Mathematical Optimization
Author: Aster Santana
Mathematical optimization, or math optimization for short, is a powerful and well-established technology widely used to solve practical decision-making problems across industries.
If you know what linear programming (LP) or mixed-integer programming (MIP) is, then you can simply think of math optimization as an extension of those technologies.
If you’re not familiar with optimization at all, the first thing you need to know is that there are three key steps in solving decision problems with math optimization:
Step 1: Write a mathematical formulation
This is the process of creating a very precise representation of the problem to be solved—which requires a very good understanding of the business problem. We use the word "formulation" because we literally write mathematical formulas to model the problem. These formulas define the objective function and the constraints, which are functions of the decision variables, i.e., the unknowns of the problem.
Step 2: Implement a mathematical model
In this step, we implement the mathematical formulation using some modeling/programming language so that we can use computers to solve the problem on any given data instance. In other words, the mathematical model is a representation of the mathematical formulation that computers understand.
Step 3: Solve the model
In theory, this is the hardest step. However, we simply hand this task over to the computer. There are incredibly powerful algorithms available from commercial and open-source solvers that we can leverage. And these solvers can often handle problems with even millions of decision variables and constraints in a matter of minutes!
Classes of optimization problems
Classes of optimization problems can be defined, for example, based on modeling assumptions, such as deterministic or stochastic; type of decision variables present in the model, such as continuous, integer, or both; or type of expressions defining the objective function and constraints, such as linear or non-linear. The Convex Vs. Non-convex Optimization article offers a deeper dive into this topic.
When it comes to solving real-world problems, we often have to make approximations or simplify assumptions so that the formulation falls into a class of optimization that we can solve efficiently with state-of-the-art algorithms. This is an ongoing battle between representativeness and tractability.
One class that deserves special attention is that of deterministic mixed-integer programming (MIP), which may involve a mix of continuous and discrete (integer or binary) decision variables. MIP is also an extension of Linear Programming (LP), the most basic class of optimization problems. As a side note, when people say MIP they are generally referring to mixed-integer linear programming, and they use the acronym MILP when they want to emphasize that.
What’s so special about MIP?
The first great thing about MIP is that it strikes a great balance between representativeness and tractability. This means that a broad class of business problems can be represented and solved efficiently when modeled as MIP.
It should be noted that tractability was not a feature of MIP a few decades ago. Here is a quote from a recent paper by Dimitris Bertsimas and Angela King:
“In the period 1991–2015, algorithmic advances in Mixed-Integer Linear Optimization (MILO) coupled with hardware improvements have resulted in an astonishing 450 billion factor speedup in solving MILO problems.”
What’s even more impressive is that this number continues to grow every year as reported by commercial MIP solvers such as CPLEX and Gurobi.
Robustness is another strength of MIP. For the most part, it is easy to extend or modify a MIP model to address new business requirements, which is particularly relevant for business problems that tend to change quite often, i.e., most of them.
Finally, solutions from a MIP model always come with an optimality guarantee. This means that the optimization solver terminates only when it has found a provably optimal solution. And if the solver terminates before proving optimality, because it hit a time limit, for example, it will show how much the objective function value could possibly improve in the best-case scenario.
A little bit of history
For several decades, math optimization, and especially MIP, has been transforming operations across industries, including airlines crew scheduling, sports scheduling, and the whole field of the supply chain. Initially, however, those developing the theory were the only ones using math optimization for solving practical problems. That was because a deep understanding of the theoretical and computational aspects of math optimization used to be required for modeling and solving real-world problems.
This reality began to change in the 90s when the body of literature on how to formulate and solve problems with MIP increased. Around the same time, the commercial optimization solvers Xpress and CPLEX began to gain popularity. A few years later, in the early 2000s, two of the first open-source MIP solvers, GLPK and COIN-OR CBC, were publicly released. Another big player in the commercial side, Gurobi, was founded in 2008.
With that, a much broader community began to adopt math optimization to solve challenging problems in various industries. However, some specific modeling languages, such as GAMS, AMPL, or AIMMS, would still be required. This used to somehow limit the usage of math optimization to the Operations Research community.
Another wave of democratization of math optimization came with the broad adoption of Python as a programming language for data science and analytics. Many Python interfaces for optimization packages, such as Pyomo, Mip, and PuLP, emerged. Gurobi, in particular, has put tremendous effort into making its optimization package easily accessible via gurobipy, the Gurobi-Python interface, along with comprehensive, well-structured documentation. CPLEX has also taken a similar path.
Now, any data scientist (and other analytics professionals) can much more easily leverage math optimization for solving all sorts of practical problems. However, the myth that math optimization is only for Operation Researchers has not vanished yet.
How hard is it to learn math optimization?
This is a tricky question. And the answer highly depends on the level of expertise you want to achieve. Compared to machine learning, it might be fair to say that:
Math optimization is harder to grasp than basic regression and classification techniques.
Deep learning and reinforcement learning, on the other hand, are more difficult to learn than math optimization.
However, there are several levels of expertise that one can achieve in optimization. This includes being able to:
Admit that a problem is an optimization problem when an expert says so.
Recognize when a problem can be formulated as an optimization problem.
Formulate business problems as optimization problems.
Implement and solve optimization models using existing solvers and algorithms.
Design customized algorithms and heuristics for solving challenging optimization problems.
Develop theory for solving whole classes of optimization problems.
Levels 1 and 2 can be achieved by just reading or watching videos on the subject.
Level 3 is the one that requires the most time commitment because there isn’t an effective recipe for writing formulations. Some people describe it as a bag of tricks and some others say it’s an art. But the truth is that proficiency in formulating business problems using optimization only comes with deliberate practice.
A method is a trick you use more than once. — Richard Stone.
So the trick is not how many tricks you have but how you use them!
Level 4 is the most fun for most people and particularly easy with Python. Implementation of optimization models only becomes trickier for complex or large-scale problems, when performance issues might begin to arise.
Beginners typically spend a lot of time around Level 3 and Level 4.
Level 5 is also super fun but requires some more experience. At this point, math optimization may become just a tool to solve subproblems within a custom heuristic, for example. A decent understanding of the theory and how optimization solvers work also becomes crucial here.
Level 6 is where you find operations research professionals that hold a Ph.D. degree. To get to this level, you need to understand the theory underlying Levels 3-5 deeply.
If you are interested in developing skills to solve practical decision-making problems with math optimization, or take your existing skills to the next level, make sure to check our training programs.
If you have any questions, comments, or would like to chat more about this topic, just reach out to us.