This vignettes discribes the modelling techniques available in ompr to make your life easier when developing a mixed integer programming model.

It uses the MILPModel backend which is still experimental!

A MILP Model

You can think of a MIP Model as a big constraint maxtrix and a set of vectors. But you can also think of it as a set of decision variables, an objective function and a number of constraints as equations/inequalities. ompr implements the latter approach.

For example, Wikipedia describes the Knapsack problem like this: $\begin{equation*} \begin{array}{ll@{}ll} \text{max} & \displaystyle\sum\limits_{i=1}^{n} v_{i}x_{i} & &\\ \text{subject to}& \displaystyle\sum\limits_{i=1}^{n} w_{i}x_{i} \leq W, & &\\ & x_{i} \in \{0,1\}, &i=1 ,\ldots, n& \end{array} \end{equation*}$

This is the ompr equivalent:

n <- 10; W <- 2
v <- runif(n);w <- runif(n)
model <- MILPModel() %>%
add_variable(x[i], i = 1:n, type = "binary") %>%
set_objective(sum_expr(colwise(v[i]) * x[i], i = 1:n)) %>%
add_constraint(sum_expr(colwise(w[i]) * x[i], i = 1:n) <= W)

The overall idea is to use modern R idioms to construct models like the one above as readable as possible directly in R. ompr will do the heavy lifting and transforms everything into matrices/vectors and pass it to your favorite solver.

Vectorized semantics

ompr supppots different backends. A backend is the empty model to which you add variables, constraints etc. Currently two backends exist: MIPModel and MILPModel. This vignette describes the latter as the first will become deprecated.

Compared to the old MIPModel backend, MILPModel has vectorized semantics. Meaning that model variables accept and expect vectors. This enables a speedup by a factor of 1000 and more. More details can be found at the end of this document.

Pipes

Each function in ompr creates immutable copies of the models. In addition the function interface has been designed to work with magrittr pipes. You always start with an empty model and add components to it.

## Mixed integer linear optimization problem
## Variables:
##   Continuous: 1
##   Integer: 0
##   Binary: 0
## Model sense: maximize
## Constraints: 1

Variable types

Variables can be of type continuous, integer or binary.

## Mixed integer linear optimization problem
## Variables:
##   Continuous: 1
##   Integer: 1
##   Binary: 1
## No objective function.
## Constraints: 0

Variable bounds

Variables can have lower and upper bounds.

add_variable(y, lb = 5, ub = 10)
## Mixed integer linear optimization problem
## Variables:
##   Continuous: 2
##   Integer: 0
##   Binary: 0
## No objective function.
## Constraints: 0

Indexed variables

Often when you develop a complex model you work with indexed variables. This is an important concept ompr supports.

add_variable(x[i], i = 1:10) %>%  # creates 10 decision variables
set_objective(x[5]) %>%
## Mixed integer linear optimization problem
## Variables:
##   Continuous: 10
##   Integer: 0
##   Binary: 0
## Model sense: maximize
## Constraints: 1

Summation over variables

If you have indexed variables then you often want to sum over a subset of variables.

The following code creates a model with three decision variables $$x_1$$, $$x_2$$, $$x_3$$. An objective function $$\sum_i x_i$$ and one constraint $$\sum_i x_i \leq 10$$.

set_objective(sum_expr(x[i], i = 1:3)) %>%
add_constraint(sum_expr(x[i], i = 1:3) <= 10)
## Mixed integer linear optimization problem
## Variables:
##   Continuous: 3
##   Integer: 0
##   Binary: 0
## Model sense: maximize
## Constraints: 1

Quantifiers

add_variable, add_constraint, set_bounds, sum_expr all support a common quantifier interface that also supports filter expression. A more complex example will show what that means.

# Create x_{i, j} variables for all combinations of i and j where
# i = 1:10 and j = 1:10.
add_variable(x[i, j], type = "binary", i = 1:10, j = 1:10) %>%

# add a y_i variable for all i between 1 and 10 with i mod 2 = 0
add_variable(y[i], type = "binary", i = 1:10, i %% 2 == 0) %>%

# we maximize all x_{i,j} where i = j + 1
set_objective(sum_expr(x[i, j], i = 1:10, j = 1:10, i == j + 1)) %>%

# for each i between 1 and 10 with i mod 2 = 0
# we add a constraint \sum_j x_{i,j}
add_constraint(sum_expr(x[i, j], j = 1:10) <= 1, i = 1:10, i %% 2 == 0) %>%

# of course you can leave out filters or add more than 1
add_constraint(sum_expr(x[i, j], j = 1:10) <= 2, i = 1:10)
## Mixed integer linear optimization problem
## Variables:
##   Continuous: 0
##   Integer: 0
##   Binary: 105
## Model sense: maximize
## Constraints: 15

Special bounds on a subset of variables

Imagine you want to model a matching problem with a single binary decision variable $$x_{i,j}$$ that is $$1$$ iff object $$i$$ is matched to object $$j$$. One constraint would be to allow matches only if $$i \neq j$$. This can be modelled by a constraint or by selectively changing bounds on variables. The latter approach can be used by solvers to improve the solution process.

add_variable(x[i, j], i = 1:10, j = 1:10,
type = "integer", lb = 0, ub = 1) %>%
set_objective(sum_expr(x[i, j], i = 1:10, j = 1:10)) %>%
add_constraint(x[i, i] == 0, i = 1:10) %>%

# this sets the ub to 0 without adding new constraints
set_bounds(x[i, i], ub = 0, i = 1:10)
## Mixed integer linear optimization problem
## Variables:
##   Continuous: 0
##   Integer: 100
##   Binary: 0
## Model sense: maximize
## Constraints: 10

External model parameters

Of course you will need external parameters for your models. You can reuse any variable defined in your R environment within the MIP Model.

n <- 5 # number of our variables
costs <- rpois(n, lambda = 3) # a cost vector
max_elements <- 3

add_variable(x[i], type = "binary", i = 1:n) %>%
set_objective(sum_expr(colwise(costs[i]) * x[i], i = 1:n)) %>%
add_constraint(sum_expr(x[i], i = 1:n) <= max_elements)
## Mixed integer linear optimization problem
## Variables:
##   Continuous: 0
##   Integer: 0
##   Binary: 5
## Model sense: maximize
## Constraints: 1

Extract model solutions

Once you have a model, you pass it to a solver and get back a solutions. The main interface to extract variable values from a solution is the function get_solution. It returns a data.frame for indexed variable and thus makes it easy to subsequently use the value.

We use ROI and GLPK to solve it.

## ROI: R Optimization Infrastructure
## Registered solver plugins: nlminb, glpk.
## Default solver: auto.

set.seed(1)
n <- 5
weights <- matrix(rpois(n * n, 5), ncol = n, nrow = n)
# construct a function that is vectorized
w <- function(i, j) {
vapply(seq_along(i), function(k) weights[i[k], j[k]], numeric(1L))
}
result <- MILPModel() %>%
add_variable(x[i, j], i = 1:n, j = 1:n, type = "binary") %>%
set_objective(sum_expr(colwise(w(i, j)) * x[i, j], i = 1:n, j = 1:n)) %>%
add_constraint(sum_expr(x[i, j], j = 1:n) == 1, i = 1:n) %>%
solve_model(with_ROI("glpk", verbose = TRUE))
## <SOLVER MSG>  ----
## GLPK Simplex Optimizer 5.0
## 5 rows, 25 columns, 25 non-zeros
##       0: obj =  -0.000000000e+00 inf =   5.000e+00 (5)
##       5: obj =   2.400000000e+01 inf =   0.000e+00 (0)
## *    14: obj =   4.400000000e+01 inf =   0.000e+00 (0)
## OPTIMAL LP SOLUTION FOUND
## GLPK Integer Optimizer 5.0
## 5 rows, 25 columns, 25 non-zeros
## 25 integer variables, all of which are binary
## Integer optimization begins...
## Long-step dual simplex will be used
## +    14: mip =     not found yet <=              +inf        (1; 0)
## +    14: >>>>>   4.400000000e+01 <=   4.400000000e+01   0.0% (1; 0)
## +    14: mip =   4.400000000e+01 <=     tree is empty   0.0% (0; 1)
## INTEGER OPTIMAL SOLUTION FOUND
## <!SOLVER MSG> ----
get_solution(result, x[i, j]) %>%
dplyr::filter(value == 1)
##   variable i j value
## 1        x 4 1     1
## 2        x 2 2     1
## 3        x 5 3     1
## 4        x 3 4     1
## 5        x 1 5     1

You can also fix certain indexes.

get_solution(result, x[2, j])
##   variable j value
## 1        x 1     0
## 2        x 2     1
## 3        x 3     0
## 4        x 4     0
## 5        x 5     0

Vectorized semantics revisited

Each variable accepts vectors. The following code snippets show the behaviour by example:

Instead of passing index variables through quantifiers, you can also use vectors directly. Each element of a vector creates a new row for that variable. The two constraint groups below are equivalent.

n <- 10L

add_variable(x[i, j], i = 1:n, j = 1:n) %>%
add_constraint(x[i, j] == 1, i = 1:n, j = 1:n, i == j) %>%
add_constraint(x[1:n, 1:n] == 1) # this this equivalent
## Mixed integer linear optimization problem
## Variables:
##   Continuous: 100
##   Integer: 0
##   Binary: 0
## No objective function.
## Constraints: 20

You can also add vectors columnwise using the function colwise or as_colwise:

add_variable(x[i, j], i = 1:n, j = 1:n) %>%
add_constraint(sum_expr(x[i, j], i = 1:n) == 1, j = 1:n) %>%
add_constraint(x[colwise(1:n), 1:n] == 1) # this this equivalent
## Mixed integer linear optimization problem
## Variables:
##   Continuous: 100
##   Integer: 0
##   Binary: 0
## No objective function.
## Constraints: 20

Another example:

Say you want to express the below matrix:

x[1, 1]
x[1, 1] + x[1, 2]
x[1, 1] + x[1, 2] + x[1, 3]

With vectorized semantics you can do the following:

x[1, colwise(1, 1:2, 1:3)]

Or with the support of the ompr function sum_expr

add_variable(x[i, j], i = 1, j = 1:n) %>%
add_constraint(sum_expr(x[1, j], j = colwise(1, 1:2, 1:3)) == 1)
## Mixed integer linear optimization problem
## Variables:
##   Continuous: 10
##   Integer: 0
##   Binary: 0
## No objective function.
## Constraints: 3

Coefficients and colwise

colwise is probably the concept that is likely to change in the future.

However it is currently in particular necessary to set coefficients correctly within sum_expr. This section shows a couple of examples on how to use it.

Below is a model that demonstrates the use of colwise to set coefficients within sum_expr.

model <- MILPModel() %>%
add_variable(x[i, j], i = 1:2, j = 1:3) %>%

# this creates
# x[1, 1] + x[1, 2] + x[1, 3]
# x[2, 1] + x[2, 2] + x[2, 3]
add_constraint(sum_expr(x[i, j], j = 1:3) == 1, i = 1:2) %>%

# this creates
# 1 * x[1, 1] + 2 * x[1, 2] + 3 * x[1, 3]
# 1 * x[2, 1] + 2 * x[2, 2] + 3 * x[2, 3]
add_constraint(colwise(1:3) * sum_expr(x[i, j], j = 1:3) == 1, i = 1:2) %>%

# 1 * x[1, 1] + 2 * x[1, 2] + 3 * x[1, 3]
# 4 * x[2, 1] + 5 * x[2, 2] + 6 * x[2, 3]
add_constraint(sum_expr(colwise(1:6) * x[i, j], j = 1:3) == 1, i = 1:2) %>%

# 1 * x[1, 1] + 1 * x[1, 2] + 1 * x[1, 3]
# 2 * x[2, 1] + 2 * x[2, 2] + 2 * x[2, 3]
add_constraint(sum_expr(1:3 * x[i, j], j = 1:3) == 1, i = 1:2)
# the standard semantic is that numeric vectors have one coefficient
# for the entire row. Thus the first row gets multiplied by 1
# the second row by 2. And since there are only two rows, the third component
# gets ignored

Often, you have a function taking the variable indexes and returning the variable coefficients. As long as the function returns a colwise vector, everything works as above. The only difficulty is to figure out the length of the return value.

Say you have constraint that looks like this:

add_constraint(model, sum_expr(sum_fun(i, j) * x[i, j], j = 1:3) == 1, i = 1:2)

sum_fun should return a numeric vector wrapped in colwise with the length of any row index (e.g. i) times the length of any column index (e.g. j). In the above case 3 * 2 = 6.

A template function could look like this:

sum_fun <- function(i, j) {
n_rows <- length(i)
n_col <- length(j)
colwise(vapply(seq_len(n_rows * n_col), function(x) {
x # 1, 2, 3, 4, 5, 6
# where 1, 2, 3 are for cols in the first row
# 4, 5, 6 are the cols in the second row and so on
}, numeric(1L)))
}

Note that when using multiple indexes, all combinations are of those indexes are bound to indexes.

Sometimes, you want to control both variable indexes and their bounds very specifically. You can either use filter expressions together with set_bounds or pass in the concrete variable indexes explicitly.

The following code is equivalent:

add_variable(x[i, j], i = 1:10, j = 1:10)
## Mixed integer linear optimization problem
## Variables:
##   Continuous: 100
##   Integer: 0
##   Binary: 0
## No objective function.
## Constraints: 0
grid <- expand.grid(i = 1:10, j = 1:10)

add_variable(x[grid$i, grid$j])
## Mixed integer linear optimization problem
## Variables:
##   Continuous: 100
##   Integer: 0
##   Binary: 0
## No objective function.
## Constraints: 0

In addition to the first block of code, you can also set bounds on the variables in the same way:

add_variable(x[grid$i, grid$j], lb = grid$i * 10, ub = grid$i * 20)
## Mixed integer linear optimization problem
## Variables:
##   Continuous: 100
##   Integer: 0
##   Binary: 0
## No objective function.
## Constraints: 0

This is especially handy if you work with large models where a lot of index combinations are invalid. An example is routing problem, where you define a connection between two cities by a binary variable x[i, j] - a lot of i,j combinations are usually invalid and do not need to enter the model explicitly. Although a lot of solvers remove those variables during presolve.

Feedback

Do you have any questions, ideas, comments? Or did you find a mistake? Let’s discuss on Github.