Kantorovich distance with the 'ompr' package
Do you know my former blog? It contains four posts about the computation of the Kantorovich distance:
The Julia way using the JumP library has the most convenient syntax:
using JuMP
= [1/7, 2/7, 4/7]
mu = [1/4, 1/4, 1/2]
nu = length(mu)
n = Model()
m @defVar(m, p[1:n, 1:n] >= 0)
@setObjective(m, Min, sum{p[i, j], i in 1:n, j in 1:n; i != j})
for k in 1:n
@addConstraint(m, sum(p[k, :]) == mu[k])
@addConstraint(m, sum(p[:, k]) == nu[k])
end
solve(m)
This allows to get the Kantorovich distance between the two probabilities mu
and nu
corresponding to the 0-1 distance (assuming mu
and nu
have the same support). This is totally useless because one can straightforwardly get this distance: it is one minus the total weight of the infimum measure of the two probability measures (1 - sum(pmin(mu, nu))
in R). But this is just for a simple illustration purpose. This problem is not trivial for another distance on the support of mu
and nu
. Encoding this distance as a matrix D
, the linear programming model allowing to get the corresponding Kantorovich distance is obtained by replacing
, j], i in 1:n, j in 1:n; i != j} sum{p[i
with
, j] * D[i, j], i in 1:n, j in 1:n; i != j} sum{p[i
Now I want to show again how to compute the Kantorovich distance with R, but using another package I discovered yesterday: the ompr package. It allows to write the model with a convenient syntax, close to the mathematical language, similar to the one above with JumP. Here is the model:
library(ompr)
library(ompr.roi)
library(ROI.plugin.glpk)
c(1/7, 2/7, 4/7)
mu <- c(1/4, 1/4, 1/2)
nu <- length(mu)
n <-
MIPModel() |>
model <- add_variable(p[i, j], i = 1:n, j = 1:n, type = "continuous") |>
add_constraint(p[i, j] >= 0, i = 1:n, j = 1:n) |>
add_constraint(sum_over(p[i, j], j = 1:n) == mu[i], i = 1:n) |>
add_constraint(sum_over(p[i, j], i = 1:n) == nu[j], j = 1:n) |>
set_objective(sum_over(p[i, j], i = 1:n, j = 1:n, i != j), "min")
This is nicely readable. Now we solve the problem:
model |>
optimization <- solve_model(with_ROI(solver = "glpk"))
And we get the Kantorovich distance:
objective_value(optimization)
## [1] 0.1071429