# ot.regpath

Regularization path OT solvers

## Functions

ot.regpath.complement_schur(M_current, b, d, id_pop)[source]

This function computes the inverse of the design matrix in the regularization path using the Schur complement. Two cases may arise:

Case 1: one variable is added to the active set

$\begin{split}M_{k+1}^{-1} = \begin{bmatrix} M_{k}^{-1} + s^{-1} M_{k}^{-1} \mathbf{b} \mathbf{b}^T M_{k}^{-1} \ & - M_{k}^{-1} \mathbf{b} s^{-1} \\ - s^{-1} \mathbf{b}^T M_{k}^{-1} & s^{-1} \end{bmatrix}\end{split}$

where :

• $$M_k^{-1}$$ is the inverse of the design matrix $$H_A^tH_A$$ of the previous iteration

• $$\mathbf{b}$$ is the last column of $$M_{k}$$

• $$s$$ is the Schur complement, given by $$s = \mathbf{d} - \mathbf{b}^T M_{k}^{-1} \mathbf{b}$$

Case 2: one variable is removed from the active set.

$M_{k+1}^{-1} = M^{-1}_{k \backslash q} - \frac{r_{-q,q} r^{T}_{-q,q}}{r_{q,q}}$

where :

• $$q$$ is the index of column and row to delete

• $$M^{-1}_{k \backslash q}$$ is the previous inverse matrix deprived of the $$q$$ th column and $$q$$ th row

• $$r_{-q,q}$$ is the $$q$$ th column of $$M^{-1}_{k}$$ without the $$q$$ th element

• $$r_{q, q}$$ is the element of $$q$$ th column and $$q$$ th row in $$M^{-1}_{k}$$

Parameters
• M_current (ndarray, shape (size(A)-1, size(A)-1)) – Inverse matrix of $$H_A^tH_A$$ at the previous iteration, with size(A) the size of the active set

• b (ndarray, shape (size(A)-1, )) – None for case 2 (removal), last column of $$M_{k}$$ for case 1 (addition)

• d (float) – should be equal to 2 when UOT and 1 for the semi-relaxed OT

• id_pop (int) – Index of the variable to be removed, equal to -1 if no variable is deleted at the current iteration

Returns

M – Inverse matrix of $$H_A^tH_A$$ of the current iteration

Return type

ndarray, shape (size(A), size(A))

References

41

Chapel, L., Flamary, R., Wu, H., Févotte, C., and Gasso, G. (2021). Unbalanced optimal transport through non-negative penalized linear regression. NeurIPS.

ot.regpath.compute_next_removal(phi, delta, current_gamma)[source]

This function computes the next gamma value if a variable is removed at the next iteration of the regularization path.

We look for the largest value of the regularization parameter such that an element of the current solution vanishes

$\max_{j \in A} \frac{\phi_j}{\delta_j}$

where :

• A is the current active set

• $$\phi_j$$ is the $$j$$ th element of the intercept of the current solution

• $$\delta_j$$ is the $$j$$ th element of the slope of the current solution

Parameters
• phi (ndarray, shape (size(A), )) – Intercept of the solution at the current iteration

• delta (ndarray, shape (size(A), )) – Slope of the solution at the current iteration

• current_gamma (float) – Value of the regularization parameter at the beginning of the current iteration

Returns

• next_removal_gamma (float) – Gamma value if a variable is removed at the next iteration

• next_removal_index (int) – Index of the variable to be removed at the next iteration

References

41

Chapel, L., Flamary, R., Wu, H., Févotte, C., and Gasso, G. (2021). Unbalanced optimal transport through non-negative penalized linear regression. NeurIPS.

ot.regpath.compute_transport_plan(gamma, gamma_list, Pi_list)[source]

Given the regularization path, this function computes the transport plan for any value of gamma thanks to the piecewise linearity of the path.

$t(\gamma) = \phi(\gamma) - \gamma \delta(\gamma)$

where:

• $$\gamma$$ is the regularization parameter

• $$\phi(\gamma)$$ is the corresponding intercept

• $$\delta(\gamma)$$ is the corresponding slope

• $$\mathbf{t}$$ is the flattened version of the transport matrix

Parameters
• gamma (float) – Regularization coefficient

• gamma_list (list) – List of regularization parameters of the regularization path

• Pi_list (list) – List of all the solutions of the regularization path

Returns

t – Vectorization of the transport plan corresponding to the given value of gamma

Return type

np.ndarray (dim_a*dim_b, )

Examples

>>> import ot
>>> import numpy as np
>>> n = 3
>>> xs = np.array([1., 2., 3.]).reshape((n, 1))
>>> xt = np.array([5., 6., 7.]).reshape((n, 1))
>>> C = ot.dist(xs, xt)
>>> C /= C.max()
>>> a = np.array([0.2, 0.5, 0.3])
>>> b = np.array([0.2, 0.5, 0.3])
>>> t, pi_list, g_list = ot.regpath.regularization_path(a, b, C, reg=1e-4)
>>> gamma = 1
>>> t2 = ot.regpath.compute_transport_plan(gamma, g_list, pi_list)
>>> t2
array([0.        , 0.        , 0.        , 0.19722222, 0.05555556,
0.        , 0.        , 0.24722222, 0.        ])


References

41

Chapel, L., Flamary, R., Wu, H., Févotte, C., and Gasso, G. (2021). Unbalanced optimal transport through non-negative penalized linear regression. NeurIPS.

### Examples using ot.regpath.compute_transport_plan

ot.regpath.construct_augmented_H(active_index, m, Hc, HrHr)[source]

This function constructs an augmented matrix for the first iteration of the semi-relaxed regularization path

$\begin{split}\text{Augmented}_H = \begin{bmatrix} 0 & H_{c A} \\ H_{c A}^T & H_{r A}^T H_{r A} \end{bmatrix}\end{split}$

where :

• $$H_{r A}$$ is the sub-matrix constructed by the columns of $$H_r$$ whose indices belong to the active set A

• $$H_{c A}$$ is the sub-matrix constructed by the columns of $$H_c$$ whose indices belong to the active set A

Parameters
• active_index (list) – Indices of the active variables

• m (int) – Length of the target distribution

• Hc (np.ndarray (dim_b, dim_a * dim_b)) – Matrix that computes the sum along the columns of the transport plan $$T$$

• HrHr (np.ndarray (dim_a * dim_b, dim_a * dim_b)) – Matrix product of $$H_r^T H_r$$

Returns

H_augmented – Augmented matrix for the first iteration of the semi-relaxed regularization path

Return type

np.ndarray (dim_b + size(A), dim_b + size(A))

ot.regpath.fully_relaxed_path(a: numpy.array, b: numpy.array, C: numpy.array, reg=0.0001, itmax=50000)[source]

This function gives the regularization path of l2-penalized UOT problem

The problem to optimize is the Lasso reformulation of the l2-penalized UOT:

\begin{align}\begin{aligned}\min_t \gamma \mathbf{c}^T \mathbf{t} + 0.5 * \|{H} \mathbf{t} - \mathbf{y}\|_2^2\\s.t. \mathbf{t} \geq 0\end{aligned}\end{align}

where :

• $$\mathbf{c}$$ is the flattened version of the cost matrix $${C}$$

• $$\gamma = 1/\lambda$$ is the l2-regularization coefficient

• $$\mathbf{y}$$ is the concatenation of vectors $$\mathbf{a}$$ and $$\mathbf{b}$$, defined as $$\mathbf{y}^T = [\mathbf{a}^T \mathbf{b}^T]$$

• $${H}$$ is a design matrix, see [41] for the design of $${H}$$. The matrix product $$H\mathbf{t}$$ computes both the source marginal and the target marginals.

• $$\mathbf{t}$$ is the flattened version of the transport matrix

Parameters
• a (np.ndarray (dim_a,)) – Histogram of dimension dim_a

• b (np.ndarray (dim_b,)) – Histogram of dimension dim_b

• C (np.ndarray, shape (dim_a, dim_b)) – Cost matrix

• reg (float) – l2-regularization coefficient

• itmax (int) – Maximum number of iteration

Returns

• t (np.ndarray (dim_a*dim_b, )) – Flattened vector of the optimal transport matrix

• t_list (list) – List of solutions in the regularization path

• gamma_list (list) – List of regularization coefficients in the regularization path

Examples

>>> import ot
>>> import numpy as np
>>> n = 3
>>> xs = np.array([1., 2., 3.]).reshape((n, 1))
>>> xt = np.array([5., 6., 7.]).reshape((n, 1))
>>> C = ot.dist(xs, xt)
>>> C /= C.max()
>>> a = np.array([0.2, 0.5, 0.3])
>>> b = np.array([0.2, 0.5, 0.3])
>>> t, _, _ = ot.regpath.fully_relaxed_path(a, b, C, 1e-4)
>>> t
array([1.99958333e-01, 0.00000000e+00, 0.00000000e+00, 3.88888889e-05,
4.99938889e-01, 0.00000000e+00, 0.00000000e+00, 3.88888889e-05,
2.99958333e-01])


References

41

Chapel, L., Flamary, R., Wu, H., Févotte, C., and Gasso, G. (2021). Unbalanced optimal transport through non-negative penalized linear regression. NeurIPS.

ot.regpath.ot_next_gamma(phi, delta, HtH, Hty, c, active_index, current_gamma)[source]

This function computes the next value of gamma if a variable is added in the next iteration of the regularization path.

We look for the largest value of gamma such that the gradient of an inactive variable vanishes

$\max_{i \in \bar{A}} \frac{\mathbf{h}_i^T(H_A \phi - \mathbf{y})} {\mathbf{h}_i^T H_A \delta - \mathbf{c}_i}$

where :

• A is the current active set

• $$\mathbf{h}_i$$ is the $$i$$ th column of the design matrix $${H}$$

• $${H}_A$$ is the sub-matrix constructed by the columns of $${H}$$ whose indices belong to the active set A

• $$\mathbf{c}_i$$ is the $$i$$ th element of the cost vector $$\mathbf{c}$$

• $$\mathbf{y}$$ is the concatenation of the source and target distributions

• $$\phi$$ is the intercept of the solutions at the current iteration

• $$\delta$$ is the slope of the solutions at the current iteration

Parameters
• phi (np.ndarray (size(A), )) – Intercept of the solutions at the current iteration

• delta (np.ndarray (size(A), )) – Slope of the solutions at the current iteration

• HtH (np.ndarray (dim_a * dim_b, dim_a * dim_b)) – Matrix product of $${H}^T {H}$$

• Hty (np.ndarray (dim_a + dim_b, )) – Matrix product of $${H}^T \mathbf{y}$$

• c (np.ndarray (dim_a * dim_b, )) – Flattened array of the cost matrix $${C}$$

• active_index (list) – Indices of active variables

• current_gamma (float) – Value of the regularization parameter at the beginning of the current iteration

Returns

• next_gamma (float) – Value of gamma if a variable is added to active set in next iteration

• next_active_index (int) – Index of variable to be activated

References

41

Chapel, L., Flamary, R., Wu, H., Févotte, C., and Gasso, G. (2021). Unbalanced optimal transport through non-negative penalized linear regression. NeurIPS.

ot.regpath.recast_ot_as_lasso(a, b, C)[source]

This function recasts the l2-penalized UOT problem as a Lasso problem.

Recall the l2-penalized UOT problem defined in [41]

\begin{align}\begin{aligned}\text{UOT}_{\lambda} = \min_T <C, T> + \lambda \|T 1_m - \mathbf{a}\|_2^2 + \lambda \|T^T 1_n - \mathbf{b}\|_2^2\\s.t. T \geq 0\end{aligned}\end{align}

where :

• $$C$$ is the cost matrix

• $$\lambda$$ is the l2-regularization parameter

• $$\mathbf{a}$$ and $$\mathbf{b}$$ are the source and target distributions

• $$T$$ is the transport plan to optimize

The problem above can be reformulated as a non-negative penalized linear regression problem, particularly Lasso

\begin{align}\begin{aligned}\text{UOT2}_{\lambda} = \min_{\mathbf{t}} \gamma \mathbf{c}^T \mathbf{t} + 0.5 * \|H \mathbf{t} - \mathbf{y}\|_2^2\\s.t. \mathbf{t} \geq 0\end{aligned}\end{align}

where :

• $$\mathbf{c}$$ is the flattened version of the cost matrix $$C$$

• $$\mathbf{y}$$ is the concatenation of vectors $$\mathbf{a}$$ and $$\mathbf{b}$$

• $$H$$ is a metric matrix, see [41] for the design of $$H$$. The matrix product $$H\mathbf{t}$$ computes both the source marginal and the target marginals.

• $$\mathbf{t}$$ is the flattened version of the transport plan $$T$$

Parameters
• a (np.ndarray (dim_a,)) – Histogram of dimension dim_a

• b (np.ndarray (dim_b,)) – Histogram of dimension dim_b

• C (np.ndarray, shape (dim_a, dim_b)) – Cost matrix

Returns

• H (np.ndarray (dim_a+dim_b, dim_a*dim_b)) – Design matrix that contains only 0 and 1

• y (np.ndarray (ns + nt, )) – Concatenation of histograms $$\mathbf{a}$$ and $$\mathbf{b}$$

• c (np.ndarray (ns * nt, )) – Flattened array of the cost matrix

Examples

>>> import ot
>>> a = np.array([0.2, 0.3, 0.5])
>>> b = np.array([0.1, 0.9])
>>> C = np.array([[16., 25.], [28., 16.], [40., 36.]])
>>> H, y, c = ot.regpath.recast_ot_as_lasso(a, b, C)
>>> H.toarray()
array([[1., 1., 0., 0., 0., 0.],
[0., 0., 1., 1., 0., 0.],
[0., 0., 0., 0., 1., 1.],
[1., 0., 1., 0., 1., 0.],
[0., 1., 0., 1., 0., 1.]])
>>> y
array([0.2, 0.3, 0.5, 0.1, 0.9])
>>> c
array([16., 25., 28., 16., 40., 36.])


References

41

Chapel, L., Flamary, R., Wu, H., Févotte, C., and Gasso, G. (2021). Unbalanced optimal transport through non-negative penalized linear regression. NeurIPS.

ot.regpath.recast_semi_relaxed_as_lasso(a, b, C)[source]

This function recasts the semi-relaxed l2-UOT problem as Lasso problem.

\begin{align}\begin{aligned}\text{semi-relaxed UOT} = \min_T <C, T> + \lambda \|T 1_m - \mathbf{a}\|_2^2\\s.t. T^T 1_n = \mathbf{b}\\ \mathbf{t} \geq 0\end{aligned}\end{align}

where :

• $$C$$ is the metric cost matrix

• $$\lambda$$ is the l2-regularization parameter

• $$\mathbf{a}$$ and $$\mathbf{b}$$ are the source and target distributions

• $$T$$ is the transport plan to optimize

The problem above can be reformulated as follows

\begin{align}\begin{aligned}\text{semi-relaxed UOT2} = \min_t \gamma \mathbf{c}^T t + 0.5 * \|H_r \mathbf{t} - \mathbf{a}\|_2^2\\s.t. H_c \mathbf{t} = \mathbf{b}\\ \mathbf{t} \geq 0\end{aligned}\end{align}

where :

• $$\mathbf{c}$$ is flattened version of the cost matrix $$C$$

• $$\gamma = 1/\lambda$$ is the l2-regularization parameter

• $$H_r$$ is a metric matrix which computes the sum along the rows of the transport plan $$T$$

• $$H_c$$ is a metric matrix which computes the sum along the columns of the transport plan $$T$$

• $$\mathbf{t}$$ is the flattened version of $$T$$

Parameters
• a (np.ndarray (dim_a,)) – Histogram of dimension dim_a

• b (np.ndarray (dim_b,)) – Histogram of dimension dim_b

• C (np.ndarray, shape (dim_a, dim_b)) – Cost matrix

Returns

• Hr (np.ndarray (dim_a, dim_a * dim_b)) – Auxiliary matrix constituted by 0 and 1, which computes the sum along the rows of transport plan $$T$$

• Hc (np.ndarray (dim_b, dim_a * dim_b)) – Auxiliary matrix constituted by 0 and 1, which computes the sum along the columns of transport plan $$T$$

• c (np.ndarray (ns * nt, )) – Flattened array of the cost matrix

Examples

>>> import ot
>>> a = np.array([0.2, 0.3, 0.5])
>>> b = np.array([0.1, 0.9])
>>> C = np.array([[16., 25.], [28., 16.], [40., 36.]])
>>> Hr,Hc,c = ot.regpath.recast_semi_relaxed_as_lasso(a, b, C)
>>> Hr.toarray()
array([[1., 1., 0., 0., 0., 0.],
[0., 0., 1., 1., 0., 0.],
[0., 0., 0., 0., 1., 1.]])
>>> Hc.toarray()
array([[1., 0., 1., 0., 1., 0.],
[0., 1., 0., 1., 0., 1.]])
>>> c
array([16., 25., 28., 16., 40., 36.])

ot.regpath.regularization_path(a: numpy.array, b: numpy.array, C: numpy.array, reg=0.0001, semi_relaxed=False, itmax=50000)[source]

This function provides all the solutions of the regularization path of the l2-UOT problem [41].

The problem to optimize is the Lasso reformulation of the l2-penalized UOT:

\begin{align}\begin{aligned}\min_t \gamma \mathbf{c}^T \mathbf{t} + 0.5 * \|{H} \mathbf{t} - \mathbf{y}\|_2^2\\s.t. \mathbf{t} \geq 0\end{aligned}\end{align}

where :

• $$\mathbf{c}$$ is the flattened version of the cost matrix $${C}$$

• $$\gamma = 1/\lambda$$ is the l2-regularization coefficient

• $$\mathbf{y}$$ is the concatenation of vectors $$\mathbf{a}$$ and $$\mathbf{b}$$, defined as $$\mathbf{y}^T = [\mathbf{a}^T \mathbf{b}^T]$$

• $${H}$$ is a design matrix, see [41] for the design of $${H}$$. The matrix product $$H\mathbf{t}$$ computes both the source marginal and the target marginals.

• $$\mathbf{t}$$ is the flattened version of the transport matrix

For the semi-relaxed problem, it optimizes the Lasso reformulation of the l2-penalized UOT:

\begin{align}\begin{aligned}\min_t \gamma \mathbf{c}^T \mathbf{t} + 0.5 * \|H_r \mathbf{t} - \mathbf{a}\|_2^2\\s.t. H_c \mathbf{t} = \mathbf{b}\\ \mathbf{t} \geq 0\end{aligned}\end{align}
Parameters
• a (np.ndarray (dim_a,)) – Histogram of dimension dim_a

• b (np.ndarray (dim_b,)) – Histogram of dimension dim_b

• C (np.ndarray, shape (dim_a, dim_b)) – Cost matrix

• reg (float (optional)) – l2-regularization coefficient

• semi_relaxed (bool (optional)) – Give the semi-relaxed path if True

• itmax (int (optional)) – Maximum number of iteration

Returns

• t (np.ndarray (dim_a*dim_b, )) – Flattened vector of the (unregularized) optimal transport matrix

• t_list (list) – List of all the optimal transport vectors of the regularization path

• gamma_list (list) – List of the regularization parameters in the path

References

41

Chapel, L., Flamary, R., Wu, H., Févotte, C., and Gasso, G. (2021). Unbalanced optimal transport through non-negative penalized linear regression. NeurIPS.

### Examples using ot.regpath.regularization_path

ot.regpath.semi_relaxed_next_gamma(phi, delta, phi_u, delta_u, HrHr, Hc, Hra, c, active_index, current_gamma)[source]

This function computes the next value of gamma when a variable is active in the regularization path of semi-relaxed UOT.

By taking the Lagrangian form of the problem, we obtain a similar update as the two-sided relaxed UOT

$\max_{i \in \bar{A}} \frac{\mathbf{h}_{ri}^T(H_{rA} \phi - \mathbf{a}) + \mathbf{h}_{c i}^T\phi_u}{\mathbf{h}_{r i}^T H_{r A} \delta + \ \mathbf{h}_{c i} \delta_u - \mathbf{c}_i}$

where :

• A is the current active set

• $$\mathbf{h}_{r i}$$ is the ith column of the matrix $$H_r$$

• $$\mathbf{h}_{c i}$$ is the ith column of the matrix $$H_c$$

• $$H_{r A}$$ is the sub-matrix constructed by the columns of $$H_r$$ whose indices belong to the active set A

• $$\mathbf{c}_i$$ is the $$i$$ th element of cost vector $$\mathbf{c}$$

• $$\phi$$ is the intercept of the solutions in current iteration

• $$\delta$$ is the slope of the solutions in current iteration

• $$\phi_u$$ is the intercept of Lagrange parameter at the current iteration

• $$\delta_u$$ is the slope of Lagrange parameter at the current iteration

Parameters
• phi (np.ndarray (size(A), )) – Intercept of the solutions at the current iteration

• delta (np.ndarray (size(A), )) – Slope of the solutions at the current iteration

• phi_u (np.ndarray (dim_b, )) – Intercept of the Lagrange parameter at the current iteration

• delta_u (np.ndarray (dim_b, )) – Slope of the Lagrange parameter at the current iteration

• HrHr (np.ndarray (dim_a * dim_b, dim_a * dim_b)) – Matrix product of $$H_r^T H_r$$

• Hc (np.ndarray (dim_b, dim_a * dim_b)) – Matrix that computes the sum along the columns of the transport plan $$T$$

• Hra (np.ndarray (dim_a * dim_b, )) – Matrix product of $$H_r^T \mathbf{a}$$

• c (np.ndarray (dim_a * dim_b, )) – Flattened array of cost matrix $$C$$

• active_index (list) – Indices of active variables

• current_gamma (float) – Value of regularization coefficient at the start of current iteration

Returns

• next_gamma (float) – Value of gamma if a variable is added to active set in next iteration

• next_active_index (int) – Index of variable to be activated

References

41

Chapel, L., Flamary, R., Wu, H., Févotte, C., and Gasso, G. (2021). Unbalanced optimal transport through non-negative penalized linear regression. NeurIPS.

ot.regpath.semi_relaxed_path(a: numpy.array, b: numpy.array, C: numpy.array, reg=0.0001, itmax=50000)[source]

This function gives the regularization path of semi-relaxed l2-UOT problem.

The problem to optimize is the Lasso reformulation of the l2-penalized UOT:

\begin{align}\begin{aligned}\min_t \gamma \mathbf{c}^T t + 0.5 * \|H_r \mathbf{t} - \mathbf{a}\|_2^2\\s.t. H_c \mathbf{t} = \mathbf{b}\\ \mathbf{t} \geq 0\end{aligned}\end{align}

where :

• $$\mathbf{c}$$ is the flattened version of the cost matrix $$C$$

• $$\gamma = 1/\lambda$$ is the l2-regularization parameter

• $$H_r$$ is a matrix that computes the sum along the rows of the transport plan $$T$$

• $$H_c$$ is a matrix that computes the sum along the columns of the transport plan $$T$$

• $$\mathbf{t}$$ is the flattened version of the transport plan $$T$$

Parameters
• a (np.ndarray (dim_a,)) – Histogram of dimension dim_a

• b (np.ndarray (dim_b,)) – Histogram of dimension dim_b

• C (np.ndarray, shape (dim_a, dim_b)) – Cost matrix

• reg (float (optional)) – l2-regularization coefficient

• itmax (int (optional)) – Maximum number of iteration

Returns

• t (np.ndarray (dim_a*dim_b, )) – Flattened vector of the (unregularized) optimal transport matrix

• t_list (list) – List of all the optimal transport vectors of the regularization path

• gamma_list (list) – List of the regularization parameters in the path

Examples

>>> import ot
>>> import numpy as np
>>> n = 3
>>> xs = np.array([1., 2., 3.]).reshape((n, 1))
>>> xt = np.array([5., 6., 7.]).reshape((n, 1))
>>> C = ot.dist(xs, xt)
>>> C /= C.max()
>>> a = np.array([0.2, 0.5, 0.3])
>>> b = np.array([0.2, 0.5, 0.3])
>>> t, _, _ = ot.regpath.semi_relaxed_path(a, b, C, 1e-4)
>>> t
array([1.99980556e-01, 0.00000000e+00, 0.00000000e+00, 1.94444444e-05,
4.99980556e-01, 0.00000000e+00, 0.00000000e+00, 1.94444444e-05,
3.00000000e-01])


References

41

Chapel, L., Flamary, R., Wu, H., Févotte, C., and Gasso, G. (2021). Unbalanced optimal transport through non-negative penalized linear regression. NeurIPS.