ot.unbalanced

Solvers related to Unbalanced Optimal Transport problems.

ot.unbalanced.barycenter_unbalanced(A, M, reg, reg_m, method='sinkhorn', weights=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Compute the entropic unbalanced wasserstein barycenter of \(\mathbf{A}\).

The function solves the following optimization problem with \(\mathbf{a}\)

\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{u_{reg}}(\mathbf{a},\mathbf{a}_i)\]

where :

  • \(W_{u_{reg}}(\cdot,\cdot)\) is the unbalanced entropic regularized Wasserstein distance (see ot.unbalanced.sinkhorn_unbalanced())

  • \(\mathbf{a}_i\) are training distributions in the columns of matrix \(\mathbf{A}\)

  • reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix for OT

  • reg_mis the marginal relaxation hyperparameter

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10]

Parameters:
  • A (array-like, shape (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of dimension dim

  • M (array-like, shape (dim, dim)) – ground metric matrix for OT.

  • reg (float) – Entropy regularization term > 0

  • reg_m (float) – Marginal relaxation term > 0

  • weights (array-like, shape (n_hists,) optional) – Weight of each distribution (barycentric coordinates) If None, uniform weights are used.

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • a (array-like, shape (dim,)) – Unbalanced Wasserstein barycenter

  • log (dict) – log dictionary return only if log==True in parameters

References

ot.unbalanced.barycenter_unbalanced_sinkhorn(A, M, reg, reg_m, weights=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False)[source]

Compute the entropic unbalanced wasserstein barycenter of \(\mathbf{A}\).

The function solves the following optimization problem with \(\mathbf{a}\)

\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{u_{reg}}(\mathbf{a},\mathbf{a}_i)\]

where :

  • \(W_{u_{reg}}(\cdot,\cdot)\) is the unbalanced entropic regularized Wasserstein distance (see ot.unbalanced.sinkhorn_unbalanced())

  • \(\mathbf{a}_i\) are training distributions in the columns of matrix \(\mathbf{A}\)

  • reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix for OT

  • reg_mis the marginal relaxation hyperparameter

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10]

Parameters:
  • A (array-like, shape (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of dimension dim

  • M (array-like, shape (dim, dim)) – ground metric matrix for OT.

  • reg (float) – Entropy regularization term > 0

  • reg_m (float) – Marginal relaxation term > 0

  • weights (array-like, shape (n_hists,) optional) – Weight of each distribution (barycentric coordinates) If None, uniform weights are used.

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • a (array-like, shape (dim,)) – Unbalanced Wasserstein barycenter

  • log (dict) – log dictionary return only if \(log==True\) in parameters

References

ot.unbalanced.barycenter_unbalanced_stabilized(A, M, reg, reg_m, weights=None, tau=1000.0, numItermax=1000, stopThr=1e-06, verbose=False, log=False)[source]

Compute the entropic unbalanced wasserstein barycenter of \(\mathbf{A}\) with stabilization.

The function solves the following optimization problem:

\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{u_{reg}}(\mathbf{a},\mathbf{a}_i)\]

where :

  • \(W_{u_{reg}}(\cdot,\cdot)\) is the unbalanced entropic regularized Wasserstein distance (see ot.unbalanced.sinkhorn_unbalanced())

  • \(\mathbf{a}_i\) are training distributions in the columns of matrix \(\mathbf{A}\)

  • reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix for OT

  • reg_mis the marginal relaxation hyperparameter

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10]

Parameters:
  • A (array-like, shape (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of dimension dim

  • M (array-like, shape (dim, dim)) – ground metric matrix for OT.

  • reg (float) – Entropy regularization term > 0

  • reg_m (float) – Marginal relaxation term > 0

  • tau (float) – Stabilization threshold for log domain absorption.

  • weights (array-like, shape (n_hists,) optional) – Weight of each distribution (barycentric coordinates) If None, uniform weights are used.

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • a (array-like, shape (dim,)) – Unbalanced Wasserstein barycenter

  • log (dict) – log dictionary return only if \(log==True\) in parameters

References

ot.unbalanced.lbfgsb_unbalanced(a, b, M, reg, reg_m, c=None, reg_div='kl', regm_div='kl', G0=None, numItermax=1000, stopThr=1e-15, method='L-BFGS-B', verbose=False, log=False)[source]

Solve the unbalanced optimal transport problem and return the OT plan using L-BFGS-B algorithm. The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \mathrm{div}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{div_m}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{div_m}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where:

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • \(\mathrm{div_m}\) is a divergence, either Kullback-Leibler divergence,

or half-squared \(\ell_2\) divergence, or Total variation - \(\mathrm{div}\) is a divergence, either Kullback-Leibler divergence, or half-squared \(\ell_2\) divergence

Note

This function is backend-compatible and will work on arrays from all compatible backends. First, it converts all arrays into Numpy arrays, then uses the L-BFGS-B algorithm from scipy.optimize to solve the optimization problem.

Parameters:
  • a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b If b is an empty list or array ([]), then b is set to uniform distribution.

  • M (array-like (dim_a, dim_b)) – loss matrix

  • reg (float) – regularization term >=0

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative (including 0) but cannot be infinity. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. If \(\mathrm{reg_{m}}\) is an array, it must be a Numpy array.

  • c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).

  • reg_div (string or pair of callable functions, optional (default = 'kl')) – Divergence used for regularization. Can take three values: ‘entropy’ (negative entropy), or ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or a tuple of two callable functions returning the reg term and its derivative. Note that the callable functions should be able to handle Numpy arrays and not tensors from the backend, otherwise functions will be converted to Numpy leading to a computational overhead.

  • regm_div (string, optional (default = 'kl')) – Divergence to quantify the difference between the marginals. Can take three values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or ‘tv’ (Total Variation)

  • G0 (array-like (dim_a, dim_b), optional (default = None)) – Initialization of the transport matrix. None corresponds to uniform product.

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • gamma ((dim_a, dim_b) array-like) – Optimal transportation matrix for the given parameters

  • log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[1., 36.],[9., 4.]]
>>> np.round(ot.unbalanced.lbfgsb_unbalanced(a, b, M, reg=0, reg_m=5, reg_div='kl', regm_div='kl'), 2)
array([[0.45, 0.  ],
       [0.  , 0.34]])
>>> np.round(ot.unbalanced.lbfgsb_unbalanced(a, b, M, reg=0, reg_m=5, reg_div='l2', regm_div='l2'), 2)
array([[0.4, 0. ],
       [0. , 0.1]])

References

See also

ot.lp.emd2

Unregularized OT loss

ot.unbalanced.sinkhorn_unbalanced2

Entropic regularized OT loss

ot.unbalanced.lbfgsb_unbalanced2(a, b, M, reg, reg_m, c=None, reg_div='kl', regm_div='kl', G0=None, returnCost='linear', numItermax=1000, stopThr=1e-15, method='L-BFGS-B', verbose=False, log=False)[source]

Solve the unbalanced optimal transport problem and return the OT cost using L-BFGS-B. The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \mathrm{div}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{div_m}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{div_m}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where:

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • \(\mathrm{div_m}\) is a divergence, either Kullback-Leibler divergence,

or half-squared \(\ell_2\) divergence, or Total variation - \(\mathrm{div}\) is a divergence, either Kullback-Leibler divergence, or half-squared \(\ell_2\) divergence

Note

This function is backend-compatible and will work on arrays from all compatible backends. First, it converts all arrays into Numpy arrays, then uses the L-BFGS-B algorithm from scipy.optimize to solve the optimization problem.

Parameters:
  • a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b If b is an empty list or array ([]), then b is set to uniform distribution.

  • M (array-like (dim_a, dim_b)) – loss matrix

  • reg (float) – regularization term >=0

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative (including 0) but cannot be infinity. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. If \(\mathrm{reg_{m}}\) is an array, it must be a Numpy array.

  • c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).

  • reg_div (string or pair of callable functions, optional (default = 'kl')) – Divergence used for regularization. Can take three values: ‘entropy’ (negative entropy), or ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or a tuple of two callable functions returning the reg term and its derivative. Note that the callable functions should be able to handle Numpy arrays and not tensors from the backend, otherwise functions will be converted to Numpy leading to a computational overhead.

  • regm_div (string, optional (default = 'kl')) – Divergence to quantify the difference between the marginals. Can take three values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or ‘tv’ (Total Variation)

  • G0 (array-like (dim_a, dim_b), optional (default = None)) – Initialization of the transport matrix. None corresponds to uniform product.

  • returnCost (string, optional (default = "linear")) – If returnCost = “linear”, then return the linear part of the unbalanced OT loss. If returnCost = “total”, then return the total unbalanced OT loss.

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • ot_cost (array-like) – the OT cost between \(\mathbf{a}\) and \(\mathbf{b}\)

  • log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[1., 36.],[9., 4.]]
>>> np.round(ot.unbalanced.lbfgsb_unbalanced2(a, b, M, reg=0, reg_m=5, reg_div='kl', regm_div='kl'), 2)
1.79
>>> np.round(ot.unbalanced.lbfgsb_unbalanced2(a, b, M, reg=0, reg_m=5, reg_div='l2', regm_div='l2'), 2)
0.8

References

See also

ot.lp.emd2

Unregularized OT loss

ot.unbalanced.sinkhorn_unbalanced2

Entropic regularized OT loss

ot.unbalanced.mm_unbalanced(a, b, M, reg_m, c=None, reg=0, div='kl', G0=None, numItermax=1000, stopThr=1e-15, verbose=False, log=False)[source]

Solve the unbalanced optimal transport problem and return the OT plan. The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg_{m1}} \cdot \mathrm{div}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{div}(\gamma^T \mathbf{1}, \mathbf{b}) + \mathrm{reg} \cdot \mathrm{div}(\gamma, \mathbf{c})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where:

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • div is a divergence, either Kullback-Leibler or half-squared \(\ell_2\) divergence

The algorithm used for solving the problem is a maximization- minimization algorithm as proposed in [41]

Parameters:
  • a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b If b is an empty list or array ([]), then b is set to uniform distribution.

  • M (array-like (dim_a, dim_b)) – loss matrix

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative but cannot be infinity. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).

  • reg (float, optional (default = 0)) – Regularization term >= 0. By default, solve the unregularized problem

  • c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).

  • div (string, optional) – Divergence to quantify the difference between the marginals. Can take two values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared)

  • G0 (array-like (dim_a, dim_b)) – Initialization of the transport matrix

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • gamma ((dim_a, dim_b) array-like) – Optimal transportation matrix for the given parameters

  • log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[1., 36.],[9., 4.]]
>>> np.round(ot.unbalanced.mm_unbalanced(a, b, M, 5, div='kl'), 2)
array([[0.45, 0.  ],
       [0.  , 0.34]])
>>> np.round(ot.unbalanced.mm_unbalanced(a, b, M, 5, div='l2'), 2)
array([[0.4, 0. ],
       [0. , 0.1]])

References

See also

ot.lp.emd

Unregularized OT

ot.unbalanced.sinkhorn_unbalanced

Entropic regularized OT

ot.unbalanced.mm_unbalanced2(a, b, M, reg_m, c=None, reg=0, div='kl', G0=None, returnCost='linear', numItermax=1000, stopThr=1e-15, verbose=False, log=False)[source]

Solve the unbalanced optimal transport problem and return the OT cost. The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg_{m1}} \cdot \mathrm{div}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{div}(\gamma^T \mathbf{1}, \mathbf{b}) + \mathrm{reg} \cdot \mathrm{div}(\gamma, \mathbf{c})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where:

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • \(\mathrm{div}\) is a divergence, either Kullback-Leibler or half-squared \(\ell_2\) divergence

The algorithm used for solving the problem is a maximization- minimization algorithm as proposed in [41]

Parameters:
  • a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b If b is an empty list or array ([]), then b is set to uniform distribution.

  • M (array-like (dim_a, dim_b)) – loss matrix

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative but cannot be infinity. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).

  • reg (float, optional (default = 0)) – Entropy regularization term >= 0. By default, solve the unregularized problem

  • c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = mathbf{a} mathbf{b}^T\).

  • div (string, optional) – Divergence to quantify the difference between the marginals. Can take two values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared)

  • G0 (array-like (dim_a, dim_b)) – Initialization of the transport matrix

  • returnCost (string, optional (default = "linear")) – If returnCost = “linear”, then return the linear part of the unbalanced OT loss. If returnCost = “total”, then return the total unbalanced OT loss.

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • ot_cost (array-like) – the OT cost between \(\mathbf{a}\) and \(\mathbf{b}\)

  • log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[1., 36.],[9., 4.]]
>>> np.round(ot.unbalanced.mm_unbalanced2(a, b, M, 5, div='l2'), 2)
0.8
>>> np.round(ot.unbalanced.mm_unbalanced2(a, b, M, 5, div='kl'), 2)
1.79

References

See also

ot.lp.emd2

Unregularized OT loss

ot.unbalanced.sinkhorn_unbalanced2

Entropic regularized OT loss

ot.unbalanced.sinkhorn_knopp_unbalanced(a, b, M, reg, reg_m, reg_type='kl', c=None, warmstart=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the entropic regularization unbalanced optimal transport problem and return the OT plan

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where :

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]

Warning

Starting from version 0.9.5, the default value has been changed to reg_type=’kl’ instead of reg_type=’entropy’. This makes the function more consistent with the literature and the other solvers. If you want to use the entropy regularization, please set reg_type=’entropy’ explicitly.

Parameters:
  • a (array-like, shape (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like, shape (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b. If b is an empty list or array ([]), then b is set to uniform distribution. If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)

  • M (array-like, shape (dim_a, dim_b)) – loss matrix

  • reg (float) – Entropy regularization term > 0

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\). For semi-relaxed case, use either \(\mathrm{reg_{m}}=(float("inf"), scalar)\) or \(\mathrm{reg_{m}}=(scalar, float("inf"))\). If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).

  • reg_type (string, optional) –

    Regularizer term. Can take two values:

    • Negative entropy: ‘entropy’: \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\). This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\).

    • Kullback-Leibler divergence: ‘kl’: \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).

  • c (array-like, shape (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}=\)’entropy’, then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).

  • warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u, v sinkhorn scaling vectors).

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • if n_hists == 1

    • gammaarray-like, shape (dim_a, dim_b)

      Optimal transportation matrix for the given parameters

    • logdict

      log dictionary returned only if log is True

  • else

    • ot_costarray-like, shape (n_hists,)

      the OT cost between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)

    • logdict

      log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> np.round(ot.unbalanced.sinkhorn_knopp_unbalanced(a, b, M, 1., 1.), 7)
array([[0.3220536, 0.1184769],
       [0.1184769, 0.3220536]])

References

See also

ot.lp.emd

Unregularized OT

ot.optim.cg

General regularized OT

ot.unbalanced.sinkhorn_stabilized_unbalanced(a, b, M, reg, reg_m, reg_type='kl', c=None, warmstart=None, tau=100000.0, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the entropic regularization unbalanced optimal transport problem and return the loss

The function solves the following optimization problem using log-domain stabilization as proposed in [10]:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where :

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]

Parameters:
  • a (array-like, shape (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like, shape (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b. If b is an empty list or array ([]), then b is set to uniform distribution. If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)

  • M (array-like, shape (dim_a, dim_b)) – loss matrix

  • reg (float) – Entropy regularization term > 0

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\). For semi-relaxed case, use either \(\mathrm{reg_{m}}=(float("inf"), scalar)\) or \(\mathrm{reg_{m}}=(scalar, float("inf"))\). If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).

  • method (str) – method used for the solver either ‘sinkhorn’, ‘sinkhorn_stabilized’ or ‘sinkhorn_reg_scaling’, see those function for specific parameters

  • reg_type (string, optional) –

    Regularizer term. Can take two values:

    • Negative entropy: ‘entropy’: \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\). This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\).

    • Kullback-Leibler divergence: ‘kl’: \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).

  • c (array-like, shape (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}=\)’entropy’, then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).

  • warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u, v sinkhorn scaling vectors).

  • tau (float) – threshold for max value in u or v for log scaling

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (>0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

  • warning:: (..) – Starting from version 0.9.5, the default value has been changed to reg_type=’kl’ instead of reg_type=’entropy’. This makes the function more consistent with the literature and the other solvers. If you want to use the entropy regularization, please set reg_type=’entropy’ explicitly.

Returns:

  • if n_hists == 1

    • gammaarray-like, shape (dim_a, dim_b)

      Optimal transportation matrix for the given parameters

    • logdict

      log dictionary returned only if log is True

  • else

    • ot_costarray-like, shape (n_hists,)

      the OT cost between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)

    • logdict

      log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> np.round(ot.unbalanced.sinkhorn_stabilized_unbalanced(a, b, M, 1., 1.), 7)
array([[0.3220536, 0.1184769],
       [0.1184769, 0.3220536]])

References

See also

ot.lp.emd

Unregularized OT

ot.optim.cg

General regularized OT

ot.unbalanced.sinkhorn_unbalanced(a, b, M, reg, reg_m, method='sinkhorn', reg_type='kl', c=None, warmstart=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the unbalanced entropic regularization optimal transport problem and return the OT plan

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \ \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where :

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]

Warning

Starting from version 0.9.5, the default value has been changed to reg_type=’kl’ instead of reg_type=’entropy’. This makes the function more consistent with the literature and the other solvers. If you want to use the entropy regularization, please set reg_type=’entropy’ explicitly.

Parameters:
  • a (array-like, shape (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like, shape (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b. If b is an empty list or array ([]), then b is set to uniform distribution. If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)

  • M (array-like, shape (dim_a, dim_b)) – loss matrix

  • reg (float) – Entropy regularization term > 0

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\). For semi-relaxed case, use either \(\mathrm{reg_{m}}=(float("inf"), scalar)\) or \(\mathrm{reg_{m}}=(scalar, float("inf"))\). If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).

  • method (str) – method used for the solver either ‘sinkhorn’, ‘sinkhorn_stabilized’, ‘sinkhorn_translation_invariant’ or ‘sinkhorn_reg_scaling’, see those function for specific parameters

  • reg_type (string, optional) –

    Regularizer term. Can take two values:

    • Negative entropy: ‘entropy’: \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\). This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\).

    • Kullback-Leibler divergence (default): ‘kl’: \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).

  • c (array-like, shape (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}=\)’entropy’, then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).

  • warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u, v sinkhorn scaling vectors).

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (>0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • if n_hists == 1

    • gammaarray-like, shape(dim_a, dim_b)

      Optimal transportation matrix for the given parameters

    • logdict

      log dictionary returned only if log is True

  • else

    • ot_distancearray-like, shape (n_hists,)

      the OT distance between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)

    • logdict

      log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.], [1., 0.]]
>>> np.round(ot.sinkhorn_unbalanced(a, b, M, 1, 1), 7)
array([[0.3220536, 0.1184769],
       [0.1184769, 0.3220536]])

References

See also

ot.unbalanced.sinkhorn_knopp_unbalanced

Unbalanced Classic Sinkhorn [10]

ot.unbalanced.sinkhorn_stabilized_unbalanced

Unbalanced Stabilized sinkhorn [9, 10]

ot.unbalanced.sinkhorn_reg_scaling_unbalanced

Unbalanced Sinkhorn with epsilon scaling [9, 10]

ot.unbalanced.sinkhorn_unbalanced_translation_invariant

Translation Invariant Unbalanced Sinkhorn [73]

ot.unbalanced.sinkhorn_unbalanced2(a, b, M, reg, reg_m, method='sinkhorn', reg_type='kl', c=None, warmstart=None, returnCost='linear', numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the entropic regularization unbalanced optimal transport problem and return the cost

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma\geq 0\end{aligned}\end{align} \]

where :

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]

Warning

Starting from version 0.9.5, the default value has been changed to reg_type=’kl’ instead of reg_type=’entropy’. This makes the function more consistent with the literature and the other solvers. If you want to use the entropy regularization, please set reg_type=’entropy’ explicitly.

Parameters:
  • a (array-like, shape (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like, shape (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b. If b is an empty list or array ([]), then b is set to uniform distribution. If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)

  • M (array-like, shape (dim_a, dim_b)) – loss matrix

  • reg (float) – Entropy regularization term > 0

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\). For semi-relaxed case, use either \(\mathrm{reg_{m}}=(float("inf"), scalar)\) or \(\mathrm{reg_{m}}=(scalar, float("inf"))\). If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).

  • method (str) – method used for the solver either ‘sinkhorn’, ‘sinkhorn_stabilized’, ‘sinkhorn_translation_invariant’ or ‘sinkhorn_reg_scaling’, see those function for specific parameters

  • reg_type (string, optional) –

    Regularizer term. Can take two values:

    • Negative entropy: ‘entropy’: \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\). This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\).

    • Kullback-Leibler divergence: ‘kl’: \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).

  • c (array-like, shape (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}=\)’entropy’, then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).

  • warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u,v sinkhorn scaling vectors).

  • returnCost (string, optional (default = "linear")) – If returnCost = “linear”, then return the linear part of the unbalanced OT loss. If returnCost = “total”, then return the total unbalanced OT loss.

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (>0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • ot_cost (array-like, shape (n_hists,)) – the OT cost between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)

  • log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .10]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> np.round(ot.unbalanced.sinkhorn_unbalanced2(a, b, M, 1., 1.), 8)
0.19600125

References

See also

ot.unbalanced.sinkhorn_knopp

Unbalanced Classic Sinkhorn [10]

ot.unbalanced.sinkhorn_stabilized

Unbalanced Stabilized sinkhorn [9, 10]

ot.unbalanced.sinkhorn_reg_scaling

Unbalanced Sinkhorn with epsilon scaling [9, 10]

ot.unbalanced.sinkhorn_unbalanced_translation_invariant

Translation Invariant Unbalanced Sinkhorn [73]

ot.unbalanced.sinkhorn_unbalanced_translation_invariant(a, b, M, reg, reg_m, reg_type='kl', c=None, warmstart=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the entropic regularization unbalanced optimal transport problem and return the OT plan

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \ \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where :

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\Omega\) is the entropic regularization term,KL divergence

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the translation invariant Sinkhorn algorithm as proposed in [73]

Parameters:
  • a (array-like, shape (dim_a,)) – Unnormalized histogram of dimension dim_a

  • b (array-like, shape (dim_b,) or (dim_b, n_hists)) – One or multiple unnormalized histograms of dimension dim_b If many, compute all the OT distances (a, b_i)

  • M (array-like, shape (dim_a, dim_b)) – loss matrix

  • reg (float) – Entropy regularization term > 0

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If reg_m is a scalar or an indexable object of length 1, then the same reg_m is applied to both marginal relaxations. The entropic balanced OT can be recovered using reg_m=float(“inf”). For semi-relaxed case, use either reg_m=(float(“inf”), scalar) or reg_m=(scalar, float(“inf”)). If reg_m is an array, it must have the same backend as input arrays (a, b, M).

  • reg_type (string, optional) – Regularizer term. Can take two values: ‘entropy’ (negative entropy) \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\), or ‘kl’ (Kullback-Leibler) \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).

  • c (array-like, shape (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}=\)’entropy’, then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).

  • warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u,v sinkhorn scaling vectors).

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • if n_hists == 1

    • gammaarray-like, shape (dim_a, dim_b)

      Optimal transportation matrix for the given parameters

    • logdict

      log dictionary returned only if log is True

  • else

    • ot_distancearray-like, shape (n_hists,)

      the OT distance between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)

    • logdict

      log dictionary returned only if log is True

Examples

>>> import ot
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> ot.unbalanced.sinkhorn_unbalanced_translation_invariant(a, b, M, 1., 1.)
array([[0.32205357, 0.11847689],
       [0.11847689, 0.32205357]])

References