ot.unbalanced

Solvers related to Unbalanced Optimal Transport problems.

ot.unbalanced.barycenter_unbalanced(A, M, reg, reg_m, method='sinkhorn', weights=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Compute the entropic unbalanced wasserstein barycenter of \(\mathbf{A}\).

The function solves the following optimization problem with \(\mathbf{a}\)

\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{u_{reg}}(\mathbf{a},\mathbf{a}_i)\]

where :

  • \(W_{u_{reg}}(\cdot,\cdot)\) is the unbalanced entropic regularized Wasserstein distance (see ot.unbalanced.sinkhorn_unbalanced())

  • \(\mathbf{a}_i\) are training distributions in the columns of matrix \(\mathbf{A}\)

  • reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix for OT

  • reg_mis the marginal relaxation hyperparameter

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10]

Parameters:
  • A (array-like (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of dimension dim

  • M (array-like (dim, dim)) – ground metric matrix for OT.

  • reg (float) – Entropy regularization term > 0

  • reg_m (float) – Marginal relaxation term > 0

  • weights (array-like (n_hists,) optional) – Weight of each distribution (barycentric coordinates) If None, uniform weights are used.

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • a ((dim,) array-like) – Unbalanced Wasserstein barycenter

  • log (dict) – log dictionary return only if log==True in parameters

References

ot.unbalanced.barycenter_unbalanced_sinkhorn(A, M, reg, reg_m, weights=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False)[source]

Compute the entropic unbalanced wasserstein barycenter of \(\mathbf{A}\).

The function solves the following optimization problem with \(\mathbf{a}\)

\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{u_{reg}}(\mathbf{a},\mathbf{a}_i)\]

where :

  • \(W_{u_{reg}}(\cdot,\cdot)\) is the unbalanced entropic regularized Wasserstein distance (see ot.unbalanced.sinkhorn_unbalanced())

  • \(\mathbf{a}_i\) are training distributions in the columns of matrix \(\mathbf{A}\)

  • reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix for OT

  • reg_mis the marginal relaxation hyperparameter

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10]

Parameters:
  • A (array-like (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of dimension dim

  • M (array-like (dim, dim)) – ground metric matrix for OT.

  • reg (float) – Entropy regularization term > 0

  • reg_m (float) – Marginal relaxation term > 0

  • weights (array-like (n_hists,) optional) – Weight of each distribution (barycentric coordinates) If None, uniform weights are used.

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • a ((dim,) array-like) – Unbalanced Wasserstein barycenter

  • log (dict) – log dictionary return only if \(log==True\) in parameters

References

ot.unbalanced.barycenter_unbalanced_stabilized(A, M, reg, reg_m, weights=None, tau=1000.0, numItermax=1000, stopThr=1e-06, verbose=False, log=False)[source]

Compute the entropic unbalanced wasserstein barycenter of \(\mathbf{A}\) with stabilization.

The function solves the following optimization problem:

\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{u_{reg}}(\mathbf{a},\mathbf{a}_i)\]

where :

  • \(W_{u_{reg}}(\cdot,\cdot)\) is the unbalanced entropic regularized Wasserstein distance (see ot.unbalanced.sinkhorn_unbalanced())

  • \(\mathbf{a}_i\) are training distributions in the columns of matrix \(\mathbf{A}\)

  • reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix for OT

  • reg_mis the marginal relaxation hyperparameter

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10]

Parameters:
  • A (array-like (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of dimension dim

  • M (array-like (dim, dim)) – ground metric matrix for OT.

  • reg (float) – Entropy regularization term > 0

  • reg_m (float) – Marginal relaxation term > 0

  • tau (float) – Stabilization threshold for log domain absorption.

  • weights (array-like (n_hists,) optional) – Weight of each distribution (barycentric coordinates) If None, uniform weights are used.

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • a ((dim,) array-like) – Unbalanced Wasserstein barycenter

  • log (dict) – log dictionary return only if \(log==True\) in parameters

References

ot.unbalanced.lbfgsb_unbalanced(a, b, M, reg, reg_m, c=None, reg_div='kl', regm_div='kl', G0=None, numItermax=1000, stopThr=1e-15, method='L-BFGS-B', verbose=False, log=False)[source]

Solve the unbalanced optimal transport problem and return the OT plan using L-BFGS-B algorithm. The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \mathrm{div}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{div_m}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{div_m}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where:

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • \(\mathrm{div_m}\) is a divergence, either Kullback-Leibler divergence,

or half-squared \(\ell_2\) divergence, or Total variation - \(\mathrm{div}\) is a divergence, either Kullback-Leibler divergence, or half-squared \(\ell_2\) divergence

Note

This function is backend-compatible and will work on arrays from all compatible backends. First, it converts all arrays into Numpy arrays, then uses the L-BFGS-B algorithm from scipy.optimize to solve the optimization problem.

Parameters:
  • a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b If b is an empty list or array ([]), then b is set to uniform distribution.

  • M (array-like (dim_a, dim_b)) – loss matrix

  • reg (float) – regularization term >=0

  • c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative (including 0) but cannot be infinity. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. If \(\mathrm{reg_{m}}\) is an array, it must be a Numpy array.

  • reg_div (string, optional) – Divergence used for regularization. Can take three values: ‘entropy’ (negative entropy), or ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or a tuple of two calable functions returning the reg term and its derivative. Note that the callable functions should be able to handle Numpy arrays and not tesors from the backend

  • regm_div (string, optional) – Divergence to quantify the difference between the marginals. Can take three values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or ‘tv’ (Total Variation)

  • G0 (array-like (dim_a, dim_b)) – Initialization of the transport matrix

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • gamma ((dim_a, dim_b) array-like) – Optimal transportation matrix for the given parameters

  • log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[1., 36.],[9., 4.]]
>>> np.round(ot.unbalanced.lbfgsb_unbalanced(a, b, M, reg=0, reg_m=5, reg_div='kl', regm_div='kl'), 2)
array([[0.45, 0.  ],
       [0.  , 0.34]])
>>> np.round(ot.unbalanced.lbfgsb_unbalanced(a, b, M, reg=0, reg_m=5, reg_div='l2', regm_div='l2'), 2)
array([[0.4, 0. ],
       [0. , 0.1]])

References

See also

ot.lp.emd2

Unregularized OT loss

ot.unbalanced.sinkhorn_unbalanced2

Entropic regularized OT loss

ot.unbalanced.lbfgsb_unbalanced2(a, b, M, reg, reg_m, c=None, reg_div='kl', regm_div='kl', G0=None, returnCost='linear', numItermax=1000, stopThr=1e-15, method='L-BFGS-B', verbose=False, log=False)[source]

Solve the unbalanced optimal transport problem and return the OT cost using L-BFGS-B. The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \mathrm{div}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{div_m}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{div_m}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where:

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • \(\mathrm{div_m}\) is a divergence, either Kullback-Leibler divergence,

or half-squared \(\ell_2\) divergence, or Total variation - \(\mathrm{div}\) is a divergence, either Kullback-Leibler divergence, or half-squared \(\ell_2\) divergence

Note

This function is backend-compatible and will work on arrays from all compatible backends. First, it converts all arrays into Numpy arrays, then uses the L-BFGS-B algorithm from scipy.optimize to solve the optimization problem.

Parameters:
  • a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b If b is an empty list or array ([]), then b is set to uniform distribution.

  • M (array-like (dim_a, dim_b)) – loss matrix

  • reg (float) – regularization term >=0

  • c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative (including 0) but cannot be infinity. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. If \(\mathrm{reg_{m}}\) is an array, it must be a Numpy array.

  • reg_div (string, optional) – Divergence used for regularization. Can take three values: ‘entropy’ (negative entropy), or ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or a tuple of two calable functions returning the reg term and its derivative. Note that the callable functions should be able to handle Numpy arrays and not tesors from the backend

  • regm_div (string, optional) – Divergence to quantify the difference between the marginals. Can take three values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or ‘tv’ (Total Variation)

  • G0 (array-like (dim_a, dim_b)) – Initialization of the transport matrix

  • returnCost (string, optional (default = "linear")) – If returnCost = “linear”, then return the linear part of the unbalanced OT loss. If returnCost = “total”, then return the total unbalanced OT loss.

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • ot_cost (array-like) – the OT cost between \(\mathbf{a}\) and \(\mathbf{b}\)

  • log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[1., 36.],[9., 4.]]
>>> np.round(ot.unbalanced.lbfgsb_unbalanced2(a, b, M, reg=0, reg_m=5, reg_div='kl', regm_div='kl'), 2)
1.79
>>> np.round(ot.unbalanced.lbfgsb_unbalanced2(a, b, M, reg=0, reg_m=5, reg_div='l2', regm_div='l2'), 2)
0.8

References

See also

ot.lp.emd2

Unregularized OT loss

ot.unbalanced.sinkhorn_unbalanced2

Entropic regularized OT loss

ot.unbalanced.mm_unbalanced(a, b, M, reg_m, c=None, reg=0, div='kl', G0=None, numItermax=1000, stopThr=1e-15, verbose=False, log=False)[source]

Solve the unbalanced optimal transport problem and return the OT plan. The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg_{m1}} \cdot \mathrm{div}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{div}(\gamma^T \mathbf{1}, \mathbf{b}) + \mathrm{reg} \cdot \mathrm{div}(\gamma, \mathbf{c})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where:

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • div is a divergence, either Kullback-Leibler or half-squared \(\ell_2\) divergence

The algorithm used for solving the problem is a maximization- minimization algorithm as proposed in [41]

Parameters:
  • a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b If b is an empty list or array ([]), then b is set to uniform distribution.

  • M (array-like (dim_a, dim_b)) – loss matrix

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative but cannot be infinity. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).

  • reg (float, optional (default = 0)) – Regularization term >= 0. By default, solve the unregularized problem

  • c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).

  • div (string, optional) – Divergence to quantify the difference between the marginals. Can take two values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared)

  • G0 (array-like (dim_a, dim_b)) – Initialization of the transport matrix

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • gamma ((dim_a, dim_b) array-like) – Optimal transportation matrix for the given parameters

  • log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[1., 36.],[9., 4.]]
>>> np.round(ot.unbalanced.mm_unbalanced(a, b, M, 5, div='kl'), 2)
array([[0.45, 0.  ],
       [0.  , 0.34]])
>>> np.round(ot.unbalanced.mm_unbalanced(a, b, M, 5, div='l2'), 2)
array([[0.4, 0. ],
       [0. , 0.1]])

References

See also

ot.lp.emd

Unregularized OT

ot.unbalanced.sinkhorn_unbalanced

Entropic regularized OT

ot.unbalanced.mm_unbalanced2(a, b, M, reg_m, c=None, reg=0, div='kl', G0=None, returnCost='linear', numItermax=1000, stopThr=1e-15, verbose=False, log=False)[source]

Solve the unbalanced optimal transport problem and return the OT cost. The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg_{m1}} \cdot \mathrm{div}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{div}(\gamma^T \mathbf{1}, \mathbf{b}) + \mathrm{reg} \cdot \mathrm{div}(\gamma, \mathbf{c})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where:

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • \(\mathrm{div}\) is a divergence, either Kullback-Leibler or half-squared \(\ell_2\) divergence

The algorithm used for solving the problem is a maximization- minimization algorithm as proposed in [41]

Parameters:
  • a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b If b is an empty list or array ([]), then b is set to uniform distribution.

  • M (array-like (dim_a, dim_b)) – loss matrix

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative but cannot be infinity. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).

  • reg (float, optional (default = 0)) – Entropy regularization term >= 0. By default, solve the unregularized problem

  • c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = mathbf{a} mathbf{b}^T\).

  • div (string, optional) – Divergence to quantify the difference between the marginals. Can take two values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared)

  • G0 (array-like (dim_a, dim_b)) – Initialization of the transport matrix

  • returnCost (string, optional (default = "linear")) – If returnCost = “linear”, then return the linear part of the unbalanced OT loss. If returnCost = “total”, then return the total unbalanced OT loss.

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • ot_cost (array-like) – the OT cost between \(\mathbf{a}\) and \(\mathbf{b}\)

  • log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[1., 36.],[9., 4.]]
>>> np.round(ot.unbalanced.mm_unbalanced2(a, b, M, 5, div='l2'), 2)
0.8
>>> np.round(ot.unbalanced.mm_unbalanced2(a, b, M, 5, div='kl'), 2)
1.79

References

See also

ot.lp.emd2

Unregularized OT loss

ot.unbalanced.sinkhorn_unbalanced2

Entropic regularized OT loss

ot.unbalanced.sinkhorn_knopp_unbalanced(a, b, M, reg, reg_m, reg_type='kl', c=None, warmstart=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the entropic regularization unbalanced optimal transport problem and return the OT plan

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where :

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]

Parameters:
  • a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b. If b is an empty list or array ([]), then b is set to uniform distribution. If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)

  • M (array-like (dim_a, dim_b)) – loss matrix

  • reg (float) – Entropy regularization term > 0

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\). For semi-relaxed case, use either \(\mathrm{reg_{m}}=(float("inf"), scalar)\) or \(\mathrm{reg_{m}}=(scalar, float("inf"))\). If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).

  • reg_type (string, optional) – Regularizer term. Can take two values: + Negative entropy: ‘entropy’: \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\). This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\). + Kullback-Leibler divergence: ‘kl’: \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).

  • c (array-like (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}='entropy'\), then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).

  • warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u, v sinkhorn scaling vectors).

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • if n_hists == 1

    • gamma(dim_a, dim_b) array-like

      Optimal transportation matrix for the given parameters

    • logdict

      log dictionary returned only if log is True

  • else

    • ot_cost(n_hists,) array-like

      the OT cost between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)

    • logdict

      log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> np.round(ot.unbalanced.sinkhorn_knopp_unbalanced(a, b, M, 1., 1.), 7)
array([[0.3220536, 0.1184769],
       [0.1184769, 0.3220536]])

References

See also

ot.lp.emd

Unregularized OT

ot.optim.cg

General regularized OT

ot.unbalanced.sinkhorn_stabilized_unbalanced(a, b, M, reg, reg_m, reg_type='kl', c=None, warmstart=None, tau=100000.0, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the entropic regularization unbalanced optimal transport problem and return the loss

The function solves the following optimization problem using log-domain stabilization as proposed in [10]:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where :

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]

Parameters:
  • a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b. If b is an empty list or array ([]), then b is set to uniform distribution. If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)

  • M (array-like (dim_a, dim_b)) – loss matrix

  • reg (float) – Entropy regularization term > 0

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\). For semi-relaxed case, use either \(\mathrm{reg_{m}}=(float("inf"), scalar)\) or \(\mathrm{reg_{m}}=(scalar, float("inf"))\). If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).

  • method (str) – method used for the solver either ‘sinkhorn’, ‘sinkhorn_stabilized’ or ‘sinkhorn_reg_scaling’, see those function for specific parameters

  • reg_type (string, optional) – Regularizer term. Can take two values: + Negative entropy: ‘entropy’: \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\). This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\). + Kullback-Leibler divergence: ‘kl’: \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).

  • c (array-like (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}='entropy'\), then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).

  • warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u, v sinkhorn scaling vectors).

  • tau (float) – threshold for max value in u or v for log scaling

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (>0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • if n_hists == 1

    • gamma(dim_a, dim_b) array-like

      Optimal transportation matrix for the given parameters

    • logdict

      log dictionary returned only if log is True

  • else

    • ot_cost(n_hists,) array-like

      the OT cost between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)

    • logdict

      log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> np.round(ot.unbalanced.sinkhorn_stabilized_unbalanced(a, b, M, 1., 1.), 7)
array([[0.3220536, 0.1184769],
       [0.1184769, 0.3220536]])

References

See also

ot.lp.emd

Unregularized OT

ot.optim.cg

General regularized OT

ot.unbalanced.sinkhorn_unbalanced(a, b, M, reg, reg_m, method='sinkhorn', reg_type='kl', c=None, warmstart=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the unbalanced entropic regularization optimal transport problem and return the OT plan

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \ \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where :

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]

Parameters:
  • a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b. If b is an empty list or array ([]), then b is set to uniform distribution. If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)

  • M (array-like (dim_a, dim_b)) – loss matrix

  • reg (float) – Entropy regularization term > 0

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\). For semi-relaxed case, use either \(\mathrm{reg_{m}}=(float("inf"), scalar)\) or \(\mathrm{reg_{m}}=(scalar, float("inf"))\). If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).

  • method (str) – method used for the solver either ‘sinkhorn’, ‘sinkhorn_stabilized’, ‘sinkhorn_translation_invariant’ or ‘sinkhorn_reg_scaling’, see those function for specific parameters

  • reg_type (string, optional) – Regularizer term. Can take two values: + Negative entropy: ‘entropy’: \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\). This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\). + Kullback-Leibler divergence: ‘kl’: \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).

  • c (array-like (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}='entropy'\), then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).

  • warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u, v sinkhorn scaling vectors).

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (>0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • if n_hists == 1

    • gamma(dim_a, dim_b) array-like

      Optimal transportation matrix for the given parameters

    • logdict

      log dictionary returned only if log is True

  • else

    • ot_distance(n_hists,) array-like

      the OT distance between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)

    • logdict

      log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.], [1., 0.]]
>>> np.round(ot.sinkhorn_unbalanced(a, b, M, 1, 1), 7)
array([[0.3220536, 0.1184769],
       [0.1184769, 0.3220536]])

References

See also

ot.unbalanced.sinkhorn_knopp_unbalanced

Unbalanced Classic Sinkhorn [10]

ot.unbalanced.sinkhorn_stabilized_unbalanced

Unbalanced Stabilized sinkhorn [9, 10]

ot.unbalanced.sinkhorn_reg_scaling_unbalanced

Unbalanced Sinkhorn with epsilon scaling [9, 10]

ot.unbalanced.sinkhorn_unbalanced_translation_invariant

Translation Invariant Unbalanced Sinkhorn [73]

ot.unbalanced.sinkhorn_unbalanced2(a, b, M, reg, reg_m, method='sinkhorn', reg_type='kl', c=None, warmstart=None, returnCost='linear', numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the entropic regularization unbalanced optimal transport problem and return the cost

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma\geq 0\end{aligned}\end{align} \]

where :

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • \(\mathbf{c}\) is a reference distribution for the regularization

  • KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]

Parameters:
  • a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.

  • b (array-like (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b. If b is an empty list or array ([]), then b is set to uniform distribution. If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)

  • M (array-like (dim_a, dim_b)) – loss matrix

  • reg (float) – Entropy regularization term > 0

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\). For semi-relaxed case, use either \(\mathrm{reg_{m}}=(float("inf"), scalar)\) or \(\mathrm{reg_{m}}=(scalar, float("inf"))\). If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).

  • method (str) – method used for the solver either ‘sinkhorn’, ‘sinkhorn_stabilized’, ‘sinkhorn_translation_invariant’ or ‘sinkhorn_reg_scaling’, see those function for specific parameters

  • reg_type (string, optional) – Regularizer term. Can take two values: + Negative entropy: ‘entropy’: \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\). This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\). + Kullback-Leibler divergence: ‘kl’: \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).

  • c (array-like (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}='entropy'\), then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).

  • warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u,v sinkhorn scaling vectors).

  • returnCost (string, optional (default = "linear")) – If returnCost = “linear”, then return the linear part of the unbalanced OT loss. If returnCost = “total”, then return the total unbalanced OT loss.

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (>0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • ot_cost ((n_hists,) array-like) – the OT cost between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)

  • log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .10]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> np.round(ot.unbalanced.sinkhorn_unbalanced2(a, b, M, 1., 1.), 8)
0.19600125

References

See also

ot.unbalanced.sinkhorn_knopp

Unbalanced Classic Sinkhorn [10]

ot.unbalanced.sinkhorn_stabilized

Unbalanced Stabilized sinkhorn [9, 10]

ot.unbalanced.sinkhorn_reg_scaling

Unbalanced Sinkhorn with epsilon scaling [9, 10]

ot.unbalanced.sinkhorn_unbalanced_translation_invariant

Translation Invariant Unbalanced Sinkhorn [73]

ot.unbalanced.sinkhorn_unbalanced_translation_invariant(a, b, M, reg, reg_m, reg_type='kl', c=None, warmstart=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the entropic regularization unbalanced optimal transport problem and return the OT plan

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \ \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where :

  • \(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix

  • \(\Omega\) is the entropic regularization term,KL divergence

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions

  • KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the translation invariant Sinkhorn algorithm as proposed in [73]

Parameters:
  • a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a

  • b (array-like (dim_b,) or array-like (dim_b, n_hists)) – One or multiple unnormalized histograms of dimension dim_b If many, compute all the OT distances (a, b_i)

  • M (array-like (dim_a, dim_b)) – loss matrix

  • reg (float) – Entropy regularization term > 0

  • reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If reg_m is a scalar or an indexable object of length 1, then the same reg_m is applied to both marginal relaxations. The entropic balanced OT can be recovered using reg_m=float(“inf”). For semi-relaxed case, use either reg_m=(float(“inf”), scalar) or reg_m=(scalar, float(“inf”)). If reg_m is an array, it must have the same backend as input arrays (a, b, M).

  • reg_type (string, optional) – Regularizer term. Can take two values: ‘entropy’ (negative entropy) \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\), or ‘kl’ (Kullback-Leibler) \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).

  • c (array-like (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}='entropy'\), then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).

  • warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u,v sinkhorn scaling vectors).

  • numItermax (int, optional) – Max number of iterations

  • stopThr (float, optional) – Stop threshold on error (> 0)

  • verbose (bool, optional) – Print information along iterations

  • log (bool, optional) – record log if True

Returns:

  • if n_hists == 1

    • gamma(dim_a, dim_b) array-like

      Optimal transportation matrix for the given parameters

    • logdict

      log dictionary returned only if log is True

  • else

    • ot_distance(n_hists,) array-like

      the OT distance between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)

    • logdict

      log dictionary returned only if log is True

Examples

>>> import ot
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> ot.unbalanced.sinkhorn_unbalanced_translation_invariant(a, b, M, 1., 1.)
array([[0.32205357, 0.11847689],
       [0.11847689, 0.32205357]])

References