ot.unbalanced

Solvers related to Unbalanced Optimal Transport problems.

ot.unbalanced.barycenter_unbalanced(A, M, reg, reg_m, method='sinkhorn', weights=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Compute the entropic unbalanced wasserstein barycenter of \(\mathbf{A}\).

The function solves the following optimization problem with \(\mathbf{a}\)

\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{u_{reg}}(\mathbf{a},\mathbf{a}_i)\]

where :

\(W_{u_{reg}}(\cdot,\cdot)\) is the unbalanced entropic regularized Wasserstein distance (see ot.unbalanced.sinkhorn_unbalanced())
\(\mathbf{a}_i\) are training distributions in the columns of matrix \(\mathbf{A}\)
reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix for OT
reg_mis the marginal relaxation hyperparameter

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10]

Parameters:

A (array-like (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of dimension dim
M (array-like (dim, dim)) – ground metric matrix for OT.
reg (float) – Entropy regularization term > 0
reg_m (float) – Marginal relaxation term > 0
weights (array-like (n_hists,) optional) – Weight of each distribution (barycentric coordinates) If None, uniform weights are used.
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

a ((dim,) array-like) – Unbalanced Wasserstein barycenter
log (dict) – log dictionary return only if log==True in parameters

References

ot.unbalanced.barycenter_unbalanced_sinkhorn(A, M, reg, reg_m, weights=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False)[source]

Compute the entropic unbalanced wasserstein barycenter of \(\mathbf{A}\).

The function solves the following optimization problem with \(\mathbf{a}\)

\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{u_{reg}}(\mathbf{a},\mathbf{a}_i)\]

where :

\(W_{u_{reg}}(\cdot,\cdot)\) is the unbalanced entropic regularized Wasserstein distance (see ot.unbalanced.sinkhorn_unbalanced())
\(\mathbf{a}_i\) are training distributions in the columns of matrix \(\mathbf{A}\)
reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix for OT
reg_mis the marginal relaxation hyperparameter

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10]

Parameters:

A (array-like (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of dimension dim
M (array-like (dim, dim)) – ground metric matrix for OT.
reg (float) – Entropy regularization term > 0
reg_m (float) – Marginal relaxation term > 0
weights (array-like (n_hists,) optional) – Weight of each distribution (barycentric coordinates) If None, uniform weights are used.
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

a ((dim,) array-like) – Unbalanced Wasserstein barycenter
log (dict) – log dictionary return only if \(log==True\) in parameters

References

ot.unbalanced.barycenter_unbalanced_stabilized(A, M, reg, reg_m, weights=None, tau=1000.0, numItermax=1000, stopThr=1e-06, verbose=False, log=False)[source]

Compute the entropic unbalanced wasserstein barycenter of \(\mathbf{A}\) with stabilization.

The function solves the following optimization problem:

\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{u_{reg}}(\mathbf{a},\mathbf{a}_i)\]

where :

\(W_{u_{reg}}(\cdot,\cdot)\) is the unbalanced entropic regularized Wasserstein distance (see ot.unbalanced.sinkhorn_unbalanced())
\(\mathbf{a}_i\) are training distributions in the columns of matrix \(\mathbf{A}\)
reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix for OT
reg_mis the marginal relaxation hyperparameter

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10]

Parameters:

A (array-like (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of dimension dim
M (array-like (dim, dim)) – ground metric matrix for OT.
reg (float) – Entropy regularization term > 0
reg_m (float) – Marginal relaxation term > 0
tau (float) – Stabilization threshold for log domain absorption.
weights (array-like (n_hists,) optional) – Weight of each distribution (barycentric coordinates) If None, uniform weights are used.
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

a ((dim,) array-like) – Unbalanced Wasserstein barycenter
log (dict) – log dictionary return only if \(log==True\) in parameters

References

ot.unbalanced.lbfgsb_unbalanced(a, b, M, reg, reg_m, c=None, reg_div='kl', regm_div='kl', G0=None, numItermax=1000, stopThr=1e-15, method='L-BFGS-B', verbose=False, log=False)[source]

Solve the unbalanced optimal transport problem and return the OT plan using L-BFGS-B algorithm. The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \mathrm{div}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{div_m}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{div_m}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where:

\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
\(\mathrm{div_m}\) is a divergence, either Kullback-Leibler divergence,

or half-squared \(\ell_2\) divergence, or Total variation - \(\mathrm{div}\) is a divergence, either Kullback-Leibler divergence, or half-squared \(\ell_2\) divergence

Note

This function is backend-compatible and will work on arrays from all compatible backends. First, it converts all arrays into Numpy arrays, then uses the L-BFGS-B algorithm from scipy.optimize to solve the optimization problem.

Parameters:

a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.
b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b If b is an empty list or array ([]), then b is set to uniform distribution.
M (array-like (dim_a, dim_b)) – loss matrix
reg (float) – regularization term >=0
c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative (including 0) but cannot be infinity. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. If \(\mathrm{reg_{m}}\) is an array, it must be a Numpy array.
reg_div (string, optional) – Divergence used for regularization. Can take three values: ‘entropy’ (negative entropy), or ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or a tuple of two calable functions returning the reg term and its derivative. Note that the callable functions should be able to handle Numpy arrays and not tesors from the backend
regm_div (string, optional) – Divergence to quantify the difference between the marginals. Can take three values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or ‘tv’ (Total Variation)
G0 (array-like (dim_a, dim_b)) – Initialization of the transport matrix
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

gamma ((dim_a, dim_b) array-like) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[1., 36.],[9., 4.]]
>>> np.round(ot.unbalanced.lbfgsb_unbalanced(a, b, M, reg=0, reg_m=5, reg_div='kl', regm_div='kl'), 2)
array([[0.45, 0.  ],
       [0.  , 0.34]])
>>> np.round(ot.unbalanced.lbfgsb_unbalanced(a, b, M, reg=0, reg_m=5, reg_div='l2', regm_div='l2'), 2)
array([[0.4, 0. ],
       [0. , 0.1]])

References

See also

ot.lp.emd2: Unregularized OT loss
ot.unbalanced.sinkhorn_unbalanced2: Entropic regularized OT loss

ot.unbalanced.lbfgsb_unbalanced2(a, b, M, reg, reg_m, c=None, reg_div='kl', regm_div='kl', G0=None, returnCost='linear', numItermax=1000, stopThr=1e-15, method='L-BFGS-B', verbose=False, log=False)[source]

Solve the unbalanced optimal transport problem and return the OT cost using L-BFGS-B. The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \mathrm{div}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{div_m}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{div_m}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where:

\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
\(\mathrm{div_m}\) is a divergence, either Kullback-Leibler divergence,

or half-squared \(\ell_2\) divergence, or Total variation - \(\mathrm{div}\) is a divergence, either Kullback-Leibler divergence, or half-squared \(\ell_2\) divergence

Note

This function is backend-compatible and will work on arrays from all compatible backends. First, it converts all arrays into Numpy arrays, then uses the L-BFGS-B algorithm from scipy.optimize to solve the optimization problem.

Parameters:

a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.
b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b If b is an empty list or array ([]), then b is set to uniform distribution.
M (array-like (dim_a, dim_b)) – loss matrix
reg (float) – regularization term >=0
c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative (including 0) but cannot be infinity. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. If \(\mathrm{reg_{m}}\) is an array, it must be a Numpy array.
reg_div (string, optional) – Divergence used for regularization. Can take three values: ‘entropy’ (negative entropy), or ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or a tuple of two calable functions returning the reg term and its derivative. Note that the callable functions should be able to handle Numpy arrays and not tesors from the backend
regm_div (string, optional) – Divergence to quantify the difference between the marginals. Can take three values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or ‘tv’ (Total Variation)
G0 (array-like (dim_a, dim_b)) – Initialization of the transport matrix
returnCost (string, optional (default = "linear")) – If returnCost = “linear”, then return the linear part of the unbalanced OT loss. If returnCost = “total”, then return the total unbalanced OT loss.
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

ot_cost (array-like) – the OT cost between \(\mathbf{a}\) and \(\mathbf{b}\)
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[1., 36.],[9., 4.]]
>>> np.round(ot.unbalanced.lbfgsb_unbalanced2(a, b, M, reg=0, reg_m=5, reg_div='kl', regm_div='kl'), 2)
1.79
>>> np.round(ot.unbalanced.lbfgsb_unbalanced2(a, b, M, reg=0, reg_m=5, reg_div='l2', regm_div='l2'), 2)
0.8

References

See also

ot.lp.emd2: Unregularized OT loss
ot.unbalanced.sinkhorn_unbalanced2: Entropic regularized OT loss

ot.unbalanced.mm_unbalanced(a, b, M, reg_m, c=None, reg=0, div='kl', G0=None, numItermax=1000, stopThr=1e-15, verbose=False, log=False)[source]

Solve the unbalanced optimal transport problem and return the OT plan. The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg_{m1}} \cdot \mathrm{div}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{div}(\gamma^T \mathbf{1}, \mathbf{b}) + \mathrm{reg} \cdot \mathrm{div}(\gamma, \mathbf{c})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where:

\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
div is a divergence, either Kullback-Leibler or half-squared \(\ell_2\) divergence

The algorithm used for solving the problem is a maximization- minimization algorithm as proposed in [41]

Parameters:

a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.
b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b If b is an empty list or array ([]), then b is set to uniform distribution.
M (array-like (dim_a, dim_b)) – loss matrix
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative but cannot be infinity. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).
reg (float, optional (default = 0)) – Regularization term >= 0. By default, solve the unregularized problem
c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).
div (string, optional) – Divergence to quantify the difference between the marginals. Can take two values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared)
G0 (array-like (dim_a, dim_b)) – Initialization of the transport matrix
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

gamma ((dim_a, dim_b) array-like) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[1., 36.],[9., 4.]]
>>> np.round(ot.unbalanced.mm_unbalanced(a, b, M, 5, div='kl'), 2)
array([[0.45, 0.  ],
       [0.  , 0.34]])
>>> np.round(ot.unbalanced.mm_unbalanced(a, b, M, 5, div='l2'), 2)
array([[0.4, 0. ],
       [0. , 0.1]])

References

See also

ot.lp.emd: Unregularized OT
ot.unbalanced.sinkhorn_unbalanced: Entropic regularized OT

ot.unbalanced.mm_unbalanced2(a, b, M, reg_m, c=None, reg=0, div='kl', G0=None, returnCost='linear', numItermax=1000, stopThr=1e-15, verbose=False, log=False)[source]

Solve the unbalanced optimal transport problem and return the OT cost. The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg_{m1}} \cdot \mathrm{div}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{div}(\gamma^T \mathbf{1}, \mathbf{b}) + \mathrm{reg} \cdot \mathrm{div}(\gamma, \mathbf{c})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where:

\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
\(\mathrm{div}\) is a divergence, either Kullback-Leibler or half-squared \(\ell_2\) divergence

The algorithm used for solving the problem is a maximization- minimization algorithm as proposed in [41]

Parameters:

a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.
b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b If b is an empty list or array ([]), then b is set to uniform distribution.
M (array-like (dim_a, dim_b)) – loss matrix
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative but cannot be infinity. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).
reg (float, optional (default = 0)) – Entropy regularization term >= 0. By default, solve the unregularized problem
c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = mathbf{a} mathbf{b}^T\).
div (string, optional) – Divergence to quantify the difference between the marginals. Can take two values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared)
G0 (array-like (dim_a, dim_b)) – Initialization of the transport matrix
returnCost (string, optional (default = "linear")) – If returnCost = “linear”, then return the linear part of the unbalanced OT loss. If returnCost = “total”, then return the total unbalanced OT loss.
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

ot_cost (array-like) – the OT cost between \(\mathbf{a}\) and \(\mathbf{b}\)
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[1., 36.],[9., 4.]]
>>> np.round(ot.unbalanced.mm_unbalanced2(a, b, M, 5, div='l2'), 2)
0.8
>>> np.round(ot.unbalanced.mm_unbalanced2(a, b, M, 5, div='kl'), 2)
1.79

References

See also

ot.lp.emd2: Unregularized OT loss
ot.unbalanced.sinkhorn_unbalanced2: Entropic regularized OT loss

ot.unbalanced.sinkhorn_knopp_unbalanced(a, b, M, reg, reg_m, reg_type='kl', c=None, warmstart=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the entropic regularization unbalanced optimal transport problem and return the OT plan

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]

Parameters:

a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.
b (array-like (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b. If b is an empty list or array ([]), then b is set to uniform distribution. If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)
M (array-like (dim_a, dim_b)) – loss matrix
reg (float) – Entropy regularization term > 0
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\). For semi-relaxed case, use either \(\mathrm{reg_{m}}=(float("inf"), scalar)\) or \(\mathrm{reg_{m}}=(scalar, float("inf"))\). If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).
reg_type (string, optional) – Regularizer term. Can take two values: + Negative entropy: ‘entropy’: \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\). This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\). + Kullback-Leibler divergence: ‘kl’: \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).
c (array-like (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}='entropy'\), then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u, v sinkhorn scaling vectors).
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

if n_hists == 1 –
- gamma(dim_a, dim_b) array-like
  Optimal transportation matrix for the given parameters
- logdict
  log dictionary returned only if log is True
else –
- ot_cost(n_hists,) array-like
  the OT cost between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)
- logdict
  log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> np.round(ot.unbalanced.sinkhorn_knopp_unbalanced(a, b, M, 1., 1.), 7)
array([[0.3220536, 0.1184769],
       [0.1184769, 0.3220536]])

References

See also

ot.lp.emd: Unregularized OT
ot.optim.cg: General regularized OT

ot.unbalanced.sinkhorn_stabilized_unbalanced(a, b, M, reg, reg_m, reg_type='kl', c=None, warmstart=None, tau=100000.0, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the entropic regularization unbalanced optimal transport problem and return the loss

The function solves the following optimization problem using log-domain stabilization as proposed in [10]:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]

Parameters:

a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.
b (array-like (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b. If b is an empty list or array ([]), then b is set to uniform distribution. If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)
M (array-like (dim_a, dim_b)) – loss matrix
reg (float) – Entropy regularization term > 0
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\). For semi-relaxed case, use either \(\mathrm{reg_{m}}=(float("inf"), scalar)\) or \(\mathrm{reg_{m}}=(scalar, float("inf"))\). If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).
method (str) – method used for the solver either ‘sinkhorn’, ‘sinkhorn_stabilized’ or ‘sinkhorn_reg_scaling’, see those function for specific parameters
reg_type (string, optional) – Regularizer term. Can take two values: + Negative entropy: ‘entropy’: \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\). This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\). + Kullback-Leibler divergence: ‘kl’: \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).
c (array-like (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}='entropy'\), then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u, v sinkhorn scaling vectors).
tau (float) – threshold for max value in u or v for log scaling
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

if n_hists == 1 –
- gamma(dim_a, dim_b) array-like
  Optimal transportation matrix for the given parameters
- logdict
  log dictionary returned only if log is True
else –
- ot_cost(n_hists,) array-like
  the OT cost between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)
- logdict
  log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> np.round(ot.unbalanced.sinkhorn_stabilized_unbalanced(a, b, M, 1., 1.), 7)
array([[0.3220536, 0.1184769],
       [0.1184769, 0.3220536]])

References

See also

ot.lp.emd: Unregularized OT
ot.optim.cg: General regularized OT

ot.unbalanced.sinkhorn_unbalanced(a, b, M, reg, reg_m, method='sinkhorn', reg_type='kl', c=None, warmstart=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the unbalanced entropic regularization optimal transport problem and return the OT plan

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \ \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]

Parameters:

a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.
b (array-like (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b. If b is an empty list or array ([]), then b is set to uniform distribution. If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)
M (array-like (dim_a, dim_b)) – loss matrix
reg (float) – Entropy regularization term > 0
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\). For semi-relaxed case, use either \(\mathrm{reg_{m}}=(float("inf"), scalar)\) or \(\mathrm{reg_{m}}=(scalar, float("inf"))\). If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).
method (str) – method used for the solver either ‘sinkhorn’, ‘sinkhorn_stabilized’, ‘sinkhorn_translation_invariant’ or ‘sinkhorn_reg_scaling’, see those function for specific parameters
reg_type (string, optional) – Regularizer term. Can take two values: + Negative entropy: ‘entropy’: \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\). This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\). + Kullback-Leibler divergence: ‘kl’: \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).
c (array-like (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}='entropy'\), then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u, v sinkhorn scaling vectors).
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

if n_hists == 1 –
- gamma(dim_a, dim_b) array-like
  Optimal transportation matrix for the given parameters
- logdict
  log dictionary returned only if log is True
else –
- ot_distance(n_hists,) array-like
  the OT distance between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)
- logdict
  log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.], [1., 0.]]
>>> np.round(ot.sinkhorn_unbalanced(a, b, M, 1, 1), 7)
array([[0.3220536, 0.1184769],
       [0.1184769, 0.3220536]])

References

See also

ot.unbalanced.sinkhorn_knopp_unbalanced: Unbalanced Classic Sinkhorn [10]
ot.unbalanced.sinkhorn_stabilized_unbalanced: Unbalanced Stabilized sinkhorn [9, 10]
ot.unbalanced.sinkhorn_reg_scaling_unbalanced: Unbalanced Sinkhorn with epsilon scaling [9, 10]
ot.unbalanced.sinkhorn_unbalanced_translation_invariant: Translation Invariant Unbalanced Sinkhorn [73]

ot.unbalanced.sinkhorn_unbalanced2(a, b, M, reg, reg_m, method='sinkhorn', reg_type='kl', c=None, warmstart=None, returnCost='linear', numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the entropic regularization unbalanced optimal transport problem and return the cost

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma\geq 0\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]

Parameters:

a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a If a is an empty list or array ([]), then a is set to uniform distribution.
b (array-like (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b. If b is an empty list or array ([]), then b is set to uniform distribution. If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)
M (array-like (dim_a, dim_b)) – loss matrix
reg (float) – Entropy regularization term > 0
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1, then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations. The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\). For semi-relaxed case, use either \(\mathrm{reg_{m}}=(float("inf"), scalar)\) or \(\mathrm{reg_{m}}=(scalar, float("inf"))\). If \(\mathrm{reg_{m}}\) is an array, it must have the same backend as input arrays (a, b, M).
method (str) – method used for the solver either ‘sinkhorn’, ‘sinkhorn_stabilized’, ‘sinkhorn_translation_invariant’ or ‘sinkhorn_reg_scaling’, see those function for specific parameters
reg_type (string, optional) – Regularizer term. Can take two values: + Negative entropy: ‘entropy’: \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\). This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\). + Kullback-Leibler divergence: ‘kl’: \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).
c (array-like (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}='entropy'\), then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u,v sinkhorn scaling vectors).
returnCost (string, optional (default = "linear")) – If returnCost = “linear”, then return the linear part of the unbalanced OT loss. If returnCost = “total”, then return the total unbalanced OT loss.
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

ot_cost ((n_hists,) array-like) – the OT cost between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import numpy as np
>>> a=[.5, .10]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> np.round(ot.unbalanced.sinkhorn_unbalanced2(a, b, M, 1., 1.), 8)
0.19600125

References

See also

ot.unbalanced.sinkhorn_knopp: Unbalanced Classic Sinkhorn [10]
ot.unbalanced.sinkhorn_stabilized: Unbalanced Stabilized sinkhorn [9, 10]
ot.unbalanced.sinkhorn_reg_scaling: Unbalanced Sinkhorn with epsilon scaling [9, 10]
ot.unbalanced.sinkhorn_unbalanced_translation_invariant: Translation Invariant Unbalanced Sinkhorn [73]

ot.unbalanced.sinkhorn_unbalanced_translation_invariant(a, b, M, reg, reg_m, reg_type='kl', c=None, warmstart=None, numItermax=1000, stopThr=1e-06, verbose=False, log=False, **kwargs)[source]

Solve the entropic regularization unbalanced optimal transport problem and return the OT plan

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}W = \arg \min_\gamma \ \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot \mathrm{KL}(\gamma, \mathbf{c}) + \mathrm{reg_{m1}} \cdot \mathrm{KL}(\gamma \mathbf{1}, \mathbf{a}) + \mathrm{reg_{m2}} \cdot \mathrm{KL}(\gamma^T \mathbf{1}, \mathbf{b})\\s.t. \gamma \geq 0\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term,KL divergence
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
KL is the Kullback-Leibler divergence

The algorithm used for solving the problem is the translation invariant Sinkhorn algorithm as proposed in [73]

Parameters:

a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a
b (array-like (dim_b,) or array-like (dim_b, n_hists)) – One or multiple unnormalized histograms of dimension dim_b If many, compute all the OT distances (a, b_i)
M (array-like (dim_a, dim_b)) – loss matrix
reg (float) – Entropy regularization term > 0
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term. If reg_m is a scalar or an indexable object of length 1, then the same reg_m is applied to both marginal relaxations. The entropic balanced OT can be recovered using reg_m=float(“inf”). For semi-relaxed case, use either reg_m=(float(“inf”), scalar) or reg_m=(scalar, float(“inf”)). If reg_m is an array, it must have the same backend as input arrays (a, b, M).
reg_type (string, optional) – Regularizer term. Can take two values: ‘entropy’ (negative entropy) \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\), or ‘kl’ (Kullback-Leibler) \(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).
c (array-like (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization. If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\). If \(\texttt{reg_type}='entropy'\), then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given (that is the logarithm of the u,v sinkhorn scaling vectors).
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

if n_hists == 1 –
- gamma(dim_a, dim_b) array-like
  Optimal transportation matrix for the given parameters
- logdict
  log dictionary returned only if log is True
else –
- ot_distance(n_hists,) array-like
  the OT distance between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)
- logdict
  log dictionary returned only if log is True

Examples

>>> import ot
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.],[1., 0.]]
>>> ot.unbalanced.sinkhorn_unbalanced_translation_invariant(a, b, M, 1., 1.)
array([[0.32205357, 0.11847689],
       [0.11847689, 0.32205357]])

References