# ot.bregman

Bregman projections solvers for entropic regularized OT

## Functions

ot.bregman.barycenter(A, M, reg, weights=None, method='sinkhorn', numItermax=10000, stopThr=0.0001, verbose=False, log=False, warn=True, **kwargs)[source]

Compute the entropic regularized wasserstein barycenter of distributions $$\mathbf{A}$$

The function solves the following optimization problem:

$\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{reg}(\mathbf{a},\mathbf{a}_i)$

where :

• $$W_{reg}(\cdot,\cdot)$$ is the entropic regularized Wasserstein distance (see ot.bregman.sinkhorn()) if method is sinkhorn or sinkhorn_stabilized or sinkhorn_log.

• $$\mathbf{a}_i$$ are training distributions in the columns of matrix $$\mathbf{A}$$

• reg and $$\mathbf{M}$$ are respectively the regularization term and the cost matrix for OT

The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in 

Parameters
• A (array-like, shape (dim, n_hists)) – n_hists training distributions $$\mathbf{a}_i$$ of size dim

• M (array-like, shape (dim, dim)) – loss matrix for OT

• reg (float) – Regularization term > 0

• method (str (optional)) – method used for the solver either ‘sinkhorn’ or ‘sinkhorn_stabilized’ or ‘sinkhorn_log’

• weights (array-like, shape (n_hists,)) – Weights of each histogram $$\mathbf{a}_i$$ on the simplex (barycentric coodinates)

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• a ((dim,) array-like) – Wasserstein barycenter

• log (dict) – log dictionary return only if log==True in parameters

References

3

Benamou, J. D., Carlier, G., Cuturi, M., Nenna, L., & Peyré, G. (2015). Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37(2), A1111-A1138.

ot.bregman.barycenter_debiased(A, M, reg, weights=None, method='sinkhorn', numItermax=10000, stopThr=0.0001, verbose=False, log=False, warn=True, **kwargs)[source]

Compute the debiased Sinkhorn barycenter of distributions A

The function solves the following optimization problem:

$\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i S_{reg}(\mathbf{a},\mathbf{a}_i)$

where :

• $$S_{reg}(\cdot,\cdot)$$ is the debiased Sinkhorn divergence (see ot.bregman.empirical_sinkhorn_divergence())

• $$\mathbf{a}_i$$ are training distributions in the columns of matrix $$\mathbf{A}$$

• reg and $$\mathbf{M}$$ are respectively the regularization term and the cost matrix for OT

The algorithm used for solving the problem is the debiased Sinkhorn algorithm as proposed in 

Parameters
• A (array-like, shape (dim, n_hists)) – n_hists training distributions $$\mathbf{a}_i$$ of size dim

• M (array-like, shape (dim, dim)) – loss matrix for OT

• reg (float) – Regularization term > 0

• method (str (optional)) – method used for the solver either ‘sinkhorn’ or ‘sinkhorn_log’

• weights (array-like, shape (n_hists,)) – Weights of each histogram $$\mathbf{a}_i$$ on the simplex (barycentric coodinates)

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• a ((dim,) array-like) – Wasserstein barycenter

• log (dict) – log dictionary return only if log==True in parameters

References

37

Janati, H., Cuturi, M., Gramfort, A. Proceedings of the 37th International Conference on Machine Learning, PMLR 119:4692-4701, 2020

### Examples using ot.bregman.barycenter_debiased

ot.bregman.barycenter_sinkhorn(A, M, reg, weights=None, numItermax=1000, stopThr=0.0001, verbose=False, log=False, warn=True)[source]

Compute the entropic regularized wasserstein barycenter of distributions $$\mathbf{A}$$

The function solves the following optimization problem:

$\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{reg}(\mathbf{a},\mathbf{a}_i)$

where :

• $$W_{reg}(\cdot,\cdot)$$ is the entropic regularized Wasserstein distance (see ot.bregman.sinkhorn())

• $$\mathbf{a}_i$$ are training distributions in the columns of matrix $$\mathbf{A}$$

• reg and $$\mathbf{M}$$ are respectively the regularization term and the cost matrix for OT

The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in .

Parameters
• A (array-like, shape (dim, n_hists)) – n_hists training distributions $$\mathbf{a}_i$$ of size dim

• M (array-like, shape (dim, dim)) – loss matrix for OT

• reg (float) – Regularization term > 0

• weights (array-like, shape (n_hists,)) – Weights of each histogram $$\mathbf{a}_i$$ on the simplex (barycentric coodinates)

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• a ((dim,) array-like) – Wasserstein barycenter

• log (dict) – log dictionary return only if log==True in parameters

References

3

Benamou, J. D., Carlier, G., Cuturi, M., Nenna, L., & Peyré, G. (2015).

Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37(2), A1111-A1138.

ot.bregman.barycenter_stabilized(A, M, reg, tau=10000000000.0, weights=None, numItermax=1000, stopThr=0.0001, verbose=False, log=False, warn=True)[source]

Compute the entropic regularized wasserstein barycenter of distributions $$\mathbf{A}$$ with stabilization.

The function solves the following optimization problem:

$\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{reg}(\mathbf{a},\mathbf{a}_i)$

where :

• $$W_{reg}(\cdot,\cdot)$$ is the entropic regularized Wasserstein distance (see ot.bregman.sinkhorn())

• $$\mathbf{a}_i$$ are training distributions in the columns of matrix $$\mathbf{A}$$

• reg and $$\mathbf{M}$$ are respectively the regularization term and the cost matrix for OT

The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in 

Parameters
• A (array-like, shape (dim, n_hists)) – n_hists training distributions $$\mathbf{a}_i$$ of size dim

• M (array-like, shape (dim, dim)) – loss matrix for OT

• reg (float) – Regularization term > 0

• tau (float) – threshold for max value in $$\mathbf{u}$$ or $$\mathbf{v}$$ for log scaling

• weights (array-like, shape (n_hists,)) – Weights of each histogram $$\mathbf{a}_i$$ on the simplex (barycentric coodinates)

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• a ((dim,) array-like) – Wasserstein barycenter

• log (dict) – log dictionary return only if log==True in parameters

References

3

Benamou, J. D., Carlier, G., Cuturi, M., Nenna, L., & Peyré, G. (2015). Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37(2), A1111-A1138.

ot.bregman.convolutional_barycenter2d(A, reg, weights=None, method='sinkhorn', numItermax=10000, stopThr=0.0001, verbose=False, log=False, warn=True, **kwargs)[source]

Compute the entropic regularized wasserstein barycenter of distributions $$\mathbf{A}$$ where $$\mathbf{A}$$ is a collection of 2D images.

The function solves the following optimization problem:

$\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{reg}(\mathbf{a},\mathbf{a}_i)$

where :

• $$W_{reg}(\cdot,\cdot)$$ is the entropic regularized Wasserstein distance (see ot.bregman.sinkhorn())

• $$\mathbf{a}_i$$ are training distributions (2D images) in the mast two dimensions of matrix $$\mathbf{A}$$

• reg is the regularization strength scalar value

The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in 

Parameters
• A (array-like, shape (n_hists, width, height)) – n distributions (2D images) of size width x height

• reg (float) – Regularization term >0

• weights (array-like, shape (n_hists,)) – Weights of each image on the simplex (barycentric coodinates)

• method (string, optional) – method used for the solver either ‘sinkhorn’ or ‘sinkhorn_log’

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (> 0)

• stabThr (float, optional) – Stabilization threshold to avoid numerical precision issue

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• a (array-like, shape (width, height)) – 2D Wasserstein barycenter

• log (dict) – log dictionary return only if log==True in parameters

References

21

Solomon, J., De Goes, F., Peyré, G., Cuturi, M., Butscher, A., Nguyen, A. & Guibas, L. (2015). Convolutional wasserstein distances: Efficient optimal transportation on geometric domains. ACM Transactions on Graphics (TOG), 34(4), 66

37

Janati, H., Cuturi, M., Gramfort, A. Proceedings of the 37th International Conference on Machine Learning, PMLR 119:4692-4701, 2020

### Examples using ot.bregman.convolutional_barycenter2d

ot.bregman.convolutional_barycenter2d_debiased(A, reg, weights=None, method='sinkhorn', numItermax=10000, stopThr=0.001, verbose=False, log=False, warn=True, **kwargs)[source]

Compute the debiased sinkhorn barycenter of distributions $$\mathbf{A}$$ where $$\mathbf{A}$$ is a collection of 2D images.

The function solves the following optimization problem:

$\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i S_{reg}(\mathbf{a},\mathbf{a}_i)$

where :

• $$S_{reg}(\cdot,\cdot)$$ is the debiased entropic regularized Wasserstein distance (see ot.bregman.barycenter_debiased())

• $$\mathbf{a}_i$$ are training distributions (2D images) in the mast two dimensions of matrix $$\mathbf{A}$$

• reg is the regularization strength scalar value

The algorithm used for solving the problem is the debiased Sinkhorn scaling algorithm as proposed in 

Parameters
• A (array-like, shape (n_hists, width, height)) – n distributions (2D images) of size width x height

• reg (float) – Regularization term >0

• weights (array-like, shape (n_hists,)) – Weights of each image on the simplex (barycentric coodinates)

• method (string, optional) – method used for the solver either ‘sinkhorn’ or ‘sinkhorn_log’

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (> 0)

• stabThr (float, optional) – Stabilization threshold to avoid numerical precision issue

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• a (array-like, shape (width, height)) – 2D Wasserstein barycenter

• log (dict) – log dictionary return only if log==True in parameters

References

37

Janati, H., Cuturi, M., Gramfort, A. Proceedings of the 37th International Conference on Machine Learning, PMLR 119:4692-4701, 2020

### Examples using ot.bregman.convolutional_barycenter2d_debiased

ot.bregman.empirical_sinkhorn(X_s, X_t, reg, a=None, b=None, metric='sqeuclidean', numIterMax=10000, stopThr=1e-09, isLazy=False, batchSize=100, verbose=False, log=False, warn=True, **kwargs)[source]

Solve the entropic regularization optimal transport problem and return the OT matrix from empirical data

The function solves the following optimization problem:

\begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align}

where :

• $$\mathbf{M}$$ is the (n_samples_a, n_samples_b) metric cost matrix

• $$\Omega$$ is the entropic regularization term $$\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})$$

• $$\mathbf{a}$$ and $$\mathbf{b}$$ are source and target weights (sum to 1)

Parameters
• X_s (array-like, shape (n_samples_a, dim)) – samples in the source domain

• X_t (array-like, shape (n_samples_b, dim)) – samples in the target domain

• reg (float) – Regularization term >0

• a (array-like, shape (n_samples_a,)) – samples weights in the source domain

• b (array-like, shape (n_samples_b,)) – samples weights in the target domain

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• isLazy (boolean, optional) – If True, then only calculate the cost matrix by block and return the dual potentials only (to save memory). If False, calculate full cost matrix and return outputs of sinkhorn function.

• batchSize (int or tuple of 2 int, optional) – Size of the batches used to compute the sinkhorn update without memory overhead. When a tuple is provided it sets the size of the left/right batches.

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• gamma (array-like, shape (n_samples_a, n_samples_b)) – Regularized optimal transportation matrix for the given parameters

• log (dict) – log dictionary return only if log==True in parameters

Examples

>>> n_samples_a = 2
>>> n_samples_b = 2
>>> reg = 0.1
>>> X_s = np.reshape(np.arange(n_samples_a, dtype=np.float64), (n_samples_a, 1))
>>> X_t = np.reshape(np.arange(0, n_samples_b, dtype=np.float64), (n_samples_b, 1))
>>> empirical_sinkhorn(X_s, X_t, reg=reg, verbose=False)
array([[4.99977301e-01,  2.26989344e-05],
[2.26989344e-05,  4.99977301e-01]])


References

2

M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013

9

Schmitzer, B. (2016). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519.

10

Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2016). Scaling algorithms for unbalanced transport problems. arXiv preprint arXiv:1607.05816.

### Examples using ot.bregman.empirical_sinkhorn

ot.bregman.empirical_sinkhorn2(X_s, X_t, reg, a=None, b=None, metric='sqeuclidean', numIterMax=10000, stopThr=1e-09, isLazy=False, batchSize=100, verbose=False, log=False, warn=True, **kwargs)[source]

Solve the entropic regularization optimal transport problem from empirical data and return the OT loss

The function solves the following optimization problem:

\begin{align}\begin{aligned}W = \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align}

where :

• $$\mathbf{M}$$ is the (n_samples_a, n_samples_b) metric cost matrix

• $$\Omega$$ is the entropic regularization term $$\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})$$

• $$\mathbf{a}$$ and $$\mathbf{b}$$ are source and target weights (sum to 1)

Parameters
• X_s (array-like, shape (n_samples_a, dim)) – samples in the source domain

• X_t (array-like, shape (n_samples_b, dim)) – samples in the target domain

• reg (float) – Regularization term >0

• a (array-like, shape (n_samples_a,)) – samples weights in the source domain

• b (array-like, shape (n_samples_b,)) – samples weights in the target domain

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• isLazy (boolean, optional) – If True, then only calculate the cost matrix by block and return the dual potentials only (to save memory). If False, calculate full cost matrix and return outputs of sinkhorn function.

• batchSize (int or tuple of 2 int, optional) – Size of the batches used to compute the sinkhorn update without memory overhead. When a tuple is provided it sets the size of the left/right batches.

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• W ((n_hists) array-like or float) – Optimal transportation loss for the given parameters

• log (dict) – log dictionary return only if log==True in parameters

Examples

>>> n_samples_a = 2
>>> n_samples_b = 2
>>> reg = 0.1
>>> X_s = np.reshape(np.arange(n_samples_a, dtype=np.float64), (n_samples_a, 1))
>>> X_t = np.reshape(np.arange(0, n_samples_b, dtype=np.float64), (n_samples_b, 1))
>>> b = np.full((n_samples_b, 3), 1/n_samples_b)
>>> empirical_sinkhorn2(X_s, X_t, b=b, reg=reg, verbose=False)
array([4.53978687e-05, 4.53978687e-05, 4.53978687e-05])


References

2

M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013

9

Schmitzer, B. (2016). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519.

10

Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2016). Scaling algorithms for unbalanced transport problems. arXiv preprint arXiv:1607.05816.

ot.bregman.empirical_sinkhorn_divergence(X_s, X_t, reg, a=None, b=None, metric='sqeuclidean', numIterMax=10000, stopThr=1e-09, verbose=False, log=False, warn=True, **kwargs)[source]

Compute the sinkhorn divergence loss from empirical data

The function solves the following optimization problems and return the sinkhorn divergence $$S$$:

\begin{align}\begin{aligned}W &= \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\W_a &= \min_{\gamma_a} \quad \langle \gamma_a, \mathbf{M_a} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma_a)\\W_b &= \min_{\gamma_b} \quad \langle \gamma_b, \mathbf{M_b} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma_b)\\S &= W - \frac{W_a + W_b}{2}\end{aligned}\end{align}
\begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\\ \gamma_a \mathbf{1} &= \mathbf{a}\\ \gamma_a^T \mathbf{1} &= \mathbf{a}\\ \gamma_a &\geq 0\\ \gamma_b \mathbf{1} &= \mathbf{b}\\ \gamma_b^T \mathbf{1} &= \mathbf{b}\\ \gamma_b &\geq 0\end{aligned}\end{align}

where :

• $$\mathbf{M}$$ (resp. $$\mathbf{M_a}$$, $$\mathbf{M_b}$$) is the (n_samples_a, n_samples_b) metric cost matrix (resp (n_samples_a, n_samples_a) and (n_samples_b, n_samples_b))

• $$\Omega$$ is the entropic regularization term $$\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})$$

• $$\mathbf{a}$$ and $$\mathbf{b}$$ are source and target weights (sum to 1)

Parameters
• X_s (array-like, shape (n_samples_a, dim)) – samples in the source domain

• X_t (array-like, shape (n_samples_b, dim)) – samples in the target domain

• reg (float) – Regularization term >0

• a (array-like, shape (n_samples_a,)) – samples weights in the source domain

• b (array-like, shape (n_samples_b,)) – samples weights in the target domain

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• W ((1,) array-like) – Optimal transportation symmetrized loss for the given parameters

• log (dict) – log dictionary return only if log==True in parameters

Examples

>>> n_samples_a = 2
>>> n_samples_b = 4
>>> reg = 0.1
>>> X_s = np.reshape(np.arange(n_samples_a, dtype=np.float64), (n_samples_a, 1))
>>> X_t = np.reshape(np.arange(0, n_samples_b, dtype=np.float64), (n_samples_b, 1))
>>> empirical_sinkhorn_divergence(X_s, X_t, reg)
1.499887176049052


References

23

Aude Genevay, Gabriel Peyré, Marco Cuturi, Learning Generative Models with Sinkhorn Divergences, Proceedings of the Twenty-First International Conference on Artficial Intelligence and Statistics, (AISTATS) 21, 2018

ot.bregman.geometricBar(weights, alldistribT)[source]

return the weighted geometric mean of distributions

ot.bregman.geometricMean(alldistribT)[source]

return the geometric mean of distributions

ot.bregman.greenkhorn(a, b, M, reg, numItermax=10000, stopThr=1e-09, verbose=False, log=False, warn=True)[source]

Solve the entropic regularization optimal transport problem and return the OT matrix

The algorithm used is based on the paper  which is a stochastic version of the Sinkhorn-Knopp algorithm 

The function solves the following optimization problem:

\begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align}

where :

• $$\mathbf{M}$$ is the (dim_a, dim_b) metric cost matrix

• $$\Omega$$ is the entropic regularization term $$\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})$$

• $$\mathbf{a}$$ and $$\mathbf{b}$$ are source and target weights (histograms, both sum to 1)

Parameters
• a (array-like, shape (dim_a,)) – samples weights in the source domain

• b (array-like, shape (dim_b,) or array-like, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets and fixed $$\mathbf{M}$$ if $$\mathbf{b}$$ is a matrix (return OT loss + dual variables in log)

• M (array-like, shape (dim_a, dim_b)) – loss matrix

• reg (float) – Regularization term >0

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters

• log (dict) – log dictionary return only if log==True in parameters

Examples

>>> import ot
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.], [1., 0.]]
>>> ot.bregman.greenkhorn(a, b, M, 1)
array([[0.36552929, 0.13447071],
[0.13447071, 0.36552929]])


References

2

M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013

22

J. Altschuler, J.Weed, P. Rigollet : Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration, Advances in Neural Information Processing Systems (NIPS) 31, 2017

ot.lp.emd

Unregularized OT

ot.optim.cg

General regularized OT

ot.bregman.jcpot_barycenter(Xs, Ys, Xt, reg, metric='sqeuclidean', numItermax=100, stopThr=1e-06, verbose=False, log=False, warn=True, **kwargs)[source]

Joint OT and proportion estimation for multi-source target shift as proposed in 

The function solves the following optimization problem:

\begin{align}\begin{aligned}\mathbf{h} = \mathop{\arg \min}_{\mathbf{h}} \quad \sum_{k=1}^{K} \lambda_k W_{reg}((\mathbf{D}_2^{(k)} \mathbf{h})^T, \mathbf{a})\\s.t. \ \forall k, \mathbf{D}_1^{(k)} \gamma_k \mathbf{1}_n= \mathbf{h}\end{aligned}\end{align}

where :

• $$\lambda_k$$ is the weight of k-th source domain

• $$W_{reg}(\cdot,\cdot)$$ is the entropic regularized Wasserstein distance (see ot.bregman.sinkhorn())

• $$\mathbf{D}_2^{(k)}$$ is a matrix of weights related to k-th source domain defined as in [p. 5, 27], its expected shape is $$(n_k, C)$$ where $$n_k$$ is the number of elements in the k-th source domain and C is the number of classes

• $$\mathbf{h}$$ is a vector of estimated proportions in the target domain of size C

• $$\mathbf{a}$$ is a uniform vector of weights in the target domain of size n

• $$\mathbf{D}_1^{(k)}$$ is a matrix of class assignments defined as in [p. 5, 27], its expected shape is $$(n_k, C)$$

The problem consist in solving a Wasserstein barycenter problem to estimate the proportions $$\mathbf{h}$$ in the target domain.

The algorithm used for solving the problem is the Iterative Bregman projections algorithm with two sets of marginal constraints related to the unknown vector $$\mathbf{h}$$ and uniform target distribution.

Parameters
• Xs (list of K array-like(nsk,d)) – features of all source domains’ samples

• Ys (list of K array-like(nsk,)) – labels of all source domains’ samples

• Xt (array-like (nt,d)) – samples in the target domain

• reg (float) – Regularization term > 0

• metric (string, optional (default="sqeuclidean")) – The ground metric for the Wasserstein problem

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on relative change in the barycenter (>0)

• verbose (bool, optional (default=False)) – Controls the verbosity of the optimization algorithm

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• h ((C,) array-like) – proportion estimation in the target domain

• log (dict) – log dictionary return only if log==True in parameters

References

27

Ievgen Redko, Nicolas Courty, Rémi Flamary, Devis Tuia “Optimal transport for multi-source domain adaptation under target shift”, International Conference on Artificial Intelligence and Statistics (AISTATS), 2019.

ot.bregman.projC(gamma, q)[source]

return the KL projection on the column constrints

ot.bregman.projR(gamma, p)[source]

return the KL projection on the row constrints

ot.bregman.screenkhorn(a, b, M, reg, ns_budget=None, nt_budget=None, uniform=False, restricted=True, maxiter=10000, maxfun=10000, pgtol=1e-09, verbose=False, log=False)[source]

Screening Sinkhorn Algorithm for Regularized Optimal Transport

The function solves an approximate dual of Sinkhorn divergence  which is written as the following optimization problem:

$(\mathbf{u}, \mathbf{v}) = \mathop{\arg \min}_{\mathbf{u}, \mathbf{v}} \quad \mathbf{1}_{ns}^T \mathbf{B}(\mathbf{u}, \mathbf{v}) \mathbf{1}_{nt} - \langle \kappa \mathbf{u}, \mathbf{a} \rangle - \langle \frac{1}{\kappa} \mathbf{v}, \mathbf{b} \rangle$

where:

$\mathbf{B}(\mathbf{u}, \mathbf{v}) = \mathrm{diag}(e^\mathbf{u}) \mathbf{K} \mathrm{diag}(e^\mathbf{v}) \text{, with } \mathbf{K} = e^{-\mathbf{M} / \mathrm{reg}} \text{ and}$
\begin{align}\begin{aligned}s.t. \ e^{u_i} &\geq \epsilon / \kappa, \forall i \in \{1, \ldots, ns\}\\ e^{v_j} &\geq \epsilon \kappa, \forall j \in \{1, \ldots, nt\}\end{aligned}\end{align}

The parameters kappa and epsilon are determined w.r.t the couple number budget of points (ns_budget, nt_budget), see Equation (5) in 

Parameters
• a (array-like, shape=(ns,)) – samples weights in the source domain

• b (array-like, shape=(nt,)) – samples weights in the target domain

• M (array-like, shape=(ns, nt)) – Cost matrix

• reg (float) – Level of the entropy regularisation

• ns_budget (int, default=None) – Number budget of points to be kept in the source domain. If it is None then 50% of the source sample points will be kept

• nt_budget (int, default=None) – Number budget of points to be kept in the target domain. If it is None then 50% of the target sample points will be kept

• uniform (bool, default=False) – If True, the source and target distribution are supposed to be uniform, i.e., $$a_i = 1 / ns$$ and $$b_j = 1 / nt$$

• restricted (bool, default=True) – If True, a warm-start initialization for the L-BFGS-B solver using a restricted Sinkhorn algorithm with at most 5 iterations

• maxiter (int, default=10000) – Maximum number of iterations in LBFGS solver

• maxfun (int, default=10000) – Maximum number of function evaluations in LBFGS solver

• pgtol (float, default=1e-09) – Final objective function accuracy in LBFGS solver

• verbose (bool, default=False) – If True, display informations about the cardinals of the active sets and the parameters kappa and epsilon

Dependency

To gain more efficiency, ot.bregman.screenkhorn() needs to call the “Bottleneck” package (https://pypi.org/project/Bottleneck/) in the screening pre-processing step.

If Bottleneck isn’t installed, the following error message appears:

“Bottleneck module doesn’t exist. Install it from https://pypi.org/project/Bottleneck/

Returns

• gamma (array-like, shape=(ns, nt)) – Screened optimal transportation matrix for the given parameters

• log (dict, default=False) – Log dictionary return only if log==True in parameters

References

2

M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013

26

Alaya M. Z., Bérar M., Gasso G., Rakotomamonjy A. (2019). Screening Sinkhorn Algorithm for Regularized Optimal Transport (NIPS) 33, 2019

### Examples using ot.bregman.screenkhorn

ot.bregman.sinkhorn(a, b, M, reg, method='sinkhorn', numItermax=1000, stopThr=1e-09, verbose=False, log=False, warn=True, **kwargs)[source]

Solve the entropic regularization optimal transport problem and return the OT matrix

The function solves the following optimization problem:

\begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align}

where :

• $$\mathbf{M}$$ is the (dim_a, dim_b) metric cost matrix

• $$\Omega$$ is the entropic regularization term $$\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})$$

• $$\mathbf{a}$$ and $$\mathbf{b}$$ are source and target weights (histograms, both sum to 1)

Note

This function is backend-compatible and will work on arrays from all compatible backends.

The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in 

Choosing a Sinkhorn solver

By default and when using a regularization parameter that is not too small the default sinkhorn solver should be enough. If you need to use a small regularization to get sharper OT matrices, you should use the ot.bregman.sinkhorn_stabilized() solver that will avoid numerical errors. This last solver can be very slow in practice and might not even converge to a reasonable OT matrix in a finite time. This is why ot.bregman.sinkhorn_epsilon_scaling() that relies on iterating the value of the regularization (and using warm start) sometimes leads to better solutions. Note that the greedy version of the sinkhorn ot.bregman.greenkhorn() can also lead to a speedup and the screening version of the sinkhorn ot.bregman.screenkhorn() aim at providing a fast approximation of the Sinkhorn problem. For use of GPU and gradient computation with small number of iterations we strongly recommend the ot.bregman.sinkhorn_log() solver that will no need to check for numerical problems.

Parameters
• a (array-like, shape (dim_a,)) – samples weights in the source domain

• b (array-like, shape (dim_b,) or ndarray, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets and fixed $$\mathbf{M}$$ if $$\mathbf{b}$$ is a matrix (return OT loss + dual variables in log)

• M (array-like, shape (dim_a, dim_b)) – loss matrix

• reg (float) – Regularization term >0

• method (str) – method used for the solver either ‘sinkhorn’,’sinkhorn_log’, ‘greenkhorn’, ‘sinkhorn_stabilized’ or ‘sinkhorn_epsilon_scaling’, see those function for specific parameters

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters

• log (dict) – log dictionary return only if log==True in parameters

Examples

>>> import ot
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.], [1., 0.]]
>>> ot.sinkhorn(a, b, M, 1)
array([[0.36552929, 0.13447071],
[0.13447071, 0.36552929]])


References

2

M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013

9

Schmitzer, B. (2016). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519.

10

Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2016). Scaling algorithms for unbalanced transport problems. arXiv preprint arXiv:1607.05816.

34

Feydy, J., Séjourné, T., Vialard, F. X., Amari, S. I., Trouvé, A., & Peyré, G. (2019, April). Interpolating between optimal transport and MMD using Sinkhorn divergences. In The 22nd International Conference on Artificial Intelligence and Statistics (pp. 2681-2690). PMLR.

ot.bregman.sinkhorn2(a, b, M, reg, method='sinkhorn', numItermax=1000, stopThr=1e-09, verbose=False, log=False, warn=False, **kwargs)[source]

Solve the entropic regularization optimal transport problem and return the loss

The function solves the following optimization problem:

\begin{align}\begin{aligned}W = \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align}

where :

• $$\mathbf{M}$$ is the (dim_a, dim_b) metric cost matrix

• $$\Omega$$ is the entropic regularization term $$\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})$$

• $$\mathbf{a}$$ and $$\mathbf{b}$$ are source and target weights (histograms, both sum to 1)

Note

This function is backend-compatible and will work on arrays from all compatible backends.

The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in 

Choosing a Sinkhorn solver

By default and when using a regularization parameter that is not too small the default sinkhorn solver should be enough. If you need to use a small regularization to get sharper OT matrices, you should use the ot.bregman.sinkhorn_log() solver that will avoid numerical errors. This last solver can be very slow in practice and might not even converge to a reasonable OT matrix in a finite time. This is why ot.bregman.sinkhorn_epsilon_scaling() that relies on iterating the value of the regularization (and using warm start) sometimes leads to better solutions. Note that the greedy version of the sinkhorn ot.bregman.greenkhorn() can also lead to a speedup and the screening version of the sinkhorn ot.bregman.screenkhorn() aim a providing a fast approximation of the Sinkhorn problem. For use of GPU and gradient computation with small number of iterations we strongly recommend the ot.bregman.sinkhorn_log() solver that will no need to check for numerical problems.

Parameters
• a (array-like, shape (dim_a,)) – samples weights in the source domain

• b (array-like, shape (dim_b,) or ndarray, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets and fixed $$\mathbf{M}$$ if $$\mathbf{b}$$ is a matrix (return OT loss + dual variables in log)

• M (array-like, shape (dim_a, dim_b)) – loss matrix

• reg (float) – Regularization term >0

• method (str) – method used for the solver either ‘sinkhorn’,’sinkhorn_log’, ‘sinkhorn_stabilized’, see those function for specific parameters

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• W ((n_hists) float/array-like) – Optimal transportation loss for the given parameters

• log (dict) – log dictionary return only if log==True in parameters

Examples

>>> import ot
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.], [1., 0.]]
>>> ot.sinkhorn2(a, b, M, 1)
0.26894142136999516


References

2

M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013

9

Schmitzer, B. (2016). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519.

10

Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2016). Scaling algorithms for unbalanced transport problems. arXiv preprint arXiv:1607.05816.

21

Altschuler J., Weed J., Rigollet P. : Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration, Advances in Neural Information Processing Systems (NIPS) 31, 2017

34

Feydy, J., Séjourné, T., Vialard, F. X., Amari, S. I., Trouvé, A., & Peyré, G. (2019, April). Interpolating between optimal transport and MMD using Sinkhorn divergences. In The 22nd International Conference on Artificial Intelligence and Statistics (pp. 2681-2690). PMLR.

ot.bregman.sinkhorn_epsilon_scaling(a, b, M, reg, numItermax=100, epsilon0=10000.0, numInnerItermax=100, tau=1000.0, stopThr=1e-09, warmstart=None, verbose=False, print_period=10, log=False, warn=True, **kwargs)[source]

Solve the entropic regularization optimal transport problem with log stabilization and epsilon scaling. The function solves the following optimization problem:

\begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align}

where :

• $$\mathbf{M}$$ is the (dim_a, dim_b) metric cost matrix

• $$\Omega$$ is the entropic regularization term $$\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})$$

• $$\mathbf{a}$$ and $$\mathbf{b}$$ are source and target weights (histograms, both sum to 1)

The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in  but with the log stabilization proposed in  and the log scaling proposed in  algorithm 3.2

Parameters
• a (array-like, shape (dim_a,)) – samples weights in the source domain

• b (array-like, shape (dim_b,)) – samples in the target domain

• M (array-like, shape (dim_a, dim_b)) – loss matrix

• reg (float) – Regularization term >0

• tau (float) – threshold for max value in $$\mathbf{u}$$ or $$\mathbf{b}$$ for log scaling

• warmstart (tuple of vectors) – if given then starting values for alpha and beta log scalings

• numItermax (int, optional) – Max number of iterations

• numInnerItermax (int, optional) – Max number of iterations in the inner slog stabilized sinkhorn

• epsilon0 (int, optional) – first epsilon regularization value (then exponential decrease to reg)

• stopThr (float, optional) – Stop threshold on error (>0)

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters

• log (dict) – log dictionary return only if log==True in parameters

Examples

>>> import ot
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.], [1., 0.]]
>>> ot.bregman.sinkhorn_epsilon_scaling(a, b, M, 1)
array([[0.36552929, 0.13447071],
[0.13447071, 0.36552929]])


References

2

M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013

9

Schmitzer, B. (2016). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519.

10

Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2016). Scaling algorithms for unbalanced transport problems. arXiv preprint arXiv:1607.05816.

ot.lp.emd

Unregularized OT

ot.optim.cg

General regularized OT

ot.bregman.sinkhorn_knopp(a, b, M, reg, numItermax=1000, stopThr=1e-09, verbose=False, log=False, warn=True, **kwargs)[source]

Solve the entropic regularization optimal transport problem and return the OT matrix

The function solves the following optimization problem:

\begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align}

where :

• $$\mathbf{M}$$ is the (dim_a, dim_b) metric cost matrix

• $$\Omega$$ is the entropic regularization term $$\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})$$

• $$\mathbf{a}$$ and $$\mathbf{b}$$ are source and target weights (histograms, both sum to 1)

The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in 

Parameters
• a (array-like, shape (dim_a,)) – samples weights in the source domain

• b (array-like, shape (dim_b,) or array-like, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets and fixed $$\mathbf{M}$$ if $$\mathbf{b}$$ is a matrix (return OT loss + dual variables in log)

• M (array-like, shape (dim_a, dim_b)) – loss matrix

• reg (float) – Regularization term >0

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters

• log (dict) – log dictionary return only if log==True in parameters

Examples

>>> import ot
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.], [1., 0.]]
>>> ot.sinkhorn(a, b, M, 1)
array([[0.36552929, 0.13447071],
[0.13447071, 0.36552929]])


References

2

M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013

ot.lp.emd

Unregularized OT

ot.optim.cg

General regularized OT

ot.bregman.sinkhorn_log(a, b, M, reg, numItermax=1000, stopThr=1e-09, verbose=False, log=False, warn=True, **kwargs)[source]

Solve the entropic regularization optimal transport problem in log space and return the OT matrix

The function solves the following optimization problem:

\begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align}

where :

• $$\mathbf{M}$$ is the (dim_a, dim_b) metric cost matrix

• $$\Omega$$ is the entropic regularization term $$\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})$$

• $$\mathbf{a}$$ and $$\mathbf{b}$$ are source and target weights (histograms, both sum to 1)

The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm  with the implementation from 

Parameters
• a (array-like, shape (dim_a,)) – samples weights in the source domain

• b (array-like, shape (dim_b,) or array-like, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets and fixed $$\mathbf{M}$$ if $$\mathbf{b}$$ is a matrix (return OT loss + dual variables in log)

• M (array-like, shape (dim_a, dim_b)) – loss matrix

• reg (float) – Regularization term >0

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters

• log (dict) – log dictionary return only if log==True in parameters

Examples

>>> import ot
>>> a=[.5, .5]
>>> b=[.5, .5]
>>> M=[[0., 1.], [1., 0.]]
>>> ot.sinkhorn(a, b, M, 1)
array([[0.36552929, 0.13447071],
[0.13447071, 0.36552929]])


References

2

M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013

34

Feydy, J., Séjourné, T., Vialard, F. X., Amari, S. I., Trouvé, A., & Peyré, G. (2019, April). Interpolating between optimal transport and MMD using Sinkhorn divergences. In The 22nd International Conference on Artificial Intelligence and Statistics (pp. 2681-2690). PMLR.

ot.lp.emd

Unregularized OT

ot.optim.cg

General regularized OT

ot.bregman.sinkhorn_stabilized(a, b, M, reg, numItermax=1000, tau=1000.0, stopThr=1e-09, warmstart=None, verbose=False, print_period=20, log=False, warn=True, **kwargs)[source]

Solve the entropic regularization OT problem with log stabilization

The function solves the following optimization problem:

\begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align}

where :

• $$\mathbf{M}$$ is the (dim_a, dim_b) metric cost matrix

• $$\Omega$$ is the entropic regularization term $$\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})$$

• $$\mathbf{a}$$ and $$\mathbf{b}$$ are source and target weights (histograms, both sum to 1)

The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in  but with the log stabilization proposed in  an defined in  (Algo 3.1) .

Parameters
• a (array-like, shape (dim_a,)) – samples weights in the source domain

• b (array-like, shape (dim_b,)) – samples in the target domain

• M (array-like, shape (dim_a, dim_b)) – loss matrix

• reg (float) – Regularization term >0

• tau (float) – threshold for max value in $$\mathbf{u}$$ or $$\mathbf{v}$$ for log scaling

• warmstart (table of vectors) – if given then starting values for alpha and beta log scalings

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters

• log (dict) – log dictionary return only if log==True in parameters

Examples

>>> import ot
>>> a=[.5,.5]
>>> b=[.5,.5]
>>> M=[[0.,1.],[1.,0.]]
>>> ot.bregman.sinkhorn_stabilized(a, b, M, 1)
array([[0.36552929, 0.13447071],
[0.13447071, 0.36552929]])


References

2

M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013

9

Schmitzer, B. (2016). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519.

10

Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2016). Scaling algorithms for unbalanced transport problems. arXiv preprint arXiv:1607.05816.

ot.lp.emd

Unregularized OT

ot.optim.cg

General regularized OT

ot.bregman.unmix(a, D, M, M0, h0, reg, reg0, alpha, numItermax=1000, stopThr=0.001, verbose=False, log=False, warn=True)[source]

Compute the unmixing of an observation with a given dictionary using Wasserstein distance

The function solve the following optimization problem:

$\mathbf{h} = \mathop{\arg \min}_\mathbf{h} \quad (1 - \alpha) W_{\mathbf{M}, \mathrm{reg}}(\mathbf{a}, \mathbf{Dh}) + \alpha W_{\mathbf{M_0}, \mathrm{reg}_0}(\mathbf{h}_0, \mathbf{h})$

where :

• $$W_{M,reg}(\cdot,\cdot)$$ is the entropic regularized Wasserstein distance with $$\mathbf{M}$$ loss matrix (see ot.bregman.sinkhorn())

• $$\mathbf{D}$$ is a dictionary of n_atoms atoms of dimension dim_a, its expected shape is (dim_a, n_atoms)

• $$\mathbf{h}$$ is the estimated unmixing of dimension n_atoms

• $$\mathbf{a}$$ is an observed distribution of dimension dim_a

• $$\mathbf{h}_0$$ is a prior on $$\mathbf{h}$$ of dimension dim_prior

• reg and $$\mathbf{M}$$ are respectively the regularization term and the cost matrix (dim_a, dim_a) for OT data fitting

• reg$$_0$$ and $$\mathbf{M_0}$$ are respectively the regularization term and the cost matrix (dim_prior, n_atoms) regularization

• $$\alpha$$ weight data fitting and regularization

The optimization problem is solved following the algorithm described in 

Parameters
• a (array-like, shape (dim_a)) – observed distribution (histogram, sums to 1)

• D (array-like, shape (dim_a, n_atoms)) – dictionary matrix

• M (array-like, shape (dim_a, dim_a)) – loss matrix

• M0 (array-like, shape (n_atoms, dim_prior)) – loss matrix

• h0 (array-like, shape (n_atoms,)) – prior on the estimated unmixing h

• reg (float) – Regularization term >0 (Wasserstein data fitting)

• reg0 (float) – Regularization term >0 (Wasserstein reg with h0)

• alpha (float) – How much should we trust the prior ([0,1])

• numItermax (int, optional) – Max number of iterations

• stopThr (float, optional) – Stop threshold on error (>0)

• verbose (bool, optional) – Print information along iterations

• log (bool, optional) – record log if True

• warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.

Returns

• h (array-like, shape (n_atoms,)) – Wasserstein barycenter

• log (dict) – log dictionary return only if log==True in parameters

References

4

S. Nakhostin, N. Courty, R. Flamary, D. Tuia, T. Corpetti, Supervised planetary unmixing with optimal transport, Whorkshop on Hyperspectral Image and Signal Processing : Evolution in Remote Sensing (WHISPERS), 2016.