ot.bregman
Bregman projections solvers for entropic regularized OT
Functions
- ot.bregman.barycenter(A, M, reg, weights=None, method='sinkhorn', numItermax=10000, stopThr=0.0001, verbose=False, log=False, warn=True, **kwargs)[source]
Compute the entropic regularized wasserstein barycenter of distributions \(\mathbf{A}\)
The function solves the following optimization problem:
\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{reg}(\mathbf{a},\mathbf{a}_i)\]where :
\(W_{reg}(\cdot,\cdot)\) is the entropic regularized Wasserstein distance (see
ot.bregman.sinkhorn()
) if method is sinkhorn or sinkhorn_stabilized or sinkhorn_log.\(\mathbf{a}_i\) are training distributions in the columns of matrix \(\mathbf{A}\)
reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix for OT
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in [3]
- Parameters
A (array-like, shape (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of size dim
M (array-like, shape (dim, dim)) – loss matrix for OT
reg (float) – Regularization term > 0
method (str (optional)) – method used for the solver either ‘sinkhorn’ or ‘sinkhorn_stabilized’ or ‘sinkhorn_log’
weights (array-like, shape (n_hists,)) – Weights of each histogram \(\mathbf{a}_i\) on the simplex (barycentric coodinates)
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
a ((dim,) array-like) – Wasserstein barycenter
log (dict) – log dictionary return only if log==True in parameters
References
- 3
Benamou, J. D., Carlier, G., Cuturi, M., Nenna, L., & Peyré, G. (2015). Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37(2), A1111-A1138.
- ot.bregman.barycenter_debiased(A, M, reg, weights=None, method='sinkhorn', numItermax=10000, stopThr=0.0001, verbose=False, log=False, warn=True, **kwargs)[source]
Compute the debiased Sinkhorn barycenter of distributions A
The function solves the following optimization problem:
\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i S_{reg}(\mathbf{a},\mathbf{a}_i)\]where :
\(S_{reg}(\cdot,\cdot)\) is the debiased Sinkhorn divergence (see
ot.bregman.empirical_sinkhorn_divergence()
)\(\mathbf{a}_i\) are training distributions in the columns of matrix \(\mathbf{A}\)
reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix for OT
The algorithm used for solving the problem is the debiased Sinkhorn algorithm as proposed in [37]
- Parameters
A (array-like, shape (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of size dim
M (array-like, shape (dim, dim)) – loss matrix for OT
reg (float) – Regularization term > 0
method (str (optional)) – method used for the solver either ‘sinkhorn’ or ‘sinkhorn_log’
weights (array-like, shape (n_hists,)) – Weights of each histogram \(\mathbf{a}_i\) on the simplex (barycentric coodinates)
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
a ((dim,) array-like) – Wasserstein barycenter
log (dict) – log dictionary return only if log==True in parameters
References
- 37
Janati, H., Cuturi, M., Gramfort, A. Proceedings of the 37th International Conference on Machine Learning, PMLR 119:4692-4701, 2020
Examples using ot.bregman.barycenter_debiased
- ot.bregman.barycenter_sinkhorn(A, M, reg, weights=None, numItermax=1000, stopThr=0.0001, verbose=False, log=False, warn=True)[source]
Compute the entropic regularized wasserstein barycenter of distributions \(\mathbf{A}\)
The function solves the following optimization problem:
\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{reg}(\mathbf{a},\mathbf{a}_i)\]where :
\(W_{reg}(\cdot,\cdot)\) is the entropic regularized Wasserstein distance (see
ot.bregman.sinkhorn()
)\(\mathbf{a}_i\) are training distributions in the columns of matrix \(\mathbf{A}\)
reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix for OT
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in [3].
- Parameters
A (array-like, shape (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of size dim
M (array-like, shape (dim, dim)) – loss matrix for OT
reg (float) – Regularization term > 0
weights (array-like, shape (n_hists,)) – Weights of each histogram \(\mathbf{a}_i\) on the simplex (barycentric coodinates)
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
a ((dim,) array-like) – Wasserstein barycenter
log (dict) – log dictionary return only if log==True in parameters
References
- 3
Benamou, J. D., Carlier, G., Cuturi, M., Nenna, L., & Peyré, G. (2015).
Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37(2), A1111-A1138.
- ot.bregman.barycenter_stabilized(A, M, reg, tau=10000000000.0, weights=None, numItermax=1000, stopThr=0.0001, verbose=False, log=False, warn=True)[source]
Compute the entropic regularized wasserstein barycenter of distributions \(\mathbf{A}\) with stabilization.
The function solves the following optimization problem:
\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{reg}(\mathbf{a},\mathbf{a}_i)\]where :
\(W_{reg}(\cdot,\cdot)\) is the entropic regularized Wasserstein distance (see
ot.bregman.sinkhorn()
)\(\mathbf{a}_i\) are training distributions in the columns of matrix \(\mathbf{A}\)
reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix for OT
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in [3]
- Parameters
A (array-like, shape (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of size dim
M (array-like, shape (dim, dim)) – loss matrix for OT
reg (float) – Regularization term > 0
tau (float) – threshold for max value in \(\mathbf{u}\) or \(\mathbf{v}\) for log scaling
weights (array-like, shape (n_hists,)) – Weights of each histogram \(\mathbf{a}_i\) on the simplex (barycentric coodinates)
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
a ((dim,) array-like) – Wasserstein barycenter
log (dict) – log dictionary return only if log==True in parameters
References
- 3
Benamou, J. D., Carlier, G., Cuturi, M., Nenna, L., & Peyré, G. (2015). Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37(2), A1111-A1138.
- ot.bregman.convolutional_barycenter2d(A, reg, weights=None, method='sinkhorn', numItermax=10000, stopThr=0.0001, verbose=False, log=False, warn=True, **kwargs)[source]
Compute the entropic regularized wasserstein barycenter of distributions \(\mathbf{A}\) where \(\mathbf{A}\) is a collection of 2D images.
The function solves the following optimization problem:
\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i W_{reg}(\mathbf{a},\mathbf{a}_i)\]where :
\(W_{reg}(\cdot,\cdot)\) is the entropic regularized Wasserstein distance (see
ot.bregman.sinkhorn()
)\(\mathbf{a}_i\) are training distributions (2D images) in the mast two dimensions of matrix \(\mathbf{A}\)
reg is the regularization strength scalar value
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in [21]
- Parameters
A (array-like, shape (n_hists, width, height)) – n distributions (2D images) of size width x height
reg (float) – Regularization term >0
weights (array-like, shape (n_hists,)) – Weights of each image on the simplex (barycentric coodinates)
method (string, optional) – method used for the solver either ‘sinkhorn’ or ‘sinkhorn_log’
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
stabThr (float, optional) – Stabilization threshold to avoid numerical precision issue
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
a (array-like, shape (width, height)) – 2D Wasserstein barycenter
log (dict) – log dictionary return only if log==True in parameters
References
- 21
Solomon, J., De Goes, F., Peyré, G., Cuturi, M., Butscher, A., Nguyen, A. & Guibas, L. (2015). Convolutional wasserstein distances: Efficient optimal transportation on geometric domains. ACM Transactions on Graphics (TOG), 34(4), 66
- 37
Janati, H., Cuturi, M., Gramfort, A. Proceedings of the 37th International Conference on Machine Learning, PMLR 119:4692-4701, 2020
Examples using ot.bregman.convolutional_barycenter2d
- ot.bregman.convolutional_barycenter2d_debiased(A, reg, weights=None, method='sinkhorn', numItermax=10000, stopThr=0.001, verbose=False, log=False, warn=True, **kwargs)[source]
Compute the debiased sinkhorn barycenter of distributions \(\mathbf{A}\) where \(\mathbf{A}\) is a collection of 2D images.
The function solves the following optimization problem:
\[\mathbf{a} = \mathop{\arg \min}_\mathbf{a} \quad \sum_i S_{reg}(\mathbf{a},\mathbf{a}_i)\]where :
\(S_{reg}(\cdot,\cdot)\) is the debiased entropic regularized Wasserstein distance (see
ot.bregman.barycenter_debiased()
)\(\mathbf{a}_i\) are training distributions (2D images) in the mast two dimensions of matrix \(\mathbf{A}\)
reg is the regularization strength scalar value
The algorithm used for solving the problem is the debiased Sinkhorn scaling algorithm as proposed in [37]
- Parameters
A (array-like, shape (n_hists, width, height)) – n distributions (2D images) of size width x height
reg (float) – Regularization term >0
weights (array-like, shape (n_hists,)) – Weights of each image on the simplex (barycentric coodinates)
method (string, optional) – method used for the solver either ‘sinkhorn’ or ‘sinkhorn_log’
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
stabThr (float, optional) – Stabilization threshold to avoid numerical precision issue
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
a (array-like, shape (width, height)) – 2D Wasserstein barycenter
log (dict) – log dictionary return only if log==True in parameters
References
- 37
Janati, H., Cuturi, M., Gramfort, A. Proceedings of the 37th International Conference on Machine Learning, PMLR 119:4692-4701, 2020
Examples using ot.bregman.convolutional_barycenter2d_debiased
- ot.bregman.empirical_sinkhorn(X_s, X_t, reg, a=None, b=None, metric='sqeuclidean', numIterMax=10000, stopThr=1e-09, isLazy=False, batchSize=100, verbose=False, log=False, warn=True, **kwargs)[source]
Solve the entropic regularization optimal transport problem and return the OT matrix from empirical data
The function solves the following optimization problem:
\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align} \]where :
\(\mathbf{M}\) is the (n_samples_a, n_samples_b) metric cost matrix
\(\Omega\) is the entropic regularization term \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (sum to 1)
- Parameters
X_s (array-like, shape (n_samples_a, dim)) – samples in the source domain
X_t (array-like, shape (n_samples_b, dim)) – samples in the target domain
reg (float) – Regularization term >0
a (array-like, shape (n_samples_a,)) – samples weights in the source domain
b (array-like, shape (n_samples_b,)) – samples weights in the target domain
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
isLazy (boolean, optional) – If True, then only calculate the cost matrix by block and return the dual potentials only (to save memory). If False, calculate full cost matrix and return outputs of sinkhorn function.
batchSize (int or tuple of 2 int, optional) – Size of the batches used to compute the sinkhorn update without memory overhead. When a tuple is provided it sets the size of the left/right batches.
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
gamma (array-like, shape (n_samples_a, n_samples_b)) – Regularized optimal transportation matrix for the given parameters
log (dict) – log dictionary return only if log==True in parameters
Examples
>>> n_samples_a = 2 >>> n_samples_b = 2 >>> reg = 0.1 >>> X_s = np.reshape(np.arange(n_samples_a, dtype=np.float64), (n_samples_a, 1)) >>> X_t = np.reshape(np.arange(0, n_samples_b, dtype=np.float64), (n_samples_b, 1)) >>> empirical_sinkhorn(X_s, X_t, reg=reg, verbose=False) array([[4.99977301e-01, 2.26989344e-05], [2.26989344e-05, 4.99977301e-01]])
References
- 2
M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013
- 9
Schmitzer, B. (2016). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519.
- 10
Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2016). Scaling algorithms for unbalanced transport problems. arXiv preprint arXiv:1607.05816.
Examples using ot.bregman.empirical_sinkhorn
- ot.bregman.empirical_sinkhorn2(X_s, X_t, reg, a=None, b=None, metric='sqeuclidean', numIterMax=10000, stopThr=1e-09, isLazy=False, batchSize=100, verbose=False, log=False, warn=True, **kwargs)[source]
Solve the entropic regularization optimal transport problem from empirical data and return the OT loss
The function solves the following optimization problem:
\[ \begin{align}\begin{aligned}W = \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align} \]where :
\(\mathbf{M}\) is the (n_samples_a, n_samples_b) metric cost matrix
\(\Omega\) is the entropic regularization term \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (sum to 1)
- Parameters
X_s (array-like, shape (n_samples_a, dim)) – samples in the source domain
X_t (array-like, shape (n_samples_b, dim)) – samples in the target domain
reg (float) – Regularization term >0
a (array-like, shape (n_samples_a,)) – samples weights in the source domain
b (array-like, shape (n_samples_b,)) – samples weights in the target domain
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
isLazy (boolean, optional) – If True, then only calculate the cost matrix by block and return the dual potentials only (to save memory). If False, calculate full cost matrix and return outputs of sinkhorn function.
batchSize (int or tuple of 2 int, optional) – Size of the batches used to compute the sinkhorn update without memory overhead. When a tuple is provided it sets the size of the left/right batches.
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
W ((n_hists) array-like or float) – Optimal transportation loss for the given parameters
log (dict) – log dictionary return only if log==True in parameters
Examples
>>> n_samples_a = 2 >>> n_samples_b = 2 >>> reg = 0.1 >>> X_s = np.reshape(np.arange(n_samples_a, dtype=np.float64), (n_samples_a, 1)) >>> X_t = np.reshape(np.arange(0, n_samples_b, dtype=np.float64), (n_samples_b, 1)) >>> b = np.full((n_samples_b, 3), 1/n_samples_b) >>> empirical_sinkhorn2(X_s, X_t, b=b, reg=reg, verbose=False) array([4.53978687e-05, 4.53978687e-05, 4.53978687e-05])
References
- 2
M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013
- 9
Schmitzer, B. (2016). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519.
- 10
Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2016). Scaling algorithms for unbalanced transport problems. arXiv preprint arXiv:1607.05816.
- ot.bregman.empirical_sinkhorn_divergence(X_s, X_t, reg, a=None, b=None, metric='sqeuclidean', numIterMax=10000, stopThr=1e-09, verbose=False, log=False, warn=True, **kwargs)[source]
Compute the sinkhorn divergence loss from empirical data
The function solves the following optimization problems and return the sinkhorn divergence \(S\):
\[ \begin{align}\begin{aligned}W &= \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\W_a &= \min_{\gamma_a} \quad \langle \gamma_a, \mathbf{M_a} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma_a)\\W_b &= \min_{\gamma_b} \quad \langle \gamma_b, \mathbf{M_b} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma_b)\\S &= W - \frac{W_a + W_b}{2}\end{aligned}\end{align} \]\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\\ \gamma_a \mathbf{1} &= \mathbf{a}\\ \gamma_a^T \mathbf{1} &= \mathbf{a}\\ \gamma_a &\geq 0\\ \gamma_b \mathbf{1} &= \mathbf{b}\\ \gamma_b^T \mathbf{1} &= \mathbf{b}\\ \gamma_b &\geq 0\end{aligned}\end{align} \]where :
\(\mathbf{M}\) (resp. \(\mathbf{M_a}\), \(\mathbf{M_b}\)) is the (n_samples_a, n_samples_b) metric cost matrix (resp (n_samples_a, n_samples_a) and (n_samples_b, n_samples_b))
\(\Omega\) is the entropic regularization term \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (sum to 1)
- Parameters
X_s (array-like, shape (n_samples_a, dim)) – samples in the source domain
X_t (array-like, shape (n_samples_b, dim)) – samples in the target domain
reg (float) – Regularization term >0
a (array-like, shape (n_samples_a,)) – samples weights in the source domain
b (array-like, shape (n_samples_b,)) – samples weights in the target domain
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
W ((1,) array-like) – Optimal transportation symmetrized loss for the given parameters
log (dict) – log dictionary return only if log==True in parameters
Examples
>>> n_samples_a = 2 >>> n_samples_b = 4 >>> reg = 0.1 >>> X_s = np.reshape(np.arange(n_samples_a, dtype=np.float64), (n_samples_a, 1)) >>> X_t = np.reshape(np.arange(0, n_samples_b, dtype=np.float64), (n_samples_b, 1)) >>> empirical_sinkhorn_divergence(X_s, X_t, reg) 1.499887176049052
References
- 23
Aude Genevay, Gabriel Peyré, Marco Cuturi, Learning Generative Models with Sinkhorn Divergences, Proceedings of the Twenty-First International Conference on Artficial Intelligence and Statistics, (AISTATS) 21, 2018
- ot.bregman.geometricBar(weights, alldistribT)[source]
return the weighted geometric mean of distributions
- ot.bregman.greenkhorn(a, b, M, reg, numItermax=10000, stopThr=1e-09, verbose=False, log=False, warn=True)[source]
Solve the entropic regularization optimal transport problem and return the OT matrix
The algorithm used is based on the paper [22] which is a stochastic version of the Sinkhorn-Knopp algorithm [2]
The function solves the following optimization problem:
\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align} \]where :
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (histograms, both sum to 1)
- Parameters
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,) or array-like, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets and fixed \(\mathbf{M}\) if \(\mathbf{b}\) is a matrix (return OT loss + dual variables in log)
M (array-like, shape (dim_a, dim_b)) – loss matrix
reg (float) – Regularization term >0
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary return only if log==True in parameters
Examples
>>> import ot >>> a=[.5, .5] >>> b=[.5, .5] >>> M=[[0., 1.], [1., 0.]] >>> ot.bregman.greenkhorn(a, b, M, 1) array([[0.36552929, 0.13447071], [0.13447071, 0.36552929]])
References
- 2
M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013
- 22
J. Altschuler, J.Weed, P. Rigollet : Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration, Advances in Neural Information Processing Systems (NIPS) 31, 2017
See also
ot.lp.emd
Unregularized OT
ot.optim.cg
General regularized OT
- ot.bregman.jcpot_barycenter(Xs, Ys, Xt, reg, metric='sqeuclidean', numItermax=100, stopThr=1e-06, verbose=False, log=False, warn=True, **kwargs)[source]
Joint OT and proportion estimation for multi-source target shift as proposed in [27]
The function solves the following optimization problem:
\[ \begin{align}\begin{aligned}\mathbf{h} = \mathop{\arg \min}_{\mathbf{h}} \quad \sum_{k=1}^{K} \lambda_k W_{reg}((\mathbf{D}_2^{(k)} \mathbf{h})^T, \mathbf{a})\\s.t. \ \forall k, \mathbf{D}_1^{(k)} \gamma_k \mathbf{1}_n= \mathbf{h}\end{aligned}\end{align} \]where :
\(\lambda_k\) is the weight of k-th source domain
\(W_{reg}(\cdot,\cdot)\) is the entropic regularized Wasserstein distance (see
ot.bregman.sinkhorn()
)\(\mathbf{D}_2^{(k)}\) is a matrix of weights related to k-th source domain defined as in [p. 5, 27], its expected shape is \((n_k, C)\) where \(n_k\) is the number of elements in the k-th source domain and C is the number of classes
\(\mathbf{h}\) is a vector of estimated proportions in the target domain of size C
\(\mathbf{a}\) is a uniform vector of weights in the target domain of size n
\(\mathbf{D}_1^{(k)}\) is a matrix of class assignments defined as in [p. 5, 27], its expected shape is \((n_k, C)\)
The problem consist in solving a Wasserstein barycenter problem to estimate the proportions \(\mathbf{h}\) in the target domain.
The algorithm used for solving the problem is the Iterative Bregman projections algorithm with two sets of marginal constraints related to the unknown vector \(\mathbf{h}\) and uniform target distribution.
- Parameters
Xs (list of K array-like(nsk,d)) – features of all source domains’ samples
Ys (list of K array-like(nsk,)) – labels of all source domains’ samples
Xt (array-like (nt,d)) – samples in the target domain
reg (float) – Regularization term > 0
metric (string, optional (default="sqeuclidean")) – The ground metric for the Wasserstein problem
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on relative change in the barycenter (>0)
verbose (bool, optional (default=False)) – Controls the verbosity of the optimization algorithm
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
h ((C,) array-like) – proportion estimation in the target domain
log (dict) – log dictionary return only if log==True in parameters
References
- 27
Ievgen Redko, Nicolas Courty, Rémi Flamary, Devis Tuia “Optimal transport for multi-source domain adaptation under target shift”, International Conference on Artificial Intelligence and Statistics (AISTATS), 2019.
- ot.bregman.screenkhorn(a, b, M, reg, ns_budget=None, nt_budget=None, uniform=False, restricted=True, maxiter=10000, maxfun=10000, pgtol=1e-09, verbose=False, log=False)[source]
Screening Sinkhorn Algorithm for Regularized Optimal Transport
The function solves an approximate dual of Sinkhorn divergence [2] which is written as the following optimization problem:
\[(\mathbf{u}, \mathbf{v}) = \mathop{\arg \min}_{\mathbf{u}, \mathbf{v}} \quad \mathbf{1}_{ns}^T \mathbf{B}(\mathbf{u}, \mathbf{v}) \mathbf{1}_{nt} - \langle \kappa \mathbf{u}, \mathbf{a} \rangle - \langle \frac{1}{\kappa} \mathbf{v}, \mathbf{b} \rangle\]where:
\[\mathbf{B}(\mathbf{u}, \mathbf{v}) = \mathrm{diag}(e^\mathbf{u}) \mathbf{K} \mathrm{diag}(e^\mathbf{v}) \text{, with } \mathbf{K} = e^{-\mathbf{M} / \mathrm{reg}} \text{ and}\]\[ \begin{align}\begin{aligned}s.t. \ e^{u_i} &\geq \epsilon / \kappa, \forall i \in \{1, \ldots, ns\}\\ e^{v_j} &\geq \epsilon \kappa, \forall j \in \{1, \ldots, nt\}\end{aligned}\end{align} \]The parameters kappa and epsilon are determined w.r.t the couple number budget of points (ns_budget, nt_budget), see Equation (5) in [26]
- Parameters
a (array-like, shape=(ns,)) – samples weights in the source domain
b (array-like, shape=(nt,)) – samples weights in the target domain
M (array-like, shape=(ns, nt)) – Cost matrix
reg (float) – Level of the entropy regularisation
ns_budget (int, default=None) – Number budget of points to be kept in the source domain. If it is None then 50% of the source sample points will be kept
nt_budget (int, default=None) – Number budget of points to be kept in the target domain. If it is None then 50% of the target sample points will be kept
uniform (bool, default=False) – If True, the source and target distribution are supposed to be uniform, i.e., \(a_i = 1 / ns\) and \(b_j = 1 / nt\)
restricted (bool, default=True) – If True, a warm-start initialization for the L-BFGS-B solver using a restricted Sinkhorn algorithm with at most 5 iterations
maxiter (int, default=10000) – Maximum number of iterations in LBFGS solver
maxfun (int, default=10000) – Maximum number of function evaluations in LBFGS solver
pgtol (float, default=1e-09) – Final objective function accuracy in LBFGS solver
verbose (bool, default=False) – If True, display informations about the cardinals of the active sets and the parameters kappa and epsilon
Dependency
To gain more efficiency,
ot.bregman.screenkhorn()
needs to call the “Bottleneck” package (https://pypi.org/project/Bottleneck/) in the screening pre-processing step.If Bottleneck isn’t installed, the following error message appears:
“Bottleneck module doesn’t exist. Install it from https://pypi.org/project/Bottleneck/”
- Returns
gamma (array-like, shape=(ns, nt)) – Screened optimal transportation matrix for the given parameters
log (dict, default=False) – Log dictionary return only if log==True in parameters
References
- 2
M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013
- 26
Alaya M. Z., Bérar M., Gasso G., Rakotomamonjy A. (2019). Screening Sinkhorn Algorithm for Regularized Optimal Transport (NIPS) 33, 2019
Examples using ot.bregman.screenkhorn
- ot.bregman.sinkhorn(a, b, M, reg, method='sinkhorn', numItermax=1000, stopThr=1e-09, verbose=False, log=False, warn=True, **kwargs)[source]
Solve the entropic regularization optimal transport problem and return the OT matrix
The function solves the following optimization problem:
\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align} \]where :
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (histograms, both sum to 1)
Note
This function is backend-compatible and will work on arrays from all compatible backends.
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in [2]
Choosing a Sinkhorn solver
By default and when using a regularization parameter that is not too small the default sinkhorn solver should be enough. If you need to use a small regularization to get sharper OT matrices, you should use the
ot.bregman.sinkhorn_stabilized()
solver that will avoid numerical errors. This last solver can be very slow in practice and might not even converge to a reasonable OT matrix in a finite time. This is whyot.bregman.sinkhorn_epsilon_scaling()
that relies on iterating the value of the regularization (and using warm start) sometimes leads to better solutions. Note that the greedy version of the sinkhornot.bregman.greenkhorn()
can also lead to a speedup and the screening version of the sinkhornot.bregman.screenkhorn()
aim at providing a fast approximation of the Sinkhorn problem. For use of GPU and gradient computation with small number of iterations we strongly recommend theot.bregman.sinkhorn_log()
solver that will no need to check for numerical problems.- Parameters
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,) or ndarray, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets and fixed \(\mathbf{M}\) if \(\mathbf{b}\) is a matrix (return OT loss + dual variables in log)
M (array-like, shape (dim_a, dim_b)) – loss matrix
reg (float) – Regularization term >0
method (str) – method used for the solver either ‘sinkhorn’,’sinkhorn_log’, ‘greenkhorn’, ‘sinkhorn_stabilized’ or ‘sinkhorn_epsilon_scaling’, see those function for specific parameters
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary return only if log==True in parameters
Examples
>>> import ot >>> a=[.5, .5] >>> b=[.5, .5] >>> M=[[0., 1.], [1., 0.]] >>> ot.sinkhorn(a, b, M, 1) array([[0.36552929, 0.13447071], [0.13447071, 0.36552929]])
References
- 2
M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013
- 9
Schmitzer, B. (2016). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519.
- 10
Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2016). Scaling algorithms for unbalanced transport problems. arXiv preprint arXiv:1607.05816.
- 34
Feydy, J., Séjourné, T., Vialard, F. X., Amari, S. I., Trouvé, A., & Peyré, G. (2019, April). Interpolating between optimal transport and MMD using Sinkhorn divergences. In The 22nd International Conference on Artificial Intelligence and Statistics (pp. 2681-2690). PMLR.
See also
ot.lp.emd
Unregularized OT
ot.optim.cg
General regularized OT
ot.bregman.sinkhorn_knopp
Classic Sinkhorn [2]
ot.bregman.sinkhorn_stabilized
ot.bregman.sinkhorn_epsilon_scaling
- ot.bregman.sinkhorn2(a, b, M, reg, method='sinkhorn', numItermax=1000, stopThr=1e-09, verbose=False, log=False, warn=False, **kwargs)[source]
Solve the entropic regularization optimal transport problem and return the loss
The function solves the following optimization problem:
\[ \begin{align}\begin{aligned}W = \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align} \]where :
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (histograms, both sum to 1)
Note
This function is backend-compatible and will work on arrays from all compatible backends.
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in [2]
Choosing a Sinkhorn solver
By default and when using a regularization parameter that is not too small the default sinkhorn solver should be enough. If you need to use a small regularization to get sharper OT matrices, you should use the
ot.bregman.sinkhorn_log()
solver that will avoid numerical errors. This last solver can be very slow in practice and might not even converge to a reasonable OT matrix in a finite time. This is whyot.bregman.sinkhorn_epsilon_scaling()
that relies on iterating the value of the regularization (and using warm start) sometimes leads to better solutions. Note that the greedy version of the sinkhornot.bregman.greenkhorn()
can also lead to a speedup and the screening version of the sinkhornot.bregman.screenkhorn()
aim a providing a fast approximation of the Sinkhorn problem. For use of GPU and gradient computation with small number of iterations we strongly recommend theot.bregman.sinkhorn_log()
solver that will no need to check for numerical problems.- Parameters
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,) or ndarray, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets and fixed \(\mathbf{M}\) if \(\mathbf{b}\) is a matrix (return OT loss + dual variables in log)
M (array-like, shape (dim_a, dim_b)) – loss matrix
reg (float) – Regularization term >0
method (str) – method used for the solver either ‘sinkhorn’,’sinkhorn_log’, ‘sinkhorn_stabilized’, see those function for specific parameters
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
W ((n_hists) float/array-like) – Optimal transportation loss for the given parameters
log (dict) – log dictionary return only if log==True in parameters
Examples
>>> import ot >>> a=[.5, .5] >>> b=[.5, .5] >>> M=[[0., 1.], [1., 0.]] >>> ot.sinkhorn2(a, b, M, 1) 0.26894142136999516
References
- 2
M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013
- 9
Schmitzer, B. (2016). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519.
- 10
Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2016). Scaling algorithms for unbalanced transport problems. arXiv preprint arXiv:1607.05816.
- 21
Altschuler J., Weed J., Rigollet P. : Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration, Advances in Neural Information Processing Systems (NIPS) 31, 2017
- 34
Feydy, J., Séjourné, T., Vialard, F. X., Amari, S. I., Trouvé, A., & Peyré, G. (2019, April). Interpolating between optimal transport and MMD using Sinkhorn divergences. In The 22nd International Conference on Artificial Intelligence and Statistics (pp. 2681-2690). PMLR.
See also
ot.lp.emd
Unregularized OT
ot.optim.cg
General regularized OT
ot.bregman.sinkhorn_knopp
Classic Sinkhorn [2]
ot.bregman.greenkhorn
Greenkhorn [21]
ot.bregman.sinkhorn_stabilized
- ot.bregman.sinkhorn_epsilon_scaling(a, b, M, reg, numItermax=100, epsilon0=10000.0, numInnerItermax=100, tau=1000.0, stopThr=1e-09, warmstart=None, verbose=False, print_period=10, log=False, warn=True, **kwargs)[source]
Solve the entropic regularization optimal transport problem with log stabilization and epsilon scaling. The function solves the following optimization problem:
\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align} \]where :
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (histograms, both sum to 1)
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in [2] but with the log stabilization proposed in [10] and the log scaling proposed in [9] algorithm 3.2
- Parameters
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,)) – samples in the target domain
M (array-like, shape (dim_a, dim_b)) – loss matrix
reg (float) – Regularization term >0
tau (float) – threshold for max value in \(\mathbf{u}\) or \(\mathbf{b}\) for log scaling
warmstart (tuple of vectors) – if given then starting values for alpha and beta log scalings
numItermax (int, optional) – Max number of iterations
numInnerItermax (int, optional) – Max number of iterations in the inner slog stabilized sinkhorn
epsilon0 (int, optional) – first epsilon regularization value (then exponential decrease to reg)
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary return only if log==True in parameters
Examples
>>> import ot >>> a=[.5, .5] >>> b=[.5, .5] >>> M=[[0., 1.], [1., 0.]] >>> ot.bregman.sinkhorn_epsilon_scaling(a, b, M, 1) array([[0.36552929, 0.13447071], [0.13447071, 0.36552929]])
References
- 2
M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013
- 9
Schmitzer, B. (2016). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519.
- 10
Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2016). Scaling algorithms for unbalanced transport problems. arXiv preprint arXiv:1607.05816.
See also
ot.lp.emd
Unregularized OT
ot.optim.cg
General regularized OT
- ot.bregman.sinkhorn_knopp(a, b, M, reg, numItermax=1000, stopThr=1e-09, verbose=False, log=False, warn=True, **kwargs)[source]
Solve the entropic regularization optimal transport problem and return the OT matrix
The function solves the following optimization problem:
\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align} \]where :
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (histograms, both sum to 1)
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in [2]
- Parameters
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,) or array-like, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets and fixed \(\mathbf{M}\) if \(\mathbf{b}\) is a matrix (return OT loss + dual variables in log)
M (array-like, shape (dim_a, dim_b)) – loss matrix
reg (float) – Regularization term >0
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary return only if log==True in parameters
Examples
>>> import ot >>> a=[.5, .5] >>> b=[.5, .5] >>> M=[[0., 1.], [1., 0.]] >>> ot.sinkhorn(a, b, M, 1) array([[0.36552929, 0.13447071], [0.13447071, 0.36552929]])
References
- 2
M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013
See also
ot.lp.emd
Unregularized OT
ot.optim.cg
General regularized OT
- ot.bregman.sinkhorn_log(a, b, M, reg, numItermax=1000, stopThr=1e-09, verbose=False, log=False, warn=True, **kwargs)[source]
Solve the entropic regularization optimal transport problem in log space and return the OT matrix
The function solves the following optimization problem:
\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align} \]where :
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (histograms, both sum to 1)
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm [2] with the implementation from [34]
- Parameters
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,) or array-like, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets and fixed \(\mathbf{M}\) if \(\mathbf{b}\) is a matrix (return OT loss + dual variables in log)
M (array-like, shape (dim_a, dim_b)) – loss matrix
reg (float) – Regularization term >0
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary return only if log==True in parameters
Examples
>>> import ot >>> a=[.5, .5] >>> b=[.5, .5] >>> M=[[0., 1.], [1., 0.]] >>> ot.sinkhorn(a, b, M, 1) array([[0.36552929, 0.13447071], [0.13447071, 0.36552929]])
References
- 2
M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013
- 34
Feydy, J., Séjourné, T., Vialard, F. X., Amari, S. I., Trouvé, A., & Peyré, G. (2019, April). Interpolating between optimal transport and MMD using Sinkhorn divergences. In The 22nd International Conference on Artificial Intelligence and Statistics (pp. 2681-2690). PMLR.
See also
ot.lp.emd
Unregularized OT
ot.optim.cg
General regularized OT
- ot.bregman.sinkhorn_stabilized(a, b, M, reg, numItermax=1000, tau=1000.0, stopThr=1e-09, warmstart=None, verbose=False, print_period=20, log=False, warn=True, **kwargs)[source]
Solve the entropic regularization OT problem with log stabilization
The function solves the following optimization problem:
\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} &= \mathbf{a}\\ \gamma^T \mathbf{1} &= \mathbf{b}\\ \gamma &\geq 0\end{aligned}\end{align} \]where :
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (histograms, both sum to 1)
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling algorithm as proposed in [2] but with the log stabilization proposed in [10] an defined in [9] (Algo 3.1) .
- Parameters
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,)) – samples in the target domain
M (array-like, shape (dim_a, dim_b)) – loss matrix
reg (float) – Regularization term >0
tau (float) – threshold for max value in \(\mathbf{u}\) or \(\mathbf{v}\) for log scaling
warmstart (table of vectors) – if given then starting values for alpha and beta log scalings
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary return only if log==True in parameters
Examples
>>> import ot >>> a=[.5,.5] >>> b=[.5,.5] >>> M=[[0.,1.],[1.,0.]] >>> ot.bregman.sinkhorn_stabilized(a, b, M, 1) array([[0.36552929, 0.13447071], [0.13447071, 0.36552929]])
References
- 2
M. Cuturi, Sinkhorn Distances : Lightspeed Computation of Optimal Transport, Advances in Neural Information Processing Systems (NIPS) 26, 2013
- 9
Schmitzer, B. (2016). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. arXiv preprint arXiv:1610.06519.
- 10
Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2016). Scaling algorithms for unbalanced transport problems. arXiv preprint arXiv:1607.05816.
See also
ot.lp.emd
Unregularized OT
ot.optim.cg
General regularized OT
- ot.bregman.unmix(a, D, M, M0, h0, reg, reg0, alpha, numItermax=1000, stopThr=0.001, verbose=False, log=False, warn=True)[source]
Compute the unmixing of an observation with a given dictionary using Wasserstein distance
The function solve the following optimization problem:
\[\mathbf{h} = \mathop{\arg \min}_\mathbf{h} \quad (1 - \alpha) W_{\mathbf{M}, \mathrm{reg}}(\mathbf{a}, \mathbf{Dh}) + \alpha W_{\mathbf{M_0}, \mathrm{reg}_0}(\mathbf{h}_0, \mathbf{h})\]where :
\(W_{M,reg}(\cdot,\cdot)\) is the entropic regularized Wasserstein distance with \(\mathbf{M}\) loss matrix (see
ot.bregman.sinkhorn()
)\(\mathbf{D}\) is a dictionary of n_atoms atoms of dimension dim_a, its expected shape is (dim_a, n_atoms)
\(\mathbf{h}\) is the estimated unmixing of dimension n_atoms
\(\mathbf{a}\) is an observed distribution of dimension dim_a
\(\mathbf{h}_0\) is a prior on \(\mathbf{h}\) of dimension dim_prior
reg and \(\mathbf{M}\) are respectively the regularization term and the cost matrix (dim_a, dim_a) for OT data fitting
reg\(_0\) and \(\mathbf{M_0}\) are respectively the regularization term and the cost matrix (dim_prior, n_atoms) regularization
\(\alpha\) weight data fitting and regularization
The optimization problem is solved following the algorithm described in [4]
- Parameters
a (array-like, shape (dim_a)) – observed distribution (histogram, sums to 1)
D (array-like, shape (dim_a, n_atoms)) – dictionary matrix
M (array-like, shape (dim_a, dim_a)) – loss matrix
M0 (array-like, shape (n_atoms, dim_prior)) – loss matrix
h0 (array-like, shape (n_atoms,)) – prior on the estimated unmixing h
reg (float) – Regularization term >0 (Wasserstein data fitting)
reg0 (float) – Regularization term >0 (Wasserstein reg with h0)
alpha (float) – How much should we trust the prior ([0,1])
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
- Returns
h (array-like, shape (n_atoms,)) – Wasserstein barycenter
log (dict) – log dictionary return only if log==True in parameters
References
- 4
S. Nakhostin, N. Courty, R. Flamary, D. Tuia, T. Corpetti, Supervised planetary unmixing with optimal transport, Whorkshop on Hyperspectral Image and Signal Processing : Evolution in Remote Sensing (WHISPERS), 2016.