\(W_{reg}(\cdot,\cdot)\) is the entropic regularized Wasserstein
distance (see ot.bregman.sinkhorn())
if method is sinkhorn or sinkhorn_stabilized or sinkhorn_log.
\(\mathbf{a}_i\) are training distributions in the columns of matrix
\(\mathbf{A}\)
reg and \(\mathbf{M}\) are respectively the regularization term and
the cost matrix for OT
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix scaling
algorithm as proposed in [3]
Parameters:
A (array-like, shape (dim, n_hists)) – n_hists training distributions \(\mathbf{a}_i\) of size dim
M (array-like, shape (dim, dim)) – loss matrix for OT
a (array-like, shape (n_samples_a,)) – samples weights in the source domain
b (array-like, shape (n_samples_b,)) – samples weights in the target domain
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
isLazy (boolean, optional) – If True, then only calculate the cost matrix by block and return
the dual potentials only (to save memory). If False, calculate full
cost matrix and return outputs of sinkhorn function.
batchSize (int or tuple of 2 int, optional) – Size of the batches used to compute the sinkhorn update without memory overhead.
When a tuple is provided it sets the size of the left/right batches.
verbose (bool, optional) – Print information along iterations
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given
(that is the logarithm of the u,v sinkhorn scaling vectors)
Returns:
gamma (array-like, shape (n_samples_a, n_samples_b)) – Regularized optimal transportation matrix for the given parameters
log (dict) – log dictionary return only if log==True in parameters
a (array-like, shape (n_samples_a,)) – samples weights in the source domain
b (array-like, shape (n_samples_b,)) – samples weights in the target domain
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
isLazy (boolean, optional) – If True, then only calculate the cost matrix by block and return
the dual potentials only (to save memory). If False, calculate
full cost matrix and return outputs of sinkhorn function.
batchSize (int or tuple of 2 int, optional) – Size of the batches used to compute the sinkhorn update without memory overhead.
When a tuple is provided it sets the size of the left/right batches.
verbose (bool, optional) – Print information along iterations
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given
(that is the logarithm of the u,v sinkhorn scaling vectors)
Returns:
W ((n_hists) array-like or float) – Optimal transportation loss for the given parameters
log (dict) – log dictionary return only if log==True in parameters
\(C\) is the cost matrix such that \(C_{i,j}=d(x_i^s,x_j^t)\) and
\(d\) is a metric.
\(\Omega\) is the entropic regularization term
\(\Omega(\gamma)=\sum_{i,j}\gamma_{i,j}\log(\gamma_{i,j})-\gamma_{i,j}+1\)
\(a\) and \(b\) are source and target weights (sum to 1)
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix
scaling algorithm as proposed in and computed in log space for
better stability and epsilon-scaling. The solution is computed in a lazy way
using the Geomloss [60] and the KeOps library [61].
Parameters:
X_s (array-like, shape (n_samples_a, dim)) – samples in the source domain
X_t (array-like, shape (n_samples_b, dim)) – samples in the target domain
a (array-like, shape (n_samples_a,), default=None) – samples weights in the source domain
b (array-like, shape (n_samples_b,), default=None) – samples weights in the target domain
metric (str, default='sqeuclidean') – Metric used for the cost matrix computation Only accepted values are
‘sqeuclidean’ and ‘euclidean’.
scaling (float, default=0.95) – Scaling parameter used for epsilon scaling. Value close to one promote
precision while value close to zero promote speed.
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given
(that is the logarithm of the u,v sinkhorn scaling vectors)
Returns:
W ((1,) array-like) – Optimal transportation symmetrized loss for the given parameters
log (dict) – log dictionary return only if log==True in parameters
Solves the free support (locations of the barycenters are optimized, not the weights) regularized Wasserstein barycenter problem (i.e. the weighted Frechet mean for the 2-Sinkhorn divergence), formally:
\(\mathbf{b} \in \mathbb{R}^{k}\) is the desired weights vector of the barycenter
This problem is considered in [20] (Algorithm 2).
There are two differences with the following codes:
we do not optimize over the weights
we do not do line search for the locations updates, we use i.e. \(\theta = 1\) in
[20] (Algorithm 2). This can be seen as a discrete
implementation of the fixed-point algorithm of
[43] proposed in the continuous setting.
at each iteration, instead of solving an exact OT problem, we use the Sinkhorn algorithm for calculating the
transport plan in [20] (Algorithm 2).
Parameters:
measures_locations (list of N (k_i,d) array-like) – The discrete support of a measure supported on \(k_i\) locations of a d-dimensional space
(\(k_i\) can be different for each element of the list)
measures_weights (list of N (k_i,) array-like) – Numpy arrays where each numpy array has \(k_i\) non-negatives values summing to one
representing the weights of each discrete input measure
X_init ((k,d) array-like) – Initialization of the support locations (on k atoms) of the barycenter
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term
\(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target
weights (histograms, both sum to 1)
Parameters:
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,) or array-like, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets
and fixed \(\mathbf{M}\) if \(\mathbf{b}\) is a matrix
(return OT loss + dual variables in log)
M (array-like, shape (dim_a, dim_b)) – loss matrix
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given
(that is the logarithm of the u,v sinkhorn scaling vectors)
Returns:
gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary return only if log==True in parameters
\(W_{reg}(\cdot,\cdot)\) is the entropic regularized Wasserstein distance
(see ot.bregman.sinkhorn())
\(\mathbf{D}_2^{(k)}\) is a matrix of weights related to k-th source domain
defined as in [p. 5, 27], its expected shape
is \((n_k, C)\) where \(n_k\) is the number of elements in the k-th source
domain and C is the number of classes
\(\mathbf{h}\) is a vector of estimated proportions in the target domain of size C
\(\mathbf{a}\) is a uniform vector of weights in the target domain of size n
\(\mathbf{D}_1^{(k)}\) is a matrix of class assignments defined as in
[p. 5, 27], its expected shape is \((n_k, C)\)
The problem consist in solving a Wasserstein barycenter problem to estimate
the proportions \(\mathbf{h}\) in the target domain.
The algorithm used for solving the problem is the Iterative Bregman projections algorithm
with two sets of marginal constraints related to the unknown vector
\(\mathbf{h}\) and uniform target distribution.
Parameters:
Xs (list of K array-like(nsk,d)) – features of all source domains’ samples
Ys (list of K array-like(nsk,)) – labels of all source domains’ samples
Xt (array-like (nt,d)) – samples in the target domain
The parameters kappa and epsilon are determined w.r.t the couple number
budget of points (ns_budget, nt_budget), see Equation (5)
in [26]
Parameters:
a (array-like, shape=(ns,)) – samples weights in the source domain
b (array-like, shape=(nt,)) – samples weights in the target domain
M (array-like, shape=(ns, nt)) – Cost matrix
reg (float) – Level of the entropy regularisation
ns_budget (int, default=None) – Number budget of points to be kept in the source domain.
If it is None then 50% of the source sample points will be kept
nt_budget (int, default=None) – Number budget of points to be kept in the target domain.
If it is None then 50% of the target sample points will be kept
uniform (bool, default=False) – If True, the source and target distribution are supposed to be uniform,
i.e., \(a_i = 1 / ns\) and \(b_j = 1 / nt\)
restricted (bool, default=True) – If True, a warm-start initialization for the L-BFGS-B solver
using a restricted Sinkhorn algorithm with at most 5 iterations
maxiter (int, default=10000) – Maximum number of iterations in LBFGS solver
maxfun (int, default=10000) – Maximum number of function evaluations in LBFGS solver
pgtol (float, default=1e-09) – Final objective function accuracy in LBFGS solver
verbose (bool, default=False) – If True, display information about the cardinals of the active sets
and the parameters kappa and epsilon
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term
\(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target
weights (histograms, both sum to 1)
Note
This function is backend-compatible and will work on arrays
from all compatible backends.
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix
scaling algorithm as proposed in [2]
Choosing a Sinkhorn solver
By default and when using a regularization parameter that is not too small
the default sinkhorn solver should be enough. If you need to use a small
regularization to get sharper OT matrices, you should use the
ot.bregman.sinkhorn_stabilized() solver that will avoid numerical
errors. This last solver can be very slow in practice and might not even
converge to a reasonable OT matrix in a finite time. This is why
ot.bregman.sinkhorn_epsilon_scaling() that relies on iterating the value
of the regularization (and using warm start) sometimes leads to better
solutions. Note that the greedy version of the sinkhorn
ot.bregman.greenkhorn() can also lead to a speedup and the screening
version of the sinkhorn ot.bregman.screenkhorn() aim at providing a
fast approximation of the Sinkhorn problem. For use of GPU and gradient
computation with small number of iterations we strongly recommend the
ot.bregman.sinkhorn_log() solver that will no need to check for
numerical problems.
Parameters:
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,) or ndarray, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets
and fixed \(\mathbf{M}\) if \(\mathbf{b}\) is a matrix
(return OT loss + dual variables in log)
M (array-like, shape (dim_a, dim_b)) – loss matrix
method (str) – method used for the solver either ‘sinkhorn’,’sinkhorn_log’,
‘greenkhorn’, ‘sinkhorn_stabilized’ or ‘sinkhorn_epsilon_scaling’, see
those function for specific parameters
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given
(that is the logarithm of the u,v sinkhorn scaling vectors)
Returns:
gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary return only if log==True in parameters
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term
\(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target
weights (histograms, both sum to 1)
and returns \(\langle \gamma^*, \mathbf{M} \rangle_F\) (without
the entropic contribution).
Note
This function is backend-compatible and will work on arrays
from all compatible backends.
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix
scaling algorithm as proposed in [2]
Choosing a Sinkhorn solver
By default and when using a regularization parameter that is not too small
the default sinkhorn solver should be enough. If you need to use a small
regularization to get sharper OT matrices, you should use the
ot.bregman.sinkhorn_log() solver that will avoid numerical
errors. This last solver can be very slow in practice and might not even
converge to a reasonable OT matrix in a finite time. This is why
ot.bregman.sinkhorn_epsilon_scaling() that relies on iterating the value
of the regularization (and using warm start) sometimes leads to better
solutions. Note that the greedy version of the sinkhorn
ot.bregman.greenkhorn() can also lead to a speedup and the screening
version of the sinkhorn ot.bregman.screenkhorn() aim a providing a
fast approximation of the Sinkhorn problem. For use of GPU and gradient
computation with small number of iterations we strongly recommend the
ot.bregman.sinkhorn_log() solver that will no need to check for
numerical problems.
Parameters:
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,) or ndarray, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets
and fixed \(\mathbf{M}\) if \(\mathbf{b}\) is a matrix
(return OT loss + dual variables in log)
M (array-like, shape (dim_a, dim_b)) – loss matrix
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given
(that is the logarithm of the u,v sinkhorn scaling vectors)
Returns:
W ((n_hists) float/array-like) – Optimal transportation loss for the given parameters
log (dict) – log dictionary return only if log==True in parameters
Solve the entropic regularization optimal transport problem with log
stabilization and epsilon scaling.
The function solves the following optimization problem:
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term
\(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (histograms, both sum to 1)
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix
scaling algorithm as proposed in [2]
but with the log stabilization
proposed in [10] and the log scaling
proposed in [9] algorithm 3.2
Parameters:
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,)) – samples in the target domain
M (array-like, shape (dim_a, dim_b)) – loss matrix
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term
\(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target
weights (histograms, both sum to 1)
The algorithm used for solving the problem is the Sinkhorn-Knopp
matrix scaling algorithm as proposed in [2]
Parameters:
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,) or array-like, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets
and fixed \(\mathbf{M}\) if \(\mathbf{b}\) is a matrix
(return OT loss + dual variables in log)
M (array-like, shape (dim_a, dim_b)) – loss matrix
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given
(that is the logarithm of the u,v sinkhorn scaling vectors)
Returns:
gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary return only if log==True in parameters
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term
\(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (histograms, both sum to 1)
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix
scaling algorithm [2] with the
implementation from [34]
Parameters:
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,) or array-like, shape (dim_b, n_hists)) – samples in the target domain, compute sinkhorn with multiple targets
and fixed \(\mathbf{M}\) if \(\mathbf{b}\) is a matrix (return OT loss + dual variables in log)
M (array-like, shape (dim_a, dim_b)) – loss matrix
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given
(that is the logarithm of the u,v sinkhorn scaling vectors)
Returns:
gamma (array-like, shape (dim_a, dim_b)) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary return only if log==True in parameters
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term
\(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target
weights (histograms, both sum to 1)
The algorithm used for solving the problem is the Sinkhorn-Knopp matrix
scaling algorithm as proposed in [2]
but with the log stabilization
proposed in [10] an defined in
[9] (Algo 3.1) .
Parameters:
a (array-like, shape (dim_a,)) – samples weights in the source domain
b (array-like, shape (dim_b,)) – samples in the target domain
M (array-like, shape (dim_a, dim_b)) – loss matrix