solver (string, optional) – the solver used, default ‘interior-point’ use the lp solver from
scipy.optimize. None, or ‘glpk’ or ‘mosek’ use the solver from cvxopt.
Returns:
a ((d,) ndarray) – Wasserstein barycenter
log (dict) – log dictionary return only if log==True in parameters
Computes the Wasserstein distance on the circle using the Binary search algorithm proposed in [44].
Samples need to be in \(S^1\cong [0,1[\). If they are on \(\mathbb{R}\),
takes the value modulo 1.
If the values are on \(S^1\subset\mathbb{R}^2\), it is required to first find the coordinates
using e.g. the atan2 function.
Compute the discrete multi-marginal optimal transport of distributions A.
This function operates on distributions whose supports are real numbers on
the real line.
The algorithm solves both primal and dual d-MMOT programs concurrently to
produce the optimal transport plan as well as the total (minimal) cost.
The cost is a ground cost, and the solution is independent of
which Monge cost is desired.
The algorithm accepts \(d\) distributions (i.e., histograms)
\(a_{1}, \ldots, a_{d} \in \mathbb{R}_{+}^{n}\) with \(e^{\prime}
a_{j}=1\) for all \(j \in[d]\). Although the algorithm states that all
histograms have the same number of bins, the algorithm can be easily
adapted to accept as inputs \(a_{i} \in \mathbb{R}_{+}^{n_{i}}\)
with \(n_{i} \neq n_{j}\) [50].
The function solves the following optimization problem[51]:
A (nx.ndarray, shape (dim, n_hists)) – The input ndarray containing distributions of n bins in d dimensions.
verbose (bool, optional) – If True, print debugging information during execution. Default=False.
log (bool, optional) – If True, record log. Default is False.
Returns:
obj (float) – the value of the primal objective function evaluated at the solution.
log (dict) – A dictionary containing the log of the discrete mmot problem:
- ‘A’: a dictionary that maps tuples of indices to the corresponding
primal variables. The tuples are the indices of the entries that are
set to their minimum value during the algorithm.
- ‘primal objective’: a float, the value of the objective function
evaluated at the solution.
- ‘dual’: a list of arrays, the dual variables corresponding to
the input arrays. The i-th element of the list is the dual variable
corresponding to the i-th dimension of the input arrays.
- ‘dual objective’: a float, the value of the dual objective function
evaluated at the solution.
Minimize the d-dimensional EMD using gradient descent.
Discrete Multi-Marginal Optimal Transport (d-MMOT): Let \(a_1, \ldots,
a_d\in\mathbb{R}^n_{+}\) be discrete probability distributions. Here,
the d-MMOT is the LP,
where the indices in the constraints include all \(i_j\in[n]\), :math:
jin[d]. Denote by \(\phi(a_1,\ldots,a_d)\), the optimal objective
value of the LP in d-MMOT problem. Let \(z^*\) be an optimal solution
to the dual program. Then,
Using these dual variables naturally provided by the algorithm in
ot.lp.dmmot_monge_1dgrid_loss, gradient steps move each input distribution
to minimize their d-mmot distance.
Parameters:
A (nx.ndarray, shape (dim, n_hists)) – The input ndarray containing distributions of n bins in d dimensions.
niters (int, optional (default=100)) – The maximum number of iterations for the optimization algorithm.
lr_init (float, optional (default=1e-5)) – The initial learning rate (step size) for the optimization algorithm.
lr_decay (float, optional (default=0.995)) – The learning rate decay rate in each iteration.
print_rate (int, optional (default=100)) – The rate at which to print the objective value and gradient norm
during the optimization algorithm.
verbose (bool, optional) – If True, print debugging information during execution. Default=False.
log (bool, optional) – If True, record log. Default is False.
Returns:
a (list of ndarrays, each of shape (n,)) – The optimal solution as a list of n approximate barycenters, each of
length vecsize.
log (dict) – log dictionary return only if log==True in parameters
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
Warning
Note that the \(\mathbf{M}\) matrix in numpy needs to be a C-order
numpy.array in float64 format. It will be converted if not in this
format
Note
This function is backend-compatible and will work on arrays
from all compatible backends. But the algorithm uses the C++ CPU backend
which can lead to copy overhead on GPU arrays.
Note
This function will cast the computed transport plan to the data type
of the provided input with the following priority: \(\mathbf{a}\),
then \(\mathbf{b}\), then \(\mathbf{M}\) if marginals are not provided.
Casting to an integer tensor might result in a loss of precision.
If this behaviour is unwanted, please make sure to provide a
floating point input.
Note
An error will be raised if the vectors \(\mathbf{a}\) and \(\mathbf{b}\) do not sum to the same value.
a ((ns,) array-like, float) – Source histogram (uniform weight if empty list)
b ((nt,) array-like, float) – Target histogram (uniform weight if empty list)
M ((ns,nt) array-like, float) – Loss matrix (c-order array in numpy with type float64)
numItermax (int, optional (default=100000)) – The maximum number of iterations before stopping the optimization
algorithm if it has not converged.
log (bool, optional (default=False)) – If True, returns a dictionary containing the cost and dual variables.
Otherwise returns only the optimal transportation matrix.
center_dual (boolean, optional (default=True)) – If True, centers the dual potential using function
ot.lp.center_ot_dual().
numThreads (int or "max", optional (default=1, i.e. OpenMP is not used)) – If compiled with OpenMP, chooses the number of threads to parallelize.
“max” selects the highest number possible.
check_marginals (bool, optional (default=True)) – If True, checks that the marginals mass are equal. If False, skips the
check.
Returns:
gamma (array-like, shape (ns, nt)) – Optimal transportation matrix for the given
parameters
log (dict, optional) – If input log is true, a dictionary containing the
cost and dual variables and exit status
Examples
Simple example with obvious solution. The function emd accepts lists and
perform automatic conversion to numpy arrays
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
Note
This function is backend-compatible and will work on arrays
from all compatible backends. But the algorithm uses the C++ CPU backend
which can lead to copy overhead on GPU arrays.
Note
This function will cast the computed transport plan and
transportation loss to the data type of the provided input with the
following priority: \(\mathbf{a}\), then \(\mathbf{b}\),
then \(\mathbf{M}\) if marginals are not provided.
Casting to an integer tensor might result in a loss of precision.
If this behaviour is unwanted, please make sure to provide a
floating point input.
Note
An error will be raised if the vectors \(\mathbf{a}\) and \(\mathbf{b}\) do not sum to the same value.
a ((ns,) array-like, float64) – Source histogram (uniform weight if empty list)
b ((nt,) array-like, float64) – Target histogram (uniform weight if empty list)
M ((ns,nt) array-like, float64) – Loss matrix (for numpy c-order array with type float64)
processes (int, optional (default=1)) – Nb of processes used for multiple emd computation (deprecated)
numItermax (int, optional (default=100000)) – The maximum number of iterations before stopping the optimization
algorithm if it has not converged.
log (boolean, optional (default=False)) – If True, returns a dictionary containing dual
variables. Otherwise returns only the optimal transportation cost.
return_matrix (boolean, optional (default=False)) – If True, returns the optimal transportation matrix in the log.
center_dual (boolean, optional (default=True)) – If True, centers the dual potential using function
ot.lp.center_ot_dual().
numThreads (int or "max", optional (default=1, i.e. OpenMP is not used)) – If compiled with OpenMP, chooses the number of threads to parallelize.
“max” selects the highest number possible.
check_marginals (bool, optional (default=True)) – If True, checks that the marginals mass are equal. If False, skips the
check.
Returns:
W (float, array-like) – Optimal transportation loss for the given parameters
log (dict) – If input log is true, a dictionary containing dual
variables and exit status
Examples
Simple example with obvious solution. The function emd accepts lists and
perform automatic conversion to numpy arrays
x_a (ndarray of float64, shape (ns,) or (ns, 1)) – Source dirac locations (on the real line)
x_b (ndarray of float64, shape (nt,) or (ns, 1)) – Target dirac locations (on the real line)
a (ndarray of float64, shape (ns,), optional) – Source histogram (default is uniform weight)
b (ndarray of float64, shape (nt,), optional) – Target histogram (default is uniform weight)
metric (str, optional (default='sqeuclidean')) – Metric to be used. Only works with either of the strings
‘sqeuclidean’, ‘minkowski’, ‘cityblock’, or ‘euclidean’.
p (float, optional (default=1.0)) – The p-norm to apply for if metric=’minkowski’
dense (boolean, optional (default=True)) – If True, returns \(\gamma\) as a dense ndarray of shape (ns, nt).
Otherwise returns a sparse representation using scipy’s coo_matrix
format. Only used if log is set to True. Due to implementation details,
this function runs faster when dense is set to False.
log (boolean, optional (default=False)) – If True, returns a dictionary containing the transportation matrix.
Otherwise returns only the loss.
Returns:
loss (float) – Cost associated to the optimal transportation
log (dict) – If input log is True, a dictionary containing the Optimal transportation
matrix for the given parameters
Examples
Simple example with obvious solution. The function emd2_1d accepts lists and
performs automatic conversion to numpy arrays
x_a (ndarray of float64, shape (ns,) or (ns, 1)) – Source dirac locations (on the real line)
x_b (ndarray of float64, shape (nt,) or (ns, 1)) – Target dirac locations (on the real line)
a (ndarray of float64, shape (ns,), optional) – Source histogram (default is uniform weight)
b (ndarray of float64, shape (nt,), optional) – Target histogram (default is uniform weight)
metric (str, optional (default='sqeuclidean')) – Metric to be used. Only works with either of the strings
‘sqeuclidean’, ‘minkowski’, ‘cityblock’, or ‘euclidean’.
p (float, optional (default=1.0)) – The p-norm to apply for if metric=’minkowski’
dense (boolean, optional (default=True)) – If True, returns \(\gamma\) as a dense ndarray of shape (ns, nt).
Otherwise returns a sparse representation using scipy’s coo_matrix
format. Due to implementation details, this function runs faster when
‘sqeuclidean’, ‘minkowski’, ‘cityblock’, or ‘euclidean’ metrics
are used.
log (boolean, optional (default=False)) – If True, returns a dictionary containing the cost.
Otherwise returns only the optimal transportation matrix.
check_marginals (bool, optional (default=True)) – If True, checks that the marginals mass are equal. If False, skips the
check.
Returns:
gamma (ndarray, shape (ns, nt)) – Optimal transportation matrix for the given parameters
log (dict) – If input log is True, a dictionary containing the cost
Examples
Simple example with obvious solution. The function emd_1d accepts lists and
performs automatic conversion to numpy arrays
u ((ns,) ndarray, float64) – Source dirac locations (on the real line)
v ((nt,) ndarray, float64) – Target dirac locations (on the real line)
metric (str, optional (default='sqeuclidean')) – Metric to be used. Only works with either of the strings
‘sqeuclidean’, ‘minkowski’, ‘cityblock’, or ‘euclidean’.
p (float, optional (default=1.0)) – The p-norm to apply for if metric=’minkowski’
Returns:
gamma ((n, ) ndarray, float64) – Values in the Optimal transportation matrix
indices ((n, 2) ndarray, int64) – Indices of the values stored in gamma for the Optimal transportation
matrix
cost – cost associated to the optimal transportation
Solves the free support (locations of the barycenters are optimized, not the weights) Wasserstein barycenter problem (i.e. the weighted Frechet mean for the 2-Wasserstein distance), formally:
\(\mathbf{b} \in \mathbb{R}^{k}\) is the desired weights vector of the barycenter
This problem is considered in [20] (Algorithm 2).
There are two differences with the following codes:
we do not optimize over the weights
we do not do line search for the locations updates, we use i.e. \(\theta = 1\) in
[20] (Algorithm 2). This can be seen as a discrete
implementation of the fixed-point algorithm of
[43] proposed in the continuous setting.
Parameters:
measures_locations (list of N (k_i,d) array-like) – The discrete support of a measure supported on \(k_i\) locations of a d-dimensional space
(\(k_i\) can be different for each element of the list)
measures_weights (list of N (k_i,) array-like) – Numpy arrays where each numpy array has \(k_i\) non-negatives values summing to one
representing the weights of each discrete input measure
X_init ((k,d) array-like) – Initialization of the support locations (on k atoms) of the barycenter
b ((k,) array-like) – Initialization of the weights of the barycenter (non-negatives, sum to 1)
weights ((N,) array-like) – Initialization of the coefficients of the barycenter (non-negatives, sum to 1)
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
numThreads (int or "max", optional (default=1, i.e. OpenMP is not used)) – If compiled with OpenMP, chooses the number of threads to parallelize.
“max” selects the highest number possible.
Returns:
X – Support locations (on k atoms) of the barycenter
Solves the free support generalized Wasserstein barycenter problem: finding a barycenter (a discrete measure with
a fixed amount of points of uniform weights) whose respective projections fit the input measures.
More formally:
\(\gamma = \sum_{l=1}^n b_l\delta_{y_l}\) is the desired barycenter with each \(y_l \in \mathbb{R}^d\)
\(\mathbf{b} \in \mathbb{R}^{n}\) is the desired weights vector of the barycenter
The input measures are \(\nu_i = \sum_{j=1}^{k_i}a_{i,j}\delta_{x_{i,j}}\)
The \(\mathbf{a}_i \in \mathbb{R}^{k_i}\) are the respective empirical measures weights (on the simplex)
The \(\mathbf{X}_i \in \mathbb{R}^{k_i, d_i}\) are the respective empirical measures atoms locations
\(w = (w_1, \cdots w_p)\) are the barycenter coefficients (on the simplex)
Each \(\mathbf{P}_i \in \mathbb{R}^{d, d_i}\), and \(P_i\#\nu_i = \sum_{j=1}^{k_i}a_{i,j}\delta_{P_ix_{i,j}}\)
As show by [42],
this problem can be re-written as a Wasserstein Barycenter problem,
which we solve using the free support method [20]
(Algorithm 2).
Parameters:
X_list (list of p (k_i,d_i) array-like) – Discrete supports of the input measures: each consists of \(k_i\) locations of a d_i-dimensional space
(\(k_i\) can be different for each element of the list)
a_list (list of p (k_i,) array-like) – Measure weights: each element is a vector (k_i) on the simplex
P_list (list of p (d_i,d) array-like) – Each \(P_i\) is a linear map \(\mathbb{R}^{d} \rightarrow \mathbb{R}^{d_i}\)
n_samples_bary (int) – Number of barycenter points
Y_init ((n_samples_bary,d) array-like) – Initialization of the support locations (on k atoms) of the barycenter
b ((n_samples_bary,) array-like) – Initialization of the weights of the barycenter measure (on the simplex)
weights ((p,) array-like) – Initialization of the coefficients of the barycenter (on the simplex)
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
numThreads (int or "max", optional (default=1, i.e. OpenMP is not used)) – If compiled with OpenMP, chooses the number of threads to parallelize.
“max” selects the highest number possible.
eps (Stability coefficient for the change of variable matrix inversion) – If the \(\mathbf{P}_i^T\) matrices don’t span \(\mathbb{R}^d\), the problem is ill-defined and a matrix
inversion will fail. In this case one may set eps=1e-8 and get a solution anyway (which may make little sense)
Returns:
Y – Support locations (on n_samples_bary atoms) of the barycenter
Computes the closed-form for the 2-Wasserstein distance between samples and a uniform distribution on \(S^1\)
Samples need to be in \(S^1\cong [0,1[\). If they are on \(\mathbb{R}\),
takes the value modulo 1.
If the values are on \(S^1\subset\mathbb{R}^2\), it is required to first find the coordinates
using e.g. the atan2 function.
Computes the 1 dimensional OT loss [15] between two (batched) empirical
distributions
It is formally the p-Wasserstein distance raised to the power p.
We do so in a vectorized way by first building the individual quantile functions then integrating them.
This function should be preferred to emd_1d whenever the backend is
different to numpy, and when gradients over
either sample positions or weights are required.
Parameters:
u_values (array-like, shape (n, ...)) – locations of the first empirical distribution
v_values (array-like, shape (m, ...)) – locations of the second empirical distribution
u_weights (array-like, shape (n, ...), optional) – weights of the first empirical distribution, if None then uniform weights are used
v_weights (array-like, shape (m, ...), optional) – weights of the second empirical distribution, if None then uniform weights are used
p (int, optional) – order of the ground metric used, should be at least 1 (see [2, Chap. 2], default is 1
require_sort (bool, optional) – sort the distributions atoms locations, if False we will consider they have been sorted prior to being passed to
the function, default is True
Computes the Wasserstein distance on the circle using either [45] for p=1 or
the binary search algorithm proposed in [44] otherwise.
Samples need to be in \(S^1\cong [0,1[\). If they are on \(\mathbb{R}\),
takes the value modulo 1.
If the values are on \(S^1\subset\mathbb{R}^2\), it requires to first find the coordinates
using e.g. the atan2 function.