Solve the unbalanced optimal transport problem and return the OT plan using L-BFGS-B algorithm.
The function solves the following optimization problem:
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
\(\mathrm{div_m}\) is a divergence, either Kullback-Leibler divergence,
or half-squared \(\ell_2\) divergence, or Total variation
\(\mathrm{div}\) is a divergence, either Kullback-Leibler divergence,
or half-squared \(\ell_2\) divergence
Note
This function is backend-compatible and will work on arrays
from all compatible backends. First, it converts all arrays into Numpy arrays,
then uses the L-BFGS-B algorithm from scipy.optimize to solve the optimization problem.
Parameters:
a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a
If a is an empty list or array ([]),
then a is set to uniform distribution.
b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b
If b is an empty list or array ([]),
then b is set to uniform distribution.
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative (including 0) but cannot be infinity.
If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1,
then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations.
If \(\mathrm{reg_{m}}\) is an array, it must be a Numpy array.
c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization.
If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).
reg_div (string or pair of callable functions, optional (default = 'kl')) – Divergence used for regularization.
Can take three values: ‘entropy’ (negative entropy), or
‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or a tuple
of two callable functions returning the reg term and its derivative.
Note that the callable functions should be able to handle Numpy arrays
and not tensors from the backend, otherwise functions will be converted to Numpy
leading to a computational overhead.
regm_div (string, optional (default = 'kl')) – Divergence to quantify the difference between the marginals.
Can take three values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or ‘tv’ (Total Variation)
G0 (array-like (dim_a, dim_b), optional (default = None)) – Initialization of the transport matrix. None corresponds to uniform product.
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
\(\mathrm{div_m}\) is a divergence, either Kullback-Leibler divergence,
or half-squared \(\ell_2\) divergence, or Total variation
\(\mathrm{div}\) is a divergence, either Kullback-Leibler divergence,
or half-squared \(\ell_2\) divergence
Note
This function is backend-compatible and will work on arrays
from all compatible backends. First, it converts all arrays into Numpy arrays,
then uses the L-BFGS-B algorithm from scipy.optimize to solve the optimization problem.
Parameters:
a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a
If a is an empty list or array ([]),
then a is set to uniform distribution.
b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b
If b is an empty list or array ([]),
then b is set to uniform distribution.
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative (including 0) but cannot be infinity.
If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1,
then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations.
If \(\mathrm{reg_{m}}\) is an array, it must be a Numpy array.
c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization.
If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).
reg_div (string or pair of callable functions, optional (default = 'kl')) – Divergence used for regularization.
Can take three values: ‘entropy’ (negative entropy), or
‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or a tuple
of two callable functions returning the reg term and its derivative.
Note that the callable functions should be able to handle Numpy arrays
and not tensors from the backend, otherwise functions will be converted to Numpy
leading to a computational overhead.
regm_div (string, optional (default = 'kl')) – Divergence to quantify the difference between the marginals.
Can take three values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared) or ‘tv’ (Total Variation)
G0 (array-like (dim_a, dim_b), optional (default = None)) – Initialization of the transport matrix. None corresponds to uniform product.
returnCost (string, optional (default = "linear")) – If returnCost = “linear”, then return the linear part of the unbalanced OT loss.
If returnCost = “total”, then return the total unbalanced OT loss.
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target
unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
div is a divergence, either Kullback-Leibler or half-squared \(\ell_2\) divergence
The algorithm used for solving the problem is a maximization-
minimization algorithm as proposed in [41]
Parameters:
a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a
If a is an empty list or array ([]),
then a is set to uniform distribution.
b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b
If b is an empty list or array ([]),
then b is set to uniform distribution.
M (array-like (dim_a, dim_b)) – loss matrix
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative but cannot be infinity.
If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1,
then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations.
If \(\mathrm{reg_{m}}\) is an array,
it must have the same backend as input arrays (a, b, M).
reg (float, optional (default = 0)) – Regularization term >= 0.
By default, solve the unregularized problem
c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization.
If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).
div (string, optional) – Divergence to quantify the difference between the marginals.
Can take two values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared)
G0 (array-like (dim_a, dim_b)) – Initialization of the transport matrix
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target
unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
\(\mathrm{div}\) is a divergence, either Kullback-Leibler or half-squared \(\ell_2\) divergence
The algorithm used for solving the problem is a maximization-
minimization algorithm as proposed in [41]
Parameters:
a (array-like (dim_a,)) – Unnormalized histogram of dimension dim_a
If a is an empty list or array ([]),
then a is set to uniform distribution.
b (array-like (dim_b,)) – Unnormalized histogram of dimension dim_b
If b is an empty list or array ([]),
then b is set to uniform distribution.
M (array-like (dim_a, dim_b)) – loss matrix
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term: nonnegative but cannot be infinity.
If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1,
then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations.
If \(\mathrm{reg_{m}}\) is an array,
it must have the same backend as input arrays (a, b, M).
reg (float, optional (default = 0)) – Entropy regularization term >= 0.
By default, solve the unregularized problem
c (array-like (dim_a, dim_b), optional (default = None)) – Reference measure for the regularization.
If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).
div (string, optional) – Divergence to quantify the difference between the marginals.
Can take two values: ‘kl’ (Kullback-Leibler) or ‘l2’ (half-squared)
G0 (array-like (dim_a, dim_b)) – Initialization of the transport matrix
returnCost (string, optional (default = "linear")) – If returnCost = “linear”, then return the linear part of the unbalanced OT loss.
If returnCost = “total”, then return the total unbalanced OT loss.
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
KL is the Kullback-Leibler divergence
The algorithm used for solving the problem is the generalized Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]
Warning
Starting from version 0.9.5, the default value has been changed to reg_type=’kl’ instead of reg_type=’entropy’. This makes the function more consistent with the literature
and the other solvers. If you want to use the entropy regularization, please set reg_type=’entropy’ explicitly.
Parameters:
a (array-like, shape (dim_a,)) – Unnormalized histogram of dimension dim_a
If a is an empty list or array ([]),
then a is set to uniform distribution.
b (array-like, shape (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b.
If b is an empty list or array ([]),
then b is set to uniform distribution.
If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)
M (array-like, shape (dim_a, dim_b)) – loss matrix
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term.
If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1,
then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations.
The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\).
For semi-relaxed case, use either
\(\mathrm{reg_{m}}=(float("inf"), scalar)\) or
\(\mathrm{reg_{m}}=(scalar, float("inf"))\).
If \(\mathrm{reg_{m}}\) is an array,
it must have the same backend as input arrays (a, b, M).
reg_type (string, optional) –
Regularizer term. Can take two values:
Negative entropy: ‘entropy’:
\(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\).
This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\).
c (array-like, shape (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization.
If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).
If \(\texttt{reg_type}=\)’entropy’, then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given
(that is the logarithm of the u, v sinkhorn scaling vectors).
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
KL is the Kullback-Leibler divergence
The algorithm used for solving the problem is the generalized
Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]
Parameters:
a (array-like, shape (dim_a,)) – Unnormalized histogram of dimension dim_a
If a is an empty list or array ([]),
then a is set to uniform distribution.
b (array-like, shape (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b.
If b is an empty list or array ([]),
then b is set to uniform distribution.
If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)
M (array-like, shape (dim_a, dim_b)) – loss matrix
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term.
If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1,
then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations.
The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\).
For semi-relaxed case, use either
\(\mathrm{reg_{m}}=(float("inf"), scalar)\) or
\(\mathrm{reg_{m}}=(scalar, float("inf"))\).
If \(\mathrm{reg_{m}}\) is an array,
it must have the same backend as input arrays (a, b, M).
method (str) – method used for the solver either ‘sinkhorn’, ‘sinkhorn_stabilized’ or
‘sinkhorn_reg_scaling’, see those function for specific parameters
reg_type (string, optional) –
Regularizer term. Can take two values:
Negative entropy: ‘entropy’:
\(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\).
This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\).
c (array-like, shape (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization.
If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).
If \(\texttt{reg_type}=\)’entropy’, then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given
(that is the logarithm of the u, v sinkhorn scaling vectors).
tau (float) – threshold for max value in u or v for log scaling
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
warning:: (..) – Starting from version 0.9.5, the default value has been changed to reg_type=’kl’ instead of reg_type=’entropy’. This makes the function more consistent with the literature
and the other solvers. If you want to use the entropy regularization, please set reg_type=’entropy’ explicitly.
Returns:
if n_hists == 1 –
gammaarray-like, shape (dim_a, dim_b)
Optimal transportation matrix for the given parameters
logdict
log dictionary returned only if log is True
else –
ot_costarray-like, shape (n_hists,)
the OT cost between \(\mathbf{a}\) and each of the histograms \(\mathbf{b}_i\)
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
KL is the Kullback-Leibler divergence
The algorithm used for solving the problem is the generalized
Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]
Warning
Starting from version 0.9.5, the default value has been changed to reg_type=’kl’ instead of reg_type=’entropy’. This makes the function more consistent with the literature
and the other solvers. If you want to use the entropy regularization, please set reg_type=’entropy’ explicitly.
Parameters:
a (array-like, shape (dim_a,)) – Unnormalized histogram of dimension dim_a
If a is an empty list or array ([]),
then a is set to uniform distribution.
b (array-like, shape (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b.
If b is an empty list or array ([]),
then b is set to uniform distribution.
If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)
M (array-like, shape (dim_a, dim_b)) – loss matrix
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term.
If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1,
then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations.
The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\).
For semi-relaxed case, use either
\(\mathrm{reg_{m}}=(float("inf"), scalar)\) or
\(\mathrm{reg_{m}}=(scalar, float("inf"))\).
If \(\mathrm{reg_{m}}\) is an array,
it must have the same backend as input arrays (a, b, M).
method (str) – method used for the solver either ‘sinkhorn’, ‘sinkhorn_stabilized’, ‘sinkhorn_translation_invariant’ or
‘sinkhorn_reg_scaling’, see those function for specific parameters
reg_type (string, optional) –
Regularizer term. Can take two values:
Negative entropy: ‘entropy’:
\(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\).
This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\).
c (array-like, shape (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization.
If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).
If \(\texttt{reg_type}=\)’entropy’, then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given
(that is the logarithm of the u, v sinkhorn scaling vectors).
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\mathbf{c}\) is a reference distribution for the regularization
KL is the Kullback-Leibler divergence
The algorithm used for solving the problem is the generalized
Sinkhorn-Knopp matrix scaling algorithm as proposed in [10, 25]
Warning
Starting from version 0.9.5, the default value has been changed to reg_type=’kl’ instead of reg_type=’entropy’. This makes the function more consistent with the literature
and the other solvers. If you want to use the entropy regularization, please set reg_type=’entropy’ explicitly.
Parameters:
a (array-like, shape (dim_a,)) – Unnormalized histogram of dimension dim_a
If a is an empty list or array ([]),
then a is set to uniform distribution.
b (array-like, shape (dim_b,)) – One or multiple unnormalized histograms of dimension dim_b.
If b is an empty list or array ([]),
then b is set to uniform distribution.
If many, compute all the OT costs \((\mathbf{a}, \mathbf{b}_i)_i\)
M (array-like, shape (dim_a, dim_b)) – loss matrix
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term.
If \(\mathrm{reg_{m}}\) is a scalar or an indexable object of length 1,
then the same \(\mathrm{reg_{m}}\) is applied to both marginal relaxations.
The entropic balanced OT can be recovered using \(\mathrm{reg_{m}}=float("inf")\).
For semi-relaxed case, use either
\(\mathrm{reg_{m}}=(float("inf"), scalar)\) or
\(\mathrm{reg_{m}}=(scalar, float("inf"))\).
If \(\mathrm{reg_{m}}\) is an array,
it must have the same backend as input arrays (a, b, M).
method (str) – method used for the solver either ‘sinkhorn’, ‘sinkhorn_stabilized’, ‘sinkhorn_translation_invariant’ or
‘sinkhorn_reg_scaling’, see those function for specific parameters
reg_type (string, optional) –
Regularizer term. Can take two values:
Negative entropy: ‘entropy’:
\(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\).
This is equivalent (up to a constant) to \(\Omega(\gamma) = \text{KL}(\gamma, 1_{dim_a} 1_{dim_b}^T)\).
c (array-like, shape (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization.
If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).
If \(\texttt{reg_type}=\)’entropy’, then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given
(that is the logarithm of the u,v sinkhorn scaling vectors).
returnCost (string, optional (default = "linear")) – If returnCost = “linear”, then return the linear part of the unbalanced OT loss.
If returnCost = “total”, then return the total unbalanced OT loss.
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
\(\mathbf{M}\) is the (dim_a, dim_b) metric cost matrix
\(\Omega\) is the entropic regularization term,KL divergence
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
KL is the Kullback-Leibler divergence
The algorithm used for solving the problem is the translation invariant Sinkhorn algorithm as proposed in [73]
Parameters:
a (array-like, shape (dim_a,)) – Unnormalized histogram of dimension dim_a
b (array-like, shape (dim_b,) or (dim_b, n_hists)) – One or multiple unnormalized histograms of dimension dim_b
If many, compute all the OT distances (a, b_i)
M (array-like, shape (dim_a, dim_b)) – loss matrix
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term.
If reg_m is a scalar or an indexable object of length 1,
then the same reg_m is applied to both marginal relaxations.
The entropic balanced OT can be recovered using reg_m=float(“inf”).
For semi-relaxed case, use either
reg_m=(float(“inf”), scalar) or reg_m=(scalar, float(“inf”)).
If reg_m is an array, it must have the same backend as input arrays (a, b, M).
reg_type (string, optional) – Regularizer term. Can take two values:
‘entropy’ (negative entropy)
\(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log(\gamma_{i,j}) - \sum_{i,j} \gamma_{i,j}\), or
‘kl’ (Kullback-Leibler)
\(\Omega(\gamma) = \text{KL}(\gamma, \mathbf{a} \mathbf{b}^T)\).
c (array-like, shape (dim_a, dim_b), optional (default=None)) – Reference measure for the regularization.
If None, then use \(\mathbf{c} = \mathbf{a} \mathbf{b}^T\).
If \(\texttt{reg_type}=\)’entropy’, then \(\mathbf{c} = 1_{dim_a} 1_{dim_b}^T\).
warmstart (tuple of arrays, shape (dim_a, dim_b), optional) – Initialization of dual potentials. If provided, the dual potentials should be given
(that is the logarithm of the u,v sinkhorn scaling vectors).
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (> 0)
verbose (bool, optional) – Print information along iterations
Compute the Sliced Unbalanced Optimal Transport (SUOT) between two empirical distributions.
The 1D UOT problem is computed with KL regularization and solved with a Frank-Wolfe algorithm, see [82].
The Sliced Unbalanced Optimal Transport (SUOT) is defined as
with \(P^\theta(x)=\langle x,\theta\rangle\) and \(\lambda\) the uniform distribution on the unit sphere.
Warning
This function only works in pytorch or jax as it uses autodifferentiation to compute the 1D UOT problems. It is not maintained in jax.
Parameters:
X_s (ndarray, shape (n_samples_a, dim)) – samples in the source domain
X_t (ndarray, shape (n_samples_b, dim)) – samples in the target domain
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term.
If reg_m is a scalar or an indexable object of length 1,
then the same reg_m is applied to both marginal relaxations.
The balanced OT can be recovered using reg_m=float(“inf”).
For semi-relaxed case, use either reg_m=(float(“inf”), scalar) or reg_m=(scalar, float(“inf”)).
If reg_m is an array, it must have the same backend as input arrays (X_s, X_t).
a (ndarray, shape (n_samples_a,), optional) – samples weights in the source domain
b (ndarray, shape (n_samples_b,), optional) – samples weights in the target domain
n_projections (int, optional) – Number of projections used for the Monte-Carlo approximation
p (float, optional, by default =2) – Power p used for computing the sliced Wasserstein
projections (shape (dim, n_projections), optional) – Projection matrix (n_projections and seed are not used in this case)
seed (int or RandomState or None, optional) – Seed used for random number generator
log (bool, optional) – if True, returns the projections used and their associated UOTs and reweighted marginals.
Returns:
loss (float/array-like, shape (…)) – SUOT
log (dict, optional) – If log is True, then returns a dictionary containing the projection directions used, the projected UOTs, and reweighted marginals on each slices.
The USOT problem is solved with a Frank-Wolfe algorithm as proposed in [82].
Warning
This function only works in pytorch or jax as it uses autodifferentiation to compute the 1D potentials. It is not maintained in jax.
Parameters:
X_s (ndarray, shape (n_samples_a, dim)) – samples in the source domain
X_t (ndarray, shape (n_samples_b, dim)) – samples in the target domain
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term.
If reg_m is a scalar or an indexable object of length 1,
then the same reg_m is applied to both marginal relaxations.
The balanced OT can be recovered using reg_m=float(“inf”).
For semi-relaxed case, use either reg_m=(float(“inf”), scalar) or reg_m=(scalar, float(“inf”)).
If reg_m is an array, it must have the same backend as input arrays (X_s, X_t).
a (ndarray, shape (n_samples_a,), optional) – samples weights in the source domain
b (ndarray, shape (n_samples_b,), optional) – samples weights in the target domain
n_projections (int, optional) – Number of projections used for the Monte-Carlo approximation
p (float, optional, by default =2) – Power p used for computing the sliced Wasserstein
projections (shape (dim, n_projections), optional) – Projection matrix (n_projections and seed are not used in this case)
seed (int or RandomState or None, optional) – Seed used for random number generator
log (bool, optional) – if True, returns the sot loss, the projections used, their associated EMD and the full mass of the reweighted marginals.
Returns:
a_reweighted (array-like shape (n, …)) – First marginal reweighted
b_reweighted (array-like shape (m, …)) – Second marginal reweighted
loss (float/array-like, shape (…)) – USOT
log (dict, optional) – If log is True, then returns a dictionary containing the projection directions used, the 1D OT losses, the SOT loss and the full mass of reweighted marginals.
Solves the 1D unbalanced OT problem with KL regularization.
The function implements the Frank-Wolfe algorithm to solve the dual problem,
as proposed in [73].
This function only works in pytorch or jax as it uses autodifferentiation to compute the potentials. It is not maintained in jax.
Parameters:
u_values (array-like, shape (n, ...)) – locations of the first empirical distribution
v_values (array-like, shape (m, ...)) – locations of the second empirical distribution
reg_m (float or indexable object of length 1 or 2) – Marginal relaxation term.
If reg_m is a scalar or an indexable object of length 1,
then the same reg_m is applied to both marginal relaxations.
The balanced OT can be recovered using reg_m=float(“inf”).
For semi-relaxed case, use either reg_m=(float(“inf”), scalar) or reg_m=(scalar, float(“inf”)).
If reg_m is an array, it must have the same backend as input arrays (u_values, v_values).
u_weights (array-like, shape (n, ...), optional) – weights of the first empirical distribution, if None then uniform weights are used
v_weights (array-like, shape (m, ...), optional) – weights of the second empirical distribution, if None then uniform weights are used
p (int, optional) – order of the ground metric used, should be at least 1, default is 2
require_sort (bool, optional) – sort the distributions atoms locations, if False we will consider they have been sorted prior to being passed to
the function, default is True
returnCost (string, optional (default = "linear")) – If returnCost = “linear”, then return the linear part of the unbalanced OT loss.
If returnCost = “total”, then return the total unbalanced OT loss.