Computes the Expected Sliced cost and plan between two datasets X_s and
X_t of shapes (ns, d) and (nt, d). Given a set of n_projections projection
directions, the expected sliced plan is obtained by averaging the n_projections
1d optimal transport plans between the projections of X_s and X_t on each
direction. Expected Sliced was introduced in [87] and further studied in
[86].
Note
The computation ignores potential ambiguities in the projections: if
two points from a same measure have the same projection on a direction,
then multiple sorting permutations are possible. To avoid combinatorial
explosion, only one permutation is retained: this strays from theory in
pathological cases.
Warning
Tensorflow and jax only returns dense plans, as they do not support well sparse matrices.
Parameters:
X_s (array-like, shape (ns, d)) – The first set of vectors.
X_t (array-like, shape (nt, d)) – The second set of vectors.
a (ndarray of float64, shape (ns,), optional) – Source histogram (default is uniform weight)
b (ndarray of float64, shape (nt,), optional) – Target histogram (default is uniform weight)
projections (shape (dim, n_projections), optional) – Projection matrix (n_projections and seed are not used in this case).
Default is None
metric (str, optional (default='sqeuclidean')) – Metric to be used. Only works with either of the strings
‘sqeuclidean’, ‘minkowski’, ‘cityblock’, or ‘euclidean’.
p (float, optional (default=2)) – The p-norm to apply for if metric=’minkowski’
n_projections (int, optional) – The number of projection directions. Required if projections is None.
seed (int, optional) – The seed for the random number generator for sampling projections, in case
projections is None. Default is None.
beta (float, optional) – Inverse-temperature parameter which weights each projection’s
contribution to the expected plan. Default is 0 (uniform weighting).
dense (boolean, optional (default=True)) – If True, returns \(\gamma\) as a dense ndarray of shape (ns, nt).
Otherwise returns a sparse representation using scipy’s coo_matrix
format.
batch_size (int, optional) – If specified, compute the distance in batches of size batch_size to
avoid memory issues for large datasets. Default is None (no batching).
log (bool, optional) – If True, returns additional logging information. Default is False.
Returns:
plan (ndarray, shape (ns, nt) or coo_matrix if dense is False) – Optimal transportation matrix for the given parameters.
cost (float) – The cost associated to the optimal permutation.
log_dict (dict, optional) – A dictionary containing intermediate computations for logging purposes.
Returned only if log is True.
Generates n_projections samples from the uniform distribution on the Stiefel manifold of dimension \(d\times 2\): \(\mathbb{V}_{d,2}=\{X \in \mathbb{R}^{d\times 2}, X^TX=I_2\}\)
where \(\mu,\nu\in\mathcal{P}(S^{d-1})\) are two probability measures on the sphere, \(\mathrm{LCOT}_2\) is the linear circular optimal transport distance,
and \(P^U_\# \mu\) stands for the pushforwards of the projection \(\forall x\in S^{d-1},\ P^U(x) = \frac{U^Tx}{\|U^Tx\|_2}\).
Parameters:
X_s (ndarray, shape (n_samples_a, dim)) – Samples in the source domain
X_t (ndarray, shape (n_samples_b, dim), optional) – Samples in the target domain. If None, computes the distance against
the uniform distribution on the sphere.
a (ndarray, shape (n_samples_a,), optional) – samples weights in the source domain
b (ndarray, shape (n_samples_b,), optional) – samples weights in the target domain
n_projections (int, optional) – Number of projections used for the Monte-Carlo approximation
projections (shape (n_projections, dim, 2), optional) – Projection matrix (n_projections and seed are not used in this case)
seed (int or RandomState or None, optional) – Seed used for random number generator
log (bool, optional) – if True, linear_sliced_wasserstein_sphere returns the projections used
and their associated LCOT.
Returns:
cost (float) – Linear Spherical Sliced Wasserstein Cost
log (dict, optional) – log dictionary return only if log==True in parameters
\(\theta_\# \mu\) stands for the pushforwards of the projection \(\mathbb{R}^d \ni X \mapsto \langle \theta, X \rangle\)
Parameters:
X_s (ndarray, shape (n_samples_a, dim)) – samples in the source domain
X_t (ndarray, shape (n_samples_b, dim)) – samples in the target domain
a (ndarray, shape (n_samples_a,), optional) – samples weights in the source domain
b (ndarray, shape (n_samples_b,), optional) – samples weights in the target domain
n_projections (int, optional) – Number of projections used for the Monte-Carlo approximation
p (float, optional =) – Power p used for computing the sliced Wasserstein
projections (shape (dim, n_projections), optional) – Projection matrix (n_projections and seed are not used in this case)
seed (int or RandomState or None, optional) – Seed used for random number generator
log (bool, optional) – if True, sliced_wasserstein_distance returns the projections used and their associated EMD.
scaler (None, object with .transform(), or callable, optional) –
Preprocessing applied to X_s and X_t before computing the distance.
Useful for normalizing inputs when features have very different scales.
None : no preprocessing (default)
Object with .transform() method : e.g. an ot.utils.DataScaler
fitted on a representative sample. This is the recommended way to get
stable, consistent normalization across multiple calls (e.g. when
using SWD as a loss in mini-batch training).
Callable : any function, lambda, or PyTorch transform applied
directly as scaler(X_s) and scaler(X_t).
See ot.utils.DataScaler for a backend-aware scaler that supports
joint fitting on multiple distributions.
Returns:
cost (float) – Sliced Wasserstein Cost
log (dict, optional) – log dictionary return only if log==True in parameters
Computes the cost and permutation associated to the min-Pivot Sliced
Discrepancy (introduced as min-SWGG in [85] and studied further in [86]). Given
the supports X_s and X_t of two discrete uniform measures with ns and nt
atoms in dimension d, the min-Pivot Sliced Discrepancy goes through
n_projections different projections of the measures on random directions, and
retains the couplings that yields the lowest cost between X_s and X_t
(compared in \(\mathbb{R}^d\)). When ns=nt, it gives
where \(\sigma_k\) is a permutation such that ordering the projections
on the axis projections[k, :] matches \(X_s[i, :]\) to \(X_t[\sigma_k(i), :]\).
Note
The computation ignores potential ambiguities in the projections: if
two points from a same measure have the same projection on a direction,
then multiple sorting permutations are possible. To avoid combinatorial
explosion, only one permutation is retained: this strays from theory in
pathological cases.
Warning
Tensorflow and jax only returns dense plans, as they do not support well sparse matrices.
Parameters:
X_s (array-like, shape (ns, d)) – The first set of vectors.
X_t (array-like, shape (nt, d)) – The second set of vectors.
a (ndarray of float64, shape (ns,), optional) – Source histogram (default is uniform weight)
b (ndarray of float64, shape (nt,), optional) – Target histogram (default is uniform weight)
projections (shape (dim, n_projections), optional) – Projection matrix (n_projections and seed are not used in this case).
Default is None
metric (str, optional (default='sqeuclidean')) – Metric to be used. Only works with either of the strings
‘sqeuclidean’, ‘minkowski’, ‘cityblock’, or ‘euclidean’.
p (float, optional (default=1.0)) – The p-norm to apply for if metric=’minkowski’
n_projections (int, optional) – The number of projection directions. Required if projections is None.
seed (int, optional) – The seed for the random number generator for sampling projections, in case
projections is None. Default is None.
batch_size (int, optional) – If specified, compute the distance in batches of size batch_size to
avoid memory issues for large datasets. Default is None (no batching).
dense (boolean, optional (default=True)) – If True, returns \(\gamma\) as a dense ndarray of shape (ns, nt).
Otherwise returns a sparse representation using scipy’s coo_matrix
format.
log (bool, optional) – If True, returns additional logging information. Default is False.
Returns:
plan (ndarray, shape (ns, nt) or coo_matrix if dense is False) – Optimal transportation matrix for the given parameters.
cost (float) – The cost associated to the optimal permutation.
log_dict (dict, optional) – A dictionary containing intermediate computations for logging purposes.
Returned only if log is True.
Projection of \(x\in S^{d-1}\) on circles using coordinates on [0,1[.
To get the projection on the circle, we use the following formula:
\[P^U(x) = \frac{U^Tx}{\|U^Tx\|_2}\]
where \(U\) is a random matrix sampled from the uniform distribution on the Stiefel manifold of dimension \(d\times 2\): \(\mathbb{V}_{d,2}=\{X \in \mathbb{R}^{d\times 2}, X^TX=I_2\}\)
and \(x\) is a point on the sphere. Then, we apply the function get_coordinate_circle to get the coordinates on \([0,1[\).
Parameters:
x (ndarray, shape (n_samples, dim)) – samples on the sphere
n_projections (int, optional) – Number of projections used for the Monte-Carlo approximation
projections (shape (n_projections, dim, 2), optional) – Projection matrix (n_projections and seed are not used in this case)
seed (int or RandomState or None, optional) – Seed used for random number generator
backend – Backend to use for random generation
Returns:
Xp_coords – Coordinates of the projections on the circle
Computes all the permutations that sort the projections of two (ns, nt)
datasets X_s and X_t on the directions projections.
Each permutation perm[:, k] is such that each \(X_s[i, :]\) is matched
to X_t[perm[i, k], :] when projected on projections[k, :].
Parameters:
X_s (array-like, shape (ns, d)) – The first set of vectors.
X_t (array-like, shape (nt, d)) – The second set of vectors.
a (ndarray of float64, shape (ns,), optional) – Source histogram (default is uniform weight)
b (ndarray of float64, shape (nt,), optional) – Target histogram (default is uniform weight)
metric (str, optional (default='sqeuclidean')) – Metric to be used. Only works with either of the strings
‘sqeuclidean’, ‘minkowski’, ‘cityblock’, or ‘euclidean’.
p (float, optional (default=1.0)) – The p-norm to apply for if metric=’minkowski’
projections (shape (dim, n_projections), optional) – Projection matrix (n_projections and seed are not used in this case)
n_projections (int, optional) – The number of projection directions. Required if projections is None.
seed (int, optional) – The seed for the random number generator for sampling projections, in case
projections is None.
Default is None.
batch_size (int, optional) – If specified, compute the distance in batches of size batch_size to
avoid memory issues for large datasets. Default is None (no batching).
log (bool, optional) – If True, returns additional logging information. Default is False.
Returns:
plan (list of dictionaries) – List of the optimal transport plans as a list of dictionaries containing
the rows, cols and data of the non-zero elements of the transportation matrix.
costs (list of float) – The cost associated to each projection.
log_dict (dict, optional) – A dictionary containing intermediate computations for logging purposes.
Returned only if log is True.
\(\theta_\# \mu\) stands for the pushforwards of the projection \(X \in \mathbb{R}^d \mapsto \langle \theta, X \rangle\)
Parameters:
X_s (ndarray, shape (n_samples_a, dim)) – samples in the source domain
X_t (ndarray, shape (n_samples_b, dim)) – samples in the target domain
a (ndarray, shape (n_samples_a,), optional) – samples weights in the source domain
b (ndarray, shape (n_samples_b,), optional) – samples weights in the target domain
n_projections (int, optional) – Number of projections used for the Monte-Carlo approximation
p (float, optional) – Power p used for computing the sliced Wasserstein
projections (shape (dim, n_projections), optional) – Projection matrix (n_projections and seed are not used in this case)
seed (int or RandomState or None, optional) – Seed used for random number generator
log (bool, optional) – if True, sliced_wasserstein_distance returns the projections used and their associated EMD.
scaler (None, object with .transform(), or callable, optional) –
Preprocessing applied to X_s and X_t before computing the distance.
Useful for normalizing inputs when features have very different scales.
None : no preprocessing (default)
Object with .transform() method : e.g. an ot.utils.DataScaler
fitted on a representative sample. This is the recommended way to get
stable, consistent normalization across multiple calls (e.g. when
using SWD as a loss in mini-batch training).
Callable : any function, lambda, or PyTorch transform applied
directly as scaler(X_s) and scaler(X_t).
See ot.utils.DataScaler for a backend-aware scaler that supports
joint fitting on multiple distributions.
Returns:
cost (float) – Sliced Wasserstein Cost
log (dict, optional) – log dictionary return only if log==True in parameters