ot.lowrank

Low rank OT solvers

Functions

ot.lowrank.compute_lr_sqeuclidean_matrix(X_s, X_t, rescale_cost, nx=None)[source]

Compute the low rank decomposition of a squared euclidean distance matrix. This function won’t work for other distance metrics.

See “Section 3.5, proposition 1”

Parameters:

X_s (array-like, shape (n_samples_a, dim)) – samples in the source domain
X_t (array-like, shape (n_samples_b, dim)) – samples in the target domain
rescale_cost (bool) – Rescale the low rank factorization of the sqeuclidean cost matrix
nx (default None) – POT backend

Returns:

M1 (array-like, shape (n_samples_a, dim+2)) – First low rank decomposition of the distance matrix
M2 (array-like, shape (n_samples_b, dim+2)) – Second low rank decomposition of the distance matrix

References

ot.lowrank.lowrank_sinkhorn(X_s, X_t, a=None, b=None, reg=0, rank=None, alpha=1e-10, rescale_cost=True, init='random', reg_init=0.1, seed_init=49, gamma_init='rescale', numItermax=2000, stopThr=1e-07, warn=True, log=False)[source]

Solve the entropic regularization optimal transport problem under low-nonnegative rank constraints on the couplings.

The function solves the following optimization problem:

\[\mathop{\inf_{(\mathbf{Q},\mathbf{R},\mathbf{g}) \in \mathcal{C}(\mathbf{a},\mathbf{b},r)}} \langle \mathbf{C}, \mathbf{Q}\mathrm{diag}(1/\mathbf{g})\mathbf{R}^\top \rangle - \mathrm{reg} \cdot H((\mathbf{Q}, \mathbf{R}, \mathbf{g}))\]

where :

\(\mathbf{C}\) is the (dim_a, dim_b) metric cost matrix
\(H((\mathbf{Q}, \mathbf{R}, \mathbf{g}))\) is the values of the three respective entropies evaluated for each term.
\(\mathbf{Q}\) and \(\mathbf{R}\) are the low-rank matrix decomposition of the OT plan
\(\mathbf{g}\) is the weight vector for the low-rank decomposition of the OT plan
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (histograms, both sum to 1)
\(r\) is the rank of the OT plan
\(\mathcal{C}(\mathbf{a}, \mathbf{b}, r)\) are the low-rank couplings of the OT problem

Parameters:

X_s (array-like, shape (n_samples_a, dim)) – samples in the source domain
X_t (array-like, shape (n_samples_b, dim)) – samples in the target domain
a (array-like, shape (n_samples_a,)) – samples weights in the source domain
b (array-like, shape (n_samples_b,)) – samples weights in the target domain
reg (float, optional) – Regularization term >0
rank (int, optional. Default is None. (>0)) – Nonnegative rank of the OT plan. If None, min(ns, nt) is considered.
alpha (int, optional. Default is 1e-10. (>0 and <1/r)) – Lower bound for the weight vector g.
rescale_cost (bool, optional. Default is False) – Rescale the low rank factorization of the sqeuclidean cost matrix
init (str, optional. Default is 'random'.) – Initialization strategy for the low rank couplings. ‘random’, ‘deterministic’ or ‘kmeans’
reg_init (float, optional. Default is 1e-1. (>0)) – Regularization term for a ‘kmeans’ init. If None, 1 is considered.
seed_init (int, optional. Default is 49. (>0)) – Random state for a ‘random’ or ‘kmeans’ init strategy.
gamma_init (str, optional. Default is "rescale".) – Initialization strategy for gamma. ‘rescale’, or ‘theory’ Gamma is a constant that scales the convergence criterion of the Mirror Descent optimization scheme used to compute the low-rank couplings (Q, R and g)
numItermax (int, optional. Default is 2000.) – Max number of iterations for the Dykstra algorithm
stopThr (float, optional. Default is 1e-7.) – Stop threshold on error (>0) in Dykstra
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
log (bool, optional) – record log if True

Returns:

Q (array-like, shape (n_samples_a, r)) – First low-rank matrix decomposition of the OT plan
R (array-like, shape (n_samples_b, r)) – Second low-rank matrix decomposition of the OT plan
g (array-like, shape (r, )) – Weight vector for the low-rank decomposition of the OT
log (dict (lazy_plan, value and value_linear)) – log dictionary return only if log==True in parameters

References

ot.lowrank.compute_lr_sqeuclidean_matrix(X_s, X_t, rescale_cost, nx=None)[source]

Compute the low rank decomposition of a squared euclidean distance matrix. This function won’t work for other distance metrics.

See “Section 3.5, proposition 1”

Parameters:

X_s (array-like, shape (n_samples_a, dim)) – samples in the source domain
X_t (array-like, shape (n_samples_b, dim)) – samples in the target domain
rescale_cost (bool) – Rescale the low rank factorization of the sqeuclidean cost matrix
nx (default None) – POT backend

Returns:

M1 (array-like, shape (n_samples_a, dim+2)) – First low rank decomposition of the distance matrix
M2 (array-like, shape (n_samples_b, dim+2)) – Second low rank decomposition of the distance matrix

References

ot.lowrank.lowrank_sinkhorn(X_s, X_t, a=None, b=None, reg=0, rank=None, alpha=1e-10, rescale_cost=True, init='random', reg_init=0.1, seed_init=49, gamma_init='rescale', numItermax=2000, stopThr=1e-07, warn=True, log=False)[source]

Solve the entropic regularization optimal transport problem under low-nonnegative rank constraints on the couplings.

The function solves the following optimization problem:

\[\mathop{\inf_{(\mathbf{Q},\mathbf{R},\mathbf{g}) \in \mathcal{C}(\mathbf{a},\mathbf{b},r)}} \langle \mathbf{C}, \mathbf{Q}\mathrm{diag}(1/\mathbf{g})\mathbf{R}^\top \rangle - \mathrm{reg} \cdot H((\mathbf{Q}, \mathbf{R}, \mathbf{g}))\]

where :

\(\mathbf{C}\) is the (dim_a, dim_b) metric cost matrix
\(H((\mathbf{Q}, \mathbf{R}, \mathbf{g}))\) is the values of the three respective entropies evaluated for each term.
\(\mathbf{Q}\) and \(\mathbf{R}\) are the low-rank matrix decomposition of the OT plan
\(\mathbf{g}\) is the weight vector for the low-rank decomposition of the OT plan
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (histograms, both sum to 1)
\(r\) is the rank of the OT plan
\(\mathcal{C}(\mathbf{a}, \mathbf{b}, r)\) are the low-rank couplings of the OT problem

Parameters:

X_s (array-like, shape (n_samples_a, dim)) – samples in the source domain
X_t (array-like, shape (n_samples_b, dim)) – samples in the target domain
a (array-like, shape (n_samples_a,)) – samples weights in the source domain
b (array-like, shape (n_samples_b,)) – samples weights in the target domain
reg (float, optional) – Regularization term >0
rank (int, optional. Default is None. (>0)) – Nonnegative rank of the OT plan. If None, min(ns, nt) is considered.
alpha (int, optional. Default is 1e-10. (>0 and <1/r)) – Lower bound for the weight vector g.
rescale_cost (bool, optional. Default is False) – Rescale the low rank factorization of the sqeuclidean cost matrix
init (str, optional. Default is 'random'.) – Initialization strategy for the low rank couplings. ‘random’, ‘deterministic’ or ‘kmeans’
reg_init (float, optional. Default is 1e-1. (>0)) – Regularization term for a ‘kmeans’ init. If None, 1 is considered.
seed_init (int, optional. Default is 49. (>0)) – Random state for a ‘random’ or ‘kmeans’ init strategy.
gamma_init (str, optional. Default is "rescale".) – Initialization strategy for gamma. ‘rescale’, or ‘theory’ Gamma is a constant that scales the convergence criterion of the Mirror Descent optimization scheme used to compute the low-rank couplings (Q, R and g)
numItermax (int, optional. Default is 2000.) – Max number of iterations for the Dykstra algorithm
stopThr (float, optional. Default is 1e-7.) – Stop threshold on error (>0) in Dykstra
warn (bool, optional) – if True, raises a warning if the algorithm doesn’t convergence.
log (bool, optional) – record log if True

Returns:

Q (array-like, shape (n_samples_a, r)) – First low-rank matrix decomposition of the OT plan
R (array-like, shape (n_samples_b, r)) – Second low-rank matrix decomposition of the OT plan
g (array-like, shape (r, )) – Weight vector for the low-rank decomposition of the OT
log (dict (lazy_plan, value and value_linear)) – log dictionary return only if log==True in parameters

References