ot.stochastic

Stochastic solvers for regularized OT.

Functions

ot.stochastic.averaged_sgd_entropic_transport(a, b, M, reg, numItermax=300000, lr=None)[source]

Compute the ASGD algorithm to solve the regularized semi continous measures optimal transport max problem

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg}\cdot\Omega(\gamma)\\s.t. \gamma \mathbf{1} = \mathbf{a}\\ \gamma^T \mathbf{1} = \mathbf{b}\\ \gamma \geq 0\end{aligned}\end{align} \]

Where :

  • \(\mathbf{M}\) is the (ns, nt) metric cost matrix

  • \(\Omega\) is the entropic regularization term with \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (sum to 1)

The algorithm used for solving the problem is the ASGD algorithm as proposed in [18] [alg.2]

Parameters
  • b (ndarray, shape (nt,)) – target measure

  • M (ndarray, shape (ns, nt)) – cost matrix

  • reg (float) – Regularization term > 0

  • numItermax (int) – Number of iteration.

  • lr (float) – Learning rate.

Returns

ave_v – dual variable

Return type

ndarray, shape (nt,)

Examples

>>> import ot
>>> np.random.seed(0)
>>> n_source = 7
>>> n_target = 4
>>> a = ot.utils.unif(n_source)
>>> b = ot.utils.unif(n_target)
>>> X_source = np.random.randn(n_source, 2)
>>> Y_target = np.random.randn(n_target, 2)
>>> M = ot.dist(X_source, Y_target)
>>> ot.stochastic.solve_semi_dual_entropic(a, b, M, reg=1, method="ASGD", numItermax=300000)
array([[2.53942342e-02, 9.98640673e-02, 1.75945647e-02, 4.27664307e-06],
       [1.21556999e-01, 1.26350515e-02, 1.30491795e-03, 7.36017394e-03],
       [3.54070702e-03, 7.63581358e-02, 6.29581672e-02, 1.32812798e-07],
       [2.60578198e-02, 3.35916645e-02, 8.28023223e-02, 4.05336238e-04],
       [9.86808864e-03, 7.59774324e-04, 1.08702729e-02, 1.21359007e-01],
       [2.17218856e-02, 9.12931802e-04, 1.87962526e-03, 1.18342700e-01],
       [4.14237512e-02, 2.67487857e-02, 7.23016955e-02, 2.38291052e-03]])

References

18

Genevay, A., Cuturi, M., Peyré, G. & Bach, F. (2016) Stochastic Optimization for Large-scale Optimal Transport. Advances in Neural Information Processing Systems (2016).

ot.stochastic.batch_grad_dual(a, b, M, reg, alpha, beta, batch_size, batch_alpha, batch_beta)[source]

Computes the partial gradient of the dual optimal transport problem.

For each \((i,j)\) in a batch of coordinates, the partial gradients are :

\[ \begin{align}\begin{aligned}\partial_{\mathbf{u}_i} F = \frac{b_s}{l_v} \mathbf{u}_i - \sum_{j \in B_v} \mathbf{a}_i \mathbf{b}_j \exp\left( \frac{\mathbf{u}_i + \mathbf{v}_j - \mathbf{M}_{i,j}}{\mathrm{reg}} \right)\\\partial_{\mathbf{v}_j} F = \frac{b_s}{l_u} \mathbf{v}_j - \sum_{i \in B_u} \mathbf{a}_i \mathbf{b}_j \exp\left( \frac{\mathbf{u}_i + \mathbf{v}_j - \mathbf{M}_{i,j}}{\mathrm{reg}} \right)\end{aligned}\end{align} \]

Where :

  • \(\mathbf{M}\) is the (ns, nt) metric cost matrix

  • \(\mathbf{u}\), \(\mathbf{v}\) are dual variables in \(\mathbb{R}^{ns} \times \mathbb{R}^{nt}\)

  • reg is the regularization term

  • \(B_u\) and \(B_v\) are lists of index

  • \(b_s\) is the size of the batches \(B_u\) and \(B_v\)

  • \(l_u\) and \(l_v\) are the lengths of \(B_u\) and \(B_v\)

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (sum to 1)

The algorithm used for solving the dual problem is the SGD algorithm as proposed in [19] [alg.1]

Parameters
  • a (ndarray, shape (ns,)) – source measure

  • b (ndarray, shape (nt,)) – target measure

  • M (ndarray, shape (ns, nt)) – cost matrix

  • reg (float) – Regularization term > 0

  • alpha (ndarray, shape (ns,)) – dual variable

  • beta (ndarray, shape (nt,)) – dual variable

  • batch_size (int) – size of the batch

  • batch_alpha (ndarray, shape (bs,)) – batch of index of alpha

  • batch_beta (ndarray, shape (bs,)) – batch of index of beta

Returns

grad – partial grad F

Return type

ndarray, shape (ns,)

Examples

>>> import ot
>>> np.random.seed(0)
>>> n_source = 7
>>> n_target = 4
>>> a = ot.utils.unif(n_source)
>>> b = ot.utils.unif(n_target)
>>> X_source = np.random.randn(n_source, 2)
>>> Y_target = np.random.randn(n_target, 2)
>>> M = ot.dist(X_source, Y_target)
>>> sgd_dual_pi, log = ot.stochastic.solve_dual_entropic(a, b, M, reg=1, batch_size=3, numItermax=30000, lr=0.1, log=True)
>>> log['alpha']
array([0.71759102, 1.57057384, 0.85576566, 0.1208211 , 0.59190466,
       1.197148  , 0.17805133])
>>> log['beta']
array([0.49741367, 0.57478564, 1.40075528, 2.75890102])
>>> sgd_dual_pi
array([[2.09730063e-02, 8.38169324e-02, 7.50365455e-03, 8.72731415e-09],
       [5.58432437e-03, 5.89881299e-04, 3.09558411e-05, 8.35469849e-07],
       [3.26489515e-03, 7.15536035e-02, 2.99778211e-02, 3.02601593e-10],
       [4.05390622e-02, 5.31085068e-02, 6.65191787e-02, 1.55812785e-06],
       [7.82299812e-02, 6.12099102e-03, 4.44989098e-02, 2.37719187e-03],
       [5.06266486e-02, 2.16230494e-03, 2.26215141e-03, 6.81514609e-04],
       [6.06713990e-02, 3.98139808e-02, 5.46829338e-02, 8.62371424e-06]])

References

19

Seguy, V., Bhushan Damodaran, B., Flamary, R., Courty, N., Rolet, A.& Blondel, M. Large-scale Optimal Transport and Mapping Estimation. International Conference on Learning Representation (2018)

ot.stochastic.c_transform_entropic(b, M, reg, beta)[source]

The goal is to recover u from the c-transform.

The function computes the c-transform of a dual variable from the other dual variable:

\[\mathbf{u} = \mathbf{v}^{c,reg} = - \mathrm{reg} \sum_j \mathbf{b}_j \exp\left( \frac{\mathbf{v} - \mathbf{M}}{\mathrm{reg}} \right)\]

Where :

  • \(\mathbf{M}\) is the (ns, nt) metric cost matrix

  • \(\mathbf{u}\), \(\mathbf{v}\) are dual variables in \(\mathbb{R}^{ns} \times \mathbb{R}^{nt}\)

  • reg is the regularization term

It is used to recover an optimal u from optimal v solving the semi dual problem, see Proposition 2.1 of [18]

Parameters
  • b (ndarray, shape (nt,)) – Target measure

  • M (ndarray, shape (ns, nt)) – Cost matrix

  • reg (float) – Regularization term > 0

  • v (ndarray, shape (nt,)) – Dual variable.

Returns

u – Dual variable.

Return type

ndarray, shape (ns,)

Examples

>>> import ot
>>> np.random.seed(0)
>>> n_source = 7
>>> n_target = 4
>>> a = ot.utils.unif(n_source)
>>> b = ot.utils.unif(n_target)
>>> X_source = np.random.randn(n_source, 2)
>>> Y_target = np.random.randn(n_target, 2)
>>> M = ot.dist(X_source, Y_target)
>>> ot.stochastic.solve_semi_dual_entropic(a, b, M, reg=1, method="ASGD", numItermax=300000)
array([[2.53942342e-02, 9.98640673e-02, 1.75945647e-02, 4.27664307e-06],
       [1.21556999e-01, 1.26350515e-02, 1.30491795e-03, 7.36017394e-03],
       [3.54070702e-03, 7.63581358e-02, 6.29581672e-02, 1.32812798e-07],
       [2.60578198e-02, 3.35916645e-02, 8.28023223e-02, 4.05336238e-04],
       [9.86808864e-03, 7.59774324e-04, 1.08702729e-02, 1.21359007e-01],
       [2.17218856e-02, 9.12931802e-04, 1.87962526e-03, 1.18342700e-01],
       [4.14237512e-02, 2.67487857e-02, 7.23016955e-02, 2.38291052e-03]])

References

18

Genevay, A., Cuturi, M., Peyré, G. & Bach, F. (2016) Stochastic Optimization for Large-scale Optimal Transport. Advances in Neural Information Processing Systems (2016).

ot.stochastic.coordinate_grad_semi_dual(b, M, reg, beta, i)[source]

Compute the coordinate gradient update for regularized discrete distributions for \((i, :)\)

The function computes the gradient of the semi dual problem:

\[\max_\mathbf{v} \ \sum_i \mathbf{a}_i \left[ \sum_j \mathbf{v}_j \mathbf{b}_j - \mathrm{reg} \cdot \log \left( \sum_j \mathbf{b}_j \exp \left( \frac{\mathbf{v}_j - \mathbf{M}_{i,j}}{\mathrm{reg}} \right) \right) \right]\]

Where :

  • \(\mathbf{M}\) is the (ns, nt) metric cost matrix

  • \(\mathbf{v}\) is a dual variable in \(\mathbb{R}^{nt}\)

  • reg is the regularization term

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (sum to 1)

The algorithm used for solving the problem is the ASGD & SAG algorithms as proposed in [18] [alg.1 & alg.2]

Parameters
  • b (ndarray, shape (nt,)) – Target measure.

  • M (ndarray, shape (ns, nt)) – Cost matrix.

  • reg (float) – Regularization term > 0.

  • v (ndarray, shape (nt,)) – Dual variable.

  • i (int) – Picked number i.

Returns

coordinate gradient

Return type

ndarray, shape (nt,)

Examples

>>> import ot
>>> np.random.seed(0)
>>> n_source = 7
>>> n_target = 4
>>> a = ot.utils.unif(n_source)
>>> b = ot.utils.unif(n_target)
>>> X_source = np.random.randn(n_source, 2)
>>> Y_target = np.random.randn(n_target, 2)
>>> M = ot.dist(X_source, Y_target)
>>> ot.stochastic.solve_semi_dual_entropic(a, b, M, reg=1, method="ASGD", numItermax=300000)
array([[2.53942342e-02, 9.98640673e-02, 1.75945647e-02, 4.27664307e-06],
       [1.21556999e-01, 1.26350515e-02, 1.30491795e-03, 7.36017394e-03],
       [3.54070702e-03, 7.63581358e-02, 6.29581672e-02, 1.32812798e-07],
       [2.60578198e-02, 3.35916645e-02, 8.28023223e-02, 4.05336238e-04],
       [9.86808864e-03, 7.59774324e-04, 1.08702729e-02, 1.21359007e-01],
       [2.17218856e-02, 9.12931802e-04, 1.87962526e-03, 1.18342700e-01],
       [4.14237512e-02, 2.67487857e-02, 7.23016955e-02, 2.38291052e-03]])

References

18

Genevay, A., Cuturi, M., Peyré, G. & Bach, F. (2016) Stochastic Optimization for Large-scale Optimal Transport. Advances in Neural Information Processing Systems (2016).

ot.stochastic.loss_dual_entropic(u, v, xs, xt, reg=1, ws=None, wt=None, metric='sqeuclidean')[source]

Compute the dual loss of the entropic OT as in equation (6)-(7) of [19]

This loss is backend compatible and can be used for stochastic optimization of the dual potentials. It can be used on the full dataset (beware of memory) or on minibatches.

Parameters
  • u (array-like, shape (ns,)) – Source dual potential

  • v (array-like, shape (nt,)) – Target dual potential

  • xs (array-like, shape (ns,d)) – Source samples

  • xt (array-like, shape (ns,d)) – Target samples

  • reg (float) – Regularization term > 0 (default=1)

  • ws (array-like, shape (ns,), optional) – Source sample weights (default unif)

  • wt (array-like, shape (ns,), optional) – Target sample weights (default unif)

  • metric (string, callable) – Ground metric for OT (default quadratic). Can be given as a callable function taking (xs,xt) as parameters.

Returns

dual_loss – Dual loss (to maximize)

Return type

array-like

References

19

Seguy, V., Bhushan Damodaran, B., Flamary, R., Courty, N., Rolet, A.& Blondel, M. Large-scale Optimal Transport and Mapping Estimation. International Conference on Learning Representation (2018)

Examples using ot.stochastic.loss_dual_entropic

ot.stochastic.loss_dual_quadratic(u, v, xs, xt, reg=1, ws=None, wt=None, metric='sqeuclidean')[source]

Compute the dual loss of the quadratic regularized OT as in equation (6)-(7) of [19]

This loss is backend compatible and can be used for stochastic optimization of the dual potentials. It can be used on the full dataset (beware of memory) or on minibatches.

Parameters
  • u (array-like, shape (ns,)) – Source dual potential

  • v (array-like, shape (nt,)) – Target dual potential

  • xs (array-like, shape (ns,d)) – Source samples

  • xt (array-like, shape (ns,d)) – Target samples

  • reg (float) – Regularization term > 0 (default=1)

  • ws (array-like, shape (ns,), optional) – Source sample weights (default unif)

  • wt (array-like, shape (ns,), optional) – Target sample weights (default unif)

  • metric (string, callable) – Ground metric for OT (default quadratic). Can be given as a callable function taking (xs,xt) as parameters.

Returns

dual_loss – Dual loss (to maximize)

Return type

array-like

References

19

Seguy, V., Bhushan Damodaran, B., Flamary, R., Courty, N., Rolet, A.& Blondel, M. Large-scale Optimal Transport and Mapping Estimation. International Conference on Learning Representation (2018)

Examples using ot.stochastic.loss_dual_quadratic

ot.stochastic.plan_dual_entropic(u, v, xs, xt, reg=1, ws=None, wt=None, metric='sqeuclidean')[source]

Compute the primal OT plan the entropic OT as in equation (8) of [19]

This loss is backend compatible and can be used for stochastic optimization of the dual potentials. It can be used on the full dataset (beware of memory) or on minibatches.

Parameters
  • u (array-like, shape (ns,)) – Source dual potential

  • v (array-like, shape (nt,)) – Target dual potential

  • xs (array-like, shape (ns,d)) – Source samples

  • xt (array-like, shape (ns,d)) – Target samples

  • reg (float) – Regularization term > 0 (default=1)

  • ws (array-like, shape (ns,), optional) – Source sample weights (default unif)

  • wt (array-like, shape (ns,), optional) – Target sample weights (default unif)

  • metric (string, callable) – Ground metric for OT (default quadratic). Can be given as a callable function taking (xs,xt) as parameters.

Returns

G – Primal OT plan

Return type

array-like

References

19

Seguy, V., Bhushan Damodaran, B., Flamary, R., Courty, N., Rolet, A.& Blondel, M. Large-scale Optimal Transport and Mapping Estimation. International Conference on Learning Representation (2018)

Examples using ot.stochastic.plan_dual_entropic

ot.stochastic.plan_dual_quadratic(u, v, xs, xt, reg=1, ws=None, wt=None, metric='sqeuclidean')[source]

Compute the primal OT plan the quadratic regularized OT as in equation (8) of [19]

This loss is backend compatible and can be used for stochastic optimization of the dual potentials. It can be used on the full dataset (beware of memory) or on minibatches.

Parameters
  • u (array-like, shape (ns,)) – Source dual potential

  • v (array-like, shape (nt,)) – Target dual potential

  • xs (array-like, shape (ns,d)) – Source samples

  • xt (array-like, shape (ns,d)) – Target samples

  • reg (float) – Regularization term > 0 (default=1)

  • ws (array-like, shape (ns,), optional) – Source sample weights (default unif)

  • wt (array-like, shape (ns,), optional) – Target sample weights (default unif)

  • metric (string, callable) – Ground metric for OT (default quadratic). Can be given as a callable function taking (xs,xt) as parameters.

Returns

G – Primal OT plan

Return type

array-like

References

19

Seguy, V., Bhushan Damodaran, B., Flamary, R., Courty, N., Rolet, A.& Blondel, M. Large-scale Optimal Transport and Mapping Estimation. International Conference on Learning Representation (2018)

Examples using ot.stochastic.plan_dual_quadratic

ot.stochastic.sag_entropic_transport(a, b, M, reg, numItermax=10000, lr=None)[source]

Compute the SAG algorithm to solve the regularized discrete measures optimal transport max problem

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} = \mathbf{a}\\ \gamma^T \mathbf{1} = \mathbf{b}\\ \gamma \geq 0\end{aligned}\end{align} \]

Where :

  • \(\mathbf{M}\) is the (ns, nt) metric cost matrix

  • \(\Omega\) is the entropic regularization term with \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (sum to 1)

The algorithm used for solving the problem is the SAG algorithm as proposed in [18] [alg.1]

Parameters
  • a (ndarray, shape (ns,),) – Source measure.

  • b (ndarray, shape (nt,),) – Target measure.

  • M (ndarray, shape (ns, nt),) – Cost matrix.

  • reg (float) – Regularization term > 0

  • numItermax (int) – Number of iteration.

  • lr (float) – Learning rate.

Returns

v – Dual variable.

Return type

ndarray, shape (nt,)

Examples

>>> import ot
>>> np.random.seed(0)
>>> n_source = 7
>>> n_target = 4
>>> a = ot.utils.unif(n_source)
>>> b = ot.utils.unif(n_target)
>>> X_source = np.random.randn(n_source, 2)
>>> Y_target = np.random.randn(n_target, 2)
>>> M = ot.dist(X_source, Y_target)
>>> ot.stochastic.solve_semi_dual_entropic(a, b, M, reg=1, method="ASGD", numItermax=300000)
array([[2.53942342e-02, 9.98640673e-02, 1.75945647e-02, 4.27664307e-06],
       [1.21556999e-01, 1.26350515e-02, 1.30491795e-03, 7.36017394e-03],
       [3.54070702e-03, 7.63581358e-02, 6.29581672e-02, 1.32812798e-07],
       [2.60578198e-02, 3.35916645e-02, 8.28023223e-02, 4.05336238e-04],
       [9.86808864e-03, 7.59774324e-04, 1.08702729e-02, 1.21359007e-01],
       [2.17218856e-02, 9.12931802e-04, 1.87962526e-03, 1.18342700e-01],
       [4.14237512e-02, 2.67487857e-02, 7.23016955e-02, 2.38291052e-03]])

References

18

Genevay, A., Cuturi, M., Peyré, G. & Bach, F. (2016) Stochastic Optimization for Large-scale Optimal Transport. Advances in Neural Information Processing Systems (2016).

ot.stochastic.sgd_entropic_regularization(a, b, M, reg, batch_size, numItermax, lr)[source]

Compute the sgd algorithm to solve the regularized discrete measures optimal transport dual problem

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} = \mathbf{a}\\ \gamma^T \mathbf{1} = \mathbf{b}\\ \gamma \geq 0\end{aligned}\end{align} \]

Where :

  • \(\mathbf{M}\) is the (ns, nt) metric cost matrix

  • \(\Omega\) is the entropic regularization term with \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (sum to 1)

Parameters
  • a (ndarray, shape (ns,)) – source measure

  • b (ndarray, shape (nt,)) – target measure

  • M (ndarray, shape (ns, nt)) – cost matrix

  • reg (float) – Regularization term > 0

  • batch_size (int) – size of the batch

  • numItermax (int) – number of iteration

  • lr (float) – learning rate

Returns

  • alpha (ndarray, shape (ns,)) – dual variable

  • beta (ndarray, shape (nt,)) – dual variable

Examples

>>> import ot
>>> n_source = 7
>>> n_target = 4
>>> reg = 1
>>> numItermax = 20000
>>> lr = 0.1
>>> batch_size = 3
>>> log = True
>>> a = ot.utils.unif(n_source)
>>> b = ot.utils.unif(n_target)
>>> rng = np.random.RandomState(0)
>>> X_source = rng.randn(n_source, 2)
>>> Y_target = rng.randn(n_target, 2)
>>> M = ot.dist(X_source, Y_target)
>>> sgd_dual_pi, log = ot.stochastic.solve_dual_entropic(a, b, M, reg, batch_size, numItermax, lr, log)
>>> log['alpha']
array([0.64171798, 1.27932201, 0.78132257, 0.15638935, 0.54888354,
       1.03663469, 0.20595781])
>>> log['beta']
array([0.51207194, 0.58033189, 1.28922676, 2.26859736])
>>> sgd_dual_pi
array([[1.97276541e-02, 7.81248547e-02, 6.22136048e-03, 4.95442423e-09],
       [4.23494310e-03, 4.43286263e-04, 2.06927079e-05, 3.82389139e-07],
       [3.07542414e-03, 6.67897769e-02, 2.48904999e-02, 1.72030247e-10],
       [4.26271990e-02, 5.53375455e-02, 6.16535024e-02, 9.88812650e-07],
       [7.60423265e-02, 5.89585256e-03, 3.81267087e-02, 1.39458256e-03],
       [4.37557504e-02, 1.85189176e-03, 1.72335760e-03, 3.55491279e-04],
       [6.33096109e-02, 4.11683954e-02, 5.02962051e-02, 5.43097516e-06]])

References

19

Seguy, V., Bhushan Damodaran, B., Flamary, R., Courty, N., Rolet, A.& Blondel, M. Large-scale Optimal Transport and Mapping Estimation. International Conference on Learning Representation (2018)

ot.stochastic.solve_dual_entropic(a, b, M, reg, batch_size, numItermax=10000, lr=1, log=False)[source]

Compute the transportation matrix to solve the regularized discrete measures optimal transport dual problem

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} = \mathbf{a}\\ \gamma^T \mathbf{1} = \mathbf{b}\\ \gamma \geq 0\end{aligned}\end{align} \]

Where :

  • \(\mathbf{M}\) is the (ns, nt) metric cost matrix

  • \(\Omega\) is the entropic regularization term with \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (sum to 1)

Parameters
  • a (ndarray, shape (ns,)) – source measure

  • b (ndarray, shape (nt,)) – target measure

  • M (ndarray, shape (ns, nt)) – cost matrix

  • reg (float) – Regularization term > 0

  • batch_size (int) – size of the batch

  • numItermax (int) – number of iteration

  • lr (float) – learning rate

  • log (bool, optional) – record log if True

Returns

  • pi (ndarray, shape (ns, nt)) – transportation matrix

  • log (dict) – log dictionary return only if log==True in parameters

Examples

>>> import ot
>>> n_source = 7
>>> n_target = 4
>>> reg = 1
>>> numItermax = 20000
>>> lr = 0.1
>>> batch_size = 3
>>> log = True
>>> a = ot.utils.unif(n_source)
>>> b = ot.utils.unif(n_target)
>>> rng = np.random.RandomState(0)
>>> X_source = rng.randn(n_source, 2)
>>> Y_target = rng.randn(n_target, 2)
>>> M = ot.dist(X_source, Y_target)
>>> sgd_dual_pi, log = ot.stochastic.solve_dual_entropic(a, b, M, reg, batch_size, numItermax, lr, log)
>>> log['alpha']
array([0.64057733, 1.2683513 , 0.75610161, 0.16024284, 0.54926534,
       1.0514201 , 0.19958936])
>>> log['beta']
array([0.51372571, 0.58843489, 1.27993921, 2.24344807])
>>> sgd_dual_pi
array([[1.97377795e-02, 7.86706853e-02, 6.15682001e-03, 4.82586997e-09],
       [4.19566963e-03, 4.42016865e-04, 2.02777272e-05, 3.68823708e-07],
       [3.00379244e-03, 6.56562018e-02, 2.40462171e-02, 1.63579656e-10],
       [4.28626062e-02, 5.60031599e-02, 6.13193826e-02, 9.67977735e-07],
       [7.61972739e-02, 5.94609051e-03, 3.77886693e-02, 1.36046648e-03],
       [4.44810042e-02, 1.89476742e-03, 1.73285847e-03, 3.51826036e-04],
       [6.30118293e-02, 4.12398660e-02, 4.95148998e-02, 5.26247246e-06]])

References

19

Seguy, V., Bhushan Damodaran, B., Flamary, R., Courty, N., Rolet, A.& Blondel, M. Large-scale Optimal Transport and Mapping Estimation. International Conference on Learning Representation (2018)

Examples using ot.stochastic.solve_dual_entropic

ot.stochastic.solve_semi_dual_entropic(a, b, M, reg, method, numItermax=10000, lr=None, log=False)[source]

Compute the transportation matrix to solve the regularized discrete measures optimal transport max problem

The function solves the following optimization problem:

\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\s.t. \ \gamma \mathbf{1} = \mathbf{a}\\ \gamma^T \mathbf{1} = \mathbf{b}\\ \gamma \geq 0\end{aligned}\end{align} \]

Where :

  • \(\mathbf{M}\) is the (ns, nt) metric cost matrix

  • \(\Omega\) is the entropic regularization term with \(\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)

  • \(\mathbf{a}\) and \(\mathbf{b}\) are source and target weights (sum to 1)

The algorithm used for solving the problem is the SAG or ASGD algorithms as proposed in [18]

Parameters
  • a (ndarray, shape (ns,)) – source measure

  • b (ndarray, shape (nt,)) – target measure

  • M (ndarray, shape (ns, nt)) – cost matrix

  • reg (float) – Regularization term > 0

  • methode (str) – used method (SAG or ASGD)

  • numItermax (int) – number of iteration

  • lr (float) – learning rate

  • n_source (int) – size of the source measure

  • n_target (int) – size of the target measure

  • log (bool, optional) – record log if True

Returns

  • pi (ndarray, shape (ns, nt)) – transportation matrix

  • log (dict) – log dictionary return only if log==True in parameters

Examples

>>> import ot
>>> np.random.seed(0)
>>> n_source = 7
>>> n_target = 4
>>> a = ot.utils.unif(n_source)
>>> b = ot.utils.unif(n_target)
>>> X_source = np.random.randn(n_source, 2)
>>> Y_target = np.random.randn(n_target, 2)
>>> M = ot.dist(X_source, Y_target)
>>> ot.stochastic.solve_semi_dual_entropic(a, b, M, reg=1, method="ASGD", numItermax=300000)
array([[2.53942342e-02, 9.98640673e-02, 1.75945647e-02, 4.27664307e-06],
       [1.21556999e-01, 1.26350515e-02, 1.30491795e-03, 7.36017394e-03],
       [3.54070702e-03, 7.63581358e-02, 6.29581672e-02, 1.32812798e-07],
       [2.60578198e-02, 3.35916645e-02, 8.28023223e-02, 4.05336238e-04],
       [9.86808864e-03, 7.59774324e-04, 1.08702729e-02, 1.21359007e-01],
       [2.17218856e-02, 9.12931802e-04, 1.87962526e-03, 1.18342700e-01],
       [4.14237512e-02, 2.67487857e-02, 7.23016955e-02, 2.38291052e-03]])

References

18

Genevay, A., Cuturi, M., Peyré, G. & Bach, F. (2016) Stochastic Optimization for Large-scale Optimal Transport. Advances in Neural Information Processing Systems (2016).

Examples using ot.stochastic.solve_semi_dual_entropic