Note
Go to the end to download the full example code.
Stochastic examples
This example is designed to show how to use the stochastic optimization algorithms for discrete and semi-continuous measures from the POT library.
[18] Genevay, A., Cuturi, M., Peyré, G. & Bach, F. Stochastic Optimization for Large-scale Optimal Transport. Advances in Neural Information Processing Systems (2016).
[19] Seguy, V., Bhushan Damodaran, B., Flamary, R., Courty, N., Rolet, A. & Blondel, M. Large-scale Optimal Transport and Mapping Estimation. International Conference on Learning Representation (2018)
# Author: Kilian Fatras <kilian.fatras@gmail.com>
#
# License: MIT License
import matplotlib.pylab as pl
import numpy as np
import ot
import ot.plot
Compute the Transportation Matrix for the Semi-Dual Problem
Discrete case
Sample two discrete measures for the discrete case and compute their cost matrix c.
Call the “SAG” method to find the transportation matrix in the discrete case
method = "SAG"
sag_pi = ot.stochastic.solve_semi_dual_entropic(a, b, M, reg, method, numItermax)
print(sag_pi)
[[2.55553509e-02 9.96395660e-02 1.76579142e-02 4.31178196e-06]
[1.21640234e-01 1.25357448e-02 1.30225078e-03 7.37891338e-03]
[3.56123975e-03 7.61451746e-02 6.31505947e-02 1.33831456e-07]
[2.61515202e-02 3.34246014e-02 8.28734709e-02 4.07550428e-04]
[9.85500870e-03 7.52288517e-04 1.08262628e-02 1.21423583e-01]
[2.16904253e-02 9.03825797e-04 1.87178503e-03 1.18391107e-01]
[4.15462212e-02 2.65987989e-02 7.23177216e-02 2.39440107e-03]]
Semi-Continuous Case
Sample one general measure a, one discrete measures b for the semicontinuous case, the points where source and target measures are defined and compute the cost matrix.
Call the “ASGD” method to find the transportation matrix in the semicontinuous case.
[3.76510592 7.64094845 3.78917596 2.57007572 1.65543745 3.4893295
2.70623359] [-2.50319213 -2.25852474 -0.82688144 5.5885983 ]
[[2.19802712e-02 1.03838786e-01 1.70349712e-02 3.11402024e-06]
[1.20269164e-01 1.50177118e-02 1.44418382e-03 6.12608330e-03]
[3.05271739e-03 7.90868636e-02 6.07174656e-02 9.63289956e-08]
[2.33574229e-02 3.61718564e-02 8.30222147e-02 3.05648858e-04]
[1.12749105e-02 1.04283861e-03 1.38926617e-02 1.16646732e-01]
[2.49295484e-02 1.25865775e-03 2.41297662e-03 1.14255960e-01]
[3.78279732e-02 2.93440562e-02 7.38545201e-02 1.83059335e-03]]
Compare the results with the Sinkhorn algorithm
sinkhorn_pi = ot.sinkhorn(a, b, M, reg)
print(sinkhorn_pi)
[[2.55553508e-02 9.96395661e-02 1.76579142e-02 4.31178193e-06]
[1.21640234e-01 1.25357448e-02 1.30225079e-03 7.37891333e-03]
[3.56123974e-03 7.61451746e-02 6.31505947e-02 1.33831455e-07]
[2.61515201e-02 3.34246014e-02 8.28734709e-02 4.07550425e-04]
[9.85500876e-03 7.52288523e-04 1.08262629e-02 1.21423583e-01]
[2.16904255e-02 9.03825804e-04 1.87178504e-03 1.18391107e-01]
[4.15462212e-02 2.65987989e-02 7.23177217e-02 2.39440105e-03]]
Plot Transportation Matrices
For SAG
pl.figure(4, figsize=(5, 5))
ot.plot.plot1D_mat(a, b, sag_pi, "semi-dual : OT matrix SAG")
pl.show()
For ASGD
pl.figure(4, figsize=(5, 5))
ot.plot.plot1D_mat(a, b, asgd_pi, "semi-dual : OT matrix ASGD")
pl.show()
For Sinkhorn
pl.figure(4, figsize=(5, 5))
ot.plot.plot1D_mat(a, b, sinkhorn_pi, "OT matrix Sinkhorn")
pl.show()
Compute the Transportation Matrix for the Dual Problem
Semi-continuous case
Sample one general measure a, one discrete measures b for the semi-continuous case and compute the cost matrix c.
n_source = 7
n_target = 4
reg = 1
numItermax = 100000
lr = 0.1
batch_size = 3
log = True
a = ot.utils.unif(n_source)
b = ot.utils.unif(n_target)
rng = np.random.RandomState(0)
X_source = rng.randn(n_source, 2)
Y_target = rng.randn(n_target, 2)
M = ot.dist(X_source, Y_target)
Call the “SGD” dual method to find the transportation matrix in the semi-continuous case
sgd_dual_pi, log_sgd = ot.stochastic.solve_dual_entropic(
a, b, M, reg, batch_size, numItermax, lr, log=log
)
print(log_sgd["alpha"], log_sgd["beta"])
print(sgd_dual_pi)
[0.91732819 2.7799397 1.07406199 0.01970121 0.60717156 1.80910257
0.10902398] [0.34639291 0.47463643 1.57482501 4.92047485]
[[2.20200322e-02 9.25938748e-02 1.09047347e-02 9.25518158e-08]
[1.60917795e-02 1.78850969e-03 1.23469888e-04 2.43170724e-05]
[3.49209980e-03 8.05271170e-02 4.43815515e-02 3.26915633e-09]
[3.15043415e-02 4.34264205e-02 7.15531236e-02 1.22305749e-05]
[6.82992713e-02 5.62286712e-03 5.37746045e-02 2.09630346e-02]
[8.02712798e-02 3.60737409e-03 4.96463916e-03 1.09144850e-02]
[4.86875958e-02 3.36173252e-02 6.07394894e-02 6.98997703e-05]]
Compare the results with the Sinkhorn algorithm
Call the Sinkhorn algorithm from POT
sinkhorn_pi = ot.sinkhorn(a, b, M, reg)
print(sinkhorn_pi)
[[2.55553508e-02 9.96395661e-02 1.76579142e-02 4.31178193e-06]
[1.21640234e-01 1.25357448e-02 1.30225079e-03 7.37891333e-03]
[3.56123974e-03 7.61451746e-02 6.31505947e-02 1.33831455e-07]
[2.61515201e-02 3.34246014e-02 8.28734709e-02 4.07550425e-04]
[9.85500876e-03 7.52288523e-04 1.08262629e-02 1.21423583e-01]
[2.16904255e-02 9.03825804e-04 1.87178504e-03 1.18391107e-01]
[4.15462212e-02 2.65987989e-02 7.23177217e-02 2.39440105e-03]]
Plot Transportation Matrices
For SGD
pl.figure(4, figsize=(5, 5))
ot.plot.plot1D_mat(a, b, sgd_dual_pi, "dual : OT matrix SGD")
pl.show()
For Sinkhorn
pl.figure(4, figsize=(5, 5))
ot.plot.plot1D_mat(a, b, sinkhorn_pi, "OT matrix Sinkhorn")
pl.show()
Total running time of the script: (0 minutes 6.266 seconds)