ot.partial

Partial OT solvers

Functions

ot.partial.entropic_partial_gromov_wasserstein(C1, C2, p, q, reg, m=None, G0=None, numItermax=1000, tol=1e-07, log=False, verbose=False)[source]

Returns the partial Gromov-Wasserstein transport between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\)

The function solves the following optimization problem:

\[\gamma = \mathop{\arg \min}_{\gamma} \quad \sum_{i,j,k,l} L(\mathbf{C_1}_{i,k}, \mathbf{C_2}_{j,l})\cdot \gamma_{i,j}\cdot\gamma_{k,l} + \mathrm{reg} \cdot\Omega(\gamma)\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma &\geq 0\\ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

where :

\(\mathbf{C_1}\) is the metric cost matrix in the source space
\(\mathbf{C_2}\) is the metric cost matrix in the target space
\(\mathbf{p}\) and \(\mathbf{q}\) are the sample weights
L: quadratic loss function
\(\Omega\) is the entropic regularization term, \(\Omega=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
m is the amount of mass to be transported

The formulation of the GW problem has been proposed in [12] and the partial GW in [29]

Note

This function will be deprecated in a near future, please use

ot.gromov.entropic_partial_gromov_wasserstein instead.

Parameters:

C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric cost matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
reg (float) – entropic regularization parameter
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
numItermax (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations

Examples

>>> import ot
>>> import scipy as sp
>>> a = np.array([0.25] * 4)
>>> b = np.array([0.25] * 4)
>>> x = np.array([1,2,100,200]).reshape((-1,1))
>>> y = np.array([3,2,98,199]).reshape((-1,1))
>>> C1 = sp.spatial.distance.cdist(x, x)
>>> C2 = sp.spatial.distance.cdist(y, y)
>>> np.round(entropic_partial_gromov_wasserstein(C1, C2, a, b, 50), 2)
array([[0.12, 0.13, 0.  , 0.  ],
       [0.13, 0.12, 0.  , 0.  ],
       [0.  , 0.  , 0.25, 0.  ],
       [0.  , 0.  , 0.  , 0.25]])
>>> np.round(entropic_partial_gromov_wasserstein(C1, C2, a, b, 50,0.25), 2)
array([[0.02, 0.03, 0.  , 0.03],
       [0.03, 0.03, 0.  , 0.03],
       [0.  , 0.  , 0.03, 0.  ],
       [0.02, 0.02, 0.  , 0.03]])

Returns:

math: gamma : (dim_a, dim_b) ndarray – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True

References

See also

ot.partial.partial_gromov_wasserstein: exact Partial Gromov-Wasserstein

ot.partial.entropic_partial_gromov_wasserstein2(C1, C2, p, q, reg, m=None, G0=None, numItermax=1000, tol=1e-07, log=False, verbose=False)[source]

Returns the partial Gromov-Wasserstein discrepancy between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\)

The function solves the following optimization problem:

\[GW = \min_{\gamma} \quad \sum_{i,j,k,l} L(\mathbf{C_1}_{i,k}, \mathbf{C_2}_{j,l})\cdot \gamma_{i,j}\cdot\gamma_{k,l} + \mathrm{reg} \cdot\Omega(\gamma)\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma &\geq 0\\ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

where :

\(\mathbf{C_1}\) is the metric cost matrix in the source space
\(\mathbf{C_2}\) is the metric cost matrix in the target space
\(\mathbf{p}\) and \(\mathbf{q}\) are the sample weights
L : quadratic loss function
\(\Omega\) is the entropic regularization term, \(\Omega=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
m is the amount of mass to be transported

The formulation of the GW problem has been proposed in [12] and the partial GW in [29]

Note

This function will be deprecated in a near future, please use

ot.gromov.entropic_partial_gromov_wasserstein2 instead.

Parameters:

C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric cost matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
reg (float) – entropic regularization parameter
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
numItermax (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations

Returns:

partial_gw_dist (float) – Gromov-Wasserstein distance
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import scipy as sp
>>> a = np.array([0.25] * 4)
>>> b = np.array([0.25] * 4)
>>> x = np.array([1,2,100,200]).reshape((-1,1))
>>> y = np.array([3,2,98,199]).reshape((-1,1))
>>> C1 = sp.spatial.distance.cdist(x, x)
>>> C2 = sp.spatial.distance.cdist(y, y)
>>> np.round(entropic_partial_gromov_wasserstein2(C1, C2, a, b,50), 2)
1.87

References

ot.partial.entropic_partial_wasserstein(a, b, M, reg, m=None, numItermax=1000, stopThr=1e-100, verbose=False, log=False)[source]

Solves the partial optimal transport problem and returns the OT plan

The function considers the following problem:

\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\\begin{split}s.t. \gamma \mathbf{1} &\leq \mathbf{a} \\ \gamma^T \mathbf{1} &\leq \mathbf{b} \\ \gamma &\geq 0 \\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\} \\\end{split}\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the metric cost matrix
\(\Omega\) is the entropic regularization term, \(\Omega=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
m is the amount of mass to be transported

The formulation of the problem has been proposed in [3] (prop. 5)

Parameters:

a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix
reg (float) – Regularization term > 0
m (float, optional) – Amount of mass to be transported
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> a = [.1, .2]
>>> b = [.1, .1]
>>> M = [[0., 1.], [2., 3.]]
>>> np.round(entropic_partial_wasserstein(a, b, M, 1, 0.1), 2)
array([[0.06, 0.02],
       [0.01, 0.  ]])

References

See also

ot.partial.partial_wasserstein: exact Partial Wasserstein

Examples using `ot.partial.entropic_partial_wasserstein`

Partial Wasserstein and Gromov-Wasserstein example

ot.partial.gwgrad_partial(C1, C2, T)[source]

Compute the GW gradient. Note: we can not use the trick in [12] as the marginals may not sum to 1.

Note

This function will be deprecated in a near future, please use

ot.gromov.gwggrad instead.

Parameters:

C1 (array of shape (n_p,n_p)) – intra-source (P) cost matrix
C2 (array of shape (n_u,n_u)) – intra-target (U) cost matrix
T (array of shape(n_p+nb_dummies, n_u) (default: None)) – Transport matrix

Returns:

gradient

Return type:

numpy.array of shape (n_p+nb_dummies, n_u)

References

ot.partial.gwloss_partial(C1, C2, T)[source]

Compute the GW loss.

Note

This function will be deprecated in a near future, please use

ot.gromov.gwloss instead.

Parameters:

C1 (array of shape (n_p,n_p)) – intra-source (P) cost matrix
C2 (array of shape (n_u,n_u)) – intra-target (U) cost matrix
T (array of shape(n_p+nb_dummies, n_u) (default: None)) – Transport matrix

Return type:

GW loss

ot.partial.partial_gromov_wasserstein(C1, C2, p, q, m=None, nb_dummies=1, G0=None, thres=1, numItermax=1000, tol=1e-07, log=False, verbose=False, **kwargs)[source]

Solves the partial optimal transport problem and returns the OT plan

The function considers the following problem:

\[\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the metric cost matrix
\(\Omega\) is the entropic regularization term, \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
m is the amount of mass to be transported

The formulation of the problem has been proposed in [29]

Note

This function will be deprecated in a near future, please use

ot.gromov.partial_gromov_wasserstein instead.

Parameters:

C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric costfr matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
nb_dummies (int, optional) – Number of dummy points to add (avoid instabilities in the EMD solver)
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
thres (float, optional) – quantile of the gradient matrix to populate the cost matrix when 0 (default: 1)
numItermax (int, optional) – Max number of iterations
tol (float, optional) – tolerance for stopping iterations
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations
**kwargs (dict) – parameters can be directly passed to the emd solver

Returns:

gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import scipy as sp
>>> a = np.array([0.25] * 4)
>>> b = np.array([0.25] * 4)
>>> x = np.array([1,2,100,200]).reshape((-1,1))
>>> y = np.array([3,2,98,199]).reshape((-1,1))
>>> C1 = sp.spatial.distance.cdist(x, x)
>>> C2 = sp.spatial.distance.cdist(y, y)
>>> np.round(partial_gromov_wasserstein(C1, C2, a, b),2)
array([[0.  , 0.25, 0.  , 0.  ],
       [0.25, 0.  , 0.  , 0.  ],
       [0.  , 0.  , 0.25, 0.  ],
       [0.  , 0.  , 0.  , 0.25]])
>>> np.round(partial_gromov_wasserstein(C1, C2, a, b, m=0.25),2)
array([[0.  , 0.  , 0.  , 0.  ],
       [0.  , 0.  , 0.  , 0.  ],
       [0.  , 0.  , 0.25, 0.  ],
       [0.  , 0.  , 0.  , 0.  ]])

References

ot.partial.partial_gromov_wasserstein2(C1, C2, p, q, m=None, nb_dummies=1, G0=None, thres=1, numItermax=1000, tol=1e-07, log=False, verbose=False, **kwargs)[source]

Solves the partial optimal transport problem and returns the partial Gromov-Wasserstein discrepancy

The function considers the following problem:

\[GW = \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the metric cost matrix
\(\Omega\) is the entropic regularization term, \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
m is the amount of mass to be transported

The formulation of the problem has been proposed in [29]

Note

This function will be deprecated in a near future, please use

ot.gromov.partial_gromov_wasserstein2 instead.

Parameters:

C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric cost matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
nb_dummies (int, optional) – Number of dummy points to add (avoid instabilities in the EMD solver)
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
thres (float, optional) – quantile of the gradient matrix to populate the cost matrix when 0 (default: 1)
numItermax (int, optional) – Max number of iterations
tol (float, optional) – tolerance for stopping iterations
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations
**kwargs (dict) – parameters can be directly passed to the emd solver

Warning

When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).

Returns:

partial_gw_dist (float) – partial GW discrepancy
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import scipy as sp
>>> a = np.array([0.25] * 4)
>>> b = np.array([0.25] * 4)
>>> x = np.array([1,2,100,200]).reshape((-1,1))
>>> y = np.array([3,2,98,199]).reshape((-1,1))
>>> C1 = sp.spatial.distance.cdist(x, x)
>>> C2 = sp.spatial.distance.cdist(y, y)
>>> np.round(partial_gromov_wasserstein2(C1, C2, a, b),2)
1.69
>>> np.round(partial_gromov_wasserstein2(C1, C2, a, b, m=0.25),2)
0.0

References

ot.partial.partial_wasserstein(a, b, M, m=None, nb_dummies=1, log=False, **kwargs)[source]

Solves the partial optimal transport problem for the quadratic cost and returns the OT plan

The function considers the following problem:

\[\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
m is the amount of mass to be transported

Parameters:

a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix for the quadratic cost
m (float, optional) – amount of mass to be transported
nb_dummies (int, optional, default:1) – number of reservoir points to be added (to avoid numerical instabilities, increase its value if an error is raised)
log (bool, optional) – record log if True
**kwargs (dict) – parameters can be directly passed to the emd solver

Warning

When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).

Returns:

gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> a = [.1, .2]
>>> b = [.1, .1]
>>> M = [[0., 1.], [2., 3.]]
>>> np.round(partial_wasserstein(a,b,M), 2)
array([[0.1, 0. ],
       [0. , 0.1]])
>>> np.round(partial_wasserstein(a,b,M,m=0.1), 2)
array([[0.1, 0. ],
       [0. , 0. ]])

References

See also

ot.partial.partial_wasserstein_lagrange: Partial Wasserstein with

regularization

ot.partial.entropic_partial_wasserstein: Partial Wasserstein with a

entropic

Examples using `ot.partial.partial_wasserstein`

2D examples of exact and entropic unbalanced optimal transport

Partial Wasserstein and Gromov-Wasserstein example

ot.partial.partial_wasserstein2(a, b, M, m=None, nb_dummies=1, log=False, **kwargs)[source]

Solves the partial optimal transport problem for the quadratic cost and returns the partial GW discrepancy

The function considers the following problem:

\[\gamma = \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
m is the amount of mass to be transported

Parameters:

a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix for the quadratic cost
m (float, optional) – amount of mass to be transported
nb_dummies (int, optional, default:1) – number of reservoir points to be added (to avoid numerical instabilities, increase its value if an error is raised)
log (bool, optional) – record log if True
**kwargs (dict) – parameters can be directly passed to the emd solver

Warning

When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).

Returns:

GW (float) – partial GW discrepancy
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> a=[.1, .2]
>>> b=[.1, .1]
>>> M=[[0., 1.], [2., 3.]]
>>> np.round(partial_wasserstein2(a, b, M), 1)
0.3
>>> np.round(partial_wasserstein2(a,b,M,m=0.1), 1)
0.0

References

ot.partial.partial_wasserstein_lagrange(a, b, M, reg_m=None, nb_dummies=1, log=False, **kwargs)[source]

Solves the partial optimal transport problem for the quadratic cost and returns the OT plan

The function considers the following problem:

\[\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, (\mathbf{M} - \lambda) \rangle_F\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m & \leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

or equivalently (see Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2018). An interpolating distance between optimal transport and Fisher–Rao metrics. Foundations of Computational Mathematics, 18(1), 1-44.)

\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \sqrt{\frac{\lambda}{2} (\|\gamma \mathbf{1} - \mathbf{a}\|_1 + \|\gamma^T \mathbf{1} - \mathbf{b}\|_1)}\\s.t. \ \gamma \geq 0\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\lambda\) is the lagrangian cost. Tuning its value allows attaining a given mass to be transported m

The formulation of the problem has been proposed in [28]

Parameters:

a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix for the quadratic cost
reg_m (float, optional) – Lagrangian cost
nb_dummies (int, optional, default:1) – number of reservoir points to be added (to avoid numerical instabilities, increase its value if an error is raised)
log (bool, optional) – record log if True
**kwargs (dict) – parameters can be directly passed to the emd solver

Warning

When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).

Returns:

gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> a = [.1, .2]
>>> b = [.1, .1]
>>> M = [[0., 1.], [2., 3.]]
>>> np.round(partial_wasserstein_lagrange(a,b,M), 2)
array([[0.1, 0. ],
       [0. , 0.1]])
>>> np.round(partial_wasserstein_lagrange(a,b,M,reg_m=2), 2)
array([[0.1, 0. ],
       [0. , 0. ]])

References

See also

ot.partial.partial_wasserstein: Partial Wasserstein with fixed mass

ot.partial.entropic_partial_gromov_wasserstein(C1, C2, p, q, reg, m=None, G0=None, numItermax=1000, tol=1e-07, log=False, verbose=False)[source]

Returns the partial Gromov-Wasserstein transport between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\)

The function solves the following optimization problem:

\[\gamma = \mathop{\arg \min}_{\gamma} \quad \sum_{i,j,k,l} L(\mathbf{C_1}_{i,k}, \mathbf{C_2}_{j,l})\cdot \gamma_{i,j}\cdot\gamma_{k,l} + \mathrm{reg} \cdot\Omega(\gamma)\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma &\geq 0\\ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

where :

\(\mathbf{C_1}\) is the metric cost matrix in the source space
\(\mathbf{C_2}\) is the metric cost matrix in the target space
\(\mathbf{p}\) and \(\mathbf{q}\) are the sample weights
L: quadratic loss function
\(\Omega\) is the entropic regularization term, \(\Omega=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
m is the amount of mass to be transported

The formulation of the GW problem has been proposed in [12] and the partial GW in [29]

Note

This function will be deprecated in a near future, please use

ot.gromov.entropic_partial_gromov_wasserstein instead.

Parameters:

C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric cost matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
reg (float) – entropic regularization parameter
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
numItermax (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations

Examples

>>> import ot
>>> import scipy as sp
>>> a = np.array([0.25] * 4)
>>> b = np.array([0.25] * 4)
>>> x = np.array([1,2,100,200]).reshape((-1,1))
>>> y = np.array([3,2,98,199]).reshape((-1,1))
>>> C1 = sp.spatial.distance.cdist(x, x)
>>> C2 = sp.spatial.distance.cdist(y, y)
>>> np.round(entropic_partial_gromov_wasserstein(C1, C2, a, b, 50), 2)
array([[0.12, 0.13, 0.  , 0.  ],
       [0.13, 0.12, 0.  , 0.  ],
       [0.  , 0.  , 0.25, 0.  ],
       [0.  , 0.  , 0.  , 0.25]])
>>> np.round(entropic_partial_gromov_wasserstein(C1, C2, a, b, 50,0.25), 2)
array([[0.02, 0.03, 0.  , 0.03],
       [0.03, 0.03, 0.  , 0.03],
       [0.  , 0.  , 0.03, 0.  ],
       [0.02, 0.02, 0.  , 0.03]])

Returns:

math: gamma : (dim_a, dim_b) ndarray – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True

References

See also

ot.partial.partial_gromov_wasserstein: exact Partial Gromov-Wasserstein

ot.partial.entropic_partial_gromov_wasserstein2(C1, C2, p, q, reg, m=None, G0=None, numItermax=1000, tol=1e-07, log=False, verbose=False)[source]

Returns the partial Gromov-Wasserstein discrepancy between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\)

The function solves the following optimization problem:

\[GW = \min_{\gamma} \quad \sum_{i,j,k,l} L(\mathbf{C_1}_{i,k}, \mathbf{C_2}_{j,l})\cdot \gamma_{i,j}\cdot\gamma_{k,l} + \mathrm{reg} \cdot\Omega(\gamma)\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma &\geq 0\\ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

where :

\(\mathbf{C_1}\) is the metric cost matrix in the source space
\(\mathbf{C_2}\) is the metric cost matrix in the target space
\(\mathbf{p}\) and \(\mathbf{q}\) are the sample weights
L : quadratic loss function
\(\Omega\) is the entropic regularization term, \(\Omega=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
m is the amount of mass to be transported

The formulation of the GW problem has been proposed in [12] and the partial GW in [29]

Note

This function will be deprecated in a near future, please use

ot.gromov.entropic_partial_gromov_wasserstein2 instead.

Parameters:

C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric cost matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
reg (float) – entropic regularization parameter
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
numItermax (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations

Returns:

partial_gw_dist (float) – Gromov-Wasserstein distance
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import scipy as sp
>>> a = np.array([0.25] * 4)
>>> b = np.array([0.25] * 4)
>>> x = np.array([1,2,100,200]).reshape((-1,1))
>>> y = np.array([3,2,98,199]).reshape((-1,1))
>>> C1 = sp.spatial.distance.cdist(x, x)
>>> C2 = sp.spatial.distance.cdist(y, y)
>>> np.round(entropic_partial_gromov_wasserstein2(C1, C2, a, b,50), 2)
1.87

References

ot.partial.entropic_partial_wasserstein(a, b, M, reg, m=None, numItermax=1000, stopThr=1e-100, verbose=False, log=False)[source]

Solves the partial optimal transport problem and returns the OT plan

The function considers the following problem:

\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\\begin{split}s.t. \gamma \mathbf{1} &\leq \mathbf{a} \\ \gamma^T \mathbf{1} &\leq \mathbf{b} \\ \gamma &\geq 0 \\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\} \\\end{split}\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the metric cost matrix
\(\Omega\) is the entropic regularization term, \(\Omega=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
m is the amount of mass to be transported

The formulation of the problem has been proposed in [3] (prop. 5)

Parameters:

a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix
reg (float) – Regularization term > 0
m (float, optional) – Amount of mass to be transported
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True

Returns:

gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> a = [.1, .2]
>>> b = [.1, .1]
>>> M = [[0., 1.], [2., 3.]]
>>> np.round(entropic_partial_wasserstein(a, b, M, 1, 0.1), 2)
array([[0.06, 0.02],
       [0.01, 0.  ]])

References

See also

ot.partial.partial_wasserstein: exact Partial Wasserstein

ot.partial.gwgrad_partial(C1, C2, T)[source]

Compute the GW gradient. Note: we can not use the trick in [12] as the marginals may not sum to 1.

Note

This function will be deprecated in a near future, please use

ot.gromov.gwggrad instead.

Parameters:

C1 (array of shape (n_p,n_p)) – intra-source (P) cost matrix
C2 (array of shape (n_u,n_u)) – intra-target (U) cost matrix
T (array of shape(n_p+nb_dummies, n_u) (default: None)) – Transport matrix

Returns:

gradient

Return type:

numpy.array of shape (n_p+nb_dummies, n_u)

References

ot.partial.gwloss_partial(C1, C2, T)[source]

Compute the GW loss.

Note

This function will be deprecated in a near future, please use

ot.gromov.gwloss instead.

Parameters:

C1 (array of shape (n_p,n_p)) – intra-source (P) cost matrix
C2 (array of shape (n_u,n_u)) – intra-target (U) cost matrix
T (array of shape(n_p+nb_dummies, n_u) (default: None)) – Transport matrix

Return type:

GW loss

ot.partial.partial_gromov_wasserstein(C1, C2, p, q, m=None, nb_dummies=1, G0=None, thres=1, numItermax=1000, tol=1e-07, log=False, verbose=False, **kwargs)[source]

Solves the partial optimal transport problem and returns the OT plan

The function considers the following problem:

\[\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the metric cost matrix
\(\Omega\) is the entropic regularization term, \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
m is the amount of mass to be transported

The formulation of the problem has been proposed in [29]

Note

This function will be deprecated in a near future, please use

ot.gromov.partial_gromov_wasserstein instead.

Parameters:

C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric costfr matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
nb_dummies (int, optional) – Number of dummy points to add (avoid instabilities in the EMD solver)
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
thres (float, optional) – quantile of the gradient matrix to populate the cost matrix when 0 (default: 1)
numItermax (int, optional) – Max number of iterations
tol (float, optional) – tolerance for stopping iterations
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations
**kwargs (dict) – parameters can be directly passed to the emd solver

Returns:

gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import scipy as sp
>>> a = np.array([0.25] * 4)
>>> b = np.array([0.25] * 4)
>>> x = np.array([1,2,100,200]).reshape((-1,1))
>>> y = np.array([3,2,98,199]).reshape((-1,1))
>>> C1 = sp.spatial.distance.cdist(x, x)
>>> C2 = sp.spatial.distance.cdist(y, y)
>>> np.round(partial_gromov_wasserstein(C1, C2, a, b),2)
array([[0.  , 0.25, 0.  , 0.  ],
       [0.25, 0.  , 0.  , 0.  ],
       [0.  , 0.  , 0.25, 0.  ],
       [0.  , 0.  , 0.  , 0.25]])
>>> np.round(partial_gromov_wasserstein(C1, C2, a, b, m=0.25),2)
array([[0.  , 0.  , 0.  , 0.  ],
       [0.  , 0.  , 0.  , 0.  ],
       [0.  , 0.  , 0.25, 0.  ],
       [0.  , 0.  , 0.  , 0.  ]])

References

ot.partial.partial_gromov_wasserstein2(C1, C2, p, q, m=None, nb_dummies=1, G0=None, thres=1, numItermax=1000, tol=1e-07, log=False, verbose=False, **kwargs)[source]

Solves the partial optimal transport problem and returns the partial Gromov-Wasserstein discrepancy

The function considers the following problem:

\[GW = \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the metric cost matrix
\(\Omega\) is the entropic regularization term, \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
m is the amount of mass to be transported

The formulation of the problem has been proposed in [29]

Note

This function will be deprecated in a near future, please use

ot.gromov.partial_gromov_wasserstein2 instead.

Parameters:

C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric cost matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
nb_dummies (int, optional) – Number of dummy points to add (avoid instabilities in the EMD solver)
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
thres (float, optional) – quantile of the gradient matrix to populate the cost matrix when 0 (default: 1)
numItermax (int, optional) – Max number of iterations
tol (float, optional) – tolerance for stopping iterations
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations
**kwargs (dict) – parameters can be directly passed to the emd solver

Warning

When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).

Returns:

partial_gw_dist (float) – partial GW discrepancy
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> import scipy as sp
>>> a = np.array([0.25] * 4)
>>> b = np.array([0.25] * 4)
>>> x = np.array([1,2,100,200]).reshape((-1,1))
>>> y = np.array([3,2,98,199]).reshape((-1,1))
>>> C1 = sp.spatial.distance.cdist(x, x)
>>> C2 = sp.spatial.distance.cdist(y, y)
>>> np.round(partial_gromov_wasserstein2(C1, C2, a, b),2)
1.69
>>> np.round(partial_gromov_wasserstein2(C1, C2, a, b, m=0.25),2)
0.0

References

ot.partial.partial_wasserstein(a, b, M, m=None, nb_dummies=1, log=False, **kwargs)[source]

Solves the partial optimal transport problem for the quadratic cost and returns the OT plan

The function considers the following problem:

\[\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
m is the amount of mass to be transported

Parameters:

a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix for the quadratic cost
m (float, optional) – amount of mass to be transported
nb_dummies (int, optional, default:1) – number of reservoir points to be added (to avoid numerical instabilities, increase its value if an error is raised)
log (bool, optional) – record log if True
**kwargs (dict) – parameters can be directly passed to the emd solver

Warning

When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).

Returns:

gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> a = [.1, .2]
>>> b = [.1, .1]
>>> M = [[0., 1.], [2., 3.]]
>>> np.round(partial_wasserstein(a,b,M), 2)
array([[0.1, 0. ],
       [0. , 0.1]])
>>> np.round(partial_wasserstein(a,b,M,m=0.1), 2)
array([[0.1, 0. ],
       [0. , 0. ]])

References

See also

ot.partial.partial_wasserstein_lagrange: Partial Wasserstein with

regularization

ot.partial.entropic_partial_wasserstein: Partial Wasserstein with a

entropic

ot.partial.partial_wasserstein2(a, b, M, m=None, nb_dummies=1, log=False, **kwargs)[source]

Solves the partial optimal transport problem for the quadratic cost and returns the partial GW discrepancy

The function considers the following problem:

\[\gamma = \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
m is the amount of mass to be transported

Parameters:

a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix for the quadratic cost
m (float, optional) – amount of mass to be transported
nb_dummies (int, optional, default:1) – number of reservoir points to be added (to avoid numerical instabilities, increase its value if an error is raised)
log (bool, optional) – record log if True
**kwargs (dict) – parameters can be directly passed to the emd solver

Warning

When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).

Returns:

GW (float) – partial GW discrepancy
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> a=[.1, .2]
>>> b=[.1, .1]
>>> M=[[0., 1.], [2., 3.]]
>>> np.round(partial_wasserstein2(a, b, M), 1)
0.3
>>> np.round(partial_wasserstein2(a,b,M,m=0.1), 1)
0.0

References

ot.partial.partial_wasserstein_lagrange(a, b, M, reg_m=None, nb_dummies=1, log=False, **kwargs)[source]

Solves the partial optimal transport problem for the quadratic cost and returns the OT plan

The function considers the following problem:

\[\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, (\mathbf{M} - \lambda) \rangle_F\]

\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m & \leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]

or equivalently (see Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2018). An interpolating distance between optimal transport and Fisher–Rao metrics. Foundations of Computational Mathematics, 18(1), 1-44.)

\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \sqrt{\frac{\lambda}{2} (\|\gamma \mathbf{1} - \mathbf{a}\|_1 + \|\gamma^T \mathbf{1} - \mathbf{b}\|_1)}\\s.t. \ \gamma \geq 0\end{aligned}\end{align} \]

where :

\(\mathbf{M}\) is the metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\lambda\) is the lagrangian cost. Tuning its value allows attaining a given mass to be transported m

The formulation of the problem has been proposed in [28]

Parameters:

a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix for the quadratic cost
reg_m (float, optional) – Lagrangian cost
nb_dummies (int, optional, default:1) – number of reservoir points to be added (to avoid numerical instabilities, increase its value if an error is raised)
log (bool, optional) – record log if True
**kwargs (dict) – parameters can be directly passed to the emd solver

Warning

When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).

Returns:

gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True

Examples

>>> import ot
>>> a = [.1, .2]
>>> b = [.1, .1]
>>> M = [[0., 1.], [2., 3.]]
>>> np.round(partial_wasserstein_lagrange(a,b,M), 2)
array([[0.1, 0. ],
       [0. , 0.1]])
>>> np.round(partial_wasserstein_lagrange(a,b,M,reg_m=2), 2)
array([[0.1, 0. ],
       [0. , 0. ]])

References

See also

ot.partial.partial_wasserstein: Partial Wasserstein with fixed mass

ot.partial

Functions

Examples using ot.partial.entropic_partial_wasserstein

Examples using ot.partial.partial_wasserstein

Examples using `ot.partial.entropic_partial_wasserstein`

Examples using `ot.partial.partial_wasserstein`