The optimal solution has the form \(\mathbf{T} = \text{diag}(\exp(\mathbf{u})) \exp(\mathbf{K}) \text{diag}(\exp(\mathbf{v}))\),
where the dual variables \(\mathbf{u}\) and \(\mathbf{v}\) are found iteratively using:
The optimal solution has the form \(\mathbf{T} = \text{diag}(\mathbf{f}) \mathbf{K} \text{diag}(\mathbf{g})\),
where the dual variables \(\mathbf{u}\) and \(\mathbf{v}\) are found iteratively using:
Computes the gromov-wasserstein cost given a cost tensor and transport plan. Batched version.
Parameters:
L (dict) – Cost tensor as returned by tensor_batch.
T (array-like, shape (B, n, m)) – Transport plan.
recompute_const (bool, optional) – Whether to recompute the constant term. Default is False. This should be set to True if T does not satisfy the marginal constraints.
symmetric (bool, optional) – Whether to use symmetric version. Default is True.
nx (module, optional) – Backend to use. Default is None.
Examples
>>> importnumpyasnp>>> fromot.batchimporttensor_batch,loss_quadratic_batch>>> # Create batch of cost matrices>>> C1=np.random.rand(3,5,5)# 3 problems, 5x5 source matrices>>> C2=np.random.rand(3,4,4)# 3 problems, 4x4 target matrices>>> a=np.ones((3,5))/5# Uniform source distributions>>> b=np.ones((3,4))/4# Uniform target distributions>>> L=tensor_batch(a,b,C1,C2,loss='sqeuclidean')>>> # Use the uniform transport plan for testing>>> T=np.ones((3,5,4))/(5*4)>>> loss=loss_quadratic_batch(L,T,recompute_const=True)>>> loss.shape(3,)
Computes the gromov-wasserstein for samples C1, C2 and transport plan. Batched version.
Parameters:
a (array-like, shape (B, n)) – Source distributions.
b (array-like, shape (B, m)) – Target distributions.
C1 (array-like, shape (B, n, n) or (B, n, n, d)) – Source cost matrices.
C2 (array-like, shape (B, m, m) or (B, n, n, d)) – Target cost matrices.
T (array-like, shape (B, n, m)) – Transport plan.
loss (str, optional) – Loss function to use. Supported values: ‘sqeuclidean’, ‘kl’.
Default is ‘sqeuclidean’.
recompute_const (bool, optional) – Whether to recompute the constant term. Default is False. This should be set to True if T does not satisfy the marginal constraints.
symmetric (bool, optional) – Whether to use symmetric version. Default is True.
nx (module, optional) – Backend to use. Default is None.
Examples
>>> importnumpyasnp>>> fromot.batchimportloss_quadratic_samples_batch>>> # Create batch of cost matrices>>> C1=np.random.rand(3,5,5)# 3 problems, 5x5 source matrices>>> C2=np.random.rand(3,4,4)# 3 problems, 4x4 target matrices>>> a=np.ones((3,5))/5# Uniform source distributions>>> b=np.ones((3,4))/4# Uniform target distributions>>> # Use the uniform transport plan for testing>>> T=np.ones((3,5,4))/(5*4)>>> loss=loss_quadratic_samples_batch(a,b,C1,C2,T,recompute_const=True)>>> loss.shape(3,)
solver (str) – Solver to use, either ‘log_sinkhorn’ or ‘sinkhorn’. Default is “log_sinkhorn” which is more stable.
reg_type (str, optional) – Type of regularization \(R\) either “KL”, or “entropy”. Default is “entropy”.
grad (str, optional) – Type of gradient computation, either or ‘autodiff’, ‘envelope’ or ‘last_step’ used only for
Sinkhorn solver. By default ‘autodiff’ provides gradients wrt all
outputs (plan, value, value_linear) but with important memory cost.
‘envelope’ provides gradients only for value and and other outputs are
detached. This is useful for memory saving when only the value is needed. ‘last_step’ provides
gradients only for the last iteration of the Sinkhorn solver, but provides gradient for both the OT plan and the objective values.
‘detach’ does not compute the gradients for the Sinkhorn solver.
Returns:
res – Result of the optimization problem. The information can be obtained as follows:
res.plan : OT plan \(\mathbf{T}\)
res.potentials : OT dual potentials
res.value : Optimal value of the optimization problem
res.value_linear : Linear OT loss with the optimal OT plan
>>> importnumpyasnp>>> fromot.batchimportsolve_batch,dist_batch>>> X=np.random.randn(5,10,3)# 5 batches of 10 samples in 3D>>> Y=np.random.randn(5,15,3)# 5 batches of 15 samples in 3D>>> M=dist_batch(X,Y,metric="euclidean")# Compute cost matrices>>> reg=0.1>>> result=solve_batch(M,reg)>>> result.plan.shape# Optimal transport plans for each batch(5, 10, 15)>>> result.value.shape# Optimal transport values for each batch(5,)
Writing the objective as \((1-\alpha) \langle \mathbf{M}, \mathbf{T} \rangle + \alpha \langle \mathcal{L} \otimes \mathbf{T}, \mathbf{T} \rangle\),
the solver uses proximal gradient descent where each iteration is:
where \(H(\mathbf{T})\) is the entropy of \(\mathbf{T}\). Thus each iteration can be solved using the Bregman projection solver implemented in bregman_log_projection_batch.
Note that the inner optimization problem does not need to be solved exactly. In practice it sufficient to set max_iter_inner to a low value (e.g. 20) and/or tol_inner to a high value (e.g. 1e-2).
Parameters:
C1 (array-like, shape (B, n, n, d) or (B, n, n)) – Samples affinity matrices from source distribution
C2 (array-like, shape (B, n, n, d) or (B, n, n)) – Samples affinity matrices from target distribution
a (array-like, shape (B, n), optional) – Marginal distribution of the source samples. If None, uniform distribution is used.
b (array-like, shape (B, m), optional) – Marginal distribution of the target samples. If None, uniform distribution is used.
loss (str, optional) – Type of loss function, can be ‘sqeuclidean’ or ‘kl’ or a QuadraticMetric instance.
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
M (array-like, shape (dim_a, dim_b), optional) – Linear cost matrix for Fused Gromov-Wasserstein (default is None).
alpha (float, optional) – Weight the quadratic term (alpha*Gromov) and the linear term
((1-alpha)*Wass) in the Fused Gromov-Wasserstein problem. Not used for
Gromov problem (when M is not provided). By default alpha=None
corresponds to alpha=1 for Gromov problem (M==None) and
alpha=0.5 for Fused Gromov-Wasserstein problem (M!=None)
epsilon (float, optional) – Regularization parameter for proximal gradient descent. Default is 1e-2.
T_init (array-like, shape (B, n, m), optional) – Initial transport plan. If None, it is initialized to uniform distribution.
max_iter (int, optional) – Maximum number of iterations for the proximal gradient descent. Default is 50.
tol (float, optional) – Tolerance for convergence of the proximal gradient descent. Default is 1e-5.
max_iter_inner (int, optional) – Maximum number of iterations for the inner Bregman projection. Default is 50.
tol_inner (float, optional) – Tolerance for convergence of the inner Bregman projection. Default is 1e-5.
grad (str, optional) – Type of gradient computation, either or ‘autodiff’, ‘envelope’ or ‘detach’. ‘autodiff’ provides gradients wrt all
outputs (plan, value, value_linear) but with important memory cost.
‘envelope’ provides gradients only for (value, value_linear)`. detach` is the fastest option but
provides no gradients. Default is ‘detach’.
assume_inner_convergence (bool, optional) – If True, assumes that the inner Bregman projection always converged i.e. the transport plan satisfies the marginal constraints.
This enables faster computations of the tensor product but might results in inaccurate results (e.g. negative values of the loss).
Default is True.
Returns:
res – Result of the optimization problem. The information can be obtained as follows:
res.plan : OT plan \(\mathbf{T}\)
res.potentials : OT dual potentials
res.value : Optimal value of the optimization problem
res.value_linear : Linear OT loss with the optimal OT plan
res.value_quad : Quadratic OT loss with the optimal OT plan
Non-batched solver for Gromov-Wasserstein. Note that the non-batched solver uses a different algorithm (conditional gradient) and might not give the same results.
solver (str) – Solver to use, either ‘log_sinkhorn’ or ‘sinkhorn’. Default is “log_sinkhorn” which is more stable.
reg_type (str, optional) – Type of regularization \(R\) either “KL”, or “entropy”. Default is “entropy”.
grad (str, optional) – Type of gradient computation, either or ‘autodiff’, ‘envelope’ or ‘last_step’ used only for
Sinkhorn solver. By default ‘autodiff’ provides gradients wrt all
outputs (plan, value, value_linear) but with important memory cost.
‘envelope’ provides gradients only for value and and other outputs are
detached. This is useful for memory saving when only the value is needed. ‘last_step’ provides
gradients only for the last iteration of the Sinkhorn solver, but provides gradient for both the OT plan and the objective values.
‘detach’ does not compute the gradients for the Sinkhorn solver.
Returns:
res – Result of the optimization problem. The information can be obtained as follows:
res.plan : OT plan \(\mathbf{T}\)
res.potentials : OT dual potentials
res.value : Optimal value of the optimization problem
res.value_linear : Linear OT loss with the optimal OT plan
where \(\mathcal{L}\) is a 4D tensor with elements \(\mathcal{L}[i,j,k,l] = \ell(C_{1,ij}, C_{2,kl})\).
For loss functions of the form \(\ell(a,b) = f_1(a) + f_2(b) - \langle h_1(a), h_2(b) \rangle\),
the tensor product \(\mathcal{L} \times \mathbf{T}\) can be computed efficiently without explicitly computing \(\mathcal{L}\) [12].
This function precomputes all matrices that implicitly define mathcal{L} for various loss functions.
Parameters:
a (array-like, shape (B, n)) – Source distributions for each problem in the batch.
b (array-like, shape (B, m)) – Target distributions for each problem in the batch.
C1 (array-like, shape (B, n, n) or (B, n, n, d)) – Source cost matrices for each problem. Can be a 3D array for scalar costs or a 4D array for vector-valued costs (edge features).
C2 (array-like, shape (B, m, m) or (B, n, n, d)) – Target cost matrices for each problem. Can be a 3D array for scalar costs or a 4D array for vector-valued costs (edge features).
symmetric (bool, optional) – Whether the cost matrices are symmetric. Default is True.
nx (backend object, optional) – Numerical backend to use for computations. If None, the default backend is used.
loss (str, optional) – Loss function to use. Supported values: ‘sqeuclidean’, ‘kl’.
Default is ‘sqeuclidean’.
logits (bool, optional) – For KL divergence, whether inputs are logits (unnormalized log probabilities).
If True, inputs are treated as logits. Default is None.
Returns:
Dictionary containing:
- constC : array-like, shape (B, n, m)