ot.batch

Batch operations for optimal transport.

ot.batch.bregman_log_projection_batch(K, a=None, b=None, nx=None, max_iter=10000, tol=1e-05, grad='detach')[source]

Apply Bregman projection to a batch of affinity matrices \(\exp(\mathbf{K})\).

The function solves the following optimization problem:

\[\begin{split}\begin{aligned} \mathbf{T} = \mathop{\arg \min}_\mathbf{T} \quad & \text{KL}(\mathbf{T} \| \exp(\mathbf{K})) \\ \text{s.t.} \quad & \mathbf{T} \mathbf{1} = \mathbf{a} \\ & \mathbf{T}^T \mathbf{1} = \mathbf{b} \\ & \mathbf{T} \geq 0 \end{aligned}\end{split}\]

This is equivalent to:

\[\begin{split}\begin{aligned} \mathbf{T} = \mathop{\arg \max}_\mathbf{T} \quad & \langle \mathbf{T}, \mathbf{K} \rangle_F \\ \text{s.t.} \quad & \mathbf{T} \mathbf{1} = \mathbf{a} \\ & \mathbf{T}^T \mathbf{1} = \mathbf{b} \\ & \mathbf{T} \geq 0 \end{aligned}\end{split}\]

The optimal solution has the form \(\mathbf{T} = \text{diag}(\exp(\mathbf{u})) \exp(\mathbf{K}) \text{diag}(\exp(\mathbf{v}))\), where the dual variables \(\mathbf{u}\) and \(\mathbf{v}\) are found iteratively using:

\[ \begin{align}\begin{aligned}\mathbf{u}^{(k+1)} = \log(\mathbf{a}) - \text{LSE}(\mathbf{K} + \mathbf{v}^{(k)})\\\mathbf{v}^{(k+1)} = \log(\mathbf{b}) - \text{LSE}(\mathbf{K}^T + \mathbf{u}^{(k)})\end{aligned}\end{align} \]

where LSE denotes the log-sum-exp operation. The iterations are performed using the log-sum-exp trick to avoid numerical issues.

Parameters:

K (array-like, shape (B, n, m)) – Affinity matrix for each problem in the batch.
a (array-like, shape (B, n), optional) – Source distribution for each problem. If None, uniform distribution is used.
b (array-like, shape (B, m), optional) – Target distribution for each problem. If None, uniform distribution is used.
nx (backend object, optional) – Numerical backend to use for computations. If None, the default backend is used.
max_iter (int, optional) – Maximum number of iterations.
tol (float, optional) – Tolerance for convergence. The solver stops when the maximum change in the dual variables is below this value.
grad (str, optional) – Gradient computation mode: ‘detach’, ‘autodiff’, or ‘last_step’.

Returns:

Dictionary containing: - ‘T’ : array-like, shape (B, n, m)

Optimal transport plan for each problem.

’log_T’array-like, shape (B, n, m)
Logarithm of the optimal transport plan for each problem.
’potentials’tuple of array-like, shapes ((B, n), (B, m))
Log-scaling factors (u, v).
’n_iters’int
Number of iterations performed.

Return type:

dict

Examples

>>> import numpy as np
>>> from ot.batch import bregman_log_projection_batch
>>> # Create batch of affinity matrices
>>> K = np.random.randn(5, 10, 15)  # 5 problems, 10x15 cost matrices
>>> result = bregman_log_projection_batch(K)
>>> T = result['T']  # Shape (5, 10, 15)

See also

ot.batch.bregman_projection_batch: standard Bregman projection.

ot.batch.bregman_projection_batch(K, a=None, b=None, nx=None, max_iter=10000, tol=1e-05, grad='detach')[source]

Apply Bregman projection to a batch of affinity matrices \(\mathbf{K}\).

The function solves the following optimization problem:

\[\begin{split}\begin{aligned} \mathbf{T} = \mathop{\arg \min}_\mathbf{T} \quad & \text{KL}(\mathbf{T} \| \mathbf{K}) \\ \text{s.t.} \quad & \mathbf{T} \mathbf{1} = \mathbf{a} \\ & \mathbf{T}^T \mathbf{1} = \mathbf{b} \\ & \mathbf{T} \geq 0 \end{aligned}\end{split}\]

This is equivalent to:

\[\begin{split}\begin{aligned} \mathbf{T} = \mathop{\arg \max}_\mathbf{T} \quad & \langle \mathbf{T}, \log(\mathbf{K}) \rangle_F \\ \text{s.t.} \quad & \mathbf{T} \mathbf{1} = \mathbf{a} \\ & \mathbf{T}^T \mathbf{1} = \mathbf{b} \\ & \mathbf{T} \geq 0 \end{aligned}\end{split}\]

The optimal solution has the form \(\mathbf{T} = \text{diag}(\mathbf{f}) \mathbf{K} \text{diag}(\mathbf{g})\), where the dual variables \(\mathbf{u}\) and \(\mathbf{v}\) are found iteratively using:

\[ \begin{align}\begin{aligned}\mathbf{f}^{(k+1)} = \frac{\mathbf{a}}{\sum \mathbf{K} \mathbf{g}^{(k)}}\\\mathbf{g}^{(k+1)} = \frac{\mathbf{b}}{\sum \mathbf{K}^T \mathbf{f}^{(k)}}\end{aligned}\end{align} \]

Parameters:

K (array-like, shape (B, n, m)) – Affinity matrix for each problem in the batch.
a (array-like, shape (B, n), optional) – Source distribution for each problem. If None, uniform distribution is used.
b (array-like, shape (B, m), optional) – Target distribution for each problem. If None, uniform distribution is used.
nx (backend object, optional) – Numerical backend to use for computations. If None, the default backend is used.
max_iter (int, optional) – Maximum number of iterations.
tol (float, optional) – Tolerance for convergence. The solver stops when the maximum change in the dual variables is below this value.
grad (str, optional) – Gradient computation mode: ‘detach’, ‘autodiff’, or ‘last_step’.

Returns:

Dictionary containing: - ‘T’ : array-like, shape (B, n, m)

Optimal transport plan for each problem.

’potentials’tuple of array-like, shapes ((B, n), (B, m))
Scaling factors (f, g).
’n_iters’int
Number of iterations performed.

Return type:

dict

Examples

>>> import numpy as np
>>> from ot.batch import bregman_projection_batch
>>> # Create batch of affinity matrices
>>> K = np.random.randn(5, 10, 15)  # 5 problems, 10x15 cost matrices
>>> result = bregman_projection_batch(K)
>>> T = result['T']  # Shape (5, 10, 15)

See also

ot.batch.bregman_log_projection_batch: Bregman projection in the log-domain.

ot.batch.dist_batch(X1, X2=None, metric='sqeuclidean', p=2, nx=None)[source]

Batched version of ot.dist, use it to compute many distance matrices in parallel.

Parameters:

X1 (array-like, shape (b,n1,d)) – b matrices with n1 samples of size d
X2 (array-like, shape (b,n2,d), optional) – b matrices with n2 samples of size d (if None then \(\mathbf{X_2} = \mathbf{X_1}\))
metric (str, optional) – ‘sqeuclidean’, ‘euclidean’, ‘minkowski’ or ‘kl’
p (float, optional) – p-norm for the Minkowski metrics. Default value is 2.
nx (Backend, optional) – Backend to perform computations on. If omitted, the backend defaults to that of x1.

Returns:

M – distance matrix computed with given metric

Return type:

array-like, shape (b, n1, n2)

Examples

>>> import numpy as np
>>> from ot.batch import dist_batch
>>> X1 = np.random.randn(5, 10, 3)
>>> X2 = np.random.randn(5, 15, 3)
>>> M = dist_batch(X1, X2, metric="euclidean")
>>> M.shape
(5, 10, 15)

See also

ot.dist: equivalent non-batched function.

ot.batch.entropy_batch(T, nx=None, eps=1e-16)[source]

Computes the entropy of a batch of transport plans T.

\[H(T)_b = - \sum_{i,j} T_{b,i,j} \log(T_{b,i,j})\]

Parameters:

T (array-like, shape (b,n1,n2)) – b transport of size n1 x n2.
eps (float, optional) – Small constant to avoid numerical issues with log(0). Default is 1e-16.
nx (Backend, optional) – Backend to perform computations on. If omitted, the backend defaults to that of T.

ot.batch.loss_linear_batch(M, T, nx=None)[source]

Computes the linear optimal transport loss given a batch cost matrices and transport plans.

\[L(T, M)_b = \langle T_b, M_b \rangle_F\]

Parameters:

M (array-like, shape (B, ns, nt)) – Cost matrix
T (array-like, shape (B, ns, nt)) – Transport plan

Returns:

loss – Loss value for each batch element

Return type:

array-like, shape (B,)

See also

ot.batch.dist_batch: batched cost matrix computation for computing M.
ot.batch.solve_batch: solver for computing the optimal T.

ot.batch.loss_linear_samples_batch(X, Y, T, metric='l2')[source]

Computes the linear optimal transport loss given samples and transport plan. This is the equivalent of calling dist_batch and then loss_linear_batch.

Parameters:

X (array-like, shape (B, ns, d)) – Samples from source distribution
Y (array-like, shape (B, nt, d)) – Samples from target distribution
T (array-like, shape (B, ns, nt)) – Transport plan
metric (str, optional) – ‘sqeuclidean’, ‘euclidean’, ‘minkowski’ or ‘kl’

Returns:

loss – Loss value for each batch element

Return type:

array-like, shape (B,)

See also

ot.batch.dist_batch: batched cost matrix computation for computing M.
ot.batch.solve_batch: solver for computing the optimal T.

ot.batch.loss_quadratic_batch(L, T, recompute_const=False, symmetric=True, nx=None)[source]

Computes the gromov-wasserstein cost given a cost tensor and transport plan. Batched version.

Parameters:

L (dict) – Cost tensor as returned by tensor_batch.
T (array-like, shape (B, n, m)) – Transport plan.
recompute_const (bool, optional) – Whether to recompute the constant term. Default is False. This should be set to True if T does not satisfy the marginal constraints.
symmetric (bool, optional) – Whether to use symmetric version. Default is True.
nx (module, optional) – Backend to use. Default is None.

Examples

>>> import numpy as np
>>> from ot.batch import tensor_batch, loss_quadratic_batch
>>> # Create batch of cost matrices
>>> C1 = np.random.rand(3, 5, 5)  # 3 problems, 5x5 source matrices
>>> C2 = np.random.rand(3, 4, 4)  # 3 problems, 4x4 target matrices
>>> a = np.ones((3, 5)) / 5  # Uniform source distributions
>>> b = np.ones((3, 4)) / 4  # Uniform target distributions
>>> L = tensor_batch(a, b, C1, C2, loss='sqeuclidean')
>>> # Use the uniform transport plan for testing
>>> T = np.ones((3, 5, 4)) / (5 * 4)
>>> loss = loss_quadratic_batch(L, T, recompute_const=True)
>>> loss.shape
(3,)

See also

ot.batch.tensor_batch: From computing the cost tensor L.
ot.batch.solve_gromov_batch: For finding the optimal transport plan T.

ot.batch.loss_quadratic_samples_batch(a, b, C1, C2, T, loss='sqeuclidean', symmetric=None, nx=None, logits=None, recompute_const=False)[source]

Computes the gromov-wasserstein for samples C1, C2 and transport plan. Batched version.

Parameters:

a (array-like, shape (B, n)) – Source distributions.
b (array-like, shape (B, m)) – Target distributions.
C1 (array-like, shape (B, n, n) or (B, n, n, d)) – Source cost matrices.
C2 (array-like, shape (B, m, m) or (B, n, n, d)) – Target cost matrices.
T (array-like, shape (B, n, m)) – Transport plan.
loss (str, optional) – Loss function to use. Supported values: ‘sqeuclidean’, ‘kl’. Default is ‘sqeuclidean’.
recompute_const (bool, optional) – Whether to recompute the constant term. Default is False. This should be set to True if T does not satisfy the marginal constraints.
symmetric (bool, optional) – Whether to use symmetric version. Default is True.
nx (module, optional) – Backend to use. Default is None.

Examples

>>> import numpy as np
>>> from ot.batch import loss_quadratic_samples_batch
>>> # Create batch of cost matrices
>>> C1 = np.random.rand(3, 5, 5)  # 3 problems, 5x5 source matrices
>>> C2 = np.random.rand(3, 4, 4)  # 3 problems, 4x4 target matrices
>>> a = np.ones((3, 5)) / 5  # Uniform source distributions
>>> b = np.ones((3, 4)) / 4  # Uniform target distributions
>>> # Use the uniform transport plan for testing
>>> T = np.ones((3, 5, 4)) / (5 * 4)
>>> loss = loss_quadratic_samples_batch(a, b, C1, C2, T, recompute_const=True)
>>> loss.shape
(3,)

See also

ot.batch.tensor_batch: From computing the cost tensor L.
ot.batch.solve_gromov_batch: For finding the optimal transport plan T.

ot.batch.solve_batch(M, reg, a=None, b=None, max_iter=1000, tol=1e-05, solver='log_sinkhorn', reg_type='entropy', grad='envelope')[source]

Batched version of ot.solve, use it to solve many entropic OT problems in parallel.

Parameters:

M (array-like, shape (B, ns, nt)) – Cost matrix
reg (float) – Regularization parameter for entropic regularization
a (array-like, shape (B, ns)) – Source distribution (optional). If None, uniform distribution is used.
b (array-like, shape (B, nt)) – Target distribution (optional). If None, uniform distribution is used.
max_iter (int) – Maximum number of iterations
tol (float) – Tolerance for convergence
solver (str) – Solver to use, either ‘log_sinkhorn’ or ‘sinkhorn’. Default is “log_sinkhorn” which is more stable.
reg_type (str, optional) – Type of regularization \(R\) either “KL”, or “entropy”. Default is “entropy”.
grad (str, optional) – Type of gradient computation, either or ‘autodiff’, ‘envelope’ or ‘last_step’ used only for Sinkhorn solver. By default ‘autodiff’ provides gradients wrt all outputs (plan, value, value_linear) but with important memory cost. ‘envelope’ provides gradients only for value and and other outputs are detached. This is useful for memory saving when only the value is needed. ‘last_step’ provides gradients only for the last iteration of the Sinkhorn solver, but provides gradient for both the OT plan and the objective values. ‘detach’ does not compute the gradients for the Sinkhorn solver.

Returns:

res – Result of the optimization problem. The information can be obtained as follows:

res.plan : OT plan \(\mathbf{T}\)
res.potentials : OT dual potentials
res.value : Optimal value of the optimization problem
res.value_linear : Linear OT loss with the optimal OT plan

See OTResult for more information.

Return type:

OTResult()

Examples

>>> import numpy as np
>>> from ot.batch import solve_batch, dist_batch
>>> X = np.random.randn(5, 10, 3)  # 5 batches of 10 samples in 3D
>>> Y = np.random.randn(5, 15, 3)  # 5 batches of 15 samples in 3D
>>> M = dist_batch(X, Y, metric="euclidean")  # Compute cost matrices
>>> reg = 0.1
>>> result = solve_batch(M, reg)
>>> result.plan.shape  # Optimal transport plans for each batch
(5, 10, 15)
>>> result.value.shape  # Optimal transport values for each batch
(5,)

See also

ot.batch.dist_batch: batched cost matrix computation for computing M.
ot.solve: non-batched version of the OT solver.

ot.batch.solve_gromov_batch(C1, C2, reg=0.01, a=None, b=None, loss='sqeuclidean', symmetric=None, M=None, alpha=None, T_init=None, max_iter=50, tol=1e-05, max_iter_inner=50, tol_inner=1e-05, grad='envelope', logits=None)[source]

Solves a batch of Gromov-Wasserstein optimal transport problems using proximal gradient [12, 81]. For each problem in the batch, solves:

\[\begin{split}\begin{aligned} \min_{\mathbf{T} \geq 0} \quad & \sum_{i,j,k,l} L(\mathbf{C_1}_{i,k}, \mathbf{C_2}_{j,l}) \mathbf{T}_{i,j} \mathbf{T}_{k,l} \\ \text{s.t.} \quad & \mathbf{T} \mathbf{1} = \mathbf{a} \\ & \mathbf{T}^T \mathbf{1} = \mathbf{b} \\ & \mathbf{T} \geq 0 \end{aligned}\end{split}\]

If \(\mathbf{M}\) and \(\alpha\) are given, solves the more general fused Gromov-Wasserstein problem:

\[\begin{split}\begin{aligned} \min_{\mathbf{T} \geq 0} \quad & (1-\alpha) \sum_{i,j} M_{i,j} \mathbf{T}_{i,j} + \alpha \sum_{i,j,k,l} L(\mathbf{C_1}_{i,k}, \mathbf{C_2}_{j,l}) \mathbf{T}_{i,j} \mathbf{T}_{k,l} \\ \text{s.t.} \quad & \mathbf{T} \mathbf{1} = \mathbf{a} \\ & \mathbf{T}^T \mathbf{1} = \mathbf{b} \\ & \mathbf{T} \geq 0 \end{aligned}\end{split}\]

Writing the objective as \((1-\alpha) \langle \mathbf{M}, \mathbf{T} \rangle + \alpha \langle \mathcal{L} \otimes \mathbf{T}, \mathbf{T} \rangle\), the solver uses proximal gradient descent where each iteration is:

\[\begin{split}\begin{aligned} \mathbf{T}_{k+1} = \mathop{\arg \min}_{\mathbf{T} \geq 0} \quad & \langle \mathbf{M}_k, \mathbf{T} \rangle + \epsilon \, \text{KL}(\mathbf{T} \| \mathbf{T}_k) \\ \text{where} \quad & \mathbf{M}_k = (1 - \alpha) \mathbf{M} + 2 \alpha \mathcal{L} \otimes \mathbf{T}_k \end{aligned}\end{split}\]

This can be rewritten as:

\[\begin{aligned} \mathbf{T}_{k+1} = \mathop{\arg \min}_{\mathbf{T} \geq 0} \quad & \langle \mathbf{M}_k - \epsilon \log(\mathbf{T}_k), \mathbf{T} \rangle - \epsilon H(\mathbf{T}) \end{aligned}\]

where \(H(\mathbf{T})\) is the entropy of \(\mathbf{T}\). Thus each iteration can be solved using the Bregman projection solver implemented in bregman_log_projection_batch.

Note that the inner optimization problem does not need to be solved exactly. In practice it sufficient to set max_iter_inner to a low value (e.g. 20) and/or tol_inner to a high value (e.g. 1e-2).

Parameters:

C1 (array-like, shape (B, n, n, d) or (B, n, n)) – Samples affinity matrices from source distribution
C2 (array-like, shape (B, n, n, d) or (B, n, n)) – Samples affinity matrices from target distribution
a (array-like, shape (B, n), optional) – Marginal distribution of the source samples. If None, uniform distribution is used.
b (array-like, shape (B, m), optional) – Marginal distribution of the target samples. If None, uniform distribution is used.
loss (str, optional) – Type of loss function, can be ‘sqeuclidean’ or ‘kl’ or a QuadraticMetric instance.
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not. If let to its default None value, a symmetry test will be conducted. Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
M (array-like, shape (dim_a, dim_b), optional) – Linear cost matrix for Fused Gromov-Wasserstein (default is None).
alpha (float, optional) – Weight the quadratic term (alpha*Gromov) and the linear term ((1-alpha)*Wass) in the Fused Gromov-Wasserstein problem. Not used for Gromov problem (when M is not provided). By default alpha=None corresponds to alpha=1 for Gromov problem (M==None) and alpha=0.5 for Fused Gromov-Wasserstein problem (M!=None)
epsilon (float, optional) – Regularization parameter for proximal gradient descent. Default is 1e-2.
T_init (array-like, shape (B, n, m), optional) – Initial transport plan. If None, it is initialized to uniform distribution.
max_iter (int, optional) – Maximum number of iterations for the proximal gradient descent. Default is 50.
tol (float, optional) – Tolerance for convergence of the proximal gradient descent. Default is 1e-5.
max_iter_inner (int, optional) – Maximum number of iterations for the inner Bregman projection. Default is 50.
tol_inner (float, optional) – Tolerance for convergence of the inner Bregman projection. Default is 1e-5.
grad (str, optional) – Type of gradient computation, either or ‘autodiff’, ‘envelope’ or ‘detach’. ‘autodiff’ provides gradients wrt all outputs (plan, value, value_linear) but with important memory cost. ‘envelope’ provides gradients only for (value, value_linear)`. detach` is the fastest option but provides no gradients. Default is ‘detach’.
assume_inner_convergence (bool, optional) – If True, assumes that the inner Bregman projection always converged i.e. the transport plan satisfies the marginal constraints. This enables faster computations of the tensor product but might results in inaccurate results (e.g. negative values of the loss). Default is True.

Returns:

res – Result of the optimization problem. The information can be obtained as follows:

res.plan : OT plan \(\mathbf{T}\)
res.potentials : OT dual potentials
res.value : Optimal value of the optimization problem
res.value_linear : Linear OT loss with the optimal OT plan
res.value_quad : Quadratic OT loss with the optimal OT plan

See OTResult for more information.

Return type:

OTResult()

See also

ot.batch.tensor_batch: From computing the cost tensor L.
ot.solve_gromov: Non-batched solver for Gromov-Wasserstein. Note that the non-batched solver uses a different algorithm (conditional gradient) and might not give the same results.

References

ot.batch.solve_sample_batch(X_a, X_b, reg, a=None, b=None, metric='sqeuclidean', p=2, max_iter=1000, tol=1e-05, solver='log_sinkhorn', reg_type='entropy', grad='envelope')[source]

Batched version of ot.solve, use it to solve many entropic OT problems in parallel.

Parameters:

M (array-like, shape (B, ns, nt)) – Cost matrix
reg (float) – Regularization parameter for entropic regularization
metric (str, optional) – ‘sqeuclidean’, ‘euclidean’, ‘minkowski’ or ‘kl’
p (float, optional) – p-norm for the Minkowski metrics. Default value is 2.
a (array-like, shape (B, ns)) – Source distribution (optional). If None, uniform distribution is used.
b (array-like, shape (B, nt)) – Target distribution (optional). If None, uniform distribution is used.
max_iter (int) – Maximum number of iterations
tol (float) – Tolerance for convergence
solver (str) – Solver to use, either ‘log_sinkhorn’ or ‘sinkhorn’. Default is “log_sinkhorn” which is more stable.
reg_type (str, optional) – Type of regularization \(R\) either “KL”, or “entropy”. Default is “entropy”.
grad (str, optional) – Type of gradient computation, either or ‘autodiff’, ‘envelope’ or ‘last_step’ used only for Sinkhorn solver. By default ‘autodiff’ provides gradients wrt all outputs (plan, value, value_linear) but with important memory cost. ‘envelope’ provides gradients only for value and and other outputs are detached. This is useful for memory saving when only the value is needed. ‘last_step’ provides gradients only for the last iteration of the Sinkhorn solver, but provides gradient for both the OT plan and the objective values. ‘detach’ does not compute the gradients for the Sinkhorn solver.

Returns:

res – Result of the optimization problem. The information can be obtained as follows:

res.plan : OT plan \(\mathbf{T}\)
res.potentials : OT dual potentials
res.value : Optimal value of the optimization problem
res.value_linear : Linear OT loss with the optimal OT plan

See OTResult for more information.

Return type:

OTResult()

See also

ot.batch.solve_batch: solver for computing the optimal T from arbitrary cost matrix M.

ot.batch.tensor_batch(a, b, C1, C2, symmetric=True, nx=None, loss='sqeuclidean', logits=None)[source]

Compute the Gromov-Wasserstein cost tensor for a batch of problems.

The Gromov-Wasserstein distance can be expressed as:

\[\text{GW}(\mathbf{T}, \mathbf{C}_1, \mathbf{C}_2) = \sum_{ijkl} T_{ik} T_{jl} \ell(C_{1,ij}, C_{2,kl}) = \langle \mathcal{L} \times \mathbf{T}, \mathbf{T} \rangle\]

where \(\mathcal{L}\) is a 4D tensor with elements \(\mathcal{L}[i,j,k,l] = \ell(C_{1,ij}, C_{2,kl})\).

For loss functions of the form \(\ell(a,b) = f_1(a) + f_2(b) - \langle h_1(a), h_2(b) \rangle\), the tensor product \(\mathcal{L} \times \mathbf{T}\) can be computed efficiently without explicitly computing \(\mathcal{L}\) [12].

This function precomputes all matrices that implicitly define mathcal{L} for various loss functions.

Parameters:

a (array-like, shape (B, n)) – Source distributions for each problem in the batch.
b (array-like, shape (B, m)) – Target distributions for each problem in the batch.
C1 (array-like, shape (B, n, n) or (B, n, n, d)) – Source cost matrices for each problem. Can be a 3D array for scalar costs or a 4D array for vector-valued costs (edge features).
C2 (array-like, shape (B, m, m) or (B, n, n, d)) – Target cost matrices for each problem. Can be a 3D array for scalar costs or a 4D array for vector-valued costs (edge features).
symmetric (bool, optional) – Whether the cost matrices are symmetric. Default is True.
nx (backend object, optional) – Numerical backend to use for computations. If None, the default backend is used.
loss (str, optional) – Loss function to use. Supported values: ‘sqeuclidean’, ‘kl’. Default is ‘sqeuclidean’.
logits (bool, optional) – For KL divergence, whether inputs are logits (unnormalized log probabilities). If True, inputs are treated as logits. Default is None.

Returns:

Dictionary containing: - constC : array-like, shape (B, n, m)

Constant term in the tensor product.

hC1 : array-like, shape (B, n, n, d) or (B, n, n)
hC2 : array-like, shape (B, m, m, d) or (B, m, m)
fC1 : array-like, shape (B, n, n)
fC2 : array-like, shape (B, m, m)

Return type:

dict

Supported loss functions:

Squared Euclidean loss:

\[\ell(a, b) = \|a - b\|_2^2 = \sum_i (a_i - b_i)^2\]

KL divergence:

\[\ell(a, b) = \sum_i a_i \log\left(\frac{a_i}{b_i}\right)\]

If logits=True, the entries of C1 are treated as logits (unnormalized log probabilities) and the loss becomes:

\[\ell(x, y) = \sum_i y_i (\log(y_i) - x_i)\]

Examples

>>> import numpy as np
>>> from ot.batch import tensor_batch
>>> # Create batch of cost matrices
>>> C1 = np.random.rand(3, 5, 5)  # 3 problems, 5x5 source matrices
>>> C2 = np.random.rand(3, 4, 4)  # 3 problems, 4x4 target matrices
>>> a = np.ones((3, 5)) / 5  # Uniform source distributions
>>> b = np.ones((3, 4)) / 4  # Uniform target distributions
>>> L = tensor_batch(a, b, C1, C2, loss='sqeuclidean')

References