Returns the Fused Gromov-Wasserstein transport between \((\mathbf{C_1}, \mathbf{Y_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{Y_2}, \mathbf{q})\)
with pairwise distance matrix \(\mathbf{M}\) between node feature matrices \(\mathbf{Y_1}\) and \(\mathbf{Y_2}\),
estimated using Bregman Alternated Projected Gradient method.
If marginal_loss=True, the function solves the following Fused Gromov-Wasserstein
optimization problem :
Else, the function solves an equivalent problem [63, 64], where constant terms only
depending on the marginals \(\mathbf{p}\): and \(\mathbf{q}\): are
discarded while assuming that L decomposes as in Proposition 1 in [12]:
\(\mathbf{M}\): pairwise relation matrix between features across domains
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
L: loss function to account for the misfit between the similarity and feature matrices
satisfying \(L(a, b) = f_1(a) + f_2(b) - h_1(a) h_2(b)\)
\(\alpha\): trade-off parameter
Note
By algorithmic design the optimal coupling \(\mathbf{T}\)
returned by this function does not necessarily satisfy the marginal
constraints \(\mathbf{T}\mathbf{1}=\mathbf{p}\) and
\(\mathbf{T}^T\mathbf{1}=\mathbf{q}\). So the returned Fused
Gromov-Wasserstein loss does not necessarily satisfy distance
properties and may be negative.
Parameters:
M (array-like, shape (ns, nt)) – Pairwise relation matrix between features across domains
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
loss_fun (string, optional (default='square_loss')) – Loss function used for the solver either ‘square_loss’ or ‘kl_loss’
epsilon (float, optional) – Regularization term >0
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 will be used as initial transport of the solver. G0 is not
required to satisfy marginal constraints but we strongly recommend it
to correctly estimate the GW distance.
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
marginal_loss (bool, optional. Default is False.) – Include constant marginal terms or not in the objective function.
verbose (bool, optional) – Print information along iterations
Returns the Fused Gromov-Wasserstein loss between \((\mathbf{C_1}, \mathbf{Y_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{Y_2}, \mathbf{q})\)
with pairwise distance matrix \(\mathbf{M}\) between node feature matrices \(\mathbf{Y_1}\) and \(\mathbf{Y_2}\),
estimated using Bregman Alternated Projected Gradient method.
If marginal_loss=True, the function solves the following Fused Gromov-Wasserstein
optimization problem :
Else, the function solves an equivalent problem [63, 64], where constant terms only
depending on the marginals \(\mathbf{p}\): and \(\mathbf{q}\): are
discarded while assuming that L decomposes as in Proposition 1 in [12]:
\(\mathbf{M}\): metric cost matrix between features across domains
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
L: loss function to account for the misfit between the similarity and feature matrices
satisfying \(L(a, b) = f_1(a) + f_2(b) - h_1(a) h_2(b)\)
\(\alpha\): trade-off parameter
Note
By algorithmic design the optimal coupling \(\mathbf{T}\)
returned by this function does not necessarily satisfy the marginal
constraints \(\mathbf{T}\mathbf{1}=\mathbf{p}\) and
\(\mathbf{T}^T\mathbf{1}=\mathbf{q}\). So the returned Fused
Gromov-Wasserstein loss does not necessarily satisfy distance
properties and may be negative.
Parameters:
M (array-like, shape (ns, nt)) – Metric cost matrix between features across domains
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
loss_fun (string, optional (default='square_loss')) – Loss function used for the solver either ‘square_loss’ or ‘kl_loss’
epsilon (float, optional) – Regularization term >0
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 will be used as initial transport of the solver. G0 is not
required to satisfy marginal constraints but we strongly recommend it
to correctly estimate the GW distance.
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
marginal_loss (bool, optional. Default is False.) – Include constant marginal terms or not in the objective function.
verbose (bool, optional) – Print information along iterations
Returns the Gromov-Wasserstein transport between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\)
estimated using Bregman Alternated Projected Gradient method.
If marginal_loss=True, the function solves the following Gromov-Wasserstein
optimization problem :
Else, the function solves an equivalent problem [63], where constant terms only
depending on the marginals \(\mathbf{p}\): and \(\mathbf{q}\): are
discarded while assuming that L decomposes as in Proposition 1 in [12]:
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
L: loss function to account for the misfit between the similarity matrices
satisfying \(L(a, b) = f_1(a) + f_2(b) - h_1(a) h_2(b)\)
Note
By algorithmic design the optimal coupling \(\mathbf{T}\)
returned by this function does not necessarily satisfy the marginal
constraints \(\mathbf{T}\mathbf{1}=\mathbf{p}\) and
\(\mathbf{T}^T\mathbf{1}=\mathbf{q}\). So the returned
Gromov-Wasserstein loss does not necessarily satisfy distance
properties and may be negative.
Parameters:
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
loss_fun (string, optional (default='square_loss')) – Loss function used for the solver either ‘square_loss’ or ‘kl_loss’
epsilon (float, optional) – Regularization term >0
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 will be used as initial transport of the solver. G0 is not
required to satisfy marginal constraints but we strongly recommend it
to correctly estimate the GW distance.
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
marginal_loss (bool, optional. Default is False.) – Include constant marginal terms or not in the objective function.
verbose (bool, optional) – Print information along iterations
Returns the Gromov-Wasserstein loss \(\mathbf{GW}\) between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\)
estimated using Bregman Alternated Projected Gradient method.
If marginal_loss=True, the function solves the following Gromov-Wasserstein
optimization problem :
Else, the function solves an equivalent problem [63, 64], where constant terms only
depending on the marginals \(\mathbf{p}\): and \(\mathbf{q}\): are
discarded while assuming that L decomposes as in Proposition 1 in [12]:
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
L: loss function to account for the misfit between the similarity matrices
satisfying \(L(a, b) = f_1(a) + f_2(b) - h_1(a) h_2(b)\)
Note
By algorithmic design the optimal coupling \(\mathbf{T}\)
returned by this function does not necessarily satisfy the marginal
constraints \(\mathbf{T}\mathbf{1}=\mathbf{p}\) and
\(\mathbf{T}^T\mathbf{1}=\mathbf{q}\). So the returned
Gromov-Wasserstein loss does not necessarily satisfy distance
properties and may be negative.
Parameters:
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
loss_fun (string, optional (default='square_loss')) – Loss function used for the solver either ‘square_loss’ or ‘kl_loss’
epsilon (float, optional) – Regularization term >0
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 will be used as initial transport of the solver. G0 is not
required to satisfy marginal constraints but we strongly recommand it
to correcly estimate the GW distance.
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
marginal_loss (bool, optional. Default is False.) – Include constant marginal terms or not in the objective function.
verbose (bool, optional) – Print information along iterations
ot.gromov.GW_distance_estimation(C1, C2, p, q, loss_fun, T, nb_samples_p=None, nb_samples_q=None, std=True, random_state=None)[source]
Returns an approximation of the Gromov-Wasserstein loss between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\)
with a fixed transport plan \(\mathbf{T}\). To recover an approximation of the Gromov-Wasserstein distance as defined in [13] compute \(d_{GW} = \frac{1}{2} \sqrt{\mathbf{GW}}\).
The function gives an unbiased approximation of the following equation:
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
L : Loss function to account for the misfit between the similarity matrices
\(\mathbf{T}\): Matrix with marginal \(\mathbf{p}\) and \(\mathbf{q}\)
Parameters:
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,)) – Distribution in the source space
q (array-like, shape (nt,)) – Distribution in the target space
loss_fun (function: \(\mathbb{R} \times \mathbb{R} \mapsto \mathbb{R}\)) – Loss function used for the distance, the transport plan does not depend on the loss function
T (csr or array-like, shape (ns, nt)) – Transport plan matrix, either a sparse csr or a dense matrix
nb_samples_p (int, optional) – nb_samples_p is the number of samples (without replacement) along the first dimension of \(\mathbf{T}\)
nb_samples_q (int, optional) – nb_samples_q is the number of samples along the second dimension of \(\mathbf{T}\), for each sample along the first
std (bool, optional) – Standard deviation associated with the prediction of the gromov-wasserstein cost
random_state (int or RandomState instance, optional) – Fix the seed for reproducibility
Returns the Fused Gromov-Wasserstein barycenters of S measurable networks with node features \((\mathbf{C}_s, \mathbf{Y}_s, \mathbf{p}_s)_{1 \leq s \leq S}\)
estimated using Fused Gromov-Wasserstein transports from Sinkhorn projections.
The function solves the following optimization problem:
Ys (list of array-like, each element has shape (ns,d)) – Features of all samples
Cs (list of S array-like of shape (ns,ns)) – Metric cost matrices
ps (list of S array-like of shape (ns,), optional) – Sample weights in the S spaces.
If let to its default value None, uniform distributions are taken.
p (array-like, shape (N,), optional) – Weights in the targeted barycenter.
If let to its default value None, uniform distribution is taken.
lambdas (list of float, optional) – List of the S spaces’ weights.
If let to its default value None, uniform weights are taken.
loss_fun (string, optional (default='square_loss')) – Loss function used for the solver either ‘square_loss’ or ‘kl_loss’
epsilon (float, optional) – Regularization term >0
symmetric (bool, optional.) – Either structures are to be assumed symmetric or not. Default value is True.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
stop_criterion (str, optional. Default is 'barycenter'.) – Stop criterion taking values in [‘barycenter’, ‘loss’]. If set to ‘barycenter’
uses absolute norm variations of estimated barycenters. Else if set to ‘loss’
uses the relative variations of the loss.
warmstartT (bool, optional) – Either to perform warmstart of transport plans in the successive
fused gromov-wasserstein transport problems.
verbose (bool, optional) – Print information along iterations.
init_C (bool | array-like, shape (N, N)) – Random initial value for the \(\mathbf{C}\) matrix provided by user.
init_Y (array-like, shape (N,d), optional) – Initialization for the barycenters’ features. If not set a
random init is used.
fixed_structure (bool, optional) – Whether to fix the structure of the barycenter during the updates.
fixed_features (bool, optional) – Whether to fix the feature of the barycenter during the updates
random_state (int or RandomState instance, optional) – Fix the seed for reproducibility
**kwargs (dict) – parameters can be directly passed to the ot.entropic_fused_gromov_wasserstein solver.
Returns:
Y (array-like, shape (N, d)) – Feature matrix in the barycenter space (permutated arbitrarily)
C (array-like, shape (N, N)) – Similarity matrix in the barycenter space (permutated as Y’s rows)
log (dict) – Only returned when log=True. It contains the keys:
\(\mathbf{T}\): list of (N, ns) transport matrices
\(\mathbf{p}\): (N,) barycenter weights
\((\mathbf{M}_s)_s\): all distance matrices between the feature of the barycenter and the other features \((dist(\mathbf{X}, \mathbf{Y}_s))_s\) shape (N, ns)
Returns the Fused Gromov-Wasserstein transport between \((\mathbf{C_1}, \mathbf{Y_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{Y_2}, \mathbf{q})\)
with pairwise distance matrix \(\mathbf{M}\) between node feature matrices \(\mathbf{Y_1}\) and \(\mathbf{Y_2}\),
estimated using Sinkhorn projections.
If solver=”PGD”, the function solves the following entropic-regularized
Fused Gromov-Wasserstein optimization problem using Projected Gradient Descent [12]:
\(\mathbf{M}\): metric cost matrix between features across domains
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
L: loss function to account for the misfit between the similarity and feature matrices
H: entropy
\(\alpha\): trade-off parameter
Note
If the inner solver ot.sinkhorn did not convergence, the
optimal coupling \(\mathbf{T}\) returned by this function does not
necessarily satisfy the marginal constraints
\(\mathbf{T}\mathbf{1}=\mathbf{p}\) and
\(\mathbf{T}^T\mathbf{1}=\mathbf{q}\). So the returned
Fused Gromov-Wasserstein loss does not necessarily satisfy distance
properties and may be negative.
Parameters:
M (array-like, shape (ns, nt)) – Metric cost matrix between features across domains
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
loss_fun (string, optional (default='square_loss')) – Loss function used for the solver either ‘square_loss’ or ‘kl_loss’
epsilon (float, optional) – Regularization term >0
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 will be used as initial transport of the solver. G0 is not
required to satisfy marginal constraints but we strongly recommend it
to correctly estimate the GW distance.
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
solver (string, optional) – Solver to use either ‘PGD’ for Projected Gradient Descent or ‘PPA’
for Proximal Point Algorithm.
Default value is ‘PGD’.
warmstart (bool, optional) – Either to perform warmstart of dual potentials in the successive
Sinkhorn projections.
verbose (bool, optional) – Print information along iterations
**kwargs (dict) – parameters can be directly passed to the ot.sinkhorn solver.
Such as numItermax and stopThr to control its estimation precision,
e.g [51] suggests to use numItermax=1.
Returns the Fused Gromov-Wasserstein distance between \((\mathbf{C_1}, \mathbf{Y_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{Y_2}, \mathbf{q})\)
with pairwise distance matrix \(\mathbf{M}\) between node feature matrices \(\mathbf{Y_1}\) and \(\mathbf{Y_2}\),
estimated using Sinkhorn projections.
If solver=”PGD”, the function solves the following entropic-regularized
Fused Gromov-Wasserstein optimization problem using Projected Gradient Descent [12]:
\(\mathbf{M}\): metric cost matrix between features across domains
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
L: loss function to account for the misfit between the similarity and feature matrices
H: entropy
\(\alpha\): trade-off parameter
Note
If the inner solver ot.sinkhorn did not convergence, the
optimal coupling \(\mathbf{T}\) returned by this function does not
necessarily satisfy the marginal constraints
\(\mathbf{T}\mathbf{1}=\mathbf{p}\) and
\(\mathbf{T}^T\mathbf{1}=\mathbf{q}\). So the returned
Fused Gromov-Wasserstein loss does not necessarily satisfy distance
properties and may be negative.
Parameters:
M (array-like, shape (ns, nt)) – Metric cost matrix between features across domains
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
loss_fun (string, optional (default='square_loss')) – Loss function used for the solver either ‘square_loss’ or ‘kl_loss’
epsilon (float, optional) – Regularization term >0
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymetric).
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 will be used as initial transport of the solver. G0 is not
required to satisfy marginal constraints but we strongly recommend it
to correctly estimate the GW distance.
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
Returns the Gromov-Wasserstein barycenters of S measured similarity matrices \((\mathbf{C}_s)_{1 \leq s \leq S}\)
estimated using Gromov-Wasserstein transports from Sinkhorn projections.
The function solves the following optimization problem:
Cs (list of S array-like of shape (ns,ns)) – Metric cost matrices
ps (list of S array-like of shape (ns,), optional) – Sample weights in the S spaces.
If let to its default value None, uniform distributions are taken.
p (array-like, shape (N,), optional) – Weights in the targeted barycenter.
If let to its default value None, uniform distribution is taken.
lambdas (list of float, optional) – List of the S spaces’ weights.
If let to its default value None, uniform weights are taken.
loss_fun (string, optional (default='square_loss')) – Loss function used for the solver either ‘square_loss’ or ‘kl_loss’
epsilon (float, optional) – Regularization term >0
symmetric (bool, optional.) – Either structures are to be assumed symmetric or not. Default value is True.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
stop_criterion (str, optional. Default is 'barycenter'.) – Convergence criterion taking values in [‘barycenter’, ‘loss’]. If set to ‘barycenter’
uses absolute norm variations of estimated barycenters. Else if set to ‘loss’
uses the relative variations of the loss.
warmstartT (bool, optional) – Either to perform warmstart of transport plans in the successive
gromov-wasserstein transport problems.
verbose (bool, optional) – Print information along iterations.
Returns the Gromov-Wasserstein transport between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\)
estimated using Sinkhorn projections.
If solver=”PGD”, the function solves the following entropic-regularized
Gromov-Wasserstein optimization problem using Projected Gradient Descent [12]:
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
L: loss function to account for the misfit between the similarity matrices
H: entropy
Note
If the inner solver ot.sinkhorn did not convergence, the
optimal coupling \(\mathbf{T}\) returned by this function does not
necessarily satisfy the marginal constraints
\(\mathbf{T}\mathbf{1}=\mathbf{p}\) and
\(\mathbf{T}^T\mathbf{1}=\mathbf{q}\). So the returned
Gromov-Wasserstein loss does not necessarily satisfy distance
properties and may be negative.
Parameters:
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
loss_fun (string, optional (default='square_loss')) – Loss function used for the solver either ‘square_loss’ or ‘kl_loss’
epsilon (float, optional) – Regularization term >0
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 will be used as initial transport of the solver. G0 is not
required to satisfy marginal constraints but we strongly recommend it
to correctly estimate the GW distance.
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
solver (string, optional) – Solver to use either ‘PGD’ for Projected Gradient Descent or ‘PPA’
for Proximal Point Algorithm.
Default value is ‘PGD’.
warmstart (bool, optional) – Either to perform warmstart of dual potentials in the successive
Sinkhorn projections.
verbose (bool, optional) – Print information along iterations
**kwargs (dict) – parameters can be directly passed to the ot.sinkhorn solver.
Such as numItermax and stopThr to control its estimation precision,
e.g [51] suggests to use numItermax=1.
Returns the Gromov-Wasserstein loss \(\mathbf{GW}\) between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\)
estimated using Sinkhorn projections. To recover the Gromov-Wasserstein distance as defined in [13] compute \(d_{GW} = \frac{1}{2} \sqrt{\mathbf{GW}}\).
If solver=”PGD”, the function solves the following entropic-regularized
Gromov-Wasserstein optimization problem using Projected Gradient Descent [12]:
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
L: loss function to account for the misfit between the similarity matrices
H: entropy
Note
If the inner solver ot.sinkhorn did not convergence, the
optimal coupling \(\mathbf{T}\) returned by this function does not
necessarily satisfy the marginal constraints
\(\mathbf{T}\mathbf{1}=\mathbf{p}\) and
\(\mathbf{T}^T\mathbf{1}=\mathbf{q}\). So the returned
Gromov-Wasserstein loss does not necessarily satisfy distance
properties and may be negative.
Parameters:
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
loss_fun (string, optional (default='square_loss')) – Loss function used for the solver either ‘square_loss’ or ‘kl_loss’
epsilon (float, optional) – Regularization term >0
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 will be used as initial transport of the solver. G0 is not
required to satisfy marginal constraints but we strongly recommand it
to correcly estimate the GW distance.
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
solver (string, optional) – Solver to use either ‘PGD’ for Projected Gradient Descent or ‘PPA’
for Proximal Point Algorithm.
Default value is ‘PGD’.
warmstart (bool, optional) – Either to perform warmstart of dual potentials in the successive
Sinkhorn projections.
verbose (bool, optional) – Print information along iterations
**kwargs (dict) – parameters can be directly passed to the ot.sinkhorn solver.
Such as numItermax and stopThr to control its estimation precision,
e.g [51] suggests to use numItermax=1.
Computes the entropic-regularized semi-relaxed FGW transport between two graphs (see [48])
estimated using a Mirror Descent algorithm following the KL geometry.
\(\mathbf{M}\) is the (ns, nt) metric cost matrix between features
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\) source weights (sum to 1)
L is a loss function to account for the misfit between the similarity matrices
Note
This function is backend-compatible and will work on arrays
from all compatible backends. However all the steps in the conditional
gradient are not differentiable.
The algorithm used for solving the problem is conditional gradient as discussed in [48]
Parameters:
M (array-like, shape (ns, nt)) – Metric cost matrix between features across domains
C1 (array-like, shape (ns, ns)) – Metric cost matrix representative of the structure in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix representative of the structure in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
loss_fun (str) – loss function used for the solver either ‘square_loss’ or ‘kl_loss’.
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymetric).
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 must satisfy marginal constraints and will be used as initial transport of the solver.
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error computed on transport plans
Computes the entropic-regularized semi-relaxed FGW divergence between two graphs (see [48])
estimated using a Mirror Descent algorithm following the KL geometry.
\(\mathbf{M}\) is the (ns, nt) metric cost matrix between features
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\) source weights (sum to 1)
L is a loss function to account for the misfit between the similarity matrices
Note
This function is backend-compatible and will work on arrays
from all compatible backends. However all the steps in the conditional
gradient are not differentiable.
The algorithm used for solving the problem is conditional gradient as discussed in [48]
Parameters:
M (array-like, shape (ns, nt)) – Metric cost matrix between features across domains
C1 (array-like, shape (ns, ns)) – Metric cost matrix representative of the structure in the source space.
C2 (array-like, shape (nt, nt)) – Metric cost matrix representative of the structure in the target space.
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
loss_fun (str, optional) – loss function used for the solver either ‘square_loss’ or ‘kl_loss’.
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymetric).
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 must satisfy marginal constraints and will be used as initial transport of the solver.
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error computed on transport plans
Returns the entropic-regularized semi-relaxed gromov-wasserstein divergence
transport plan from \((\mathbf{C_1}, \mathbf{p})\) to \(\mathbf{C_2}\)
estimated using a Mirror Descent algorithm following the KL geometry.
The function solves the following optimization problem:
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
L: loss function to account for the misfit between the similarity matrices
Note
This function is backend-compatible and will work on arrays
from all compatible backends. However all the steps in the conditional
gradient are not differentiable.
Parameters:
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
loss_fun (str) – loss function used for the solver either ‘square_loss’ or ‘kl_loss’.
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymetric).
verbose (bool, optional) – Print information along iterations
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 must satisfy marginal constraints and will be used as initial transport of the solver.
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error computed on transport plans
Returns the entropic-regularized semi-relaxed gromov-wasserstein divergence
from \((\mathbf{C_1}, \mathbf{p})\) to \(\mathbf{C_2}\)
estimated using a Mirror Descent algorithm following the KL geometry.
The function solves the following optimization problem:
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
L: loss function to account for the misfit between the similarity
matrices
Note that when using backends, this loss function is differentiable wrt the
matrices (C1, C2) but not yet for the weights p.
.. note:: This function is backend-compatible and will work on arrays
from all compatible backends. However all the steps in the conditional
gradient are not differentiable.
Parameters:
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
loss_fun (str) – loss function used for the solver either ‘square_loss’ or ‘kl_loss’.
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymetric).
verbose (bool, optional) – Print information along iterations
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 must satisfy marginal constraints and will be used as initial transport of the solver.
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error computed on transport plans
Returns the Fused Gromov-Wasserstein barycenters of S measurable networks with node features \((\mathbf{C}_s, \mathbf{Y}_s, \mathbf{p}_s)_{1 \leq s \leq S}\)
(see eq (5) in [24]), estimated using Fused Gromov-Wasserstein transports from Conditional Gradient solvers.
The function solves the following optimization problem:
N (int) – Desired number of samples of the target barycenter
Ys (list of array-like, each element has shape (ns,d)) – Features of all samples
Cs (list of array-like, each element has shape (ns,ns)) – Structure matrices of all samples
ps (list of array-like, each element has shape (ns,), optional) – Masses of all samples.
If let to its default value None, uniform distributions are taken.
lambdas (list of float, optional) – List of the S spaces’ weights.
If let to its default value None, uniform weights are taken.
alpha (float, optional) – Alpha parameter for the fgw distance.
fixed_structure (bool, optional) – Whether to fix the structure of the barycenter during the updates.
fixed_features (bool, optional) – Whether to fix the feature of the barycenter during the updates
p (array-like, shape (N,), optional) – Weights in the targeted barycenter.
If let to its default value None, uniform distribution is taken.
loss_fun (str, optional) – Loss function used for the solver either ‘square_loss’ or ‘kl_loss’
armijo (bool, optional) – If True the step of the line-search is found via an armijo research.
Else closed form is used. If there are convergence issues use False.
symmetric (bool, optional) – Either structures are to be assumed symmetric or not. Default value is True.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on relative error (>0)
stop_criterion (str, optional. Default is 'barycenter'.) – Stop criterion taking values in [‘barycenter’, ‘loss’]. If set to ‘barycenter’
uses absolute norm variations of estimated barycenters. Else if set to ‘loss’
uses the relative variations of the loss.
warmstartT (bool, optional) – Either to perform warmstart of transport plans in the successive
fused gromov-wasserstein transport problems.
verbose (bool, optional) – Print information along iterations.
init_C (array-like, shape (N,N), optional) – Initialization for the barycenters’ structure matrix. If not set
a random init is used.
init_X (array-like, shape (N,d), optional) – Initialization for the barycenters’ features. If not set a
random init is used.
random_state (int or RandomState instance, optional) – Fix the seed for reproducibility
Returns:
X (array-like, shape (N, d)) – Barycenters’ features
C (array-like, shape (N, N)) – Barycenters’ structure matrix
log (dict) – Only returned when log=True. It contains the keys:
\(\mathbf{T}\): list of (N, ns) transport matrices
\(\mathbf{p}\): (N,) barycenter weights
\((\mathbf{M}_s)_s\): all distance matrices between the feature of the barycenter and the other features \((dist(\mathbf{X}, \mathbf{Y}_s))_s\) shape (N, ns)
Format an attributed graph \((\mathbf{C}, \mathbf{F}, \mathbf{p})\)
with structure matrix \((\mathbf{C} \in R^{n imes n}\), feature matrix
\((\mathbf{F} \in R^{n imes d}\) and node relative importance
\((\mathbf{p} \in \Sigma_n\), into a partitioned attributed graph
taking into account partitions and representants :math:`mathcal{P} = left{(mathbf{P_{i}}, mathbf{r_{i}})
ight}_i`.
Carray-like, shape (n, n)
Structure matrix.
parray-like, shape (n,),
Node distribution.
partarray-like, shape (n,)
Array of partition assignment for each node.
rep_indiceslist of array-like of ints, shape (npart,)
indices for representative node of each partition sorted according to
partition identifiers.
Farray-like, shape (n, d), optional. (Default is None)
Optional feature matrix aligned with the graph structure.
Marray-like, shape (n, n), optional. (Default is None)
Optional pairwise similarity matrix between features.
alpha: float, optional. Default is 1.
Trade-off parameter in \(]0, 1]\) between structure and features.
If alpha = 1 features are ignored. This trade-off is taken into account
into the outputted relations between nodes and representants.
nxbackend, optional
POT backend
CRarray-like, shape (npart, npart)
Structure matrix between partition representants.
list_Rlist of npart arrays,
List of relations between a representant and nodes in its partition,
for each partition.
Format an attributed graph \((\mathbf{D}(\mathbf{X}), \mathbf{F}, \mathbf{p})\)
with euclidean structure matrix \((\mathbf{D}(\mathbf{X}) \in R^{n imes n}\),
feature matrix \((\mathbf{F} \in R^{n imes d}\) and node relative importance
\((\mathbf{p} \in \Sigma_n\), into a partitioned attributed graph
taking into account partitions and representants :math:`mathcal{P} = left{(mathbf{P_{i}}, mathbf{r_{i}})
ight}_i`.
Xarray-like, shape (n, d)
Structure matrix.
parray-like, shape (n,),
Node distribution.
partarray-like, shape (n,)
Array of partition assignment for each node.
rep_indiceslist of array-like of ints, shape (npart,)
indices for representative node of each partition sorted according to
partition identifiers.
Farray-like, shape (n, p), optional. (Default is None)
Optional feature matrix aligned with the samples.
alpha: float, optional. Default is 1.
Trade-off parameter in \(]0, 1]\) between structure and features.
If alpha = 1 features are ignored. This trade-off is taken into account
into the outputted relations between nodes and representants.
nxbackend, optional
POT backend
CRarray-like, shape (npart, npart)
Structure matrix between partition representants.
list_Rlist of npart arrays,
List of relations between a representant and nodes in its partition,
for each partition.
Returns the Fused Gromov-Wasserstein transport between \((\mathbf{C_1}, \mathbf{Y_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{Y_2}, \mathbf{q})\)
with pairwise distance matrix \(\mathbf{M}\) between node feature matrices \(\mathbf{Y_1}\) and \(\mathbf{Y_2}\) (see [24]).
The function solves the following optimization problem using Conditional Gradient:
\(\mathbf{M}\): metric cost matrix between features across domains
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
L: loss function to account for the misfit between the similarity and feature matrices
\(\alpha\): trade-off parameter
Note
This function is backend-compatible and will work on arrays
from all compatible backends. But the algorithm uses the C++ CPU backend
which can lead to copy overhead on GPU arrays.
Note
All computations in the conjugate gradient solver are done with
numpy to limit memory overhead.
Note
This function will cast the computed transport plan to the data
type of the provided input \(\mathbf{M}\). Casting to an integer
tensor might result in a loss of precision. If this behaviour is
unwanted, please make sure to provide a floating point input.
Parameters:
M (array-like, shape (ns, nt)) – Metric cost matrix between features across domains
C1 (array-like, shape (ns, ns)) – Metric cost matrix representative of the structure in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix representative of the structure in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
loss_fun (str, optional) – Loss function used for the solver
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
armijo (bool, optional) – If True the step of the line-search is found via an armijo research. Else closed form is used.
If there are convergence issues use False.
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 must satisfy marginal constraints and will be used as initial transport of the solver.
Returns the Fused Gromov-Wasserstein distance between \((\mathbf{C_1}, \mathbf{Y_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{Y_2}, \mathbf{q})\)
with pairwise distance matrix \(\mathbf{M}\) between node feature matrices \(\mathbf{Y_1}\) and \(\mathbf{Y_2}\) (see [24]).
The function solves the following optimization problem using Conditional Gradient:
\(\mathbf{M}\): metric cost matrix between features across domains
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
L: loss function to account for the misfit between the similarity and feature matrices
\(\alpha\): trade-off parameter
Note that when using backends, this loss function is differentiable wrt the
matrices (C1, C2, M) and weights (p, q) for quadratic loss using the gradients from [38]_.
Note
This function is backend-compatible and will work on arrays
from all compatible backends. But the algorithm uses the C++ CPU backend
which can lead to copy overhead on GPU arrays.
Note
All computations in the conjugate gradient solver are done with
numpy to limit memory overhead.
Note
This function will cast the computed transport plan to the data
type of the provided input \(\mathbf{M}\). Casting to an integer
tensor might result in a loss of precision. If this behaviour is
unwanted, please make sure to provide a floating point input.
Parameters:
M (array-like, shape (ns, nt)) – Metric cost matrix between features across domains
C1 (array-like, shape (ns, ns)) – Metric cost matrix representative of the structure in the source space.
C2 (array-like, shape (nt, nt)) – Metric cost matrix representative of the structure in the target space.
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
loss_fun (str, optional) – Loss function used for the solver.
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
armijo (bool, optional) – If True the step of the line-search is found via an armijo research.
Else closed form is used. If there are convergence issues use False.
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 must satisfy marginal constraints and will be used as initial transport of the solver.
Infer Fused Gromov-Wasserstein linear dictionary \(\{ (\mathbf{C_{dict}[d]}, \mathbf{Y_{dict}[d]}, \mathbf{q}) \}_{d \in [D]}\) from the list of S attributed structures \(\{ (\mathbf{C_s}, \mathbf{Y_s},\mathbf{p_s}) \}_s\)
nt (int) – Number of samples within each dictionary atoms
alpha (float) – Trade-off parameter of Fused Gromov-Wasserstein
reg (float, optional) – Coefficient of the negative quadratic regularization used to promote sparsity of w. The default is 0.
ps (list of S array-like, shape (ns,), optional) – Distribution in each source space C of Cs. Default is None and corresponds to uniform distibutions.
q (array-like, shape (nt,), optional) – Distribution in the embedding space whose structure will be learned. Default is None and corresponds to uniform distributions.
epochs (int, optional) – Number of epochs used to learn the dictionary. Default is 32.
batch_size (int, optional) – Batch size for each stochastic gradient update of the dictionary. Set to the dataset size if the provided batch_size is higher than the dataset size. Default is 32.
learning_rate_C (float, optional) – Learning rate used for the stochastic gradient descent on Cdict. Default is 1.
learning_rate_Y (float, optional) – Learning rate used for the stochastic gradient descent on Ydict. Default is 1.
Cdict_init (list of D array-like with shape (nt, nt), optional) – Used to initialize the dictionary structures Cdict.
If set to None (Default), the dictionary will be initialized randomly.
Else Cdict must have shape (D, nt, nt) i.e match provided shape features.
Ydict_init (list of D array-like with shape (nt, d), optional) – Used to initialize the dictionary features Ydict.
If set to None, the dictionary features will be initialized randomly.
Else Ydict must have shape (D, nt, d) where d is the features dimension of inputs Ys and also match provided shape features.
projection (str, optional) – If ‘nonnegative’ and/or ‘symmetric’ is in projection, the corresponding projection will be performed at each stochastic update of the dictionary
Else the set of atoms is \(R^{nt * nt}\). Default is ‘nonnegative_symmetric’
log (bool, optional) – If set to True, losses evolution by batches and epochs are tracked. Default is False.
use_adam_optimizer (bool, optional) – If set to True, adam optimizer with default settings is used as adaptative learning rate strategy.
Else perform SGD with fixed learning rate. Default is True.
tol_outer (float, optional) – Solver precision for the BCD algorithm, measured by absolute relative error on consecutive losses. Default is \(10^{-5}\).
tol_inner (float, optional) – Solver precision for the Conjugate Gradient algorithm used to get optimal w at a fixed transport, measured by absolute relative error on consecutive losses. Default is \(10^{-5}\).
max_iter_outer (int, optional) – Maximum number of iterations for the BCD. Default is 20.
max_iter_inner (int, optional) – Maximum number of iterations for the Conjugate Gradient. Default is 200.
verbose (bool, optional) – Print the reconstruction loss every epoch. Default is False.
random_state (int, RandomState instance or None, default=None) – Determines random number generation. Pass an int for reproducible
output across multiple function calls.
Returns:
Cdict_best_state (D array-like, shape (D,nt,nt)) – Metric/Graph cost matrices composing the dictionary.
The dictionary leading to the best loss over an epoch is saved and returned.
Ydict_best_state (D array-like, shape (D,nt,d)) – Feature matrices composing the dictionary.
The dictionary leading to the best loss over an epoch is saved and returned.
log (dict) – If use_log is True, contains loss evolutions by batches and epochs.
Returns the Fused Gromov-Wasserstein linear unmixing of \((\mathbf{C},\mathbf{Y},\mathbf{p})\) onto the attributed dictionary atoms \(\{ (\mathbf{C_{dict}[d]},\mathbf{Y_{dict}[d]}, \mathbf{q}) \}_{d \in [D]}\)
\(\mathbf{C}\) is a (ns,ns) pairwise similarity matrix of variable size ns.
\(\mathbf{Y}\) is a (ns,d) features matrix of variable size ns and fixed dimension d.
\(\mathbf{C_{dict}}\) is a (D, nt, nt) tensor of D pairwise similarity matrix of fixed size nt.
\(\mathbf{Y_{dict}}\) is a (D, nt, d) tensor of D features matrix of fixed size nt and fixed dimension d.
\(\mathbf{p}\) is the source distribution corresponding to \(\mathbf{C_s}\)
\(\mathbf{q}\) is the target distribution assigned to every structures in the embedding space.
\(\alpha\) is the trade-off parameter of Fused Gromov-Wasserstein
reg is the regularization coefficient.
The algorithm used for solving the problem is a Block Coordinate Descent as discussed in [38]_, algorithm 6.
Parameters:
C (array-like, shape (ns, ns)) – Metric/Graph cost matrix.
Y (array-like, shape (ns, d)) – Feature matrix.
Cdict (D array-like, shape (D,nt,nt)) – Metric/Graph cost matrices composing the dictionary on which to embed (C,Y).
Ydict (D array-like, shape (D,nt,d)) – Feature matrices composing the dictionary on which to embed (C,Y).
alpha (float,) – Trade-off parameter of Fused Gromov-Wasserstein.
reg (float, optional) – Coefficient of the negative quadratic regularization used to promote sparsity of w. The default is 0.
p (array-like, shape (ns,), optional) – Distribution in the source space C. Default is None and corresponds to uniform distribution.
q (array-like, shape (nt,), optional) – Distribution in the space depicted by the dictionary. Default is None and corresponds to uniform distribution.
tol_outer (float, optional) – Solver precision for the BCD algorithm.
tol_inner (float, optional) – Solver precision for the Conjugate Gradient algorithm used to get optimal w at a fixed transport. Default is \(10^{-5}\).
max_iter_outer (int, optional) – Maximum number of iterations for the BCD. Default is 20.
max_iter_inner (int, optional) – Maximum number of iterations for the Conjugate Gradient. Default is 200.
Returns:
w (array-like, shape (D,)) – fused Gromov-Wasserstein linear unmixing of (C,Y,p) onto the span of the dictionary.
Cembedded (array-like, shape (nt,nt)) – embedded structure of \((\mathbf{C},\mathbf{Y}, \mathbf{p})\) onto the dictionary, \(\sum_d w_d\mathbf{C_{dict}[d]}\).
Yembedded (array-like, shape (nt,d)) – embedded features of \((\mathbf{C},\mathbf{Y}, \mathbf{p})\) onto the dictionary, \(\sum_d w_d\mathbf{Y_{dict}[d]}\).
T (array-like (ns,nt)) – Fused Gromov-Wasserstein transport plan between \((\mathbf{C},\mathbf{p})\) and \((\sum_d w_d\mathbf{C_{dict}[d]}, \sum_d w_d\mathbf{Y_{dict}[d]},\mathbf{q})\).
Partitioning a given graph with structure matrix \(\mathbf{C} \in R^{n imes n}\)
into npart partitions either ‘random’, or using one of {‘louvain’, ‘fluid’}
algorithms from networkx, or ‘spectral’ clustering from scikit-learn,
or (Fused) Gromov-Wasserstein projections from POT.
Parameters:
C (array-like, shape (n, n)) – Structure matrix.
npart (int,) – number of partitions/clusters smaller than the number of nodes in
\(\mathbf{C}\).
part_method (str, optional. Default is 'random'.) – Partitioning algorithm to use among {‘random’, ‘louvain’, ‘fluid’, ‘spectral’, ‘GW’, ‘FGW’}.
‘random’ for random sampling of points; ‘louvain’ and ‘fluid’ for graph
partitioning algorithm that works well on adjacency matrix, If the
louvain algorithm is used, npart is ignored; ‘spectral’ for spectral
clustering; ‘(F)GW’ for (F)GW projection using sr(F)GW solvers.
F (array-like, shape (n, d), optional. (Default is None)) – Optional feature matrix aligned with the graph structure. Only used if
part_method=”FGW”.
alpha (float, optional. (Default is 1.)) – Trade-off parameter between feature and structure matrices, taking
values in [0, 1] and only used if F != None and part_method=”FGW”.
random_state (int, optional) – Random seed for the partitioning algorithm.
nx (backend, optional) – POT backend.
Returns:
part – Array of partition assignment for each node.
Get representative node for each partition given by \(\mathbf{part} \in R^{n}\)
of a graph with structure matrix \(\mathbf{C} \in R^{n imes n}\).
Selection is either done randomly or using ‘pagerank’ algorithm from networkx.
Parameters:
C (array-like, shape (n, n)) – structure matrix.
part (array-like, shape (n,)) – Array of partition assignment for each node.
rep_method (str, optional. Default is 'pagerank'.) – Selection method for representant in each partition. Can be either ‘random’
i.e random sampling within each partition, or ‘pagerank’ to select a
node with maximal pagerank.
random_state (int, optional) – Random seed for the partitioning algorithm
nx (backend, optional) – POT backend
Returns:
rep_indices – indices for representative node of each partition sorted
according to partition identifiers.
Compute npart partitions and representants over samples \(\mathbf{X} \in R^{n imes d}\)
using either a random or a kmeans algorithm.
Parameters:
X (array-like, shape (n, d)) – Samples endowed with an euclidean geometry.
npart (int,) – number of partitions smaller than the number of samples in
\(\mathbf{X}\).
method (str, optional. Default is 'kmeans'.) – Partitioning and representant selection algorithms to use among
{‘random’, ‘kmeans’}. ‘random’ for random sampling of points; ‘kmeans’
for k-means clustering using scikit-learn implementation where closest
points to centroids are considered as representants.
random_state (int, optional) – Random seed for the partitioning algorithm.
nx (backend, optional) – POT backend.
Returns:
part (array-like, shape (npart,)) – Array of partition assignment for each node.
rep_indices (list, shape (npart,)) – indices for representative node of each partition sorted
according to partition identifiers.
Cs (list of S array-like of shape (ns, ns)) – Metric cost matrices
ps (list of S array-like of shape (ns,), optional) – Sample weights in the S spaces.
If let to its default value None, uniform distributions are taken.
p (array-like, shape (N,), optional) – Weights in the targeted barycenter.
If let to its default value None, uniform distribution is taken.
lambdas (list of float, optional) – List of the S spaces’ weights.
If let to its default value None, uniform weights are taken.
loss_fun (callable, optional) – tensor-matrix multiplication function based on specific loss function
symmetric (bool, optional.) – Either structures are to be assumed symmetric or not. Default value is True.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
armijo (bool, optional) – If True the step of the line-search is found via an armijo research.
Else closed form is used. If there are convergence issues use False.
max_iter (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on relative error (>0)
stop_criterion (str, optional. Default is 'barycenter'.) – Stop criterion taking values in [‘barycenter’, ‘loss’]. If set to ‘barycenter’
uses absolute norm variations of estimated barycenters. Else if set to ‘loss’
uses the relative variations of the loss.
warmstartT (bool, optional) – Either to perform warmstart of transport plans in the successive
fused gromov-wasserstein transport problems.s
verbose (bool, optional) – Print information along iterations.
\(\mathbf{C_1}\): Metric cost matrix in the source space.
\(\mathbf{C_2}\): Metric cost matrix in the target space.
\(\mathbf{p}\): Distribution in the source space.
\(\mathbf{q}\): Distribution in the target space.
L: Loss function to account for the misfit between the similarity matrices.
Note
This function is backend-compatible and will work on arrays
from all compatible backends. But the algorithm uses the C++ CPU backend
which can lead to copy overhead on GPU arrays.
Note
All computations in the conjugate gradient solver are done with
numpy to limit memory overhead.
Note
This function will cast the computed transport plan to the data
type of the provided input \(\mathbf{C}_1\). Casting to an integer
tensor might result in a loss of precision. If this behaviour is
unwanted, please make sure to provide a floating point input.
Parameters:
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space.
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space.
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
loss_fun (str, optional) – Loss function used for the solver either ‘square_loss’ or ‘kl_loss’.
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
verbose (bool, optional) – Print information along iterations.
armijo (bool, optional) – If True, the step of the line-search is found via an armijo search. Else closed form is used.
If there are convergence issues, use False.
G0 (array-like, shape (ns,nt), optional) – If None, the initial transport plan of the solver is pq^T.
Otherwise G0 must satisfy marginal constraints and will be used as initial transport of the solver.
max_iter (int, optional) – Max number of iterations.
tol_rel (float, optional) – Stop threshold on relative error (>0).
tol_abs (float, optional) – Stop threshold on absolute error (>0).
**kwargs (dict) – Parameters can be directly passed to the ot.optim.cg solver.
Returns the Gromov-Wasserstein loss \(\mathbf{GW}\) between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\).
To recover the Gromov-Wasserstein distance as defined in [13] compute \(d_{GW} = \frac{1}{2} \sqrt{\mathbf{GW}}\).
The function solves the following optimization problem using Conditional Gradient:
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
L: loss function to account for the misfit between the similarity
matrices
Note that when using backends, this loss function is differentiable wrt the
matrices (C1, C2) and weights (p, q) for quadratic loss using the gradients from [38]_.
Note
This function is backend-compatible and will work on arrays
from all compatible backends. But the algorithm uses the C++ CPU backend
which can lead to copy overhead on GPU arrays.
Note
All computations in the conjugate gradient solver are done with
numpy to limit memory overhead.
Note
This function will cast the computed transport plan to the data
type of the provided input \(\mathbf{C}_1\). Casting to an integer
tensor might result in a loss of precision. If this behaviour is
unwanted, please make sure to provide a floating point input.
Parameters:
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
loss_fun (str) – loss function used for the solver either ‘square_loss’ or ‘kl_loss’
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
verbose (bool, optional) – Print information along iterations
armijo (bool, optional) – If True the step of the line-search is found via an armijo research. Else closed form is used.
If there are convergence issues use False.
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 must satisfy marginal constraints and will be used as initial transport of the solver.
max_iter (int, optional) – Max number of iterations
tol_rel (float, optional) – Stop threshold on relative error (>0)
tol_abs (float, optional) – Stop threshold on absolute error (>0)
**kwargs (dict) – parameters can be directly passed to the ot.optim.cg solver
Returns:
gw_dist (float) – Gromov-Wasserstein distance
log (dict) – convergence information and Coupling matrix
Infer Gromov-Wasserstein linear dictionary \(\{ (\mathbf{C_{dict}[d]}, q) \}_{d \in [D]}\) from the list of structures \(\{ (\mathbf{C_s},\mathbf{p_s}) \}_s\)
nt (int) – Number of samples within each dictionary atoms
reg (float, optional) – Coefficient of the negative quadratic regularization used to promote sparsity of w. The default is 0.
ps (list of S array-like, shape (ns,), optional) – Distribution in each source space C of Cs. Default is None and corresponds to uniform distibutions.
q (array-like, shape (nt,), optional) – Distribution in the embedding space whose structure will be learned. Default is None and corresponds to uniform distributions.
epochs (int, optional) – Number of epochs used to learn the dictionary. Default is 32.
batch_size (int, optional) – Batch size for each stochastic gradient update of the dictionary. Set to the dataset size if the provided batch_size is higher than the dataset size. Default is 32.
learning_rate (float, optional) – Learning rate used for the stochastic gradient descent. Default is 1.
Cdict_init (list of D array-like with shape (nt, nt), optional) – Used to initialize the dictionary.
If set to None (Default), the dictionary will be initialized randomly.
Else Cdict must have shape (D, nt, nt) i.e match provided shape features.
projection (str , optional) – If ‘nonnegative’ and/or ‘symmetric’ is in projection, the corresponding projection will be performed at each stochastic update of the dictionary
Else the set of atoms is \(R^{nt * nt}\). Default is ‘nonnegative_symmetric’
log (bool, optional) – If set to True, losses evolution by batches and epochs are tracked. Default is False.
use_adam_optimizer (bool, optional) – If set to True, adam optimizer with default settings is used as adaptative learning rate strategy.
Else perform SGD with fixed learning rate. Default is True.
tol_outer (float, optional) – Solver precision for the BCD algorithm, measured by absolute relative error on consecutive losses. Default is \(10^{-5}\).
tol_inner (float, optional) – Solver precision for the Conjugate Gradient algorithm used to get optimal w at a fixed transport, measured by absolute relative error on consecutive losses. Default is \(10^{-5}\).
max_iter_outer (int, optional) – Maximum number of iterations for the BCD. Default is 20.
max_iter_inner (int, optional) – Maximum number of iterations for the Conjugate Gradient. Default is 200.
verbose (bool, optional) – Print the reconstruction loss every epoch. Default is False.
random_state (int, RandomState instance or None, default=None) – Determines random number generation. Pass an int for reproducible
output across multiple function calls.
Returns:
Cdict_best_state (D array-like, shape (D,nt,nt)) – Metric/Graph cost matrices composing the dictionary.
The dictionary leading to the best loss over an epoch is saved and returned.
log (dict) – If use_log is True, contains loss evolutions by batches and epochs.
Returns the Gromov-Wasserstein linear unmixing of \((\mathbf{C},\mathbf{p})\) onto the dictionary \(\{ (\mathbf{C_{dict}[d]}, \mathbf{q}) \}_{d \in [D]}\).
\(\mathbf{C}\) is the (ns,ns) pairwise similarity matrix.
\(\mathbf{C_{dict}}\) is a (D, nt, nt) tensor of D pairwise similarity matrices of size nt.
\(\mathbf{p}\) and \(\mathbf{q}\) are source and target weights.
reg is the regularization coefficient.
The algorithm used for solving the problem is a Block Coordinate Descent as discussed in [38]_ , algorithm 1.
Parameters:
C (array-like, shape (ns, ns)) – Metric/Graph cost matrix.
Cdict (D array-like, shape (D,nt,nt)) – Metric/Graph cost matrices composing the dictionary on which to embed C.
reg (float, optional.) – Coefficient of the negative quadratic regularization used to promote sparsity of w. Default is 0.
p (array-like, shape (ns,), optional) – Distribution in the source space C. Default is None and corresponds to uniform distribution.
q (array-like, shape (nt,), optional) – Distribution in the space depicted by the dictionary. Default is None and corresponds to uniform distribution.
tol_outer (float, optional) – Solver precision for the BCD algorithm.
tol_inner (float, optional) – Solver precision for the Conjugate Gradient algorithm used to get optimal w at a fixed transport. Default is \(10^{-5}\).
max_iter_outer (int, optional) – Maximum number of iterations for the BCD. Default is 20.
max_iter_inner (int, optional) – Maximum number of iterations for the Conjugate Gradient. Default is 200.
Returns:
w (array-like, shape (D,)) – Gromov-Wasserstein linear unmixing of \((\mathbf{C},\mathbf{p})\) onto the span of the dictionary.
Cembedded (array-like, shape (nt,nt)) – embedded structure of \((\mathbf{C},\mathbf{p})\) onto the dictionary, \(\sum_d w_d\mathbf{C_{dict}[d]}\).
T (array-like (ns, nt)) – Gromov-Wasserstein transport plan between \((\mathbf{C},\mathbf{p})\) and \((\sum_d w_d\mathbf{C_{dict}[d]}, \mathbf{q})\)
ot.gromov.init_matrix(C1, C2, p, q, loss_fun='square_loss', nx=None)[source]
Return loss matrices and tensors for Gromov-Wasserstein fast computation
Returns the value of \(\mathcal{L}(\mathbf{C_1}, \mathbf{C_2}) \otimes \mathbf{T}\) with the
selected loss function as the loss function of Gromov-Wasserstein discrepancy.
The matrices are computed as described in Proposition 1 in [12]
Where :
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{T}\): A coupling between those two spaces
The square-loss function \(L(a, b) = |a - b|^2\) is read as :
Return loss matrices and tensors for semi-relaxed Gromov-Wasserstein fast computation
Returns the value of \(\mathcal{L}(\mathbf{C_1}, \mathbf{C_2}) \otimes \mathbf{T}\) with the
selected loss function as the loss function of semi-relaxed Gromov-Wasserstein discrepancy.
The matrices are computed as described in Proposition 1 in [12]
and adapted to the semi-relaxed problem where the second marginal is not a constant anymore.
Where :
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{T}\): A coupling between those two spaces
The square-loss function \(L(a, b) = |a - b|^2\) is read as :
rank (int, optional. Default is None. (>0)) – Nonnegative rank of the OT plan. If None, min(ns, nt) is considered.
alpha (int, optional. Default is 1e-10. (>0 and <1/r)) – Lower bound for the weight vector g.
rescale_cost (bool, optional. Default is False) – Rescale the low rank factorization of the sqeuclidean cost matrix
seed_init (int, optional. Default is 49. (>0)) – Random state for the ‘random’ initialization of low rank couplings
gamma_init (str, optional. Default is "rescale".) – Initialization strategy for gamma. ‘rescale’, or ‘theory’
Gamma is a constant that scales the convergence criterion of the Mirror Descent
optimization scheme used to compute the low-rank couplings (Q, R and g)
numItermax (int, optional. Default is 1000.) – Max number of iterations for Low Rank GW
stopThr (float, optional. Default is 1e-4.) – Stop threshold on error (>0) for Low Rank GW
The error is the sum of Kullback Divergences computed for each low rank
coupling (Q, R and g) and scaled using gamma.
numItermax_dykstra (int, optional. Default is 2000.) – Max number of iterations for the Dykstra algorithm
stopThr_dykstra (float, optional. Default is 1e-7.) – Stop threshold on error (>0) in Dykstra
cost_factorized_Xs (tuple, optional. Default is None) – Tuple with two pre-computed low rank decompositions (A1, A2) of the source cost
matrix. Both matrices should have a shape of (n_samples_a, dim_Xs + 2).
If None, the low rank cost matrices will be computed as sqeuclidean cost matrices.
cost_factorized_Xt (tuple, optional. Default is None) – Tuple with two pre-computed low rank decompositions (B1, B2) of the target cost
matrix. Both matrices should have a shape of (n_samples_b, dim_Xt + 2).
If None, the low rank cost matrices will be computed as sqeuclidean cost matrices.
warn (bool, optional) – if True, raises a warning if the low rank GW algorithm doesn’t convergence.
warn_dykstra (bool, optional) – if True, raises a warning if the Dykstra algorithm doesn’t convergence.
Returns the gromov-wasserstein transport between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\) using a stochastic Frank-Wolfe.
This method has a \(\mathcal{O}(\mathrm{max\_iter} \times PN^2)\) time complexity with P the number of Sinkhorn iterations.
The function solves the following optimization problem:
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
L: loss function to account for the misfit between the similarity matrices
Parameters:
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,)) – Distribution in the source space
q (array-like, shape (nt,)) – Distribution in the target space
loss_fun (function: \(\mathbb{R} \times \mathbb{R} \mapsto \mathbb{R}\)) – Loss function used for the distance, the transport plan does not depend on the loss function
alpha (float) – Step of the Frank-Wolfe algorithm, should be between 0 and 1
max_iter (int, optional) – Max number of iterations
threshold_plan (float, optional) – Deleting very small values in the transport plan. If above zero, it violates the marginal constraints.
verbose (bool, optional) – Print information along iterations
log (bool, optional) – Gives the distance estimated and the standard deviation
random_state (int or RandomState instance, optional) – Fix the seed for reproducibility
Returns the quantized Fused Gromov-Wasserstein transport between
\((\mathbf{C_1}, \mathbf{F_1}, \mathbf{p})\) and \((\mathbf{C_2},
\mathbf{F_2}, \mathbf{q})\), whose samples are assigned to partitions and
representants \(\mathcal{P_1} = \{(\mathbf{P_{1, i}}, \mathbf{r_{1, i}})\}_{i \leq npart1}\)
and \(\mathcal{P_2} = \{(\mathbf{P_{2, j}}, \mathbf{r_{2, j}})\}_{j \leq npart2}\).
The function estimates the following optimization problem:
using a two-step strategy computing: i) a global alignment \(\mathbf{T}^{g}\)
between representants across joint structure and feature spaces;
ii) local alignments \(\mathbf{T}^{(i, j)}\) between partitions
\(\mathbf{P_{1, i}}\) and \(\mathbf{P_{2, j}}\) seen as 1D measures.
Where :
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{F_1}\): Feature matrix in the source space
\(\mathbf{F_2}\): Feature matrix in the target space
\(\mathbf{D}(\mathbf{F_1}, \mathbf{F_2})\): Pairwise euclidean distance matrix between features
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
\(L\): quadratic loss function to account for the misfit between the similarity matrices
Note
This function is backend-compatible and will work on arrays
from all compatible backends. But the algorithm uses the C++ CPU backend
which can lead to copy overhead on GPU arrays.
Note
All computations in the conjugate gradient solver are done with
numpy to limit memory overhead.
Parameters:
C1 (array-like, shape (ns, ns)) – Structure matrix in the source space.
C2 (array-like, shape (nt, nt)) – Structure matrix in the target space.
npart1 (int,) – number of partition in the source space.
npart2 (int,) – number of partition in the target space.
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
C1_aux (array-like, shape (ns, ns), optional. Default is None.) – Auxiliary structure matrix in the source space to perform the partitioning
and representant selection.
C2_aux (array-like, shape (nt, nt), optional. Default is None.) – Auxiliary structure matrix in the target space to perform the partitioning
and representant selection.
F1 (array-like, shape (ns, d), optional. Default is None.) – Feature matrix in the source space.
F2 (array-like, shape (nt, d), optional. Default is None.) – Feature matrix in the target space
alpha (float, optional. Default is 1.) – FGW trade-off parameter in \(]0, 1]\) between structure and features.
If alpha = 1 features are ignored hence computing qGW, if alpha=0
structures are ignored and we compute the quantized Wasserstein transport.
part_method (str, optional. Default is 'spectral'.) – Partitioning algorithm to use among {‘random’, ‘louvain’, ‘fluid’,
‘spectral’, ‘louvain_fused’, ‘fluid_fused’, ‘spectral_fused’, ‘GW’, ‘FGW’}.
If part_method in {‘louvain_fused’, ‘fluid_fused’, ‘spectral_fused’},
corresponding graph partitioning algorithm {‘louvain’, ‘fluid’, ‘spectral’}
will be used on the modified structure matrix
\(\alpha \mathbf{C} + (1 - \alpha) \mathbf{D}(\mathbf{F})\) where
\(\mathbf{D}(\mathbf{F})\) is the pairwise euclidean matrix between features.
If part_method in {‘GW’, ‘FGW’}, a (F)GW projection is used.
If the louvain algorithm is used, the requested number of partitions is
ignored.
rep_method (str, optional. Default is 'pagerank'.) – Selection method for node representant in each partition.
Can be either ‘random’ i.e random sampling within each partition,
{‘pagerank’, ‘pagerank_fused’} to select a node with maximal pagerank w.r.t
\(\mathbf{C}\) or \(\alpha \mathbf{C} + (1 - \alpha) \mathbf{D}(\mathbf{F})\).
verbose (bool, optional) – Print information along iterations
armijo (bool, optional) – If True the step of the line-search is found via an armijo research. Else closed form is used.
If there are convergence issues use False.
max_iter (int, optional) – Max number of iterations
tol_rel (float, optional) – Stop threshold on relative error (>0)
tol_abs (float, optional) – Stop threshold on absolute error (>0)
random_state (int, optional) – Random seed for the partitioning algorithm
**kwargs (dict) – parameters can be directly passed to the ot.optim.cg solver
Ts_local (dict of local OT matrices.) – Dictionary with keys \((i, j)\) corresponding to 1D OT between
\(\mathbf{P_{1, i}}\) and \(\mathbf{P_{2, j}}\) if \(T^{g}_{ij} \neq 0\).
T (array-like, shape (ns, nt)) – Coupling between the two spaces.
log (dict) – Convergence information for inner problems and qGW loss.
Returns the quantized Fused Gromov-Wasserstein transport between
\((\mathbf{C_1}, \mathbf{F_1}, \mathbf{p})\) and \((\mathbf{C_2},
\mathbf{F_2}, \mathbf{q})\), whose samples are assigned to partitions and representants
\(\mathcal{P_1} = \{(\mathbf{P_{1, i}}, \mathbf{r_{1, i}})\}_{i \leq npart1}\)
and \(\mathcal{P_2} = \{(\mathbf{P_{2, j}}, \mathbf{r_{2, j}})\}_{j \leq npart2}\).
The latter must be precomputed and encoded e.g for the source as: \(\mathbf{CR_1}\)
structure matrix between representants; list_R1 a list of relations between
representants and their associated samples; list_p1 a list of nodes
distribution within each partition; \(\mathbf{FR_1}\) feature matrix
of representants.
The function estimates the following optimization problem:
using a two-step strategy computing: i) a global alignment \(\mathbf{T}^{g}\)
between representants joint structure and feature spaces; ii) local alignments
\(\mathbf{T}^{(i, j)}\) between partitions \(\mathbf{P_{1, i}}\)
and \(\mathbf{P_{2, j}}\) seen as 1D measures.
Where :
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{F_1}\): Feature matrix in the source space
\(\mathbf{F_2}\): Feature matrix in the target space
\(\mathbf{M}\): Pairwise similarity matrix between features
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
\(L\): quadratic loss function to account for the misfit between the similarity matrices
Note
This function is backend-compatible and will work on arrays
from all compatible backends. But the algorithm uses the C++ CPU backend
which can lead to copy overhead on GPU arrays.
Note
All computations in the Gromov-Wasserstein conjugate gradient solver
are done with numpy to limit memory overhead.
Parameters:
CR1 (array-like, shape (npart1, npart1)) – Structure matrix between partition representants in the source space.
CR2 (array-like, shape (npart2, npart2)) – Structure matrix between partition representants in the target space.
list_R1 (list of npart1 arrays,) – List of relations between representants and their associated samples in the source space.
list_R2 (list of npart2 arrays,) – List of relations between representants and their associated samples in the target space.
list_p1 (list of npart1 arrays,) – List of node distributions within each partition of the source space.
list_p (list of npart2 arrays,) – List of node distributions within each partition of the target space.
MR (array-like, shape (npart1, npart2), optional. (Default is None)) – Metric cost matrix between features of representants across spaces.
alpha (float, optional. Default is None.) – FGW trade-off parameter in \(]0, 1]\) between structure and features.
If alpha = 1 features are ignored hence computing qGW.
build_OT (bool, optional. Default is False) – Either to build or not the OT between non-partitioned structures.
log (bool, optional. Default is False) – record log if True
armijo (bool, optional) – If True the step of the line-search is found via an armijo research. Else closed form is used.
If there are convergence issues use False.
max_iter (int, optional) – Max number of iterations
tol_rel (float, optional) – Stop threshold on relative error (>0)
tol_abs (float, optional) – Stop threshold on absolute error (>0)
nx (backend, optional) – POT backend
**kwargs (dict) – parameters can be directly passed to the ot.optim.cg solver
Ts_local (dict of local OT matrices.) – Dictionary with keys \((i, j)\) corresponding to 1D OT between
\(\mathbf{P_{1, i}}\) and \(\mathbf{P_{2, j}}\) if \(T^{g}_{ij} \neq 0\).
T (array-like, shape (ns, nt)) – Coupling between the two spaces if build_OT=True else None.
log (dict, if log=True.) – Convergence information and losses of inner OT problems.
Returns the quantized Fused Gromov-Wasserstein transport between samples
endowed with their respective euclidean geometry \((\mathbf{D}(\mathbf{X_1}), \mathbf{F_1}, \mathbf{p})\)
and \((\mathbf{D}(\mathbf{X_1}), \mathbf{F_2}, \mathbf{q})\), whose samples are assigned to partitions and
representants \(\mathcal{P_1} = \{(\mathbf{P_{1, i}}, \mathbf{r_{1, i}})\}_{i \leq npart1}\)
and \(\mathcal{P_2} = \{(\mathbf{P_{2, j}}, \mathbf{r_{2, j}})\}_{j \leq npart2}\).
The function estimates the following optimization problem:
using a two-step strategy computing: i) a global alignment \(\mathbf{T}^{g}\)
between representants across joint structure and feature spaces;
ii) local alignments \(\mathbf{T}^{(i, j)}\) between partitions
\(\mathbf{P_{1, i}}\) and \(\mathbf{P_{2, j}}\) seen as 1D measures.
Where :
\(\mathbf{X_1}\): Samples in the source space
\(\mathbf{X_2}\): Samples in the target space
\(\mathbf{F_1}\): Feature matrix in the source space
\(\mathbf{F_2}\): Feature matrix in the target space
\(\mathbf{D}(\mathbf{F_1}, \mathbf{F_2})\): Pairwise euclidean distance matrix between features
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
\(L\): quadratic loss function to account for the misfit between the similarity matrices
Note
This function is backend-compatible and will work on arrays
from all compatible backends. But the algorithm uses the C++ CPU backend
which can lead to copy overhead on GPU arrays.
Note
All computations in the conjugate gradient solver are done with
numpy to limit memory overhead.
Parameters:
X1 (array-like, shape (ns, ds)) – Samples in the source space.
X2 (array-like, shape (nt, dt)) – Samples in the target space.
npart1 (int,) – number of partition in the source space.
npart2 (int,) – number of partition in the target space.
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
q (array-like, shape (nt,), optional) – Distribution in the target space.
If let to its default value None, uniform distribution is taken.
F1 (array-like, shape (ns, d), optional. Default is None.) – Feature matrix in the source space.
F2 (array-like, shape (nt, d), optional. Default is None.) – Feature matrix in the target space
alpha (float, optional. Default is 1.) – FGW trade-off parameter in \(]0, 1]\) between structure and features.
If alpha = 1 features are ignored hence computing qGW, if alpha=0
structures are ignored and we compute the quantized Wasserstein transport.
method (str, optional. Default is 'kmeans'.) – Partitioning and representant selection algorithms to use among
{‘random’, ‘kmeans’, ‘kmeans_fused’}.
If part_method == ‘kmeans_fused’, kmeans is performed on augmented
samples \([\alpha \mathbf{X}; (1 - \alpha) \mathbf{F}]\).
verbose (bool, optional) – Print information along iterations
armijo (bool, optional) – If True the step of the line-search is found via an armijo research. Else closed form is used.
If there are convergence issues use False.
max_iter (int, optional) – Max number of iterations
tol_rel (float, optional) – Stop threshold on relative error (>0)
tol_abs (float, optional) – Stop threshold on absolute error (>0)
random_state (int, optional) – Random seed for the partitioning algorithm
**kwargs (dict) – parameters can be directly passed to the ot.optim.cg solver
Ts_local (dict of local OT matrices.) – Dictionary with keys \((i, j)\) corresponding to 1D OT between
\(\mathbf{P_{1, i}}\) and \(\mathbf{P_{2, j}}\) if \(T^{g}_{ij} \neq 0\).
T (array-like, shape (ns, nt)) – Coupling between the two spaces.
log (dict) – Convergence information for inner problems and qGW loss.
Returns the gromov-wasserstein transport between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\) using a 1-stochastic Frank-Wolfe.
This method has a \(\mathcal{O}(\mathrm{max\_iter} \times N \log(N))\) time complexity by relying on the 1D Optimal Transport solver.
The function solves the following optimization problem:
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
\(\mathbf{q}\): distribution in the target space
L: loss function to account for the misfit between the similarity matrices
Parameters:
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,)) – Distribution in the source space
q (array-like, shape (nt,)) – Distribution in the target space
loss_fun (function: \(\mathbb{R} \times \mathbb{R} \mapsto \mathbb{R}\)) – Loss function used for the distance, the transport plan does not depend on the loss function
nb_samples_grad (int) – Number of samples to approximate the gradient
epsilon (float) – Weight of the Kullback-Leibler regularization
max_iter (int, optional) – Max number of iterations
verbose (bool, optional) – Print information along iterations
log (bool, optional) – Gives the distance estimated and the standard deviation
random_state (int or RandomState instance, optional) – Fix the seed for reproducibility
L is a loss function to account for the misfit between the similarity matrices
Note
This function is backend-compatible and will work on arrays
from all compatible backends. However all the steps in the conditional
gradient are not differentiable.
The algorithm used for solving the problem is conditional gradient as discussed in [48]
Parameters:
M (array-like, shape (ns, nt)) – Metric cost matrix between features across domains
C1 (array-like, shape (ns, ns)) – Metric cost matrix representative of the structure in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix representative of the structure in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
loss_fun (str) – loss function used for the solver either ‘square_loss’ or ‘kl_loss’.
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 must satisfy marginal constraints and will be used as initial transport of the solver.
L is a loss function to account for the misfit between the similarity matrices
The algorithm used for solving the problem is conditional gradient as
discussed in [48]
Note that when using backends, this loss function is differentiable wrt the
matrices (C1, C2) but not yet for the weights p.
Note
This function is backend-compatible and will work on arrays
from all compatible backends. However all the steps in the conditional
gradient are not differentiable.
Parameters:
M (array-like, shape (ns, nt)) – Metric cost matrix between features across domains
C1 (array-like, shape (ns, ns)) – Metric cost matrix representative of the structure in the source space.
C2 (array-like, shape (nt, nt)) – Metric cost matrix representative of the structure in the target space.
p (array-like, shape (ns,)) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
loss_fun (str, optional) – loss function used for the solver either ‘square_loss’ or ‘kl_loss’.
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 must satisfy marginal constraints and will be used as initial transport of the solver.
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
L: loss function to account for the misfit between the similarity matrices
Note
This function is backend-compatible and will work on arrays
from all compatible backends. However all the steps in the conditional
gradient are not differentiable.
Parameters:
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
loss_fun (str) – loss function used for the solver either ‘square_loss’ or ‘kl_loss’.
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
verbose (bool, optional) – Print information along iterations
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 must satisfy marginal constraints and will be used as initial transport of the solver.
max_iter (int, optional) – Max number of iterations
tol_rel (float, optional) – Stop threshold on relative error (>0)
tol_abs (float, optional) – Stop threshold on absolute error (>0)
**kwargs (dict) – parameters can be directly passed to the ot.optim.cg solver
\(\mathbf{C_1}\): Metric cost matrix in the source space
\(\mathbf{C_2}\): Metric cost matrix in the target space
\(\mathbf{p}\): distribution in the source space
L: loss function to account for the misfit between the similarity
matrices
Note that when using backends, this loss function is differentiable wrt the
matrices (C1, C2) but not yet for the weights p.
Note
This function is backend-compatible and will work on arrays
from all compatible backends. However all the steps in the conditional
gradient are not differentiable.
Parameters:
C1 (array-like, shape (ns, ns)) – Metric cost matrix in the source space
C2 (array-like, shape (nt, nt)) – Metric cost matrix in the target space
p (array-like, shape (ns,), optional) – Distribution in the source space.
If let to its default value None, uniform distribution is taken.
loss_fun (str) – loss function used for the solver either ‘square_loss’ or ‘kl_loss’.
symmetric (bool, optional) – Either C1 and C2 are to be assumed symmetric or not.
If let to its default None value, a symmetry test will be conducted.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
verbose (bool, optional) – Print information along iterations
G0 (array-like, shape (ns,nt), optional) – If None the initial transport plan of the solver is pq^T.
Otherwise G0 must satisfy marginal constraints and will be used as initial transport of the solver.
max_iter (int, optional) – Max number of iterations
tol_rel (float, optional) – Stop threshold on relative error (>0)
tol_abs (float, optional) – Stop threshold on absolute error (>0)
**kwargs (dict) – parameters can be directly passed to the ot.optim.cg solver
C1 (array-like (ns,ns), optional) – Transformed Structure matrix in the source domain.
For the ‘square_loss’ and ‘kl_loss’, we provide hC1 from ot.gromov.init_matrix
C2 (array-like (nt,nt), optional) – Transformed Structure matrix in the source domain.
For the ‘square_loss’ and ‘kl_loss’, we provide hC2 from ot.gromov.init_matrix
M (array-like (ns,nt)) – Cost matrix between the features.
alpha_min (float, optional) – Minimum value for alpha
alpha_max (float, optional) – Maximum value for alpha
nx (backend, optional) – If let to its default value None, a backend test will be conducted.
symmetric (bool, optional) – Either structures are to be assumed symmetric or not. Default value is False.
Else if set to True (resp. False), C1 and C2 will be assumed symmetric (resp. asymmetric).
Returns:
alpha (float) – The optimal step size of the FW
fc (int) – nb of function call. Useless here
cost_G (float) – The value of the cost for the next iteration
C1 (array-like (ns,ns), optional) – Transformed Structure matrix in the source domain.
Note that for the ‘square_loss’ and ‘kl_loss’, we provide hC1 from ot.gromov.init_matrix_semirelaxed
C2 (array-like (nt,nt), optional) – Transformed Structure matrix in the source domain.
Note that for the ‘square_loss’ and ‘kl_loss’, we provide hC2 from ot.gromov.init_matrix_semirelaxed
ones_p (array-like (ns,1)) – Array of ones of size ns
M (array-like (ns,nt)) – Cost matrix between the features.
fC2t (array-like (nt,nt), optional) – Transformed Structure matrix in the source domain.
Note that for the ‘square_loss’ and ‘kl_loss’, we provide fC2t from ot.gromov.init_matrix_semirelaxed.
If fC2t is not provided, it is by default fC2t corresponding to the ‘square_loss’.
alpha_min (float, optional) – Minimum value for alpha
alpha_max (float, optional) – Maximum value for alpha
nx (backend, optional) – If let to its default value None, a backend test will be conducted.
Returns:
alpha (float) – The optimal step size of the FW
fc (int) – nb of function call. Useless here
cost_G (float) – The value of the cost for the next iteration
Updates \(\mathbf{C}\) according to the KL Loss kernel with the S
\(\mathbf{T}_s\) couplings calculated at each iteration of the GW
barycenter problem in [12]:
Updates \(\mathbf{C}\) according to the L2 Loss kernel with the S
\(\mathbf{T}_s\) couplings calculated at each iteration of the GW
barycenter problem in [12]: