Compute the CO-Optimal Transport between two matrices.
Return the sample and feature transport plans between
\((\mathbf{X}, \mathbf{w}_{xs}, \mathbf{w}_{xf})\) and
\((\mathbf{Y}, \mathbf{w}_{ys}, \mathbf{w}_{yf})\).
The function solves the following CO-Optimal Transport (COOT) problem:
\(\mathbf{w}_{xs}\): Distribution of the samples in the source space
\(\mathbf{w}_{xf}\): Distribution of the features in the source space
\(\mathbf{w}_{ys}\): Distribution of the samples in the target space
\(\mathbf{w}_{yf}\): Distribution of the features in the target space
Note
This function allows epsilon to be zero.
In that case, the ot.lp.emd solver of POT will be used.
Parameters:
X ((n_sample_x, n_feature_x) array-like, float) – First input matrix.
Y ((n_sample_y, n_feature_y) array-like, float) – Second input matrix.
wx_samp ((n_sample_x, ) array-like, float, optional (default = None)) – Histogram assigned on rows (samples) of matrix X.
Uniform distribution by default.
wx_feat ((n_feature_x, ) array-like, float, optional (default = None)) – Histogram assigned on columns (features) of matrix X.
Uniform distribution by default.
wy_samp ((n_sample_y, ) array-like, float, optional (default = None)) – Histogram assigned on rows (samples) of matrix Y.
Uniform distribution by default.
wy_feat ((n_feature_y, ) array-like, float, optional (default = None)) – Histogram assigned on columns (features) of matrix Y.
Uniform distribution by default.
epsilon (scalar or indexable object of length 2, float or int, optional (default = 0)) – Regularization parameters for entropic approximation of sample and feature couplings.
Allow the case where epsilon contains 0. In that case, the EMD solver is used instead of
Sinkhorn solver. If epsilon is scalar, then the same epsilon is applied to
both regularization of sample and feature couplings.
alpha (scalar or indexable object of length 2, float or int, optional (default = 0)) – Coefficient parameter of linear terms with respect to the sample and feature couplings.
If alpha is scalar, then the same alpha is applied to both linear terms.
M_samp ((n_sample_x, n_sample_y), float, optional (default = None)) – Sample matrix with respect to the linear term on sample coupling.
M_feat ((n_feature_x, n_feature_y), float, optional (default = None)) – Feature matrix with respect to the linear term on feature coupling.
”duals_sample” and “duals_feature” whose values are
tuples of 2 vectors of size (n_sample_x, n_sample_y) and (n_feature_x, n_feature_y).
Initialization of sample and feature dual vectors
if using Sinkhorn algorithm. Zero vectors by default.
”pi_sample” and “pi_feature” whose values are matrices
of size (n_sample_x, n_sample_y) and (n_feature_x, n_feature_y).
Initialization of sample and feature couplings.
Uniform distributions by default.
nits_bcd (int, optional (default = 100)) – Number of Block Coordinate Descent (BCD) iterations to solve COOT.
tol_bcd (float, optional (default = 1e-7)) – Tolerance of BCD scheme. If the L1-norm between the current and previous
sample couplings is under this threshold, then stop BCD scheme.
eval_bcd (int, optional (default = 1)) – Multiplier of iteration at which the COOT cost is evaluated. For example,
if eval_bcd = 8, then the cost is calculated at iterations 8, 16, 24, etc…
nits_ot (int, optional (default = 100)) – Number of iterations to solve each of the
two optimal transport problems in each BCD iteration.
tol_sinkhorn (float, optional (default = 1e-7)) – Tolerance of Sinkhorn algorithm to stop the Sinkhorn scheme for
entropic optimal transport problem (if any) in each BCD iteration.
Only triggered when Sinkhorn solver is used.
method_sinkhorn (string, optional (default = "sinkhorn")) – Method used in POT’s ot.sinkhorn solver.
Only support “sinkhorn” and “sinkhorn_log”.
early_stopping_tol (float, optional (default = 1e-6)) – Tolerance for the early stopping. If the absolute difference between
the last 2 recorded COOT distances is under this tolerance, then stop BCD scheme.
log (bool, optional (default = False)) – If True then the cost and 4 dual vectors, including
2 from sample and 2 from feature couplings, are recorded.
verbose (bool, optional (default = False)) – If True then print the COOT cost at every multiplier of eval_bcd-th iteration.
Compute the CO-Optimal Transport distance between two measures.
Returns the CO-Optimal Transport distance between
\((\mathbf{X}, \mathbf{w}_{xs}, \mathbf{w}_{xf})\) and
\((\mathbf{Y}, \mathbf{w}_{ys}, \mathbf{w}_{yf})\).
The function solves the following CO-Optimal Transport (COOT) problem:
\(\mathbf{w}_{xs}\): Distribution of the samples in the source space
\(\mathbf{w}_{xf}\): Distribution of the features in the source space
\(\mathbf{w}_{ys}\): Distribution of the samples in the target space
\(\mathbf{w}_{yf}\): Distribution of the features in the target space
Note
This function allows epsilon to be zero.
In that case, the ot.lp.emd solver of POT will be used.
Parameters:
X ((n_sample_x, n_feature_x) array-like, float) – First input matrix.
Y ((n_sample_y, n_feature_y) array-like, float) – Second input matrix.
wx_samp ((n_sample_x, ) array-like, float, optional (default = None)) – Histogram assigned on rows (samples) of matrix X.
Uniform distribution by default.
wx_feat ((n_feature_x, ) array-like, float, optional (default = None)) – Histogram assigned on columns (features) of matrix X.
Uniform distribution by default.
wy_samp ((n_sample_y, ) array-like, float, optional (default = None)) – Histogram assigned on rows (samples) of matrix Y.
Uniform distribution by default.
wy_feat ((n_feature_y, ) array-like, float, optional (default = None)) – Histogram assigned on columns (features) of matrix Y.
Uniform distribution by default.
epsilon (scalar or indexable object of length 2, float or int, optional (default = 0)) – Regularization parameters for entropic approximation of sample and feature couplings.
Allow the case where epsilon contains 0. In that case, the EMD solver is used instead of
Sinkhorn solver. If epsilon is scalar, then the same epsilon is applied to
both regularization of sample and feature couplings.
alpha (scalar or indexable object of length 2, float or int, optional (default = 0)) – Coefficient parameter of linear terms with respect to the sample and feature couplings.
If alpha is scalar, then the same alpha is applied to both linear terms.
M_samp ((n_sample_x, n_sample_y), float, optional (default = None)) – Sample matrix with respect to the linear term on sample coupling.
M_feat ((n_feature_x, n_feature_y), float, optional (default = None)) – Feature matrix with respect to the linear term on feature coupling.
”duals_sample” and “duals_feature” whose values are
tuples of 2 vectors of size (n_sample_x, n_sample_y) and (n_feature_x, n_feature_y).
Initialization of sample and feature dual vectors
if using Sinkhorn algorithm. Zero vectors by default.
- “pi_sample” and “pi_feature” whose values are matrices
of size (n_sample_x, n_sample_y) and (n_feature_x, n_feature_y).
Initialization of sample and feature couplings.
Uniform distributions by default.
nits_bcd (int, optional (default = 100)) – Number of Block Coordinate Descent (BCD) iterations to solve COOT.
tol_bcd (float, optional (default = 1e-7)) – Tolerance of BCD scheme. If the L1-norm between the current and previous
sample couplings is under this threshold, then stop BCD scheme.
eval_bcd (int, optional (default = 1)) – Multiplier of iteration at which the COOT cost is evaluated. For example,
if eval_bcd = 8, then the cost is calculated at iterations 8, 16, 24, etc…
nits_ot (int, optional (default = 100)) – Number of iterations to solve each of the
two optimal transport problems in each BCD iteration.
tol_sinkhorn (float, optional (default = 1e-7)) – Tolerance of Sinkhorn algorithm to stop the Sinkhorn scheme for
entropic optimal transport problem (if any) in each BCD iteration.
Only triggered when Sinkhorn solver is used.
method_sinkhorn (string, optional (default = "sinkhorn")) – Method used in POT’s ot.sinkhorn solver.
Only support “sinkhorn” and “sinkhorn_log”.
early_stopping_tol (float, optional (default = 1e-6)) – Tolerance for the early stopping. If the absolute difference between
the last 2 recorded COOT distances is under this tolerance, then stop BCD scheme.
log (bool, optional (default = False)) – If True then the cost and 4 dual vectors, including
2 from sample and 2 from feature couplings, are recorded.
verbose (bool, optional (default = False)) – If True then print the COOT cost at every multiplier of eval_bcd-th iteration.
Returns:
float – CO-Optimal Transport distance.
dict – Contains logged information from co_optimal_transport solver.
Only returned if log parameter is True
Compute the CO-Optimal Transport between two matrices.
Return the sample and feature transport plans between
\((\mathbf{X}, \mathbf{w}_{xs}, \mathbf{w}_{xf})\) and
\((\mathbf{Y}, \mathbf{w}_{ys}, \mathbf{w}_{yf})\).
The function solves the following CO-Optimal Transport (COOT) problem:
\(\mathbf{w}_{xs}\): Distribution of the samples in the source space
\(\mathbf{w}_{xf}\): Distribution of the features in the source space
\(\mathbf{w}_{ys}\): Distribution of the samples in the target space
\(\mathbf{w}_{yf}\): Distribution of the features in the target space
Note
This function allows epsilon to be zero.
In that case, the ot.lp.emd solver of POT will be used.
Parameters:
X ((n_sample_x, n_feature_x) array-like, float) – First input matrix.
Y ((n_sample_y, n_feature_y) array-like, float) – Second input matrix.
wx_samp ((n_sample_x, ) array-like, float, optional (default = None)) – Histogram assigned on rows (samples) of matrix X.
Uniform distribution by default.
wx_feat ((n_feature_x, ) array-like, float, optional (default = None)) – Histogram assigned on columns (features) of matrix X.
Uniform distribution by default.
wy_samp ((n_sample_y, ) array-like, float, optional (default = None)) – Histogram assigned on rows (samples) of matrix Y.
Uniform distribution by default.
wy_feat ((n_feature_y, ) array-like, float, optional (default = None)) – Histogram assigned on columns (features) of matrix Y.
Uniform distribution by default.
epsilon (scalar or indexable object of length 2, float or int, optional (default = 0)) – Regularization parameters for entropic approximation of sample and feature couplings.
Allow the case where epsilon contains 0. In that case, the EMD solver is used instead of
Sinkhorn solver. If epsilon is scalar, then the same epsilon is applied to
both regularization of sample and feature couplings.
alpha (scalar or indexable object of length 2, float or int, optional (default = 0)) – Coefficient parameter of linear terms with respect to the sample and feature couplings.
If alpha is scalar, then the same alpha is applied to both linear terms.
M_samp ((n_sample_x, n_sample_y), float, optional (default = None)) – Sample matrix with respect to the linear term on sample coupling.
M_feat ((n_feature_x, n_feature_y), float, optional (default = None)) – Feature matrix with respect to the linear term on feature coupling.
”duals_sample” and “duals_feature” whose values are
tuples of 2 vectors of size (n_sample_x, n_sample_y) and (n_feature_x, n_feature_y).
Initialization of sample and feature dual vectors
if using Sinkhorn algorithm. Zero vectors by default.
”pi_sample” and “pi_feature” whose values are matrices
of size (n_sample_x, n_sample_y) and (n_feature_x, n_feature_y).
Initialization of sample and feature couplings.
Uniform distributions by default.
nits_bcd (int, optional (default = 100)) – Number of Block Coordinate Descent (BCD) iterations to solve COOT.
tol_bcd (float, optional (default = 1e-7)) – Tolerance of BCD scheme. If the L1-norm between the current and previous
sample couplings is under this threshold, then stop BCD scheme.
eval_bcd (int, optional (default = 1)) – Multiplier of iteration at which the COOT cost is evaluated. For example,
if eval_bcd = 8, then the cost is calculated at iterations 8, 16, 24, etc…
nits_ot (int, optional (default = 100)) – Number of iterations to solve each of the
two optimal transport problems in each BCD iteration.
tol_sinkhorn (float, optional (default = 1e-7)) – Tolerance of Sinkhorn algorithm to stop the Sinkhorn scheme for
entropic optimal transport problem (if any) in each BCD iteration.
Only triggered when Sinkhorn solver is used.
method_sinkhorn (string, optional (default = "sinkhorn")) – Method used in POT’s ot.sinkhorn solver.
Only support “sinkhorn” and “sinkhorn_log”.
early_stopping_tol (float, optional (default = 1e-6)) – Tolerance for the early stopping. If the absolute difference between
the last 2 recorded COOT distances is under this tolerance, then stop BCD scheme.
log (bool, optional (default = False)) – If True then the cost and 4 dual vectors, including
2 from sample and 2 from feature couplings, are recorded.
verbose (bool, optional (default = False)) – If True then print the COOT cost at every multiplier of eval_bcd-th iteration.
Compute the CO-Optimal Transport distance between two measures.
Returns the CO-Optimal Transport distance between
\((\mathbf{X}, \mathbf{w}_{xs}, \mathbf{w}_{xf})\) and
\((\mathbf{Y}, \mathbf{w}_{ys}, \mathbf{w}_{yf})\).
The function solves the following CO-Optimal Transport (COOT) problem:
\(\mathbf{w}_{xs}\): Distribution of the samples in the source space
\(\mathbf{w}_{xf}\): Distribution of the features in the source space
\(\mathbf{w}_{ys}\): Distribution of the samples in the target space
\(\mathbf{w}_{yf}\): Distribution of the features in the target space
Note
This function allows epsilon to be zero.
In that case, the ot.lp.emd solver of POT will be used.
Parameters:
X ((n_sample_x, n_feature_x) array-like, float) – First input matrix.
Y ((n_sample_y, n_feature_y) array-like, float) – Second input matrix.
wx_samp ((n_sample_x, ) array-like, float, optional (default = None)) – Histogram assigned on rows (samples) of matrix X.
Uniform distribution by default.
wx_feat ((n_feature_x, ) array-like, float, optional (default = None)) – Histogram assigned on columns (features) of matrix X.
Uniform distribution by default.
wy_samp ((n_sample_y, ) array-like, float, optional (default = None)) – Histogram assigned on rows (samples) of matrix Y.
Uniform distribution by default.
wy_feat ((n_feature_y, ) array-like, float, optional (default = None)) – Histogram assigned on columns (features) of matrix Y.
Uniform distribution by default.
epsilon (scalar or indexable object of length 2, float or int, optional (default = 0)) – Regularization parameters for entropic approximation of sample and feature couplings.
Allow the case where epsilon contains 0. In that case, the EMD solver is used instead of
Sinkhorn solver. If epsilon is scalar, then the same epsilon is applied to
both regularization of sample and feature couplings.
alpha (scalar or indexable object of length 2, float or int, optional (default = 0)) – Coefficient parameter of linear terms with respect to the sample and feature couplings.
If alpha is scalar, then the same alpha is applied to both linear terms.
M_samp ((n_sample_x, n_sample_y), float, optional (default = None)) – Sample matrix with respect to the linear term on sample coupling.
M_feat ((n_feature_x, n_feature_y), float, optional (default = None)) – Feature matrix with respect to the linear term on feature coupling.
”duals_sample” and “duals_feature” whose values are
tuples of 2 vectors of size (n_sample_x, n_sample_y) and (n_feature_x, n_feature_y).
Initialization of sample and feature dual vectors
if using Sinkhorn algorithm. Zero vectors by default.
- “pi_sample” and “pi_feature” whose values are matrices
of size (n_sample_x, n_sample_y) and (n_feature_x, n_feature_y).
Initialization of sample and feature couplings.
Uniform distributions by default.
nits_bcd (int, optional (default = 100)) – Number of Block Coordinate Descent (BCD) iterations to solve COOT.
tol_bcd (float, optional (default = 1e-7)) – Tolerance of BCD scheme. If the L1-norm between the current and previous
sample couplings is under this threshold, then stop BCD scheme.
eval_bcd (int, optional (default = 1)) – Multiplier of iteration at which the COOT cost is evaluated. For example,
if eval_bcd = 8, then the cost is calculated at iterations 8, 16, 24, etc…
nits_ot (int, optional (default = 100)) – Number of iterations to solve each of the
two optimal transport problems in each BCD iteration.
tol_sinkhorn (float, optional (default = 1e-7)) – Tolerance of Sinkhorn algorithm to stop the Sinkhorn scheme for
entropic optimal transport problem (if any) in each BCD iteration.
Only triggered when Sinkhorn solver is used.
method_sinkhorn (string, optional (default = "sinkhorn")) – Method used in POT’s ot.sinkhorn solver.
Only support “sinkhorn” and “sinkhorn_log”.
early_stopping_tol (float, optional (default = 1e-6)) – Tolerance for the early stopping. If the absolute difference between
the last 2 recorded COOT distances is under this tolerance, then stop BCD scheme.
log (bool, optional (default = False)) – If True then the cost and 4 dual vectors, including
2 from sample and 2 from feature couplings, are recorded.
verbose (bool, optional (default = False)) – If True then print the COOT cost at every multiplier of eval_bcd-th iteration.
Returns:
float – CO-Optimal Transport distance.
dict – Contains logged information from co_optimal_transport solver.
Only returned if log parameter is True