.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/others/plot_learning_weights_with_COOT.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_others_plot_learning_weights_with_COOT.py: =============================================================== Learning sample marginal distribution with CO-Optimal Transport =============================================================== In this example, we illustrate how to estimate the sample marginal distribution which minimizes the CO-Optimal Transport distance [47]_ between two matrices. More precisely, given a source data :math:`(X, \mu_x^{(s)}, \mu_x^{(f)})` and a target matrix :math:`Y` associated with a fixed histogram on features :math:`\mu_y^{(f)}`, we want to solve the following problem .. math:: \min_{\mu_y^{(s)} \in \Delta} \text{COOT}\left( (X, \mu_x^{(s)}, \mu_x^{(f)}), (Y, \mu_y^{(s)}, \mu_y^{(f)}) \right) where :math:`\Delta` is the probability simplex. This minimization is done with a simple projected gradient descent in PyTorch. We use the automatic backend of POT that allows us to compute the CO-Optimal Transport distance with :func:`ot.coot.co_optimal_transport2` with differentiable losses. .. [49] Redko, I., Vayer, T., Flamary, R., and Courty, N. (2020). `CO-Optimal Transport `_. Advances in Neural Information Processing Systems, 33. .. GENERATED FROM PYTHON SOURCE LINES 24-40 .. code-block:: Python # Author: Remi Flamary # Quang Huy Tran # License: MIT License from matplotlib.patches import ConnectionPatch import torch import numpy as np import matplotlib.pyplot as pl import ot from ot.coot import co_optimal_transport as coot from ot.coot import co_optimal_transport2 as coot2 .. GENERATED FROM PYTHON SOURCE LINES 41-49 Generate data ------------- The source and clean target matrices are generated by :math:`X_{i,j} = \cos(\frac{i}{n_1} \pi) + \cos(\frac{j}{d_1} \pi)` and :math:`Y_{i,j} = \cos(\frac{i}{n_2} \pi) + \cos(\frac{j}{d_2} \pi)`. The target matrix is then contaminated by adding 5 row outliers. Intuitively, we expect that the estimated sample distribution should ignore these outliers, i.e. their weights should be zero. .. GENERATED FROM PYTHON SOURCE LINES 49-83 .. code-block:: Python np.random.seed(182) n1, d1 = 20, 16 n2, d2 = 10, 8 n = 15 X = ( torch.cos(torch.arange(n1) * torch.pi / n1)[:, None] + torch.cos(torch.arange(d1) * torch.pi / d1)[None, :] ) # Generate clean target data mixed with outliers Y_noisy = torch.randn((n, d2)) * 10.0 Y_noisy[:n2, :] = ( torch.cos(torch.arange(n2) * torch.pi / n2)[:, None] + torch.cos(torch.arange(d2) * torch.pi / d2)[None, :] ) Y = Y_noisy[:n2, :] X, Y_noisy, Y = X.double(), Y_noisy.double(), Y.double() fig, axes = pl.subplots(nrows=1, ncols=3, figsize=(12, 5)) axes[0].imshow(X, vmin=-2, vmax=2) axes[0].set_title('$X$') axes[1].imshow(Y, vmin=-2, vmax=2) axes[1].set_title('Clean $Y$') axes[2].imshow(Y_noisy, vmin=-2, vmax=2) axes[2].set_title('Noisy $Y$') pl.tight_layout() .. image-sg:: /auto_examples/others/images/sphx_glr_plot_learning_weights_with_COOT_001.png :alt: $X$, Clean $Y$, Noisy $Y$ :srcset: /auto_examples/others/images/sphx_glr_plot_learning_weights_with_COOT_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 84-86 Optimize the COOT distance with respect to the sample marginal distribution --------------------------------------------------------------------------- .. GENERATED FROM PYTHON SOURCE LINES 86-112 .. code-block:: Python losses = [] lr = 1e-3 niter = 1000 b = torch.tensor(ot.unif(n), requires_grad=True) for i in range(niter): loss = coot2(X, Y_noisy, wy_samp=b, log=False, verbose=False) losses.append(float(loss)) loss.backward() with torch.no_grad(): b -= lr * b.grad # gradient step b[:] = ot.utils.proj_simplex(b) # projection on the simplex b.grad.zero_() # Estimated sample marginal distribution and training loss curve pl.plot(losses[10:]) pl.title('CO-Optimal Transport distance') print(f"Marginal distribution = {b.detach().numpy()}") .. image-sg:: /auto_examples/others/images/sphx_glr_plot_learning_weights_with_COOT_002.png :alt: CO-Optimal Transport distance :srcset: /auto_examples/others/images/sphx_glr_plot_learning_weights_with_COOT_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none Marginal distribution = [0.07507868 0.08001347 0.09469872 0.1001999 0.10001527 0.10001687 0.09999904 0.09979829 0.11466591 0.13551386 0. 0. 0. 0. 0. ] .. GENERATED FROM PYTHON SOURCE LINES 113-117 Visualizing the row and column alignments with the estimated sample marginal distribution ----------------------------------------------------------------------------------------- Clearly, the learned marginal distribution completely and successfully ignores the 5 outliers. .. GENERATED FROM PYTHON SOURCE LINES 117-151 .. code-block:: Python X, Y_noisy = X.numpy(), Y_noisy.numpy() b = b.detach().numpy() pi_sample, pi_feature = coot(X, Y_noisy, wy_samp=b, log=False, verbose=True) fig = pl.figure(4, (9, 7)) pl.clf() ax1 = pl.subplot(2, 2, 3) pl.imshow(X, vmin=-2, vmax=2) pl.xlabel('$X$') ax2 = pl.subplot(2, 2, 2) ax2.yaxis.tick_right() pl.imshow(np.transpose(Y_noisy), vmin=-2, vmax=2) pl.title("Transpose(Noisy $Y$)") ax2.xaxis.tick_top() for i in range(n1): j = np.argmax(pi_sample[i, :]) xyA = (d1 - .5, i) xyB = (j, d2 - .5) con = ConnectionPatch(xyA=xyA, xyB=xyB, coordsA=ax1.transData, coordsB=ax2.transData, color="black") fig.add_artist(con) for i in range(d1): j = np.argmax(pi_feature[i, :]) xyA = (i, -.5) xyB = (-.5, j) con = ConnectionPatch( xyA=xyA, xyB=xyB, coordsA=ax1.transData, coordsB=ax2.transData, color="blue") fig.add_artist(con) .. image-sg:: /auto_examples/others/images/sphx_glr_plot_learning_weights_with_COOT_003.png :alt: Transpose(Noisy $Y$) :srcset: /auto_examples/others/images/sphx_glr_plot_learning_weights_with_COOT_003.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none CO-Optimal Transport cost at iteration 1: 0.010389716046318498 .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 2.806 seconds) .. _sphx_glr_download_auto_examples_others_plot_learning_weights_with_COOT.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_learning_weights_with_COOT.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_learning_weights_with_COOT.py ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_