Optimal Transport theory enables the definition of a distance across the set of measures on any given space. This Wasserstein distance naturally accounts for geometric warping between measures (including, but not exclusive to, images). We introduce a new, Optimal Transport-based representation learning method in close analogy with the usual Dictionary Learning problem. This approach typically relies on a matrix dot-product between the learned dictionary and the codes making up the new representation. The relationship between atoms and data is thus ultimately linear. By reconstructing our data as Wasserstein barycenters of learned atoms instead, our approach yields a representation making full use of the Wasserstein distance's attractive properties and allowing for non-linear relationships between the dictionary atoms and the datapoints.
We apply our method to a dataset of Euclid-like simulated PSFs (Point Spread Function). ESA's Euclid mission will cover a large area of the sky in order to accurately measure the shape of billions of galaxies. PSF estimation and correction is one of the main sources of systematic errors on those galaxy shape measurements. PSF variations across the field of view and with the incoming light's wavelength can be highly non-linear, while still retaining strong geometrical information, making the use of Optimal Transport distances an attractive prospect. We show that our representation does indeed succeed at capturing the PSF's variations.