Optimal Transport Tools (OTT)#
OTT is a JAX package that bundles
a few utilities to compute, and differentiate as needed, the solution to optimal
transport (OT) problems, taken in a fairly wide sense. For instance,
of course compute Wasserstein (or Gromov-Wasserstein) distances between weighted
clouds of points (or histograms) in a wide variety of scenarios, but also
estimate Monge maps, Wasserstein barycenters, and help with simpler tasks such
as differentiable approximations to ranking or even clustering.
To achieve this,
OTT rests on two families of tools:
the first family consists in discrete solvers computing transport between point clouds, using the Sinkhorn [Cuturi, 2013] and low-rank Sinkhorn [Scetbon et al., 2021] algorithms, and moving up towards Gromov-Wasserstein [Mémoli, 2011, Peyré et al., 2016];
the second family consists in continuous solvers, using suitable neural architectures such as an MLP or input-convex neural network [Amos et al., 2017] coupled with SGD-like estimators [Amos, 2023, Korotin et al., 2021, Makkuva et al., 2020].
OTT from PyPI as:
pip install ott-jax
or with the
neural OT dependencies:
pip install 'ott-jax[neural]'
or using conda as:
conda install -c conda-forge ott-jax
OTT is designed with the following choices:
Split geometry from OT solvers in the discrete case: We argue that there should be one, and one implementation only, of every major OT algorithm (Sinkhorn, Gromov-Wasserstein, barycenters, etc…), regardless of the geometric setup that is considered. To give a concrete example, any speedups one may benefit from by using a specific cost (e.g. Sinkhorn being faster when run on a separable cost on histograms supported on a separable grid [Solomon et al., 2015]) should not require a separate reimplementation of a Sinkhorn routine.
As a consequence, and to minimize code copy/pasting, use as often as possible object hierarchies, and interleave outer solvers (such as quadratic, aka Gromov-Wasserstein solvers) with inner solvers (e.g., low-rank Sinkhorn). This choice ensures that speedups achieved at lower computation levels (e.g. low-rank factorization of squared Euclidean distances) propagate seamlessly and automatically in higher level calls (e.g. updates in Gromov-Wasserstein), without requiring any attention from the user.
ott.geometrycontains classes that instantiate the ground cost matrix used to specify OT problems. Here cost matrix can be understood in a literal (by actually passing a matrix) or abstract sense (by passing information that is sufficient to recreate that matrix, apply all or parts of it, or apply its kernel). A typical example in the latter case arises when comparing two point clouds, paired with a cost function. Geometry objects are used to describe OT problems, solved next by solvers.
ott.problemsare used to describe linear, quadratic or barycenter OT problems.
ott.initializersimplement simple strategies to initialize solvers. When the problems are solved with a convex solver, such as a
LinearProblemsolved with a
Sinkhornsolver, the resolution of OT solvers, then this initialization is mostly useful to speed up convergences. When the problem is not convex, which is the case for most other uses of this toolbox, the initialization can play a decisive role to reach a useful solution.
ott.neuralprovides tools to parameterize optimal transport maps, couplings or conditional probabilities as neural networks.
ott.toolsprovides an interface to exploit OT solutions, as produced by solvers from the
ott.solversmodule. Such tasks include computing approximations to Wasserstein distances [Genevay et al., 2018, Séjourné et al., 2019], approximating OT between GMMs, or computing differentiable sort and quantile operations [Cuturi et al., 2019].
ott.mathholds low-level miscellaneous mathematical primitives, such as an implementation of the matrix square-root.
ott.utilsprovides miscellaneous helper functions.