Optimal Transport Tools (OTT)#
OTT is a JAX package that bundles
a few utilities to compute, and differentiate as needed, the solution to optimal
transport (OT) problems, taken in a fairly wide sense. For instance,
of course compute Wasserstein (or Gromov-Wasserstein) distances between weighted
clouds of points (or histograms) in a wide variety of scenarios, but also
estimate Monge maps, Wasserstein barycenters, and help with simpler tasks such
as differentiable approximations to ranking or even clustering.
To achieve this,
OTT rests on two families of tools:
the first family consists in discrete solvers computing transport between point clouds, using the Sinkhorn [Cuturi, 2013] and low-rank Sinkhorn [Scetbon et al., 2021] algorithms, and moving up towards Gromov-Wasserstein [Mémoli, 2011, Peyré et al., 2016];
the second family consists in continuous solvers, using suitable neural architectures such as an MLP or input-convex neural network [Amos et al., 2017] coupled with SGD-like estimators [Amos, 2023, Korotin et al., 2021, Makkuva et al., 2020].
OTT from PyPI as:
pip install ott-jax
conda via conda-forge as:
conda install -c conda-forge ott-jax
OTT is designed with the following choices:
Take advantage whenever possible of JAX features, such as just-in-time (JIT) compilation, auto-vectorization (VMAP) and both automatic but most importantly implicit differentiation.
Split geometry from OT solvers in the discrete case: We argue that there should be one, and one implementation only, of every major OT algorithm (Sinkhorn, Gromov-Wasserstein, barycenters, etc…), regardless of the geometric setup that is considered. To give a concrete example, any speedups one may benefit from by using a specific cost (e.g. Sinkhorn being faster when run on a separable cost on histograms supported on a separable grid [Solomon et al., 2015]) should not require a separate reimplementation of a Sinkhorn routine.
As a consequence, and to minimize code copy/pasting, use as often as possible object hierarchies, and interleave outer solvers (such as quadratic, aka Gromov-Wasserstein solvers) with inner solvers (e.g., low-rank Sinkhorn). This choice ensures that speedups achieved at lower computation levels (e.g. low-rank factorization of squared Euclidean distances) propagate seamlessly and automatically in higher level calls (e.g. updates in Gromov-Wasserstein), without requiring any attention from the user.
ott.geometrycontains classes that instantiate the ground cost matrix used to specify OT problems. Here cost matrix can be understood in a literal (by actually passing a matrix) or abstract sense (by passing information that is sufficient to recreate that matrix, apply all or parts of it, or apply its kernel). A typical example in the latter case arises when comparing two point clouds, paired with a cost function. Geometry objects are used to describe OT problems, solved next by solvers.
ott.problemsare used to describe linear, quadratic or barycenter OT problems.
ott.solverssolve a problem instantiated with
ott.problemsusing one of the implemented techniques.
ott.initializersare used to speed up the resolution of OT solvers.
ott.toolsprovides an interface to exploit OT solutions, as produced by solvers from the
ott.solversmodule. Such tasks include computing approximations to Wasserstein distances [Genevay et al., 2018, Séjourné et al., 2019], approximating OT between GMMs, or computing differentiable sort and quantile operations [Cuturi et al., 2019].
ott.mathholds low-level mathematical primitives.
ott.utilsprovides miscellaneous helper functions.