Documentation Documentation Documentation Coverage

Optimal Transport Tools (OTT)#


OTT is a JAX package that bundles a few utilities to compute, and differentiate as needed, the solution to optimal transport (OT) problems, taken in a fairly wide sense. For instance, OTT can of course compute Wasserstein (or Gromov-Wasserstein) distances between weighted clouds of points (or histograms) in a wide variety of scenarios, but also estimate Monge maps, Wasserstein barycenters, and help with simpler tasks such as differentiable approximations to ranking or even clustering.

To achieve this, OTT rests on two families of tools:


Install OTT from PyPI as:

pip install ott-jax

or with the neural OT dependencies:

pip install 'ott-jax[neural]'

or using conda as:

conda install -c conda-forge ott-jax

Design Choices#

OTT is designed with the following choices:

  • Take advantage whenever possible of JAX features, such as just-in-time (JIT) compilation, auto-vectorization (VMAP) and both automatic but most importantly implicit differentiation.

  • Split geometry from OT solvers in the discrete case: We argue that there should be one, and one implementation only, of every major OT algorithm (Sinkhorn, Gromov-Wasserstein, barycenters, etc…), regardless of the geometric setup that is considered. To give a concrete example, any speedups one may benefit from by using a specific cost (e.g. Sinkhorn being faster when run on a separable cost on histograms supported on a separable grid [Solomon et al., 2015]) should not require a separate reimplementation of a Sinkhorn routine.

  • As a consequence, and to minimize code copy/pasting, use as often as possible object hierarchies, and interleave outer solvers (such as quadratic, aka Gromov-Wasserstein solvers) with inner solvers (e.g., low-rank Sinkhorn). This choice ensures that speedups achieved at lower computation levels (e.g. low-rank factorization of squared Euclidean distances) propagate seamlessly and automatically in higher level calls (e.g. updates in Gromov-Wasserstein), without requiring any attention from the user.


  • ott.geometry contains classes that instantiate the ground cost matrix used to specify OT problems. Here cost matrix can be understood in a literal (by actually passing a matrix) or abstract sense (by passing information that is sufficient to recreate that matrix, apply all or parts of it, or apply its kernel). A typical example in the latter case arises when comparing two point clouds, paired with a cost function. Geometry objects are used to describe OT problems, solved next by solvers.

  • ott.problems are used to describe linear, quadratic or barycenter OT problems.

  • ott.solvers solve a problem instantiated with ott.problems using one of the implemented techniques.

  • ott.initializers implement simple strategies to initialize solvers. When the problems are solved with a convex solver, such as a LinearProblem solved with a Sinkhorn solver, the resolution of OT solvers, then this initialization is mostly useful to speed up convergences. When the problem is not convex, which is the case for most other uses of this toolbox, the initialization can play a decisive role to reach a useful solution.

  • ott.neural provides tools to parameterize optimal transport maps, couplings or conditional probabilities as neural networks.

  • provides an interface to exploit OT solutions, as produced by solvers from the ott.solvers module. Such tasks include computing approximations to Wasserstein distances [Genevay et al., 2018, Séjourné et al., 2019], approximating OT between GMMs, or computing differentiable sort and quantile operations [Cuturi et al., 2019].

  • ott.math holds low-level miscellaneous mathematical primitives, such as an implementation of the matrix square-root.

  • ott.utils provides miscellaneous helper functions.