ott.geometry.geometry.Geometry

Contents

ott.geometry.geometry.Geometry#

class ott.geometry.geometry.Geometry(cost_matrix=None, kernel_matrix=None, epsilon=None, relative_epsilon=None, scale_cost=1.0)[source]#

Base class to define ground costs/kernels used in optimal transport.

Optimal transport problems are intrinsically geometric: they compute an optimal way to transport mass from one configuration onto another. To define what is meant by optimality of transport requires defining a ground cost, which quantifies how costly it is to move mass from one among several source locations, towards one out of multiple target locations. These source and target locations can be described as points in vectors spaces, grids, or more generally described through a (dissimilarity) cost matrix, or almost equivalently, a (similarity) kernel matrix. This class describes such a geometry and several useful methods to exploit it.

Parameters:
  • cost_matrix (Optional[Array]) – Cost matrix of shape [n, m].

  • kernel_matrix (Optional[Array]) – Kernel matrix of shape [n, m].

  • epsilon (Union[float, Epsilon, None]) –

    Regularization parameter or a scheduler:

    • epsilon = None and relative_epsilon = None, use \(0.05 * \text{stddev(cost_matrix)}\).

    • if epsilon is a float and relative_epsilon = None, it directly corresponds to the regularization strength.

    • otherwise, epsilon multiplies the mean_cost_matrix or std_cost_matrix, depending on the value of relative_epsilon.

    If epsilon = None, the value of DEFAULT_EPSILON_SCALE = 0.05. will be used.

  • relative_epsilon (Optional[Literal['mean', 'std']]) – Whether epsilon refers to a fraction of the mean_cost_matrix or std_cost_matrix.

  • scale_cost (Union[float, Literal['mean', 'max_cost', 'median', 'std']]) – option to rescale the cost matrix. Implemented scalings are ‘median’, ‘mean’, ‘std’ and ‘max_cost’. Alternatively, a float factor can be given to rescale the cost such that cost_matrix /= scale_cost.

Note

When defining a Geometry through a cost_matrix, it is important to select an epsilon regularization parameter that is meaningful. That parameter can be provided by the user, or assigned a default value through a simple rule, using for instance the mean_cost_matrix or the std_cost_matrix.

Methods

apply_cost(arr[, axis, fn, is_linear])

Apply cost_matrix to array (vector or matrix).

apply_kernel(vec[, eps, axis])

Apply kernel_matrix on positive scaling vector.

apply_lse_kernel(f, g, eps[, vec, axis])

Apply kernel_matrix in log domain.

apply_square_cost(arr[, axis])

Apply elementwise-square of cost matrix to array (vector or matrix).

apply_transport_from_potentials(f, g, vec[, ...])

Apply transport matrix computed from potentials to a (batched) vec.

apply_transport_from_scalings(u, v, vec[, axis])

Apply transport matrix computed from scalings to a (batched) vec.

copy_epsilon(other)

Copy the epsilon parameters from another geometry.

marginal_from_potentials(f, g[, axis])

Output marginal of transportation matrix from potentials.

marginal_from_scalings(u, v[, axis])

Output marginal of transportation matrix from scalings.

potential_from_scaling(scaling)

Compute dual potential vector from scaling vector.

prepare_divergences(*args[, static_b])

Instantiate 2 (or 3) geometries to compute a Sinkhorn divergence.

scaling_from_potential(potential)

Compute scaling vector from dual potential.

set_scale_cost(scale_cost)

Modify how to rescale of the cost_matrix.

subset([row_ixs, col_ixs])

Subset rows or columns of a geometry.

to_LRCGeometry([rank, tol, rng, scale])

Factorize the cost matrix using either SVD (full) or [Indyk et al., 2019].

transport_from_potentials(f, g)

Output transport matrix from potentials.

transport_from_scalings(u, v)

Output transport matrix from pair of scalings.

update_potential(f, g, log_marginal[, ...])

Carry out one Sinkhorn update for potentials, i.e. in log space.

update_scaling(scaling, marginal[, ...])

Carry out one Sinkhorn update for scalings, using kernel directly.

Attributes

can_LRC

Check quickly if casting geometry as LRC makes sense.

cost_matrix

Cost matrix, recomputed from kernel if only kernel was specified.

cost_rank

Output rank of cost matrix, if any was provided.

diag_cost

Diagonal of the cost matrix.

dtype

The data type.

epsilon

Epsilon regularization value.

epsilon_scheduler

Epsilon scheduler.

inv_scale_cost

Compute and return inverse of scaling factor for cost matrix.

is_online

Whether geometry cost/kernel should be recomputed on the fly.

is_square

Whether geometry cost/kernel is a square matrix.

is_squared_euclidean

Whether cost is computed by taking squared Euclidean distance.

is_symmetric

Whether geometry cost/kernel is a symmetric matrix.

kernel_matrix

Kernel matrix.

mean_cost_matrix

Mean of the cost_matrix.

median_cost_matrix

Median of the cost_matrix.

shape

Shape of the geometry.

std_cost_matrix

Standard deviation of all values stored in cost_matrix.