ott.geometry.pointcloud.PointCloud#

class ott.geometry.pointcloud.PointCloud(x, y=None, cost_fn=None, power=2.0, batch_size=None, scale_cost=1.0, **kwargs)[source]#

Defines geometry for 2 point clouds (possibly 1 vs itself) using CostFn.

Creates a geometry, specifying a cost function passed as CostFn type object. When the number of points is large, setting the online flag to True implies that cost and kernel matrices used to update potentials or scalings will be recomputed on the fly, rather than stored in memory. More precisely, when setting online, the cost function will be partially cached by storing norm values for each point in both point clouds, but the pairwise cost function evaluations won’t be. The sum of norms + the pairwise cost term is raised to power.

Parameters
  • x (ndarray) – n x d array of n d-dimensional vectors

  • y (Optional[ndarray]) – m x d array of m d-dimensional vectors. If None, use x.

  • cost_fn (Optional[CostFn]) – a CostFn function between two points in dimension d.

  • power (float) – a power to raise (norm(x) + norm(y) + cost(x,y)) **

  • batch_size (Optional[int]) – When None, the cost matrix corresponding to that point cloud is computed, stored and later re-used at each application of apply_lse_kernel(). When batch_size is a positive integer, computations are done in an online fashion, namely the cost matrix is recomputed at each call of the apply_lse_kernel() step, batch_size lines at a time, used on a vector and discarded. The online computation is particularly useful for big point clouds whose cost matrix does not fit in memory.

  • scale_cost (Union[bool, int, float, Literal[‘mean’, ‘max_norm’, ‘max_bound’, ‘max_cost’, ‘median’]]) – option to rescale the cost matrix. Implemented scalings are ‘median’, ‘mean’, ‘max_cost’, ‘max_norm’ and ‘max_bound’. Alternatively, a float factor can be given to rescale the cost such that cost_matrix /= scale_cost. If True, use ‘mean’.

  • kwargs (Any) – other optional parameters to be passed on to superclass initializer, notably those related to epsilon regularization.

Methods

apply_cost(arr[, axis, fn, is_linear])

Apply cost matrix to array (vector or matrix).

apply_kernel(scaling[, eps, axis])

Apply kernel_matrix on positive scaling vector.

apply_lse_kernel(f, g, eps[, vec, axis])

Apply kernel_matrix in log domain on a pair of dual potential variables.

apply_square_cost(arr[, axis])

Apply elementwise-square of cost matrix to array (vector or matrix).

apply_transport_from_potentials(f, g, vec[, ...])

Apply transport matrix computed from potentials to a (batched) vec.

apply_transport_from_scalings(u, v, vec[, axis])

Apply transport matrix computed from scalings to a (batched) vec.

barycenter(weights)

Compute barycenter of points in self.x using weights, valid for p=2.0.

copy_epsilon(other)

Copy the epsilon parameters from another geometry.

marginal_from_potentials(f, g[, axis])

Output marginal of transportation matrix from potentials.

marginal_from_scalings(u, v[, axis])

Output marginal of transportation matrix from scalings.

mask(src_mask, tgt_mask[, mask_value])

Mask rows or columns of a geometry.

potential_from_scaling(scaling)

Compute dual potential vector from scaling vector.

prepare_divergences(x, y[, static_b, ...])

Instantiate the geometries used for a divergence computation.

scaling_from_potential(potential)

Compute scaling vector from dual potential.

subset(src_ixs, tgt_ixs, **kwargs)

Subset rows or columns of a geometry.

to_LRCGeometry([scale])

Convert sqEuc.

transport_from_potentials(f, g)

Output transport matrix from potentials.

transport_from_scalings(u, v)

Output transport matrix from pair of scalings.

update_potential(f, g, log_marginal[, ...])

Carry out one Sinkhorn update for potentials, i.e. in log space.

update_scaling(scaling, marginal[, ...])

Carry out one Sinkhorn update for scalings, using kernel directly.

vec_apply_cost(arr[, axis, fn])

Apply the geometry's cost matrix in a vectorised way.

Attributes

batch_size

Batch size for online mode.

cost_matrix

Cost matrix, recomputed from kernel if only kernel was specified.

cost_rank

Output rank of cost matrix, if any was provided.

epsilon

Epsilon regularization value.

inv_scale_cost

Compute and return inverse of scaling factor for cost matrix.

is_online

Whether cost_matrix or kernel_matrix is computed on-the-fly.

is_squared_euclidean

Whether cost is computed by taking squared-Eucl.

is_symmetric

Whether geometry cost/kernel is a symmetric matrix.

kernel_matrix

Kernel matrix, either provided by user or recomputed from cost_matrix.

mean_cost_matrix

Mean of the cost_matrix.

median_cost_matrix

Median of the cost_matrix.

scale_epsilon

Compute the scale of the epsilon, potentially based on data.

shape

Shape of the geometry.

src_mask

Mask of shape [num_a,] to compute cost_matrix statistics.

tgt_mask

Mask of shape [num_b,] to compute cost_matrix statistics.