ott.geometry.pointcloud.PointCloud#

class ott.geometry.pointcloud.PointCloud(x, y=None, cost_fn=None, batch_size=None, scale_cost=1.0, **kwargs)[source]#

Defines geometry for 2 point clouds (possibly 1 vs itself).

Creates a geometry, specifying a cost function passed as CostFn type object. When the number of points is large, setting the batch_size flag implies that cost and kernel matrices used to update potentials or scalings will be recomputed on the fly, rather than stored in memory. More precisely, when setting batch_size, the cost function will be partially cached by storing norm values for each point in both point clouds, but the pairwise cost function evaluations won’t be.

Parameters
  • x (Array) – n x d array of n d-dimensional vectors

  • y (Optional[Array]) – m x d array of m d-dimensional vectors. If None, use x.

  • cost_fn (Optional[CostFn]) – a CostFn function between two points in dimension d.

  • batch_size (Optional[int]) – When None, the cost matrix corresponding to that point cloud is computed, stored and later re-used at each application of apply_lse_kernel(). When batch_size is a positive integer, computations are done in an online fashion, namely the cost matrix is recomputed at each call of the apply_lse_kernel() step, batch_size lines at a time, used on a vector and discarded. The online computation is particularly useful for big point clouds whose cost matrix does not fit in memory.

  • scale_cost (Union[bool, int, float, Literal[‘mean’, ‘max_norm’, ‘max_bound’, ‘max_cost’, ‘median’]]) – option to rescale the cost matrix. Implemented scalings are ‘median’, ‘mean’, ‘max_cost’, ‘max_norm’ and ‘max_bound’. Alternatively, a float factor can be given to rescale the cost such that cost_matrix /= scale_cost. If True, use ‘mean’.

  • kwargs (Any) – other optional parameters to be passed on to superclass initializer, notably those related to epsilon regularization.

Methods

apply_cost(arr[, axis, fn, is_linear])

Apply cost matrix to array (vector or matrix).

apply_kernel(scaling[, eps, axis])

Apply kernel_matrix on positive scaling vector.

apply_lse_kernel(f, g, eps[, vec, axis])

Apply kernel_matrix in log domain on a pair of dual potential variables.

apply_square_cost(arr[, axis])

Apply elementwise-square of cost matrix to array (vector or matrix).

apply_transport_from_potentials(f, g, vec[, ...])

Apply transport matrix computed from potentials to a (batched) vec.

apply_transport_from_scalings(u, v, vec[, axis])

Apply transport matrix computed from scalings to a (batched) vec.

barycenter(weights)

Compute barycenter of points in self.x using weights.

copy_epsilon(other)

Copy the epsilon parameters from another geometry.

marginal_from_potentials(f, g[, axis])

Output marginal of transportation matrix from potentials.

marginal_from_scalings(u, v[, axis])

Output marginal of transportation matrix from scalings.

mask(src_mask, tgt_mask[, mask_value])

Mask rows or columns of a geometry.

potential_from_scaling(scaling)

Compute dual potential vector from scaling vector.

prepare_divergences(x, y[, static_b, ...])

Instantiate the geometries used for a divergence computation.

scaling_from_potential(potential)

Compute scaling vector from dual potential.

subset(src_ixs, tgt_ixs, **kwargs)

Subset rows or columns of a geometry.

to_LRCGeometry([scale])

Convert point cloud to low-rank geometry.

transport_from_potentials(f, g)

Output transport matrix from potentials.

transport_from_scalings(u, v)

Output transport matrix from pair of scalings.

update_potential(f, g, log_marginal[, ...])

Carry out one Sinkhorn update for potentials, i.e. in log space.

update_scaling(scaling, marginal[, ...])

Carry out one Sinkhorn update for scalings, using kernel directly.

vec_apply_cost(arr[, axis, fn])

Apply the geometry's cost matrix in a vectorised way.

Attributes

batch_size

Batch size for online mode.

can_LRC

Check quickly if casting geometry as LRC makes sense.

cost_matrix

Cost matrix, recomputed from kernel if only kernel was specified.

cost_rank

Output rank of cost matrix, if any was provided.

dtype

The data type.

epsilon

Epsilon regularization value.

inv_scale_cost

Compute and return inverse of scaling factor for cost matrix.

is_online

Whether cost_matrix or kernel_matrix is computed on-the-fly.

is_squared_euclidean

Whether cost is computed by taking squared-Eucl.

is_symmetric

Whether geometry cost/kernel is a symmetric matrix.

kernel_matrix

Kernel matrix, either provided by user or recomputed from cost_matrix.

mean_cost_matrix

Mean of the cost_matrix.

median_cost_matrix

Median of the cost_matrix.

scale_epsilon

Compute the scale of the epsilon, potentially based on data.

shape

Shape of the geometry.

src_mask

Mask of shape [num_a,] to compute cost_matrix statistics.

tgt_mask

Mask of shape [num_b,] to compute cost_matrix statistics.