ott.tools.soft_sort.multivariate_cdf_quantile_maps

ott.tools.soft_sort.multivariate_cdf_quantile_maps#

ott.tools.soft_sort.multivariate_cdf_quantile_maps(inputs, target_sampler=None, rng=None, num_target_samples=None, cost_fn=None, epsilon=None, input_weights=None, target_weights=None, **kwargs)[source]#

Returns multivariate CDF and quantile maps, given input samples.

Implements the multivariate generalizations for CDF and quantiles proposed in [Chernozhukov et al., 2017]. The reference measure is assumed to be the uniform measure by default, but can be modified. For consistency, the reference measure should be symmetrically centered around \((\tfrac{1}{2},\cdots,\tfrac{1}{2})\) and supported on \([0, 1]^d\).

The implementation return two entropic map estimators, one for the CDF map, the other for the quantiles map.

Parameters:
  • inputs (Array) – 2D array of [n, d] vectors.

  • target_sampler (Optional[Callable[[Array, Tuple[int, int]], Array]]) – Callable that takes a rng and [m, d] shape. m is passed on as target_num_samples, dimension d is inferred directly from the shape passed in inputs. This is assumed by default to be uniform(), and could be any other random sampler properly wrapped to have the signature above.

  • rng (Optional[Array]) – rng key used by target_sampler.

  • num_target_samples (Optional[int]) – number m of points generated in the target distribution.

  • cost_fn (Optional[CostFn]) – Cost function, used to compare inputs and targets. Passed on to instantiate a PointCloud object. If None, SqEuclidean is used.

  • epsilon (Optional[float]) – entropic regularization parameter used to instantiate the PointCloud object.

  • input_weights (Optional[Array]) – [n,] vector of weights for input measure. Assumed to be uniform by default.

  • target_weights (Optional[Array]) – [m,] vector of weights for target measure. Assumed to be uniform by default.

  • kwargs (Any) – keyword arguments passed on to the solve() function, which solves the OT problem between inputs and targets using the Sinkhorn algorithm.

Return type:

Tuple[Callable[[Array], Array], Callable[[Array], Array]]

Returns:

  • The multivariate CDF map, taking a [b, d] batch of vectors in the range of the inputs, and mapping each vector within the range of the reference measure (assumed by default to be \([0, 1]^d\)).

  • The quantile map, mapping a batch [b, d] of multivariate quantile vectors onto [b, d] vectors in \([0, 1]^d\), the range of the reference measure.