ott.tools.k_means.k_means#
- ott.tools.k_means.k_means(geom, k, weights=None, init='k-means++', n_init=10, n_local_trials=None, tol=0.0001, min_iterations=0, max_iterations=300, store_inner_errors=False, rng=None)[source]#
K-means clustering using Lloyd’s algorithm [Lloyd, 1982].
- Parameters:
geom (
Union
[Array
,PointCloud
]) – Point cloud of shape[n, ndim]
to cluster. If passed as an array,SqEuclidean
cost is assumed.k (
int
) – The number of clusters.weights (
Optional
[Array
]) – The weights of input points. These weights are considered when computing the centroids and inertia. IfNone
, use uniform weights.init (
Union
[Literal
['k-means++'
,'random'
],Callable
[[PointCloud
,int
,Array
],Array
]]) –Initialization method. Can be one of the following:
’k-means++’ - select initial centroids that are \(\mathcal{O}(\log k)\)-optimal [Arthur and Vassilvitskii, 2007].
’random’ - randomly select
k
points from thegeom
.callable()
- a function which takes the point cloud, the number of clusters and a random key and returns the centroids as an array of shape[k, ndim]
.
n_init (
int
) – Number of times k-means will run with different initial seeds.n_local_trials (
Optional
[int
]) – Number of local trials wheninit = 'k-means++'
. IfNone
, \(2 + \lfloor log(k) \rfloor\) is used.tol (
float
) – Relative tolerance with respect to the Frobenius norm of the centroids’ shift between two consecutive iterations.min_iterations (
int
) – Minimum number of iterations.max_iterations (
int
) – Maximum number of iterations.store_inner_errors (
bool
) – Whether to store the errors (inertia) at each iteration.rng (
Optional
[Array
]) – Random key for seeding the initializations.
- Return type:
- Returns:
The k-means clustering.