K-Means
Partition new data points into K clusters based on distance to cluster centroids
K-Means inference assigns each new data point to the nearest cluster centroid learned during training. It is the most widely used clustering algorithm, fast and scalable for well-separated spherical clusters.
When to use:
- Customer segmentation, product categorization, or document grouping
- When the number of clusters is known in advance
- Well-separated, roughly spherical clusters of similar size
Input: Tabular data with the feature columns defined during training Output: Cluster label (0 to K-1) for each row
Model Settings (set during training, used at inference)
N Clusters (default: 8) Number of clusters. The centroids learned during training are used to assign new points.
Init (default: k-means++)
Centroid initialization method. k-means++ is the reliable default.
Max Iter (default: 300) Maximum iterations during training convergence.
Inference Settings
No dedicated inference-time settings. Each point is assigned to its nearest trained centroid.