Bisecting K-Means
Hierarchical K-Means via recursive bisection of the largest cluster
Bisecting K-Means builds a hierarchical cluster tree by repeatedly splitting one cluster into two using K-Means. It produces better-balanced clusters than standard K-Means on some datasets.
When to use:
- When cluster hierarchy is useful alongside flat assignments
- Better-balanced clusters than standard K-Means
- Large datasets where hierarchical agglomerative clustering is too slow
Input: Tabular data with the feature columns defined during training Output: Cluster label for each row
Model Settings (set during training, used at inference)
N Clusters (default: 8) Final number of clusters.
Init (default: random)
Centroid initialization per bisection step. k-means++ can improve quality at higher cost.
Max Iter (default: 300) Maximum iterations per bisection step.
Bisecting Strategy (default: biggest_inertia)
Which cluster to bisect next. biggest_inertia splits the cluster with the most within-cluster variance; largest_cluster splits the largest cluster by size.
Inference Settings
No dedicated inference-time settings. Each point is assigned to its nearest trained centroid.