Bisecting K-Means
Divisive hierarchical clustering that recursively splits clusters into two, combining aspects of hierarchical and K-Means clustering
Divisive hierarchical clustering that recursively splits clusters into two, combining aspects of hierarchical and K-Means clustering.
When to use:
- Want hierarchical structure with K-Means quality
- Need more consistent results than agglomerative
- Large datasets where regular hierarchical is too slow
- Want balance between speed and hierarchy
Strengths: More consistent than regular K-Means, faster than agglomerative hierarchical, creates hierarchy, good for text clustering Weaknesses: Still requires specifying k, slower than regular K-Means, assumes spherical clusters
Model Parameters
N Clusters (default: 8, required) Number of clusters to form.
Init Method (default: "random") How to initialize cluster centers:
- random: Random initialization (default for bisecting)
- k-means++: Smart initialization
Max Iterations (default: 300) Maximum iterations for each bisection.
Bisecting Strategy (default: "biggest_inertia") How to choose which cluster to split next:
- biggest_inertia: Split cluster with largest within-cluster variance (default)
- largest_cluster: Split largest cluster by number of points
Random State (default: 42) Seed for reproducibility.