K-Means

Fast and scalable algorithm that partitions data into k clusters by minimizing within-cluster variance

Fast and scalable algorithm that partitions data into k clusters by minimizing within-cluster variance.

When to use:

Know approximately how many clusters to expect
Clusters are roughly spherical and similar size
Need fast results on large datasets
Good starting point for exploration

Strengths: Very fast, scalable to large datasets, simple and interpretable, consistent results Weaknesses: Must specify k in advance, assumes spherical clusters, sensitive to outliers, poor with varying cluster sizes

Model Parameters

N Clusters (default: 8, required) Number of clusters to form. This is the most important parameter.

Too low: Merges distinct groups
Too high: Splits natural groups
Use elbow method or silhouette analysis to find optimal k

Init Method (default: "k-means++") How to initialize cluster centers:

k-means++: Smart initialization (default, better convergence)
random: Random initialization (faster but may give poor results)

Max Iterations (default: 300) Maximum number of iterations for convergence.

100-300: Usually sufficient
500+: For difficult datasets or large k

Random State (default: 42) Seed for reproducibility. Keep consistent for comparable results.

K-Means

Model Parameters

On this page

Sicherheit auf Enterprise-Niveau

In jeder Infrastruktur einsetzbar

DSGVO-konform

K-Means

Model Parameters

On this page

Command Palette