Dokumentation (english)

Mini Batch K-Means

Fast variant of K-Means that uses mini-batches of data to reduce computation time

Fast variant of K-Means that uses mini-batches of data to reduce computation time.

When to use:

  • Very large datasets (>10k samples)
  • Need faster training than K-Means
  • Can accept slightly lower quality for speed
  • Memory constraints

Strengths: Much faster than K-Means, lower memory usage, good for large datasets, similar quality to K-Means Weaknesses: Slightly less accurate than K-Means, more sensitive to initialization

Model Parameters

N Clusters (default: 8, required) Number of clusters to form.

Init Method (default: "k-means++") How to initialize cluster centers:

  • k-means++: Smart initialization (better)
  • random: Random initialization (faster)

Max Iterations (default: 100) Maximum iterations over the complete dataset.

  • 50-100: Usually sufficient for large data
  • 200+: For better convergence

Batch Size (default: 1024) Size of mini batches for training.

  • 256-512: Small batches (more updates, slower)
  • 1024: Good default
  • 2048+: Large batches (fewer updates, faster)

Random State (default: 42) Seed for reproducibility.

On this page


Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor 1 Tag
Release: v4.0.0-production
Buildnummer: master@64a3463
Historie: 68 Items