Dokumentation (english)

BIRCH

Scalable incremental clustering for large datasets

BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies) builds a compact CF tree summary of the data and performs clustering on the summary. It is designed for very large datasets that don't fit in memory.

When to use:

  • Very large datasets where other algorithms are too slow
  • Streaming or incremental data where the model needs updating without full retraining
  • When memory efficiency is critical

Input: Tabular data with the feature columns defined during training Output: Cluster label for each row

Model Settings (set during training, used at inference)

N Clusters (default: 3) Number of final clusters after the optional refinement step. Set to null to return CF subclusters directly.

Threshold (default: 0.5) Maximum radius of subclusters in the CF tree. Smaller values create more, finer subclusters.

Branching Factor (default: 50) Maximum CF entries per node in the tree. Controls the tree structure.

Inference Settings

No dedicated inference-time settings. New points traverse the CF tree to find their nearest subcluster.


Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor etwa 4 Stunden
Release: v4.0.0-production
Buildnummer: master@afa25ab
Historie: 72 Items