Random Forest
Ensemble of decision trees for robust, high-accuracy classification
Random Forest trains multiple decision trees on random subsets of data and features, then aggregates their votes. This ensemble approach reduces variance and generalizes well without extensive tuning.
When to use:
- General-purpose classification with good out-of-the-box accuracy
- Datasets with mixed feature types and potential nonlinear interactions
- When feature importance rankings are needed alongside predictions
Input: Tabular data with the feature columns defined during training Output: Predicted class label and class probabilities
Model Settings (set during training, used at inference)
N Estimators (default: 100) Number of trees in the forest. More trees improve stability at the cost of inference speed.
Max Depth (default: null — unlimited) Maximum depth of each tree. Shallower trees reduce overfitting; deeper trees capture more complex patterns.
Max Features (default: sqrt)
Number of features to consider at each split. sqrt of total features is a reliable default for classification.
Criterion (default: gini)
Impurity measure for split evaluation. gini is slightly faster; entropy can produce marginally better splits on some datasets.
Class Weight (default: null)
Set to balanced to handle class imbalance.
Inference Settings
No dedicated inference-time settings. Predictions aggregate all trained trees.