K-Nearest Neighbors
Classify by finding the most similar training examples
K-Nearest Neighbors (KNN) classifies each new point by majority vote among its K closest training examples. It makes no explicit model assumptions, but inference requires comparing against all training points.
When to use:
- Small-to-medium datasets where similarity-based reasoning is appropriate
- Local pattern recognition where decision boundaries are complex
- Baseline model with minimal hyperparameter tuning
Input: Tabular data with the feature columns defined during training Output: Predicted class label and class probabilities
Model Settings (set during training, used at inference)
N Neighbors (default: 5) Number of nearest neighbors to consider. Smaller values capture local patterns; larger values smooth the decision boundary.
Weights (default: uniform)
uniform — all neighbors vote equally. distance — closer neighbors have more influence.
Metric (default: minkowski)
Distance metric for neighbor search. euclidean and manhattan are common alternatives.
Algorithm (default: auto)
Nearest neighbor search algorithm. auto selects the best based on data size and dimensionality.
Inference Settings
No dedicated inference-time settings. Each prediction searches the stored training set.