Polynomial Regression
Extending linear regression to model non-linear relationships
Polynomial regression models non-linear relationships by creating new features from the original feature. Take a single feature x and create x², x³, etc. These become additional features in a linear regression model.
The visualization shows fits of different polynomial degrees:
- Degree 1 (straight line): Underfits
- Degree 2-3: Usually optimal
- High degree: May fit the training data perfectly, but this is overfitting—the model memorizes noise instead of learning the true pattern
Key points:
- It's still linear regression—just with engineered features (x, x², x³)
- High-degree polynomials can pass through every training point but perform poorly on new data
- The best way to determine if a polynomial is overfitting or underfitting is to plot a learning curve showing training vs. validation error
- Use cross-validation to choose the degree
- Always scale features before creating polynomial terms
For evaluation metrics, see Regression Evaluation Metrics.
Bias, Variance, and Irreducible Error
Every prediction error can be decomposed into three components:
Bias: Error from overly simple models that miss the true pattern. Low-degree polynomials have high bias—they can't capture complex relationships.
Variance: Error from models that are too sensitive to training data fluctuations. High-degree polynomials have high variance—small changes in training data cause large changes in the fitted curve.
Irreducible Error: Error from noise in the data itself. No model can eliminate this—it's the fundamental limit of prediction.
The goal is to minimize bias + variance. As polynomial degree increases:
- Bias decreases: Model becomes more flexible
- Variance increases: Model becomes more sensitive to noise
- Irreducible error stays constant: It's inherent in the data
The optimal degree balances bias and variance, achieving the lowest total error on unseen data.