Polynomial Regression

Polynomial regression models non-linear relationships by creating new features from the original feature. Take a single feature x and create x², x³, etc. These become additional features in a linear regression model.

y = \beta_0 + \beta_1 x + \beta_2 x^2 + \beta_3 x^3 + \epsilon

The visualization shows fits of different polynomial degrees:

Degree 1 (straight line): Underfits
Degree 2-3: Usually optimal
High degree: May fit the training data perfectly, but this is overfitting—the model memorizes noise instead of learning the true pattern

Key points:

It's still linear regression—just with engineered features (x, x², x³)
High-degree polynomials can pass through every training point but perform poorly on new data
The best way to determine if a polynomial is overfitting or underfitting is to plot a learning curve showing training vs. validation error
Use cross-validation to choose the degree
Always scale features before creating polynomial terms

For evaluation metrics, see Regression Evaluation Metrics.

Bias, Variance, and Irreducible Error

Every prediction error can be decomposed into three components:

Bias: Error from overly simple models that miss the true pattern. Low-degree polynomials have high bias—they can't capture complex relationships.

Variance: Error from models that are too sensitive to training data fluctuations. High-degree polynomials have high variance—small changes in training data cause large changes in the fitted curve.

Irreducible Error: Error from noise in the data itself. No model can eliminate this—it's the fundamental limit of prediction.

\text{Total Error} = \text{Bias}^2 + \text{Variance} + \text{Irreducible Error}

The goal is to minimize bias + variance. As polynomial degree increases:

Bias decreases: Model becomes more flexible
Variance increases: Model becomes more sensitive to noise
Irreducible error stays constant: It's inherent in the data

The optimal degree balances bias and variance, achieving the lowest total error on unseen data.

Polynomial Regression

Bias, Variance, and Irreducible Error

On this page

Command Palette