ARIMA
AutoRegressive Integrated Moving Average model for univariate time series forecasting
ARIMA (AutoRegressive Integrated Moving Average) is a popular statistical method for time series forecasting that combines autoregression, differencing, and moving averages. It's particularly effective for univariate time series data with trends but without complex seasonal patterns.
When to Use ARIMA
ARIMA is best suited for:
- Univariate time series forecasting (single target variable)
- Data with trends that need to be made stationary through differencing
- Short to medium-term forecasts
- When you need interpretable model coefficients
- Scenarios where statistical rigor is preferred over black-box models
Strengths
- Well-established statistical foundation with confidence intervals
- Interpretable parameters and diagnostic statistics
- Effective for capturing linear trends and autocorrelation patterns
- Works well with relatively small datasets
- No need for external features (univariate)
- Provides prediction intervals for uncertainty quantification
Weaknesses
- Limited to univariate forecasting (single target only)
- Cannot handle seasonal patterns directly (use SARIMA instead)
- Requires manual tuning of p, d, q parameters (or use Auto ARIMA)
- Assumes linear relationships in the data
- Performance degrades for long-term forecasts
- Sensitive to outliers and structural breaks
Parameters
Common Time Series Parameters
All time series models share these parameters:
- Timestamp Column (required): Column containing dates/times that orders your observations
- Target Column (required): Numeric value to forecast
- Frequency (optional): Time spacing (D=daily, H=hourly, W=weekly, M=monthly). Auto-inferred if not specified
- Forecast Steps (required, default=1): How many periods ahead to predict
- Aggregation (default='mean'): Method to handle duplicate timestamps (mean, sum, first, last, max, min)
- Fill Method (default='interpolate'): How to handle missing values (interpolate, ffill, bfill)
- Smoothing (default=false): Whether to apply smoothing to the time series data
ARIMA-Specific Parameters
AR Order (p)
- Type: Integer
- Default: 1
- Description: Number of lag observations included in the model (autoregressive term). Higher values capture more historical dependencies
- Typical Range: 0-5
- Example: p=2 means the model uses the previous 2 time points to predict the next value
Differencing Order (d)
- Type: Integer
- Default: 0
- Description: Degree of differencing needed to make the series stationary. d=1 removes linear trends, d=2 removes quadratic trends
- Typical Range: 0-2
- Example: d=1 transforms the series to differences between consecutive values
MA Order (q)
- Type: Integer
- Default: 0
- Description: Size of the moving average window (error term lagged forecast errors). Captures short-term fluctuations
- Typical Range: 0-5
- Example: q=1 means the model considers the previous forecast error
Hyperparameter Tuning Parameters
- Tune Hyperparameters (default=false): Enable automatic optimization of p, d, q values
- Tuning Method (default='grid'): Strategy for parameter search (grid or random)
- Number of Trials (default=20): Number of random search iterations (when tuning_method='random')
- CV Folds (default=3, min=2): Number of time series cross-validation folds for tuning
- Scoring Metric (default='rmse'): Metric to minimize during tuning (rmse, mae, mape)
Configuration Tips
Finding the Right Order Parameters
- Start Simple: Begin with (1,1,1) and evaluate performance
- Check ACF/PACF Plots: Use autocorrelation plots to guide p and q selection
- Enable Tuning: Set "Tune Hyperparameters" to true to automatically find optimal (p,d,q)
- Stationarity Test: Use d=1 if your data has a clear trend, d=0 if already stationary
When to Increase Each Parameter
- Increase p: When current values are strongly correlated with many past values
- Increase d: When one level of differencing doesn't remove the trend
- Increase q: When you see patterns in forecast errors over time
Frequency Configuration
Make sure your frequency matches your data:
- Daily sales → 'D'
- Hourly sensors → 'H'
- Monthly revenue → 'M'
Model Selection Strategy
- Enable hyperparameter tuning for initial exploration
- Review the selected (p,d,q) values
- Fine-tune manually if needed for your domain knowledge
- Compare with Auto ARIMA for best results
Common Issues and Solutions
Issue: Poor Long-Term Forecasts
Solution: ARIMA is designed for short-term forecasting. For long horizons, consider:
- Using Prophet or TBATS for better trend extrapolation
- Reducing forecast_steps
- Adding exogenous variables with SARIMAX
Issue: Model Won't Converge
Solution:
- Check for missing values in your time series
- Try reducing p, d, or q parameters
- Ensure your data has sufficient history (at least 50+ observations)
- Consider removing extreme outliers
Issue: Predictions Flatten Out
Solution: This is expected behavior for ARIMA. To maintain trends:
- Verify d parameter is appropriate for your trend
- Consider using Prophet which has explicit trend modeling
- Use SARIMAX with exogenous trend variables
Issue: High Errors on Recent Data
Solution:
- Your series may have structural breaks or regime changes
- Try fitting on more recent data only
- Consider models like Prophet that handle changepoints
Issue: Need Seasonal Patterns
Solution: ARIMA doesn't model seasonality. Use:
- SARIMA for seasonal ARIMA
- Prophet for automatic seasonality detection
- Seasonal decomposition before ARIMA
Example Use Cases
- Daily website traffic: (p=7, d=1, q=0) captures weekly patterns through AR terms
- Monthly sales with trend: (p=1, d=1, q=1) removes trend and smooths noise
- Stock prices: (p=1, d=1, q=0) random walk with drift
- Temperature forecasting: (p=2, d=0, q=1) for stationary weather patterns