The AI Forecast Operator uses historical time series data to generate forecasts for
future trends.
This operator simplifies and quickens the data science process by automating model selection,
hyperparameter tuning, and feature identification for a specific prediction task.
The Operator is easy to use and extend, and as powerful as a team of data scientists. To get
started with the a forecast, use the following YAML
configuration:
No perfect model exists. A core feature of the Operator is the ability to select from various
model frameworks. For enterprise AI, typically one or two frameworks perform best for the
problem space. Each model is optimized for different assumptions, such as dataset size,
frequency, complexity, and seasonality. The best way to decide which framework is correct for
you is through empirical testing. Based on experience with several enterprise forecasting
problems, the ADS team has found the following frameworks to be the most effective, ranging
from traditional statistical models to complex machine learning and deep neural networks:
Prophet
ARIMA
LightGBM
NeuralProphet
AutoTS
Note
AutoTS isn't a single modeling framework but a combination of many. AutoTS
algorithms include (v0.6.15): ConstantNaive, LastValueNaive, AverageValueNaive, GLS, GLM,
ETS, ARIMA, FBProphet, RollingRegression, GluonTS, SeasonalNaive, UnobservedComponents,
VECM, DynamicFactor, MotifSimulation, WindowRegression, VAR, DatepartRegression,
UnivariateRegression, UnivariateMotif, MultivariateMotif, NVAR, MultivariateRegression,
SectionalMotif, Theta, ARDL, NeuralProphet, DynamicFactorMQ, PytorchForecasting, ARCH,
RRVAR, MAR, TMF, LATC, KalmanStateSpace, MetricMotif, Cassandra, SeasonalityMotif,
MLEnsemble, PreprocessingRegression, FFT, BallTreeMultivariateMotif, TiDE, NeuralForecast,
DMD.
Auto-Select 🔗
For users new to forecasting, the Operator also has an auto-select option. This is the most
computationally expensive option as it splits the training data into several validation sets,
evaluates each framework, and tries to decide the best one. However, auto-select doesn't
guarantee to find the best model and isn't recommended as the default configuration for
end-users because of its complexity.
Specify the Model 🔗
You can manually select the required model from the list in Modeling Options and insert it into the model parameter
slot. For example:
When extra data is provided, the Operator can optionally generate explanations for these
features (columns) using SHAP values. You can enable explanations in the YAML
file:
With large datasets, SHAP values can be expensive to generate. Enterprise applications might
vary in their need for decimal accuracy compared to computational cost. Therefore, the
Operator offers several options:
FAST_APPROXIMATE (default)
Generated SHAP values are typically within 1% of the true values and require 1% of the
time.
BALANCED
Generated SHAP values are typically within 0.1% of the true values and require 10% of
the time.