Introduction
Creating time series forecasting models is a crucial requirement of many projects since time series forecasting is common customer demand. Based on the same technology used at Amazon.com, Amazon Forecast accelerates this process.
Amazon Forecast predictors are models that utilize your target time series and related time series, item metadata, and any other datasets you include. Forecasts can be produced by using predictors based on your time-series data.
As a default, Amazon Forecast creates an AutoPredictor that selects the optimal combination of algorithms to forecast each time series in your dataset.
Typically, predictions created with AutoPredictor are more accurate than those made using AutoML or manual selection. Predictor retraining and Forecast Explainability are only available for predictors created with AutoPredictor.

To train a predictor, Amazon Forecast requires the following inputs:
- Forecast horizon – The number of time steps being forecasted.
- Forecast frequency - the frequency with which you produce forecasts (hourly, daily, weekly, etc.).
- Dataset group – A group of datasets that include a target time series dataset. Target time-series datasets contain the target attribute (item_id), timestamp attributes, and dimensions. Additional time series and item metadata are optional.

If you create a forecast with Amazon Forecast, you can include the Weather Index and Holidays. You can use the Weather Index to incorporate information regarding weather and Holidays to incorporate information on national holidays.
Evaluating Predictor Accuracy
You can use Amazon Forecast's accuracy metrics to help you evaluate predictors and choose which to use. The Forecast assesses predictors based on Root Mean Square Error, Weighted Quantile Loss, Mean Absolute Percentage Error, Mean Absolute Scaled Error, and Weighted Absolute Percentage Error metrics.
Backtesting on Amazon Forecast produces accuracy metrics and optimizes parameters. When Forecast backtests your model, time-series data is automatically separated into training and testing sets. A model is trained on the training set to forecast data points from the testing set. Analyzing the model's accuracy involves comparing the Forecast with the observed values in the testing set.
Weighted Quantile Loss (wQL)
A model's Weighted Quantile Loss (wQL) measures its accuracy at a given quantile. This is especially useful when there are different costs for underpredicting and overpredicting. By setting weight (τ) of the Weighted Quantile Loss function, you can automatically incorporate differing penalties for overpredicting and underpredicting.
The loss function is calculated as follows.

Where:
τ - a quantile in set {0.01, 0.02, ..., 0.99}
qi,t(τ) - the τ-quantile that the model predicts.
yi,t - the observed value at point (i,t)
Weighted Absolute Percentage Error (WAPE)
The weighted absolute percentage error measures how far forecasted values differ from observed values. By taking the sum of the observed numbers and the sum of the predicted numbers, WAPE is calculated as the difference between those numbers. Smaller values indicate better accuracy.

Where:
yi,t - observed value at point (i,t)
ŷi,t - predicted value at point (i,t)
Due to its use of the absolute error instead of the squared error, WAPE is more robust to outliers than RMSE.
Root Mean Square Error (RMSE)
Root Mean Square Error (RMSE) measures the square root of the average of squared errors, so it is susceptible to outliers. The lower the RMSE, the better the model.

Where:
yi,t - observed value at point (i,t)
ŷi,t - predicted value at point (i,t)
nT - number of data points in a testing set
Mean Absolute Percentage Error (MAPE)
A mean absolute percentage error (MAPE) is calculated by taking the absolute value of the difference between predicted and observed values for each unit of time and averaging that value. Lower values indicate better accuracy.

Where:
At - observed value at point t
Ft - predicted value at point t
n - number of data points in the time series
Choosing Forecast Types
To create predictions and evaluate predictors, Amazon Forecast uses forecast types. The types are as follows:
- Mean forecast type - An expected value based on the mean. This type of Forecast is typically used as a point forecast for a specific period.
- Quantile forecast type - Forecast at a certain percentile. It is typically used as a range of possible values to account for forecast uncertainty. In other words, a forecast at the 0.75 quantiles will estimate a lower value than the observed value 75% of the time.
Quantiles can provide an upper and a lower bound for forecasts. For example, using the Forecast types 0.15 (P15) and 0.9 (P90) provides a range of values known as a 75% confidence interval. The observed value is expected to be lower than the P15 value 15% of the time, and the P90 value is expected to be higher than the observed value 90% of the time. By generating forecasts at p15 and P90, you can expect the actual value to fall 75% of the time between those bounds. The shaded region depicts this range of values in the figure below.

Setting Backtesting Parameters
Forecast accuracy metrics are calculated using backtesting. Forecast averages each metric across all backtest windows when you run multiple backtests. In Forecast, one backtest is computed by default, with the size of backtest window (testing set) equal to the length of the forecast horizon (prediction window). The backtest window length and the number of backtesting scenarios can be set when training a predictor. Ideally, the backtest window should be at least as large as the forecast horizon but smaller than half the time-series dataset. It is possible to perform between 1 and 5 backtests. As a general rule, more backtests produce more reliable accuracy metrics.
