Table of contents
1.
Introduction
2.
Evaluating Predictor Accuracy
2.1.
Weighted Quantile Loss (wQL)
2.2.
Weighted Absolute Percentage Error (WAPE)
2.3.
Root Mean Square Error (RMSE)
2.4.
Mean Absolute Percentage Error (MAPE)
2.5.
Choosing Forecast Types
2.6.
Setting Backtesting Parameters
3.
Retraining Predictors
4.
Weather Index
5.
Holidays Featurization
6.
Predictor Explainability
6.1.
Interpreting Impact Scores
7.
Frequently Asked Questions
8.
Conclusion
Last Updated: Mar 27, 2024
Easy

Training Predictors of Amazon Forecast

Author vishal teotia
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Creating time series forecasting models is a crucial requirement of many projects since time series forecasting is common customer demand. Based on the same technology used at Amazon.com, Amazon Forecast accelerates this process.

Amazon Forecast predictors are models that utilize your target time series and related time series, item metadata, and any other datasets you include. Forecasts can be produced by using predictors based on your time-series data. 

As a default, Amazon Forecast creates an AutoPredictor that selects the optimal combination of algorithms to forecast each time series in your dataset.

Typically, predictions created with AutoPredictor are more accurate than those made using AutoML or manual selection. Predictor retraining and Forecast Explainability are only available for predictors created with AutoPredictor.

 

To train a predictor, Amazon Forecast requires the following inputs:

  • Forecast horizon – The number of time steps being forecasted.
  • Forecast frequency - the frequency with which you produce forecasts (hourly, daily, weekly, etc.).
  • Dataset group – A group of datasets that include a target time series dataset. Target time-series datasets contain the target attribute (item_id), timestamp attributes, and dimensions. Additional time series and item metadata are optional.

If you create a forecast with Amazon Forecast, you can include the Weather Index and Holidays. You can use the Weather Index to incorporate information regarding weather and Holidays to incorporate information on national holidays.

Evaluating Predictor Accuracy

You can use Amazon Forecast's accuracy metrics to help you evaluate predictors and choose which to use. The Forecast assesses predictors based on Root Mean Square Error, Weighted Quantile Loss, Mean Absolute Percentage Error, Mean Absolute Scaled Error, and Weighted Absolute Percentage Error metrics.

Backtesting on Amazon Forecast produces accuracy metrics and optimizes parameters. When Forecast backtests your model, time-series data is automatically separated into training and testing sets. A model is trained on the training set to forecast data points from the testing set. Analyzing the model's accuracy involves comparing the Forecast with the observed values in the testing set.

Weighted Quantile Loss (wQL)

A model's Weighted Quantile Loss (wQL) measures its accuracy at a given quantile. This is especially useful when there are different costs for underpredicting and overpredicting. By setting weight (τ) of the Weighted Quantile Loss function, you can automatically incorporate differing penalties for overpredicting and underpredicting.

The loss function is calculated as follows.

Where:

τ - a quantile in set {0.01, 0.02, ..., 0.99}

qi,t(τ) - the τ-quantile that the model predicts.

yi,t - the observed value at point (i,t)

Weighted Absolute Percentage Error (WAPE)

The weighted absolute percentage error measures how far forecasted values differ from observed values. By taking the sum of the observed numbers and the sum of the predicted numbers, WAPE is calculated as the difference between those numbers. Smaller values indicate better accuracy.

Where:

yi,t -  observed value at point (i,t)

ŷi,t -  predicted value at point (i,t)

Due to its use of the absolute error instead of the squared error, WAPE is more robust to outliers than RMSE.

Root Mean Square Error (RMSE)

Root Mean Square Error (RMSE) measures the square root of the average of squared errors, so it is susceptible to outliers. The lower the RMSE, the better the model. 

Where:

yi,t -  observed value at point (i,t)

ŷi,t -  predicted value at point (i,t)

nT -  number of data points in a testing set

Mean Absolute Percentage Error (MAPE)

A mean absolute percentage error (MAPE) is calculated by taking the absolute value of the difference between predicted and observed values for each unit of time and averaging that value. Lower values indicate better accuracy.

Where:

At - observed value at point t

Ft - predicted value at point t

n - number of data points in the time series

Choosing Forecast Types

To create predictions and evaluate predictors, Amazon Forecast uses forecast types. The types are as follows:

  • Mean forecast typeAn expected value based on the mean. This type of Forecast is typically used as a point forecast for a specific period.
  • Quantile forecast type - Forecast at a certain percentile. It is typically used as a range of possible values to account for forecast uncertainty. In other words, a forecast at the 0.75 quantiles will estimate a lower value than the observed value 75% of the time.

Quantiles can provide an upper and a lower bound for forecasts. For example, using the Forecast types 0.15 (P15) and 0.9 (P90) provides a range of values known as a 75% confidence interval. The observed value is expected to be lower than the P15 value 15% of the time, and the P90 value is expected to be higher than the observed value 90% of the time. By generating forecasts at p15 and P90, you can expect the actual value to fall 75% of the time between those bounds. The shaded region depicts this range of values in the figure below.

Setting Backtesting Parameters

Forecast accuracy metrics are calculated using backtesting. Forecast averages each metric across all backtest windows when you run multiple backtests. In Forecast, one backtest is computed by default, with the size of backtest window (testing set) equal to the length of the forecast horizon (prediction window). The backtest window length and the number of backtesting scenarios can be set when training a predictor. Ideally, the backtest window should be at least as large as the forecast horizon but smaller than half the time-series dataset. It is possible to perform between 1 and 5 backtests. As a general rule, more backtests produce more reliable accuracy metrics.

Retraining Predictors

You can retain your predictors with updated datasets to stay up to date. A predictor's configuration settings are preserved when retraining the predictor with Amazon Forecast. A retrained predictor will have a separate ARN from the original one. The original predictor will remain active after retraining.

Retraining a predictor can improve forecasting accuracy in two ways:

  1. More current data: The retrained predictor will incorporate more current data when training.
  2. Predictor improvements: Updates and improvements to the Amazon Forecast algorithms and additional datasets will be integrated into your retrained predictor.

The process of retraining a predictor can be up to 50% faster than creating a new one from scratch.

Note: Retraining is only available for predictors created with AutoPredictor.

Weather Index

With the Amazon Forecast Weather Index, you can incorporate historically and forecast weather data into your models. It is instrumental in retail applications, where temperature and precipitation affect product demand.

 

The Forecast only applies the weather feature to time series that improve accuracy when the Weather Index is enabled during predictor training. The Weather Index is not applied to a time series if the addition of weather information does not result in improved predictive accuracy during backtesting.

Holidays Featurization

There is a built-in feature called Holidays that incorporates a feature-engineered dataset of national holidays into your model. This feature supports the national holidays of 66 countries. In retail, where public holidays can significantly affect demand, the Holidays feature can be especially useful.

Amazon Forecast includes the Holidays feature as an Additional Dataset, which is enabled before training a model. Holidays will apply the holiday calendar for every item in your dataset after you choose a country.

Predictor Explainability

Using Predictor Explainability, you can understand how the attributes in your dataset impact the target variable. Forecasts rely upon a metric called Impact scores to determine how each attribute impacts forecast values and whether it increases or decreases forecast values. Predictor Explainability can be enabled when you include related time series, item metadata, or additional datasets such as Holidays and the Weather Index.

Interpreting Impact Scores

The impact score measures the impact attributes have on forecast values. When the ‘price’ attribute has twice the impact score than the ‘store location’ attribute, we can conclude that the price of an item impacts forecast values twice as much as the store location.

In addition, impact scores provide information on whether attributes increase or decrease forecast values. The two graphs on the console indicate this. Blue bars indicate increasing forecast values, while red bars indicate decreasing forecast values. 

Impact scores range from 0 to 1, where a score of 0 indicates no impact, and a score close to 1 indicates a significant impact. SDKs assign Impact scores ranging from -1 to 1, where the sign indicates the impact direction.

Frequently Asked Questions

1. Is Predictor Explainability only available to predictors created with AutoPredictor?

Yes, explainability cannot be enabled for legacy predictors that were created using AutoML or manually.

2. What is AutoML?

Your entire dataset is analyzed using the best-performing algorithm in Forecast.

3. Can we export accuracy metrics?

Yes, Forecast allows you to export forecasted values and accuracy metrics generated during backtesting..

Conclusion

In this blog, we learned about predictors of Amazon forecast and about the required

data needed to train them. We also learned about different accuracy metrics and the differences between them. We also saw how backtesting works and how it is helpful in training robust predictors.

Check out this link if you want to explore more about Big Data.

Check out the Amazon Interview Experience to learn about Amazon’s hiring process.

Recommended Readings:

If you are preparing for the upcoming Campus Placements, don't worry. Coding Ninjas has your back. Visit this data structure link for cracking the best product companies.

Live masterclass