RESEARCH

ML APPLICATIONS

Large-scale forecasting: Self-supervised learning framework for hyperparameter tuning

April 5, 2021

What the research is:

A new self-supervised learning framework for model selection (SSL-MS) and hyperparameter tuning (SSL-HPT), which provides accurate forecasts with less computational time and resources. This SSL-HPT algorithm estimates hyperparameters 6-20x faster when compared with baseline search-based algorithms, while producing comparably accurate forecasting results in various applications.

In time series analysis (used to find trends or forecast future values), slight differences in hyperparameters could lead to very different forecast results for a given model. Therefore, it’s very important to select optimal hyperparameter values. Most existing hyperparameter tuning methods β€” such as grid search, random search, and Bayesian optimal search β€” are based on one key component: search. Because of this, they are computationally expensive and cannot be applied to fast, scalable time series hyperparameter tuning. Our framework, SSL-HPT, uses time series features as inputs and produces optimal hyperparameters in less time without sacrificing accuracy.

How it works:

We developed the self-supervised learning framework with two main tasks in mind in the forecasting domain: SSL-MS and SSL-HPT.

SSL-MS: The self-supervised learning framework for SSL-MS consists of three steps, as shown below:

  1. Offline training data preparation: We obtain (a) time series features for each time series and (b) the best-performing model for each time series via offline exhaustive hyperparameter tuning.

  2. Offline training: A classifier (self-supervised learner) is trained with the data from step 1, where the input feature (predictor) is the time series feature and the label is the best performing model from step 1.

  3. Online model prediction: In our online services, for new time series data, we extract features and then make inference with our pretrained classifier, such as a random forest model.

The workflow of SSL-MS.

SSL-HPT: The workflow of SSL-MS can be naturally extended to SSL-HPT. As shown in the image below, given a model, all hyperparameter settings within a predefined parameter space are explored for each time series. The most promising one will then be selected as the output π‘Œ. For input 𝑋, the time series features we used here are the same as in SSL-MS. Once the self-supervised learner is trained, we can directly predict hyperparameters and produce the forecasting results for any new time series data.

The workflow of SSL-HPT.

Experiment results with Facebook Infrastructure data and open source data. We empirically evaluated our algorithms on both internal and external data sets, and obtained similar conclusions. SSL frameworks can dramatically improve the efficiency of model selection and hyperparameter tuning, reducing running time by 6-20x with comparable forecasting accuracy.

Why it matters:

Forecasting is one of the core data science and machine learning tasks we perform at Facebook, so providing fast, reliable, and accurate forecasting results with large amounts of time series data is important for our business. The applications for this framework include capacity planning and management, demand forecasting, energy prediction, and anomaly detection. Rapid advances in computing technologies have enabled businesses to keep track of a large number of time series data sets. Hence, the need to regularly forecast millions of time series is becoming increasingly common. It remains challenging to obtain fast and accurate forecasts for a large number of time series.

Our SSL framework offers an efficient solution to provide high-quality forecasting results at a low computational cost, and short running time. This approach is independent of specific forecasting models and algorithms, so we still enjoy the advantages of individual forecasting techniques, such as the interpretability of the Prophet model. Preliminary analysis shows that our framework could be extended to model recommendation and enhance the Bayesian optimization algorithm in our in-house AX library.

Read the full paper:

Self-supervised learning for fast and scalable time series hyperparameter tuning

This is a joint work with Peiyi Zhang, PhD candidate in statistics at Purdue University, Yang Yu, Nikolay Pavlovich Laptev, Caner Komurlu, Peng Gao, and Ginger Holt. We would also like to thank Alessandro Panella and Dario Benavides for their valuable feedback.

Written By

Xiaodong Jiang

Research Data Scientist