December 02, 2019

A new framework that combines the best of both traditional statistical models and neural network models for time series modeling, which is prevalent in many important applications, such as forecasting and anomaly detection. Classical models such as autoregression (AR) exploit the inherent characteristics of a time series, leading to a more concise model. This is possible because the model makes strong assumptions about the data, such as the true order of the AR process. These models, however, do not scale well for a large volume of training data, particularly if there are long-range dependencies or complex interactions.

To overcome the scalability challenges, sequence-to-sequence models have become popular in natural language processing. RNN-based methods, in particular, allow for a more expressive model without requiring elaborate features. While these models scale well to applications with rich data, they can be overly complex for typical time series data, resulting in the lack of interpretability. We needed a scalable, extensible, and interpretable model to bridge both the statistical and deep learning-based approaches. Our proposed framework models proven classic AR methods using a feedforward neural network approach. The feedforward model is not only as interpretable as AR models but also scalable and easier to use.

Our basic building block, termed AR-Net, has two distinct advantages over its traditional counterpart:

AR-Net scales well to large orders, making it possible to estimate long-range dependencies (important in high-resolution monitoring applications, such as those in the data center domain).

AR-Net automatically selects and estimates the important coefficients of a sparse AR process, eliminating the need to know the true order of the AR process.

Consider a time series *y1 , ... , yt*, expressed as an AR process. In order to predict the next time step *yt*, each of the *p* past values of *y* is multiplied by a learned weight *wi* (called AR coefficient).

For a large *p* (called order), the traditional approach can become impractically slow to train. However, a large order is required for monitoring high-resolution millisecond or second-level data. To overcome the scalability challenge, we train a neural network with stochastic gradient descent to learn the AR coefficients. If we know the true order of the process, AR-Net effectively learns near-identical weights as classic AR implementations and is equally good at predicting the next value of the time series.

Left: AR-equivalent neural network without hidden layers (simplest form of AR-Net). Right: AR-inspired neural network with n hidden layers (general AR-Net).

If the order is unknown, AR-Net automatically learns the relevant weights, even if the underlying data is generated by a noisy and extremely sparse AR process. We achieve this by introducing a small regularization factor of the learned weights. In such a sparse setting, AR-Net clearly outperforms classic AR.

AR-Net effectively learns the sparse weights, setting the irrelevant weights to zero. Classic AR overestimates the irrelevant weights. Fitted on data generated by a noisy AR-3 process with sparsity (lags 1, 3, and 10 are non-zero).

Our work demonstrates that modeling time series with neural networks can be just as interpretable as doing so using classical methods. Furthermore, we make it computationally tractable and simple for the practitioner to fit a sparse AR model of a high order. This makes it possible to model temporal data without having to determine the true order of the underlying AR process, allowing the model to automatically learn accurate long-range dependencies without overfitting.

Computational time to fit a classic AR implementation (statsmodels in Python) and AR-Net (using PyTorch in Python).

We call our model AR-Net because it can seamlessly be expanded to include any arbitrary number of hidden layers. Adding layers will improve the predictive power of the model — but at the expense of interpretability. Our goal here is to show that even the simplest form of AR-Net is a strong alternative to classic AR implementations, particularly when dealing with sparse or high-order AR processes. This work paves the way for creating a deep learning model that semi-explicitly incorporates time series dynamics, such as autoregression, trend shifts, and seasonality. Building on existing open source tools such as Prophet and PyTorch will help make this feasible. We are excited about how our work may help empower time series practitioners in their daily work.

AR-Net: A simple Auto-Regressive Neural Network for time series

PhD Student at Stanford University, Sustainable Systems Lab * Sponsored in part by a research agreement between Stanford University and Total S.A.

Research Scientist at Facebook

Faculty at Stanford University, Sustainable Systems Lab