Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting

06/29/2019
by   Shiyang Li, et al.
3

Time series forecasting is an important problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation. In this paper, we propose to tackle such forecasting problem with Transformer. Although impressed by its performance in our preliminary study, we found its two major weaknesses: (1) locality-agnostics: the point-wise dot-product self attention in canonical Transformer architecture is insensitive to local context, which can make the model prone to anomalies in time series; (2) memory bottleneck: space complexity of canonical Transformer grows quadratically with sequence length L, making modeling long time series infeasible. In order to solve these two issues, we first propose convolutional self attention by producing queries and keys with causal convolution so that local context can be better incorporated into attention mechanism. Then, we propose LogSparse Transformer with only O(L( L)^2) memory cost, improving the time series forecasting in finer granularity under constrained memory budget. Our experiments on both synthetic data and real-world datasets show that it compares favorably to the state-of-the-art.

READ FULL TEXT
research
08/29/2021

TCCT: Tightly-Coupled Convolutional Transformer on Time Series Forecasting

Time series forecasting is essential for a wide range of real-world appl...
research
10/02/2022

Grouped self-attention mechanism for a memory-efficient Transformer

Time-series data analysis is important because numerous real-world tasks...
research
01/04/2023

Infomaxformer: Maximum Entropy Transformer for Long Time-Series Forecasting Problem

The Transformer architecture yields state-of-the-art results in many tas...
research
12/06/2021

Parameter Efficient Deep Probabilistic Forecasting

Probabilistic time series forecasting is crucial in many application dom...
research
09/17/2021

From Known to Unknown: Knowledge-guided Transformer for Time-Series Sales Forecasting in Alibaba

Time series forecasting (TSF) is fundamentally required in many real-wor...
research
02/23/2022

A Differential Attention Fusion Model Based on Transformer for Time Series Forecasting

Time series forecasting is widely used in the fields of equipment life c...
research
08/18/2021

Transformers predicting the future. Applying attention in next-frame and time series forecasting

Recurrent Neural Networks were, until recently, one of the best ways to ...

Please sign up or login with your details

Forgot password? Click here to reset