Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers

10/26/2021
by   Albert Gu, et al.
25

Recurrent neural networks (RNNs), temporal convolutions, and neural differential equations (NDEs) are popular families of deep learning models for time-series data, each with unique strengths and tradeoffs in modeling power and computational efficiency. We introduce a simple sequence model inspired by control systems that generalizes these approaches while addressing their shortcomings. The Linear State-Space Layer (LSSL) maps a sequence u ↦ y by simply simulating a linear continuous-time state-space representation ẋ = Ax + Bu, y = Cx + Du. Theoretically, we show that LSSL models are closely related to the three aforementioned families of models and inherit their strengths. For example, they generalize convolutions to continuous-time, explain common RNN heuristics, and share features of NDEs such as time-scale adaptation. We then incorporate and generalize recent theory on continuous-time memorization to introduce a trainable subset of structured matrices A that endow LSSLs with long-range memory. Empirically, stacking LSSL layers into a simple deep neural network obtains state-of-the-art results across time series benchmarks for long dependencies in sequential image classification, real-world healthcare regression tasks, and speech. On a difficult speech classification task with length-16000 sequences, LSSL outperforms prior approaches by 24 accuracy points, and even outperforms baselines that use hand-crafted features on 100x shorter sequences.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/26/2022

Liquid Structural State-Space Models

A proper parametrization of state transition matrices of linear state-sp...
research
12/24/2022

Deep Latent State Space Models for Time-Series Generation

Methods based on ordinary differential equations (ODEs) are widely used ...
research
03/16/2023

Effectively Modeling Time Series with Simple Discrete State Spaces

Time series modeling is a well-established problem, which often requires...
research
05/20/2020

Neural ODEs for Informative Missingness in Multivariate Time Series

Informative missingness is unavoidable in the digital processing of cont...
research
06/21/2021

Neural Controlled Differential Equations for Online Prediction Tasks

Neural controlled differential equations (Neural CDEs) are a continuous-...
research
09/19/2023

Hybrid State Space-based Learning for Sequential Data Prediction with Joint Optimization

We investigate nonlinear prediction/regression in an online setting and ...
research
12/28/2022

Latent Discretization for Continuous-time Sequence Compression

Neural compression offers a domain-agnostic approach to creating codecs ...

Please sign up or login with your details

Forgot password? Click here to reset