How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections

06/24/2022
by   Albert Gu, et al.
12

Linear time-invariant state space models (SSM) are a classical model from engineering and statistics, that have recently been shown to be very promising in machine learning through the Structured State Space sequence model (S4). A core component of S4 involves initializing the SSM state matrix to a particular matrix called a HiPPO matrix, which was empirically important for S4's ability to handle long sequences. However, the specific matrix that S4 uses was actually derived in previous work for a particular time-varying dynamical system, and the use of this matrix as a time-invariant SSM had no known mathematical interpretation. Consequently, the theoretical mechanism by which S4 models long-range dependencies actually remains unexplained. We derive a more general and intuitive formulation of the HiPPO framework, which provides a simple mathematical interpretation of S4 as a decomposition onto exponentially-warped Legendre polynomials, explaining its ability to capture long dependencies. Our generalization introduces a theoretically rich class of SSMs that also lets us derive more intuitive S4 variants for other bases such as the Fourier basis, and explains other aspects of training S4, such as how to initialize the important timescale parameter. These insights improve S4's performance to 86 difficult Path-X task.

READ FULL TEXT
research
06/23/2022

On the Parameterization and Initialization of Diagonal State Space Models

State space models (SSM) have recently been shown to be very effective a...
research
10/31/2021

Efficiently Modeling Long Sequences with Structured State Spaces

A central goal of sequence modeling is designing a single principled mod...
research
07/04/2022

Fiedler Linearizations of Multivariable State-Space System and its Associated System Matrix

Linearization is a standard method in the computation of eigenvalues and...
research
09/26/2022

Liquid Structural State-Space Models

A proper parametrization of state transition matrices of linear state-sp...
research
12/30/2022

PAC-Bayesian-Like Error Bound for a Class of Linear Time-Invariant Stochastic State-Space Models

In this paper we derive a PAC-Bayesian-Like error bound for a class of s...
research
02/27/2023

Diagonal State Space Augmented Transformers for Speech Recognition

We improve on the popular conformer architecture by replacing the depthw...
research
05/16/2023

Counterfactual Outcome Prediction using Structured State Space Model

Counterfactual outcome prediction in longitudinal data has recently gain...

Please sign up or login with your details

Forgot password? Click here to reset