Sequential Multi-Dimensional Self-Supervised Learning for Clinical Time Series

by   Aniruddh Raghu, et al.

Self-supervised learning (SSL) for clinical time series data has received significant attention in recent literature, since these data are highly rich and provide important information about a patient's physiological state. However, most existing SSL methods for clinical time series are limited in that they are designed for unimodal time series, such as a sequence of structured features (e.g., lab values and vitals signs) or an individual high-dimensional physiological signal (e.g., an electrocardiogram). These existing methods cannot be readily extended to model time series that exhibit multimodality, with structured features and high-dimensional data being recorded at each timestep in the sequence. In this work, we address this gap and propose a new SSL method – Sequential Multi-Dimensional SSL – where a SSL loss is applied both at the level of the entire sequence and at the level of the individual high-dimensional data points in the sequence in order to better capture information at both scales. Our strategy is agnostic to the specific form of loss function used at each level – it can be contrastive, as in SimCLR, or non-contrastive, as in VICReg. We evaluate our method on two real-world clinical datasets, where the time series contains sequences of (1) high-frequency electrocardiograms and (2) structured data from lab values and vitals signs. Our experimental results indicate that pre-training with our method and then fine-tuning on downstream tasks improves performance over baselines on both datasets, and in several settings, can lead to improvements across different self-supervised loss functions.


Leveraging Time Irreversibility with Order-Contrastive Pre-training

Label-scarce, high-dimensional domains such as healthcare present a chal...

Multi-view self-supervised learning for multivariate variable-channel time series

Labeling of multivariate biomedical time series data is a laborious and ...

Evaluating Contrastive Learning on Wearable Timeseries for Downstream Clinical Outcomes

Vast quantities of person-generated health data (wearables) are collecte...

Large Scale Time-Series Representation Learning via Simultaneous Low and High Frequency Feature Bootstrapping

Learning representation from unlabeled time series data is a challenging...

ShapeWordNet: An Interpretable Shapelet Neural Network for Physiological Signal Classification

Physiological signals are high-dimensional time series of great practica...

VIbCReg: Variance-Invariance-better-Covariance Regularization for Self-Supervised Learning on Time Series

Self-supervised learning for image representations has recently had many...

DuETT: Dual Event Time Transformer for Electronic Health Records

Electronic health records (EHRs) recorded in hospital settings typically...

Please sign up or login with your details

Forgot password? Click here to reset