Missing Value Imputation on Multidimensional Time Series

by   Parikshit Bansal, et al.

We present DeepMVI, a deep learning method for missing value imputation in multidimensional time-series datasets. Missing values are commonplace in decision support platforms that aggregate data over long time stretches from disparate sources, and reliable data analytics calls for careful handling of missing data. One strategy is imputing the missing values, and a wide variety of algorithms exist spanning simple interpolation, matrix factorization methods like SVD, statistical models like Kalman filters, and recent deep learning methods. We show that often these provide worse results on aggregate analytics compared to just excluding the missing data. DeepMVI uses a neural network to combine fine-grained and coarse-grained patterns along a time series, and trends from related series across categorical dimensions. After failing with off-the-shelf neural architectures, we design our own network that includes a temporal transformer with a novel convolutional window feature, and kernel regression with learned embeddings. The parameters and their training are designed carefully to generalize across different placements of missing blocks and data characteristics. Experiments across nine real datasets, four different missing scenarios, comparing seven existing methods show that DeepMVI is significantly more accurate, reducing error by more than 50 the cases, compared to the best existing method. Although slower than simpler matrix factorization methods, we justify the increased time overheads by showing that DeepMVI is the only option that provided overall more accurate analytics than dropping missing values.


page 1

page 2

page 3

page 4


Comparison of different Methods for Univariate Time Series Imputation in R

Missing values in datasets are a well-known problem and there are quite ...

SSIM - A Deep Learning Approach for Recovering Missing Time Series Sensor Data

Missing data are unavoidable in wireless sensor networks, due to issues ...

Bayesian Temporal Factorization for Multidimensional Time Series Prediction

Large-scale and multidimensional spatiotemporal data sets are becoming u...

R package imputeTestbench to compare imputations methods for univariate time series

This paper describes the R package imputeTestbench that provides a testb...

A Unified Framework for Long Range and Cold Start Forecasting of Seasonal Profiles in Time Series

Providing long-range forecasts is a fundamental challenge in time series...

Internal Data Imputation in Data Warehouse Dimensions

Missing values occur commonly in the multidimensional data warehouses. T...

Filling time-series gaps using image techniques: Multidimensional context autoencoder approach for building energy data imputation

Building energy prediction and management has become increasingly import...

Please sign up or login with your details

Forgot password? Click here to reset