Multistage Large Segment Imputation Framework Based on Deep Learning and Statistic Metrics

by   JinSheng Yang, et al.

Missing value is a very common and unavoidable problem in sensors, and researchers have made numerous attempts for missing value imputation, particularly in deep learning models. However, for real sensor data, the specific data distribution and data periods are rarely considered, making it difficult to choose the appropriate evaluation indexes and models for different sensors. To address this issue, this study proposes a multistage imputation framework based on deep learning with adaptability for missing value imputation. The model presents a mixture measurement index of low- and higher-order statistics for data distribution and a new perspective on data imputation performance metrics, which is more adaptive and effective than the traditional mean squared error. A multistage imputation strategy and dynamic data length are introduced into the imputation process for data periods. Experimental results on different types of sensor data show that the multistage imputation strategy and the mixture index are superior and that the effect of missing value imputation has been improved to some extent, particularly for the large segment imputation problem. The codes and experimental results have been uploaded to GitHub.


page 6

page 9

page 11

page 12

page 17


Are deep learning models superior for missing data imputation in large surveys? Evidence from an empirical comparison

Multiple imputation (MI) is the state-of-the-art approach for dealing wi...

Transformed Distribution Matching for Missing Value Imputation

We study the problem of imputing missing values in a dataset, which has ...

Multiple Imputation for Biomedical Data using Monte Carlo Dropout Autoencoders

Due to complex experimental settings, missing values are common in biome...

Online Missing Value Imputation and Correlation Change Detection for Mixed-type Data via Gaussian Copula

Most data science algorithms require complete observations, yet many dat...

Controllable Missingness from Uncontrollable Missingness: Joint Learning Measurement Policy and Imputation

Due to the cost or interference of measurement, we need to control measu...

Goodness (of fit) of Imputation Accuracy: The GoodImpact Analysis

In statistical survey analysis, (partial) non-responders are integral el...

CDR: Conservative Doubly Robust Learning for Debiased Recommendation

In recommendation systems (RS), user behavior data is observational rath...

Please sign up or login with your details

Forgot password? Click here to reset