Scalable inference in functional linear regression with streaming data

by   Jinhan Xie, et al.

Traditional static functional data analysis is facing new challenges due to streaming data, where data constantly flow in. A major challenge is that storing such an ever-increasing amount of data in memory is nearly impossible. In addition, existing inferential tools in online learning are mainly developed for finite-dimensional problems, while inference methods for functional data are focused on the batch learning setting. In this paper, we tackle these issues by developing functional stochastic gradient descent algorithms and proposing an online bootstrap resampling procedure to systematically study the inference problem for functional linear regression. In particular, the proposed estimation and inference procedures use only one pass over the data; thus they are easy to implement and suitable to the situation where data arrive in a streaming manner. Furthermore, we establish the convergence rate as well as the asymptotic distribution of the proposed estimator. Meanwhile, the proposed perturbed estimator from the bootstrap procedure is shown to enjoy the same theoretical properties, which provide the theoretical justification for our online inference tool. As far as we know, this is the first inference result on the functional linear regression model with streaming data. Simulation studies are conducted to investigate the finite-sample performance of the proposed procedure. An application is illustrated with the Beijing multi-site air-quality data.


page 1

page 2

page 3

page 4


Testing exogeneity in the functional linear regression model

We propose a novel test statistic for testing exogeneity in the function...

On Scalable Inference with Stochastic Gradient Descent

In many applications involving large dataset or online updating, stochas...

Optimal One-pass Nonparametric Estimation Under Memory Constraint

For nonparametric regression in the streaming setting, where data consta...

Functional L-Optimality Subsampling for Massive Data

Massive data bring the big challenges of memory and computation for anal...

Inference on High-dimensional Single-index Models with Streaming Data

Traditional statistical methods are faced with new challenges due to str...

Online Debiased Lasso

We propose an online debiased lasso (ODL) method for statistical inferen...

Online Dual Coordinate Ascent Learning

The stochastic dual coordinate-ascent (S-DCA) technique is a useful alte...

Please sign up or login with your details

Forgot password? Click here to reset