Optimal subsampling for quantile regression in big data

01/28/2020
by   HaiYing Wang, et al.
0

We investigate optimal subsampling for quantile regression. We derive the asymptotic distribution of a general subsampling estimator and then derive two versions of optimal subsampling probabilities. One version minimizes the trace of the asymptotic variance-covariance matrix for a linearly transformed parameter estimator and the other minimizes that of the original parameter estimator. The former does not depend on the densities of the responses given covariates and is easy to implement. Algorithms based on optimal subsampling probabilities are proposed and asymptotic distributions and asymptotic optimality of the resulting estimators are established. Furthermore, we propose an iterative subsampling procedure based on the optimal subsampling probabilities in the linearly transformed parameter estimation which has great scalability to utilize available computational resources. In addition, this procedure yields standard errors for parameter estimators without estimating the densities of the responses given the covariates. We provide numerical examples based on both simulated and real data to illustrate the proposed method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2022

Optimal subsampling for functional quantile regression

Subsampling is an efficient method to deal with massive data. In this pa...
research
01/06/2023

Optimal subsampling algorithm for composite quantile regression with distributed data

For massive data stored at multiple machines, we propose a distributed s...
research
06/14/2021

Improving Bridge estimators via f-GAN

Bridge sampling is a powerful Monte Carlo method for estimating ratios o...
research
10/10/2022

Approximating Partial Likelihood Estimators via Optimal Subsampling

With the growing availability of large-scale biomedical data, it is ofte...
research
06/18/2018

Optimal Subsampling Algorithms for Big Data Generalized Linear Models

To fast approximate the maximum likelihood estimator with massive data, ...
research
10/17/2018

On mean decomposition for summarizing conditional distributions

We propose a summary measure defined as the expected value of a random v...
research
11/11/2020

Maximum sampled conditional likelihood for informative subsampling

Subsampling is a computationally effective approach to extract informati...

Please sign up or login with your details

Forgot password? Click here to reset