Minimax Bounds for Distributed Logistic Regression

10/03/2019
by   Leighton Pate Barnes, et al.
0

We consider a distributed logistic regression problem where labeled data pairs (X_i,Y_i)∈R^d×{-1,1} for i=1,...,n are distributed across multiple machines in a network and must be communicated to a centralized estimator using at most k bits per labeled pair. We assume that the data X_i come independently from some distribution P_X, and that the distribution of Y_i conditioned on X_i follows a logistic model with some parameter θ∈R^d. By using a Fisher information argument, we give minimax lower bounds for estimating θ under different assumptions on the tail of the distribution P_X. We consider both ℓ^2 and logistic losses, and show that for the logistic loss our sub-Gaussian lower bound is order-optimal and cannot be improved.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset