Learning and Covering Sums of Independent Random Variables with Unbounded Support

10/24/2022
by   Alkis Kalavasis, et al.
0

We study the problem of covering and learning sums X = X_1 + ⋯ + X_n of independent integer-valued random variables X_i (SIIRVs) with unbounded, or even infinite, support. De et al. at FOCS 2018, showed that the maximum value of the collective support of X_i's necessarily appears in the sample complexity of learning X. In this work, we address two questions: (i) Are there general families of SIIRVs with unbounded support that can be learned with sample complexity independent of both n and the maximal element of the support? (ii) Are there general families of SIIRVs with unbounded support that admit proper sparse covers in total variation distance? As for question (i), we provide a set of simple conditions that allow the unbounded SIIRV to be learned with complexity poly(1/ϵ) bypassing the aforementioned lower bound. We further address question (ii) in the general setting where each variable X_i has unimodal probability mass function and is a different member of some, possibly multi-parameter, exponential family ℰ that satisfies some structural properties. These properties allow ℰ to contain heavy tailed and non log-concave distributions. Moreover, we show that for every ϵ > 0, and every k-parameter family ℰ that satisfies some structural assumptions, there exists an algorithm with Õ(k) ·poly(1/ϵ) samples that learns a sum of n arbitrary members of ℰ within ϵ in TV distance. The output of the learning algorithm is also a sum of random variables whose distribution lies in the family ℰ. En route, we prove that any discrete unimodal exponential family with bounded constant-degree central moments can be approximated by the family corresponding to a bounded subset of the initial (unbounded) parameter space.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/18/2018

Learning Sums of Independent Random Variables with Sparse Collective Support

We study the learnability of sums of independent integer random variable...
research
06/24/2021

Practical strategies for GEV-based regression models for extremes

The generalised extreme value (GEV) distribution is a three parameter fa...
research
11/09/2018

Density estimation for shift-invariant multidimensional distributions

We study density estimation for classes of shift-invariant distributions...
research
03/22/2023

Generalized Data Thinning Using Sufficient Statistics

Our goal is to develop a general strategy to decompose a random variable...
research
10/22/2020

On Mean Estimation for Heteroscedastic Random Variables

We study the problem of estimating the common mean μ of n independent sy...
research
08/19/2021

Threshold Phenomena in Learning Halfspaces with Massart Noise

We study the problem of PAC learning halfspaces on ℝ^d with Massart nois...
research
04/10/2023

Exponentially Improved Efficient Machine Learning for Quantum Many-body States with Provable Guarantees

Solving the ground state and the ground-state properties of quantum many...

Please sign up or login with your details

Forgot password? Click here to reset