Continual Mean Estimation Under User-Level Privacy
We consider the problem of continually releasing an estimate of the population mean of a stream of samples that is user-level differentially private (DP). At each time instant, a user contributes a sample, and the users can arrive in arbitrary order. Until now these requirements of continual release and user-level privacy were considered in isolation. But, in practice, both these requirements come together as the users often contribute data repeatedly and multiple queries are made. We provide an algorithm that outputs a mean estimate at every time instant t such that the overall release is user-level ε-DP and has the following error guarantee: Denoting by M_t the maximum number of samples contributed by a user, as long as Ω̃(1/ε) users have M_t/2 samples each, the error at time t is Õ(1/√(t)+√(M)_t/tε). This is a universal error guarantee which is valid for all arrival patterns of the users. Furthermore, it (almost) matches the existing lower bounds for the single-release setting at all time instants when users have contributed equal number of samples.
READ FULL TEXT