A Simple Data Structure for Maintaining a Discrete Probability Distribution
We revisit the following problem: given a set of indices S = {1, …, n} and weights w_1, …, w_n ∈ℝ_> 0, provide samples from S with distribution p(i) = w_i / W where W = ∑_j w_j gives the proper normalization. In the static setting, there is a simple data structure due to Walker called Alias Table that allows for samples to be drawn in constant time. A more challenging task is to maintain the distribution in a dynamic setting, where elements may be added or removed, or weights may change over time; here, existing solutions restrict the permissible weights, require rebuilding of the associated data structure after a number of updates, or are rather complex. In this paper, we describe, analyze, and engineer a simple data structure for maintaining a discrete probability distribution in the dynamic setting. Construction of the data structure for an arbitrary distribution takes time O(n), sampling takes expected time O(1), and updates of size Δ = O(W / n) can be processed in time O(1). To evaluate the efficiency of the data structure we conduct an experimental study. The results suggest that the dynamic sampling performance is comparable to the static Alias Table with a minor slowdown.
READ FULL TEXT