Online Forecasting of Total-Variation-bounded Sequences
We consider the problem of online forecasting of sequences of length n with total-variation at most C_n using observations contaminated by independent σ-subgaussian noise. We design an O(n n)-time algorithm that achieves a cumulative square error of Õ(n^1/3C_n^2/3σ^4/3) with high probability. The result is rate-optimal as it matches the known minimax rate for the offline nonparametric estimation of the same class [Mammen and van de Geer, 1997]. To the best of our knowledge, this is the first polynomial-time algorithm that optimally forecasts total variation bounded sequences. Our proof techniques leverage the special localized structure of Haar wavelet basis and adaptivity to unknown smoothness parameter in the classical wavelet smoothing [Donoho et al., 1998]. We also compare our model to the rich literature of dynamic regret minimization and nonstationary stochastic optimization, where our problem can be treated as a special case. We show that the workhorse in those settings --- online gradient descent and its variants with a fixed restarting schedule --- are instances of a class of linear forecasters that require a suboptimal regret of Ω̃(√(n)). This implies that the use of more adaptive algorithms are necessary to obtain the optimal rate.
READ FULL TEXT