Robust Hypothesis Testing with Wasserstein Uncertainty Sets
We consider a data-driven robust hypothesis test where the optimal test will minimize the worst-case performance regarding distributions that are close to the empirical distributions with respect to the Wasserstein distance. This leads to a new non-parametric hypothesis testing framework based on distributionally robust optimization, which is more robust when there are limited samples for one or both hypotheses. Such a scenario often arises from applications such as health care, online change-point detection, and anomaly detection. We study the computational and statistical properties of the proposed test by presenting a tractable convex reformulation of the original infinite-dimensional variational problem exploiting Wasserstein's properties and characterizing the radii selection for the uncertainty sets. We also demonstrate the good performance of our method on synthetic and real data.
READ FULL TEXT