Structure preservation via the Wasserstein distance

09/15/2022

∙

We show that under minimal assumptions on a random vector X∈ℝ^d, and with high probability, given m independent copies of X, the coordinate distribution of each vector (⟨ X_i,θ⟩)_i=1^m is dictated by the distribution of the true marginal ⟨ X,θ⟩. Formally, we show that with high probability, sup_θ∈ S^d-1( 1/m∑_i=1^m |⟨ X_i,θ⟩^♯ - λ^θ_i |^2 )^1/2≤ c ( d/m)^1/4, where λ^θ_i = m∫_(i-1/m, i/m] F_⟨ X,θ⟩^-1(u)^2 du and a^♯ denotes the monotone non-decreasing rearrangement of a. The proof follows from the optimal estimate on the worst Wasserstein distance between a marginal of X and its empirical counterpart, 1/m∑_i=1^m δ_⟨ X_i, θ⟩. We then use the accurate information on the structures of the vectors (⟨ X_i,θ⟩)_i=1^m to construct the first non-gaussian ensemble that yields the optimal estimate in the Dvoretzky-Milman Theorem: the ensemble exhibits almost Euclidean sections in arbitrary normed spaces of the same dimension as the gaussian embedding – despite being very far from gaussian (in fact, it happens to be heavy-tailed).

READ FULL TEXT

Structure preservation via the Wasserstein distance

Sign in with Google

Consider DeepAI Pro