A Robust Framework for Graph-based Two-Sample Tests Using Weights

07/23/2023
by   Yichuan Bai, et al.
0

Graph-based tests are a class of non-parametric two-sample tests useful for analyzing high-dimensional data. The framework offers both flexibility and power in a wide-range of testing scenarios. The test statistics are constructed from similarity graphs (such as K-nearest neighbor graphs) and consequently, their performance is sensitive to the structure of the graph. When the graph has problematic structures, as is common for high-dimensional data, this can result in poor or unstable performance among existing graph-based tests. We address this challenge and develop graph-based test statistics that are robust to problematic structures of the graph. The limiting null distribution of the robust test statistics is derived. We illustrate the new tests via simulation studies and a real-world application on Chicago taxi trip-data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2021

Limiting distributions of graph-based test statistics

Two-sample tests utilizing a similarity graph on observations are useful...
research
12/24/2021

RISE: Rank in Similarity Graph Edge-Count Two-Sample Test

Two-sample hypothesis testing for high-dimensional data is ubiquitous no...
research
07/22/2022

Graph-Based Tests for Multivariate Covariate Balance Under Multi-Valued Treatments

We propose the use of non-parametric, graph-based tests to assess the di...
research
12/03/2020

Hotspot identification for Mapper graphs

Mapper algorithm can be used to build graph-based representations of hig...
research
08/28/2021

Feature Selection in High-dimensional Space Using Graph-Based Methods

High-dimensional feature selection is a central problem in a variety of ...
research
07/01/2021

Two edge-count tests and relevance analysis in k high-dimensional samples

For the task of relevance analysis, the conventional Tukey's test may be...
research
10/30/2019

Learning pairwise Markov network structures using correlation neighborhoods

Markov networks are widely studied and used throughout multivariate stat...

Please sign up or login with your details

Forgot password? Click here to reset