Cumulative differences between paired samples

by   Isabel Kloumann, et al.

The simplest, most common paired samples consist of observations from two populations, with each observed response from one population corresponding to an observed response from the other population at the same value of an ordinal covariate. The pair of observed responses (one from each population) at the same value of the covariate is known as a "matched pair" (with the matching based on the value of the covariate). A graph of cumulative differences between the two populations reveals differences in responses as a function of the covariate. Indeed, the slope of the secant line connecting two points on the graph becomes the average difference over the wide interval of values of the covariate between the two points; i.e., slope of the graph is the average difference in responses. ("Average" refers to the weighted average if the samples are weighted.) Moreover, a simple statistic known as the Kuiper metric summarizes into a single scalar the overall differences over all values of the covariate. The Kuiper metric is the absolute value of the total difference in responses between the two populations, totaled over the interval of values of the covariate for which the absolute value of the total is greatest. The total should be normalized such that it becomes the (weighted) average over all values of the covariate when the interval over which the total is taken is the entire range of the covariate (i.e., the sum for the total gets divided by the total number of observations, if the samples are unweighted, or divided by the total weight, if the samples are weighted). This cumulative approach is fully nonparametric and uniquely defined (with only one right way to construct the graphs and scalar summary statistics), unlike traditional methods such as reliability diagrams or parametric or semi-parametric regressions, which typically obscure significant differences due to their parameter settings.


page 1

page 2

page 3

page 4


Calibration of P-values for calibration and for deviation of a subpopulation from the full population

The author's recent research papers, "Cumulative deviation of a subpopul...

Covariate Balancing Methods for Randomized Controlled Trials Are Not Adversarially Robust

The first step towards investigating the effectiveness of a treatment is...

Model-free selective inference under covariate shift via weighted conformal p-values

This paper introduces weighted conformal p-values for model-free selecti...

Contextualizing E-values for Interpretable Sensitivity to Unmeasured Confounding Analyses

The strength of evidence provided by epidemiological and observational s...

Plotting the cumulative deviation of a subgroup from the full population as a function of score

Assessing whether a subgroup of a full population is getting treated equ...

Cumulative differences between subpopulations

Comparing the differences in outcomes (that is, in "dependent variables"...

Ties in ranking scores can be treated as weighted samples

Prior proposals for cumulative statistics suggest making tiny random per...

Please sign up or login with your details

Forgot password? Click here to reset