Log-Paradox: Necessary and sufficient conditions for confounding statistically significant pattern reversal under the log-transform
The log-transform is a common tool in statistical analysis, reducing the impact of extreme values, compressing the range of reported values for improved visualization, enabling the usage of parametric statistical tests requiring normally distributed data, or enabling linear models on non-linear data. Practitioners are rarely aware that log-transformed results can reverse findings: a hypothesis test without the transform can show a negative trend, while with the log-transform, it can show a positive trend, both statistically significant. We derive necessary and sufficient conditions underlying this paradoxical pattern reversal using finite difference notation. We show that biomedical image quantification is very susceptible to these conditions. Using a novel heuristic maximizing the reversal, we show that statistical significance of the paradoxical pattern reversal can be easily induced by changing as little as 5 of objects in proportional data, especially where object sizes capture underlying creation and destruction dynamics, satisfies the precondition for the paradox. We discuss recommendations on proper use of the log-transform, discuss methods to explore the underlying patterns robustly, and emphasize that any transformed result should always be accompanied by its non-transformed source equivalent to exclude accidental confounded findings.
READ FULL TEXT