A novel, computationally tractable algorithm flags in big matrices every column associated in any way with others or a dependent variable, with much higher power when columns a

02/21/2022
by   Marcos A. Antezana, et al.
0

Scanning exhaustively a big data matrix DM for subsets of independent variables IVs that are associated with a dependent variable DV is computationally tractable only for 1- and 2-IV effects. I present a highly computationally tractable Participation-In-Association Score (PAS) that in a DM with markers flags every column that is strongly associated with others. PAS examines no column subsets and its computational cost grows linearly with DM columns, remaining reasonable even in million-column DMs. PAS exploits how associations of markers in DM rows cause matches associations in the rows' pairwise comparisons. For every such comparison with a match at a tested column, PAS computes the other matches by modifying the comparison's total matches (scored once per DM), yielding a distribution of conditional matches that is perturbed by associations of the tested column. Equally tractable is dvPAS that flags DV-associated IVs by permuting the markers in the DV. P values are obtained by permutation and Sidak-corrected for multiple tests, bypassing model selection. Simulations show that i) PAS and dvPAS generate uniform-(0,1)-distributed type I error in null DMs and ii) detect randomly encountered binary and trinary models of significant n-column association and n-IV association with a binary DV, respectively, with power in the order of magnitude of exhaustive evaluation's and false positives that are uniform-(0,1)-distributed or straightforwardly tuned to be so. Power to detect 2-way DV-associated 100-marker+ runs is non-parametrically ultimate but that to detect pure n-column associations and pure n-IV DV associations sinks exponentially as n increases. Power increases about twofold in trinary vs. binary DMs and in a major way when there are background associations like between mutations in chromosomes, specially in trinary DMs where dvPAS filters said background most effectively.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset