| Literature DB >> 29475622 |
Wilson Wen Bin Goh1, Limsoon Wong2.
Abstract
The Anna Karenina effect is a manifestation of the theory-practice gap that exists when theoretical statistics are applied on real-world data. In the course of analyzing biological data for differential features such as genes or proteins, it derives from the situation where the null hypothesis is rejected for extraneous reasons (or confounders), rather than because the alternative hypothesis is relevant to the disease phenotype. The mechanics of applying statistical tests therefore must address and resolve confounders. It is inadequate to simply rely on manipulating the P-value. We discuss three mechanistic elements (hypothesis statement construction, null distribution appropriateness, and test-statistic construction) and suggest how they can be designed to foil the Anna Karenina effect to select phenotypically relevant biological features.Keywords: Statistics; biomarker; feature selection; generalizability; reproducibility
Mesh:
Substances:
Year: 2018 PMID: 29475622 DOI: 10.1016/j.tibtech.2018.01.013
Source DB: PubMed Journal: Trends Biotechnol ISSN: 0167-7799 Impact factor: 19.536