Literature DB >> 25504513

Effects of categorization method, regression type, and variable distribution on the inflation of Type-I error rate when categorizing a confounding variable.

Jean-Louis Barnwell-Ménard1, Qing Li, Alan A Cohen.   

Abstract

The loss of signal associated with categorizing a continuous variable is well known, and previous studies have demonstrated that this can lead to an inflation of Type-I error when the categorized variable is a confounder in a regression analysis estimating the effect of an exposure on an outcome. However, it is not known how the Type-I error may vary under different circumstances, including logistic versus linear regression, different distributions of the confounder, and different categorization methods. Here, we analytically quantified the effect of categorization and then performed a series of 9600 Monte Carlo simulations to estimate the Type-I error inflation associated with categorization of a confounder under different regression scenarios. We show that Type-I error is unacceptably high (>10% in most scenarios and often 100%). The only exception was when the variable categorized was a continuous mixture proxy for a genuinely dichotomous latent variable, where both the continuous proxy and the categorized variable are error-ridden proxies for the dichotomous latent variable. As expected, error inflation was also higher with larger sample size, fewer categories, and stronger associations between the confounder and the exposure or outcome. We provide online tools that can help researchers estimate the potential error inflation and understand how serious a problem this is.
Copyright © 2014 John Wiley & Sons, Ltd.

Entities:  

Keywords:  Type-I error; categorization; confounding; dichotomization; distribution; simulation

Mesh:

Year:  2014        PMID: 25504513     DOI: 10.1002/sim.6387

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.373


  12 in total

1.  Description of Criterion Validity of the Autism Spectrum Rating Scales 6-18 Parent Report: Initial Exploration in a Large Community Sample.

Authors:  Amy Camodeca
Journal:  Child Psychiatry Hum Dev       Date:  2019-12

Review 2.  The risks of biomarker-based epidemiology: Associations of circulating calcium levels with age, mortality, and frailty vary substantially across populations.

Authors:  Alan A Cohen; Véronique Legault; Georg Fuellen; Tamàs Fülöp; Linda P Fried; Luigi Ferrucci
Journal:  Exp Gerontol       Date:  2017-07-16       Impact factor: 4.032

3.  Defining the vulnerable patient with myeloma-a frailty position paper of the European Myeloma Network.

Authors:  Gordon Cook; Alessandra Larocca; Thierry Facon; Sonja Zweegman; Monika Engelhardt
Journal:  Leukemia       Date:  2020-06-18       Impact factor: 11.528

4.  Index or illusion: The case of frailty indices in the Health and Retirement Study.

Authors:  Yi-Sheng Chao; Hsing-Chien Wu; Chao-Jung Wu; Wei-Chih Chen
Journal:  PLoS One       Date:  2018-07-18       Impact factor: 3.240

5.  A novel approach to determine two optimal cut-points of a continuous predictor with a U-shaped relationship to hazard ratio in survival data: simulation and application.

Authors:  Yimin Chen; Jialing Huang; Xianying He; Yongxiang Gao; Gehendra Mahara; Zhuochen Lin; Jinxin Zhang
Journal:  BMC Med Res Methodol       Date:  2019-05-09       Impact factor: 4.615

6.  Spurious interaction as a result of categorization.

Authors:  Magne Thoresen
Journal:  BMC Med Res Methodol       Date:  2019-02-07       Impact factor: 4.615

7.  Composite diagnostic criteria are problematic for linking potentially distinct populations: the case of frailty.

Authors:  Yi-Sheng Chao; Chao-Jung Wu; Hsing-Chien Wu; Hui-Ting Hsu; Lien-Cheng Tsao; Yen-Po Cheng; Yi-Chun Lai; Wei-Chih Chen
Journal:  Sci Rep       Date:  2020-02-13       Impact factor: 4.379

8.  Ranking, selecting, and prioritising genes with desirability functions.

Authors:  Stanley E Lazic
Journal:  PeerJ       Date:  2015-11-26       Impact factor: 2.984

9.  integRATE: a desirability-based data integration framework for the prioritization of candidate genes across heterogeneous omics and its application to preterm birth.

Authors:  Haley R Eidem; Jacob L Steenwyk; Jennifer H Wisecaver; John A Capra; Patrick Abbot; Antonis Rokas
Journal:  BMC Med Genomics       Date:  2018-11-19       Impact factor: 3.063

10.  Simulation study to demonstrate biases created by diagnostic criteria of mental illnesses: major depressive episodes, dysthymia, and manic episodes.

Authors:  Yi-Sheng Chao; Kuan-Fu Lin; Chao-Jung Wu; Hsing-Chien Wu; Hui-Ting Hsu; Lien-Cheng Tsao; Yen-Po Cheng; Yi-Chun Lai; Wei-Chih Chen
Journal:  BMJ Open       Date:  2020-11-10       Impact factor: 2.692

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.