| Literature DB >> 29225408 |
Tristan R Grogan1, David A Elashoff1.
Abstract
Classification models can demonstrate apparent prediction accuracy even when there is no underlying relationship between the predictors and the response. Variable selection procedures can lead to false positive variable selections and overestimation of true model performance. A simulation study was conducted using logistic regression with forward stepwise, best subsets, and LASSO variable selection methods with varying total sample sizes (20, 50, 100, 200) and numbers of random noise predictor variables (3, 5, 10, 15, 20, 50). Using our critical values can help reduce needless follow-up on variables having no true association with the outcome.Entities:
Keywords: AUC; Logistic Regression; Simulation Study; Validation methods; Variable selection
Year: 2016 PMID: 29225408 PMCID: PMC5722241 DOI: 10.1080/03610918.2016.1230216
Source DB: PubMed Journal: Commun Stat Simul Comput ISSN: 0361-0918 Impact factor: 1.162