Literature DB >> 15184705

What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models.

Michael A Babyak1.   

Abstract

OBJECTIVE: Statistical models, such as linear or logistic regression or survival analysis, are frequently used as a means to answer scientific questions in psychosomatic research. Many who use these techniques, however, apparently fail to appreciate fully the problem of overfitting, ie, capitalizing on the idiosyncrasies of the sample at hand. Overfitted models will fail to replicate in future samples, thus creating considerable uncertainty about the scientific merit of the finding. The present article is a nontechnical discussion of the concept of overfitting and is intended to be accessible to readers with varying levels of statistical expertise. The notion of overfitting is presented in terms of asking too much from the available data. Given a certain number of observations in a data set, there is an upper limit to the complexity of the model that can be derived with any acceptable degree of uncertainty. Complexity arises as a function of the number of degrees of freedom expended (the number of predictors including complex terms such as interactions and nonlinear terms) against the same data set during any stage of the data analysis. Theoretical and empirical evidence--with a special focus on the results of computer simulation studies--is presented to demonstrate the practical consequences of overfitting with respect to scientific inference. Three common practices--automated variable selection, pretesting of candidate predictors, and dichotomization of continuous variables--are shown to pose a considerable risk for spurious findings in models. The dilemma between overfitting and exploring candidate confounders is also discussed. Alternative means of guarding against overfitting are discussed, including variable aggregation and the fixing of coefficients a priori. Techniques that account and correct for complexity, including shrinkage and penalization, also are introduced.

Entities:  

Mesh:

Year:  2004        PMID: 15184705     DOI: 10.1097/01.psy.0000127692.23278.a9

Source DB:  PubMed          Journal:  Psychosom Med        ISSN: 0033-3174            Impact factor:   4.312


  417 in total

1.  Self-efficacy and health locus of control: relationship to occupational disability among workers with back pain.

Authors:  Sylvie Richard; Clermont E Dionne; Arie Nouwen
Journal:  J Occup Rehabil       Date:  2011-09

2.  Beliefs and intentions for skin protection and UV exposure in young adults.

Authors:  Carolyn J Heckman; Sharon L Manne; Jacqueline D Kloss; Sarah Bauerle Bass; Bradley Collins; Stuart R Lessin
Journal:  Am J Health Behav       Date:  2011-11

3.  Reporting recommendations for tumor marker prognostic studies (REMARK): explanation and elaboration.

Authors:  Douglas G Altman; Lisa M McShane; Willi Sauerbrei; Sheila E Taube
Journal:  BMC Med       Date:  2012-05-29       Impact factor: 8.775

4.  Reporting Recommendations for Tumor Marker Prognostic Studies (REMARK): explanation and elaboration.

Authors:  Douglas G Altman; Lisa M McShane; Willi Sauerbrei; Sheila E Taube
Journal:  PLoS Med       Date:  2012-05-29       Impact factor: 11.069

5.  Association of anhedonia with recurrent major adverse cardiac events and mortality 1 year after acute coronary syndrome.

Authors:  Karina W Davidson; Matthew M Burg; Ian M Kronish; Daichi Shimbo; Lucia Dettenborn; Roxana Mehran; David Vorchheimer; Lynn Clemow; Joseph E Schwartz; Francois Lespérance; Nina Rieckmann
Journal:  Arch Gen Psychiatry       Date:  2010-05

6.  Patterns and correlates of adherence to self-monitoring in lung transplant recipients during the first 12 months after discharge from transplant.

Authors:  Lu Hu; Annette DeVito Dabbs; Mary Amanda Dew; Susan M Sereika; Jennifer H Lingler
Journal:  Clin Transplant       Date:  2017-06-11       Impact factor: 2.863

7.  Effect of continuous positive airway pressure on day/night rhythm of prothrombotic markers in obstructive sleep apnea.

Authors:  Roland von Känel; Loki Natarajan; Sonia Ancoli-Israel; Paul J Mills; Tanya Wolfson; Anthony C Gamst; José S Loredo; Joel E Dimsdale
Journal:  Sleep Med       Date:  2012-10-01       Impact factor: 3.492

8.  Effects of HIV viremia on the gastrointestinal microbiome of young MSM.

Authors:  Ryan R Cook; Jennifer A Fulcher; Nicole H Tobin; Fan Li; David Lee; Marjan Javanbakht; Ron Brookmeyer; Steve Shoptaw; Robert Bolan; Grace M Aldrovandi; Pamina M Gorbach
Journal:  AIDS       Date:  2019-04-01       Impact factor: 4.177

Review 9.  Network Analysis as an Alternative Approach to Conceptualizing Eating Disorders: Implications for Research and Treatment.

Authors:  Cheri A Levinson; Irina A Vanzhula; Leigh C Brosof; Kelsie Forbush
Journal:  Curr Psychiatry Rep       Date:  2018-08-06       Impact factor: 5.285

10.  Usual source of care and outcomes following acute myocardial infarction.

Authors:  Erica S Spatz; Sameer D Sheth; Kensey L Gosch; Mayur M Desai; John A Spertus; Harlan M Krumholz; Joseph S Ross
Journal:  J Gen Intern Med       Date:  2014-02-20       Impact factor: 5.128

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.