Literature DB >> 34862523

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part II-Generalization and Overfitting.

Julius M Kernbach1, Victor E Staartjes2.   

Abstract

We review the concept of overfitting, which is a well-known concern within the machine learning community, but less established in the clinical community. Overfitted models may lead to inadequate conclusions that may wrongly or even harmfully shape clinical decision-making. Overfitting can be defined as the difference among discriminatory training and testing performance, while it is normal that out-of-sample performance is equal to or ever so slightly worse than training performance for any adequately fitted model, a massively worse out-of-sample performance suggests relevant overfitting. We delve into resampling methods, specifically recommending k-fold cross-validation and bootstrapping to arrive at realistic estimates of out-of-sample error during training. Also, we encourage the use of regularization techniques such as L1 or L2 regularization, and to choose an appropriate level of algorithm complexity for the type of dataset used. Data leakage is addressed, and the importance of external validation to assess true out-of-sample performance and to-upon successful external validation-release the model into clinical practice is discussed. Finally, for highly dimensional datasets, the concepts of feature reduction using principal component analysis (PCA) as well as feature elimination using recursive feature elimination (RFE) are elucidated.
© 2022. The Author(s), under exclusive license to Springer Nature Switzerland AG.

Entities:  

Keywords:  Artificial intelligence; Clinical prediction model; Machine intelligence; Machine learning; Prediction; Prognosis

Mesh:

Year:  2022        PMID: 34862523     DOI: 10.1007/978-3-030-85292-4_3

Source DB:  PubMed          Journal:  Acta Neurochir Suppl        ISSN: 0065-1419


  5 in total

1.  Assessing calibration in an external validation study.

Authors:  Gary S Collins; Emmanuel O Ogundimu; Yannick Le Manach
Journal:  Spine J       Date:  2015-11-01       Impact factor: 4.166

Review 2.  Architectonic Mapping of the Human Brain beyond Brodmann.

Authors:  Katrin Amunts; Karl Zilles
Journal:  Neuron       Date:  2015-12-16       Impact factor: 17.173

Review 3.  Circular analysis in systems neuroscience: the dangers of double dipping.

Authors:  Nikolaus Kriegeskorte; W Kyle Simmons; Patrick S F Bellgowan; Chris I Baker
Journal:  Nat Neurosci       Date:  2009-05       Impact factor: 24.884

4.  Cross-validation failure: Small sample sizes lead to large error bars.

Authors:  Gaël Varoquaux
Journal:  Neuroimage       Date:  2017-06-24       Impact factor: 6.556

5.  A multi-modal parcellation of human cerebral cortex.

Authors:  Timothy S Coalson; Emma C Robinson; Carl D Hacker; Matthew F Glasser; John Harwell; Essa Yacoub; Kamil Ugurbil; Jesper Andersson; Christian F Beckmann; Mark Jenkinson; Stephen M Smith; David C Van Essen
Journal:  Nature       Date:  2016-07-20       Impact factor: 49.962

  5 in total
  2 in total

1.  Machine Learning Approach to Support the Detection of Parkinson's Disease in IMU-Based Gait Analysis.

Authors:  Dante Trabassi; Mariano Serrao; Tiwana Varrecchia; Alberto Ranavolo; Gianluca Coppola; Roberto De Icco; Cristina Tassorelli; Stefano Filippo Castiglia
Journal:  Sensors (Basel)       Date:  2022-05-12       Impact factor: 3.847

2.  Application of machine learning algorithms in predicting HIV infection among men who have sex with men: Model development and validation.

Authors:  Jiajin He; Jinhua Li; Siqing Jiang; Wei Cheng; Jun Jiang; Yun Xu; Jiezhe Yang; Xin Zhou; Chengliang Chai; Chao Wu
Journal:  Front Public Health       Date:  2022-08-25
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.