Literature DB >> 23811117

Overfitting in prediction models - is it a problem only in high dimensions?

Jyothi Subramanian1, Richard Simon.   

Abstract

The growing recognition that human diseases are molecularly heterogeneous has stimulated interest in the development of prognostic and predictive classifiers for patient selection and stratification. In the process of classifier development, it has been repeatedly emphasized that in situations where the number of candidate predictor variables is much larger than the number of observations, the apparent (training set, resubstitution) accuracy of the classifiers can be highly optimistically biased and hence, classification accuracy should be reported based on evaluation of the classifier on a separate test set or using complete cross-validation. Such evaluation methods have however not been the norm in the case of low-dimensional, p<n data that arise, for example, in clinical trials when a classifier is developed on a combination of clinico-pathological variables and a small number of genetic biomarkers selected from an understanding of the biology of the disease. We undertook simulation studies to investigate the existence and extent of the problem of overfitting with low-dimensional data. The results indicate that overfitting can be a serious problem even for low-dimensional data, especially if the relationship of outcome to the set of predictor variables is not strong. We hence encourage the adoption of either a separate test set or complete cross-validation to evaluate classifier accuracy, even when the number of candidate predictor variables is substantially smaller than the number of cases.
© 2013.

Entities:  

Keywords:  Classifiers; Clinical trials; Overfitting; Patient selection; Prediction accuracy

Mesh:

Year:  2013        PMID: 23811117     DOI: 10.1016/j.cct.2013.06.011

Source DB:  PubMed          Journal:  Contemp Clin Trials        ISSN: 1551-7144            Impact factor:   2.226


  30 in total

1.  Loss of CDX2 expression is associated with poor prognosis in colorectal cancer patients.

Authors:  Jeong Mo Bae; Tae Hun Lee; Nam-Yun Cho; Tae-You Kim; Gyeong Hoon Kang
Journal:  World J Gastroenterol       Date:  2015-02-07       Impact factor: 5.742

2.  Age, choline-to-N-acetyl aspartate, and lipids-lactate-to-creatine ratios assemble a significant Cox's proportional-hazards regression model for survival prediction in patients with high-grade gliomas.

Authors:  Zhenyin Liu; Jing Zhang
Journal:  Br J Radiol       Date:  2017-06-20       Impact factor: 3.039

3.  Liver failure after hepatocellular carcinoma surgery.

Authors:  Hiroaki Motoyama; Akira Kobayashi; Takahide Yokoyama; Akira Shimizu; Norihiko Furusawa; Hiroshi Sakai; Noriyuki Kitagawa; Yohei Ohkubo; Teruomi Tsukahara; Shin-ichi Miyagawa
Journal:  Langenbecks Arch Surg       Date:  2014-10-22       Impact factor: 3.445

Review 4.  Radiomic Analysis: Study Design, Statistical Analysis, and Other Bias Mitigation Strategies.

Authors:  Chaya S Moskowitz; Mattea L Welch; Michael A Jacobs; Brenda F Kurland; Amber L Simpson
Journal:  Radiology       Date:  2022-05-17       Impact factor: 29.146

5.  Multivariate dynamic prediction of ischemic infarction and tissue salvage as a function of time and degree of recanalization.

Authors:  André Kemmling; Fabian Flottmann; Nils Daniel Forkert; Jens Minnerup; Walter Heindel; Goetz Thomalla; Bernd Eckert; Michael Knauth; Marios Psychogios; Soenke Langner; Jens Fiehler
Journal:  J Cereb Blood Flow Metab       Date:  2015-07-08       Impact factor: 6.200

6.  Cross-validation and Peeling Strategies for Survival Bump Hunting using Recursive Peeling Methods.

Authors:  Jean-Eudes Dazard; Michael Choe; Michael LeBlanc; J Sunil Rao
Journal:  Stat Anal Data Min       Date:  2016-01-22       Impact factor: 1.051

7.  Decoding of Attentional State Using High-Frequency Local Field Potential Is As Accurate As Using Spikes.

Authors:  Surya S Prakash; Aritra Das; Sidrat Tasawoor Kanth; J Patrick Mayo; Supratim Ray
Journal:  Cereb Cortex       Date:  2021-07-29       Impact factor: 5.357

Review 8.  Results of the Seventh Scientific Workshop of ECCO: Precision Medicine in IBD-Disease Outcome and Response to Therapy.

Authors:  Bram Verstockt; Nurulamin M Noor; Urko M Marigorta; Polychronis Pavlidis; Parakkal Deepak; Ryan C Ungaro
Journal:  J Crohns Colitis       Date:  2021-09-25       Impact factor: 9.071

9.  Evaluation of magnetic resonance image segmentation in brain low-grade gliomas using support vector machine and convolutional neural network.

Authors:  Qifan Yang; Huijuan Zhang; Jun Xia; Xiaoliang Zhang
Journal:  Quant Imaging Med Surg       Date:  2021-01

10.  Novel Risk Engine for Diabetes Progression and Mortality in USA: Building, Relating, Assessing, and Validating Outcomes (BRAVO).

Authors:  Hui Shao; Vivian Fonseca; Charles Stoecker; Shuqian Liu; Lizheng Shi
Journal:  Pharmacoeconomics       Date:  2018-09       Impact factor: 4.558

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.