Literature DB >> 15905277

Prediction error estimation: a comparison of resampling methods.

Annette M Molinaro1, Richard Simon, Ruth M Pfeiffer.   

Abstract

MOTIVATION: In genomic studies, thousands of features are collected on relatively few samples. One of the goals of these studies is to build classifiers to predict the outcome of future observations. There are three inherent steps to this process: feature selection, model selection and prediction assessment. With a focus on prediction assessment, we compare several methods for estimating the 'true' prediction error of a prediction model in the presence of feature selection.
RESULTS: For small studies where features are selected from thousands of candidates, the resubstitution and simple split-sample estimates are seriously biased. In these small samples, leave-one-out cross-validation (LOOCV), 10-fold cross-validation (CV) and the .632+ bootstrap have the smallest bias for diagonal discriminant analysis, nearest neighbor and classification trees. LOOCV and 10-fold CV have the smallest bias for linear discriminant analysis. Additionally, LOOCV, 5- and 10-fold CV, and the .632+ bootstrap have the lowest mean square error. The .632+ bootstrap is quite biased in small sample sizes with strong signal-to-noise ratios. Differences in performance among resampling methods are reduced as the number of specimens available increase. SUPPLEMENTARY INFORMATION: A complete compilation of results and R code for simulations and analyses are available in Molinaro et al. (2005) (http://linus.nci.nih.gov/brb/TechReport.htm).

Mesh:

Year:  2005        PMID: 15905277     DOI: 10.1093/bioinformatics/bti499

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  262 in total

1.  Biomarker-calibrated dietary energy and protein intake associations with diabetes risk among postmenopausal women from the Women's Health Initiative.

Authors:  Lesley F Tinker; Gloria E Sarto; Barbara V Howard; Ying Huang; Marian L Neuhouser; Yasmin Mossavar-Rahmani; Jeannette M Beasley; Karen L Margolis; Charles B Eaton; Lawrence S Phillips; Ross L Prentice
Journal:  Am J Clin Nutr       Date:  2011-11-09       Impact factor: 7.045

2.  Probabilistic classifiers with high-dimensional data.

Authors:  Kyung In Kim; Richard Simon
Journal:  Biostatistics       Date:  2010-11-17       Impact factor: 5.899

3.  Relationship Between Quantitative Adverse Plaque Features From Coronary Computed Tomography Angiography and Downstream Impaired Myocardial Flow Reserve by 13N-Ammonia Positron Emission Tomography: A Pilot Study.

Authors:  Damini Dey; Mariana Diaz Zamudio; Annika Schuhbaeck; Luis Eduardo Juarez Orozco; Yuka Otaki; Heidi Gransar; Debiao Li; Guido Germano; Stephan Achenbach; Daniel S Berman; Aloha Meave; Erick Alexanderson; Piotr J Slomka
Journal:  Circ Cardiovasc Imaging       Date:  2015-10       Impact factor: 7.792

Review 4.  Statistical considerations on prognostic models for glioma.

Authors:  Annette M Molinaro; Margaret R Wrensch; Robert B Jenkins; Jeanette E Eckel-Passow
Journal:  Neuro Oncol       Date:  2015-12-08       Impact factor: 12.300

5.  Reliable selection of the number of fascicles in diffusion images by estimation of the generalization error.

Authors:  Benoit Scherrer; Maxime Taquet; Simon K Warfield
Journal:  Inf Process Med Imaging       Date:  2013

6.  Improved MR-based characterization of engineered cartilage using multiexponential T2 relaxation and multivariate analysis.

Authors:  David A Reiter; Onyi Irrechukwu; Ping-Chang Lin; Somaieh Moghadam; Sarah Von Thaer; Nancy Pleshko; Richard G Spencer
Journal:  NMR Biomed       Date:  2012-01-29       Impact factor: 4.044

7.  Deep Learning for Prediction of Obstructive Disease From Fast Myocardial Perfusion SPECT: A Multicenter Study.

Authors:  Julian Betancur; Frederic Commandeur; Mahsaw Motlagh; Tali Sharir; Andrew J Einstein; Sabahat Bokhari; Mathews B Fish; Terrence D Ruddy; Philipp Kaufmann; Albert J Sinusas; Edward J Miller; Timothy M Bateman; Sharmila Dorbala; Marcelo Di Carli; Guido Germano; Yuka Otaki; Balaji K Tamarappoo; Damini Dey; Daniel S Berman; Piotr J Slomka
Journal:  JACC Cardiovasc Imaging       Date:  2018-03-14

8.  Prediction of revascularization after myocardial perfusion SPECT by machine learning in a large population.

Authors:  Reza Arsanjani; Damini Dey; Tigran Khachatryan; Aryeh Shalev; Sean W Hayes; Mathews Fish; Rine Nakanishi; Guido Germano; Daniel S Berman; Piotr Slomka
Journal:  J Nucl Cardiol       Date:  2014-12-06       Impact factor: 5.952

9.  Blood gene expression signatures predict exposure levels.

Authors:  P R Bushel; A N Heinloth; J Li; L Huang; J W Chou; G A Boorman; D E Malarkey; C D Houle; S M Ward; R E Wilson; R D Fannin; M W Russo; P B Watkins; R W Tennant; R S Paules
Journal:  Proc Natl Acad Sci U S A       Date:  2007-11-02       Impact factor: 11.205

Review 10.  Combining a molecular profile with a clinical and pathological profile: biostatistical considerations.

Authors:  Richard J Sylvester
Journal:  Scand J Urol Nephrol Suppl       Date:  2008-09
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.