Literature DB >> 28655633

Cross-validation failure: Small sample sizes lead to large error bars.

Gaël Varoquaux1.   

Abstract

Predictive models ground many state-of-the-art developments in statistical brain image analysis: decoding, MVPA, searchlight, or extraction of biomarkers. The principled approach to establish their validity and usefulness is cross-validation, testing prediction on unseen data. Here, I would like to raise awareness on error bars of cross-validation, which are often underestimated. Simple experiments show that sample sizes of many neuroimaging studies inherently lead to large error bars, eg±10% for 100 samples. The standard error across folds strongly underestimates them. These large error bars compromise the reliability of conclusions drawn with predictive models, such as biomarkers or methods developments where, unlike with cognitive neuroimaging MVPA approaches, more samples cannot be acquired by repeating the experiment across many subjects. Solutions to increase sample size must be investigated, tackling possible increases in heterogeneity of the data.
Copyright © 2017 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Biomarkers; Cross-validation; Decoding; MVPA; Model selection; Statistics; fMRI

Mesh:

Year:  2017        PMID: 28655633     DOI: 10.1016/j.neuroimage.2017.06.061

Source DB:  PubMed          Journal:  Neuroimage        ISSN: 1053-8119            Impact factor:   6.556


  111 in total

Review 1.  Big-Data Science in Porous Materials: Materials Genomics and Machine Learning.

Authors:  Kevin Maik Jablonka; Daniele Ongari; Seyed Mohamad Moosavi; Berend Smit
Journal:  Chem Rev       Date:  2020-06-10       Impact factor: 60.622

2.  Classification of Alzheimer's Disease, Mild Cognitive Impairment and Normal Control Subjects Using Resting-State fMRI Based Network Connectivity Analysis.

Authors:  Zhe Wang; Yu Zheng; David C Zhu; Andrea C Bozoki; Tongtong Li
Journal:  IEEE J Transl Eng Health Med       Date:  2018-10-15       Impact factor: 3.316

Review 3.  Machine learning in resting-state fMRI analysis.

Authors:  Meenakshi Khosla; Keith Jamison; Gia H Ngo; Amy Kuceyeski; Mert R Sabuncu
Journal:  Magn Reson Imaging       Date:  2019-06-05       Impact factor: 2.546

4.  Good practice in food-related neuroimaging.

Authors:  Paul A M Smeets; Alain Dagher; Todd A Hare; Stephanie Kullmann; Laura N van der Laan; Russell A Poldrack; Hubert Preissl; Dana Small; Eric Stice; Maria G Veldhuizen
Journal:  Am J Clin Nutr       Date:  2019-03-01       Impact factor: 7.045

5.  Reply to: Towards increasing the clinical applicability of machine learning biomarkers in psychiatry.

Authors:  Marcel Adam Just; Vladimir L Cherkassky; David Brent
Journal:  Nat Hum Behav       Date:  2021-04-05

6.  Towards Algorithmic Analytics for Large-scale Datasets.

Authors:  Danilo Bzdok; Thomas E Nichols; Stephen M Smith
Journal:  Nat Mach Intell       Date:  2019-07-09

Review 7.  The Heterogeneity Problem: Approaches to Identify Psychiatric Subtypes.

Authors:  Eric Feczko; Oscar Miranda-Dominguez; Mollie Marr; Alice M Graham; Joel T Nigg; Damien A Fair
Journal:  Trends Cogn Sci       Date:  2019-05-29       Impact factor: 20.229

8.  Classification Accuracy of Neuroimaging Biomarkers in Attention-Deficit/Hyperactivity Disorder: Effects of Sample Size and Circular Analysis.

Authors:  Alfredo A Pulini; Wesley T Kerr; Sandra K Loo; Agatha Lenartowicz
Journal:  Biol Psychiatry Cogn Neurosci Neuroimaging       Date:  2018-06-27

9.  Toward Robust Anxiety Biomarkers: A Machine Learning Approach in a Large-Scale Sample.

Authors:  Emily A Boeke; Avram J Holmes; Elizabeth A Phelps
Journal:  Biol Psychiatry Cogn Neurosci Neuroimaging       Date:  2019-06-21

10.  Establishment of Best Practices for Evidence for Prediction: A Review.

Authors:  Russell A Poldrack; Grace Huckins; Gael Varoquaux
Journal:  JAMA Psychiatry       Date:  2020-05-01       Impact factor: 21.596

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.