Literature DB >> 24176869

Analysis of sampling techniques for imbalanced data: An n = 648 ADNI study.

Rashmi Dubey1, Jiayu Zhou1, Yalin Wang2, Paul M Thompson3, Jieping Ye4.   

Abstract

Many neuroimaging applications deal with imbalanced imaging data. For example, in Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, the mild cognitive impairment (MCI) cases eligible for the study are nearly two times the Alzheimer's disease (AD) patients for structural magnetic resonance imaging (MRI) modality and six times the control cases for proteomics modality. Constructing an accurate classifier from imbalanced data is a challenging task. Traditional classifiers that aim to maximize the overall prediction accuracy tend to classify all data into the majority class. In this paper, we study an ensemble system of feature selection and data sampling for the class imbalance problem. We systematically analyze various sampling techniques by examining the efficacy of different rates and types of undersampling, oversampling, and a combination of over and undersampling approaches. We thoroughly examine six widely used feature selection algorithms to identify significant biomarkers and thereby reduce the complexity of the data. The efficacy of the ensemble techniques is evaluated using two different classifiers including Random Forest and Support Vector Machines based on classification accuracy, area under the receiver operating characteristic curve (AUC), sensitivity, and specificity measures. Our extensive experimental results show that for various problem settings in ADNI, (1) a balanced training set obtained with K-Medoids technique based undersampling gives the best overall performance among different data sampling techniques and no sampling approach; and (2) sparse logistic regression with stability selection achieves competitive performance among various feature selection algorithms. Comprehensive experiments with various settings show that our proposed ensemble model of multiple undersampled datasets yields stable and promising results.
© 2013 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Alzheimer's disease; Classification; Feature selection; Imbalanced data; Oversampling; Undersampling

Mesh:

Substances:

Year:  2013        PMID: 24176869      PMCID: PMC3946903          DOI: 10.1016/j.neuroimage.2013.10.005

Source DB:  PubMed          Journal:  Neuroimage        ISSN: 1053-8119            Impact factor:   6.556


  34 in total

1.  Voxelwise gene-wide association study (vGeneWAS): multivariate gene-based association testing in 731 elderly subjects.

Authors:  Derrek P Hibar; Jason L Stein; Omid Kohannim; Neda Jahanshad; Andrew J Saykin; Li Shen; Sungeun Kim; Nathan Pankratz; Tatiana Foroud; Matthew J Huentelman; Steven G Potkin; Clifford R Jack; Michael W Weiner; Arthur W Toga; Paul M Thompson
Journal:  Neuroimage       Date:  2011-04-08       Impact factor: 6.556

2.  Exploratory undersampling for class-imbalance learning.

Authors:  Xu-Ying Liu; Jianxin Wu; Zhi-Hua Zhou
Journal:  IEEE Trans Syst Man Cybern B Cybern       Date:  2008-12-16

3.  Structural and functional biomarkers of prodromal Alzheimer's disease: a high-dimensional pattern classification study.

Authors:  Yong Fan; Susan M Resnick; Xiaoying Wu; Christos Davatzikos
Journal:  Neuroimage       Date:  2008-03-06       Impact factor: 6.556

4.  Surface-based TBM boosts power to detect disease effects on the brain: an N=804 ADNI study.

Authors:  Yalin Wang; Yang Song; Priya Rajagopalan; Tuo An; Krystal Liu; Yi-Yu Chou; Boris Gutman; Arthur W Toga; Paul M Thompson
Journal:  Neuroimage       Date:  2011-03-23       Impact factor: 6.556

5.  Automatic classification of patients with Alzheimer's disease from structural MRI: a comparison of ten methods using the ADNI database.

Authors:  Rémi Cuingnet; Emilie Gerardin; Jérôme Tessieras; Guillaume Auzias; Stéphane Lehéricy; Marie-Odile Habert; Marie Chupin; Habib Benali; Olivier Colliot
Journal:  Neuroimage       Date:  2010-06-11       Impact factor: 6.556

6.  Feature selection and classification of imbalanced datasets: application to PET images of children with autistic spectrum disorders.

Authors:  Edouard Duchesnay; Arnaud Cachia; Nathalie Boddaert; Nadia Chabane; Jean-Franois Mangin; Jean-Luc Martinot; Francis Brunelle; Monica Zilbovicius
Journal:  Neuroimage       Date:  2011-05-10       Impact factor: 6.556

7.  Hippocampal and entorhinal atrophy in mild cognitive impairment: prediction of Alzheimer disease.

Authors:  D P Devanand; G Pradhaban; X Liu; A Khandji; S De Santi; S Segal; H Rusinek; G H Pelton; L S Honig; R Mayeux; Y Stern; M H Tabert; M J de Leon
Journal:  Neurology       Date:  2007-03-13       Impact factor: 9.910

8.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

9.  Cerebrospinal fluid biomarker signature in Alzheimer's disease neuroimaging initiative subjects.

Authors:  Leslie M Shaw; Hugo Vanderstichele; Malgorzata Knapik-Czajka; Christopher M Clark; Paul S Aisen; Ronald C Petersen; Kaj Blennow; Holly Soares; Adam Simon; Piotr Lewczuk; Robert Dean; Eric Siemers; William Potter; Virginia M-Y Lee; John Q Trojanowski
Journal:  Ann Neurol       Date:  2009-04       Impact factor: 10.422

10.  Multivariate protein signatures of pre-clinical Alzheimer's disease in the Alzheimer's disease neuroimaging initiative (ADNI) plasma proteome dataset.

Authors:  Daniel Johnstone; Elizabeth A Milward; Regina Berretta; Pablo Moscato
Journal:  PLoS One       Date:  2012-04-02       Impact factor: 3.240

View more
  33 in total

1.  Delirium Prediction using Machine Learning Models on Preoperative Electronic Health Records Data.

Authors:  Anis Davoudi; Ashkan Ebadi; Parisa Rashidi; Tazcan Ozrazgat-Baslanti; Azra Bihorac; Alberto C Bursian
Journal:  Proc IEEE Int Symp Bioinformatics Bioeng       Date:  2018-01-11

2.  Reproducible Evaluation of Diffusion MRI Features for Automatic Classification of Patients with Alzheimer's Disease.

Authors:  Junhao Wen; Jorge Samper-González; Simona Bottani; Alexandre Routier; Ninon Burgos; Thomas Jacquemont; Sabrina Fontanella; Stanley Durrleman; Stéphane Epelbaum; Anne Bertrand; Olivier Colliot
Journal:  Neuroinformatics       Date:  2021-01

3.  An empirical study of a hybrid imbalanced-class DT-RST classification procedure to elucidate therapeutic effects in uremia patients.

Authors:  You-Shyang Chen
Journal:  Med Biol Eng Comput       Date:  2016-04-06       Impact factor: 2.602

Review 4.  Recent publications from the Alzheimer's Disease Neuroimaging Initiative: Reviewing progress toward improved AD clinical trials.

Authors:  Michael W Weiner; Dallas P Veitch; Paul S Aisen; Laurel A Beckett; Nigel J Cairns; Robert C Green; Danielle Harvey; Clifford R Jack; William Jagust; John C Morris; Ronald C Petersen; Andrew J Saykin; Leslie M Shaw; Arthur W Toga; John Q Trojanowski
Journal:  Alzheimers Dement       Date:  2017-03-22       Impact factor: 21.566

5.  An ensemble learning method for asthma control level detection with leveraging medical knowledge-based classifier and supervised learning.

Authors:  Roghaye Khasha; Mohammad Mehdi Sepehri; Seyed Alireza Mahdaviani
Journal:  J Med Syst       Date:  2019-04-26       Impact factor: 4.460

Review 6.  2014 Update of the Alzheimer's Disease Neuroimaging Initiative: A review of papers published since its inception.

Authors:  Michael W Weiner; Dallas P Veitch; Paul S Aisen; Laurel A Beckett; Nigel J Cairns; Jesse Cedarbaum; Robert C Green; Danielle Harvey; Clifford R Jack; William Jagust; Johan Luthman; John C Morris; Ronald C Petersen; Andrew J Saykin; Leslie Shaw; Li Shen; Adam Schwarz; Arthur W Toga; John Q Trojanowski
Journal:  Alzheimers Dement       Date:  2015-06       Impact factor: 21.566

7.  Automated gene expression pattern annotation in the mouse brain.

Authors:  Tao Yang; Xinlin Zhao; Binbin Lin; Tao Zeng; Shuiwang Ji; Jieping Ye
Journal:  Pac Symp Biocomput       Date:  2015

8.  Melancholic depression prediction by identifying representative features in metabolic and microarray profiles with missing values.

Authors:  Zhi Nie; Tao Yang; Yashu Liu; Qingyang Li; Vaibhav A Narayan; Gayle Wittenberg; Jieping Ye
Journal:  Pac Symp Biocomput       Date:  2015

9.  Identifying a clinical signature of suicidality among patients with mood disorders: A pilot study using a machine learning approach.

Authors:  Ives Cavalcante Passos; Benson Mwangi; Bo Cao; Jane E Hamilton; Mon-Ju Wu; Xiang Yang Zhang; Giovana B Zunta-Soares; Joao Quevedo; Marcia Kauer-Sant'Anna; Flávio Kapczinski; Jair C Soares
Journal:  J Affect Disord       Date:  2016-01-01       Impact factor: 4.839

10.  Cortical thickness patterns as state biomarker of anorexia nervosa.

Authors:  Luca Lavagnino; Benson Mwangi; Bo Cao; Megan E Shott; Jair C Soares; Guido K W Frank
Journal:  Int J Eat Disord       Date:  2018-02-07       Impact factor: 4.861

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.