Literature DB >> 22408190

Class-imbalanced classifiers for high-dimensional data.

Wei-Jiun Lin1, James J Chen.   

Abstract

A class-imbalanced classifier is a decision rule to predict the class membership of new samples from an available data set where the class sizes differ considerably. When the class sizes are very different, most standard classification algorithms may favor the larger (majority) class resulting in poor accuracy in the minority class prediction. A class-imbalanced classifier typically modifies a standard classifier by a correction strategy or by incorporating a new strategy in the training phase to account for differential class sizes. This article reviews and evaluates some most important methods for class prediction of high-dimensional imbalanced data. The evaluation addresses the fundamental issues of the class-imbalanced classification problem: imbalance ratio, small disjuncts and overlap complexity, lack of data and feature selection. Four class-imbalanced classifiers are considered. The four classifiers include three standard classification algorithms each coupled with an ensemble correction strategy and one support vector machines (SVM)-based correction classifier. The three algorithms are (i) diagonal linear discriminant analysis (DLDA), (ii) random forests (RFs) and (ii) SVMs. The SVM-based correction classifier is SVM threshold adjustment (SVM-THR). A Monte-Carlo simulation and five genomic data sets were used to illustrate the analysis and address the issues. The SVM-ensemble classifier appears to perform the best when the class imbalance is not too severe. The SVM-THR performs well if the imbalance is severe and predictors are highly correlated. The DLDA with a feature selection can perform well without using the ensemble correction.

Entities:  

Mesh:

Year:  2012        PMID: 22408190     DOI: 10.1093/bib/bbs006

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  47 in total

1.  Predicting Inpatient Acute Kidney Injury over Different Time Horizons: How Early and Accurate?

Authors:  Peng Cheng; Lemuel R Waitman; Yong Hu; Mei Liu
Journal:  AMIA Annu Symp Proc       Date:  2018-04-16

2.  Prediction of Future Chronic Opioid Use Among Hospitalized Patients.

Authors:  S L Calcaterra; S Scarbro; M L Hull; A D Forber; I A Binswanger; K L Colborn
Journal:  J Gen Intern Med       Date:  2018-02-05       Impact factor: 5.128

3.  A systematic review on metabolomics-based diagnostic biomarker discovery and validation in pancreatic cancer.

Authors:  Nguyen Phuoc Long; Sang Jun Yoon; Nguyen Hoang Anh; Tran Diem Nghi; Dong Kyu Lim; Yu Jin Hong; Soon-Sun Hong; Sung Won Kwon
Journal:  Metabolomics       Date:  2018-08-10       Impact factor: 4.290

4.  Computer-Assisted Diagnosis System for Breast Cancer in Computed Tomography Laser Mammography (CTLM).

Authors:  Afsaneh Jalalian; Syamsiah Mashohor; Rozi Mahmud; Babak Karasfi; M Iqbal Saripan; Abdul Rahman Ramli
Journal:  J Digit Imaging       Date:  2017-12       Impact factor: 4.056

5.  Differential distribution improves gene selection stability and has competitive classification performance for patient survival.

Authors:  Dario Strbenac; Graham J Mann; Jean Y H Yang; John T Ormerod
Journal:  Nucleic Acids Res       Date:  2016-05-17       Impact factor: 16.971

6.  A guide to automated apoptosis detection: How to make sense of imaging flow cytometry data.

Authors:  Dennis Pischel; Jörn H Buchbinder; Kai Sundmacher; Inna N Lavrik; Robert J Flassig
Journal:  PLoS One       Date:  2018-05-16       Impact factor: 3.240

7.  Transcriptome marker diagnostics using big data.

Authors:  Henry Han; Ying Liu
Journal:  IET Syst Biol       Date:  2016-02       Impact factor: 1.615

8.  Deep neural network with weight sparsity control and pre-training extracts hierarchical features and enhances classification performance: Evidence from whole-brain resting-state functional connectivity patterns of schizophrenia.

Authors:  Junghoe Kim; Vince D Calhoun; Eunsoo Shim; Jong-Hwan Lee
Journal:  Neuroimage       Date:  2015-05-15       Impact factor: 6.556

9.  Improved shrunken centroid classifiers for high-dimensional class-imbalanced data.

Authors:  Rok Blagus; Lara Lusa
Journal:  BMC Bioinformatics       Date:  2013-02-23       Impact factor: 3.169

10.  Prediction of suicidal ideation and attempt in 9 and 10 year-old children using transdiagnostic risk features.

Authors:  Gareth Harman; Dakota Kliamovich; Angelica M Morales; Sydney Gilbert; Deanna M Barch; Michael A Mooney; Sarah W Feldstein Ewing; Damien A Fair; Bonnie J Nagel
Journal:  PLoS One       Date:  2021-05-25       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.