Literature DB >> 30765897

A Random Forests Quantile Classifier for Class Imbalanced Data.

Robert O'Brien1, Hemant Ishwaran1.   

Abstract

Extending previous work on quantile classifiers (q-classifiers) we propose the q*-classifier for the class imbalance problem. The classifier assigns a sample to the minority class if the minority class conditional probability exceeds 0 < q* < 1, where q* equals the unconditional probability of observing a minority class sample. The motivation for q*-classification stems from a density-based approach and leads to the useful property that the q*-classifier maximizes the sum of the true positive and true negative rates. Moreover, because the procedure can be equivalently expressed as a cost-weighted Bayes classifier, it also minimizes weighted risk. Because of this dual optimization, the q*-classifier can achieve near zero risk in imbalance problems, while simultaneously optimizing true positive and true negative rates. We use random forests to apply q*-classification. This new method which we call RFQ is shown to outperform or is competitive with existing techniques with respect to tt-mean performance and variable selection. Extensions to the multiclass imbalanced setting are also considered.

Entities:  

Keywords:  Class Imbalance; Minority Class; Random Forests; Response-based Sampling; Weighted Bayes Classifier

Year:  2019        PMID: 30765897      PMCID: PMC6370055          DOI: 10.1016/j.patcog.2019.01.036

Source DB:  PubMed          Journal:  Pattern Recognit        ISSN: 0031-3203            Impact factor:   7.740


  11 in total

1.  Near-Bayesian Support Vector Machines for imbalanced data classification with equal or unequal misclassification costs.

Authors:  Shounak Datta; Swagatam Das
Journal:  Neural Netw       Date:  2015-07-08

2.  A novel approach to cancer staging: application to esophageal cancer.

Authors:  Hemant Ishwaran; Eugene H Blackstone; Carolyn Apperson-Hansen; Thomas W Rice
Journal:  Biostatistics       Date:  2009-06-05       Impact factor: 5.899

3.  Random survival forests for competing risks.

Authors:  Hemant Ishwaran; Thomas A Gerds; Udaya B Kogalur; Richard D Moore; Stephen J Gange; Bryan M Lau
Journal:  Biostatistics       Date:  2014-04-11       Impact factor: 5.899

4.  Cancer of the esophagus and esophagogastric junction: data-driven staging for the seventh edition of the American Joint Committee on Cancer/International Union Against Cancer Cancer Staging Manuals.

Authors:  Thomas W Rice; Valerie W Rusch; Hemant Ishwaran; Eugene H Blackstone
Journal:  Cancer       Date:  2010-08-15       Impact factor: 6.860

5.  LOCAL CASE-CONTROL SAMPLING: EFFICIENT SUBSAMPLING IN IMBALANCED DATA SETS.

Authors:  William Fithian; Trevor Hastie
Journal:  Ann Stat       Date:  2014-10-01       Impact factor: 4.028

6.  Early identification of potentially salvageable tissue with MRI-based predictive algorithms after experimental ischemic stroke.

Authors:  Mark J R J Bouts; Ivo A C W Tiebosch; Annette van der Toorn; Max A Viergever; Ona Wu; Rick M Dijkhuizen
Journal:  J Cereb Blood Flow Metab       Date:  2013-04-10       Impact factor: 6.200

7.  Multiplexed immunoassay panel identifies novel CSF biomarkers for Alzheimer's disease diagnosis and prognosis.

Authors:  Rebecca Craig-Schapiro; Max Kuhn; Chengjie Xiong; Eve H Pickering; Jingxia Liu; Thomas P Misko; Richard J Perrin; Kelly R Bales; Holly Soares; Anne M Fagan; David M Holtzman
Journal:  PLoS One       Date:  2011-04-19       Impact factor: 3.240

8.  Predicting disease risks from highly imbalanced data using random forest.

Authors:  Mohammed Khalilia; Sounak Chakraborty; Mihail Popescu
Journal:  BMC Med Inform Decis Mak       Date:  2011-07-29       Impact factor: 2.796

9.  Using multivariate machine learning methods and structural MRI to classify childhood onset schizophrenia and healthy controls.

Authors:  Deanna Greenstein; James D Malley; Brian Weisinger; Liv Clasen; Nitin Gogtay
Journal:  Front Psychiatry       Date:  2012-06-01       Impact factor: 4.157

10.  Novel data-mining approach identifies biomarkers for diagnosis of Kawasaki disease.

Authors:  Adriana H Tremoulet; Janusz Dutkowski; Yuichiro Sato; John T Kanegaye; Xuefeng B Ling; Jane C Burns
Journal:  Pediatr Res       Date:  2015-08-03       Impact factor: 3.756

View more
  13 in total

1.  Limitations of receiver operating characteristic curve on imbalanced data: Assist device mortality risk scores.

Authors:  Faezeh Movahedi; Rema Padman; James F Antaki
Journal:  J Thorac Cardiovasc Surg       Date:  2021-07-30       Impact factor: 5.209

2.  Clinicoradiological Characteristics in the Differential Diagnosis of Follicular-Patterned Lesions of the Thyroid: A Multicenter Cohort Study.

Authors:  Jeong Hoon Lee; Eun Ju Ha; Da Hyun Lee; Miran Han; Jung Hyun Park; Ji-Hoon Kim
Journal:  Korean J Radiol       Date:  2022-05-31       Impact factor: 7.109

3.  Looking beyond the eyeball test: A novel vitality index to predict recovery after esophagectomy.

Authors:  Andrew Tang; Usman Ahmad; Siva Raja; Jesse Rappaport; Daniel P Raymond; Monisha Sudarshan; Alejandro C Bribriesco; Eugene H Blackstone; Sudish C Murthy
Journal:  J Thorac Cardiovasc Surg       Date:  2020-11-13       Impact factor: 5.209

4.  Prediction of operative mortality for patients undergoing cardiac surgical procedures without established risk scores.

Authors:  Chin Siang Ong; Erik Reinertsen; Haoqi Sun; Philicia Moonsamy; Navyatha Mohan; Masaki Funamoto; Tsuyoshi Kaneko; Prem S Shekar; Stefano Schena; Jennifer S Lawton; David A D'Alessandro; M Brandon Westover; Aaron D Aguirre; Thoralf M Sundt
Journal:  J Thorac Cardiovasc Surg       Date:  2021-09-14       Impact factor: 5.209

5.  Commentary: To classify means to choose a threshold.

Authors:  Jiangnan Lyu; Hemant Ishwaran
Journal:  J Thorac Cardiovasc Surg       Date:  2021-08-08       Impact factor: 5.209

6.  Explainable Deep Learning for Augmentation of Small RNA Expression Profiles.

Authors:  Jelena Fiosina; Maksims Fiosins; Stefan Bonn
Journal:  J Comput Biol       Date:  2019-12-18       Impact factor: 1.479

7.  iDPGK: characterization and identification of lysine phosphoglycerylation sites based on sequence-based features.

Authors:  Kai-Yao Huang; Fang-Yu Hung; Hui-Ju Kao; Hui-Hsuan Lau; Shun-Long Weng
Journal:  BMC Bioinformatics       Date:  2020-12-09       Impact factor: 3.169

8.  Comparing data-driven and hypothesis-driven MRI-based predictors of cognitive impairment in individuals from the Atherosclerosis Risk in Communities (ARIC) study.

Authors:  Ramon Casanova; Fang-Chi Hsu; Ryan T Barnard; Andrea M Anderson; Rajesh Talluri; Christopher T Whitlow; Timothy M Hughes; Michael Griswold; Kathleen M Hayden; Rebecca F Gottesman; Lynne E Wagenknecht
Journal:  Alzheimers Dement       Date:  2021-07-26       Impact factor: 16.655

9.  Geomorphometric Methods for Burial Mound Recognition and Extraction from High-Resolution LiDAR DEMs.

Authors:  Mihai Niculiță
Journal:  Sensors (Basel)       Date:  2020-02-21       Impact factor: 3.576

10.  Random survival forest model identifies novel biomarkers of event-free survival in high-risk pediatric acute lymphoblastic leukemia.

Authors:  Zachary S Bohannan; Frederick Coffman; Antonina Mitrofanova
Journal:  Comput Struct Biotechnol J       Date:  2022-01-06       Impact factor: 6.155

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.