Literature DB >> 22195118

Improving predictions in imbalanced data using Pairwise Expanded Logistic Regression.

Xiaoqian Jiang1, Robert El-Kareh, Lucila Ohno-Machado.   

Abstract

Building classifiers for medical problems often involves dealing with rare, but important events. Imbalanced datasets pose challenges to ordinary classification algorithms such as Logistic Regression (LR) and Support Vector Machines (SVM). The lack of effective strategies for dealing with imbalanced training data often results in models that exhibit poor discrimination. We propose a novel approach to estimate class memberships based on the evaluation of pairwise relationships in the training data. The method we propose, Pairwise Expanded Logistic Regression, improved discrimination and had higher accuracy when compared to existing methods in two imbalanced datasets, thus showing promise as a potential remedy for this problem.

Mesh:

Year:  2011        PMID: 22195118      PMCID: PMC3243279     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  16 in total

1.  Feedforward neural network models for handling class overlap and class imbalance.

Authors:  Ralf Kretzschmar; Nicolaos B Karayiannis; Fritz Eggimann
Journal:  Int J Neural Syst       Date:  2005-10       Impact factor: 5.866

2.  Learning from imbalanced data in surveillance of nosocomial infection.

Authors:  Gilles Cohen; Mélanie Hilario; Hugo Sax; Stéphane Hugonnet; Antoine Geissbuhler
Journal:  Artif Intell Med       Date:  2005-10-17       Impact factor: 5.326

3.  Exploratory undersampling for class-imbalance learning.

Authors:  Xu-Ying Liu; Jianxin Wu; Zhi-Hua Zhou
Journal:  IEEE Trans Syst Man Cybern B Cybern       Date:  2008-12-16

4.  Evolutionary undersampling for classification with imbalanced datasets: proposals and taxonomy.

Authors:  Salvador García; Francisco Herrera
Journal:  Evol Comput       Date:  2009       Impact factor: 3.277

5.  Incidence and predictors of microbiology results returning postdischarge and requiring follow-up.

Authors:  Robert El-Kareh; Christopher Roy; Gregor Brodsky; Molly Perencevich; Eric G Poon
Journal:  J Hosp Med       Date:  2011-05       Impact factor: 2.960

6.  The meaning and use of the area under a receiver operating characteristic (ROC) curve.

Authors:  J A Hanley; B J McNeil
Journal:  Radiology       Date:  1982-04       Impact factor: 11.105

7.  Prognostic impact of pulmonary arterial hypertension: a population-based analysis.

Authors:  Melinda Carrington; Niamh F Murphy; Geoff Strange; Andrew Peacock; John J V McMurray; Simon Stewart
Journal:  Int J Cardiol       Date:  2007-04-11       Impact factor: 4.164

8.  Risks and costs of end-stage renal disease after heart transplantation.

Authors:  J Hornberger; J Best; J Geppert; M McClellan
Journal:  Transplantation       Date:  1998-12-27       Impact factor: 4.939

9.  A particle swarm based hybrid system for imbalanced medical data sampling.

Authors:  Pengyi Yang; Liang Xu; Bing B Zhou; Zili Zhang; Albert Y Zomaya
Journal:  BMC Genomics       Date:  2009-12-03       Impact factor: 3.969

10.  Semi-automated screening of biomedical citations for systematic reviews.

Authors:  Byron C Wallace; Thomas A Trikalinos; Joseph Lau; Carla Brodley; Christopher H Schmid
Journal:  BMC Bioinformatics       Date:  2010-01-26       Impact factor: 3.169

View more
  4 in total

1.  Analysis of sampling techniques for imbalanced data: An n = 648 ADNI study.

Authors:  Rashmi Dubey; Jiayu Zhou; Yalin Wang; Paul M Thompson; Jieping Ye
Journal:  Neuroimage       Date:  2013-10-29       Impact factor: 6.556

2.  Predicting complications of percutaneous coronary intervention using a novel support vector method.

Authors:  Gyemin Lee; Hitinder S Gurm; Zeeshan Syed
Journal:  J Am Med Inform Assoc       Date:  2013-04-18       Impact factor: 4.497

3.  Grid Binary LOgistic REgression (GLORE): building shared models without sharing data.

Authors:  Yuan Wu; Xiaoqian Jiang; Jihoon Kim; Lucila Ohno-Machado
Journal:  J Am Med Inform Assoc       Date:  2012-04-17       Impact factor: 4.497

4.  A Systematic Machine Learning Based Approach for the Diagnosis of Non-Alcoholic Fatty Liver Disease Risk and Progression.

Authors:  Sajida Perveen; Muhammad Shahbaz; Karim Keshavjee; Aziz Guergachi
Journal:  Sci Rep       Date:  2018-02-01       Impact factor: 4.379

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.