Literature DB >> 29610737

Variable selection in Logistic regression model with genetic algorithm.

Zhongheng Zhang1, Victor Trevino2, Sayed Shahabuddin Hoseini3, Smaranda Belciug4, Arumugam Manivanna Boopathi5, Ping Zhang6, Florin Gorunescu7,8, Velappan Subha9, Songshi Dai10,11.   

Abstract

Variable or feature selection is one of the most important steps in model specification. Especially in the case of medical-decision making, the direct use of a medical database, without a previous analysis and preprocessing step, is often counterproductive. In this way, the variable selection represents the method of choosing the most relevant attributes from the database in order to build a robust learning models and, thus, to improve the performance of the models used in the decision process. In biomedical research, the purpose of variable selection is to select clinically important and statistically significant variables, while excluding unrelated or noise variables. A variety of methods exist for variable selection, but none of them is without limitations. For example, the stepwise approach, which is highly used, adds the best variable in each cycle generally producing an acceptable set of variables. Nevertheless, it is limited by the fact that it commonly trapped in local optima. The best subset approach can systematically search the entire covariate pattern space, but the solution pool can be extremely large with tens to hundreds of variables, which is the case in nowadays clinical data. Genetic algorithms (GA) are heuristic optimization approaches and can be used for variable selection in multivariable regression models. This tutorial paper aims to provide a step-by-step approach to the use of GA in variable selection. The R code provided in the text can be extended and adapted to other data analysis needs.

Entities:  

Keywords:  Logistic regression; galgo; genetic algorithm (GA); variable selection

Year:  2018        PMID: 29610737      PMCID: PMC5879502          DOI: 10.21037/atm.2018.01.15

Source DB:  PubMed          Journal:  Ann Transl Med        ISSN: 2305-5839


  11 in total

1.  A genetic algorithm to select variables in logistic regression: example in the domain of myocardial infarction.

Authors:  S Vinterbo; L Ohno-Machado
Journal:  Proc AMIA Symp       Date:  1999

2.  Genetic algorithms for dipole location of fetal magnetocardiography.

Authors:  D Escalona-Vargas; P Murphy; C L Lowery; H Eswaran
Journal:  Annu Int Conf IEEE Eng Med Biol Soc       Date:  2016-08

3.  GALGO: an R package for multivariate variable selection using genetic algorithms.

Authors:  Victor Trevino; Francesco Falciani
Journal:  Bioinformatics       Date:  2006-03-01       Impact factor: 6.937

4.  Logistic Regression: Relating Patient Characteristics to Outcomes.

Authors:  Juliana Tolles; William J Meurer
Journal:  JAMA       Date:  2016-08-02       Impact factor: 56.272

5.  Variable selection with stepwise and best subset approaches.

Authors:  Zhongheng Zhang
Journal:  Ann Transl Med       Date:  2016-04

6.  A genetic algorithm for variable selection in logistic regression analysis of radiotherapy treatment outcomes.

Authors:  Olivier Gayou; Shiva K Das; Su-Min Zhou; Lawrence B Marks; David S Parda; Moyed Miften
Journal:  Med Phys       Date:  2008-12       Impact factor: 4.071

7.  pROC: an open-source package for R and S+ to analyze and compare ROC curves.

Authors:  Xavier Robin; Natacha Turck; Alexandre Hainard; Natalia Tiberti; Frédérique Lisacek; Jean-Charles Sanchez; Markus Müller
Journal:  BMC Bioinformatics       Date:  2011-03-17       Impact factor: 3.307

8.  An adaptive genetic algorithm for selection of blood-based biomarkers for prediction of Alzheimer's disease progression.

Authors:  Luke Vandewater; Vladimir Brusic; William Wilson; Lance Macaulay; Ping Zhang
Journal:  BMC Bioinformatics       Date:  2015-12-09       Impact factor: 3.169

9.  AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity.

Authors:  Lei Sun; Jun Wang; Jinmao Wei
Journal:  BMC Bioinformatics       Date:  2017-03-14       Impact factor: 3.169

10.  Evaluation of optimization techniques for variable selection in logistic regression applied to diagnosis of myocardial infarction.

Authors:  Adam Kiezun; I-Ting Angelina Lee; Noam Shomron
Journal:  Bioinformation       Date:  2009-02-28
View more
  3 in total

1.  Predicting Grating Orientations With Cross-Frequency Coupling and Least Absolute Shrinkage and Selection Operator in V1 and V4 of Rhesus Monkeys.

Authors:  Zhaohui Li; Yue Du; Youben Xiao; Liyong Yin
Journal:  Front Comput Neurosci       Date:  2021-01-25       Impact factor: 2.380

2.  Integrated Machine Learning and Bioinformatic Analyses Constructed a Novel Stemness-Related Classifier to Predict Prognosis and Immunotherapy Responses for Hepatocellular Carcinoma Patients.

Authors:  Dongjie Chen; Jixing Liu; Longjun Zang; Tijun Xiao; Xianlin Zhang; Zheng Li; Hongwei Zhu; Wenzhe Gao; Xiao Yu
Journal:  Int J Biol Sci       Date:  2022-01-01       Impact factor: 6.580

3.  miRNA Expression Profiles and Potential as Biomarkers in Nontuberculous Mycobacterial Pulmonary Disease.

Authors:  Sun Ae Han; Byung Woo Jhun; Su-Young Kim; Seong Mi Moon; Bumhee Yang; O Jung Kwon; Charles L Daley; Sung Jae Shin; Won-Jung Koh
Journal:  Sci Rep       Date:  2020-02-21       Impact factor: 4.379

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.