Literature DB >> 25700665

Efficient and sparse feature selection for biomedical text classification via the elastic net: Application to ICU risk stratification from nursing notes.

Ben J Marafino1, W John Boscardin2, R Adams Dudley3.   

Abstract

BACKGROUND AND SIGNIFICANCE: Sparsity is often a desirable property of statistical models, and various feature selection methods exist so as to yield sparser and interpretable models. However, their application to biomedical text classification, particularly to mortality risk stratification among intensive care unit (ICU) patients, has not been thoroughly studied.
OBJECTIVE: To develop and characterize sparse classifiers based on the free text of nursing notes in order to predict ICU mortality risk and to discover text features most strongly associated with mortality.
METHODS: We selected nursing notes from the first 24h of ICU admission for 25,826 adult ICU patients from the MIMIC-II database. We then developed a pair of stochastic gradient descent-based classifiers with elastic-net regularization. We also studied the performance-sparsity tradeoffs of both classifiers as their regularization parameters were varied.
RESULTS: The best-performing classifier achieved a 10-fold cross-validated AUC of 0.897 under the log loss function and full L2 regularization, while full L1 regularization used just 0.00025% of candidate input features and resulted in an AUC of 0.889. Using the log loss (range of AUCs 0.889-0.897) yielded better performance compared to the hinge loss (0.850-0.876), but the latter yielded even sparser models. DISCUSSION: Most features selected by both classifiers appear clinically relevant and correspond to predictors already present in existing ICU mortality models. The sparser classifiers were also able to discover a number of informative - albeit nonclinical - features.
CONCLUSION: The elastic-net-regularized classifiers perform reasonably well and are capable of reducing the number of features required by over a thousandfold, with only a modest impact on performance.
Copyright © 2015 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Elastic net; Feature selection; ICU; Machine learning; Risk stratification; Text mining

Mesh:

Year:  2015        PMID: 25700665     DOI: 10.1016/j.jbi.2015.02.003

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  19 in total

1.  Application of Machine Learning in Intensive Care Unit (ICU) Settings Using MIMIC Dataset: Systematic Review.

Authors:  Mahanazuddin Syed; Shorabuddin Syed; Kevin Sexton; Hafsa Bareen Syeda; Maryam Garza; Meredith Zozus; Farhanuddin Syed; Salma Begum; Abdullah Usama Syed; Joseph Sanford; Fred Prior
Journal:  Informatics (MDPI)       Date:  2021-03-03

2.  Inclusion of Unstructured Clinical Text Improves Early Prediction of Death or Prolonged ICU Stay.

Authors:  Gary E Weissman; Rebecca A Hubbard; Lyle H Ungar; Michael O Harhay; Casey S Greene; Blanca E Himes; Scott D Halpern
Journal:  Crit Care Med       Date:  2018-07       Impact factor: 7.598

3.  A comparison of rule-based and machine learning approaches for classifying patient portal messages.

Authors:  Robert M Cronin; Daniel Fabbri; Joshua C Denny; S Trent Rosenbloom; Gretchen Purcell Jackson
Journal:  Int J Med Inform       Date:  2017-06-23       Impact factor: 4.046

4.  Development of a predictive model for retention in HIV care using natural language processing of clinical notes.

Authors:  Tomasz Oliwa; Brian Furner; Jessica Schmitt; John Schneider; Jessica P Ridgway
Journal:  J Am Med Inform Assoc       Date:  2021-01-15       Impact factor: 4.497

5.  Construct validity of six sentiment analysis methods in the text of encounter notes of patients with critical illness.

Authors:  Gary E Weissman; Lyle H Ungar; Michael O Harhay; Katherine R Courtright; Scott D Halpern
Journal:  J Biomed Inform       Date:  2018-12-14       Impact factor: 6.317

6.  Utilizing Chinese Admission Records for MACE Prediction of Acute Coronary Syndrome.

Authors:  Danqing Hu; Zhengxing Huang; Tak-Ming Chan; Wei Dong; Xudong Lu; Huilong Duan
Journal:  Int J Environ Res Public Health       Date:  2016-09-13       Impact factor: 3.390

Review 7.  A Complete Process of Text Classification System Using State-of-the-Art NLP Models.

Authors:  Varun Dogra; Sahil Verma; Pushpita Chatterjee; Jana Shafi; Jaeyoung Choi; Muhammad Fazal Ijaz
Journal:  Comput Intell Neurosci       Date:  2022-06-09

8.  Home Healthcare Clinical Notes Predict Patient Hospitalization and Emergency Department Visits.

Authors:  Maxim Topaz; Kyungmi Woo; Miriam Ryvicker; Maryam Zolnoori; Kenrick Cato
Journal:  Nurs Res       Date:  2020 Nov/Dec       Impact factor: 2.381

9.  Sparse regressions for predicting and interpreting subcellular localization of multi-label proteins.

Authors:  Shibiao Wan; Man-Wai Mak; Sun-Yuan Kung
Journal:  BMC Bioinformatics       Date:  2016-02-24       Impact factor: 3.169

10.  Impact of Different Approaches to Preparing Notes for Analysis With Natural Language Processing on the Performance of Prediction Models in Intensive Care.

Authors:  Malini Mahendra; Yanting Luo; Hunter Mills; Gundolf Schenk; Atul J Butte; R Adams Dudley
Journal:  Crit Care Explor       Date:  2021-06-11
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.