Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Effect of incremental feature enrichment on healthcare text classification system: A machine learning paradigm.

Literature DB >> 30902126

Effect of incremental feature enrichment on healthcare text classification system: A machine learning paradigm.

Saurabh Kumar Srivastava¹, Sandeep Kumar Singh², Jasjit S Suri³.

Abstract

BACKGROUND AND
OBJECTIVE: Healthcare tweets are particularly challenging due to its sparse layout and its limited character size. Compared to previous method based on "bag of words" (BOW) model, this study uniquely identifies the enrichment protocol and learns how semantically different aspects of feature selection such as BOW (feature F0), term frequency inverse document frequency (TF-IDF, feature F1), and latent semantic indexing (LSI, feature F2) when applied sequentially with classifier improves the overall performance.
METHODS: To study this enrichment concept, our ML model is tested on two kinds of diverse data sets: (i) D1: Disease data with conjunctivitis, diarrhea, stomach ache, cough and nausea related tweets, and (ii) D2: WebKB4 dataset, while adapting three kind of classifiers (a) C1: support vector machine with radial basis function (SVMR), (b) C2: Multi-layer perceptron (MLP) and (c) C3: Random Forest (RF). Partition protocol (K10) was adapted with different performance metrics to evaluate machine learning (ML)-system.
RESULTS: Using the combination of F1, C1, D1, K10, ML accuracy was: 94%, while with F2, C1, D1, K10, ML accuracy was 97%. Using the incremental feature enrichment from F0 to F2, K10 protocol gave F1 improvement over F0 by 4.98% on Disease dataset, while F2 improvement over F0 was by 11.78% on WebKB4 dataset. We demonstrated the generalization over memorization process in our ML-design. The system was tested for stability and reliability.
CONCLUSIONS: We conclude that semantically different aspects of feature selection, when adapted sequentially, leads to improvement in ML-accuracy for healthcare data sets. We validated the system by taking non-healthcare data sets.

Entities: Chemical Disease Gene

Keywords: Feature enrichment; Healthcare text classification; MLP; Machine learning; Performance; RF; SVMR; Twitter

Mesh：

Year: 2019 PMID： 30902126 DOI： 10.1016/j.cmpb.2019.01.011

Source DB: PubMed Journal: Comput Methods Programs Biomed ISSN： 0169-2607 Impact factor: 5.428

Keyword Cloud
Cited

4 in total

1. Multiclass machine learning vs. conventional calculators for stroke/CVD risk assessment using carotid plaque predictors with coronary angiography scores as gold standard: a 500 participants study.

Authors: Ankush D Jamthikar; Deep Gupta; Laura E Mantella; Luca Saba; John R Laird; Amer M Johri; Jasjit S Suri
Journal: Int J Cardiovasc Imaging Date: 2020-11-12 Impact factor: 2.357

2. Ultrasound-based stroke/cardiovascular risk stratification using Framingham Risk Score and ASCVD Risk Score based on "Integrated Vascular Age" instead of "Chronological Age": a multi-ethnic study of Asian Indian, Caucasian, and Japanese cohorts.

Authors: Ankush Jamthikar; Deep Gupta; Elisa Cuadrado-Godia; Anudeep Puvvula; Narendra N Khanna; Luca Saba; Klaudija Viskovic; Sophie Mavrogeni; Monika Turk; John R Laird; Gyan Pareek; Martin Miner; Petros P Sfikakis; Athanasios Protogerou; George D Kitas; Chithra Shankar; Andrew Nicolaides; Vijay Viswanathan; Aditya Sharma; Jasjit S Suri
Journal: Cardiovasc Diagn Ther Date: 2020-08

3. COVLIAS 2.0-cXAI: Cloud-Based Explainable Deep Learning System for COVID-19 Lesion Localization in Computed Tomography Scans.

Authors: Jasjit S Suri; Sushant Agarwal; Gian Luca Chabert; Alessandro Carriero; Alessio Paschè; Pietro S C Danna; Luca Saba; Armin Mehmedović; Gavino Faa; Inder M Singh; Monika Turk; Paramjit S Chadha; Amer M Johri; Narendra N Khanna; Sophie Mavrogeni; John R Laird; Gyan Pareek; Martin Miner; David W Sobel; Antonella Balestrieri; Petros P Sfikakis; George Tsoulfas; Athanasios D Protogerou; Durga Prasanna Misra; Vikas Agarwal; George D Kitas; Jagjit S Teji; Mustafa Al-Maini; Surinder K Dhanjil; Andrew Nicolaides; Aditya Sharma; Vijay Rathore; Mostafa Fatemi; Azra Alizad; Pudukode R Krishnan; Ferenc Nagy; Zoltan Ruzsa; Mostafa M Fouda; Subbaram Naidu; Klaudija Viskovic; Mannudeep K Kalra
Journal: Diagnostics (Basel) Date: 2022-06-16

Review 4. Brain Tumor Characterization Using Radiogenomics in Artificial Intelligence Framework.

Authors: Biswajit Jena; Sanjay Saxena; Gopal Krishna Nayak; Antonella Balestrieri; Neha Gupta; Narinder N Khanna; John R Laird; Manudeep K Kalra; Mostafa M Fouda; Luca Saba; Jasjit S Suri
Journal: Cancers (Basel) Date: 2022-08-22 Impact factor: 6.575

4 in total