| Literature DB >> 35221528 |
Jiewu Leng1,2, Dewen Wang1, Xin Ma1, Pengjiu Yu3, Li Wei3, Wenge Chen1.
Abstract
OBJECTIVE: The high incidence of respiratory diseases has dramatically increased the medical burden under the COVID-19 pandemic in the year 2020. It is of considerable significance to utilize a new generation of information technology to improve the artificial intelligence level of respiratory disease diagnosis.Entities:
Keywords: Acute respiratory diseases; Artificial intelligence; Chinese clinical named entity recognition; Deep learning; Risk classification
Year: 2022 PMID: 35221528 PMCID: PMC8861621 DOI: 10.1007/s10489-022-03222-y
Source DB: PubMed Journal: Appl Intell (Dordr) ISSN: 0924-669X Impact factor: 5.019
Artificial intelligence models for disease diagnosis and prediction
| Function | Theme | Disease type | Artificial intelligence models | Data Type | Ref. |
|---|---|---|---|---|---|
| Information extraction | Extract valuable information | Myocardial infarction | Gaussian naïve Bayes-based active balancing mechanism | Imbalanced electrocardiogram data | [ |
| Medical data processing | Beta-lactam allergy | Fast incremental decision tree | EMRs | [ | |
| Classification & diagnosis | Classification of chronic diseases | Chronic diseases | Hybrid deep learning | EMRs | [ |
| Classification of cardiac disorder | Cardiac disorder | Adaptive neuro-fuzzy inference system | Electrocardiogram signals | [ | |
| Detect Covid-19 disease | Covid-19 disease | CNN | Chest X-ray images | [ | |
| Infer illness and predict outcomes | Diabetes and mental health | LSTM | EMRs | [ | |
| Brain disease prognosis | Brain disease | Weakly-supervised CNN | MRI and clinical scores | [ | |
| Tendency judgment | Infection rates of COVID-19 | COVID-19 | LSTM | EMRs | [ |
| Rehabilitation progress | Rehabilitation | CNN | Movement data | [ | |
| Dynamic changes in disease | Congenital heart disease | Bayesian classification | Cardiopathy data | [ | |
| Transcriptional effects of mutations | Mutations | Hybrid deep learning | DNA sequence | [ | |
| Mortality detection in ICU | Unspecified | Deep learning and rule-based reasoning | EMRs | [ | |
| Occurrence prediction | Cognitive impairment conversion prediction | Dementia | Hybrid CNN | MRI | [ |
| Predict postoperative morbidity | Heart disease | Ensemble model | EMRs | [ | |
| Predict the occurrence of a disease | Multicategory-multifactorial disease | Generalized artificial intelligence strategy | EMRs | [ | |
| Risk prediction | Predict the risk level of the disease | Multivariate disease | Deep learning model | EMRs | [ |
| Stratify the clinical risks of acute coronary syndrome | Acute coronary syndrome | Regularized stacked denoising auto-encoder model | EMRs | [ | |
| Disease risk prediction | Unspecified | Multimodal data-based recurrent CNN | Semi-structured EMRs | [ | |
| Multiple disease risk prediction | Multiple diseases | Directed disease network and recommendation system | EMRs | [ |
Fig. 1The bi-level workflow of the risk classification of acute respiratory diseases
Fig. 2Data modeling and encoding of the unstructured text data in the CEMRs
Fig. 3The network structure of the “BiLSTM+Dilated Convolution+3D Attention+CRF” model for CCNER
Notations used in the four-layer BiLSTM module
| Notations | Implications |
|---|---|
| The state of the current BiLSTM cell | |
| The new information getting into the BiLSTM cell state | |
| The Sigmoid function | |
| The forgotten information in the BiLSTM cell state | |
| The input of the current BiLSTM cell | |
| The bias in three gates of the current BiLSTM cell | |
| The weights in three gates of the current BiLSTM cell | |
| The output information in the BiLSTM cell state | |
| The output of the previous and current BiLSTM cell | |
| Two vectors (forward and backward) to form the hidden state of a BiLSTM network |
Fig. 4The computation logic of the self-attention mechanism
Fig. 6Definition of risk level for early warning of acute respiratory disease
Fig. 5The data structure of training samples in the risk classification model
The implementation detail of the customized XGBoost model
Fig. 7Screenshots and training samples in the China Hospital Pharmacovigilance System
Fig. 8Recognized entities from unstructured CEMR and the model response time
Comparison of experimental CCNER models
| Models | Precision | Recall | F1 |
|---|---|---|---|
| BBLC | 88.40 | 86.71 | 87.55 |
| BLDAC | 81.45 | 79.60 | 80.51 |
| BLDAC+transfer learning | 90.20 | 87.42 | 88.78 |
| BLDAC+transfer learning+word vector+character vector | 92.42 | 88.55 | 90.44 |
| HDL (BLDAC +transfer learning+word vector+character vector+semi-supervised learning) | 94.12 | 90.38 | 92.21 |
CCNER results for different entity types
| Entity Type | Precision | Recall | F1 |
|---|---|---|---|
| Examination | 93.40 | 91.30 | 92.34 |
| Symptoms | 92.50 | 92.72 | 92.61 |
| Disease | 93.25 | 85.97 | 89.46 |
| Anatomy | 95.88 | 90.45 | 93.1 |
| Treatment | 95.57 | 91.46 | 93.47 |
Identification results of three types of acute respiratory diseases
| Disease Type | Precision | Recall | F1 |
|---|---|---|---|
| Lung cancer | 94.30 | 87.10 | 90.54 |
| Severe pneumonia | 92.50 | 85.14 | 88.67 |
| Severe asthma | 92.95 | 85.69 | 89.17 |
Performance comparison of different model combinations
| Type | Methods | Error rate | AUC | F1 |
|---|---|---|---|---|
| BBLC-based | Logistic Regression | 0.3563 | 0.7005 | 0.7465 |
| Support Vector Machine | 0.2706 | 0.7292 | 0.7568 | |
| Random Forest | 0.2419 | 0.7534 | 0.7815 | |
| Customized XGBoost | 0.2131 | 0.7992 | 0.8075 | |
| HDL-based | Logistic Regression | 0.2546 | 0.7378 | 0.7863 |
| Support Vector Machine | 0.2234 | 0.7681 | 0.8077 | |
| Random Forest | 0.1623 | 0.8357 | 0.8548 | |
| Customized XGBoost | 0.1012 | 0.8639 | 0.8927 |
Risk classification in different disease subgroups
| Disease Type | Error rate | AUC | F1 |
|---|---|---|---|
| Lung cancer | 0.1011 | 0.8576 | 0.8940 |
| Severe pneumonia | 0.1028 | 0.8380 | 0.8597 |
| Severe asthma | 0.1041 | 0.8058 | 0.8315 |