| Literature DB >> 34943525 |
Kaixiang Su1, Jiao Wu2, Dongxiao Gu1,3, Shanlin Yang1,3, Shuyuan Deng4, Aida K Khakimova5.
Abstract
Increasingly, machine learning methods have been applied to aid in diagnosis with good results. However, some complex models can confuse physicians because they are difficult to understand, while data differences across diagnostic tasks and institutions can cause model performance fluctuations. To address this challenge, we combined the Deep Ensemble Model (DEM) and tree-structured Parzen Estimator (TPE) and proposed an adaptive deep ensemble learning method (TPE-DEM) for dynamic evolving diagnostic task scenarios. Different from previous research that focuses on achieving better performance with a fixed structure model, our proposed model uses TPE to efficiently aggregate simple models more easily understood by physicians and require less training data. In addition, our proposed model can choose the optimal number of layers for the model and the type and number of basic learners to achieve the best performance in different diagnostic task scenarios based on the data distribution and characteristics of the current diagnostic task. We tested our model on one dataset constructed with a partner hospital and five UCI public datasets with different characteristics and volumes based on various diagnostic tasks. Our performance evaluation results show that our proposed model outperforms other baseline models on different datasets. Our study provides a novel approach for simple and understandable machine learning models in tasks with variable datasets and feature sets, and the findings have important implications for the application of machine learning models in computer-aided diagnosis.Entities:
Keywords: adaptive deep ensemble learning; dynamic evolving diagnosis; intelligent health knowledge discovery; personalized health management
Year: 2021 PMID: 34943525 PMCID: PMC8700766 DOI: 10.3390/diagnostics11122288
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Figure 1Framework of our proposed methods.
Overview of the six datasets.
| Dataset Name | Volume | Distribution | Number of Features |
|---|---|---|---|
| Breast Cancer Prediction | 334 | 170 positive and 164 negative | 10 |
| Z-Alizadeh Sani | 303 | 216 positive and 87 negative | 54 |
| Indian Liver Patient [ | 583 | 416 positive and 167 negative | 10 |
| Breast Cancer Wisconsin [ | 569 | 212 positive and 357 negative | 32 |
| Cervical Cancer [ | 858 | 55 positive and 803 negative | 36 |
| Thyroid Disease [ | 7200 | 6644 positive and 556 negative | 21 |
Features of Breast Cancer Prediction.
| Attribute | Type | Description of Attribute |
|---|---|---|
| Age | Continuous | Patient’s age |
| Location | Discrete | Location of the patient’s mass |
| Node | Continuous | Number of metastatic lymph nodes |
| Density | Discrete | Density of the patient’s mass |
| Clarity | Discrete | Clarity of the patient’s mass margin |
| Area | Continuous | Area of the patient’s mass |
| Regulation | Discrete | Regulation of the patient’s mass border |
| Surface Smoothness | Discrete | Smoothness of the patient’s mass surface |
| Nipple | Discrete | Whether a woman with breast tumor has nipple discharge |
| Family_History | Discrete | Whether the patient has a family history of breast cancer |
Features of Z-Alizadeh Sani dataset.
| Feature Type | Feature Name | Data Type |
|---|---|---|
| Demographic | Age | Real number |
| Weight | Real number | |
| Sex | Categorical | |
| Length | Real number | |
| Body mass index | Real number | |
| Diabetes mellitus | Categorical | |
| Hypertension | Categorical | |
| Current smoker | Categorical | |
| Ex-smoker | Categorical | |
| Family history | Categorical | |
| Obesity | Categorical | |
| Chronic renal failure | Categorical | |
| Cerebrovascular accident | Categorical | |
| Airway disease | Categorical | |
| Thyroid disease | Categorical | |
| Congestive heart failure | Categorical | |
| Dyslipidemia | Categorical | |
| Symptom and examination Density | Blood pressure (mm Hg) | Real number |
| Pulse rate (ppm) | Real number | |
| Edema | Categorical | |
| Weak peripheral pulse | Categorical | |
| Lung rales | Categorical | |
| Systolic murmur | Categorical | |
| Diastolic murmur | Categorical | |
| Typical chest pain | Categorical | |
| Dyspnea | Categorical | |
| Function class | Real number | |
| Atypical | Categorical | |
| Nonanginal chest pain | Categorical | |
| Exertional chest pain | Categorical | |
| Low-threshold angina | Categorical | |
| ECG | Rhythm | Categorical |
| Q wave | Categorical | |
| ST depression | Categorical | |
| T inversion | Categorical | |
| Left ventricular hypertrophy | Categorical | |
| Poor R-wave progression | Categorical | |
| Laboratory and echo | Fasting blood sugar (mg/dL) | Real number |
| Creatine (mg/dL) | Real number | |
| Triglyceride (mg/dL) | Real number | |
| Low-density lipoprotein (mg/dL) | Real number | |
| High-density lipoprotein (mg/dL) | Real number | |
| Blood urea nitrogen (mg/dL) | Real number | |
| Erythrocyte sedimentation rate (mm/h) | Real number | |
| Hemoglobin (g/dL) | Real number | |
| K (mEq/lit) | Real number | |
| Na (mEq/lit) | Real number | |
| White blood cell (cells/mL) | Real number | |
| Lymphocyte (%) | Real number | |
| Neutrophil (%) | Real number | |
| Platelet (1000/mL) | Real number | |
| Ejection fraction (%) | Real number | |
| Region with RWMA | Real number | |
| Valvular heart disease | Categorical |
Results of comparison with classification models (Breast Cancer Prediction dataset).
| Precision | F-Measure | Accuracy | AUC | |
|---|---|---|---|---|
| Random Forest | 91.83% | 89.49% | 89.58% | 95.04% |
| AdaBoost | 84.07% * | 83.45% * | 83.30% * | 91.85% * |
| ExtraTrees | 88.73% * | 84.95% * | 85.33% * | 92.80% |
| GBDT | 92.81% | 89.69% | 89.92% | 95.24% |
| TPE-Voting | 87.57% * | 86.52% * | 86.42% * | 94.04% * |
| DEM | 92.79% * | 88.01% * | 88.51% * | 94.93% * |
| TPE-DEM | 95.36% | 90.91% | 91.26% | 96.08% |
* p-values are significant at α = 0.05.
Results of comparison with classification models (Z-Alizadeh Sani dataset).
| Precision | F-Measure | Accuracy | AUC | |
|---|---|---|---|---|
| Random Forest | 88.86% | 91.14% | 86.95% | 92.72% |
| AdaBoost | 87.88% * | 88.94% * | 84.07% * | 88.05% * |
| ExtraTrees | 90.88% * | 90.35% * | 86.33% * | 90.83% * |
| GBDT | 90.02% | 91.84% | 88.05% | 92.45% |
| TPE-Voting | 90.05% * | 90.51% * | 86.33% * | 91.55% * |
| DEM | 89.11% * | 90.12% * | 85.73% * | 91.84% * |
| TPE-DEM | 91.03% | 92.76% | 89.43% | 92.99% |
* p-values are significant at α = 0.05.
Results of comparison with classification models (Indian Liver Patient dataset).
| Precision | F-Measure | Accuracy | AUC | |
|---|---|---|---|---|
| Random Forest | 87.04% * | 73.84% * | 90.15% | 71.46% * |
| AdaBoost | 78.30% * | 72.08% * | 86.65% * | 71.36% |
| ExtraTrees | 85.40% * | 75.62% | 89.63% | 71.53% |
| GBDT | 85.33% * | 73.71% * | 89.34% | 69.53% |
| TPE-Voting | 85.86% | 74.03% | 90.15% | 73.21% |
| DEM | 82.47% | 73.16% | 85.44% | 73.21% |
| TPE-DEM | 87.11% | 75.48% | 90.44% | 75.22% |
* p-values are significant at α = 0.05.
Results of comparison with classification models (Breast Cancer Wisconsin dataset).
| Precision | F-Measure | Accuracy | AUC | |
|---|---|---|---|---|
| Random Forest | 96.18% | 94.67% | 96.14% | 98.11% |
| AdaBoost | 96.30% | 94.77% * | 96.13% * | 98.12% * |
| ExtraTrees | 97.16% | 95.86% | 97.01% | 98.15% |
| GBDT | 95.80% * | 94.52% ** | 95.96% * | 98.33% * |
| TPE-Voting | 94.34% | 92.79% | 94.73% | 98.36% |
| DEM | 97.59% | 95.42% | 97.02% | 98.36% |
| TPE-DEM | 97.63% | 95.90% | 97.35% | 98.38% |
* p-values are significant at α = 0.05. ** p-values are significant at α = 0.01.
Results of comparison with classification models (Cervical Cancer dataset).
| Precision | F-Measure | Accuracy | AUC | |
|---|---|---|---|---|
| Random Forest | 71.45% | 59.72% * | 95.46% | 97.00% |
| AdaBoost | 58.95% * | 50.93% ** | 94.18% * | 88.21% * |
| ExtraTrees | 68.67% * | 64.47% * | 95.46% | 95.46% |
| GBDT | 70.50% | 66.36% | 95.81% | 96.01% |
| TPE-Voting | 62.17% * | 59.16% * | 94.87% | 92.90% * |
| DEM | 70.04% * | 64.82% * | 95.34% | 92.90% * |
| TPE-DEM | 76.02% | 67.02% | 95.58% | 97.01% * |
* p-values are significant at α = 0.05. ** p-values are significant at α = 0.01.
Results of comparison with classification models (Thyroid Disease dataset).
| Precision | F-Measure | Accuracy | AUC | |
|---|---|---|---|---|
| Random Forest | 99.83% | 99.76% | 99.55% | 98.92% |
| AdaBoost | 99.66% | 99.74% | 99.52% | 98.31% |
| ExtraTrees | 98.11% * | 98.96% | 98.06% * | 98.80% |
| GBDT | 99.83% | 99.76% | 99.55% | 98.92% |
| TPE-Voting | 96.36% * | 97.94% * | 96.13% * | 97.80% * |
| DEM | 98.22% | 98.97% | 98.09% * | 97.80% |
| TPE-DEM | 99.86% | 99.81% | 99.66% | 98.94% |
* p-values are significant at α = 0.05.