| Literature DB >> 34630991 |
Dengqing Zhang1,2, Yunyi Chen3, Yuxuan Chen3, Shengyi Ye1,2, Wenyu Cai1,2, Junxue Jiang1,2, Yechuan Xu1,2, Gongfeng Zheng1,2, Ming Chen1,4.
Abstract
In recent decades, heart disease threatens people's health seriously because of its prevalence and high risk of death. Therefore, predicting heart disease through some simple physical indicators obtained from the regular physical examination at an early stage has become a valuable subject. Clinically, it is essential to be sensitive to these indicators related to heart disease to make predictions and provide a reliable basis for further diagnosis. However, the large amount of data makes manual analysis and prediction taxing and arduous. Our research aims to predict heart disease both accurately and quickly through various indicators of the body. In this paper, a novel heart disease prediction model is given. We propose a heart disease prediction algorithm that combines the embedded feature selection method and deep neural networks. This embedded feature selection method is based on the LinearSVC algorithm, using the L1 norm as a penalty item to choose a subset of features significantly associated with heart disease. These features are fed into the deep neural network we built. The weight of the network is initialized with the He initializer to prevent gradient varnishing or explosion so that the predictor can have a better performance. Our model is tested on the heart disease dataset obtained from Kaggle. Some indicators including accuracy, recall, precision, and F1-score are calculated to evaluate the predictor, and the results show that our model achieves 98.56%, 99.35%, 97.84%, and 0.983, respectively, and the average AUC score of the model reaches 0.983, confirming that the method we proposed is efficient and reliable for predicting heart disease.Entities:
Mesh:
Year: 2021 PMID: 34630991 PMCID: PMC8494559 DOI: 10.1155/2021/6260022
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Figure 1Proposed heart disease prediction system structure.
Description of features.
| SN | Attribute | Description |
|---|---|---|
| 1 | age | Age in years |
| 2 | sex | 1 = male; 0 = female |
| 3 | cp | Chest pain type |
| 4 | trestbps | Resting blood pressure (in mm Hg on admission to the hospital) |
| 5 | chol | Serum cholesterol in mg/dl |
| 6 | fbs | Fasting blood sugar >120 mg/dl (1 = true; 0 = false) |
| 7 | restecg | Resting electrocardiographic results |
| 8 | thalach | Maximum heart rate achieved |
| 9 | exang | Exercise-induced angina (1 = yes; 0 = no) |
| 10 | oldpeak | ST depression induced by exercise relative to rest |
| 11 | slope | The slope of the peak exercise ST segment |
| 12 | ca | Number of major vessels (0–3) colored by fluoroscopy |
| 13 | thal | 1 = normal; 2 = fixed defect; 3 = reversible defect |
| 14 | target | 1 = disease; 2 = no disease |
Figure 2The changes of boxplot before and after the outlier removal using IQR. (a) Raw data. (b) Results of outlier removal.
Figure 3The structure of our deep neural network.
Results of the proposed method.
| Class 0 (%) | Class 1 (%) | |
|---|---|---|
| Accuracy | 98.56 | 98.56 |
| Recall | 97.84 | 99.35 |
| Precision | 99.35 | 97.84 |
Figure 4Confusion matrix on test data.
Figure 5ROC curve and AUC of the proposed algorithms.
Importance value of features.
| SN | Attribute | Importance value |
|---|---|---|
| 1 | age | −0.00144 |
| 2 | sex | −0.56535 |
| 3 | cp | 0.288000 |
| 4 | trestbps | −0.005696 |
| 5 | chol | −0.001637 |
| 6 | fbs | 0.0 |
| 7 | restecg | 0.1190487 |
| 8 | thalach | 0.0079078 |
| 9 | exang | −0.332790 |
| 10 | oldpeak | −0.199450 |
| 11 | slope | 0.1589541 |
| 12 | ca | −0.252994 |
| 13 | thal | −0.2927249 |
Figure 6Results of different initializers.
Figure 7Comparison of results using batch normalization layer.
Comparison of classification performance of the proposed method with others.
| Authors | Methods | Accuracy (%) | Recall (%) | Precision (%) |
|---|---|---|---|---|
| Ramprakash et al. [ |
| 94.0 | 93.00 | — |
| Gao et al. [ | Bagging ensemble method with decision tree | 98.6 | 99.0 | 97.8 |
| Gao et al. [ | PCA + decision tree | 99.0 | 97.0 | 98.0 |
| Ali et al. [ | MLP | 97.95 | 98 | 98 |
| Proposed | LinearSVC + DNN |
|
|
|