| Literature DB >> 35433854 |
Sheikh Mohammed Shariful Islam1, Ashis Talukder2, Md Abdul Awal3, Md Muhammad Umer Siddiqui4, Md Martuza Ahamad5, Benojir Ahammed2, Lal B Rawal6, Roohallah Alizadehsani7, Jemal Abawajy8, Liliana Laranjo9, Clara K Chow9, Ralph Maddison1.
Abstract
Background: Hypertension is the most common modifiable risk factor for cardiovascular diseases in South Asia. Machine learning (ML) models have been shown to outperform clinical risk predictions compared to statistical methods, but studies using ML to predict hypertension at the population level are lacking. This study used ML approaches in a dataset of three South Asian countries to predict hypertension and its associated factors and compared the model's performances.Entities:
Keywords: Demographic and Health Survey; South Asia; algorithms; artificial intelligence; blood pressure; cardiovascular diseases; risk factors
Year: 2022 PMID: 35433854 PMCID: PMC9008259 DOI: 10.3389/fcvm.2022.839379
Source DB: PubMed Journal: Front Cardiovasc Med ISSN: 2297-055X
Background characteristics of the participants (N = 8,18,603).
|
|
|
|
| |
|---|---|---|---|---|
|
|
| |||
|
| ||||
| 15–24 | 2,75,719 (33.68) | 2,67,605 (97.06) | 8,114 (2.94) | |
| 25–34 | 2,36,583 (28.90) | 2,18,453 (92.34) | 18,130 (7.66) | |
| 35–44 | 1,95,200 (23.85) | 1,64,615 (84.33) | 30,585 (15.67) | |
| 45+ | 1,11,102 (13.57) | 85,183 (76.67) | 25,919 (23.33) | |
| Total | 8,18,604 (100.00) | 7,35,856 (89.89) | 82,748 (10.11) | |
|
| ||||
| Thin | 1,78,284 (21.78) | 1,69,855 (95.27) | 8,429 (4.73) | |
| Normal | 4,79,687 (58.60) | 4,37,626 (91.23) | 42,061 (8.77) | |
| Overweight | 1,22,414 (14.95) | 99,659 (81.41) | 22,755 (18.59) | |
| Obese | 38,218 (4.67) | 28,715 (75.13) | 9,503 (24.86) | |
| Total | 8,18,603 (100.00) | 7,35,855 (89.89) | 82,748 (10.11) | |
|
| ||||
| No education | 2,12,323 (25.94) | 1,84,683 (86.98) | 27,640 (13.02) | |
| Primary | 1,12,656 (13.76) | 99,095 (87.96) | 13,561 (12.04) | |
| Secondary | 3,88,283 (47.43) | 3,55,180 (91.47) | 33,103 (8.53) | |
| Higher | 1,05,342 (12.87) | 96,898 (91.98) | 8,444 (8.02) | |
| Total | 8,18,604 (100.00) | 7,35,856 (89.89) | 82,748 (10.11) | |
|
| ||||
| Poor | 3,67,661 (44.91) | 3,38,073 (91.95) | 29,588 (8.05) | |
| Middle | 1,62,088 (19.80) | 1,45,454 (89.74) | 16,634 (10.26) | |
| Rich | 2,88,854 (35.29) | 2,52,329 (87.36) | 36,525 (12.64) | |
| Total | 8,18,603 (100.00) | 7,35,856 (89.89) | 82,747 (10.11) | |
|
| ||||
| No | 3,14,579 (39.72) | 2,88,775 (91.80) | 25,804 (8.20) | |
| Yes | 4,77,383 (60.28) | 4,24,845 (88.99) | 52,538 (11.01) | |
| Total | 7,91,962 (100.00) | 7,13,620 (90.11) | 78,342 (9.89) | |
|
| ||||
| No | 7,22,313 (91.21) | 6,53,345 (90.45) | 68,968 (9.55) | |
| Yes | 69,648 (8.79) | 60,275 (86.54) | 9,373 (13.46) | |
| Total | 7,91,961 (100.00) | 7,13,620 (90.11) | 78,341 (9.89) | |
|
| ||||
| No | 7,66,754 (96.82) | 6,92,265 (90.29) | 74,489 (9.71) | |
| Yes | 25,208 (3.18) | 21,356 (84.72) | 3,852 (15.28) | |
| Total | 7,91,962 (100.00) | 7,13,621 (90.11) | 78,341 (9.89) | |
Performance indicators of all selected machine learning algorithms.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| Random forest | 0.89 | 0.90 | 0.99 | 0.94 | 3.63 |
| Decision tree | 0.83 | 0.91 | 0.90 | 0.90 | 5.92 |
| XGB | 0.90 | 0.90 | 1.00 | 0.95 | 3.52 |
| GBM | 0.90 | 0.90 | 1.00 | 0.95 | 3.33 |
| LR | 0.90 | 0.90 | 1.00 | 0.95 | 3.55 |
| LDA | 0.90 | 0.90 | 1.00 | 0.95 | 3.57 |
XGB, XGBoost; GBM, Gradient Boosting Machine; LR, Logistic Regression; LDA, Linear Discriminant Analysis.
Figure 1Violin plot of the 10-fold cross-validation (Violin plots representing the entire distribution).
Figure 2Significant features for hypertension in three South Asian countries.