| Literature DB >> 34917164 |
Soo See Chai1, Whye Lian Cheah2, Kok Luong Goh3, Yee Hui Robin Chang4, Kwan Yong Sim5, Kim On Chin6.
Abstract
This study outlines and developed a multilayer perceptron (MLP) neural network model for adolescent hypertension classification focusing on the use of simple anthropometric and sociodemographic data collected from a cross-sectional research study in Sarawak, Malaysia. Among the 2,461 data collected, 741 were hypertensive (30.1%) and 1720 were normal (69.9%). During the data gathering process, eleven anthropometric measurements and sociodemographic data were collected. The variable selection procedure in the methodology proposed selected five parameters: weight, weight-to-height ratio (WHtR), age, sex, and ethnicity, as the input of the network model. The developed MLP model with a single hidden layer of 50 hidden neurons managed to achieve a sensitivity of 0.41, specificity of 0.91, precision of 0.65, F-score of 0.50, accuracy of 0.76, and Area Under the Receiver Operating Characteristic (ROC) Curve (AUC) of 0.75 using the imbalanced data set. Analyzing the performance metrics obtained from the training, validation and testing data sets show that the developed network model is well-generalized. Using Bayes' Theorem, an adolescent classified as hypertensive using this created model has a 66.2% likelihood of having hypertension in the Sarawak adolescent population, which has a hypertension prevalence of 30.1%. When the prevalence of hypertension in the Sarawak population was increased to 50%, the developed model could predict an adolescent having hypertension with an 82.0% chance, whereas when the prevalence of hypertension was reduced to 10%, the developed model could only predict true positive hypertension with a 33.6% chance. With the sensitivity of the model increasing to 65% and 90% while retaining a specificity of 91%, the true positivity of an adolescent being hypertension would be 75.7% and 81.2%, respectively, according to Bayes' Theorem. The findings show that simple anthropometric measurements paired with sociodemographic data are feasible to be used to classify hypertension in adolescents using the developed MLP model in Sarawak adolescent population with modest hypertension prevalence. However, a model with higher sensitivity and specificity is required for better positive hypertension predictive value when the prevalence is low. We conclude that the developed classification model could serve as a quick and easy preliminary warning tool for screening high-risk adolescents of developing hypertension.Entities:
Mesh:
Year: 2021 PMID: 34917164 PMCID: PMC8670914 DOI: 10.1155/2021/2794888
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Anthropometric data of the participants.
| Male ( | Female ( | |||||||
|---|---|---|---|---|---|---|---|---|
| Min | Max | Mean | Std | Min | Max | Mean | Std | |
| Weight (kg) | 24.4 | 121.8 | 55.5 | 14.78 | 21.2 | 109.4 | 51.0 | 12.80 |
| Height (m) | 1.3 | 1.8 | 1.6 | 0.08 | 1.24 | 1.78 | 1.5 | 0.06 |
| BMI (kg/m2) | 13.3 | 43.1 | 21.3 | 4.72 | 13.1 | 43.5 | 21.6 | 4.78 |
| WC (cm) | 51.5 | 125.0 | 71.3 | 11.56 | 50.0 | 655.0 | 70.2 | 18.46 |
| WHtR | 0.3 | 0.7 | 0.4 | 0.07 | 0.3 | 4.1 | 0.5 | 0.11 |
|
| 0.8 | 1.4 | 1.1 | 0.07 | 0.8 | 11.0 | 1.1 | 0.27 |
Blood pressure profile of the participants.
| Sex | Male ( | Female ( | ||
|---|---|---|---|---|
| Blood pressure |
| % |
| % |
| Prehypertension | 199 | 19.3 | 125 | 8.8 |
| Hypertension | 232 | 22.5 | 185 | 13.0 |
| Normal | 602 | 58.3 | 1118 | 78.3 |
Figure 1Overall classification procedures implemented in this study.
Distribution of data using a stratified sampling method according to the ratio of hypertensive and normal groups.
| Partition | Normal | Hypertensive | Total ( | |||
|---|---|---|---|---|---|---|
|
| % |
| % |
| % | |
| Training | 1032 | 69.9 | 445 | 30.1 | 1477 | 60.0 |
| Validation | 516 | 69.9 | 222 | 30.1 | 738 | 30.0 |
| Testing | 172 | 69.9 | 74 | 30.1 | 246 | 10.0 |
Proportion of variance explained for the five selected parameters through Fast Supervised Selection method.
| Parameter | Proportion of variance explained |
|---|---|
| Weight | 0.2314 |
| Sex | 0.2540 |
| Ethnic | 0.2614 |
| WHtR | 0.2657 |
| Age | 0.2682 |
Figure 2Multilayer perceptron model developed in this research study.
Multilayer perceptron parameter settings.
| Parameter | Value |
|---|---|
| Input dimension | 4 |
| Number of output classes | 2 |
| Number of hidden layers | 1 |
| Hidden layer dimension | 50 |
| Hidden layer activation function | tanh |
| Momentum | 0 |
| Learning rate | 0.0010 |
| Optimization method | LBFGS (Limited-Memory Broyden Fletcher Goldfarb Shanno) |
Confusion matrix obtained using training data.
| Actual | ||
|---|---|---|
| Hypertensive | Normal | |
| Prediction | ||
| Hypertensive | 204 | 66 |
| Normal | 241 | 966 |
Confusion matrix obtained using validation data.
| Actual | ||
|---|---|---|
| Hypertensive | Normal | |
| Prediction | ||
| Hypertensive | 99 | 44 |
| Normal | 123 | 472 |
Confusion matrix obtained using testing data.
| Actual | ||
|---|---|---|
| Hypertensive | Normal | |
| Prediction | ||
| Hypertensive | 30 | 16 |
| Normal | 44 | 156 |
Classification results obtained for training, validation, and testing data sets.
| Performance metrics | Training | Validation | Testing |
|---|---|---|---|
| Sensitivity | 0.46 | 0.45 | 0.41 |
| Specificity | 0.94 | 0.91 | 0.91 |
| Precision | 0.76 | 0.69 | 0.65 |
| F-score | 0.57 | 0.54 | 0.50 |
| Accuracy | 0.79 | 0.77 | 0.76 |
| Misclassification rate | 0.21 | 0.23 | 0.24 |
| AUC | 0.82 | 0.79 | 0.75 |
Figure 3ROC of the training data set.
Figure 4ROC of the validation data.
Figure 5ROC of the testing data.
Variable importance.
| Variable | Variable importance | Relative variable importance |
|---|---|---|
| PC1 | 246.67 | 1.00 |
| Sex | 37.88 | 0.15 |
| Age | 33.02 | 0.13 |
| Ethnicity | 32.44 | 0.13 |
Performance metrics comparison.
| Our model | Sensitivity | Specificity | Precision |
| Accuracy | AUC |
|---|---|---|---|---|---|---|
| 0.41 | 0.91 | 0.65 | 0.50 | 0.76 | 0.75 | |
| López-Martínez et al. [ | 0.40 | 0.87 | 0.58 | 0.47 | 0.73 | 0.77 |
| Bani-Salameh et al. [ | 0.69 | — | 0.68 | 0.68 | 0.68 | 0.62 |
| Sakr et al. [ | 0.31 | 0.88 | 0.57 | 0.39 | — | 0.67 |
Model reliability testing using Bayes' Theorem for different prevalence and sensitivity levels. The model reliability on current hypertension prevalence in Sarawak adolescents is highlighted.
| Prevalence | Sensitivity | Specificity | TTP | TFP | TP |
|
|
|---|---|---|---|---|---|---|---|
| 10% | 41% | 91% | 8205 | 16210 | 24415 | 0.1220 | 33.6% |
| 50% | 41% | 91% | 41026 | 9005 | 50031 | 0.2500 | 82.0% |
| 30.1% | 65% | 91% | 39155 | 12590 | 51745 | 0.2586 | 75.7% |
| 30.1% | 90% | 91% | 54215 | 12590 | 66805 | 0.3338 | 81.2% |
| 30.1% | 41% | 91% | 24698 | 12590 | 37288 | 0.1863 | 66.2% |
|
| % | |
|---|---|---|
|
| ||
| Male (M) | 1033 | 42.0 |
| Female (F) | 1428 | 58.0 |
|
| ||
| Iban | 737 | 29.9 |
| Malay | 681 | 27.7 |
| Chinese | 475 | 19.3 |
| Bidayuh | 256 | 10.4 |
| Other | 312 | 12.7 |
|
| ||
| Urban | 634 | 25.8 |
| Rural | 1827 | 74.2 |
|
| ||
| One of the parents | 448 | 18.2 |
| Both parents | 80 | 3.3 |
| No | 1933 | 78.5 |
| Age | Min | Max | Mean | Standard deviation |
|---|---|---|---|---|
| Male (M) | 12 | 17 | 14.4 | 1.48 |
| Female (F) | 12 | 17 | 14.5 | 1.51 |