| Literature DB >> 36238252 |
Michael T Mapundu1, Chodziwadziwa W Kabudula1,2, Eustasius Musenge1, Victor Olago3, Turgay Celik4,5.
Abstract
Computer Coded Verbal Autopsy (CCVA) algorithms are commonly used to determine the cause of death (CoD) from questionnaire responses extracted from verbal autopsies (VAs). However, they can only operate on structured data and cannot effectively harness information from unstructured VA narratives. Machine Learning (ML) algorithms have also been applied successfully in determining the CoD from VA narratives, allowing the use of auxiliary information that CCVA algorithms cannot directly utilize. However, most ML-based studies only use responses from the structured questionnaire, and the results lack generalisability and comparability across studies. We present a comparative performance evaluation of ML methods and CCVA algorithms on South African VA narratives data, using data from Agincourt Health and Demographic Surveillance Site (HDSS) with physicians' classifications as the gold standard. The data were collected from 1993 to 2015 and have 16,338 cases. The random forest and extreme gradient boosting classifiers outperformed the other classifiers on the combined dataset, attaining accuracy of 96% respectively, with significant statistical differences in algorithmic performance (p < 0.0001). All our models attained Area Under Receiver Operating Characteristics (AUROC) of greater than 0.884. The InterVA CCVA attained 83% Cause Specific Mortality Fraction accuracy and an Overall Chance-Corrected Concordance of 0.36. We demonstrate that ML models could accurately determine the cause of death from VA narratives. Additionally, through mortality trends and pattern analysis, we discovered that in the first decade of the civil registration system in South Africa, the average life expectancy was approximately 50 years. However, in the second decade, life expectancy significantly dropped, and the population was dying at a much younger average age of 40 years, mostly from the leading HIV related causes. Interestingly, in the third decade, we see a gradual improvement in life expectancy, possibly attributed to effective health intervention programmes. Through a structure and semantic analysis of narratives where experts disagree, we also demonstrate the most frequent terms of traditional healer consultations and visits. The comparative approach also makes this study a baseline that can be used for future research enforcing generalization and comparability. Future study will entail exploring deep learning models for CoD classification.Entities:
Keywords: CCVA; Verbal Autopsy; algorithms; cause of death; machine learning
Mesh:
Year: 2022 PMID: 36238252 PMCID: PMC9552851 DOI: 10.3389/fpubh.2022.990838
Source DB: PubMed Journal: Front Public Health ISSN: 2296-2565
Twelve disease classes and the number of data samples before and after data balancing.
|
| |||
|---|---|---|---|
|
|
|
|
|
| HIV/TB | 0 | 3,388 | 3,388 |
| Other infectious | 1 | 964 | 3,388 |
| Metabolic | 2 | 242 | 3,388 |
| Cardiovascular | 3 | 140 | 3,388 |
| Indeterminate | 4 | 1,468 | 3,388 |
| Maternal and Neonatal | 5 | 121 | 3,388 |
| Abdominal | 6 | 117 | 3,388 |
| Neoplasms | 7 | 93 | 3,388 |
| External causes | 8 | 89 | 3,388 |
| Neurological | 9 | 57 | 3,388 |
| Respiratory | 10 | 46 | 3,388 |
| Other NCD | 11 | 21 | 3,388 |
Figure 1Schematic diagram of ML process followed.
Model optimal hyperparameters.
|
| |
|---|---|
|
|
|
| XGBoost | L1, max_depth=10, objective=multi:softmax, learning_rate =0.1, alpha=0 |
| RF | gini, max_depth =10, n_estimators=100, min_samples_leaf=1 |
| ANN | relu, alpha=0.0001, solver=adam |
| KNN | minkowski, n_neighbors=5, p=2 |
| SVM | gamma=scale, kernel=rbf, C=1.0 |
| Bagging | KNN, max_samples, max_features |
| DT | gini, min_samples_split=2, min_samples_leaf=1, |
| LR | |
| BC | alpha=1.0, fit_prior=True, class_prior=None |
XGBoost, eXtreme Gradient Boosting; RF, Random Forest; ANN, Artificial Neural Network; KNN, K-Nearest Neighbor; SVM, Support Vector Machine; BG, Bagging; DT, Decision Tree; LR, Logistic Regression; BC, Bayes Classifier.
Comparison of nine ML models using narratives only.
|
| ||||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| XGBoost | 96 | 96 | 96 | 96 | 0.927 | 0.906 |
| RF | 96 | 96 | 96 | 96 | 0.998 | 0.996 |
| ANN | 94 | 94 | 94 | 94 | 0.982 | 0.964 |
| KNN | 93 | 93 | 93 | 92 | 0.989 | 0.987 |
| SVM | 92 | 92 | 92 | 92 | 0.917 | 0.917 |
| Bagging | 91 | 91 | 91 | 91 | 0.997 | 0.995 |
| DT | 85 | 84 | 85 | 84 | 0.910 | 0.910 |
| LR | 82 | 82 | 82 | 82 | 0.977 | 0.959 |
| BC | 71 | 75 | 71 | 72 | 0.921 | 0.920 |
XGBoost, eXtreme Gradient Boosting; RF, Random Forest; ANN, Artificial Neural Network; KNN, K-Nearest Neighbor; SVM, Support Vector Machine; BG, Bagging; DT, Decision Tree; LR, Logistic Regression; BC, Bayes Classifier; AUROCMIA, Area Under Receiver Operating Characteristics Micro Average; AUROCMAA, Area Under Receiver Operating Characteristics Macro Average.
Comparison of nine ML models using questionnaire responses only.
|
| ||||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| XGBoost | 100 | 100 | 100 | 100 | 1 | 1 |
| ANN | 99 | 99 | 99 | 99 | 1 | 1 |
| Bagging | 98 | 98 | 98 | 98 | 0.998 | 0.998 |
| KNN | 98 | 98 | 98 | 98 | 0.997 | 0.997 |
| RF | 97 | 97 | 97 | 97 | 0.999 | 0.998 |
| DT | 97 | 97 | 97 | 97 | 0.976 | 0.976 |
| SVM | 94 | 94 | 94 | 94 | 0.990 | 0.988 |
| LR | 83 | 83 | 83 | 83 | 0.990 | 0.980 |
| BC | 74 | 77 | 74 | 75 | 0.869 | 0.884 |
XGBoost, eXtreme Gradient Boosting; RF, Random Forest; ANN, Artificial Neural Network; KNN, K-Nearest Neighbor; SVM, Support Vector Machine; BG, Bagging; DT, Decision Tree; LR, Logistic Regression; BC, Bayes Classifier; AUROCMIA, Area Under Receiver Operating Characteristics Micro Average; AUROCMAA, Area Under Receiver Operating Characteristics Macro Average.
Comparison of nine ML models using combined narratives and questionnaire responses.
|
| ||||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| XGBoost | 96 | 96 | 96 | 96 | 0.994 | 0.990 |
| RF | 96 | 96 | 96 | 96 | 0.998 | 0.996 |
| ANN | 96 | 95 | 96 | 95 | 0.995 | 0.991 |
| Bagging | 93 | 92 | 93 | 92 | 0.994 | 0.994 |
| KNN | 91 | 91 | 91 | 90 | 0.982 | 0.981 |
| DT | 87 | 87 | 87 | 87 | 0.928 | 0.928 |
| LR | 76 | 76 | 76 | 76 | 0.985 | 0.973 |
| BC | 72 | 73 | 72 | 73 | 0.910 | 0.907 |
| SVM | 68 | 68 | 68 | 66 | 0.969 | 0.958 |
XGBoost, eXtreme Gradient Boosting; RF, Random Forest; ANN, Artificial Neural Network; KNN, K-Nearest Neighbor; SVM, Support Vector Machine; BG, Bagging; DT, Decision Tree; LR, Logistic Regression; BC, Bayes Classifier; AUROCMIA, Area Under Receiver Operating Characteristics Micro Average; AUROCMAA, Area Under Receiver Operating Characteristics Macro Average.
Figure 2Area Under Receiver Operating Characteristics (AUROC) of our nine classifiers using combined questionnaire responses and narratives. (A) AUROC for ANN. (B) AUROC for KNN. (C) AUROC for RF. (D) AUROC for DT. (E) AUROC for SVM. (F) AUROC for LR. (G) AUROC for XGBOOST. (H) AUROC for BG. (I) AUROC for BC. XGBoost, eXtreme Gradient Boosting; RF, Random Forest; ANN, Artificial Neural Network; KNN, K-Nearest Neighbor; SVM, Support Vector Machine; BG, Bagging; DT, Decision Tree; LR, Logistic Regression; BC, Bayes Classifier.
Statistical tests of our nine models.
|
| |||
|---|---|---|---|
|
|
|
|
|
| XGBoost | 0.9622614 | 0.003209 | 836.00 |
| RF | 0.9566394 | 0.0030548 | 735.50 |
| ANN | 0.9530553 | 0.0025771 | 663.50 |
| Bagging | 0.9216445 | 2.91e+07 | 585.00 |
| KNN | 0.9015075 | 0.0033769 | 447.00 |
| DT | 0.8671503 | 0.003984 | 255.00 |
| LR | 0.7509405 | 0.0124037 | 155.00 |
| BC | 0.698092 | 0.0081906 | 55.00 |
| SVM | 0.6783361 | 0.0054433 | 50.00 |
XGBoost, eXtreme Gradient Boosting; RF, Random Forest; ANN, Artificial Neural Network; KNN, K-Nearest Neighbor; SVM, Support Vector Machine; BG, Bagging; DT, Decision Tree; LR, Logistic Regression; BC, Bayes Classifier.
Figure 3Top 12 CoD diseases.
Figure 4Computer Coded Verbal Autopsy (CCVA) mortality trends based on age, population, and gender. (A) Cause of death by sex. (B) Percentage of deaths by age group. (C) Yearly mortality trends by gender.
Figure 5Gender and age group counts graphs. (A) Gender count. (B) Age group count per gender. (C) Age group count.
Figure 6Mortality trends across age groups. (A) Number of deaths over time. (B) Age at death count. (C) Yearly death count across age groups. (D) Age group CoD count. (E) CoD and age death. (F) Age at death per year.
Figure 7Yearly CoD based on gender. (A) CoD based on gender. (B) Yearly CoD based on gender.
Figure 8Tri-gram model showing frequently occurring contradicting cases.
Figure 9Bi-gram model of our best model predictors.