| Literature DB >> 31131367 |
Syed Shariyar Murtaza1, Patrycja Kolpak2, Ayse Bener1, Prabhat Jha2,3.
Abstract
Verbal autopsy (VA) deals with post-mortem surveys about deaths, mostly in low and middle income countries, where the majority of deaths occur at home rather than a hospital, for retrospective assignment of causes of death (COD) and subsequently evidence-based health system strengthening. Automated algorithms for VA COD assignment have been developed and their performance has been assessed against physician and clinical diagnoses. Since the performance of automated classification methods remains low, we aimed to enhance the Naïve Bayes Classifier (NBC) algorithm to produce better ranked COD classifications on 26,766 deaths from four globally diverse VA datasets compared to some of the leading VA classification methods, namely Tariff, InterVA-4, InSilicoVA and NBC. We used a different strategy, by training multiple NBC algorithms using the one-against-all approach (OAA-NBC). To compare performance, we computed the cumulative cause-specific mortality fraction (CSMF) accuracies for population-level agreement from rank one to five COD classifications. To assess individual-level COD assignments, cumulative partially-chance corrected concordance (PCCC) and sensitivity was measured for up to five ranked classifications. Overall results show that OAA-NBC consistently assigns CODs that are the most alike physician and clinical COD assignments compared to some of the leading algorithms based on the cumulative CSMF accuracy, PCCC and sensitivity scores. The results demonstrate that our approach improves the performance of classification (sensitivity) by between 6% and 8% compared with other VA algorithms. Population-level agreements for OAA-NBC and NBC were found to be similar or higher than the other algorithms used in the experiments. Although OAA-NBC still requires improvement for individual-level COD assignment, the one-against-all approach improved its ability to assign CODs that more closely resemble physician or clinical COD classifications compared to some of the other leading VA classifiers.Entities:
Keywords: COD classification; CSMF Accuracy; VA algorithms; performance assessment; sensitivity
Year: 2019 PMID: 31131367 PMCID: PMC6480413 DOI: 10.12688/gatesopenres.12891.2
Source DB: PubMed Journal: Gates Open Res ISSN: 2572-4754
Verbal autopsy (VA) datasets used in the study *.
| MDS | Agincourt | Matlab | PHMRC-
| PHMRC-
| PHMRC-
| PHMRC-
| |
|---|---|---|---|---|---|---|---|
| Region | India | South Africa | Bangladesh | Multiple
[ | Multiple | Andhra
| Andhra
|
| # of deaths | 12,225 | 5,823 | 2,000 | 4,654 | 2,064 | 1233 | 948 |
| Ages | 1–59 months | 15–64 years | 20–64 years | 12–69 years | 28 days–
| 12–69 years | 28 days–
|
| # of grouped
| 15 | 16 | 15 | 13 | 9 | 13 | 9 |
| # of
| 90 | 88 | 214 | 224 | 133 | 224 | 133 |
| Physician
| Dual
| Dual
| Two level
| Hospital
| Hospital
| Hospital
| Hospital
|
1Six sites in total: Andhra Pradesh and Uttar Pradesh (India), Distrito Federal (Mexico), Bohol (Philippines) and Dar es Salaam and Pemba (Tanzania); applicable to both adult and child age group specific datasets.
*MDS, Agincourt and Matlab had CODs assigned by physician review of VA datasets and PHMRC is based on physician review of clinical diagnostic criteria
Cause of death (COD) list with absolute death counts by VA dataset *.
| Groups | Causes | Agincourt | Matlab | MDS | PHMRC
| PHMRC
| PHMRC
| PHMRC
|
|---|---|---|---|---|---|---|---|---|
| 1 | Acute respiratory | 110 | 11 | 3392 | 304 | 81 | 532 | 141 |
| 2 | HIV/AIDS | 2012 | NA | 5 | NA | NA | NA | NA |
| 3 | Diarrhoeal | 66 | 29 | 2711 | 101 | 41 | `256 | 112 |
| 4 | Pulmonary TB | 690 | 43 | 78 | 177 | 21 | NA | |
| 5 | Other and
| 432 | 79 | 2514 | 622 | 174 | 376 | 187 |
| 6 | Neoplasms (cancer) | 244 | 352 | 96 | 497 | 19 | 28 | 15 |
| 7 | Nutrition and
| 70 | 90 | 372 | NA | NA | NA | NA |
| 8 | Cardiovascular
| 381 | 714 | 18 | 928 | 242 | 76 | 25 |
| 9 | Chronic Respiratory | 27 | 129 | 21 | 84 | 52 | NA | NA |
| 10 | Liver cirrhosis | 89 | 100 | 112 | 234 | 59 | NA | NA |
| 11 | Other non-
| 221 | 244 | 1345 | 697 | 125 | 186 | 80 |
| 12 | Neonatal conditions | NA | NA | 410 | NA | NA | NA | NA |
| 13 | Road and transport
| 219 | 49 | 95 | 124 | 32 | 92 | 64 |
| 14 | Other injuries | 366 | 68 | 659 | 471 | 218 | 324 | 259 |
| 15 | Ill-defined | 711 | 35 | 397 | NA | NA | 194 | 65 |
| 16 | Suicide | 125 | 34 | NA | 70 | 33 | NA | NA |
| 17 | Maternal | 60 | 23 | NA | 345 | 136 | NA | NA |
*MDS, Agingcourt and Matlab had CODs assigned by physician review of VA datasets and PHMRC is based on physician review of clinical diagnostic criteria.
Figure 1. Overview of one-against-all approach.
Figure 2. One-against-all approach for ensemble learning.
Figure 3. Ranks 1 and 5 cause-specific mortality fraction (CSMF) accuracies (agreement) across VA datasets and algorithms.
Cumulative sensitivity of rank 1 and rank 5 (1-5) for COD (cause of death) classifications by VA (Verbal Autopsy) datasets and algorithms.
| VA dataset, rank, cumulative sensitivity (%) | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MDS | Matlab | Agincourt | PHMRC
| PHMRC
| PHMRC
| PHMRC
| ||||||||
| Algorithm | Rank
| Rank
| Rank
| Rank
| Rank
| Rank
| Rank
| Rank
| Rank
| Rank
| Rank
| Rank
| Rank
| Rank
|
| Tariff | 31.5 | 71.4 | 40.7 | 75.3 | 27.5 | 72.1 | 35.9 | 74.7 | 44.0 | 79.4 | 37.0 | 83.7 | 39.5 | 86.3 |
| InterVA-4 | 48.8 | 82.7 | 34.8 | 79.3 | 46.3 | 78.8 | 36.3 | 82.2 | 41.1 | 84.6 | 45.1 | 91.8 | 51.2 | 93.0 |
| InSilicoVA | 45.6 | 85.9 | 35.6 | 80.8 | 35.8 | 80.3 | 35.0 | 79.5 | 50.3 | 87.3 | 43.3 | 89.6 | 49.4 | 92.4 |
| NBC | 56.0 | 90.1 | 50.7 | 87.2 | 48.2 | 87.4 | 47.7 | 88.1 | 54.8 | 86.1 | 51.5 | 93.1 | 58.6 | 92.4 |
| OAA-NBC | 61.1 | 94.3 | 57.9 | 91.2 | 55.5 | 93.1 | 53.1 | 91.0 | 60.1 | 93.1 | 54.6 | 93.4 | 63.0 | 94.7 |
Top ranked (most likely) sensitivity scores per COD (cause of death) by VA (verbal autopsy) dataset and algorithm with physician assigned COD distributions.
| Cause, sensitivity (%) | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| VA Dataset | Algorithm | Acute
| HIV/AIDS | Diarrhoeal | Tuberculosis | Other &
| Cancers | Nutrition &
| Cardiovascular
| Chronic Respiratory | Liver cirrhosis | Other NCDs | Neonatal
| Road & transport
| Other injuries | Ill-defined | Suicide | Maternal |
| MDS |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Tariff | 36.1 | 10.0 | 47.5 | 42.5 | 19.7 | 16.7 | 31.7 | 5.0 | 0.0 | 23.9 | 3.0 | 11.2 | 84.3 | 57.3 | 25.7 | - | - | |
| InterVA-4 | 78.0 | 0.0 | 55.3 | 51.0 | 43.6 | 9.5 | 0.9 | 8.3 | 3.3 | 29.2 | 1.0 | 15.4 | 70.7 | 71.5 | 10.4 | - | - | |
| InSilicoVA | 61.5 | 0.0 | 55.7 | 50.0 | 32.4 | 0.6 | 42.3 | 0.0 | 0.0 | 21.0 | 6.8 | 13.9 | 82.1 | 69.6 | 63.3 | - | - | |
| NBC | 74.9 | 0.0 | 70.4 | 31.6 | 46.5 | 4.1 | 41.3 | 0.0 | 1.7 | 18.0 | 22.6 | 15.2 | 73.0 | 80.1 | 49.0 | - | - | |
| OAA-NBC | 85.2 | 0.0 | 78.5 | 17.9 | 51.5 | 0.0 | 25.3 | 0.0 | 0.0 | 4.5 | 23.0 | 11.0 | 79.8 | 80.6 | 25.7 | - | - | |
| Matlab |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Tariff | 15.0 | - | 53.3 | 55.0 | 15.0 | 41.0 | 61.1 | 38.1 | 79.8 | 50.0 | 9.9 | - | 57.0 | 51.2 | 13.3 | 70.8 | 16.7 | |
| InterVA-4 | 0.0 | - | 26.7 | 51.0 | 29.8 | 48.6 | 21.1 | 32.1 | 42.6 | 61.0 | 7.4 | - | 81.5 | 37.1 | 0.0 | 70.8 | 15.0 | |
| InSilicoVA | 20.0 | - | 50.0 | 34.5 | 11.4 | 17.1 | 34.4 | 47.9 | 71.3 | 53.0 | 8.2 | - | 91.5 | 19.0 | 13.3 | 86.7 | 8.3 | |
| NBC | 10.0 | - | 21.7 | 42.5 | 15.4 | 55.4 | 43.3 | 64.1 | 66.5 | 57.0 | 20.0 | - | 83.5 | 15.0 | 21.7 | 76.7 | 5.0 | |
| OAA-NBC | 20.0 | - | 51.7 | 30.5 | 7.5 | 67.6 | 38.9 | 75.3 | 75.8 | 53.0 | 23.8 | - | 96.0 | 39.5 | 2.5 | 75.8 | 5.0 | |
| Agincourt |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Tariff | 44.3 | 21.4 | 39.8 | 53.3 | 7.2 | 24.6 | 69.3 | 24.7 | 30.8 | 50.0 | 19.6 | - | 80.8 | 41.0 | 3.0 | 14.0 | 60.3 | |
| InterVA-4 | 36.1 | 74.5 | 34.7 | 59.9 | 12.5 | 28.1 | 25.8 | 13.7 | 43.3 | 50.7 | 9.9 | - | 78.4 | 64.7 | 0.0 | 21.9 | 29.2 | |
| InSilicoVA | 53.1 | 29.3 | 31.2 | 60.9 | 11.4 | 26.2 | 32.8 | 14.8 | 35.8 | 41.4 | 18.5 | - | 81.5 | 52.7 | 29.7 | 79.8 | 52.1 | |
| NBC | 41.2 | 59.4 | 27.9 | 60.8 | 26.6 | 35.3 | 33.2 | 28.3 | 33.3 | 39.1 | 16.6 | - | 79.3 | 63.3 | 27.1 | 69.2 | 53.3 | |
| OAA-NBC | 39.1 | 77.9 | 24.3 | 48.0 | 52.3 | 28.7 | 42.9 | 44.1 | 3.3 | 35.8 | 19.0 | - | 82.6 | 82.0 | 26.7 | 6.4 | 48.3 | |
| PHMRC -
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Tariff | 26.0 | - | 28.6 | 47.4 | 26.8 | 48.7 | - | 30.3 | 19.3 | 64.0 | 5.8 | - | 64.0 | 37.8 | - | 22.9 | 89.9 | |
| InterVA-4 | 14.5 | - | 5.9 | 14.6 | 45.8 | 47.7 | - | 32.6 | 45.4 | 87.2 | 13.0 | - | 29.2 | 40.8 | - | 25.7 | 61.8 | |
| InSilicoVA | 16.1 | - | 36.7 | 22.6 | 27.6 | 39.4 | - | 25.2 | 32.1 | 46.9 | 13.1 | - | 76.1 | 59.0 | - | 35.7 | 80.3 | |
| NBC | 26.7 | - | 31.7 | 30.0 | 40.7 | 60.0 | - | 49.4 | 41.7 | 60.6 | 21.3 | - | 61.4 | 69.6 | - | 35.7 | 84.1 | |
| OAA-NBC | 22.7 | - | 22.8 | 20.3 | 52.1 | 64.2 | - | 64.6 | 27.4 | 62.4 | 26.3 | - | 59.7 | 74.3 | - | 18.6 | 90.1 | |
| PHMRC -
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Tariff | 28.9 | - | 56.2 | - | 20.5 | 6.7 | - | 14.5 | - | - | 22.7 | - | 67.8 | 62.2 | 36.4 | - | - | |
| InterVA-4 | 69.9 | - | 45.3 | - | 25.8 | 43.3 | - | 5.0 | - | - | 8.6 | - | 78.4 | 63.5 | 18.4 | - | - | |
| InSilicoVA | 39.3 | - | 45.6 | - | 26.9 | 35.0 | - | 10.4 | - | - | 17.2 | - | 87.1 | 86.4 | 29.2 | - | - | |
| NBC | 60.5 | - | 48.4 | - | 45.5 | 10.0 | - | 15.7 | - | - | 12.9 | - | 83.9 | 85.5 | 27.2 | - | - | |
| OAA-NBC | 71.0 | - | 53.4 | - | 46.6 | 0.0 | - | 0.0 | - | - | 8.5 | - | 90.4 | 91.0 | 23.0 | - | - | |
*Proportion of deaths assigned for each COD by physician(s) review of VA datasets (MDS, Agincourt and Matlab) or by physician’s clinical diagnoses (PHMRC).
Complete mapping of ICD-10 (international classification of diseases 10 th revision) and WHO (World Health Organization) cause labels to the cause list used for performance assessments.
| No. | Cause of Death | WHO list of Causes | ICD-10 Range |
|---|---|---|---|
| 1 | Acute respiratory | Acute resp infect incl pneumonia, Neonatal
| H65-H68, H70-H71, J00-J22, J32, J36,
|
| 2 | HIV/AIDS | HIV/AIDS related deaths | B20-B24 |
| 3 | Diarrhoeal | Diarrhoeal diseases | A00-A09 |
| 4 | Pulmonary TB | Pulmonary tuberculosis | A15-A16, B90, J65 |
| 5 | Other and
| Sepsis (non-obstetric), Malaria, Measles, Meningitis
| A17-A33, A35-A99, B00-B17, B19,
|
| 6 | Neoplasms
| Oral neoplasms, Digestive neoplasms, Respiratory
| C00-C26, C30-C45, C47-C58, C60-C97,
|
| 7 | Nutrition and
| Severe anaemia, Severe malnutrition | D50-D53, E00-E02, E40-E46, E50-E64,
|
| 8 | Cardiovascular
| Diabetes mellitus, Acute cardiac disease, Stroke,
| E10-E14, G45-G46, G81-G83, I60-I69,
|
| 9 | Chronic
| Chronic obstructive pulmonary dis, Asthma | J30-J31, J33-J35, J37-J64, J66-J84,
|
| 10 | Liver cirrhosis | Liver cirrhosis | B18, F10, K70-K77, R16-R18, X45, Y15,
|
| 11 | Other non-
| Sickle cell with crisis, Acute abdomen, Renal
| D55-D63, D65-D83, D86, D89, E03-E07,
|
| 12 | Neonatal
| Cause of death unknown, Prematurity, Birth
| C76, D64, G40, O60, P00, P01, P02-P03,
|
| 13 | Road and
| Road traffic accident, Other transport accident | V01-V99, Y85 |
| 14 | Other injuries | Accid fall, Accid drowning and submersion,
| S00-S99, T00-T99, W00-W99, X00-X44,
|
| 15 | Ill-defined | NA | P96, R02, R07-R09, R25, R51-R54,
|
| 16 | Suicide | Intentional self-harm | X60-X84 |
| 17 | Maternal | Ectopic pregnancy, Abortion-related death,
| A34, F53, O00-O08, O10-O16, O20-O99 |
Comparison of cumulative sensitivity and cause-specific mortality fraction (CSMF) accuracy of rank 1 and 5 classifications using Dirichlet distribution on MDS and Matlab data.
| Algorithm | VA dataset, rank, cumulative sensitivity and CSMF accuracy (%) | |||||||
|---|---|---|---|---|---|---|---|---|
| MDS | Matlab | |||||||
| Sensitivity | CSMF accuracy | Sensitivity | CSMF accuracy | |||||
| Rank 1 | Rank 1-5 | Rank 1 | Rank 1-5 | Rank 1 | Rank 1-5 | Rank 1 | Rank 1-5 | |
| Tariff | 29.0 | 64.7 | 53.7 | 74.6 | 45.2 | 79.0 | 54.6 | 80.8 |
| InterVA-4 | 33.6 | 63.9 | 49.4 | 70.7 | 33.4 | 71.5 | 51.6 | 75.1 |
| InSilicoVA | 38.1 | 75.9 | 57.2 | 80.5 | 37.7 | 81.4 | 59.4 | 85.8 |
| NBC | 41.7 | 74.7 | 60.4 | 79.6 | 38.7 | 73.7 | 57.6 | 76.7 |
| OAA-NBC | 41.0 | 75.0 | 59.8 | 79.2 | 45.6 | 86.2 | 60.4 | 88.3 |