| Literature DB >> 24495855 |
Nikita Desai, Lukasz Aleksandrowicz, Pierre Miasnikof, Ying Lu, Jordana Leitao, Peter Byass, Stephen Tollman, Paul Mee, Dewan Alam, Suresh Kumar Rathi, Abhishek Singh, Rajesh Kumar, Faujdar Ram, Prabhat Jha1.
Abstract
BACKGROUND: Physician-coded verbal autopsy (PCVA) is the most widely used method to determine causes of death (CODs) in countries where medical certification of death is uncommon. Computer-coded verbal autopsy (CCVA) methods have been proposed as a faster and cheaper alternative to PCVA, though they have not been widely compared to PCVA or to each other.Entities:
Mesh:
Year: 2014 PMID: 24495855 PMCID: PMC3912488 DOI: 10.1186/1741-7015-12-20
Source DB: PubMed Journal: BMC Med ISSN: 1741-7015 Impact factor: 8.775
Dataset specifications
| Region | China | N/Aa | India | South Africa | Bangladesh |
| Sample size | 1,502 | 1,556 | 12,225 | 5,823 | 3,270 |
| Ages | 15+ years | 15 to 105 years | 1 to 59 months | 15 to 64 years | 20 to 64 years |
| Number of CODs | 31 | 32 | 15 | 17 | 17 |
| Population | Hospital deaths | Hospital deaths | Community deaths | Community deaths | Community deaths |
| Proportion ill-defined deathsb | 0% | 0% | 3% | 12% | 2% |
| Physician coding | Coding by a panel of three physicians assisted with medical records and diagnostic tests | Coding by one physician assisted with medical records and diagnostic tests | Dual, independent coding of VA records, disagreements resolved by reconciliation, and for remaining cases by adjudication by a third physician | Dual, independent coding of VA records, disagreements resolved by third physician. | Single physician re-coding of VA records after initial coding by another physician. |
All VA data in the Million Death Study, Agincourt and Matlab studies were collected by non-medical field staff, and coded by medical staff. aThe full IHME hospital-based dataset includes 12,000 VA records from India, Philippines, Tanzania and Mexico and was released after this paper went to press; correspondence with the study team suggested these data were from Bangladesh but the full details of the 1,556 deaths are not published. bIll-defined deaths are International Classification of Diseases-10 codes R95-R99. VA, verbal autopsy.
Figure 1Comparison of open-source random forest to IHME random forest. The IHME random forest was tested on a set of IHME hospital-based data, both with and without health care experience (HCE) variables. HCE variables are binary questions on previous medically diagnosed conditions (including high blood pressure, tuberculosis, cancer), and details transcribed from the respondents’ medical records. Our IHME subset contained some, but not all, HCE variables. The ORF performance was similar to the IHME random forest method on the full hospital-based dataset without HCE variables, but performed less well when HCE variables were included [12]. HCE, health care experience; IHME, Institute for Health Metrics and Evaluation; ORF, open-source random forest.
Figure 2Comparison of open-source tariff method to IHME tariff method. The IHME random forest was tested on a set of IHME hospital-based data, both with and without health care experience (HCE) variables. The ORF was tested on a subset of the full IHME data, containing some, but not all, HCE variables. The OTM performed almost exactly as the similar IHME method on the full hospital-based dataset without HCE variables (for the top cause), but less well than the same IHME analysis with HCE variables. Note that results for the full IHME dataset without HCE were only available for the top assigned cause [13]. HCE, health care experience; IHME, Institute for Health Metrics and Evaluation; OTM, open-source tariff method.
Description of testing on multiple computer-coded verbal autopsy methods and datasets
| 1100 / 400 | 48 | 48 | 48 | N/A | |
| 1100 / 400 | 96 | 96 | 96 | N/A | |
| 1100 / 400 | 89 | 89 | 89 | N/A | |
| | 1100 / 1100 | 89 | 89 | 89 | N/A |
| | 6100 / 6100a | 89 | 89 | 89 | 245 |
| 1100 / 400 | 104 | 104 | 104 | 245b | |
| | 1100 / 1100 | 104 | 104 | 104 | 245 |
| | 2900 / 2900 | 104 | 104 | 104 | 245 |
| 1100 / 400 | 224 | 224 | 224 | 245 | |
| | 1100 / 1100 | 224 | 224 | 224 | 245 |
| 1600 / 1600 | 224 | 224 | 224 | 245 | |
Only the numbers of test cases are applicable for the InterVA-4 analyses, as this method does not require any training cases. Additionally, InterVA-4 requires the input of 245 diagnostic indicators, however as many of these were not available in the given datasets, the number of useable variables was lower than 245. aThe MDS dataset used for InterVA-4 contained 552 cases, in which we extracted additional InterVA-4 indicators from the narratives. bEach CCVA method ran 30 resamples for each training/testing split within each dataset, except InterVA-4, which used the following number of re-samples: 1 for MDS data; 8, 7, 6 for Agincourt data splits of 400, 1100, and 2900 test cases; and 10, 10, 10 for Matlab data splits of 400, 1100, and 1600 test cases, respectively.
Positive predictive values of computer-coded verbal autopsy methods versus physician-coded verbal autopsy reference standards
| | ||||||||
|---|---|---|---|---|---|---|---|---|
| 400 | 35 | 57 | 36 | 70 | N/A | N/A | ||
| 400 | 33 | 55 | 34 | 53 | N/A | N/A | ||
| 6100 | 58 | 82 | 52 | 76 | 42a | 63a | ||
| 2900 | 45 | 77 | 42 | 69 | 42 | 58 | ||
| 1600 | 49 | 74 | 52 | 74 | 48 | 64 | ||
Top cause represents accuracy of the CCVA method’s most probable cause matching the cause assigned by PCVA; Top 3 represents whether CCVA’s three most probable causes contain the cause assigned by PCVA. Averages calculated across CCVA methods only use results for the top cause. aThe Million Death Study dataset used for InterVA-4 contained a sample of 552 cases, in which we extracted additional InterVA-4 indicators from the narratives.
Partial chance-corrected concordance of computer-coded verbal autopsy methods versus physician-coded verbal autopsy reference standards
| | ||||||||
|---|---|---|---|---|---|---|---|---|
| 400 | 33 | 55 | 32 | 64 | N/A | N/A | ||
| 400 | 31 | 54 | 32 | 48 | N/A | N/A | ||
| 6100 | 55 | 81 | 48 | 70 | 38a | 60a | ||
| 2900 | 42 | 75 | 38 | 62 | 39 | 56 | ||
| 1600 | 45 | 72 | 48 | 68 | 45 | 59 | ||
Top cause represents accuracy of the CCVA method’s most probable cause matching the cause assigned by PCVA; Top 3 represents whether CCVA’s three most probable causes contain the cause assigned by PCVA. Averages calculated across CCVA methods only use results for the top cause. aThe Million Death Study dataset used for InterVA-4 contained a sample of 552 cases, in which we attempted to extract additional InterVA-4 indicators from the narratives.
Cause-specific mortality fraction accuracy of computer-coded verbal autopsy methods versus physician-coded verbal autopsy reference standards
| 400 | 84 | 79 | 75 | N/A | ||
| 400 | 88 | 73 | 63 | N/A | ||
| 6100 | 96 | 64 | 33 | 70a | ||
| 2900 | 94 | 72 | 38 | 75 | ||
| 1600 | 95 | 69 | 59 | 72 | ||
aThe Million Death Study dataset used for InterVA-4 contained a sample of 552 cases, in which we attempted to extract additional InterVA-4 indicators from the narratives.