| Literature DB >> 28166263 |
Ghulam Mujtaba1,2, Liyana Shuib1, Ram Gopal Raj3, Retnagowri Rajandram4, Khairunisa Shaikh5, Mohammed Ali Al-Garadi1.
Abstract
OBJECTIVES: Widespread implementation of electronic databases has improved the accessibility of plaintext clinical information for supplementary use. Numerous machine learning techniques, such as supervised machine learning approaches or ontology-based approaches, have been employed to obtain useful information from plaintext clinical data. This study proposes an automatic multi-class classification system to predict accident-related causes of death from plaintext autopsy reports through expert-driven feature selection with supervised automatic text classification decision models.Entities:
Mesh:
Year: 2017 PMID: 28166263 PMCID: PMC5293233 DOI: 10.1371/journal.pone.0170242
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
The distribution of dataset across all nine classes.
| S. No. | Cause of Death | ICD-10 Code | No. of Records | Gender | Age in years | Nationality | Total Distribution |
|---|---|---|---|---|---|---|---|
| 1 | Multiple Injury | T07 | 260 | Male: 87% Female: 13% | Minimum: 14 Maximum: 87 Average: 39 | Malay: 24% Chinese: 19% Indonesian: 7% Indian:24% Pakistani: 5% Bangladeshi: 19% Philippines: 2% | 11.8% |
| 2 | Craniocerebral Injury | S06 | 260 | Male: 84% Female: 16% | Minimum: 6 Maximum: 86 Average: 41 | Malay: 31% Chinese: 35% Indian: 24% Indonesian: 6% Bangladeshi: 2% Philippines: 2% | 11.8% |
| 3 | Abdominal Injury | S38 | 260 | Male: 92% Female: 8% | Minimum: 20 Maximum: 50 Average: 30 | Malay: 42% Indian: 15% Indonesian: 29% Pakistani: 14% | 11.8% |
| 4 | Neck Injury | S17 | 260 | Male: 80% Female: 20% | Minimum: 15 Maximum: 50 Average: 30 | Malay: 40% Indian: 40% Pakistani: 20% | 11.8% |
| 5 | Chest Injury | S28 | 250 | Male: 89% Female: 11% | Minimum: 23 Maximum: 50 Average: 31 | Malay: 34% Indian: 16% Indonesian: 34% Pakistani: 16% | 11.4% |
| 6 | Liver Rupture | S36 | 250 | Male: 67% Female: 33% | Minimum: 20 Maximum: 55 Average: 39 | Malay: 33% Chinese: 33% Indian: 17% Pakistani: 17% | 11.4% |
| 7 | Asphyxiation | T71 | 220 | Male: 86% Female: 14% | Minimum: 5 Maximum: 50 Average: 24 | Malay: 34% Chinese: 17% Indian: 16% Indonesian: 17% Pakistani: 16% | 10.0% |
| 8 | Electrocution | T75 | 220 | Male: 88% Female: 12% | Minimum: 5 Maximum: 44 Average: 24 | Malay: 38% Chinese: 37% Indonesian: 25% | 10.0% |
| 9 | Epileptic Seizure | G40 | 220 | Male: 81% Female: 19% | Minimum: 17 Maximum: 50 Average: 33 | Malay: 17% Chinese: 33% Indian: 17% Indonesian: 33% | 10.0% |
Fig 1Algorithm of expert-driven feature selection approach.
Fig 2The complete flow of this research study.
Fig 3Overall accuracy across feature selection schemes, subsets of features and classifier.
Fig 4Macro precision across feature selection schemes, subsets of features and classifier.
Fig 5Macro recall across feature selection schemes, subsets of features and classifier.
Fig 6Macro F-measure across feature selection schemes, subsets of features and classifier.
Fig 7Area under the ROC curve for all nine classes.
Fig 8Computational time analysis of decision models and feature selection approaches.
Fig 9Decision model accuracy versus number of autopsy reports.
Comparison of accuracy results of baselines approaches and proposed approach.
| Classifier | Baseline 1 | Baseline 2 [ | Baseline 3 [ | Baseline 4 [ | Proposed Approach |
|---|---|---|---|---|---|
| NB | 71.59 | 70.13 | 67.72 | 72.59 | 80.31 |
| SVM | 69.81 | 68.90 | 68.40 | 72.81 | 82.31 |
| KNN | 70.45 | 70.45 | 70.45 | 70.50 | 82.59 |
| J48 | 72.45 | 72.68 | 73.18 | 73.85 | 83.90 |
| RF | 70.72 | 70.72 | 70.72 | 73.18 | 83.18 |