| Literature DB >> 33538134 |
Young Seob Jeong1, Minjun Jeon2, Joung Ha Park3, Min Chul Kim3,4, Eunyoung Lee5,6, Se Yoon Park5, Yu Mi Lee7, Sungim Choi8, Seong Yeon Park8, Ki Ho Park7, Sung Han Kim3, Min Huok Jeon9, Eun Ju Choo10, Tae Hyong Kim5, Mi Suk Lee7, Tark Kim11.
Abstract
BACKGROUND: Tuberculous meningitis (TBM) is the most severe form of tuberculosis, but differentiating between the diagnosis of TBM and viral meningitis (VM) is difficult. Thus, we have developed machine-learning modules for differentiating TBM from VM.Entities:
Keywords: Diagnosis; Machine learning; Meningitis; Tuberculosis; Virus
Year: 2020 PMID: 33538134 PMCID: PMC8032912 DOI: 10.3947/ic.2020.0104
Source DB: PubMed Journal: Infect Chemother ISSN: 1598-8112
Parameter settings of the machine-learning models
| Model | Setting |
|---|---|
| Random forest | - Maximum number of trees: 100 |
| Naïve Bayes | - No kernel estimator, so it uses a normal distribution |
| Logistic regression | - Ridge: 1.0 × 10−8 |
| - Training algorithm: Broyden–Fletcher–Goldfarb–Shanno | |
| Support vector machine | - Training algorithm: Sequential minimal optimization |
| - C: 1.0 | |
| - Epsilon: 1.0 × 10−12 | |
| - Kernel: PolyKernel (exponent: 1.0) | |
| Artificial neural network | - Hidden layers: [20, 5] |
| - Activation function: ReLU | |
| - Number of epoch: 250 | |
| - Training algorithm: Adam (initial learning rate: 0.001) |
Figure 1Flow chart of the study.
Patients whose clinical presentation was indicative of meningitis and with a positive CSF PCR result for HSV, VZV, or enterovirus PCR had confirmed viral meningitis. Patients whose clinical presentation was indicative of CNS infection had confirmed TBM if the CSF specimens were positive for Mycobacterim tuberculosis by culture or PCR assay. Patients whose clinical presentation was indicative of CNS infection plus a culture of other body fluids was positive for M. Tuberculosis, without other known etiologies of meningitis, had probable TBM. True positive means a correct diagnosis of tuberculous meningitis and true negative means a correct diagnosis of viral meningitis.
ANN, artificial neural network; RF, random forest; NB, naïve Bayes; LR, logistic regression; SVM, support vector machine; ID, infectious diseases; TBM, tuberculous meningitis; TP, true positive; VM, viral meningitis; FN, false negative; FP, false positive; TN, true negative.
Comparison of features between tuberculous and viral meningitis
| Features | All (N = 203) | Tuberculous (N = 60) | Viral (N = 143) | |
|---|---|---|---|---|
| Median age, years (IQR) | 37 (29 - 58) | 49 (33 - 64) | 34 (29 - 55) | <0.001 |
| Median symptom duration before the visit, days (IQR) | 5 (3 - 7) | 9 (6 - 15) | 4 (2 - 6) | <0.001 |
| Vomiting (%) | 80 (39.4) | 28 (46.7) | 52 (36.4) | 0.21 |
| Neurologic symptoms and signs (%) | 70 (34.5) | 39 (65.0) | 31 (21.5) | <0.001 |
| Median serum sodium, mg/dl (IQR) | 137 (134 - 139) | 133 (128 - 136) | 138 (136 - 140) | <0.001 |
| Median CSF glucose, mg/dl (IQR) | 53.3 (45.1 - 66.0) | 41.6 (28.8 - 61.5) | 57.6 (49.0 - 67.0) | <0.001 |
| Median CSF protein, mg/dl (IQR) | 117.0 (67.9 - 169.6) | 175.5 (118.7 - 317.1) | 101 (56.3 - 141.3) | <0.001 |
| Median CSF ADA, IU/L (IQR) | 7 (3 - 12) | 14 (8 - 21) | 5 (3 - 8) | <0.001 |
IQR, interquartile range; CSF, cerebrospinal fluid; ADA, adenosine deaminase.
Diagnostic performances of various machine-learning algorithms for differentiating tuberculous from viral meningitis
| Machine-learning algorithm | Matrix completion | TP | FP | TN | FN | Sensitivity (% [95% CI]) | Specificity (% [95% CI]) | Accuracy (% [95% CI]) | AUC (95% CI) |
|---|---|---|---|---|---|---|---|---|---|
| Artificial neural network | IterativeImputer | 46 | 11 | 132 | 14 | 76.7 (63.9 - 86.6) | 92.3 (86.7 - 96.1) | 87.7 (82.4 - 91.9) | 0.85 (0.79 - 0.89) |
| SoftImputer | 41 | 20 | 123 | 19 | 68.3 (55.0 - 79.7) | 86.0 (79.2 - 91.2) | 80.8 (74.7 - 86.0) | 0.77 (0.71 - 0.83) | |
| KnnImputer (K = 1) | 43 | 11 | 132 | 17 | 71.7 (58.6 - 82.5) | 92.3 (86.7 - 96.1) | 86.2 (80.7 - 90.6) | 0.82 (0.76 - 0.87) | |
| KnnImputer (K = 2) | 43 | 10 | 133 | 17 | 71.7 (58.6 - 82.5) | 93.0 (87.5 - 96.6) | 86.7 (81.2 - 91.0) | 0.82 (0.76 - 0.87) | |
| KnnImputer (K = 3) | 42 | 12 | 131 | 18 | 70.0 (56.8 - 81.2) | 91.6 (85.8 - 95.6) | 85.2 (79.6 - 89.8) | 0.81 (0.75 - 0.86) | |
| KnnImputer (K = 4) | 42 | 12 | 131 | 18 | 70.0 (56.8 - 81.2) | 91.6 (85.8 - 95.6) | 85.2 (79.6 - 89.8) | 0.81 (0.75 - 0.86) | |
| Random forest | IterativeImputer | 42 | 11 | 132 | 18 | 70.0 (56.8 - 81.2) | 92.3 (86.7 - 96.1) | 85.7 (80.1 - 90.2) | 0.81 (0.75 - 0.86) |
| SoftImputer | 38 | 9 | 134 | 22 | 63.3 (49.9 - 75.4) | 93.7 (88.4 - 97.1) | 84.7 (79.0 - 89.4) | 0.79 (0.72 - 0.84) | |
| KnnImputer (K = 1) | 40 | 13 | 130 | 20 | 66.7 (53.3 - 78.3) | 90.9 (85.0 - 95.1) | 83.7 (77.9 - 88.5) | 0.79 (0.73 - 0.84) | |
| KnnImputer (K = 2) | 41 | 11 | 132 | 19 | 67.8 (54.4 - 79.4) | 91.3 (86.7 - 96.1) | 85.2 (79.5 - 89.8) | 0.80 (0.74 - 0.85) | |
| KnnImputer (K = 3) | 42 | 14 | 129 | 18 | 70.0 (56.8 - 81.2) | 90.2 (84.1 - 94.5) | 84.2 (78.5 - 89.0) | 0.80 (0.74 - 0.85) | |
| KnnImputer (K = 4) | 40 | 11 | 132 | 20 | 66.7 (53.3 - 78.3) | 92.3 (86.7 - 96.1) | 84.7 (79.0 - 89.4) | 0.80 (0.73 - 0.85) | |
| Naïve Bayes | IterativeImputer | 36 | 11 | 132 | 24 | 60.0 (46.5 - 72.4) | 92.3 (86.7 - 96.1) | 82.8 (76.8 - 87.7) | 0.76 (0.70 - 0.82) |
| SoftImputer | 48 | 24 | 119 | 12 | 80.0 (67.7 - 89.2) | 83.2 (76.1 - 88.9) | 82.3 (76.3 - 87.3) | 0.82 (0.76 - 0.87) | |
| KnnImputer (K = 1) | 38 | 13 | 130 | 22 | 63.3 (49.9 - 75.4) | 90.9 (85.0 - 95.1) | 82.8 (76.9 - 87.7) | 0.77 (0.71 - 0.83) | |
| KnnImputer (K = 2) | 39 | 13 | 130 | 21 | 65.0 (51.6 - 76.9) | 90.9 (85.0 - 95.1) | 83.3 (77.4 - 88.1) | 0.78 (0.72 - 0.84) | |
| KnnImputer (K = 3) | 38 | 13 | 130 | 22 | 63.3 (49.9 - 75.4) | 90.9 (85.0 - 95.1) | 82.8 (76.9 - 87.7) | 0.77 (0.71 - 0.83) | |
| KnnImputer (K = 4) | 38 | 13 | 130 | 22 | 63.3 (49.9 - 75.4) | 90.9 (85.0 - 95.1) | 82.8 (76.9 - 87.7) | 0.77 (0.71 - 0.83) | |
| Logistic regression | IterativeImputer | 44 | 9 | 134 | 16 | 73.3 (60.3 - 83.9) | 93.7 (88.4 - 97.1) | 87.7 (82.4 - 91.9) | 0.84 (0.78 - 0.88) |
| SoftImputer | 43 | 15 | 128 | 17 | 71.7 (58.6 - 82.5) | 89.5 (83.3 - 94.0) | 84.2 (78.5 - 89.0) | 0.81 (0.75 - 0.86) | |
| KnnImputer (K = 1) | 42 | 9 | 134 | 18 | 70.0 (56.8 - 81.2) | 93.7 (88.4 - 97.1) | 86.7 (81.2 - 91.0) | 0.82 (0.76 - 0.87) | |
| KnnImputer (K = 2) | 42 | 7 | 136 | 18 | 70.0 (56.8 - 81.2) | 95.1 (90.2 - 98.0) | 87.7 (82.4 - 91.9) | 0.83 (0.77 - 0.88) | |
| KnnImputer (K = 3) | 42 | 7 | 136 | 18 | 70.0 (56.8 - 81.2) | 95.1 (90.2 - 98.0) | 87.7 (82.4 - 91.9) | 0.83 (0.77 - 0.88) | |
| KnnImputer (K = 4) | 41 | 7 | 136 | 19 | 68.3 (55.0 - 79.7) | 95.1 (90.2 - 96.1) | 87.2 (81.8 - 91.5) | 0.82 (0.76 - 0.87) | |
| Support vector machine | IterativeImputer | 34 | 6 | 137 | 26 | 56.7 (43.2 - 69.4) | 95.8 (91.1 - 98.5) | 84.2 (76.5 - 89.0) | 0.76 (0.70 - 0.82) |
| SoftImputer | 45 | 17 | 126 | 15 | 75.0 (62.1 - 85.3) | 88.1 (81.6 - 92.9) | 84.2 (78.5 - 89.0) | 0.82 (0.76 - 0.87) | |
| KnnImputer (K = 1) | 33 | 4 | 139 | 27 | 55.0 (41.6 - 67.9) | 97.2 (93.0 - 99.2) | 84.7 (79.0 - 89.4) | 0.76 (0.70 - 0.82) | |
| KnnImputer (K = 2) | 34 | 8 | 135 | 26 | 56.7 (43.2 - 69.4) | 94.4 (89.3 - 97.6) | 83.3 (77.4 - 88.1) | 0.76 (0.69 - 0.81) | |
| KnnImputer (K = 3) | 34 | 8 | 135 | 26 | 56.7 (43.2 - 69.4) | 94.4 (89.3 - 97.6) | 83.3 (77.4 - 88.1) | 0.76 (0.69 - 0.81) | |
| KnnImputer (K = 4) | 35 | 7 | 136 | 25 | 58.3 (44.9 - 70.9) | 95.1 (90.2 - 98.0) | 84.3 (78.5 - 89.0) | 0.77 (0.70 - 0.82) |
Testing was conducted using the leave-one-out cross-validation.
True positive means a correct diagnosis of tuberculous meningitis and true negative means a correct diagnosis of viral meningitis.
TP, true positive; FP, false positive; TN, true negative; FN, false negative; AUC, area under the receiver operating characteristics curve; 95% CI, 95% confidence interval.
Diagnostic performance of humans for differentiating tuberculous from viral meningitis
| TP | FP | TN | FN | Sensitivity (% [95% CI]) | Specificity (% [95% CI]) | Accuracy (% [95% CI]) | AUC (95% CI) | Artificial neural network with IterativeImputer | ||
|---|---|---|---|---|---|---|---|---|---|---|
| P1a | P2b | |||||||||
| Resident #1 | 32 | 20 | 123 | 28 | 53.3 (40.0 - 66.3) | 86.0 (79.2 - 91.2) | 76.4 (69.9 - 82.0) | 0.70 (0.63 - 0.76) | <0.001 | 0.0002 |
| Resident #2 | 23 | 7 | 136 | 37 | 38.3 (26.1 - 51.8) | 95.1 (90.2 - 96.0) | 78.3 (72.0 - 83.8) | 0.67 (0.60 - 0.73) | <0.001 | <0.001 |
| Resident #3 | 31 | 20 | 123 | 29 | 51.7 (38.4 - 64.8) | 86.0 (79.2 - 91.2) | 75.9 (69.4 - 81.6) | 0.69 (0.62 - 0.75) | <0.001 | 0.0001 |
| Resident #4 | 30 | 9 | 134 | 30 | 50.0 (36.8 - 63.2) | 93.7 (88.4 - 97.1) | 80.8 (74.7 - 86.0) | 0.72 (0.65 - 0.78) | <0.001 | 0.0004 |
| ID specialist #1 | 39 | 18 | 125 | 21 | 65.0 (51.6 - 76.9) | 87.4 (80.8 - 92.4) | 80.8 (74.7 - 86.0) | 0.76 (0.70 - 0.82) | <0.001 | 0.03 |
| ID specialist #2 | 46 | 26 | 117 | 14 | 76.7 (64.0 - 86.6) | 81.8 (74.5 - 87.8) | 80.3 (74.2 - 85.6) | 0.79 (0.73 - 0.85) | <0.001 | 0.16 |
aCohen’s kappa statistic was used to test the diagnostic agreement between machine-learning and human judgment.
bComparison of the AUC of the machine-learning with that of human judgment.
True positive means a correct diagnosis of tuberculous meningitis and true negative means a correct diagnosis of viral meningitis.
TP, true positive; FP, false positive; TN, true negative; FN, false negative; AUC, area under the receiver operating characteristics curve; 95% CI, 95% confidence interval; ID, infectious disease.
Figure 2A plot of the diagnostic performance of the machine learning and clinicians for differentiating tuberculous from viral meningitis.
The area under curves (AUC) of receiver operating characteristics between the residents was not statistically different. Also, the value of the AUC was not statistically different between the ID specialist #1 and the ID specialist #2 (P = 0.38). The higher AUCs of the ID specialists were found, although the differences were only statistically significant between the ID specialist #1 and resident #2 (P = 0.01), the ID specialist #2 and resident #1 (P = 0.02), the ID specialist #2 and resident #2 (P = 0.003), and the ID specialist #2 and resident #3 (P = 0.01). The AUC of the ANN model was statistically higher than those of all the residents. Also, the diagnostic performance of the ANN model was statistically higher than the ID specialist #2 (P = 0.03) and comparable to the ID specialist #1 (P = 0.16).
ANN, artificial neural network; LR, logistic regression; ID, infectious diseases.