| Literature DB >> 31597942 |
Simon Podnar1, Matjaž Kukar2,3, Gregor Gunčar3, Mateja Notar3, Nina Gošnjak4, Marko Notar3.
Abstract
Routine blood test results are assumed to contain much more information than is usually recognised even by the most experienced clinicians. Using routine blood tests from 15,176 neurological patients we built a machine learning predictive model for the diagnosis of brain tumours. We validated the model by retrospective analysis of 68 consecutive brain tumour and 215 control patients presenting to the neurological emergency service. Only patients with head imaging and routine blood test data were included in the validation sample. The sensitivity and specificity of the adapted tumour model in the validation group were 96% and 74%, respectively. Our data demonstrate the feasibility of brain tumour diagnosis from routine blood tests using machine learning. The reported diagnostic accuracy is comparable and possibly complementary to that of imaging studies. The presented machine learning approach opens a completely new avenue in the diagnosis of these grave neurological diseases and demonstrates the utility of valuable information obtained from routine blood tests.Entities:
Mesh:
Year: 2019 PMID: 31597942 PMCID: PMC6785553 DOI: 10.1038/s41598-019-51147-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Flow chart of patients included in the validation process.
Characteristics of data used for the training and validation of the machine learning model.
| Training group | Validation group | ||
|---|---|---|---|
| Tumour | Control | ||
| Recruitment period | 2012/03–2017/12 | 2018/01–2018/07 | 2016/01–2016/02 |
| Inclusion criteria | Blood tests & diagnoses | Blood tests & neuroimaging | Blood tests & neuroimaging |
| Number of patients | 15, 176 | 68 | 215 |
| Number of men (%) | 7285 (48%) | 34 (50%) | 115 (53%) |
| Age range (median) years | 18–103 (67) | 19–86 (66) | 24–97 (74) |
| The most frequent ICD diagnosis (I63 – cerebral infarction) | 34% | 0% | 49% |
| Number of tumour patients | 701 | 68 | 0 |
| Number of blood parameters | 295 | 110 | 138 |
| Missing parameter values | 83% | 81% | 84% |
| Tumour diagnoses | |||
| Malignant neoplasm of brain - C71 | 496 (70.8%) | 56 (82.4%) | 0 (0%) |
| Benign neoplasm of the brain and other parts of the CNS - D33 | 26 (3.7%) | 0 (0%) | 0 (0%) |
| Neoplasm of uncertain behaviour of the brain and CNS - D43 | 51 (7.3%) | 0 (0%) | 0 (0%) |
| Malignant neoplasm of the meninges - C70 | 36 (5.1%) | 0 (0%) | 0 (0%) |
| Benign neoplasm of the meninges D32 | 63 (9.0%) | 9 (13.2%) | 0 (0%) |
| Neoplasm of uncertain behaviour of the meninges - D42 | 5 (0.7%) | 0 (0%) | 0 (0%) |
| Malignant neoplasm of the spinal cord, cranial nerves and other parts of the CNS - C72 | 24 (3.4%) | 3 (4.4%) | 0 (0%) |
Basic performance indicators with Agresti-Coull binomial confidence intervals (CIs) as obtained from the training dataset (n = 15,176 neurological patients) with ten-fold cross-validation and in the validation dataset (68 brain tumour patients and 215 control patients).
| k | Sensitivity@k ± binominal CI | Specificity@k ± binominal CI | ||
|---|---|---|---|---|
| Training dataset | Validation dataset | Training dataset | Validation dataset | |
| 1 | 0.51 ± 0.037 | 0.51 ± 0.116 | 0.98 ± 0.002 | 0.99 ± 0.018 |
| 3 | 0.66 ± 0.035 | 0.69 ± 0.108 | 0.93 ± 0.004 | 0.93 ± 0.035 |
| 5 | 0.78 ± 0.031 | 0.79 ± 0.097 | 0.83 ± 0.006 | 0.82 ± 0.052 |
| 10 | 0.95 ± 0.016 | 0.99 ± 0.043 | 0.38 ± 0.008 | 0.23 ± 0.056 |
@k – prediction is correct if the actual tumour diagnosis is within the first k predicted diagnoses.
Figure 2ROC curves for tumour diagnosis calculated from (a) the training data using ten-fold stratified cross-validation and from (b) the validation data. The blue dot depicts the point corresponding to the selected threshold value.
Figure 3Group median values of the 40 most important routine blood parameters, centred and scaled to reference intervals. Groups (tumour/no tumour) within the datasets (training/validation) were evaluated by the Anderson-Darling test. The significance levels (0.95 and 0.99) of the test results are depicted at the bottom.