| Literature DB >> 34526580 |
Oliver Profant1, Zbyněk Bureš2, Zuzana Balogová3, Jan Betka4, Zdeněk Fík4, Martin Chovanec3, Jan Voráček5.
Abstract
Decision making on the treatment of vestibular schwannoma (VS) is mainly based on the symptoms, tumor size, patient's preference, and experience of the medical team. Here we provide objective tools to support the decision process by answering two questions: can a single checkup predict the need of active treatment?, and which attributes of VS development are important in decision making on active treatment? Using a machine-learning analysis of medical records of 93 patients, the objectives were addressed using two classification tasks: a time-independent case-based reasoning (CBR), where each medical record was treated as independent, and a personalized dynamic analysis (PDA), during which we analyzed the individual development of each patient's state in time. Using the CBR method we found that Koos classification of tumor size, speech reception threshold, and pure tone audiometry, collectively predict the need for active treatment with approximately 90% accuracy; in the PDA task, only the increase of Koos classification and VS size were sufficient. Our results indicate that VS treatment may be reliably predicted using only a small set of basic parameters, even without the knowledge of individual development, which may help to simplify VS treatment strategies, reduce the number of examinations, and increase cause effectiveness.Entities:
Mesh:
Year: 2021 PMID: 34526580 PMCID: PMC8443556 DOI: 10.1038/s41598-021-97819-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1An outsketch of the methodological process used in the analysis. After cleaning the data, the problem was solved in two parallel tasks (CBR and PDA). Using several feature selection methods followed by expert evaluation, the most important predictors of active VS treatment were identified. The identified set of predictors was processed by several classification methods to create models capable of predicting the active VS treatment based on the predictor values. The performance of the models was analyzed using various performance metrics.
Figure 2Diagnostic data of the patients included in the analysis. The bars represent the number of subjects having a certain characteristic. N/A not available; n ABR/DPOAE response not present; p ABR/DPOAE response present; r ABR with signs of retrocochlear lesion; l ABR with prolonged latencies. Yes + annoying tinnitus. Gray bars—actively treated patients; white bars—wait-and-scan patients.
Figure 3Hearing thresholds and tumor sizes of the patients included in the analysis. (A) Average audiograms recorded in the healthy and diseased ears in wait-and-scan patients and in the patients later changed to active treatment during the initial examination, plus the average audiogram of the diseased ear in the actively treated subjects recorded immediately before the change from wait-and-scan to active treatment. (B) Histogram of Koos grades identified in the actively treated and wait-and-scan patients during the initial examination, and in the actively treated subjects recorded immediately before the change from wait-and-scan to active treatment, the bars represent numbers of subjects having a certain Koos grade.
Predictors, extracted from CBR data, ordered according to their significance for applied dimensionality reduction method.
| Ord | Decision tree | Random forest | Gradient boosting | Logistic regression | LASSO | Expert selection (CBREXP) | |||
|---|---|---|---|---|---|---|---|---|---|
| Initial | Num | Avg | Final | ||||||
| 1 | PTAVSSR8 | Koos | Koos | Koos | Koos | Koos | 5 | 1.2 | Koos |
| 2 | Koos | PTAVSSR8 | PTAVSSR8 | Size | PTAHSR8 | SRT | 5 | 3.8 | SRT |
| 3 | PTADAR4 | SRT | PTAHSR8 | SRT | Size | PTAVSSR8 | 4 | 3.3 | PTAVSSR8 |
| 4 | SRT | PTAVSAR4 | SRT | PTAVS0.25 | PTAVSSR4 | PTAHSR8 | 4 | 4.0 | PTAHSR8 |
| 5 | PTAHSR8 | PTAVS3 | Size | PTAH8 | SRT | Size | 4 | 5.0 | - |
| 6 | PTAVSSR4 | PTAHSR8 | PTAVSIR8 | PTAH0.5 | PTAHIR8 | PTAVSSR4 | 3 | 5.7 | - |
| 7 | PTADAR8 | PTAVSSR4 | PTADAR4 | PTAH2 | SDS | PTAVS0.25 | 3 | 7.0 | - |
| 8 | PTAHIR8 | PTAVS0.25 | PTADAR8 | PTAVS6 | PTAVSSR8 | PTAHIR8 | 3 | 7.7 | - |
| 9 | PTAVS3 | MDL | PTAHIR8 | PTAVS8 | PTAVS0.25 | PTADAR4 | 2 | 5.0 | PTADAR4 |
| 10 | Size | PTAH8 | PTAVS1 | SDS | MDR | - | - | - | - |
Irrelevant variables were rejected from expert selection. Variables and metrics are expressed as follows (see “Methods”): (a) suffix 4 holds for the basic frequency range, suffix 8 holds for the full frequency range; (b) subscripts H and VS hold for the healthy or diseased ear, respectively, subscript D holds for the difference of averaged PTA values between the two ears; (c) tailing abbreviations have the following meaning: AR average row wise, IR intercept row wise, SR slope row wise.
Performance of gradually reduced expert set of variables for CBR data.
| Variables | Train and validation (AVG, %) | Test (%) | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Set | Number | ACC | PPV | TPR | TNR | AUC | ASE | ACC | PPV | TPR | TNR | AUC | ASE |
| CBR | 38 | 78 | 75 | 84 | 72 | 87 | 15 | 72 | 69 | 79 | 65 | 81 | 19 |
| CBREXPINI | 9 | 82 | 79 | 88 | 76 | 84 | 15 | 76 | 72 | 83 | 69 | 76 | 19 |
| CBREXPFIN | 5 | 84 | 81 | 90 | 78 | 88 | 14 | 81 | 78 | 88 | 74 | 78 | 17 |
| CBRCLASS | 38 | 86 | 84 | 91 | 82 | 92 | 11 | 88 | 85 | 93 | 82 | 87 | 12 |
Performance of applied combinations of dimensionality reduction and classification techniques on test set for CBR and CBREXPFIN data.
| Feature selection method | Classification accuracy (ACC) for test set (%) | ||||||
|---|---|---|---|---|---|---|---|
| Decision tree | Rand. Forest | Gradient boosting | Logistic regression | Supp. vect. machine | Neural network | Avg | |
| Decision tree (CBR) | 76 | 79 | 84 | 82 | 79 | 76 | |
| Random forest (CBR) | 76 | 79 | 87 | 76 | 76 | 68 | |
| Gradient boosting (CBR) | 84 | 82 | 86 | 84 | 76 | 89 | |
| Logistic regression (CBR) | 67 | 78 | 68 | 86 | 83 | 85 | |
| LASSO (CBR) | 74 | 79 | 79 | 76 | 74 | 82 | |
| Expert (CBREXPFIN) | 76 | 76 | 87 | 79 | 76 | 74 | |
| Avg | 76 | 79 | 82 | 81 | 77 | 79 | 79 |
Detailed metrics for the three best performing classifiers for CBR data.
| Classification (feature selection) | Train and validation (avg, %) | Test (%) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACC | PPV | TPR | TNR | AUC | ASE | ACC | PPV | TPR | TNR | AUC | ASE | |
| Neural network (Gradient boosting) | 84 | 82 | 88 | 80 | 93 | 10 | 89 | 87 | 93 | 85 | 92 | 8 |
| Gradient boosting (CBREXPFIN) | 84 | 81 | 89 | 79 | 89 | 13 | 87 | 84 | 94 | 80 | 86 | 15 |
| Gradient boosting (random forest) | 91 | 89 | 96 | 86 | 94 | 10 | 87 | 84 | 93 | 81 | 84 | 14 |
Figure 4CBR decision tree. A decision tree for CBREXPFIN variables, applied on CBR data.
Tabular representation of a decision tree for CBREXPFIN variables, applied on CBR data.
| Node ID | Samples | Node rule | No | Yes |
|---|---|---|---|---|
| 1 | 184 | PTAVSSR8 ≥ 2,3 or N/A | 2 | 3 |
| 2 | 61 | SRT ≥ 27.5 or N/A | 4 | 5 |
| 3 | 123 | PTAHSR8 ≥ 6.7 | 6 | 7 |
| 4 | 9 | No | ||
| 5 | 52 | Yes | ||
| 6 | 86 | PTADAR4 ≥ 18.1 or N/A | 8 | 9 |
| 7 | 37 | No | ||
| 8 | 17 | Yes | ||
| 9 | 69 | PTAVSSR8 ≥ 7.1 | 10 | 11 |
| 10 | 39 | No | ||
| 11 | 30 | Koos = 2 | 12 | 13 |
| 12 | 18 | No | ||
| 13 | 12 | Yes |
Inferences for selected sample CBR records using a decision tree learned from CBREXPFIN variables.
| ID | Koos | SRT | PTAVSSR8 | PTAHSR8 | PTADAR4 | Real target | Nodes visited | Prediction |
|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 57 | 11.3 | 1.3 | 28.8 | No | 1, 3, 6, 9, 11, 12 | No |
| 2 | 1 | 30 | 4.7 | 0.2 | 5.0 | Yes | 1, 3, 6, 8 | Yes |
| 3 | 2 | 110 | 10.7 | 7.6 | 43.8 | No | 1, 3, 7 | No |
| 4 | 2 | 37 | 2.1 | 0.1 | 16.3 | Yes | 1, 2, 5 | Yes |
| 5 | 3 | 60 | 4.6 | 9.8 | 17.5 | No | 1, 3, 7 | No |
| 6 | 3 | 110 | 2.1 | 3.3 | 26.3 | Yes | 1, 2, 5 | Yes |
| 7 | 4 | 110 | 5.2 | 2.3 | 68.8 | No | 1, 3, 6, 9, 10 | No |
| 8 | 4 | 110 | 0 | 3.0 | 98.8 | Yes | 1, 2, 5 | Yes |
Figure 5Receiver operating characteristic (ROC) curves of the three best performing classifiers for CBR data. (A) Averaged ROC curves of training and validation sets, (B) ROC curves for the test sets.
Predictors, extracted from PDA data set, ordered according to their significance for each dimensionality reduction method.
| Ord | Decision tree | Random forest | Gradient boosting | Logistic regression | LASSO | Expert selection (PDAEXP) | |||
|---|---|---|---|---|---|---|---|---|---|
| Initial | Num | Avg | Final | ||||||
| 1 | Size_LD | Size_LD | Koos_LD | Size_LD | Koos_LD | Koos_LD | 5 | 2 | Koos_LD |
| 2 | Koos_LD | Koos_LD | Size_SC | PTAVSAR4_SC | PTADAR4_IC | Size_LD | 4 | 1.5 | - |
| 3 | Size_AC | Size_SC | Size_LD | Size_SC | SRT_IC | Size_SC | 4 | 4.5 | Size_SC |
| 4 | PTAVSAR4_AC | SRT_LD | PTAVSSC4 | Koos_LD | SRT_LD | Koos_TD | 3 | 6.7 | - |
| 5 | Koos_TD | Koos_TD | SRT_SC | PTADAR8_SC | PTAVSAR8_IC | PTADAR4_SC | 3 | 7.7 | PTADAR4_SC |
| 6 | PTAVSAR8_AC | PTAVSAR8_IC | PTAVSAR8_SC | PTAVSAR8_SC | PTADAR8_LD | PTAVSAR4_SC | 3 | 5.3 | PTAVSAR4_SC |
| 7 | PTADAR8_IC | PTADAR4_SC | PTAVSAR4_LD | PTAVSAR8_SC | PTADAR8_IC | PTADAR8_IC | 3 | 7.7 | – |
| 8 | PTADAR4_SC | PTADAR4_LD | PTADAR8_SC | PTADAR4_SC | PTAVSAR4_IC | Size_AC | 3 | 7.3 | – |
| 9 | SRT_AC | PTADAR8_AC | PTADAR8_LD | PTADAR8_IC | Size_AC | – | – | – | – |
| 10 | Size_SC | PTAVSAR4_SC | Size_AC | Koos_TD | PTAVSAR4_AC | – | – | – | – |
Irrelevant variables were rejected from expert selection. Variables and metrics are expressed as follows (see “Methods”): (a) suffix 4 holds for the basic frequency range, suffix 8 holds for the full frequency range; (b) subscripts H and VS hold for the healthy or diseased ear, respectively, subscript D holds for the difference of averaged PTA values between the two ears; (c) tailing abbreviations have the following meaning: AR average row wise, IR intercept row wise, SR slope row wise, AC average column wise, IC intercept column wise, SC slope column wise, LD last difference, TD total difference.
Performance of gradually reduced set of variables for the PDA data set.
| Variables | Train and validation (Avg, %) | Test (%) | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Set | Number | ACC | PPV | TPR | TNR | AUC | ASE | ACC | PPV | TPR | TNR | AUC | ASE |
| PDA | 24 | 87 | 84 | 92 | 82 | 88 | 12 | 82 | 79 | 89 | 75 | 81 | 16 |
| PDAEXPINI | 8 | 87 | 84 | 92 | 82 | 88 | 12 | 82 | 79 | 89 | 75 | 81 | 16 |
| PDAEXPFIN | 4 | 87 | 85 | 92 | 82 | 90 | 11 | 82 | 79 | 89 | 75 | 81 | 16 |
| PDACLASS | 24 | 85 | 82 | 91 | 79 | 83 | 14 | 90 | 88 | 95 | 86 | 100 | 10 |
Figure 6PDA decision tree. A decision tree for PDAEXPFIN variables, applied on PDA data.
Tabular representation of a decision tree for PDAEXPFIN data set.
| Node ID | Samples | Node rule | Yes | No |
|---|---|---|---|---|
| 1 | 42 | Koos_LD < 0.01 | 2 | 3 |
| 2 | 31 | Size_SC < 0.006 | 4 | 5 |
| 3 | 11 | Yes | ||
| 4 | 24 | No | ||
| 5 | 7 | Yes |
Performance of applied combinations of dimensionality reduction and classification techniques on test set for PDA and PDAEXPFIN data.
| Feature selection method | Classification accuracy (ACC) for test set (%) | ||||||
|---|---|---|---|---|---|---|---|
| Decision tree | Rand forest | Gradient boosting | Logistic regression | Supp. vect. machine | Neural network | Avg | |
| Decision tree (PDA) | 80 | 80 | 80 | 90 | 80 | 70 | 80 |
| Random forest (PDA) | 80 | 70 | 80 | 70 | 90 | 70 | 77 |
| Gradient boosting (PDA | 80 | 80 | 80 | 90 | 80 | 80 | 82 |
| Logistic regression (PDA) | 80 | 60 | 80 | 80 | 90 | 90 | 80 |
| LASSO (PDA) | 90 | 90 | 90 | 90 | 80 | 90 | 88 |
| Expert (PDAEXPFIN) | 79 | 79 | 79 | 63 | 63 | 68 | 72 |
| Avg | 82 | 77 | 82 | 81 | 81 | 78 | 80 |
Detailed metrics for three best performing classifiers on PDA data set.
| Classification (feature selection) | Train and validation (avg, %) | Test (%) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACC | PPV | TPR | TNR | AUC | ASE | ACC | PPV | TPR | TNR | AUC | ASE | |
| Neural network (logistic regression) | 85 | 82 | 91 | 79 | 88 | 14 | 90 | 88 | 95 | 86 | 100 | 10 |
| Logistic regression (decision tree) | 85 | 82 | 91 | 79 | 81 | 14 | 90 | 88 | 95 | 86 | 100 | 10 |
| Logistic regression (gradient boosting) | 85 | 82 | 91 | 79 | 80 | 15 | 90 | 88 | 95 | 86 | 100 | 10 |