| Literature DB >> 32993745 |
Alexandros Laios1, Alexandros Gryparis2, Diederick DeJong3, Richard Hutson3, Georgios Theophilou3, Chris Leach4,5.
Abstract
BACKGROUND: The foundation of modern ovarian cancer care is cytoreductive surgery to remove all macroscopic disease (R0). Identification of R0 resection patients may help individualise treatment. Machine learning and AI have been shown to be effective systems for classification and prediction. For a disease as heterogenous as ovarian cancer, they could potentially outperform conventional predictive algorithms for routine clinical use. We investigated the performance of an AI system, the k-nearest neighbor (k-NN) classifier, to predict R0, comparing it with logistic regression. Patients diagnosed with advanced stage, high grade serous ovarian, tubal and primary peritoneal cancer, undergoing surgical cytoreduction from 2015 to 2019, was selected from the ovarian database. Performance variables included age, BMI, Charlson Comorbidity Index, timing of surgery, surgical complexity and disease score. The k-NN algorithm classified R0 vs non-R0 patients using 3-20 nearest neighbors. Prediction accuracy was estimated as percentage of observations in the training set correctly classified.Entities:
Keywords: Artificial intelligence; Cytoreduction; Machine learning; Ovarian Cancer; Predictive factors
Mesh:
Year: 2020 PMID: 32993745 PMCID: PMC7526140 DOI: 10.1186/s13048-020-00700-0
Source DB: PubMed Journal: J Ovarian Res ISSN: 1757-2215 Impact factor: 4.234
Descriptive statistics for the continuous variables, by group and overall
| R0 vs non R0 | N | Mean | Median | SD | Min | Max | ||
|---|---|---|---|---|---|---|---|---|
| non R0 | 96 | 63.42 | 65.00 | 10.777 | 41 | 88 | 0.185 | |
| R0 | 58 | 65.93 | 67.00 | 9.942 | 46 | 90 | ||
| Total | 154 | 64.36 | 66.00 | 10.509 | 41 | 90 | ||
| non R0 | 96 | 27.466 | 26.600 | 5.6198 | 18.3 | 51.4 | 0.474 | |
| R0 | 58 | 26.884 | 25.800 | 6.0783 | 15.4 | 58.0 | ||
| Total | 154 | 27.247 | 26.400 | 5.7840 | 15.4 | 58.0 | ||
| non R0 | 92 | 2202.95 | 732.50 | 4335.145 | 27 | 28,000 | 0.530 | |
| R0 | 57 | 1560.68 | 586.00 | 2284.097 | 40 | 11,100 | ||
| Total | 149 | 1957.25 | 710.00 | 3691.556 | 27 | 28,000 | ||
Absolute and relative frequencies for the categorical variables, by group and overall
| Levels | Non RO | RO | Totals | ||
|---|---|---|---|---|---|
| 51 | 38 | 89 | – | ||
| 44 | 32 | 76 | |||
| 30 | 12 | 41 | |||
| 14 | 9 | 23 | |||
| 7 | 3 | 10 | |||
| 1 | 1 | 2 | |||
| – | 1 | 1 | |||
| 96 | 58 | 153 | |||
| 54 | 37 | 91 | |||
| 30 | 17 | 47 | |||
| 12 | 4 | 16 | |||
| 96 | 58 | 154 | |||
| 17 | 14 | 31 | |||
| 79 | 44 | 123 | |||
| 96 | 58 | 154 | |||
| 14 (14.6%) | 2 | 16 (10.3%) | |||
| 60 | 28 | 88 | |||
| 22 | 28 | 50 | |||
| 96 | 58 | 154 |
Fig. 1K-NN modelling framework flowchart: The framework for building the predictive model comprised three steps: data pre-processing, model training and performance evaluation. TP: true positive, FP: false positive, TN: true negative, FN: false negative
Predictive accuracy of the k-NN model for different choices of the number of nearest neighbors × 500 replications and comparison with conventional logistic regression. Accuracy selects the best number of neighbors within a larger range and uses predictor importance when calculating distances
| Number of nearest neighbors | Mean predictive accuracy (%) | Minimum predictive accuracy (%) | Maximum predictive accuracy (%) | Mean accuracy of TPs (%) | Mean accuracy of TNs (%) | Mean accuracy of FPs (%) | Mean accuracy of FNs (%) |
| 3 | 57.5 | 41.2 | 72.5 | 37.0 | 70.7 | 29.3 | 63.0 |
| 4 | 57.8 | 39.2 | 78.4 | 70.6 | 29.4 | ||
| 5 | 60.7 | 43.1 | 74.5 | 36.4 | 76.3 | 23.7 | 63.6 |
| 6 | 60.8 | 41.2 | 82.4 | 37.5 | 75.8 | 24.2 | 62.5 |
| 7 | 62.9 | 45.1 | 78.4 | 35.2 | 80.6 | 19.4 | 64.8 |
| 8 | 62.8 | 43.1 | 78.4 | 35.7 | 80.2 | 19.8 | 64.3 |
| 9 | 64.5 | 47.1 | 80.4 | 34.5 | 83.6 | 16.4 | 65.5 |
| 10 | 64.4 | 47.1 | 78.4 | 35.0 | 83.2 | 16.8 | 65.0 |
| 11 | 65.2 | 45.1 | 78.4 | 34.6 | 84.7 | 15.3 | 65.4 |
| 12 | 64.7 | 43.1 | 34.8 | 83.9 | 16.1 | 65.2 | |
| 13 | 65.5 | 45.1 | 80.4 | 34.1 | 85.6 | 14.4 | 65.9 |
| 14 | 65.3 | 45.1 | 80.4 | 33.7 | 85.5 | 14.5 | 66.3 |
| 15 | 47.1 | 32.5 | 87.1 | 12.9 | 67.5 | ||
| 16 | 65.4 | 47.1 | 80.4 | 32.0 | 86.8 | 13.2 | 68.0 |
| 17 | 65.5 | 80.4 | 30.5 | 88.0 | 12.0 | 69.5 | |
| 18 | 65.5 | 47.1 | 76.5 | 29.8 | 88.3 | 11.7 | 70.2 |
| 19 | 47.1 | 80.4 | 28.8 | 89.4 | 10.6 | 71.2 | |
| 20 | 65.6 | 43.1 | 80.4 | 28.1 | 71.9 | ||
| – | 63.4 | 2.0 | 80.4 | 42.7 | 76.7 | 23.1 | 57.1 |
Fig. 2Variable importance chart. Misclassification error for each predictor from the kNN model for k = 15. Selected predictors are weighted by their relative importance for R0 resection prediction in the training set
R0 resection prediction comparison between the kNN model and logistic regression on a random sub cohort of 20 patients from the training set. All these patients underwent primary or interval debulking surgery and complete cytoreduction was achieved
| Patient | Age | BMI | Charlson Comorbidity Index | Type of surgery | Surgical Compexity Score categories | Pre Tx Ca125 | Disease score | kNN model | Logistic regression |
|---|---|---|---|---|---|---|---|---|---|
| 52 | 26.4 | 5 | Interval | Standard | 10,156 | 1 | R0 | R0 | |
| 66 | 28.1 | 5 | Interval | Standard | 466 | 1 | R0 | R0 | |
| 77 | 18.8 | 4 | Interval | Standard | 1876 | 2 | R0 | R0 | |
| 58 | 28.7 | 4 | Interval | Radical | 1900 | 2 | R0 | R0 | |
| 64 | 32.6 | 5 | Interval | Ultraradical | 1211 | 3 | R0 | R0 | |
| 51 | 23.8 | 4 | Interval | Radical | 1121 | 2 | R0 | R0 | |
| 52 | 39 | 4 | Interval | Standard | 586 | 2 | R0 | R0 | |
| 68 | 24.8 | 5 | Interval | Standard | 488 | 2 | R0 | R0 | |
| 48 | 31.9 | 5 | Interval | Standard | 3300 | 2 | R0 | R0 | |
| 88 | 23 | 5 | Interval | Standard | 286 | 1 | Non-R0 | R0 | |
| 57 | 26.9 | 4 | Primary | Standard | 875 | 2 | R0 | R0 | |
| 57 | 38.8 | 5 | Interval | Radical | 1765 | 2 | R0 | R0 | |
| 69 | 28 | 4 | Interval | Standard | 1417 | 1 | R0 | R0 | |
| 68 | 28.7 | 6 | Interval | Standard | 52 | 2 | Non-R0 | Non-R0 | |
| 50 | 33.9 | 4 | Interval | Standard | 2454 | 2 | R0 | R0 | |
| 55 | 33.2 | 4 | Interval | Standard | 1612 | 2 | R0 | R0 | |
| 68 | 21.6 | 4 | Interval | Ultraradical | 1800 | 2 | R0 | R0 | |
| 75 | 33.3 | 6 | Interval | Standard | 87 | 2 | Non-R0 | Non-R0 | |
| 73 | 24.7 | 4 | Primary | Radical | 181 | 2 | R0 | R0 | |
| 57 | 19.5 | 4 | Primary | Ultraradical | 1176 | 2 | R0 | R0 |