| Literature DB >> 29312547 |
Berardino De Bari1, Mauro Vallati2, Roberto Gatta3, Laëtitia Lestrade4,5, Stefania Manfrida3, Christian Carrie4, Vincenzo Valentini3.
Abstract
INTRODUCTION: The role of prophylactic inguinal irradiation (PII) in the treatment of anal cancer patients is controversial. We developped an innovative algorithm based on the Machine Learning (ML) allowing the tailoring of the prescription of PII.Entities:
Keywords: anal canal cancer; machine learning; predicitive models; prophylactic inguinal irradiation; radiochemotherapy
Year: 2016 PMID: 29312547 PMCID: PMC5752460 DOI: 10.18632/oncotarget.10749
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
Performances of the 3 proposed machine learning techniques in identifying patients that would relapse (results on training set and on testing set are showed)
| AI approaches | False + (FP) | False - (FN) | True + (TP) | True - (TN) | Specificity | Sensitivity% | Accuracy% |
|---|---|---|---|---|---|---|---|
| J48 | 39 | 41 | 29 | 121 | 75.6 | 41.4 | 65.2 |
| Random Tree | 31 | 4 | 66 | 129 | 80.6 | 94.3 | 84.8 |
| Random Forest | 16 | 5 | 65 | 144 | 90.0 | 92.9 | 90.9 |
| J48 | 8 | 3 | 3 | 51 | 86.4 | 50.0 | 83.1 |
| Random Tree | 12 | 5 | 1 | 47 | 79.7 | 16.7 | 73.9 |
| Random Forest | 9 | 4 | 2 | 50 | 84.8 | 33.3 | 80.0 |
| LR | 6 | 2 | 1 | 54 | 90.1 | 33.3 | 87.0 |
| Always Negative | 60 | 3 | 0 | 60 | 100.0 | 0.0 | 95.2 |
*The total number of patients in this table seems to be different from the total number of 194 pts only because of the Oversampling that has been applied. The real population accounted always for 194 pts.
** The total number of patients of the test set is 65.
Figure 1a-b. Screen shots taken from the PrediWeb website. The Figure 2a shows the modality of introduction of the variables (the same that have been implemented to obtain the algorithms, see Table 4), and Figure 1b shows an example of the results given by the website once all the parameters have been introduced.
Features considered for the development of the predictive model
| Variable | Accepted values |
|---|---|
| Performance Status | From 0 to 4 |
| Age at the diagnosis | ≥ 18 years |
| Initial level of SCC antigen | All values ≥ 0.1 |
| RT+/-CT after a not-curative surgical resection | Yes/No |
| Histologic type | |
| Symptoms at the moment of the diagnosis | No symptoms, rectal bleeding, anal/rectal pain, anal swelling/hemorrhoids, positive inguinal nodes, rectal syndrome, defecation troubles, other. |
| Method used for the histological definition | Only biopsy, R0 surgical excision, R1 surgical excision, R2 surgical excision. |
| T side | Anal canal, anorectal junction, anal margin, anal canal with rectal extension |
| T stage | From 1 to 4 |
| N stage | From 0 to 3 |
| Staging methods | Only clinics, echoendoscopy, MRI |
| uTNM stage | Depending on the T and N stage established with echography |
| cTNM stage | Depending on the T and N stage established with clinical examination and staging examens |
| Neoadjuvant CT | Yes/No |
| Concomitant CT | Yes/No |
| Type of concomitant CT | 5FU-CDDP |
| type of inguinal irradiation | curative/prophylactic |
Legend: SCC = squamous cell carcinoma; ADK = adenocarcinoma; CDDP = cisplatin; 5-FU = 5-fluorouracile; MMC = mitomycine.
The TRIPOD checklist (adapted from [12])
| Section/Topic | Item | Checklist Item |
|---|---|---|
| Title | 1 | Identify the study as developing and/or validating a multivariable prediction model, the target population, and the outcome to be predicted (D;V)*. |
| Abstract | 2 | Provide a summary of objectives, study design, setting, participants, sample size, predictors, outcome, statistical analysis, results, and conclusions (D;V). |
| Background and objectives | 3a | Explain the medical context (including whether diagnostic or prognostic) and rationale for developing or validating the multivariable prediction model, including references to existing models (D;V). |
| 3b | Specify the objectives, including whether the study describes the development or validation of the model or both (D;V). | |
| Source of data | 4a | Describe the study design or source of data (e.g., randomized trial, cohort, or registry data), separately for the development and validation data sets, if applicable (D;V). |
| 4b | Specify the key study dates, including start of accrual; end of accrual; and, if applicable, end of follow-up (D;V). | |
| Participants | 5a | Specify key elements of the study setting (e.g., primary care, secondary care, general population) including number and location of centres (D;V). |
| 5b | Describe eligibility criteria for participants (D;V). | |
| 5c | Give details of treatments received, if relevant (D;V). | |
| Outcome | 6a | Clearly define the outcome that is predicted by the prediction model, including how and when assessed (D;V). |
| 6b | Report any actions to blind assessment of the outcome to be predicted (D;V). | |
| Predictors | 7a | Clearly define all predictors used in developing or validating the multivariable prediction model, including how and when they were measured (D;V). |
| 7b | Report any actions to blind assessment of predictors for the outcome and other predictors. | |
| Sample size | 8 | Explain how the study size was arrived at (D;V). |
| Missing data | 9 | Describe how missing data were handled (e.g., complete-case analysis, single imputation, multiple imputation) with details of any imputation method (D;V). |
| Statistical analysis methods | 10a | Describe how predictors were handled in the analyses (D). |
| 10b | Specify type of model, all model-building procedures (including any predictor selection), and method for internal validation (D). | |
| 10c | For validation, describe how the predictions were calculated (V). | |
| 10d | Specify all measures used to assess model performance and, if relevant, to compare multiple models (D;V). | |
| 10e | Describe any model updating (e.g., recalibration) arising from the validation, if done (V). | |
| Risk groups | 11 | Provide details on how risk groups were created, if done (D;V). |
| Development vs. validation | 12 | For validation, identify any differences from the development data in setting, eligibility criteria, outcome, and predictors (V). |
| Participants | 13a | Describe the flow of participants through the study, including the number of participants with and without the outcome and, if applicable, a summary of the follow-up time. A diagram may be helpful (D;V). |
| 13b | Describe the characteristics of the participants (basic demographics, clinical features, available predictors), including the number of participants with missing data for predictors and outcome (D;V). | |
| 13c | For validation, show a comparison with the development data of the distribution of important variables (demographics, predictors and outcome) (V). | |
| Model development | 14a | Specify the number of participants and outcome events in each analysis (D). |
| 14b | If done, report the unadjusted association between each candidate predictor and outcome (D). | |
| Model specification | 15a | Present the full prediction model to allow predictions for individuals (i.e., all regression coefficients, and model intercept or baseline survival at a given time point) (D). |
| 15b | Explain how to the use the prediction model (D). | |
| Model performance | 16 | Report performance measures (with CIs) for the prediction model (D;V). |
| Model-updating | 17 | If done, report the results from any model updating (i.e., model specification, model performance) (V). |
| Limitations | 18 | Discuss any limitations of the study (such as non representative sample, few events per predictor, missing data) (D;V). |
| Interpretation | 19a | For validation, discuss the results with reference to performance in the development data, and any other validation data (V). |
| 19b | Give an overall interpretation of the results, considering objectives, limitations, results from similar studies, and other relevant evidence (D;V). | |
| Implications | 20 | Discuss the potential clinical use of the model and implications for future research (D;V). |
| Supplementary information | 21 | Provide information about the availability of supplementary resources, such as study protocol, Web calculator, and data sets (D;V). |
| Funding | 22 | Give the source of funding and the role of the funders for the present study (D;V). |
*Items relevant only to the development of a prediction model are denoted by “D”, items relating solely to a validation of a prediction model are denoted by “V”, and items relating to both are denoted “D;V”.
Description of the clinical and therapeutic features of the populations
| Testing set | Testing set | |||
|---|---|---|---|---|
| n. | % | n. | % | |
| 194 | 100 | 65 | 100 | |
| 32 | 17 | 13 | 20 | |
| 63.6 | - | 60.2 | - | |
| 40 | 21 | 51 | 78 | |
| 6 | 3 | 0 | 57 | |
| 130 | 67 | 21 | 33 | |
| 44 | 23 | 35 | 54 | |
| 98 | 51 | 27 | 42 | |
| 21 | 11 | 6 | 10 | |
| 4 | 2 | 0 | 44 | |
| 140 | 72 | 31 | 48 | |
| 19 | 10 | 6 | 10 | |
| 182 | 94 | 53 | 81 | |
| Squamous Cell Carcinoma Antigen* | 2 | 5.6 | ||
| * | ||||
| 45Gy [36–56] | 55 [30.6-58.5] | |||
| 152 | 78 | 12 | 18 | |
| 151 | 78 | 4 | 6 | |
| 187 | 96 | 65 | 100 | |
| 147 | 75 | 50 | 77 | |
| 25 [12–30] | 27 [17–34] | |||
| 36 [15–63] | 56 [22–88] | |||
| 32 | - | - | ||
| 143 | 74 | - | - | |
| 18 [10-31.7] | - | - | ||
| 22 [11–77] | - | - | ||
| 6 [4–12] | - | - | ||
| 5 [4–9] | - | - | ||
| 64 [54-76.7] | - | - | ||
| 52 | 27 | 5 | 8 | |
| 102 | 72 | 5 | 8 | |
| 18 | 9 | 2 | 3 | |