| Literature DB >> 28871984 |
Arthur Jochems1, Timo M Deist2, Issam El Naqa3, Marc Kessler3, Chuck Mayo3, Jackson Reeves3, Shruti Jolly3, Martha Matuszak3, Randall Ten Haken3, Johan van Soest2, Cary Oberije2, Corinne Faivre-Finn4, Gareth Price4, Dirk de Ruysscher2, Philippe Lambin2, Andre Dekker2.
Abstract
PURPOSE: Tools for survival prediction for non-small cell lung cancer (NSCLC) patients treated with chemoradiation or radiation therapy are of limited quality. In this work, we developed a predictive model of survival at 2 years. The model is based on a large volume of historical patient data and serves as a proof of concept to demonstrate the distributed learning approach. METHODS AND MATERIALS: Clinical data from 698 lung cancer patients, treated with curative intent with chemoradiation or radiation therapy alone, were collected and stored at 2 different cancer institutes (559 patients at Maastro clinic (Netherlands) and 139 at Michigan university [United States]). The model was further validated on 196 patients originating from The Christie (United Kingdon). A Bayesian network model was adapted for distributed learning (the animation can be viewed at https://www.youtube.com/watch?v=ZDJFOxpwqEA). Two-year posttreatment survival was chosen as the endpoint. The Maastro clinic cohort data are publicly available at https://www.cancerdata.org/publication/developing-and-validating-survival-prediction-model-nsclc-patients-through-distributed, and the developed models can be found at www.predictcancer.org.Entities:
Mesh:
Year: 2017 PMID: 28871984 PMCID: PMC5575360 DOI: 10.1016/j.ijrobp.2017.04.021
Source DB: PubMed Journal: Int J Radiat Oncol Biol Phys ISSN: 0360-3016 Impact factor: 7.038
Overview of patient characteristics per hospital
| Maastro clinic (n=559) | Michigan University (n=139) | The Christie (n=196) | |
|---|---|---|---|
| Age | |||
| Mean, y | 68 | 66 | 66 |
| SD, y | 10 | 10 | 10 |
| Missing, n | 0 (0%) | 0 (0%) | 2 (1%) |
| Sex, n | |||
| Male | 370 (62%) | 107 (77%) | 89 (45%) |
| Female | 189 (32%) | 32 (23%) | 117 (60%) |
| Missing | 0 (0%) | 0 (0%) | 1 (1%) |
| ECOG performance status, n | |||
| 0 | 102 (17%) | 16 (12%) | 40 (20%) |
| 1 | 301 (50%) | 100 (72%) | 103 (53%) |
| 2 | 11 (2%) | 21 (15%) | 45 (23%) |
| 3 | 21 (4%) | 1 (1%) | 4 (2%) |
| 4 | 4 (1%) | 0 (0%) | 0 (0%) |
| Missing | 120 (20%) | 1 (1%) | 4 (2%) |
| T stage, n | |||
| 0 | 83 (14%) | 25 (18%) | 1 (1%) |
| 1 | 154 (26%) | 33 (24%) | 13 (7%) |
| 2 | 89 (15%) | 40 (29%) | 54 (28%) |
| 3 | 198 (33%) | 40 (29%) | 51 (26%) |
| 4 | 0 (0%) | 0 (0%) | 70 (36%) |
| Missing | 35 (6%) | 1 (1%) | 7 (4%) |
| N stage, n | |||
| 0 | 150 (25%) | 33 (24%) | 51 (26%) |
| 1 | 32 (5%) | 17 (12%) | 14 (7%) |
| 2 | 214 (36%) | 58 (42%) | 86 (44%) |
| 3 | 136 (23%) | 31 (22%) | 40 (20%) |
| Missing | 27 (5%) | 0 (0%) | 5 (3%) |
| M stage, n | |||
| 0 | 505 (84%) | 139 (100%) | 191 (97%) |
| 1 | 0 (0%) | 0 (0%) | 0 (0%) |
| Missing | 54 (9%) | 0 (0%) | 5 (3%) |
| Chemotherapy timing, n | |||
| No chemotherapy | 119 (20%) | 25 (18%) | 73 (37%) |
| Sequential | 53 (9%) | 0 (0%) | 62 (32%) |
| Concurrent | 279 (47%) | 114 (82%) | 60 (31%) |
| Missing | 108 (18%) | 0 (0%) | 0 (0%) |
| Stage group, n | |||
| IA | 44 (8%) | 9 (6%) | 7 (4%) |
| IB | 26 (5%) | 8 (6%) | 12 (6%) |
| IIA | 18 (3%) | 0 (0%) | 0 (0%) |
| IIB | 53 (9%) | 16 (11%) | 14 (7%) |
| IIIA | 350 (63%) | 105 (76%) | 153 (78%) |
| IIIB | 0 (0%) | 0 (0%) | 0 (0%) |
| Missing | 68 (12%) | 1 (1%) | 10 (5%) |
| 2-y survival, n | |||
| No | 339 (57%) | 72 (52%) | 151 (77%) |
| Yes | 220 (37%) | 67 (48%) | 45 (23%) |
| Missing | 0 (0%) | 0 (0%) | 0 (0%) |
Abbreviation: ECOG = Eastern Cooperative Oncology Group.
Fig. 1Bayesian network structures. The blue nodes represent outcome. (A) Network structure based on expert opinion. (B) Network structure built using algorithmic approach. (C) Network structure adapted from Jayasurya et al (9). Abbreviations: Chemo = chemotherapy; GTV = gross tumor volume.
Fig. 2Receiver operating characteristic curves for the Bayesian network model learned using centralized learning and distributed learning. Validation was done on the institute 3 cohort. Abbreviation: AUC = area under curve.
Fig. 3Kaplan-Meier curves for risk group stratification for model. A log-rank test indicated that these curves were significantly different (P<.01).
Fig. 4Receiver operating characteristic curves of models compared in this study. The expert Bayesian network structure is shown in Figure 1A. The Bayesian network structure of Jayasurya et al (9) is shown in Figure 1C. The Patch Condition (PC) algorithm using the Bayesian network structure is shown in Figure 1B. Validation was done on the institute 3 cohort. Abbreviations: AUC = area under curve; SVM = support vector machine.