| Literature DB >> 31641134 |
Zhenwei Shi1, Ivan Zhovannik2,3, Alberto Traverso2,4, Frank J W M Dankers2,3, Timo M Deist2,5, Petros Kalendralis2, René Monshouwer3, Johan Bussink3, Rianne Fijten2, Hugo J W L Aerts6,7, Andre Dekker2, Leonard Wee2.
Abstract
Prediction modelling with radiomics is a rapidly developing research topic that requires access to vast amounts of imaging data. Methods that work on decentralized data are urgently needed, because of concerns about patient privacy. Previously published computed tomography medical image sets with gross tumour volume (GTV) outlines for non-small cell lung cancer have been updated with extended follow-up. In a previous study, these were referred to as Lung1 (n = 421) and Lung2 (n = 221). The Lung1 dataset is made publicly accessible via The Cancer Imaging Archive (TCIA; https://www.cancerimagingarchive.net ). We performed a decentralized multi-centre study to develop a radiomic signature (hereafter "ZS2019") in one institution and validated the performance in an independent institution, without the need for data exchange and compared this to an analysis where all data was centralized. The performance of ZS2019 for 2-year overall survival validated in distributed radiomics was not statistically different from the centralized validation (AUC 0.61 vs 0.61; p = 0.52). Although slightly different in terms of data and methods, no statistically significant difference in performance was observed between the new signature and previous work (c-index 0.58 vs 0.65; p = 0.37). Our objective was not the development of a new signature with the best performance, but to suggest an approach for distributed radiomics. Therefore, we used a similar method as an earlier study. We foresee that the Lung1 dataset can be further re-used for testing radiomic models and investigating feature reproducibility.Entities:
Mesh:
Year: 2019 PMID: 31641134 PMCID: PMC6805885 DOI: 10.1038/s41597-019-0241-0
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
The clinical case-comparison for the training cohort (Lung1) and the validation cohort (Lung2). The abbreviations are: (GTV) is Gross Tumour Volume delineated on the radiotherapy treatment planning computed tomography image, (Clinical T) is the tumour staging, (Clinical N) is the node staging and (Clinical M) is the metastasis staging, respectively, according to the TNM tumour classification system.
| Lung1 | Lung2 | |
|---|---|---|
| Median age (range) at diagnosis in years | 68.5 (34–92) | 66.0 (36–87) |
| Median GTV size (range) in cm3 | 39 (0–660) | 88 (1–860) |
Clinical T stage | 249 (59%) 171 (41%) 1 (0%) | 119 (54%) 85 (38%) 17 (8%) |
Clinical N stage | 170 (40%) 22 (5%) 229 (55%) 0 (0%) | 49 (22%) 16 (7%) 137 (62%) 19 (9%) |
Clinical M stage | 416 (99%) 5 (1%) | 200 (90%) 21 (10%) |
Histology | 51 (12%) 143 (34%) 152 (36%) 63 (15%) 12 (3%) | 64 (29%) 22 (10%) 82 (37%) 47 (21%) 6 (3%) |
Outcomes | 546 478 40% | 595 500 41% |
Fig. 1The performance of radiomic signature ZS2019 according to Kaplan-Meier survival analysis. The signature was developed in Lung1 (MAASTRO; black line) and then distributedly validated in Lung2 (Radboudumc; red line). The upper and lower survival curves were split according to the median of the Cox regression linear predictor from the Lung1 data, and applied to both Lung1 and Lung2 data. The Harrell concordance index in the test cohort was 0.58, the log-rank test yielded a p-value of 0.09 and the Wilcoxon test gave p-value < 0.0001.
Fig. 2A schematic diagram explaining the primary methodology for survival analysis used in this study. Details have been provided in the text. Briefly, radiomics features were extracted locally by each institution and then labelled with the radiomics ontology. We then trained a Cox regression model on Lung1 (MAASTRO) and then validated on Lung2 (Radboudumc) by distributing the learning algorithm through the Varian Learning Portal (VLP). Only the event coordinates required to plot a Kaplan-Meier survival curve was returned to MAASTRO, without any identifiable patient-level data.