Arunabha Karmakar1, Apeksha Kumtakar2, Himanshu Sehgal3, Savith Kumar4, Arjun Kalyanpur5. 1. Image Core Labs division, Teleradiology Solutions Pvt. Ltd., Plot # 7G, Opp Graphite India Whitefield, Bengaluru, Karnataka 560048, India. Electronic address: Arunabha.Karmakar@gmail.com. 2. Image Core Labs division, Teleradiology Solutions Pvt. Ltd., Plot # 7G, Opp Graphite India Whitefield, Bengaluru, Karnataka 560048, India. Electronic address: Apeksha7689@gmail.com. 3. Image Core Labs division, Teleradiology Solutions Pvt. Ltd., Plot # 7G, Opp Graphite India Whitefield, Bengaluru, Karnataka 560048, India. Electronic address: HIMANSHU.s2208@gmail.com. 4. Teleradiology Solutions Pvt. Ltd., Bengaluru, Karnataka, India. Electronic address: Major.Savith@gmail.com. 5. Teleradiology Solutions Pvt. Ltd., Bengaluru, Karnataka, India. Electronic address: Arjun.Kalyanpur@telradsol.com.
Abstract
PURPOSE: Response Evaluation Criteria in Solid Tumors (RECIST 1.1) is the gold standard for imaging response evaluation in cancer trials. We sought to evaluate consistency of applying RECIST 1.1 between 2 conventionally trained radiologists, designated as A and B; identify reasons for variation; and reconcile these differences for future studies. METHODS: The study was approved as an institutional quality check exercise. Since no identifiable patient data was collected or used, a waiver of informed consent was granted. Imaging case report forms of a concluded multicentric breast cancer trial were retrospectively reviewed. Cohen's kappa was used to rate interobserver agreement in Response Evaluation Data (target response, nontarget response, new lesions, overall response). Significant variations were reassessed by a senior radiologist to extrapolate reasons for disagreement. Methods to improve agreement were similarly ascertained. RESULTS: Sixty one cases with total of 82 data-pairs were evaluated (35 data-pairs in visit 5, 47 in visit 9). Both radiologists showed moderate agreement in target response (n = 82; ĸ = 0.477; 95% confidence interval [CI]: 0.314-0.640-), nontarget response (n = 82; ĸ = 0.578; 95% CI: 0.213-0.944) and overall response evaluation in both visits (n = 82; ĸ = 0.510; 95% CI: 0.344-0.676). Further assessment demonstrated "Prevalence effect" of Kappa in some cases which led to underestimation of agreement. Percent agreement of overall response was 74.39% while percent variation was 25.6%. Differences in interpreting RECIST 1.1 and in radiological image interpretation were the primary sources of variation. The commonest overall response was "Partial Response" (Rad A:45/82; Rad B:63/82). CONCLUSION: Inspite of moderate interobserver agreement, qualitative interpretation differences in some cases increased interobserver variability. Protocols such as Adjudication, to reduce easily avoidable inconsistencies are or should be a part of the Standard Operating Procedure in imaging institutions. Based on our findings, a standard checklist has been developed to help reduce the interpretation error-margin for future studies. Such check-lists may improve interobserver agreement in the preadjudication phase thereby improving quality of results and reducing adjudication per case ratio. CLINICAL RELEVANCE: Improving data reliability when using RECIST 1.1 will reflect in better cancer clinical trial outcomes. A checklist can be of use to imaging centers to assess and improve their own processes.
PURPOSE: Response Evaluation Criteria in Solid Tumors (RECIST 1.1) is the gold standard for imaging response evaluation in cancer trials. We sought to evaluate consistency of applying RECIST 1.1 between 2 conventionally trained radiologists, designated as A and B; identify reasons for variation; and reconcile these differences for future studies. METHODS: The study was approved as an institutional quality check exercise. Since no identifiable patient data was collected or used, a waiver of informed consent was granted. Imaging case report forms of a concluded multicentric breast cancer trial were retrospectively reviewed. Cohen's kappa was used to rate interobserver agreement in Response Evaluation Data (target response, nontarget response, new lesions, overall response). Significant variations were reassessed by a senior radiologist to extrapolate reasons for disagreement. Methods to improve agreement were similarly ascertained. RESULTS: Sixty one cases with total of 82 data-pairs were evaluated (35 data-pairs in visit 5, 47 in visit 9). Both radiologists showed moderate agreement in target response (n = 82; ĸ = 0.477; 95% confidence interval [CI]: 0.314-0.640-), nontarget response (n = 82; ĸ = 0.578; 95% CI: 0.213-0.944) and overall response evaluation in both visits (n = 82; ĸ = 0.510; 95% CI: 0.344-0.676). Further assessment demonstrated "Prevalence effect" of Kappa in some cases which led to underestimation of agreement. Percent agreement of overall response was 74.39% while percent variation was 25.6%. Differences in interpreting RECIST 1.1 and in radiological image interpretation were the primary sources of variation. The commonest overall response was "Partial Response" (Rad A:45/82; Rad B:63/82). CONCLUSION: Inspite of moderate interobserver agreement, qualitative interpretation differences in some cases increased interobserver variability. Protocols such as Adjudication, to reduce easily avoidable inconsistencies are or should be a part of the Standard Operating Procedure in imaging institutions. Based on our findings, a standard checklist has been developed to help reduce the interpretation error-margin for future studies. Such check-lists may improve interobserver agreement in the preadjudication phase thereby improving quality of results and reducing adjudication per case ratio. CLINICAL RELEVANCE: Improving data reliability when using RECIST 1.1 will reflect in better cancer clinical trial outcomes. A checklist can be of use to imaging centers to assess and improve their own processes.
Authors: Laure Fournier; Lioe-Fee de Geus-Oei; Daniele Regge; Daniela-Elena Oprea-Lager; Melvin D'Anastasi; Luc Bidaut; Tobias Bäuerle; Egesta Lopci; Giovanni Cappello; Frederic Lecouvet; Marius Mayerhoefer; Wolfgang G Kunz; Joost J C Verhoeff; Damiano Caruso; Marion Smits; Ralf-Thorsten Hoffmann; Sofia Gourtsoyianni; Regina Beets-Tan; Emanuele Neri; Nandita M deSouza; Christophe M Deroose; Caroline Caramella Journal: Front Oncol Date: 2022-01-10 Impact factor: 6.244