Literature DB >> 30854436

QIN Benchmarks for Clinical Translation of Quantitative Imaging Tools.

Keyvan Farahani¹, Darrell Tata¹, Robert J Nordstrom¹.

Abstract

The Quantitative Imaging Network of the National Cancer Institute is in its 10th year of operation, and research teams within the network are developing and validating clinical decision support software tools to measure or predict the response of cancers to various therapies. As projects progress from development activities to validation of quantitative imaging tools and methods, it is important to evaluate the performance and clinical readiness of the tools before committing them to prospective clinical trials. A variety of tests, including special challenges and tool benchmarking, have been instituted within the network to prepare the quantitative imaging tools for service in clinical trials. This article highlights the benchmarking process and provides a current evaluation of several tools in their transition from development to validation.

Entities: Chemical Disease Gene Species

Keywords: benchmarks; cancer imaging; clinical translation; oncology; quantitative imaging

Mesh：

Year: 2019 PMID： 30854436 PMCID： PMC6403038 DOI： 10.18383/j.tom.2018.00045

Source DB: PubMed Journal: Tomography ISSN： 2379-1381

Introduction

A distinguishing advantage of any research network is the opportunity for the ensemble of member teams to collaborate in areas of shared interest, addressing common scientific or technological problems, to compare individual approaches, and ultimately to build consensus. As a result, the ensemble of teams in a research network is often greater than the sum of its parts. For the past 10 years, the National Cancer Institute (NCI) Quantitative Imaging Network (QIN) has provided a network environment where the development and validation of quantitative imaging (QI) analysis software tools designed to measure or predict response to cancer therapies in clinical trials have been pursued. The motivating hypothesis for the QIN has been that clinical trials in systemic or targeted chemo-, radiation-, or immunotherapies can benefit from the inclusion of QI methods in the treatment protocols. These methods involve the extraction of measurable information from medical images to assess the status or change of a disease. To date, 36 multidisciplinary teams from academic institutions across the United States and Canada have participated in the NCI-funded research program. The current number of teams supported by the network is 20. These research teams discuss and resolve common challenges such as imaging informatics activities, clinical trial design and validation planning, and data acquisition and analysis issues, to name only a few. At the same time, each team is required to make technical and clinical progress on its individual research project. The interest in QI as a method to gauge tumor progression or predict response to therapy predates the QIN. An early attempt at extracting numeric information from clinical images came in the form of RECIST (Response Evaluation Criteria in Solid Tumors) in 2000 (1, 2), based on earlier guidelines first published by the World Health Organization in 1981 (3). The RECIST criteria used a single straight line drawn across the widest dimension in a tumor image to provide a quantitative measure of tumor size. Size, suitability, and the number of lesions to be measured were stated in the original guidelines, and later revised in version 1.1 (4). Response criteria, measured by the change in linear dimension, were established to determine if the tumor was in complete response, partial response, or stable or progressive disease. Although tumor shrinkage is an obvious desirable response to cancer therapy, it is not the only response that can occur, or in some cases, the response may be delayed in appearing (5). Furthermore, in a metastatic cancer setting a limited set of target lesions, as prescribed in RECIST 1.1, may not represent the overall tumor burden or response to therapy (6). These limitations restrict the usefulness of RECIST in some clinical trials. Often, immunotherapy trials, for example, show that complete response or stable response can occur after an initial increase in tumor burden (7, 8). Conventional RECIST criteria early in the therapy run the risk of labeling this initial increase as tumor progression, failing to account for the delayed onset of antitumor T cell response. Thus, a therapy under study in a clinical trial can be seen as failing. This has led to the development of iRECIST guidelines for response criteria in immunotherapy trials (9). QI tools being developed and validated by QIN research teams measure far more than simple unidimensional tumor size, and the articles in this special issue of Tomography highlight a number of them. Physical attributes of tumors such as heterogeneity, diffusion and perfusion, and metabolic activity are being added to the more traditional size and shape measurements of QI to determine response to therapy. These attributes have been used in machine-based modeling studies driven by imaging data to characterize tumor growth (10–13). In addition, machine learning radiomics approaches for high-throughput extraction and analysis of quantitative image features are providing an even richer set of image parameters. These include intensity, texture, kurtosis, and skewness from which to extract measurement and prediction information on tumor progression (14–16).

Background

If QI is to be useful in clinical trials as a method to measure or predict response to therapy, the methods must be developed on clinically available platforms such that the final validated tools would have value in multicenter clinical trials. To this end, the NCI QIN program was initiated in 2008. The support mechanism chosen for this effort was the cooperative agreement U01 mechanism. Here, successful applicants agree to collaborations and conditions established by NCI program staff. In the case of the QIN, these conditions include participation in a network of teams, joining in monthly teleconference meetings, and collaborating in several working groups. Applications to the QIN are subject to the NIH peer-review process conducted 3 times each year. As a result, the network teams enter the program at different times and are thus at different stages in their tool development and validation at any given point in time. This creates a need to qualify the degree of development and validation each quantitative tool has attained. Accordingly, a system of benchmarking to assess tool maturity has been implemented.

Clinical Translation

The process of translating ideas and products from laboratory demonstration to clinical utility is the exercise of transferring stated features of the idea or product into realized benefits to the user. For example, the stated feature of improved sensitivity or specificity in an imaging protocol can translate into improved personalized care in the clinic. The tool developer must be aware of the nature of the clinical need for such a tool. Likewise, the clinical user must be realistic regarding the performance characteristics needed in a clinical decision support tool. To ensure a strong connection between developer and clinical user, each QIN team is required to have a multidisciplinary composition that brings expertise in imaging physics and radiology along with informatics, oncology, statistics, and clinical requirements to the cancer problem being addressed. This gives each team multiple perspectives on the challenges of advancing decision support tools through the development and verification stages and on to the clinical validation stage. Translation is not a simple move from bench to bedside. It requires a constant check on progress with a compass heading set by clinical need. There must be a set of guiding milestones to point the way through the translation landscape and to measure progress along the way. This, however, can be very difficult in a network of research teams, where each team is focused on a different imaging modality or approach and cancer problem. A guiding pathway for QIN teams in this translation process continues to be the use of benchmarks for measuring progress toward clinical utility. Even though each team is working on a different application of QI for measurement or prediction of response to cancer therapy, they all share the challenges of bringing tools and methods into clinical utility. The benchmarks offer a ubiquitous pathway for all teams to move toward clinical workflow. As such, the benchmarks measure the tasks on the development side of the translation. There is no doubt that a set of benchmarks could be established for monitoring progress on the clinical side of the translation issue, but that is not a part of the QIN mission. Figure 1 shows a schematic pathway from initial concept and development of tools and methods for clinical decision support all the way to final clinical use. The demarcations show that the benchmark grades represent milestones in the development toward the clinical use. The details of the benchmarks and the requirements to achieve each are given in the next section.

Figure 1.

Quantitative Imaging Network (QIN) benchmarks, described in the text and in Figure 2, designate key milestones toward the clinical translation of quantitative imaging (QI) tools from laboratory prototype (A) to scale up and optimization (B) to clinical use (C).

Figure 2.

Five levels of QI benchmark for labeling of QI products. *In addition to the requirements listed for each level, candidates for benchmarks must have fulfilled the requirements for the prior-level benchmark, but not necessarily obtained that benchmark, to be considered for the current benchmark level.

Benchmarking

For each team, the transition from the activities of tool development to clinical performance validation is a central part of the research, but this does not occur in a sudden step. There is a period where prototype tools are tested against retrospective image data from archives such as The Cancer Imaging Archive (TCIA) (http://www.cancerimagingarchive.net/) or other data sources to objectively assess tool performance. The benchmarking initiative allows investigators the opportunity to adjust their algorithms before committing to a specific prospective clinical trial. Another initiative embraced by the QIN team members during their period of initial verification of tool performance has been team challenges. Here, several teams with sufficiently developed tools with similar quantitative measurement functions (segmentation, volume metrics, volume transfer constant, K, measurements, etc.) use a common data source, divided into training and test data sets, to determine and compare task-specific tool performance related to determining or predicting the therapeutic response. Within the QIN, these activities are referred to as challenges and collaborative projects (CCPs) (17) and have proven very useful in guiding the development of QI tools and analytic methods in preparation for more complete clinical validation studies. CCPs have been conducted at various points along the development pipeline, from basic concept to technical verification and preliminary clinical validation. Descriptions of CCP tasks, project design, and results have been disseminated through several peer-reviewed scientific publications (18–28). The CCP activities highlighted the need to create a method for gauging the degree of development a tool had attained at any specific timepoint. This would help to evaluate challenge results when tools with widely different levels of development participated. To gauge the level of development for tools in the QIN, a benchmarking process was developed. A Task Force, comprising QIN members, was charged with the task of developing a system to stratify the level of progress made by teams in their efforts to develop QI tools for clinical workflow. In the context of QIN activities, a tool can be a software algorithm, a physical phantom, or a digital reference object used in the production or analysis of QI biomarkers for diagnosis and staging of cancer and for the prediction or measurement of response to therapy. The Task Force developed QI Benchmarks as standard labels that signify the development, validation, and clinical translation of quantitative tools through a 5-tier benchmark system as shown in Figure 2 (29): pre-benchmark (level 1), basic benchmark (level 2), technical test benchmark (level 3), clinical trial benchmark (level 4), and clinical use benchmark (Level 5). In general, requirements for each benchmark designation require a peer-reviewed publication, where the scientific goals, methods, and results of the QI biomarker development or analysis are described. A benchmark is not automatically conferred on a QIN tool. The developer must make an application which includes the required information for that benchmark and conduct a discussion of the objective performance claim for the benchmark, best practices, and current limitations of the tool. In addition, it is important to note that candidates for each of the benchmarks must have fulfilled the requirements for the prior-level benchmark but not necessarily obtained it. The Coordinating Committee of QIN, consisting of the chairs of each of the Network Working Groups (30) and certain NCI program staff, reviews each benchmark application. If an application for a benchmark is rejected, the applicant will be allowed to address the concerns and resubmit the application. Five levels of QI benchmark for labeling of QI products. *In addition to the requirements listed for each level, candidates for benchmarks must have fulfilled the requirements for the prior-level benchmark, but not necessarily obtained that benchmark, to be considered for the current benchmark level. The establishment of this benchmarking process will help to advance the field of QI in oncology by recognizing QI tools entering QIN (benchmark level 1), encouraging QIN investigators to participate in objective performance evaluation of their tools and methods (benchmark level 2), to streamline validation through dissemination of appropriately developed tools and methods to test sites (benchmark level 3), and to promote participation in oncology clinical trials (benchmark level 4) by providing objective evaluation of tool development to allow more accurate assessment or prediction of cancer therapies and eventual clinical use (benchmark level 5). It is anticipated that this initiative will help in proper placement of advanced tools and methods into prospective clinical trials and will streamline the process of translating such tools into the broader clinical community with adoption by industry.

Results

The current catalog of QIN tools contains 67 clinical decision support tools in various stages of development. Because of the staggered entrance of teams into the network, progress in development is not uniform across the network. This has created the need for benchmarking as a measurable way to evaluate tool development status. Of the tools listed in the catalog, there are ∼12 that are to the point of entering the clinical domain and qualifying for benchmark level 4 or 5. Image segmentation of tumor from surrounding tissue is an important tool function and serves as a first step in determining treatment planning regimens in oncology and many quantitative measurements of tumor status. Several QIN teams are developing segmentation tools for various applications. One such tool developed at Columbia University (New York, NY) performs image segmentation on solid tumors and has been shown in lung, liver, and lymph nodes as a semiautomatic software tool. The segmentation of magnetic resonance imaging (MRI) and/or computed tomography (CT) images across multiple slices yields quantitative information on tumor volume (31–33) and has been used in several clinical trials. This tool can be integrated into diagnostics, radiation-treatment planning, and tumor response assessment on commercial workstations. Volumetric measurement of breast cancer tumors using dynamic contrast-enhanced MRI has been developed by the QIN team at the University of California at San Francisco (San Francisco, CA). The tool is an image processing and analysis package based on dynamic contrast-enhanced MRI contrast kinetics and has been approved on a commercial platform. It has proven useful in clinical trials performed by several groups in the NCI clinical trials network (34, 35). In addition to the analysis of algorithm performance, the validation of a breast phantom design has been reported (36). Features of the software package include image reconstruction, image registration, segmentation, and viewer/visualization. A commercial version is being used in ∼20 I-SPY clinical sites. Auto-PERCIST (Positron Emission Tomography [PET] Response Criteria in Solid Tumors) is a software package for PET imaging and can provide clinical decision support through image segmentation, viewer/visualization, and response assessment. Similar to RECIST, the PERCIST package focuses on analyzing fludeoxyglucose-PET scans and evaluates if the study was performed properly from a technical standpoint. It establishes the appropriate threshold for the standardized uptake value corrected for lean body mass (SUL) evaluation of the lesion at baseline. Auto-PERCIST has been used to provide clinical assessment of therapy response in multicenter evaluations both here in the United States and in Korea, and a release of Auto-PERCIST for European oncology trials is planned. Although not completely developed under the QIN program, many of the features found in Auto-PERCIST were created and validated in the QIN program by teams originally at the Johns Hopkins University (Baltimore, MD) and currently at Washington University (St. Louis, MO). This tool has been used in several multicenter clinical trials, and details of its performance can be found in several publications (37–40). Clinical support for evaluating tumor response can come in many forms. It be the algorithm, phantom, or digital reference object for direct analysis of images, and it can also be the workspace in which the software operates. Such is the case for ePAD, a Stanford University (Palo Alto, CA) web-based image viewing and annotation platform to enable deploying QI biomarkers into clinical trial workflow (41). It supports applications such as data collection, data mining, image annotation, image metadata archiving, and response assessment. This publicly available platform predates QIN, but many of the current quantitative functionalities of ePAD have been installed and validated under QIN support.

Conclusions

The list of benchmarked tools in QIN is growing. Constant updates are being made to the catalog as new QIN teams enter the network and existing teams progress in their development and validation of their QI tools in support of clinical trials (42, 43). This issue of Tomography highlights several QI tools and studies in the QIN. As the network moves forward, it has begun to focus on coordinated ways to approach clinical trial groups and interested commercial parties.

41 in total

1. Variations of dynamic contrast-enhanced magnetic resonance imaging in evaluation of breast cancer therapy response: a multicenter data analysis challenge.

Authors: Wei Huang; Xin Li; Yiyi Chen; Xia Li; Ming-Ching Chang; Matthew J Oborski; Dariya I Malyarenko; Mark Muzi; Guido H Jajamovich; Andriy Fedorov; Alina Tudorica; Sandeep N Gupta; Charles M Laymon; Kenneth I Marro; Hadrien A Dyvorne; James V Miller; Daniel P Barbodiak; Thomas L Chenevert; Thomas E Yankeelov; James M Mountz; Paul E Kinahan; Ron Kikinis; Bachir Taouli; Fiona Fennessy; Jayashree Kalpathy-Cramer
Journal: Transl Oncol Date: 2014-02-01 Impact factor: 4.243

2. Use of ¹⁸F-FDG PET/CT to predict short-term outcomes early in the course of chemoradiotherapy in stage III adenocarcinoma of the lung.

Authors: Xiang-Rong Zhao; Yong Zhang; Yong-Hua Yu
Journal: Oncol Lett Date: 2018-05-18 Impact factor: 2.967

3. Semi-automated pulmonary nodule interval segmentation using the NLST data.

Authors: Yoganand Balagurunathan; Andrew Beers; Jayashree Kalpathy-Cramer; Michael McNitt-Gray; Lubomir Hadjiiski; Bensheng Zhao; Jiangguo Zhu; Hao Yang; Stephen S F Yip; Hugo J W L Aerts; Sandy Napel; Dmitrii Cherezov; Kenny Cha; Heang-Ping Chan; Carlos Flores; Alberto Garcia; Robert Gillies; Dmitry Goldgof
Journal: Med Phys Date: 2018-02-19 Impact factor: 4.071

4. Letter to cancer center directors: Progress in quantitative imaging as a means to predict and/or measure tumor response in cancer therapy trials.

Authors: James M Mountz; Thomas E Yankeelov; Daniel L Rubin; John M Buatti; Bradley J Erikson; Fiona M Fennessy; Robert J Gillies; Wie Huang; Michael A Jacobs; Paul E Kinahan; Charles M Laymon; Hannah M Linden; David A Mankoff; Lawrence H Schwartz; Hyunsuk Shim; Richard L Wahl
Journal: J Clin Oncol Date: 2014-05-27 Impact factor: 44.544

Review 5. Vascularity assessment of breast lesions with gadolinium-enhanced MR imaging.

Authors: N M Hylton
Journal: Magn Reson Imaging Clin N Am Date: 1999-05 Impact factor: 2.266

Review 6. Some statistical considerations in the clinical development of cancer immunotherapies.

Authors: Bo Huang
Journal: Pharm Stat Date: 2017-11-02 Impact factor: 1.894

Review 7. Radiomics: extracting more information from medical images using advanced feature analysis.

Authors: Philippe Lambin; Emmanuel Rios-Velazquez; Ralph Leijenaar; Sara Carvalho; Ruud G P M van Stiphout; Patrick Granton; Catharina M L Zegers; Robert Gillies; Ronald Boellard; André Dekker; Hugo J W L Aerts
Journal: Eur J Cancer Date: 2012-01-16 Impact factor: 9.162

Review 8. Quantitative Imaging in Cancer Clinical Trials.

Authors: Thomas E Yankeelov; David A Mankoff; Lawrence H Schwartz; Frank S Lieberman; John M Buatti; James M Mountz; Bradley J Erickson; Fiona M M Fennessy; Wei Huang; Jayashree Kalpathy-Cramer; Richard L Wahl; Hannah M Linden; Paul E Kinahan; Binsheng Zhao; Nola M Hylton; Robert J Gillies; Laurence Clarke; Robert Nordstrom; Daniel L Rubin
Journal: Clin Cancer Res Date: 2016-01-15 Impact factor: 12.531

9. Measuring response in solid tumors: comparison of RECIST and WHO response criteria.

Authors: Joon Oh Park; Soon Il Lee; Seo Young Song; Kihyun Kim; Won Seog Kim; Chul Won Jung; Young Suk Park; Young-Hyuk Im; Won Ki Kang; Mark Hong Lee; Kyung Soo Lee; Keunchil Park
Journal: Jpn J Clin Oncol Date: 2003-10 Impact factor: 3.019

10. Toward uniform implementation of parametric map Digital Imaging and Communication in Medicine standard in multisite quantitative diffusion imaging studies.

Authors: Dariya Malyarenko; Andriy Fedorov; Laura Bell; Melissa Prah; Stefanie Hectors; Lori Arlinghaus; Mark Muzi; Meiyappan Solaiyappan; Michael Jacobs; Maggie Fung; Amita Shukla-Dave; Kevin McManus; Michael Boss; Bachir Taouli; Thomas E Yankeelov; Christopher Chad Quarles; Kathleen Schmainda; Thomas L Chenevert; David C Newitt
Journal: J Med Imaging (Bellingham) Date: 2017-10-30

7 in total

1. In vivo tumor immune microenvironment phenotypes correlate with inflammation and vasculature to predict immunotherapy response.

Authors: Aditi Sahu; Kivanc Kose; Lukas Kraehenbuehl; Candice Byers; Aliya Holland; Teguru Tembo; Anthony Santella; Anabel Alfonso; Madison Li; Miguel Cordova; Melissa Gill; Christi Fox; Salvador Gonzalez; Piyush Kumar; Amber Weiching Wang; Nicholas Kurtansky; Pratik Chandrani; Shen Yin; Paras Mehta; Cristian Navarrete-Dechent; Gary Peterson; Kimeil King; Stephen Dusza; Ning Yang; Shuaitong Liu; William Phillips; Pascale Guitera; Anthony Rossi; Allan Halpern; Liang Deng; Melissa Pulitzer; Ashfaq Marghoob; Chih-Shan Jason Chen; Taha Merghoub; Milind Rajadhyaksha
Journal: Nat Commun Date: 2022-09-09 Impact factor: 17.694

2. Reliability of Quantitative 18F-FDG PET/CT Imaging Biomarkers for Classifying Early Response to Chemoradiotherapy in Patients With Locally Advanced Non-Small Cell Lung Cancer.

Authors: Kevin P Horn; Hannah M T Thomas; Hubert J Vesselle; Paul E Kinahan; Robert S Miyaoka; Ramesh Rengan; Jing Zeng; Stephen R Bowen
Journal: Clin Nucl Med Date: 2021-11-01 Impact factor: 10.782

3. Repeatability and Reproducibility of ADC Histogram Metrics from the ACRIN 6698 Breast Cancer Therapy Response Trial.

Authors: David C Newitt; Ghoncheh Amouzandeh; Savannah C Partridge; Helga S Marques; Benjamin A Herman; Brian D Ross; Nola M Hylton; Thomas L Chenevert; Dariya I Malyarenko
Journal: Tomography Date: 2020-06

4. Quantitative Imaging Enters the Clinical Arena: A Personal Viewpoint.

Authors: Robert J Nordstrom
Journal: Tomography Date: 2020-06

5. Quantitative Imaging Informatics for Cancer Research.

Authors: Andrey Fedorov; Reinhard Beichel; Jayashree Kalpathy-Cramer; David Clunie; Michael Onken; Jörg Riesmeier; Christian Herz; Christian Bauer; Andrew Beers; Jean-Christophe Fillion-Robin; Andras Lasso; Csaba Pinter; Steve Pieper; Marco Nolden; Klaus Maier-Hein; Markus D Herrmann; Joel Saltz; Fred Prior; Fiona Fennessy; John Buatti; Ron Kikinis
Journal: JCO Clin Cancer Inform Date: 2020-05

6. Comparison of Segmentation Methods in Assessing Background Parenchymal Enhancement as a Biomarker for Response to Neoadjuvant Therapy.

Authors: Alex Anh-Tu Nguyen; Vignesh A Arasu; Fredrik Strand; Wen Li; Natsuko Onishi; Jessica Gibbs; Ella F Jones; Bonnie N Joe; Laura J Esserman; David C Newitt; Nola M Hylton
Journal: Tomography Date: 2020-06

Review 7. Clinical Trial Design and Development Work Group Within the Quantitative Imaging Network.

Authors: Ella F Jones; John M Buatti; Hui-Kuo Shu; Richard L Wahl; Brenda F Kurland; Hannah M Linden; David A Mankoff; Daniel L Rubin; Darrell Tata; Robert J Nordstrom; Lubomir Hadjiyski; Matthias Holdhoff; Lawrence H Schwartz
Journal: Tomography Date: 2020-06

7 in total