Literature DB >> 29594229

Radiomic analysis in prediction of Human Papilloma Virus status.

Kaixian Yu1, Youyi Zhang1, Yang Yu2, Chao Huang3, Rongjie Liu1, Tengfei Li1, Liuqing Yang2, Jeffrey S Morris1, Veerabhadran Baladandayuthapani1, Hongtu Zhu1.   

Abstract

Human Papilloma Virus (HPV) has been associated with oropharyngeal cancer prognosis. Traditionally the HPV status is tested through invasive lab test. Recently, the rapid development of statistical image analysis techniques has enabled precise quantitative analysis of medical images. The quantitative analysis of Computed Tomography (CT) provides a non-invasive way to assess HPV status for oropharynx cancer patients. We designed a statistical radiomics approach analyzing CT images to predict HPV status. Various radiomics features were extracted from CT scans, and analyzed using statistical feature selection and prediction methods. Our approach ranked the highest in the 2016 Medical Image Computing and Computer Assisted Intervention (MICCAI) grand challenge: Oropharynx Cancer (OPC) Radiomics Challenge, Human Papilloma Virus (HPV) Status Prediction. Further analysis on the most relevant radiomic features distinguishing HPV positive and negative subjects suggested that HPV positive patients usually have smaller and simpler tumors.

Entities:  

Keywords:  CT image; HPV status; Oropharynx cancer; Radiomics; Statistical method

Year:  2017        PMID: 29594229      PMCID: PMC5862639          DOI: 10.1016/j.ctro.2017.10.001

Source DB:  PubMed          Journal:  Clin Transl Radiat Oncol        ISSN: 2405-6308


Introduction

Oropharyngeal cancer prognosis is often linked with Human Papilloma Virus (HPV), especially HPV type 16. HPV associated oropharynx cancer patients have been shown to have increased survival time and better tumor control with radiotherapy than non-HPV-associated ones [1], [2], [3]. Typically, HPV status is tested using immunohistochemistry for p16, a protein, or in situ hybridization for viral DNA. However, the lab testing usually requires collecting biospecimen from the patients, thus it is invasive and may impose potential risk to the patients. Therefore, seeking non-invasive yet accurate way to assess the HPV status becomes important. One possible solution is through the low dose computed tomography (CT) scans which is done routinely for screening, diagnosis, and treatment guidance. A low dose CT scan is non-invasive, and it is less likely to impose extra risks to testees. Besides non-invasiveness, the imaging technique could help the physician collect more information at diagnose which subsequently will improve the design of treatment, makes the treatment more precise for each patient. Recently, the rapid development of radiomics has enabled more meaningful and precise quantitative analysis of medical images in various body sites. Magnetic Resonance Imaging (MRI) of brain has been extensively studied over the past two decades, many efforts have been done to quantify the relationship between radiomic features of MRI and the diagnosis of Alzheimer’s disease [4], [5], [6], [7], [8] as well as risk of Autism [9]. CT image is another important image modality that has been widely studied to built its connection with clinical outcomes. CT scan plays a critical role in early detection and prognosis for different types of cancer [10], [11], [12], [13]. However, only a few studies have tried to connect the CT images with HPV status quantitatively, especially in oropharynx cancers. Cantrell and colleagues [14] studied the radiomic differences in CT images between HPV+ and HPV−, but their study was not focused on predicting HPV status. Buch and colleagues [15] analyzed the difference of 42 space-invariant texture features between HPV+ and HPVpatients, again this study did not come up with a predictive model for HPV+/−. Bogowicz and colleagues [16] built a logistic regression distinguishing HPV+/− with radiomics features and achieved 0.78 AUC in validation cohort, but the number of radiomic features they used was still relative small, and not comprehensive enough. Besides that, the discovering cohort in their study was relative small. In this article, we describe an approach that utilizes only CT image to predict HPV status for oropharynx cancer patients. Our method won the first place among 9 participating teams in the 2016 Medical Image Computing and Computer Assisted Intervention (MICCAI) grand challenge: Oropharynx Cancer (OPC) Radiomics Challenge, Human Papilloma Virus (HPV) Status Prediction, with Area Under the Curve (AUC) 0.91549 in the held out evaluation data set. Our approach is based on statistical analysis of radiomic features of the upper chest CT. The feature selection result showed that among all the radiomic features extracted, MeanBreadth and SphericalDisproportion were the most important features that capture the most predictive radiomic information of the HPV status. We have also discovered that the results indicate that the HPV associated patients usually have smaller and simpler tumors. It is also worth mentioning that our method had a higher AUC in private leaderboard than the one in the public leaderboard, which indicates the proposed method is fairly generalizable.

Materials and methods

The analysis is based on radiomic features extracted from upper chest CT images, and prediction is assessed via statistical predictive models. The main challenges include extracting meaningful and predictive radiomic features, combining information across multiple Region Of Interests (ROI), selecting most relevant features, and constructing powerful predictive models. We have adopted existing state-of-the-art feature extraction method [17] to obtain a set of radiomic features covering various aspects of the image, for example, shape, texture, and grayscale intensities. To handle the multi-ROI situation, we computed a “consensus” ROI that can be regarded as ROI representative for each feature and for each individual subject. Statistical methods, e.g. generalized linear model (GLM), random forest, tree based model, and etc., were employed to select the relevant features and make the final prediction. The general procedure is shown in Fig. 1.
Fig. 1

The overall procedure of the wining method.

The overall procedure of the wining method.

Clinical and imaging data

In this challenge, contrast-enhanced Computed Tomography (CT) scans of upper chest for 315 oropharynx cancer patients were provided as the radiomic dataset (detailed challenge setting and information of data acquisition, processing, and inclusion/exclusion, can be find in the challenge summary paper [18]). 150 randomly selected subjects with annotated HPV status were given as training cohort, with 128 HPV positive and 22 HPV negative patients. The remaining 165 subjects were held out as a validation cohort (in the actual challenge, one half of the validation subjects were scored all through the last of the challenge as the public measure of the performance, and the other half were held out as a private score which was not released until the challenge was concluded). The human papilloma virus (HPV) status defined by p16, which was tested and provided by the organizers, is the main interest of this work. Two types of ROIs, Gross Primary/Nodal Tumor volume (GTVp/GTVn) are considered, and the ROI segmentation was done manually by board certified physicians in the organizers’ institution (University of Texas MD Anderson Cancer Center). In this challenge, we did not make distinguish between GTVp and GTVn per organizers’ suggestion.

Pre-processing

Several randomly selected 2D slices (5–10) of the CT images of each subject were manually inspected to assure the consistence of image quality. Subjects with low quality images including over-exposed and large degree of blurring were removed from the training cohort. We removed one subject (id: 88), since their CT scans were quite blurring in the main interested section, and the ROIs annotated were of volume 0.

Radiomic feature extraction

1683 radiomic features of 5 categories were extracted using IBEX [17], including Gray level cooccurrence matrix (GLCM 2D and 3D), Gray level run length matrix, intensity (and histogram), neighbor intensity difference (2D and 3D), and shape (for more details on parameters and feature types in each category please see Supplementary file 1). Quality control was done to ensure there is no missing values in extracted features. And the possible outliers, identified by Grubbs’ test [19] were discarded. One critical problem in this dataset was that one subject often had more than one ROIs (GTVp or GTVn); therefore, a proper way to “choose” a representative ROI became important and essential to the prediction accuracy. Treating each ROI as an individual subject yields scientific and practical problems due to the fact that not all ROIs of a specific subject directly reflect HPV status. In this challenge, we designed a “consensus” ROI for each subject: if there was only one ROI for some subject, then we used that ROI to represent the subject; if there were more than one ROIs for a subject, we created a virtual ROI, the ROI still had the same set of features, but the values of the features were not necessary from the same ROI but rather taking the most extreme value (extremum, in terms of magnitude) comparing to the robust median of all ROIs of all subjects in the cohort (an example is given in Fig. 2).
Fig. 2

The creation of consensus ROI out of a subject with 3 ROIs.

The creation of consensus ROI out of a subject with 3 ROIs.

Statistical analysis

The general statistical analysis is outlined as follows: Homogeneous testing between training and validation cohorts was done by Wilcoxon-rank-sum test [20], [21] for each feature. The features with p-value <0.05 were discarded due to their inhomogeneity, which will subsequently affect the predictive model built from the training cohort. Preliminary feature screening was done by Kolmogorov–Smirnov (KS) test [22], [23]. For each feature, calculating the KS statistic between HPV+ and HPV− subjects in the training cohort, keep only the features with p-value <0.05 to select only the features that were able to distinguish the two groups. Further feature screening through correlation with HPV status. A biserial correlation between HPV status and each radiomic feature was calculated, and only the features with biserial absolute correlation >0.3 were kept to achieve clinical relevance. Ranking remaining features by their marginal Area Under Curve (AUC) obtained by 10-time random split. The marginal AUC was accessed by building model with only one feature. The random split was done by randomly sampling 50 of the training cohort to train the model, then evaluating on the held out data. We have tested various statistical models, including generalized linear model (GLM) [24], pdfCluster [25], predictive tree model [26], random forest [27], Support Vector Machine (SVM) [28], and etc. Only the top 10 features with highest marginal AUC were kept in this step. The final features in the model was selected by forward selection, where we add features one by one according to their ranks of marginal AUC from high to low, until the model AUC stopped increasing. The model AUC was accessed by 10-time random split as well. Selecting appropriate statistical model (tree, GLM, SVM,…) was done by submitting to the public leaderboard of this challenge, as well tuning some parameters. e.g. number of trees in random forest.

Results

Eventually, we selected the highest performed model, logistic regression with two features MeanBreadth and SphericalDisproportion (Table 1). The performance metric used in this challenge was Area Under Curve (AUC), the model obtained mean AUC of 0.753 () from 10-random-split on training cohort (Fig. 3), 0.86667 on public leaderboard, and 0.91549 on the private one. We are the only team having higher private leaderboard score than the public one, which is a good indicator of model generalizability. Due to the challenge setting that truth were not released for public and private leaderboard cohorts, we are not able to provide ROC curve for either leaderboard cohorts.
Table 1

Final logistic regression.

VariableEstimated odds ratio95% CIp-value
MeanBreadth0.926[0.895, 0.958]<0.0001
SphericalDisproportion2.045[1.833, 2.280]<0.0001
Fig. 3

The testing ROC curves of each fold of the 10-random-split of the training cohort.

The testing ROC curves of each fold of the 10-random-split of the training cohort. Final logistic regression. The MeanBreadth measures the mean “width” of the ROI; therefore, it is closely related to the size of the tumor. We observed that HPV-pos subjects tend to have smaller meanBreadth (Fig. 4A), which indicates the tumor size is smaller in the HPV induced subjects.
Fig. 4

Features comparison between HPV+ and HPV− subjects. A. HPV+ patients have a relatively smaller meanBreadth comparing to HPV− patients. B. Similarly, HPV− subjects have larger SphericalDisproportion than the HPV+ subjects.

Features comparison between HPV+ and HPV− subjects. A. HPV+ patients have a relatively smaller meanBreadth comparing to HPVpatients. B. Similarly, HPV− subjects have larger SphericalDisproportion than the HPV+ subjects. On the other hand, SphericalDisproportion measures the ratio of the surface area of the image ROI to the surface area of a sphere with the same volume as the image ROI. This feature describes how complicated the shape of the tumor is, since simpler shape usually have smaller surface area comparing to more complicated shape, when the volumes are the same. The HPV-positive subjects also have smaller SphericalDisproportion (Fig. 4B), which could imply that the shape of the tumor is less complex for subjects carrying HPV than the one for subjects not carrying HPV.

Discussion

In this article, we reported our winning strategy in the 2016 Medical Image Computing and Computer Assisted Intervention (MICCAI) grand challenge: Oropharynx Cancer (OPC) Radiomics Challenge, Human Papilloma Virus (HPV) Status Prediction. The goal of the challenge is to predict HPV-16 status from annotated CT images. Our approach involves image quality checking, feature extraction, ROI reconstruction and variable selection. All through the process, we have tried various statistical models including pdfCluster, random forest, decision tree, SVM (linear and non-linear kernels) and etc. In the winning submission, we used the generalized linear model (GLM) since it had the best public leaderboard score. Besides the statistical models, we have tried deep learning as well. The network we used was GoogleNet, and the input was 2D center slice of the ROI. Although the results turned out to be not as satisfied as the statistical models, it is worth mentioning that even with only 149 subjects (actually only 75 subjects since the random split takes only half of the data as training cohort), deep learning was able to achieve AUC around 0.744 on a random split test (0.753 for the winning algorithm on the same set). The difference in AUC is marginal, and we expect that given more training subjects, deep learning will behave as well as if no better than the statistical models. Several clinical statements were also provided, but we did not use these information in our model mainly due to two reasons: firstly, our intuition was to assess the performance of using images solely, and evaluate how precise image itself can predict the HPV status; another one is including clinical information will bring in extra uncertainties into the model. For example, TNM grade requires further testing, and the grade system is discrete while tumors develop in a continuous fashion; therefore, a cutoff has to be chosen to classify the tumor into the grade system. This mandatory discretization may introduce bias and uncertainty. On the other hand, some self-reported items, e.g. smoking and sexual frequency may not be accurate. Therefore, if the clinical parameters will help improve the prediction is still debatable. There was a study [29] showing that the clinical parameters themselves were able to achieve prediction AUC of 0.84 (lower than what our model can do). They have also showed that their radiomic features did not provide improvement over the clinical only model. However, they have had only a few vaguely defined imaging features, while in our study much more comprehensive radiomic features were extracted. Unfortunately, since the challenge was concluded, we no longer have access to the leaderboard, so we are not able to provide feedbacks at this time. But this issue, whether or not adding clinical parameters improve the overall prediction accuracy, worth further study. One uncommon fact is that our model has a higher AUC on the private cohort than the public cohort. On one hand we believe it is an indication of good generalizability of our model, while it could also be possible that the validation/test cohort is relatively small. Due to the setting of the challenge that we did not receive the truth of either public or private cohort, we are not able to give more information on this rare situation. From the results, we found that the subjects of HPV-pos usually have smaller and geometrically simpler tumor, hence it is more manageable. These findings may partially explain the current literature results that HPV-pos subjects have overall survival advantage [1], [2], [3]. In this study, we explored the correlation between radiomic features and one clinical outcome, HPV16 status. As an exploration study, we think it provides its value by showing the high predictive ability. Beyond HPV16 status, we believe there are more connections between the underlying biology and imaging, for example quantifying tumor heterogeneity through imaging. In this scenario, clonal/subclonal composition can be identified through imaging, without getting the actual biopsy. This will help physician make better treatment decision, and potentially increase the survival chance of patients. In conclusion, we have designed a statistical framework to analyze CT images to predict HPV status, and achieved the first place in the 2016 MICCAI grand challenge: Oropharynx Cancer (OPC) Radiomics Challenge, Human Papilloma Virus (HPV) Status Prediction.
  19 in total

1.  Prediction of AD with MRI-based hippocampal volume in mild cognitive impairment.

Authors:  C R Jack; R C Petersen; Y C Xu; P C O'Brien; G E Smith; R J Ivnik; B F Boeve; S C Waring; E G Tangalos; E Kokmen
Journal:  Neurology       Date:  1999-04-22       Impact factor: 9.910

2.  IBEX: an open infrastructure software platform to facilitate collaborative work in radiomics.

Authors:  Lifei Zhang; David V Fried; Xenia J Fave; Luke A Hunter; Jinzhong Yang; Laurence E Court
Journal:  Med Phys       Date:  2015-03       Impact factor: 4.071

3.  Computed Tomography Radiomics Predicts HPV Status and Local Tumor Control After Definitive Radiochemotherapy in Head and Neck Squamous Cell Carcinoma.

Authors:  Marta Bogowicz; Oliver Riesterer; Kristian Ikenberg; Sonja Stieb; Holger Moch; Gabriela Studer; Matthias Guckenberger; Stephanie Tanadini-Lang
Journal:  Int J Radiat Oncol Biol Phys       Date:  2017-06-15       Impact factor: 7.038

4.  Prediction of MCI to AD conversion, via MRI, CSF biomarkers, and pattern classification.

Authors:  Christos Davatzikos; Priyanka Bhatt; Leslie M Shaw; Kayhan N Batmanghelich; John Q Trojanowski
Journal:  Neurobiol Aging       Date:  2010-07-01       Impact factor: 4.673

5.  Automatic classification of patients with Alzheimer's disease from structural MRI: a comparison of ten methods using the ADNI database.

Authors:  Rémi Cuingnet; Emilie Gerardin; Jérôme Tessieras; Guillaume Auzias; Stéphane Lehéricy; Marie-Odile Habert; Marie Chupin; Habib Benali; Olivier Colliot
Journal:  Neuroimage       Date:  2010-06-11       Impact factor: 6.556

6.  Predicting Alzheimer's Disease Using Combined Imaging-Whole Genome SNP Data.

Authors:  Dehan Kong; Kelly S Giovanello; Yalin Wang; Weili Lin; Eunjee Lee; Yong Fan; P Murali Doraiswamy; Hongtu Zhu
Journal:  J Alzheimers Dis       Date:  2015       Impact factor: 4.472

7.  Using Texture Analysis to Determine Human Papillomavirus Status of Oropharyngeal Squamous Cell Carcinomas on CT.

Authors:  K Buch; A Fujita; B Li; Y Kawashima; M M Qureshi; O Sakai
Journal:  AJNR Am J Neuroradiol       Date:  2015-04-02       Impact factor: 3.825

8.  Structural MRI biomarkers for preclinical and mild Alzheimer's disease.

Authors:  Christine Fennema-Notestine; Donald J Hagler; Linda K McEvoy; Adam S Fleisher; Elaine H Wu; David S Karow; Anders M Dale
Journal:  Hum Brain Mapp       Date:  2009-10       Impact factor: 5.038

9.  Improved survival of patients with human papillomavirus-positive head and neck squamous cell carcinoma in a prospective clinical trial.

Authors:  Carole Fakhry; William H Westra; Sigui Li; Anthony Cmelak; John A Ridge; Harlan Pinto; Arlene Forastiere; Maura L Gillison
Journal:  J Natl Cancer Inst       Date:  2008-02-12       Impact factor: 13.506

10.  Matched computed tomography segmentation and demographic data for oropharyngeal cancer radiomics challenges.

Authors: 
Journal:  Sci Data       Date:  2017-07-04       Impact factor: 6.444

View more
  13 in total

1.  Quantitative diffusion magnetic resonance imaging for prediction of human papillomavirus status in head and neck squamous-cell carcinoma: A systematic review and meta-analysis.

Authors:  Seyedmehdi Payabvash; Aimee Chan; Pejman Jabehdar Maralani; Ajay Malhotra
Journal:  Neuroradiol J       Date:  2019-05-14

2.  Radiomic analysis identifies tumor subtypes associated with distinct molecular and microenvironmental factors in head and neck squamous cell carcinoma.

Authors:  Evangelia Katsoulakis; Yao Yu; Aditya P Apte; Jonathan E Leeman; Nora Katabi; Luc Morris; Joseph O Deasy; Timothy A Chan; Nancy Y Lee; Nadeem Riaz; Vaios Hatzoglou; Jung Hun Oh
Journal:  Oral Oncol       Date:  2020-06-30       Impact factor: 5.337

Review 3.  Advances in Imaging for HPV-Related Oropharyngeal Cancer: Applications to Radiation Oncology.

Authors:  Travis C Salzillo; Nicolette Taku; Kareem A Wahid; Brigid A McDonald; Jarey Wang; Lisanne V van Dijk; Jillian M Rigert; Abdallah S R Mohamed; Jihong Wang; Stephen Y Lai; Clifton D Fuller
Journal:  Semin Radiat Oncol       Date:  2021-10       Impact factor: 5.421

Review 4.  Role of Texture Analysis in Oropharyngeal Carcinoma: A Systematic Review of the Literature.

Authors:  Eleonora Bicci; Cosimo Nardi; Leonardo Calamandrei; Michele Pietragalla; Edoardo Cavigli; Francesco Mungai; Luigi Bonasera; Vittorio Miele
Journal:  Cancers (Basel)       Date:  2022-05-16       Impact factor: 6.575

Review 5.  Translation of Precision Medicine Research Into Biomarker-Informed Care in Radiation Oncology.

Authors:  Jessica A Scarborough; Jacob G Scott
Journal:  Semin Radiat Oncol       Date:  2022-01       Impact factor: 5.421

6.  The impact of radiomics for human papillomavirus status prediction in oropharyngeal cancer: systematic review and radiomics quality score assessment.

Authors:  Gaia Spadarella; Lorenzo Ugga; Giuseppina Calareso; Rossella Villa; Serena D'Aniello; Renato Cuocolo
Journal:  Neuroradiology       Date:  2022-04-23       Impact factor: 2.995

7.  Machine Learning Applications in Head and Neck Radiation Oncology: Lessons From Open-Source Radiomics Challenges.

Authors:  Hesham Elhalawani; Timothy A Lin; Stefania Volpe; Abdallah S R Mohamed; Aubrey L White; James Zafereo; Andrew J Wong; Joel E Berends; Shady AboHashem; Bowman Williams; Jeremy M Aymard; Aasheesh Kanwar; Subha Perni; Crosby D Rock; Luke Cooksey; Shauna Campbell; Pei Yang; Khahn Nguyen; Rachel B Ger; Carlos E Cardenas; Xenia J Fave; Carlo Sansone; Gabriele Piantadosi; Stefano Marrone; Rongjie Liu; Chao Huang; Kaixian Yu; Tengfei Li; Yang Yu; Youyi Zhang; Hongtu Zhu; Jeffrey S Morris; Veerabhadran Baladandayuthapani; John W Shumway; Alakonanda Ghosh; Andrei Pöhlmann; Hady A Phoulady; Vibhas Goyal; Guadalupe Canahuate; G Elisabeta Marai; David Vock; Stephen Y Lai; Dennis S Mackin; Laurence E Court; John Freymann; Keyvan Farahani; Jayashree Kaplathy-Cramer; Clifton D Fuller
Journal:  Front Oncol       Date:  2018-08-17       Impact factor: 6.244

8.  Oropharyngeal squamous cell carcinoma: radiomic machine-learning classifiers from multiparametric MR images for determination of HPV infection status.

Authors:  Chong Hyun Suh; Kyung Hwa Lee; Young Jun Choi; Sae Rom Chung; Jung Hwan Baek; Jeong Hyun Lee; Jihye Yun; Sungwon Ham; Namkug Kim
Journal:  Sci Rep       Date:  2020-10-16       Impact factor: 4.379

9.  Differentiating low and high grade mucoepidermoid carcinoma of the salivary glands using CT radiomics.

Authors:  Michael H Zhang; Adam Hasse; Timothy Carroll; Alexander T Pearson; Nicole A Cipriani; Daniel T Ginat
Journal:  Gland Surg       Date:  2021-05

10.  Prediction of Human Papillomavirus (HPV) Association of Oropharyngeal Cancer (OPC) Using Radiomics: The Impact of the Variation of CT Scanner.

Authors:  Reza Reiazi; Colin Arrowsmith; Mattea Welch; Farnoosh Abbas-Aghababazadeh; Christopher Eeles; Tony Tadic; Andrew J Hope; Scott V Bratman; Benjamin Haibe-Kains
Journal:  Cancers (Basel)       Date:  2021-05-08       Impact factor: 6.639

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.