| Literature DB >> 30952205 |
Yangyang Hao1, Quan-Yang Duh2, Richard T Kloos3, Joshua Babiarz1, R Mack Harrell4,5,6, S Thomas Traweek7, Su Yeon Kim1, Grazyna Fedorowicz1, P Sean Walsh1, Peter M Sadow8, Jing Huang1, Giulia C Kennedy9,10.
Abstract
BACKGROUND: Identification of Hürthle cell cancers by non-operative fine-needle aspiration biopsy (FNAB) of thyroid nodules is challenging. Resultingly, non-cancerous Hürthle lesions were conventionally distinguished from Hürthle cell cancers by histopathological examination of tissue following surgical resection. Reliance on histopathological evaluation requires patients to undergo surgery to obtain a diagnosis despite most being non-cancerous. It is highly desirable to avoid surgery and to provide accurate classification of benignity versus malignancy from FNAB preoperatively. In our first-generation algorithm, Gene Expression Classifier (GEC), we achieved this goal by using machine learning (ML) on gene expression features. The classifier is sensitive, but not specific due in part to the presence of non-neoplastic benign Hürthle cells in many FNAB.Entities:
Keywords: Algorithm; Genomic; Hürthle; Machine learning; Personalized healthcare; RNA-seq; Thyroid cancer
Mesh:
Year: 2019 PMID: 30952205 PMCID: PMC6450053 DOI: 10.1186/s12918-019-0693-z
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Fig. 1The Afirma GSC Algorithm Workflow. a A diagram of the Afirma GSC workflow with the validation cohort outcomes listed. b Nested strategy for Hürthle classification. Samples are first examined by the HI classifier. HI+ samples are passed to the NI classifier. NI- samples are subject to an adjusted threshold for the main B/S classifier
Fig. 2Mitochondrial expression in cytopathology Hürthle positive (H+) and negative (H-) cohorts. Shown are the 13 mitochondrial transcripts present in RNA-seq data. Each transcript shows a boxplot of expression values for Hürthle negative (H-) and Hürthle positive (H+), respectively
Fig. 3Loss of Heterozygosity in Hürthle positive samples. a Affymetrix CytoScan array data on Hürthle tissues vs. non-Hürthle normal tissues. Each column represents one chromosome and each row represents one sample. The value in each cell is the proportion of the chromosome displaying LOH for a given sample. The samples are sorted in descending order for genome-wide LOH. b Chromosome-level LOH data from RNA-Seq Hürthle positive and Hürthle negative samples. Above the horizontal dashed line are Hürthle positive samples, below the dashed line are Hürthle negative samples. c Chromosome-level LOH data from RNA-Seq Hürthle positive, Neoplasm positive or Neoplasm negative samples. Above the horizontal dashed line are Hürthle positive, Neoplasm positive samples, below the dashed line are Hürthle positive, but Neoplasm negative samples. For both (b) and (c) the LOH scale is to the right, with red indicating more LOH
Fig. 4Hürthle Classifier Cross Validation Performance. a Samples used in the Classifier development. Hürthle and Neoplasm labels are defined by cytopathology. b Volcano plots of differential expression. Fold-change (log2 scale) is plotted on the x-axis, and FDR-adjusted p-values are plotted on the y-axis. Mitochondrial genes are shown in purple. c Hürthle Index Score. Red dashed line indicates the cut-off for HI+ vs. HI-. The green boxplot represents the score for cytopathology Hürthle negative samples and the purple boxplot represents cytopathology Hürthle positive samples. d ROC curve showing classifier performance. The red-dashed lines indicate performance at the selected cutoff
Fig. 5Neoplasm Classifier Cross Validation Performance. a Samples used in the Classifier development. Hürthle and Neoplasm labels are defined by cytopathology. Note that all samples are Hürthle positive. b Volcano plots of differential expression. Fold-change (log2 scale) is plotted on the x-axis, and FDR-adjusted p-values are plotted on the y-axis. Mitochondrial genes are shown in purple. c Neoplasm Index Score. Red dashed line indicates the cut-off for NI+ vs. NI-. Cyan triangles indicate genome-wide LOH positive. b ROC curve showing classifier performance. The red-dashed lines indicate performance at the selected cutoff
Fig. 6Hürthle and Neoplasm Scores from the Afirma GSC Validation Cohort. a HI classifier scores for the validation cohort. Red dashed line indicates cutoff for HI+ vs. HI-. HI score distribution is plotted as boxplot with individual sample values for the four groups separately: Hürthle Cell Carcinoma “HCC”, Hürthle Cell Adenoma “HCA”, non-Hürthle histopathology malignant “nonHürthle Malignant”, non-Hürthle histopathology benign “nonHürthle Benign”. b The combination of the main B/S, Hürthle, and Neoplasm Indices. Gray points are HI- samples. Purple points are HI+, NI+. Green points are HI+, NI-. Blue dots are HI+, NI- samples that were subject to the adjusted cutoff. c NI classifier scores for HI+ samples from the validation cohort