Huajing Teng1, Fengbiao Mao1, Jialong Liang1, Meiying Xue1,2, Wenqing Wei1,2, Xianfeng Li1, Kun Zhang3, Dongdong Feng4, Baoguo Liu4, Zhongsheng Sun1. 1. Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China. 2. University of Chinese Academy of Sciences, Beijing, China. 3. Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, China. 4. Department of Head and Neck Surgery, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Peking University Cancer Hospital and Institute, Beijing, China.
Abstract
Papillary thyroid carcinoma (PTC) is the fastest-growing disease caused by numerous molecular alterations in addition to previously reported DNA mutations. There is a compelling need to identify novel transcriptomic alterations that are associated with the pathogenesis of PTC with potential diagnostic and prognostic implications. Methods: We gathered and compared 242 expression profiles between paired PTC and adjacent normal tissues and identified and validated the coding and long non-coding RNAs (lncRNAs) associated with the extrathyroidal extension (ETE) of 655 PTC patients in two independent cohorts, followed by predicting their interactions with drugs. Co-expression, RNA interaction, Kaplan-Meier survival and multivariate Cox proportional regression analyses were performed to identify dysregulated lncRNAs and genes that correlated with clinical outcomes of PTC. Alternative splicing (AS), RNA circularization, and editing were also compared between transcriptomes to expand the repertoire of molecular alterations in PTC. Results: Numerous genes related to cellular microenvironment and steroid hormone response were associated with the ETE of PTC. Drug susceptibility predictions of the expression signature revealed two highly ranked compounds, 6-bromoindirubin-3'-oxime and lovastatin. Co-expression and RNA interaction analysis revealed the essential role of lncRNAs in PTC pathogenesis by modulating extracellular matrix and cell adhesion. Eight genes and two novel lncRNAs were identified that correlated with the aggressive nature and disease-free survival of PTC. Furthermore, this study provided the transcriptome-wide landscape of circRNAs in PTC and uncovered dissimilar expression profiles among circRNAs originating from the same host gene, suggesting the functional complexity of circRNAs in PTC carcinogenesis. The newly identified AS events in the SERPINA1 and FN1 genes may improve the sensitivity and specificity of these diagnostic biomarkers. Conclusions: Our study uncovered a comprehensive transcriptomic signature associated with the carcinogenesis and aggressive behavior of PTC, as well as presents a catalog of 10 potential biomarkers, which would facilitate PTC prognosis and development of new therapeutic strategies for this cancer.
Papillary thyroid carcinoma (PTC) is the fastest-growing disease caused by numerous molecular alterations in addition to previously reported DNA mutations. There is a compelling need to identify novel transcriptomic alterations that are associated with the pathogenesis of PTC with potential diagnostic and prognostic implications. Methods: We gathered and compared 242 expression profiles between paired PTC and adjacent normal tissues and identified and validated the coding and long non-coding RNAs (lncRNAs) associated with the extrathyroidal extension (ETE) of 655 PTC patients in two independent cohorts, followed by predicting their interactions with drugs. Co-expression, RNA interaction, Kaplan-Meier survival and multivariate Cox proportional regression analyses were performed to identify dysregulated lncRNAs and genes that correlated with clinical outcomes of PTC. Alternative splicing (AS), RNA circularization, and editing were also compared between transcriptomes to expand the repertoire of molecular alterations in PTC. Results: Numerous genes related to cellular microenvironment and steroid hormone response were associated with the ETE of PTC. Drug susceptibility predictions of the expression signature revealed two highly ranked compounds, 6-bromoindirubin-3'-oxime and lovastatin. Co-expression and RNA interaction analysis revealed the essential role of lncRNAs in PTC pathogenesis by modulating extracellular matrix and cell adhesion. Eight genes and two novel lncRNAs were identified that correlated with the aggressive nature and disease-free survival of PTC. Furthermore, this study provided the transcriptome-wide landscape of circRNAs in PTC and uncovered dissimilar expression profiles among circRNAs originating from the same host gene, suggesting the functional complexity of circRNAs in PTC carcinogenesis. The newly identified AS events in the SERPINA1 and FN1 genes may improve the sensitivity and specificity of these diagnostic biomarkers. Conclusions: Our study uncovered a comprehensive transcriptomic signature associated with the carcinogenesis and aggressive behavior of PTC, as well as presents a catalog of 10 potential biomarkers, which would facilitate PTC prognosis and development of new therapeutic strategies for this cancer.
Entities:
Keywords:
RNA editing; alternative splicing; circular RNAs; drug susceptibility; long non-coding RNAs; papillary thyroid carcinoma
The incidence of thyroid cancer has tripled over the past three decades, attributed to the rapid rise of the papillary subtype 1-3. Large-scale genomic characterization of papillary thyroid carcinoma (PTC) uncovered the critical role of genetic alterations in the tumorigenesis of this disease. Somatic mutations and gene fusions, which activate the mitogen-activated protein kinase (MAPK) and phosphoinositide 3-kinase AKT (PI3K-AKT) pathways, were identified as the primary molecular aberrations for PTC 4-9. However, the carcinogenesis of PTC is a complex biological process characterized by various molecular abnormalities, and the reason for its high prevalence remains poorly understood. Extrathyroidal extension (ETE) is a prognostic factor that has been strongly associated with recurrence and survival in PTC patients 10, 11, but the molecular mechanisms underlying the progression of ETE remain mostly uncharacterized, limiting the ability to predict and treat patients with high recurrence risk 11, 12. Recently, mutations in thyroglobulin (TG), not exclusive to mutations in the RAS-MAP kinase pathway, were associated with significantly worse clinical outcomes, suggesting their pathogenic role beyond driving initial oncogenesis 13. Thus, for diagnosis and prognosis, there remains a compelling need to decode novel molecular targets and/or processes that underlie the pathogenesis and progression of PTC 14-18.RNAs, the direct output of the genetic information that is encoded by genomes, play essential roles in cellular functions, and their regulation can diversify the genetic output. For example, the function of non-coding RNAs is indispensable for fine-tuning the genetic network, and its dysregulation is tightly associated with the pathogenesis of cancers 19-21. In comparison with the well-defined role of microRNAs in the pathogenesis of thyroid cancer 22, the role of long noncoding RNAs (lncRNAs) in this process is still largely uncharacterized. The aberrant expression of several lncRNAs in PTC indicates their contribution to carcinogenesis 23-27; however, key targets of these lncRNAs and their clinical roles remain to be elucidated. Emerging evidence indicates that several other processes, such as alternative splicing (AS), RNA editing, and RNA circularization, expand the complexity of the transcriptome, and provide a repertoire of candidate genes contributing to the pathogenesis of cancers 28, 29. So far, little information is available on the transcriptomic signatures of AS and circular RNAs (circRNAs) in thyroid cancer. Although an overall increase in the A-to-I editing level has been detected in multiple cancer types including thyroid cancers 28, the specific sites that were altered in cancer tissues and their pathogenic role remain elusive.Next-generation sequencing permits the investigation of an entire transcriptome, including alterations of coding/noncoding RNA expression, AS, and editing with unprecedented resolution and throughput. In this study, we collected 242 paired PTC and adjacent normal tissues, analyzed their expression profiles, and provided a comprehensive transcriptomic signature of lncRNAs, circRNAs, mRNAs, AS, and RNA editing associated with the carcinogenesis and aggressive behavior of PTC. More importantly, our study identified a catalog of 10 mRNAs and lncRNAs with prognostic significance for PTC. This study provides new insight into the pathogenesis and aggressiveness of PTC, which may facilitate the prognosis and development of new therapeutic strategies for this disease.
Methods
Patients and samples
PTC and adjacent normal tissues were obtained from 11 patients undergoing total thyroidectomy at the Head and Neck Surgery Department of the Beijing Cancer Hospital. The clinical characteristics of the patients are listed in Table . All tumor samples were reviewed by two experienced pathologists, who confirmed the diagnosis of PTC and ensured that the cancer foci contained more than 60% tumor cellularity. The TNM classification of American Joint Committee on Cancer (AJCC) ranging from stages I to IV, according to the primary tumor (T), regional lymph nodes (N), and distant metastasis (M), was used for PTC staging 30. ETE, the risk factor strongly associated with recurrence and aggressive features of PTC 10, 31, was used to describe the aggressiveness of PTC. Three different degrees of ETE, none, minimal (extension to the sternothyroid muscle or perithyroidal soft tissue) and gross (extension to the subcutaneous soft tissue, larynx, trachea, esophagus, or recurrent laryngeal nerve), were defined according to the range of extension in the AJCC TNM system. The samples were snap-frozen in liquid nitrogen and stored at -80 °C until used for analysis of differential expression, circRNAs, AS, and RNA-editing after RNA sequencing. The study was approved by the Ethical Review Board of the hospital, and clinical data were collected after patients provided informed consent.The clinical data of 12 paired PTC and noncancerous tissues (Table ) and 24 transcriptome datasets of total RNA were downloaded from ArrayExpress (https://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-83520/) 23. We also downloaded the clinical data of 125 PTC patients (Table ) and 165 transcriptome data including 80 paired cancerous and noncancerous transcriptomes from the European Nucleotide Archive database with accession number PRJEB11591 (http://www.ebi.ac.uk/ena/data/view/PRJEB11591) 16. The clinical data of 58 PTC patients, each containing both tumor and normal tissues (Table ) and their matched 116 RNA-seq read count data were downloaded from the TCGA THCA cohort (https://portal.gdc.cancer.gov).
cDNA library preparation and sequencing
Total RNA was isolated using an RNeasy Mini Kit (Qiagen, Shanghai, China). The RNA samples with an RNA Integrity Number (RIN) greater than eight, as assessed by Agilent 2100 Bioanalyzer (Agilent, Santa Clara, CA, USA), were used for RNA library construction. cDNA libraries were prepared according to the manufacturer's protocol (Illumina, San Diego, CA, USA). Each library was sequenced using the Illumina HiSeq™ 2000 platform and generated 100 bp paired-end (PE) reads. The raw sequencing datasets were deposited in the Sequence Read Archive of NCBI (http://www.ncbi.nlm.nih.gov/sra) under accession number PRJNA431763.
Identification of differentially expressed genes (DEGs) and lncRNAs
After removing adapters and low-quality bases, sequencing reads of 211 transcriptomes from 148 PTC patients in our collected cohorts (Table ) were aligned to the human reference genome (UCSC hg19) using STAR 32 with default parameters. Next, HTSeq 33 was used to count the reads according to the hg19 coordinates of protein-coding genes and lncRNAs in GENCODE (V22), and differential expression between the paired normal and cancer samples was identified using the Bioconductor package edgeR 34 with a cut-off of 5% false discovery rate (FDR). To identify the prominently altered genes and RNAs, the differentially expressed lncRNAs and genes with greater than two-fold expression changes were considered. Hierarchical clustering of the prominent altered gene and lncRNA expression profiles were performed using Cluster 3.0 (http://bonsai.hgc.jp/~mdehoon/software/cluster/). After taking into consideration the differences between tumor and adjacent normal tissues, an ANOVA-like test for differential expression across different degrees of ETE in our collected and TCGA THCA cohorts was performed independently using edgeR 34 to identify lncRNAs and genes associated with the progression of PTC. The gene expression profiles in response to ETE were compared with drug response signatures listed in the Connectivity Map (CMAP) build 02 (Broad Institute) 35. A protein-protein interaction network of DEGs associated with PTC aggressiveness was also constructed based on text mining and the experimentally determined protein-protein interaction database of String 36.
Co-expression analysis of prominently altered lncRNAs and genes
We used the list of differentially expressed lncRNAs to analyze their correlation with the DEGs detected in PTC tissues. The expression level of lncRNAs and genes in each tumor or normal tissue was used to calculate the Pearson correlation coefficient and probability. Subsequently, gene pairs with Pearson correlation coefficient > 0.7 and P < 0.05 were selected and constructed into a co-expression network using Cytoscape. To further identify the interaction partners, lncRNA-mRNA interactions were predicted for each lncRNA based on the seed-and-extension approach and RNA secondary structure energy model using Riblast according to default parameters 37.
Survival analysis of dysregulated lncRNAs and genes
We downloaded RNA-seq normalized data for 493 THCA patients with clinical data from TCGA (https://portal.gdc.cancer.gov) (Table ). Kaplan-Meier survival analysis, univariate and multivariate Cox proportional regression analyses were employed with the R survival package to analyze the association between the expression of a specific molecule and clinical outcome. Age, gender, TNM staging, mutation of BRAFV600E, histological subtypes (classical, follicular, and tall cell), and history of other malignancies were taken into account in the multivariate Cox regression analysis. The information on Cox proportional hazard ratios and the 95% confidence interval were also obtained.
circRNA analysis
Sequencing reads of paired PTC and noncancerous tissues in our collected cohorts were aligned to a human reference genome (UCSC hg19) using BWA-MEM 38 with default parameters. Subsequently, circRNAs were identified using CIRI_v2.0.6 39, 40 with at least two supporting reads. Differentially expressed circRNAs between normal and tumor tissues were further identified using CircTest 41. Regulatory elements on circRNA, including microRNA response elements and RNA-binding protein binding sites, were predicted using CircView 42.
Detection of AS events
rMATS uses a modified generalized linear mixed model to detect differential AS events from RNA-seq data with replicates 43. We used rMATS to detect five types of AS events 43, including skipped exon (SE), retained intron (RI), mutually exclusive exons (MXE) and alternative 5′ and 3′ splice sites (A5SS, A3SS), from the RNA-seq data.
Motif enrichment analysis
We used rMAPs 44 to identify binding sites of RNA-binding proteins that were significantly enriched in differential exon skipping events between cancer and normal tissues. Motif occurrences were scanned separately in exons or 250 bp upstream or downstream intronic sequences, and alternative exons without splicing changes (rMATS FDR>50%, maxPSI>15%, and minPSI<85%) were used as a control. P-values and FDR for motif enrichment were calculated for each motif after counting the number of occurrences in the differentially spliced exons and the control exons.
RNA-editing analysis
After aligning the sequencing reads to the human reference genome using STAR 32, processes, including duplicate removal, reassign mapping qualities, and base quality recalibration, were performed using PICARD (http://picard.sourceforge.net) and the Genome Analysis Toolkit (GATK). Single nucleotide variants (SNV) were called using GATK HaplotypeCaller. RNA editing sites were identified using GIREMI 45 with the obtained SNV as the training set. RNA editing sets were compared with the known RNA-edited sites available from public database, Rigorously Annotated Database of A-to-I RNA Editing (RADAR) 46. The overall editing levels across all edited sites were compared between normal and tumor tissues using Wilcoxon rank-sum test. For each of the sites, differences in editing level between paired PTC and adjacent normal tissues were calculated using Fisher's exact test followed by 5% FDR multiple-testing correction.
Gene set enrichment analysis
Functional enrichment analysis of the genes was performed on Cytoscape plugin BINGO 47.
Results
Differentially expressed genes (DEGs) and lncRNAs associated with carcinogenesis and aggressive disposition of PTC
PTC was reported to have a low frequency of somatic mutations and low intratumor heterogeneity 8. To identify the transcriptome-wide alterations of PTC, differential expression analysis of lncRNAs and coding genes was first performed by comparing 242 expression profiles between paired PTC and adjacent normal tissues. Surprisingly, 52.73% (10,427/19,776) of the coding genes and 32.83% (5,192/15,817) of the lncRNAs exhibited differential expression (FDR<0.05) between tumor and normal tissues, indicating the widespread alterations in both lncRNA and mRNA expression in the cancer known to have a low frequency of mutations. When a greater than two-fold change in the expression levels was considered significant, 1,437 lncRNAs (464 upregulated and 973 downregulated) and 1,582 coding genes (672 upregulated and 910 downregulated) with differential expression were identified (Figure ). Correlation (R2) of the expression differences in PTC versus normal tissues for 38 DEGs between our study and a previous meta-analysis 48 was 0.9352 (Figure ), suggesting a high degree of reliability of the DEG analysis. Hierarchical clustering analysis using the identified DEGs and lncRNAs revealed that expression levels of these entities could define the majority of the malignant and adjacent benign thyroid tissues (Figure ).We further compared the transcriptomes of different degrees (none, minimal, and gross) of ETE to identify transcripts associated with the aggressive features of PTC. We obtained 264 DEGs and 95 differentially expressed lncRNAs across various degrees of ETE after considering the differences between tumor and adjacent normal tissues (Table ), and found that 35 genes and 7 lncRNAs were differentially expressed between all analyzed groups. The correlation between expression levels of these identified transcripts and ETE was validated in PTC data from the TCGA cohort (Table ). After Kaplan-Meier survival, univariate and multivariate Cox regression analyses, a total of 8 genes were predicted to be correlated with the disease-free survival of PTC patients (Table ), thus expanding the number of prognostic biomarkers in this disease. For example, decreased mRNA expression of HIGD1B and SDPR was observed in minimal and gross ETEtumor tissues in both cohorts (Figure ), which was significantly associated with worse disease-free survival of PTC patients after adjusting for multiple testing (Figure ) and multiple clinical factors including age, gender, TNM staging, mutation of BRAFV600E, histological subtypes, and history of other malignancies (Figure ). In addition to several tumor microenvironment-related categories, such as “proteinaceous extracellular matrix,” and “blood vessel morphogenesis”, we found two other categories enriched in the identified gene set associated with ETE, “response to oxygen levels” and “response to steroid hormone stimulus” (Figure ), indicating their essential role in the aggressive behavior of PTC. Comparative analysis between the gene expression profiles in response to ETE and drug response signatures listed in the CMAP database 35 identified 6-bromoindirubin-3'-oxime and lovastatin. 6-bromoindirubin-3'-oxime, a pan-JAK inhibitor to induce apoptosis of humanmelanoma cells 49, was found to be the highly ranked compound with antagonistic effects on genes associated with ETE (enrichment score: -0.811; Table ). The compound ranked second is lovastatin, a cholesterol-lowering drug with dual effects on the proliferation of anaplastic thyroid cancer 50, which could induce the biological state encoded in the signature associated with ETE (enrichment score: 0.925; Table ). The identification of these compounds might be beneficial for PTC patients with high risk of progression.
Co-expression networks reveal the critical role of lncRNAs in PTC carcinogenesis
Gene expression can be organized into modules, which results in co-expression variations across conditions. We investigated the co-expression relationships between the differentially expressed lncRNAs and coding genes associated with the carcinogenesis of PTC. Following the identification of significantly co-expressed lncRNA-mRNA pairs and RNA-RNA interactions, we obtained several networks that consisted of 1,528 co-expression relationships between 349 lncRNAs and 491 mRNAs (Table ). The maximum connected subgraph of the network contained 120 lncRNAs and 103 mRNAs (Figure ). Highly connected nodes were likely to be the hubs with more competing interactions. Higher degrees of interactions were observed in the lncRNA nodes than the coding-gene nodes (Figure ), and the top 10 high-degree nodes were all of the lncRNAs. We observed significantly enriched processes relevant for extracellular matrix microenvironments among the genes impacted by lncRNAs, including “cell adhesion”, “proteinaceous extracellular matrix”, and “transmembrane receptor protein tyrosine kinase activity”, which have been proposed to be important in the pathogenesis of thyroid cancer 48. Some of the identified coding genes impacted by lncRNAs have already been described to be involved in thyroid cancer development, such as MET, KRT19, and SERPINA1
48. These results suggested the functional importance of lncRNAs in the PTC carcinogenesis.
Dysregulated lncRNAs are markers of aggressive behavior and survival in PTC
Co-expression networks suggested the essential role of lncRNAs in PTC pathogenesis, which prompted us to assess whether the expression levels of these co-expressed lncRNAs correlated with disease progression and survival of PTC patients. We examined the co-expression matrix and the repository of lncRNAs identified to be correlated with progression of ETE in the two independent cohorts (Table ). After performing Kaplan-Meier survival analysis and taking into consideration the confounding clinical factors in multivariate Cox proportional hazards regression analysis, two dysregulated lncRNAs, PACERR (Figure ) and CYP4A22-AS1 (Figure ), were for the first time identified to be correlated with the disease-free survival of PTC patients (Table ). Increased expression of PACERR (Figure ) and CYP4A22-AS1 (Figure ) was observed during the progression of ETE in two independent cohorts, and their increase was associated with worse disease-free survival of PTC patients in the Kaplan-Meier survival analysis after controlling for multiple hypothesis tests (Figure ). Furthermore, the association between the increased expression of the two lncRNAs and the disease-free survival of PTC patients remained statistically significant after accounting for age, gender, TNM staging, mutation of BRAFV600E, histological subtypes, and history of other malignancies in the multivariate Cox regression analysis (Figure ). Thus, these two dysregulated lncRNAs provide novel prognostic biomarkers for PTC.
Differentially expressed circRNAs between PTC and adjacent normal tissues
Increasing evidence has suggested the crucial functions of another class of noncoding RNAs, circRNAs, in regulating gene expression and tumorigenesis 51-54. Our search for these noncoding RNAs identified in total 17,864 circRNAs in thyroid tissues, 19.29% of which were shared by more than five samples (Figure ). By comparing each of these circRNAs between normal and tumor tissues, we identified 146 differentially expressed circRNAs generated from 128 annotated host genes (Figure and Table ). We observed significantly enriched GO terms based on host genes of these circRNAs, including “intracellular part” and “cell-cell adherens junction”, which are functional processes that are reported to be relevant to the tumorigenesis of PTC 55. Interestingly, some genes yielded multiple distinct circularized RNAs that were observed in our study. It is of note that six distinct circRNAs were observed in the TG gene, whose mutations were associated with significantly worse clinical outcomes of PTC 13 (Figure ). Some circRNAs had different acceptors but shared the same donor, indicating the existence of alternative splicing events (Figure ). An abundance of circRNA isoforms revealed dissimilar expression profiles among different circRNAs originating from the same host gene (Figure ). Multiple binding sites of miRNA and RNA binding proteins were predicted in the identified circRNAs (Figure ), which implied that these circRNAs could function as miRNA sponges to regulate mRNA expression 52, 53. Collectively, the above results suggested the role of circRNAs in the carcinogenesis of PTC, and that the alternative back-splicing contributed to the diversity and functional complexity of PTC circRNAs.
The landscape of AS in PTC
Emerging evidence indicates the important role of AS in the pathophysiology of many humancancers 56, 57. We obtained a total of 107 highly reliable differential AS events between normal and tumor tissues (Figure and Table ), including 57 SE, six RI, 31 MXE, seven A5SS and six A3SS events. Among the genes with identified AS events, some had already been proven to be crucial for thyroid cancer development. For instance, SERPINA1 exhibited differential exon skipping between tumor and normal tissues (Figure ), and a mutually exclusive exon of the FN1 gene was shown with its flanking exons (Figure ). RNA binding proteins (RBPs) can regulate AS through binding to regulatory RNA sequence motifs in the exons and/or flanking intronic sequences of alternatively spliced exons 58. Analyses of binding motifs in the exons and flanking intronic sequences of SE revealed 14 significantly enriched RBPs (Table ). For example, IGF2BP3 binding sites were observed downstream of the included exons (Figure ). The newly identified AS events are expected to facilitate a better understanding of this disease.
Differential RNA-editing pattern in PTC
RNA editing may supplement alterations in genomic DNA to drive tumorigenesis 28, 59, 60. In total, we detected 113,884 RNA editing sites in thyroid tissues, and approximately 71.02% (71,663 of 100,905) of A-to-I (G) editing sites were annotated in the RADAR 46. To determine whether editing is altered in PTC, the mean editing levels across all edited sites were compared between normal and tumor tissues. An elevated RNA editing level was observed in tumor compared to normal tissues (Figure ). To identify specific sites that were altered in cancers, we next compared the editing level of each site between paired cancer and normal tissues. Overall, we detected 2,469 differentially edited sites in tumors compared with normal tissues, and there was a significant enrichment of the A-to-G variation (Figure and Table ). When considering the distribution of RNA editing variants in different genomic regions, we observed that 33.35% of the differentially edited sites resided in the 3′ untranslated regions (UTR) derived from 506 genes (Figure ). The most enriched functional categories for the differentially edited 3' UTR variants fell into “intracellular membrane-bounded organelle”, “cell surface”, and “posttranscriptional regulation of gene expression”, which are all core functional processes in the cell. RNA editing at 3' UTRs could regulate the stability of target transcripts by creating or eliminating miRNA targeting sites 61, 62. We obtained 36 sites within 3' UTRs of 27 genes that showed altered complementary seed sites of miRNA targets. For example, in the 3' UTR of METTL7A, which is a tumor suppressor in some cancers, we found two A-to-G editing sites (chr12:51324436; chr12:51324506) in four samples (Figure ). The site (chr12:51324436) was a complementary seed site of multiple miRNAs, including miR-320, miR-4251, miR-4329, miR-4663, miR-6761-5p, miR-7160-5p, and miR-8077. Notably, miR-320a, one of the members of miR-320, was downregulated in thyroid cancer of the TCGA cohort, ovarian cancer 63, colon cancer 64, and medulloblastoma 65. These data shed new light on the functions of RNA editing variants in the tumorigenesis of PTC.
Discussion
PTC is the fastest-growing disease caused by a series of molecular alterations 2, 3, 8. Previous studies attempting to identify molecular alterations underpinning PTC have focused primarily on characterizing aberrations at the DNA level, such as mutations and structural variations 8, 13. Besides identifying variations at the genetic level, a detailed transcriptomic characterization is expected to provide a better understanding of this disease, as well as facilitate the prognosis and development of new therapeutic strategies. Here we report a comprehensive transcriptome-wide analysis of the noncoding RNAs and mRNAs, AS, and RNA editing, revealing a clinically significant relationship between mRNAs/lncRNAs and PTC prognosis.
Genes associated with the progression of PTC
Although several genes have been implicated in PTC carcinogenesis, the molecular markers associated with the progression of this disease remain largely unknown, which limits early diagnosis and treatment of patients with high recurrence risk 66. ETE, a prognostic factor strongly associated with recurrence and aggressive features of PTC 10, 31, was used to assess the progression of PTC in our study. We obtained and validated 264 genes associated with ETE of PTC in two independent cohorts, and identified 8 genes correlated with the disease-free survival of PTC patients after adjusting for multiple clinical factors, thus expanding the repertoire of prognostic biomarkers in this disease. Some of them have been reported to be associated with survival of other types of cancers. For example, SDPR could function as a metastasis suppressor in breast cancer by promoting apoptosis, and its loss of expression correlated with significantly reduced distant-metastasis-free and relapse-free survival in breast cancerpatients 67. It has been demonstrated that the expression of GALNT9, another gene identified in our survival analysis, was associated with high overall survival in neuroblastomapatients, independent of the standard risk-stratification covariates 68. The enrichment of the genes associated with ETE of PTC revealed the pivotal role of the cellular microenvironment in the progression of this disease. Interestingly, we found that some of the genes were related to steroid hormone response, indicating the essential role of this function in PTC progression. Further comparison between the expression profiles in response to ETE and drug response signatures in the CMAP database 35 revealed two highly ranked compounds, 6-bromoindirubin-3'-oxime and lovastatin. As an inhibitor of 3-hydroxy-3-methylglutaryl (HMG)-CoA reductase, lovastatin can inhibit the conversion of mevalonate from HMG-CoA. It was previously reported that it could suppress invasiveness of anaplastic thyroid cancer cells by inhibiting Rho geranylgeranylation and RhoA/ROCK signaling 69. Recently, dual effects for lovastatin in anaplastic thyroid cancer, inducing neoplastic proliferation at low doses but antineoplastic effects at high doses, have been reported 50. A positive correlation between signatures associated with ETE and effect of lovastatin (10 μM) was observed in our study. These results suggested a need for the careful evaluation of dose effects before treatment of thyroid cancerpatients with lovastatin. In another example, 6-bromoindirubin-3'-oxime, a pan-JAK inhibitor that could induce apoptosis in humanmelanoma cells 49, was found to have antagonistic effects on genes associated with ETE. Thus, it is possible that these compounds provide an alternative strategy for treatment of patients with a high recurrence risk of PTC.
Dysregulation of non-coding RNAs associated with aggressive behavior of PTC
Non-coding RNAs can have regulatory functions to diversify the genetic output and, due to their high stability and association with the pathogenesis of humancancers, are promising diagnostic and prognostic biomarkers 19-21. They may also function to amplify the malignancy and progression of cancer induced by other genomic alterations 70. Although several lncRNAs are aberrantly expressed in PTC 23-27, their key targets have not yet been identified. In this study, our co-expression network analysis of DEGs and differentially expressed lncRNAs suggested the essential role of lncRNAs in PTC pathogenesis by modulating extracellular matrix and cell adhesion. The survival analysis identified two co-expressed lncRNAs, PACERR and CYP4A22-AS1, which were associated with disease-free survival of thyroid cancerpatients after considering the confounding clinical factors including age, gender, TNM staging, mutation of BRAFV600E, histological subtypes, and history of other malignancies. It has been reported that PACERR could activate the expression of COX-2, a gene whose deregulated expression is linked to progression and outcome of several types of cancers, by occluding repressive NF-kB complexes, and play a role in inflammation and cancer 71. Our study also identified the association of CYP4A22-AS1 with the survival of PTC patients. The expression of CYP4A22-AS1 has been reported to be enriched in nuclear fractions of several cancer cell lines including K562, HelaS3, and HepG2 72, and its expression levels could discriminate clear cell renal cell carcinoma from normal renal tissues 73. These two dysregulated lncRNAs can potentially be used as novel prognostic biomarkers in PTC.
AS and RBPs involved in the carcinogenesis and development of PTC
AS events have been linked to the pathophysiology of many humancancers 56, 57. This study has cataloged numerous AS events in thyroid cancer. The expression of SERPINA1 could be used to identify PTC against benign nodules with 90% accuracy 74. Most encouragingly, in our study, we detected differential exon skipping of SERPINA1 between tumor and normal tissues. Overexpression of the FN1 gene was regarded as an important determinant of thyroid cancer aggressiveness 75. A mutually exclusive exon of this gene was observed in our study. The newly discovered AS events of SERPINA1 and FN1 here have the potential to improve the sensitivity and specificity of these biomarkers. Analysis of binding motifs in the exons and flanking intronic sequences of SE revealed 14 significantly enriched RBPs, including IGF2BP3 and PCBP2. In thyroid tumors, IGF2BP3 could activate IGF2 translation and IGF1 receptor signaling via PI3K and MAPK cascades, as well as promote cell proliferation, invasion, and transformation 76. It is speculated that the alternatively spliced transcripts regulated by RBPs are potentially involved in the carcinogenesis and development of PTC, although further studies are warranted to define the binding activity of these RBPs.In conclusion, we have provided a comprehensive transcriptomic signature associated with the carcinogenesis and aggressiveness of PTC, and present a catalog of 10 lncRNAs and mRNAs for PTC prognosis. The findings in this study provide new insights into the pathogenesis of PTC, and are expected to facilitate the prognosis and development of new therapeutic strategies for the disease.
Authors: Sait Ozturk; Panagiotis Papageorgis; Chen Khuan Wong; Arthur W Lambert; Hamid M Abdolmaleky; Arunthathi Thiagalingam; Herbert T Cohen; Sam Thiagalingam Journal: Proc Natl Acad Sci U S A Date: 2016-01-06 Impact factor: 11.205
Authors: Shihao Shen; Juw Won Park; Zhi-xiang Lu; Lan Lin; Michael D Henry; Ying Nian Wu; Qing Zhou; Yi Xing Journal: Proc Natl Acad Sci U S A Date: 2014-12-05 Impact factor: 11.205
Authors: Damian Szklarczyk; John H Morris; Helen Cook; Michael Kuhn; Stefan Wyder; Milan Simonovic; Alberto Santos; Nadezhda T Doncheva; Alexander Roth; Peer Bork; Lars J Jensen; Christian von Mering Journal: Nucleic Acids Res Date: 2016-10-18 Impact factor: 16.971