| Literature DB >> 31828039 |
Leigha D Rock1,2,3,4, Brenda C Minatel3, Erin A Marshall3, Florian Guisier3,5, Adam P Sage3, Mateus Camargo Barros-Filho3,6, Greg L Stewart3, Cathie Garnis3, Wan L Lam3.
Abstract
Head and neck squamous cell carcinoma (HNSCC) has a poor survival rate mainly due to late stage diagnosis and recurrence. Despite genomic efforts to identify driver mutations and changes in protein-coding gene expression, developing effective diagnostic and prognostic biomarkers remains a priority to guide disease management and improve patient outcome. Recent reports of previously-unannotated microRNAs (miRNAs) from multiple somatic tissues have raised the possibility of HNSCC-specific miRNAs. In this study, we applied a customized in-silico analysis pipeline to identify novel miRNAs from raw small-RNA sequencing datasets from public repositories. We discovered 146 previously-unannotated sequences expressed in head and neck samples that share structural properties highly characteristic of miRNAs. The combined expression of the novel miRNAs revealed tissue and context-specific patterns. Furthermore, comparison of tumor with non-malignant tissue samples (n = 43 pairs) revealed 135 of these miRNAs as differentially expressed, most of which were overexpressed or exclusively found in tumor samples. Additionally, a subset of novel miRNAs was significantly associated with HPV infection status and patient outcome. A prognostic-model combining novel and known miRNA was developed (multivariate Cox regression analysis) leading to an improved death and relapse risk stratification (log rank p < 1e-7). The presence of these miRNAs was corroborated both in an independent dataset and by RT-qPCR analysis, supporting their potential involvement in HNSCC. In this study, we report the discovery of 146 novel miRNAs in head and neck tissues and demonstrate their potential biological significance and clinical relevance to head and neck cancer, providing a new resource for the study of HNSCC.Entities:
Keywords: computational biology; gene expression profiling; head and neck cancer; microRNAs; non-coding RNA
Year: 2019 PMID: 31828039 PMCID: PMC6890850 DOI: 10.3389/fonc.2019.01305
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Clinicopathological information of the HNSCC patients from TCGA.
| Histology | Malignant | 523 |
| Anatomical Site | Oral cavity | 316 (60.4) |
| Pharynx | 90 (17.2) | |
| Larynx | 117(22.4) | |
| Age | Range | 19–90 |
| Median | 61 | |
| Gender | Male | 382 (73.0) |
| Female | 141 (27.0) | |
| Smoking status | Never smoker | 121 (23.1) |
| Former smoker | 211 (40.3) | |
| Current smoker | 176 (33.7) | |
| Not determined | 15 (2.9) | |
| Disease stage | I | 21 (4.0) |
| II | 97 (18.5) | |
| III | 105 (20.1) | |
| IVA, IVB, and IVC | 286 (54.7) | |
| Not determined | 14 (2.7) | |
| HPV status | Positive | 73 (14.0) |
| Negative | 40 (7.6) | |
| Not determined | 410 (78.4) | |
Information retrieved August 2018 from UCSC Xena (.
Column percentage.
Age data missing for one patient.
Determined by p16 testing.
Figure 1Study Flow Chart. High throughput small RNA-sequencing data from head and neck squamous cell carcinoma (HNSCC) (n = 523, dataset A) and matched non-malignant tissue (n = 43, dataset B) were obtained from The Cancer Genome Atlas (TCGA). Raw sequence data (BAM files) were converted into unaligned reads (FASTQ) and inputted into miRMaster for miRNA detection and quantification. A threshold criteria of ≥1 read per million (RPM) in ≥10% of samples per group was employed. To determine whether these novel sequences have potential biological relevance group comparison and association analyses were performed. Tissue specificity of the novel candidate sequences was assessed by comparing non-malignant samples (dataset B) with those from 12 other non-malignant tissue types from TCGA Pan-Cancer Atlas (dataset C) using non-linear t-Distributed Stochastic Neighbor Embedding. Differentially expressed novel miRNAs were detected by comparing tumor and matched non-malignant samples (dataset D). Clinicopathological features of the novel miRNA transcripts (n = 130) that were found to be expressed exclusively in tumor samples (dataset A) were compared. Survival analysis was performed to further characterize the novel sequences. Cox regression analysis showed that candidate novel miRNA sequences behave similarly to known miRNAs and may have prognostic value. Validation was performed on an independent dataset (Gene Expression Omnibus GSE52633) (dataset E) and by performing RT-qPCR of the most relevant miRNA candidates in formalin-fixed paraffin-embedded (FFPE) tissues (dataset F).
Description of clinical data sets.
| A | HNSCC samples obtained from TCGA ( |
| B | Non-malignant head and neck samples obtained from TCGA ( |
| C | Non-malignant samples from different organs |
| D | Matched HNSCC and non-malignant samples from TCGA ( |
| E | OSCC from the GEO (GSE52633) ( |
| F | FFPE OSCC tissue ( |
| A and B | MiRNA discovery |
| B and C | Tissue specificity |
| D | Differential expression between non-malignant samples and HNSCC |
| A | Association of miRNAs with clinical features |
| A | Survival analysis |
| E | Detection of novel miRNAs in an independent cohort |
| F | Experimental validation of most relevant miRNA by RT-qPCR in FFPE tissues |
HNSCC, head and neck squamous cell carcinoma; TCGA, The Cancer Genome Atlas; OSCC, oral squamous cell carcinoma; FFPE, formalin-fixed paraffin-embedded.
bile duct (n = 9), bladder (n = 19), brain (n = 5), cervix (n = 3), colon (n = 9), kidney (n = 71), liver (n = 47), lung (n = 91), pancreas (n = 4), prostate (n = 52), stomach (n = 45), and thyroid (n = 59).
Figure 2(A) Venn diagram summarizing the relative proportion of novel vs. previously identified miRNAs expressed to the same levels in the TCGA cohort compared to the current annotation of miRNA repositories. An addition of 146 novel miRNAs to 583 previously annotated sequences expressed to the same d level in the TCGA increases the transcriptome head and neck tissues substantially. (B) Venn diagram of novel miRNAs identified in head and neck squamous cell carcinoma tumor tissue (n = 523) and non-malignant (n = 43) tissue. Our results revealed 146 novel miRNA candidates; 80 and 16 were observed exclusively in non- malignant and tumor tissues, respectively, with 50 miRNA candidates detected in both groups. (C) Circos plot displaying the genomic localization of the novel miRNAs. The outermost circle displays the human autosomal chromosomes, and the inner layers show the expression fold changes (logged) of the novel miRNAs in head and neck squamous cell carcinoma tumors in relation to matched non-malignant tissue [created by ClicO FS: An interactive web-based service of Circos (42)].
Figure 3Tissue-specific expression patterns of unannotated miRNA transcripts. t-Distributed Stochastic Neighbor Embedding (t-SNE) analysis T-SNE shows tissue specificity of head and neck non-malignant tissue compared to other non-malignant tissue from The Cancer Genome Atlas (TCGA); bile duct (n = 9), bladder (n = 19), brain (n = 5), cervix (n = 3), colon (n = 9), were compared to head & neck (n = 43), kidney (n = 71), liver (n = 47), lung (n = 91), pancreas (n = 4), prostate (n = 52), stomach (n = 45), and thyroid (n = 59).
Figure 4Unsupervised hierarchal clustering analysis comprising 39 HNnov-miR expressed in both tumors and non-malignant tissue. The dendogram shows two clusters, the first enriched by non-neoplastic samples (novel miRNA expression predominantly low) and the second by tumor samples (novel miRNA expression predominantly high). Heatmap annotation bars show some of the clinical parameters associated with each tissue sample, including gender, disease site and stage, smoking history, and tissue type.
Figure 5Expression of HNnov-miR-2 and HNnov-miR-30 is significantly associated with negative HPV status in tumors (Mann Whitney U-test).