Literature DB >> 27035672

Characterization of long non-coding RNA expression profiles in lymph node metastasis of early-stage cervical cancer.

Chunliang Shang1, Wenhui Zhu2, Tianyu Liu1, Wei Wang1, Guangxin Huang3, Jiaming Huang1, Peizhen Zhao4, Yunhe Zhao1, Shuzhong Yao1.   

Abstract

Pelvic lymph node metastasis (PLNM) is an independent prognostic parameter and determines the treatment strategies of cervical cancer. Increasing evidence indicates that long non-coding RNAs (lncRNAs) play a crucial role in the process of tumor biological functions. This study aimed to mine lymph node metastasis-associated lncRNAs and investigate their potential pathophysiological mechanism in cervical cancer lymph node metastasis. We applied the lncRNA-mining approach to identify lncRNA transcripts represented on Affymetrix human genome U133 plus 2.0 microarrays from Gene Expression Omnibus (GEO) and then by validation in clinical specimens. The biological role and molecular mechanism of these lncRNAs were predicted by bioinformatic analysis. Subsequently, a receiver operating characteristic (ROC) curve and survival curve were conducted to evaluate the diagnostic and prognostic value of candidate lncRNAs. In total, 234 differentially expressed lncRNAs were identified to significantly associate with pelvic lymph node metastasis in early-stage cervical cancer. Our qRT-PCR results were consistent with the mining analysis (P<0.05). The functional enrichment analysis suggested that these lncRNAs may be involved in the biological process of lymph node metastasis. The ROC curves demonstrated satisfactory discrimination power of MIR100HG and AC024560.2 with areas under the curve of 0.801 and 0.837, respectively. Survival curve also indicated that patients with high MIR100HG expression had a tendency of poor prognosis. This is the first study to successfully mine the lncRNA expression patterns in PLNM of early-stage cervical cancer. MIR100HG and AC024560.2 may be a potential biomarkers of PLNM and these lncRNAs may provide broader perspective for combating cervical cancer metastasis.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27035672      PMCID: PMC4869942          DOI: 10.3892/or.2016.4715

Source DB:  PubMed          Journal:  Oncol Rep        ISSN: 1021-335X            Impact factor:   3.906


Introduction

Uterine cervical cancer is the third most common cancer and the second leading cause of cancer death in women between 20 and 39 years of age worldwide (1). The National Comprehensive Cancer Network (NCCN) (the version 1.2014) guidelines recommend the primary treatment for early-stage cervical cancer (FIGO IB and IIA) as either surgery or chemo radiotherapy. The primary chemoradiotherapy is recommended as the preferred treatment for patients with high-risk positive lymph node (2). However, accurate and efficient clinical detection methods for lymph node metastasis is difficult, so many patients receive initial unnecessary surgery with adjuvant chemoradiotherapy (3). The combination of surgery and chemoradiotherapy carries the worse morbidity, particularly long-term complications (4). Furthermore, pelvic lymph node metastasis (PLNM) has been identified as the strongest key prognostic parameter in cervical cancer, particularly early-stage cervical cancer (5); therefore, effective PLNM detection is essential to select an optimal therapy. In clinical practice, imaging diagnostics, including magnetic resonance imaging (MRI), positron emission tomography (PET), and computed tomography (CT) scans, are conventional methods of PLNM detection before treatment. Nevertheless, additional statistical analyses demonstrate that these methods have poor sensitivity for detecting lymph node metastasis (6,7). Recently, Sentinel LN (SLN) biopsy in early-stage cervical cancer yields high diagnostic value for the status of lymph node (8), but this still need general anesthesia. Thus, a non-invasive and more accurate evaluation method is urgently needed. Rapid advances in molecular biotechnology have increased attention to biomacromolecules or small RNAs, including microRNAs and short interfering RNAs. High-throughput technologies have been devoted to identifying lymph node-associated biomarkers at genomic levels (9) and the protein (10). Several recent studies have evaluated the gene expression profiles of PLMN in uterine cervical cancers (11–13) but neglected long non-coding RNAs (lncRNAs), which are longer than 200 nt and do not encode proteins. LncRNAs have emerged as potentially powerful regulators involved in various biological processes. Accumulating evidence indicates that aberrantly expressed lncRNAs participate in the carcinogenesis and development of malignant tumors through binding proteins or modulating other short regulatory RNAs (14). In addition, lncRNAs have been identified as oncogenic RNAs that promote tumor cell invasion in uterine cervical cancer (15) and as independent predictors of overall survival (16,17). Hence, lncRNAs are expected to be excellent biomarkers for PLNM in cervical cancer. In this present study, we used a data mining method and bioinformatics analysis to determine the PLNM-associated lncRNA profiles, which was fortuitously represented on the commonly used microarray platform. Using bioinformatics software and methods, we performed an initial exploration of the potential functional enrichment and pathway mechanisms of these lncRNAs. The candidate lncRNAs show promise to be biomarkers of diagnosis and prognosis in cervical cancer. This study provides a new frontier for uncovering the potential metastasis mechanisms to lymph node in uterine cervical cancer.

Materials and methods

GEO gene expression data

The raw PLNM gene expression data and corresponding related clinical parameters were downloaded from the publicly available GEO (http://www.ncbi.nlm.nih.gov/geo/) including GSE26511 and partial GSE2109 data. GSE26511 comprises 20 cervical cancer specimens without PLNM and 19 with PLNM. For these 39 samples, primary treatment consisted of type 3 radical hysterectomy and pelvic lymph node dissection (18). GSE2109 summarized 2,158 samples included in the Expression Project for Oncology (exp0). Based on TNM stage and primary tumor site, we selected 20 cervical cancer specimens from GSE2109, including 15 without PLNM and 5 with PLNM. All samples in these two panels were hybridized to Affymetrix human genome U133 plus 2.0 microarrays.

GeneChip Probe Re-annotation

Based on the lncRNA classification pipeline constructed in a previous study (19), we identified a number of lncRNAs represented on the Affymetrix microarrays. First, the latest version of NetAffx Annotation File (HG-U133_Plus_2 Annotations, CSV format, Release 34, 30 MB, 10/24/13) was obtained from the Affymetrix official website. This annotation file was mapped to the HG-U133_Plus_2 probe sets ID. Second, for the probe sets from the Refseq database, those IDs beginning with 'NR' were retained, and transcript IDs labeled with 'NP' were deleted. For the probe sets from the Ensembl database, the online software BioMart was applied to convert Affymetrix microarray IDs to Ensembl IDs together with the corresponding gene type. We only retained genes annotated as 'lincRNA', 'sense_intronic', 'processed_transcript', 'anti-sense', 'sense_overlapping', '3prime_overlapping_ncrna', or 'misc_RNA'. Next, based on the above two steps, we removed probe set IDs annotated as 'microRNA', 'snoRNAs', ' pseudo-genes' and other small RNAs.

Differential expression analysis

The expression level of lncRNAs is lower than that of protein-coding RNAs, and the robust multichip average (RMA) has higher detection efficiency for lncRNAs (20). Therefore, these raw CEL files were background-adjusted, normalized, and log-transformed using RMA rather than Microarray Analysis Suite 5.0 (MAS 5.0) using RMAexpress software (Windows version 1.1.0, written by Ben Bolstad). The differentially expressed probe sets were identified using a parametric two-sample t-test (with a random variance model) with a significance threshold of P<0.05 and validated using permutation testing across samples in BRB-Array Tools v4.4.0 Beta 1 (http://linus.nci.nih.gov/BRB-ArrayTools.html). We also entered the probe set data into Cluster3.0&TreeView (originally developed by Michael Eisen, Stanford University) to process the Hierarchical Clustering Analysis. To further verify the outcome of the microarray analysis, we adopted the training-validation strategy explored by Michiels et al (21) to classify the microarray-based datasets. This method indicated that the percentage of misclassification would decrease as the number of samples in the training set increased. Thus, we defined the data set GSE26511 as the training group to identify the differentially expressed lncRNAs and GSE2109 as the validation group to evaluate misclassification.

Construction of the lncRNA-mRNA coexpression network

The Pearson correlation coefficient (PCC) and P-value were considered in the construction of an lncRNA-mRNA co expression network (22). In view of the above hierachical cluster analysis resulted from GSE26511 and significant changes, we selected top 20 differential expression lncRNAs and 189 mRNAs (fold change >1.5) to form the coexpression network based on the Pearson correlation coefficient (PCC, PCC≥0.60, P<0.05). The PCC was calculated using the coding and non-coding RNAs. For the same mRNA with different probe sets or transcripts, we used the mean value as the final gene expression value. The lncRNA-mRNA coexpression network was created using Cytoscape software (v2.6.3).

GO and KEGG pathway analysis of lncRNA-coexpressed mRNAs

Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses were performed using Molecule Annotation System (MAS) V3.01 (CapitalBio Corp., Beijing, China). MAS provides a series of comprehensive functional annotation tools to elucidate the biological meaning of differentially expressed genes (23). GO is the product of a collaborative effort to address the need for consistent descriptions of gene products among several databases and covers the following three domains: Biological Process, Cellular Component and Molecular Function. The Fisher's exact test P-value and EASE-score were used to denote the significance of the GO enrichment terms correlated with the conditions, and the Fisher P-value was used for pathways. The lower the P-value, the more significant the GO term or pathway (24).

Gene set enrichment analysis (GSEA)

Gene set enrichment analysis (GSEA) is used to interpret gene expression data by determining the statistical significance of differences in predefined gene sets between biological states (25). In addition, GSEA can be used to find pathways that correlate to the expression of the gene. To probe the biological mechanisms of the differentially expressed lncRNAs MIR100HG and AC024560.2, all coding-gene mRNAs and these two lncRNAs were separately used to generate the expression data set. With the help of GSEA software V2.1.0 (Broad Institute, MIT, Cambridge, MA, USA), we constructed a 32,619 (genes) x39 (samples) expression matrix to perform GSEA. The predefined gene set 'c2.all.v4.0.symbols.gmt' is one of 7 major collections from the Molecular Signatures Database (MSigDB). A normalized enrichment score (NES) was calculated as the primary statistic of GSEA. In addition, the chromosome location of the lncRNAs MIR100HG and AC024560.2 was visualized using the UCSC Genome Browser (http://genome.ucsc.edu/).

Patients and tumor specimens

A total of 35 samples of fresh cervical cancer tissues were randomly collected from patients who underwent surgery at the First Affiliated Hospital of Sun Yat-sen University from May 2013 to December 2014 and provided written informed consent. All surgical specimens were immediately frozen in liquid nitrogen and stored at −80°C until RNA extraction. The clinical study was approved by the Medical Ethics Committees at the First Affiliated Hospital of Sun Yat-sen University. All samples were confirmed pathologically, and the clinical characteristics are presented in Table I.
Table I

Clinicopathological characteristics of early-stage cervical cancer patients included in the qRT-PCR analysis.

No.Age (years)FIGO stageTumor size (cm)Number of Positive PLNNumber of removed PLNLVSIDifferentiationPathological typeStromal invasion
Early-stage cervical cancer with PLNM
1  58Ib11.5138NegativeG2SCC>1/2
2  50Ib22   521NegativeG2SCC<1/2
3  52Ib11   156PositiveG1SCC<1/2
4  42IIa25   219NegativeG1SCC>1/2
5  35IIa25   215NegativeG1SCC>1/2
6  59IIa12   221NegativeG2SCC>1/2
7  62IIa13   322NegativeG1SCC>1/2
8  46Ib14   338PositiveG1SCC>1/2
9  48IIa14   136NegativeG1SCC>1/2
1058IIa13   153NegativeG1SCC>1/2
1135IIa12   172PositiveG1SCC>1/2
1243IIa12   221NegativeG2SCC<1/2
1343Ib13   217PositiveG1SCC<1/2
1449Ib25   416PositiveG2SCC>1/2
1534IIa11   217PositiveG1SCC>1/2
Early-stage cervical cancer without PLNM
1  50Ib13   017NegativeG1SCC<1/2
2  56Ib11   033NegativeG3SCC>1/2
3  44Ib13   0  9NegativeG2SCC>1/2
4  48Ib21.3041NegativeG2SCC<1/2
5  46IIa12.5044NegativeG1SCC>1/2
6  54IIa12.5036NegativeG2SCC>1/2
7  37Ib12   020NegativeG2SCC<1/2
8  39Ib13   022NegativeG1SCC<1/2
9  47Ib12   038NegativeG1SCC>1/2
1050IIa12   021NegativeG2SCC>1/2
1152Ib13   022NegativeG2SCC>1/2
1236Ib25   021NegativeG1SCC<1/2
1343Ib12   016NegativeG1SCC<1/2
1440IIa12.5012NegativeG1SCC>1/2
1541Ib13   019NegativeG1SCC<1/2
1640Ib12   025NegativeG2SCC>1/2
1740IIa12   015NegativeG1SCC>1/2
1838Ib13   0  5NegativeG1SCC<1/2
1952Ib14   020NegativeG1SCC>1/2
2036Ib12   015NegativeG2SCC<1/2

RNA extraction and quantitative real-time polymerase chain reaction (qRT-PCR) validation

Total RNA from fresh frozen tissues was extracted using RNAiso Plus reagent (Takara, Dalian, China), and complementary DNA was reverse-transcribed using PrimeScript RT Master Mix (Takara) according to the manufacturer's instructions. Quantitative real-time polymerase chain reaction (qRT-PCR) was performed with SYBR Premix Ex Taq (Takara). All qRT-PCRs were performed in a 7500 Fast Real-time PCR System (Applied Biosystems, Carlsbad, CA). The RNA primers used in qPCR are presented in Table II.
Table II

RNA primer sequences used in our study.

PrimerSequence (5′ to 3′)
LINC01139F: TTCTCTCACCCTTCAAACAGC
R: ACCAAAGATGTCGCAGGACT
MIR100HGF: GGCGACATCAGACAGACAGA
R: AGGACCAGCTGAAAGGAACA
AC024560.2F: TGGGTCGCTCTGTATCTCTG
R: CGGTGGCTGTGAGTATGAAG
LINC01503F: TGGATTTTCATGCCTGCTG
R: GGCTGCATTACCAGAAAGGT

F, Forward; R, Reverse.

TCGA cohorts

In order to indicate the prognosis significance of MIR100HG in early-stage cervical cancer, we used the MIR100HG expression data and clinical information from the Cancer Genome Atlas (TCGA, https://tcga-data.nci.Nih.gov/tcga/). Patients without intact follow-up and pathological data were excluded. The clinical data of all 131 patients from TCGA cohort are shown in Table III.
Table III

Clinical parameters of TCGA cohort enrolled in our study.

ParametersData
Total number of patients131
Age (years)
 Median45
 Range20–80
FIGO stage
 IA3
 IB111
 IIA17
Overall survival
 Deaths23 (17.5%)
Follow-up (years)
 Median0.65
 Range0–11.6

Statistical analysis

Statistical analyses were performed using IBM SPSS Statistics 19.0 for Windows (Released 2010; IBM Corp., Armonk, NY, USA). Receiver operating characteristic (ROC) analysis was performed to assess the sensitivity and specificity of the measured markers. The cut-point of MIR100HG expression was defied as the median. Kaplan-Meier and the two-sided log-rank test were used to calculate the survival curves. Significance was defined at a P-value of <0.05.

Results

Gene expression data characteristics

The GSE26511 and GSE2109 series were used in our study. To ensure consistent clinical data of patients included in these two series, we listed the age and FIGO stage of patients in Table IV. Primary analysis was focused on GSE26511, which was a published dataset consisting of the largest number of specimens. The entire workflow of this process is presented in Fig. 1A.
Table IV

Characteristics of the gene expression data covariates (all squamous cell carcinoma).

GSE26511Pelvic lymph nodesAge at diagnosisFIGO stageGSE2109Pelvic lymph nodesAge at diagnosisFIGO stage
GSM651831Negative56.41b1GSM46919Negative60–701b1
GSM651832Negative45.81b1GSM46942Negative30–401b1
GSM651833Negative49.51b1GSM117626Negative60–701b1
GSM651834Negative34.72a  GSM152580Negative20–301b2
GSM651835Negative55.51b1GSM152635Negative20–301b1
GSM651836Negative38.51b1GSM152667Negative30–401b1
GSM651837Negative34.91b1GSM152719Negative30–401b1
GSM651838Negative47.41b1GSM179853Negative30–401b1
GSM651839Negative42.31b1GSM179907Negative50–601b1
GSM651840Negative35.81b2GSM179956Negative30–401b1
GSM651841Negative51.62a  GSM203742Negative50–601b1
GSM651842Negative72  1b2GSM325809Negative40–491b1
GSM651843Negative71  1b2GSM325835Negative30–391b1
GSM651844Negative35.91b2GSM89042Negative40–501b1
GSM651845Negative68.92a  GSM102476Negative40–501b1
GSM651846Negative47.41b2GSM76614Positive 40–501b1
GSM651847Negative31.51b1GSM89050Positive 30–401b1
GSM651848Negative72.72a  GSM152723Positive 50–602a  
GSM651849Negative39.91b1GSM152751Positive 40–501b2
GSM651850Negative50.71b1GSM203622Positive 50–601b2
GSM651851Positive 56.21b1
GSM651852Positive 29.11b1
GSM651853Positive 32.22a  
GSM651854Positive 60.61b1
GSM651855Positive 49.92a  
GSM651856Positive 34.91b2
GSM651857Positive 32.71b2
GSM651858Positive 40.41b1
GSM651859Positive 48.51b2
GSM651860Positive 37.41b1
GSM651861Positive 37  1b2
GSM651862Positive 32  1b1
GSM651863Positive 37.41b1
GSM651864Positive 45.51b2
GSM651865Positive 72.51b1
GSM651866Positive 42.31b1
GSM651867Positive 46.31b1
GSM651868Positive 34.21b2
GSM651869Positive 50.52a  
Figure 1

Work flow for re-annotation and hierarchical clustering. (A) Graphic diagram of the work flow. (B) Heatmaps were produced from the hierarchical cluster analysis to show differentially expressed lncRNAs. The color scale on the right shows the relative expression levels of lncRNAs across all samples: red represents expression >0, and green represents expression <0. The analysis was first performed using the training group (GSE26511) and (C) then validated with the validation group (GSE2109).

Differentially expressed lncRNA profile

We identified 3,432 probe sets (matching with 2,803 lncRNAs) via re-annotation of the Affymetrix human genome U133 plus 2.0 microarrays based on the Refseq and Ensembl databases. For the GSE26511 training group, 249 probe sets (matching with 234 lncRNAs) that were differentially expressed in cervical cancer specimens of varying lymph node metastasis status were identified using the method described above (data not shown). Of these 249 probe sets, 40 probe sets (31 lncRNAs) were upregulated in cervical cancer tissues without PLNM compared to the lymph node metastasis group, and 209 probe sets (203 lncRNAs) were downregulated. Hierarchical clustering maps of all samples from the GSE26511 training group and the GSE2109 validation group were constructed from these differentially expressed lncRNAs. GSE2109 was used to reduce the error due to the collection of specimens from cervical cancer tissues with or without lymph node metastasis (Fig. 1B and C).

LncRNA classification and distribution

The genetic location of biomolecules plays an important role in diverse biological and molecular functions (14). Using the UCSC genome browser and Ensembl database, the differentially expressed lncRNAs were characterized as lincRNA, 3prime_overlapping_ncRNA, antisense, processed_transcript, and sense_intronic.sense_overlapping based on the correlation between lncRNAs and their associated coding genes. A total of 100 and 17 lncRNAs were included in the up- and down-regulated lncRNA classifications, respectively. Each chromosome had different numbers of up- and down-regulated lncRNAs (Fig. 2E and F).
Figure 2

Results of Molecule Annotation System (MAS) analysis. (A–C) GO terms identified in the GO analysis for correlated coding genes in the categories biological process, cellular component, and molecular function with 10 minimum P-value. (D) Biological pathways from KEGG analysis with 10 minimum P-value. (E and F) Chromosome distribution and classification of differentially expressed lncRNAs.

The overview of mRNA profile

Similar to the lncRNA analysis method, 2,980 probe sets (matching 228 lncRNAs) were identified as differentially expressed in cervical cancer specimens with PLNM. Among these mRNAs, 1,641 probe sets (matching 1,170 mRNAs) were upregulated in cervical cancer tissues without PLNM compared to the lymph node metastasis group, and 1,339 probe sets (matching with 1,111 mRNAs) were downregulated. In the subsequent analysis, we selected 335 mRNAs with a differential expression fold change >1.5 to construct a clear and significant network.

LncRNA-mRNA coexpression network

Although lncRNA functions have been implicated in many diseases, the biological functions of a large proportion of lncRNAs remain unknown. The tumor metastasis is a multifactor, multistep and multigene interactive process. Affymetrix human genome U133 plus 2.0 microarrays provided the expression levels of lncRNAs and mRNAs, and we constructed an lncRNA-mRNA coexpression network to interpret the potential biological roles of PLNM-associated lncRNAs in early-stage cervical cancer (26). The coexpression network revealed that one mRNA could target several lncRNAs and vice versa, suggesting a potential role of lncRNA and mRNA interactions in the process of lymph node metastasis in cervical cancer (Fig. 3).
Figure 3

LncRNA-mRNA coexpression network. The round nodes represent mRNA, and the square nodes represent lncRNA. A positive correlation is represented by a solid line, and a negative correlation is represented by a dotted line between the lncRNA and mRNA nodes.

GO and pathway analysis

To further evaluate the differentially expressed lncRNAs, we performed GO and pathway analyses of mRNAs co-expressed with lncRNAs. For biological process (Fig. 2C), i) signal transduction; ii) development; iii) regulation of transcription, DNA dependence; iv) cell adhesion; v) transcription; vi) cell differentiation; vii) cell cycle arrest; viii) positive regulation of cell proliferation; ix) nervous system development; x) and the Wnt receptor signaling pathway through β-catenin were involved. For cellular component and molecular function, nucleus and protein binding were included separately (Fig. 2A and B). KEGG pathway analysis indicated that the mRNAs co-expressed with the lncRNAs were mainly involved in the following: i) Gap junction; ii) Axon guidance; iii) melanoma; iv) Regulation of actin cytoskeleton; v) Phosphatidylinositol signaling system; vi) Adherens junction; vii) Prostate cancer; viii) MAPK signaling pathway; ix) Wnt signaling pathway; x) and Purine metabolism (Fig. 2D). The gap junction pathway plays a physiologically relevant regulator role in cervical tumor cells (27). We also identified the Wnt receptor signaling pathway through β-catenin, in agreement with the results of Noordhuis et al (18). These data imply that PLNM-lncRNAs may contribute to the lymph node metastasis process through the above mechanisms and pathways.

Bioinformatics analysis of lncRNA-MIR100HG and AC024560.2

To identify lncRNAs that play an important role in lymph node metastasis in early-stage cervical cancer, we analyzed the detailed information for two of the top lncRNAs from the upregulated group and downregulated groups (Table V). The lncRNA linc01503 had 4 transcripts, increasing the uncertainty of the primer and transcription products. MIR100HG was co-expressed with more coding genes than linc01139 in the upregulated group. In light of these findings, we selected MIR100HG and AC024560.2 as promising lncRNAs for further investigation.
Table V

The top four differentially expressed lncRNAs in cervical cancer specimens with lymph node metastasis.

ProbeSetGene symbolRegulation in N+Fold-changeParametric P-valueCo-expression coding genesTranscriptsBiotypebpChr
235599_atLINC01139Down2.050.0001141141lincRNA1540Chr1
225381_atMIR100HGDown1.96  0.002493471sense_overlapping3082Chr11
238804_atAC024560.2Up1.610.000290931lincRNA1471Chr3
229296_atLINC01503Up1.500.009936354lincRNA901Chr9
The mir-100-let-7a-2 cluster host gene, MIR100HG, is located on chromosome 11q24.1 and is a regulator of hematopoiesis and oncogenes in the development of myeloid leukemia (28). We performed GSEA using the expression level of MIR100HG and the entire mRNA expression dataset. Based on the GSEA results, we identified 1,127 and 2,362 gene sets that were positively or negatively associated, respectively, with MIR100HG among the 3,489 'curated gene sets'. The curated gene sets were collected from various sources such as online pathway databases, publications in PubMed, and knowledge of domain experts. Gene sets with the highest NES in the positive and negative correlation groups included RICKMAN_METASTASIS_DN and, DELYS_THYROID_ CANCER_DN, respectively (Fig. 4A and B). We also estimate that MIR100HG participates in the gap junction pathway and TGF-β pathway, with NES values of 1.6079853 and −2.2482386, respectively (Fig. 4C and D). Disruption of the TGF-β/Smad signaling pathway may be conducive to the malignant progression of cervical dysplasia in human cervical cancer (29). These analysis results are consistent with our earlier data and further indicate that MIR100HG may participate in the regulation of lymph node metastasis in early-stage cervical cancer through several pathways.
Figure 4

Bioinformatics analysis of MIR100HG. (A, negatively; B–D, positively) Identification of gene sets enriched in phenotypes correlated with MIR100HG by GSEA.

AC024560.2 is an lncRNA located on chromosome 3 and is described as Homo sapiens long non-coding RNA OTTHUMT00000340266.1. We used the expression level of AC024560.2 in GSEA using the entire mRNA expression dataset. GSEA resulted in the identification of 2,823 and 666 gene sets that were positively or negatively associated, respectively, with AC024560.2 among the 3,489 'curated gene sets'. Gene sets with the highest NES in the positive and negative correlation groups included WILLERT_WNT_SIGNALING (NES=1.9029536) and KEGG_LINOLEIC_ACID_METABOLISM (NES= −1.7962813), respectively (Fig. 5A and B). A review of the literature revealed that the activation of the Wnt/β-catenin pathway promotes proliferation and tumor formation in cervical cancer cells (30), consistent with the GSEA report.
Figure 5

Bioinformatics analysis of AC024560.2. (A, negatively; B, positively) Identification of gene sets enriched in phenotypes correlated with AC024560.2 by GSEA.

Clinical relevance

To validate the results of data mining and determine the clinical relevance of lncRNA dysregulation, we detected the expression levels of the above four lncRNAs in 35 clinical tissues with and without PLNM by qRT-PCR (Fig. 6). The qRT-PCR result was consistent with our analysis, in that expression of all 4 lncRNAs had statistical difference with the same trend (P<0.05). Next, we used existing data to access the discriminatory power for lncRNA MIR100HG and AC024560.2 by constructing ROC curves. Areas under ROC curves of two lncRNA signatures were 0.801 and 0.837, respectively (Fig. 7A and B). This suggested that MIR100HG and AC024560.2 achieve a fine diagnostic accuracy in diagnosing lymph node metastasis of cervical cancer. According to the TCGA cohort, the Kaplan-Meier analysis demonstrated that the patients with lncRNA MIR100HG high expression had poor prognosis (Fig. 7C). The prognosis value of AC024560.2 is not shown due to the lack of exact genetic match information in TCGA cohort.
Figure 6

Expression patterns of candidate lncRNAs in early-stage cervical cancer. LN+, cervical cancer tissues with PLNM; LN−, cervical cancer tissues without PLNM. The expression levels of MIR100HG (A), AC024560.2 (B), LINC01139 (C) and LINC01503 (D) were significantly different.

Figure 7

The discriminatory power of two candidate lncRNAs in differentiating PLNM and Kaplan-Meier estimates of overall survival in early-stage cervical cancer. (A) The ROC curve of MIR100HG with AUC=0.801. (B) The ROC curve of AC024560.2 with AUC=0.837. (C) MIR100HG expression and overall survival (log-rank tests, P=0.18). The tick marks on the Kaplan-Meier curves represent the censored subjects.

Discussion

PLNM has critically significant implications for individual therapy in early-stage cervical cancer. The evaluation of PLNM status using an accurate, noninvasive method is a major focus of research. As part of the development of a molecular diagnosis for PLNM, various types of proteins and microRNAs that participate in the regulation of PLNM through various pathways have been identified. However, the clinical application of these molecules has been limited by sensitivity, sample size, regions and other factors. LncRNAs are emerging as crucial regulators of genomic activity and the expression of protein-coding genes and other non-coding transcripts, including microRNAs (14). A role of lncRNAs in oncogenesis, tumor progression, tumor cell apoptosis and cell period arrest has been confirmed in several human cancers (31). Therefore, lncRNA function may be relevant to predicting PLNM. The present study is the first to comprehensively investigate the lncRNA expression profile associated with PLNM in early-stage cervical cancer patients. Although gene expression chip and protein microarrays have been widely used by researchers, the design of lncRNA expression microarrays is not mature. Therefore, mining lncRNA information from the most commonly used commercial microarrays in human cancer profiling, including the Affymetrix HG-U133 Plus 2.0 array, is an important supplement for this field. This approach is accurate, feasible, and low cost and has been adopted in several studies (19,32). Herein, we used this method for reference and successfully mined the ideal lncRNA expression profile associated with PLNM. After screening, 2,803 lncRNA transcripts were filtered from the Affymetrix microarray, and 234 lncRNAs that were significantly associated with pelvic lymph node metastasis were identified. We validated these results by qRT-PCR. Using the lncRNA-mRNA CNC network and molecular analysis system, we assessed the major biological functions and molecular mechanisms in which PLNM-associated lncRNAs might be involved. lncRNA MIR100HG and AC024560.2 were further analyzed based on their location, co-expression with coding genes, gene set enrichment and clinical discriminatory power. In our MAS report, the enrichment score of gap junctions was highest. Gap junctions are a specialized intercellular connection between two cells composed of proteins from the connexin family in vertebrates. Gap junctions allow various molecules, ions and electrical impulses to directly pass through the channel (33). The relationship between the aberrant expression of the gap junction protein connexin and lymph node metastasis has been verified for several cancers, including ovarian adenocarcinoma, human ductal breast cancer, colorectal cancer, and oral squamous cell carcinoma. Some membrane connexins are independent prognostic factors in various cancers. In uterine cervical cancers, some connexin proteins that mediate gap junction intercellular communication have been associated with carcinogenesis and tumor progression (34). The expression levels of gap junction beta-3 protein (GJA1) and gap junction alpha-10 protein (GJA10) were also significantly different in our mRNA analysis (P<0.05). These findings provide a theoretical basis to further explore this biological pathway. The lncRNA MIR100HG was originally identified as highly expressed in acute megakaryoblastic leukemia (AMKL). MIR100HG acts as a mediator of hematopoiesis and oncogenes in the progression of AMKL, an aggressive form of hematological cancer (28). Moreover, there is an intronic coding region (BLID) in MIR100HG gene, which functions as a proapoptotic molecule through a caspase-dependent mitochondrial pathway of cell death (35). This activity provides a starting point to explore the expression pattern of MIR100HG in cervical cancer. GSEA revealed that the gene set RICKMAN_METASTASIS_UP had a higher normal enrichment score and positive correlation with the profile. This result confirms the key value of MIR100HG in cancer lymph node metastasis. Furthermore, MIR100HG is closely correlated with the gap junction and TGF-β pathways. Most of the mRNAs that were co-expressed with MIR100HG participate in the gap junction pathway. The TGF-β pathway is an important PLNM-associated pathway in cervical cancer and was analyzed by Noordhuis et al (18). Thus, our findings are consistent with previous research results. We explained the biological pathway of PLNM in cervical cancer from a novel molecular mechanism perspective and established a more comprehensive understanding based on molecular profile information. lncRNA-AC024560.2 is a novel lncRNA that has not been previously associated with cancer. GSEA indicated an association of AC024560.2 with the linoleic acid metabolism pathway. Downregulation of peroxisome proliferator-activated receptors (PPARs) acting as nuclear receptors for linoleic acid metabolites participating in the apoptosis signaling pathway in colorectal cancer was reported as early as 2003 (36). The role of the Wnt signaling pathway, which is related to AC024560.2, was reported previously in cancer (18). Therefore, as regulators of biological function for certain proteins and microRNAs, these lncRNAs may play an important and valuable role in PLNM of early-stage cervical cancer. To increase the clinical significance of our findings, we analyzed the diagnostic capacity of MIR100HG and AC024560.2 using ROC curves. Our results provide strong evidence for the prediction of lymph node status in cervical cancer using these two lncRNAs. However, there are limitations to our study. First, the sample size of the GSE2109 validation group was small, and thus hierarchical clustering was less obvious. Second, the lncRNAs selected here might not represent the entire lncRNA profile involved in PLNM, because the Affymetrix human genome U133 plus 2.0 microarrays did not include all the lncRNAs present. Third, only 131 patients of early-stage cervical squamous cell carcinoma from TCGA cohort were accorded with the inclusive criteria, so that the difference between survival curves was not significant. In addition, only one pathological type of cervical cancer (squamous cell cancer) was studied because of the limitation of GEO data. Finally and unfortunately, we had no more experimental evidence to further prove our analysis report. The focus of our study was the value of bioinformatics analysis in addressing important clinical topics and discovering potential role of lncRNAs in lymph node metastasis. In summary, we successfully identified 234 differentially expressed PLNM-associated lncRNAs in early-stage cervical cancer using the data mining method. The qRT-PCR was carried out to further detect the lncRNA expression patterns in clinical cervical cancer tissues. Using the LncRNA-mRNA Coexpression network, we detailed the possible function of these lncRNAs from different perspectives, including molecular mechanism and biological pathways. Two promising lncRNAs MIR100HG and AC024560.2 were evaluated using GSEA reports and other databases to uncover their location, biological function and discrimination power. Our results fully revealed the significance of bioinformatics in analyzing clinical issues and will serve as a guide for future research. Our study also increases the understanding of lncRNAs in pathogenesis of lymph node metastasis in early-stage cervical cancer and may be a reference basis for the treatment of cervical cancer metastasis.
  36 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  BRCC2, a novel BH3-like domain-containing protein, induces apoptosis in a caspase-dependent manner.

Authors:  Constantinos G Broustas; Prafulla C Gokhale; Aquilur Rahman; Anatoly Dritschilo; Imran Ahmad; Usha Kasid
Journal:  J Biol Chem       Date:  2004-04-06       Impact factor: 5.157

3.  Prediction of cancer outcome with microarrays: a multiple random validation strategy.

Authors:  Stefan Michiels; Serge Koscielny; Catherine Hill
Journal:  Lancet       Date:  2005 Feb 5-11       Impact factor: 79.321

4.  Gene expression profiling for the prediction of lymph node metastasis in patients with cervical cancer.

Authors:  Tae-Joong Kim; Jung-Joo Choi; Woo Young Kim; Chel Hun Choi; Jeong-Won Lee; Duk-Soo Bae; Dae-Soon Son; Jhingook Kim; Byung Kwan Park; Geunghwan Ahn; Eun Yoon Cho; Byoung-Gie Kim
Journal:  Cancer Sci       Date:  2007-11-06       Impact factor: 6.716

5.  Gene expression patterns in advanced human cervical cancer.

Authors:  P W Grigsby; M Watson; M A Powell; Z Zhang; J S Rader
Journal:  Int J Gynecol Cancer       Date:  2006 Mar-Apr       Impact factor: 3.437

6.  Phosphorylation of the gap junction protein Connexin43 in CIN III lesions and cervical carcinomas.

Authors:  Inga Steinhoff; Kerstin Leykauf; Uwe Bleyl; Matthias Dürst; Angel Alonso
Journal:  Cancer Lett       Date:  2005-06-14       Impact factor: 8.679

7.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

8.  Randomised study of radical surgery versus radiotherapy for stage Ib-IIa cervical cancer.

Authors:  F Landoni; A Maneo; A Colombo; F Placa; R Milani; P Perego; G Favini; L Ferri; C Mangioni
Journal:  Lancet       Date:  1997-08-23       Impact factor: 79.321

Review 9.  The effects of connexin phosphorylation on gap junctional communication.

Authors:  Paul D Lampe; Alan F Lau
Journal:  Int J Biochem Cell Biol       Date:  2004-07       Impact factor: 5.085

10.  The 15-lipoxygenase-1 product 13-S-hydroxyoctadecadienoic acid down-regulates PPAR-delta to induce apoptosis in colorectal cancer cells.

Authors:  Imad Shureiqi; Wei Jiang; Xiangsheng Zuo; Yuanqing Wu; Julie B Stimmel; Lisa M Leesnitzer; Jeffrey S Morris; Hui-Zhen Fan; Susan M Fischer; Scott M Lippman
Journal:  Proc Natl Acad Sci U S A       Date:  2003-08-08       Impact factor: 11.205

View more
  27 in total

1.  Identification of functional long non-coding RNAs in gastric cancer by bioinformatics analysis.

Authors:  Yang Li; Dongyang Ma; Tang Li; Yuan Yin
Journal:  Int J Exp Pathol       Date:  2020-07-01       Impact factor: 1.925

2.  MicroRNA-Related Genetic Variants Associated with Survival of Head and Neck Squamous Cell Carcinoma.

Authors:  Owen M Wilkins; Alexander J Titus; Lucas A Salas; Jiang Gui; Melissa Eliot; Rondi A Butler; Erich M Sturgis; Guojun Li; Karl T Kelsey; Brock C Christensen
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2018-06-07       Impact factor: 4.254

3.  Long non-coding RNA NKILA serves as a biomarker in the early diagnosis and prognosis of patients with colorectal cancer.

Authors:  Peng Jiang; Xiaoting Han; Yingnan Zheng; Jianchao Sui; Weiping Bi
Journal:  Oncol Lett       Date:  2019-06-24       Impact factor: 2.967

4.  Downregulation of MIR100HG Induces Apoptosis in Human Megakaryoblastic Leukemia Cells.

Authors:  Parisa Bagheri; Mohammadreza Sharifi; Ava Ghadiri
Journal:  Indian J Hematol Blood Transfus       Date:  2020-07-24       Impact factor: 0.900

5.  Functional prediction of long non-coding RNAs in ovarian cancer-associated fibroblasts indicate a potential role in metastasis.

Authors:  Fatemeh Vafaee; Emily K Colvin; Samuel C Mok; Viive M Howell; Goli Samimi
Journal:  Sci Rep       Date:  2017-09-04       Impact factor: 4.379

6.  Upregulated plasmacytoma variant translocation 1 promotes cell proliferation, invasion and metastasis in colorectal cancer.

Authors:  Chaoyu Wang; Xin Zhu; Chibin Pu; Xuan Song
Journal:  Mol Med Rep       Date:  2018-03-01       Impact factor: 2.952

7.  lncRNA MIR100HG-derived miR-100 and miR-125b mediate cetuximab resistance via Wnt/β-catenin signaling.

Authors:  Yuanyuan Lu; Xiaodi Zhao; Qi Liu; Cunxi Li; Ramona Graves-Deal; Zheng Cao; Bhuminder Singh; Jeffrey L Franklin; Jing Wang; Huaying Hu; Tianying Wei; Mingli Yang; Timothy J Yeatman; Ethan Lee; Kenyi Saito-Diaz; Scott Hinger; James G Patton; Christine H Chung; Stephan Emmrich; Jan-Henning Klusmann; Daiming Fan; Robert J Coffey
Journal:  Nat Med       Date:  2017-10-16       Impact factor: 53.440

8.  High Expression of KIF20A Is Associated with Poor Overall Survival and Tumor Progression in Early-Stage Cervical Squamous Cell Carcinoma.

Authors:  Weijing Zhang; Weiling He; Yongjie Shi; Haifeng Gu; Min Li; Zhimin Liu; Yanling Feng; Nianzhen Zheng; Chuanmiao Xie; Yanna Zhang
Journal:  PLoS One       Date:  2016-12-12       Impact factor: 3.240

Review 9.  Biomarkers could facilitate prediction of worse clinical outcome of cancer with special insight to cervical cancer.

Authors:  Bartosz Czerniak; Dorota Olszewska-Słonina
Journal:  Contemp Oncol (Pozn)       Date:  2018-04-03

10.  LncRNA MIR100HG promotes cell proliferation in triple-negative breast cancer through triplex formation with p27 loci.

Authors:  Shaowei Wang; Hao Ke; Honglei Zhang; Yujie Ma; Lei Ao; Li Zou; Qin Yang; Hao Zhu; Jianyun Nie; Chunlian Wu; Baowei Jiao
Journal:  Cell Death Dis       Date:  2018-07-24       Impact factor: 8.469

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.