Literature DB >> 32194687

Identification of candidate RNA signatures in triple-negative breast cancer by the construction of a competing endogenous RNA network with integrative analyses of Gene Expression Omnibus and The Cancer Genome Atlas data.

Ping Yan1, Lingfeng Tang1, Li Liu1, Gang Tu1.   

Abstract

Triple-negative breast cancer (TNBC) is a subtype of breast cancer that is characterized by aggressive and metastatic clinical characteristics and generally leads to earlier distant recurrence and poorer prognosis than other molecular subtypes. Accumulating evidence has demonstrated that long non-coding RNAs (lncRNAs) serve a crucial role in a wide variety of biological processes by interacting with microRNAs (miRNAs) as competing endogenous RNAs (ceRNAs) and, thus, affect the expression of target genes in multiple types of cancer. Seven datasets from the Gene Expression Omnibus (GEO) database, including 444 tumor and 88 healthy tissue samples, were utilized to investigate the underlying mechanisms of TNBC and identify prognostic biomarkers. Differentially expressed genes (DEGs) were further validated in The Cancer Genome Atlas database and the associations between their expression levels and clinical information were analyzed to identify prognostic values. A potential lncRNA-miRNA-mRNA ceRNA network was also constructed. Finally, 69 mRNAs from the integrated Gene Expression Omnibus datasets were identified as DEGs using the robust rank aggregation method with |log2FC|>1 and adjusted P<0.01 set as the significance cut-off levels. In addition, 29 lncRNAs, 21 miRNAs and 27 mRNAs were included in the construction of the ceRNA network. The present study elucidated the mechanisms underlying the progression of TNBC and identified novel prognostic biomarkers for TNBC. Copyright: © Yan et al.

Entities:  

Keywords:  competing endogenous RNA; long non-coding RNA; microRNA; triple-negative breast cancer

Year:  2020        PMID: 32194687      PMCID: PMC7039180          DOI: 10.3892/ol.2020.11292

Source DB:  PubMed          Journal:  Oncol Lett        ISSN: 1792-1074            Impact factor:   2.967


Introduction

Triple-negative breast cancer (TNBC) is a typical molecular subtype of breast cancer that lacks the expression of estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER2) and accounts for 10–20% of all types of breast cancer (1,2). TNBC is also well known for its aggressive and metastatic clinical characteristics and generally leads to early distant recurrence and poor prognosis (3,4). Currently, no specific targeted therapy is available for TNBC (5). Therefore, it is crucial to identify potential biomarkers and novel therapeutic targets for the development of a more efficient treatment. Emerging evidence has indicated that long non-coding RNAs (lncRNAs) play a vital role in a large variety of biological processes, including genetic transcription, chromosome modification, cell cycle, cell differentiation and migration (6–8). Various studies have indicated that specific miRNAs may participate in tumor progression and function as oncogenes or tumor suppressor genes (9–11). Moreover, the competing endogenous RNA (ceRNA) hypothesis, which suggests that non-coding RNAs and pseudogene transcripts are able to compete for the same miRNA response elements in order to regulate each other and communicate with mRNAs, has gained increasing interest (12). Thereafter, this hypothesis was validated experimentally by further studies (13,14). However, published studies with large sample sizes and specific biomarkers for TNBC are limited. Therefore, the ceRNA network of TNBC has not yet been fully investigated and requires further exploration. In the present study, published microarray and sequencing data were searched in the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases and gene expression profiling data were collected from a large sample size of patients with TNBC, in order to identify candidate RNA signatures in TNBC. A predictable ceRNA network was also constructed based on the ceRNA hypothesis, in order to identify the TNBC-specific RNAs involved in the ceRNA crosstalk. These integrated analyses aimed to detect novel lncRNA/miRNA/mRNA biomarkers of TNBC and reveal the underlying molecular regulatory mechanisms of TNBC pathogenesis and progression.

Materials and methods

Data mining

Microarray datasets, including GSE38959 (15), GSE61723 (16), GSE61724 (16), GSE76250 (17), GSE86945 (18) and GSE86946 (18), and one dataset obtained by expression profiling via high-throughput sequencing, namely GSE58135 (19), were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). In order to be included, all datasets had to consist of at least 20 samples and employ tissue samples collected from patients with TNBC and corresponding adjacent or healthy tissues. Large sample sizes are considered to provide more reliable results in the screening of differentially expressed genes (DEGs). It has been reported that small sample sizes are one of the major challenges in microarray analysis and, thus, recent integrated bioinformatics studies tend to use datasets with relatively large sample sizes (20). Therefore, GEO datasets containing at least 20 samples were chosen for further analysis in the present study. RNA-sequencing (RNA-seq) and microRNA-sequencing (miRNA-seq) data of patients with TNBC, comprising 116 patients with TNBC with complete expression profiles and clinical information, were downloaded from the TCGA database (https://portal.gdc.cancer.gov/). Finally, 126 RNA-seq samples, including 115 tumor tissues and 11 healthy tissues, and 122 miRNA-seq samples, containing 113 tumor tissues and 9 healthy tissues, were obtained. The present study was performed in accordance with the publication guidelines provided by TCGA (http://cancergenome.nih.gov/publications/publicationguidelines). A flowchart of the data collection process and method implementation is presented in Fig. 1.
Figure 1.

Flowchart of data collection and method implementation in this study. GEO, Gene Expression Omnibus; KEGG, Kyoto Encyclopedia of Genes and Genomes; FDR, false discovery rate; TCGA, The Cancer Genome Atlas; DAVID, Database for Annotation, Visualization and Integrated Discovery; STRING, Search Tool for the Retrieval of Interacting Genes; KOBAS, KEGG Orthology Based Annotation System; FC, fold change; miRNAs, microRNAs; lncRNAs, long non-coding RNAs; ceRNA, competing endogenous RNA.

Analysis of DEGs

The Limma package of R software (version 3.5.2) (21) was used for the normalization and log base 2 transformation of microarray data from the GEO datasets and screening of DEGs between tumor and healthy tissues. Three datasets, specifically GSE76250, GSE86945 and GSE86946, were merged into one group for normalization since they were all profiled on the Affymetrix Human Transcriptome Array 2.0 platform according to the GPL17586 platform (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL17586). A boxplot of the normalization and log base 2 transformation corresponding to this group is presented in Fig. S1. The GSE58135 dataset was employed for differential expression analysis between tumor and healthy tissues, which was performed using the DESeq2 package v.1.26.0 (http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html) of R software (version 3.5.2). Gene integration analysis of the DEGs identified from the seven GEO datasets was conducted using the RobustRankAggreg package version 1.1 (22) based on a robust rank aggregation (RRA) method. Genes were then regarded as DEGs following RRA analysis based on the following significance cut-off levels: Adjusted P<0.01 and |log2FC|>1. The RRA method that was applied screens genes that are ranked consistently better than expected based on the null hypothesis of uncorrelated inputs (22). Thus, the gene expression values of samples from different datasets were not integrated into this analysis. In accordance with a number of published papers based on the RobustRankAggreg package (23,24), batch effect correction was not conducted in the present study. The TCGA datasets were used for differential expression analyses of mRNAs, lncRNAs and miRNAs, which were compared between tumor and healthy tissues using the DESeq2 package (version 1.26.0) of R software (version 3.5.2). An adjusted P<0.01 and |log2FC|>1 were set as the cut-off criteria.

Functional enrichment analysis

Enrichment analysis of DEGs that were screened from the GEO database based on the RRA method was conducted using the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. The GO enrichment analysis and functional annotation of the DEGs, which included the terms molecular function (MF), biological process (BP) and cellular component (CC), were performed using the Database for Annotation, Visualization and Integrated Discovery (DAVID; version 6.8, (http://david.abcc.ncifcrf.gov/) (25). The KEGG Orthology Based Annotation System (KOBAS; version 3.0; http://kobas.cbi.pku.edu.cn/) was used to evaluate the statistical enrichment of DEGs in KEGG pathways. P<0.01 was considered to indicate significantly enriched DEGs.

Protein-protein interaction (PPI) network analysis

The Search Tool for the Retrieval of Interacting Genes (STRING version 11.0) (https://string-db.org/) database provides information regarding the predicted and experimental interactions of proteins (26), thus it was used for the identification of the protein-protein interactions among the identified DEGs. In the present study, the DEGs were mapped into PPIs and a combined score of ≥0.4 was considered as the cut-off level. Moreover, Cytoscape software (version 3.7.1) was used to construct the PPI networks (27).

Construction of the ceRNA network

The intersection of DEmRNAs between GEO and TCGA databases obtained by using the VennDiagram package (version 1.6.20; http://CRAN.R-project.org/package=VennDiagram), DElnc RNAs and DEmiRNAs from the TCGA database were used to construct the ceRNA network. The miRcode database version 11.0; (http://www.mircode.org/) was used to collect predicted and experimentally validated lncRNA-targeted miRNAs. MiRTarBase version 6.0 (http://mirtarbase.mbc.nctu.edu.tw/php/index.php) and TargetScan version 7.2 (http://www.targetscan.org/vert_72/) were cooperatively utilized to determine miRNA-targeted mRNAs. Based on the above lncRNA-miRNA and miRNA-mRNA interactions, visualization of the lncRNA-miRNA-mRNA network was performed using Cytoscape.

Survival analysis

Gene expression data and survival information obtained from the TCGA database were assessed by Kaplan-Meier survival analysis and a log-rank test was performed using the survival package version 2.44–1.1 (https://www.rdocumentation.org/packages/survival/versions/2.44–1.1) in R software. All the samples were categorized into high and low expression groups based on the median expression level of each DEG. Regarding the log-rank test results, P<0.05 was considered to indicate a statistically significant difference.

Results

Identification of DEGs

Information of the seven datasets that were downloaded from the GEO database and included in the current study is shown in Table I. Volcano plots revealed the number of differentially expressed genes identified from each dataset according to the set cut-off levels (P<0.01 and |log2FC|>1) (Fig. 2). Based on different cut-off levels (adjusted P<0.01 and |log2FC|>1), a total of 69 differentially expressed mRNAs (DEmRNAs), consisting of 16 downregulated and 53 upregulated genes, were identified following integrated analysis of these GEO datasets using the RobustRankAggreg package and further analyzed (Fig. 3).
Table I.

Details of datasets from the Gene Expression Omnibus database.

Author, yearPMIDRecordTissuePlatformHealthyCancer(Refs.)
Komatsu et al, 201323254957GSE38959Triple-negativeGPL4133 Agilent-014850 Whole Human Genome1330(15)
breast cancerMicroarray 4×44K G4112F (Feature Number version)
Mathe et al, 201526537449GSE61723Triple-negativeGPL16686 [HuGene-2_0-st] Affymetrix Human Gene 2.01733(16)
breast cancerST Array [transcript (gene) version]
Mathe et al, 201526537449GSE61724Triple-negativeGPL6244 [HuGene-1_0-st] Affymetrix Human Gene 1.0416(16)
breast cancerST Array [transcript (gene) version]
Liu et al, 201626813360GSE76250Triple-negativeGPL17586 [HTA-2_0] Affymetrix Human Transcriptome33165(17)
breast cancerArray 2.0 [transcript (gene) version]
Romero-Cordoba et al, 201830115973GSE86945Triple-negativeGPL17586 [HTA-2_0] Affymetrix Human Transcriptome0100(18)
breast cancerArray 2.0 [transcript (gene) version]
Romero-Cordoba et al, 201830115973GSE86946Triple-negativeGPL17586 [HTA-2_0] Affymetrix Human Transcriptome058(18)
breast cancerArray 2.0 [transcript (gene) version]
Varley et al, 201424929677GSE58135Triple-negativeGPL11154 Illumina HiSeq 2000 (Homo sapiens)2142(19)
breast cancer
Figure 2.

Volcano plots of differentially expressed genes in each Gene Expression Omnibus dataset.

Figure 3.

Log2FC heatmap of the image data of each expression microarray. The abscissa corresponds to the GEO ID, while the ordinate corresponds to the gene name. Red represents log2FC>0, green represents log2FC<0, and the values represent the log2FC values in each GEO dataset. FC, fold change; GEO, Gene Expression Omnibus.

In the TCGA database, a total of 1,964 upregulated and 1,131 downregulated mRNAs, 352 differentially expressed lncRNAs (DElncRNAs) and 141 differentially expressed miRNAs (DEmiRNAs) were identified based on the aforementioned cut-off levels. Heatmaps of the top 200 DEmRNAs and DElncRNAs on the basis of adjusted P-value and overall 141 DEmiRNAs are demonstrated in Figs. S2–S4, respectively. Following gene integration analysis, GO term functional enrichment analysis of the 69 DEGs was conducted using DAVID with P<0.01 set as the significance cut-off level. The following categories or ontologies were included in the analysis: i) Biological processes; ii) cellular component; and iii) molecular function. In terms of the biological processes, the DEGs were significantly enriched in ‘mitotic nuclear division’ and ‘cell division’. In the cellular component analysis, the DEGs were markedly enriched in the ‘nucleoplasm’ and ‘nucleus’. According to the molecular function, the DEGs were significantly enriched in ATP binding and protein binding (Fig. 4). With respect to KEGG pathway enrichment analysis, performed using KOBAS with P<0.01 set as the cut-off level, the 69 genes were found to be mainly enriched in ‘cell cycle’, ‘progesterone-mediated oocyte maturation’, ‘oocyte meiosis’ and ‘microRNAs in cancer’ terms (Table II).
Figure 4.

Gene Ontology terms of the differentially expressed genes identified from the GEO database, including biological process, cellular component and molecular function. GEO, Gene Expression Omnibus.

Table II.

Enriched Kyoto Encyclopedia of Genes and Genomes pathways of the differentially expressed genes.

TermCountP-valueFDRGenes
hsa04110: Cell cycle61.55×10−61.01×10−4PLK1, CCNB2, BUB1, CCNA2, CHEK1, E2F1
hsa04914: Progesterone-mediated oocyte maturation59.06×10−62.94×10−4CCNB2, BUB1, PLK1, PGR, CCNA2
hsa04114: Oocyte meiosis52.43×10−55.26×10−4CCNB2, BUB1, PLK1, AURKA, PGR
hsa05206: MicroRNAs in cancer51.00×10−31.33×10−2STMN1, TP63, EZH2, E2F1, KIF23
hsa04115: P53 signaling pathway31.03×10−31.33×10−2CCNB2, RRM2, CHEK1
hsa05222: Small cell lung cancer32.02×10−32.18×10−2FN1, E2F1, CKS2
hsa05161: Hepatitis B38.31×10−37.71×10−2E2F1, BIRC5, CCNA2

Term, enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway; count, the number of differentially expressed genes of each term; P-value, P-value of enrichment analysis; FDR, false discovery rate.

PPI network analysis

The DEGs in TNBC were examined using the STRING database to construct PPI networks. A total of 69 DEGs were identified from the GEO database using the RRA method, including 53 upregulated and 16 downregulated DEGs. Following removal of the isolated and partially connected nodes, a complex PPI network was formulated consisting of 58 nodes and 926 interactions, which were obtained with a combined score >0.4 (Fig. 5).
Figure 5.

Protein-protein interaction network of the differentially expressed genes identified from the Gene Expression Omnibus database. Pink nodes represent upregulated genes and green nodes represent downregulated genes. Node size is positively associated with degree, which is the number of DEGs the node/genes can interact with.

A total of 67 DEmRNAs from the intersections of the GEO and TCGA databases and 352 DElncRNAs and 141 DEmiRNAs from the TCGA database were selected in order to construct a ceRNA network. A Venn diagram demonstrated the intersections of DEmRNAs between the GEO and TCGA databases (Fig. 6). To further understand the functions of the DElncRNAs, DEmiRNAs and DEmRNAs, a ceRNA network was constructed. In Table III, it is shown that 31 DElncRNAs interacted with 29 DEmiRNAs retrieved from the miRcode database. Subsequently, DEmRNAs were searched in the miRTarBase and TargetScan databases based on the 29 DEmiRNAs. According to results from both databases, a total of 27 DEmRNAs capable of interacting with 21 of the 29 DEmiRNAs were chosen (Table IV). Following removal of the remaining eight DEmiRNAs and the corresponding lncRNAs, 29 DElncRNAs, 21 DEmiRNAs and 27 DEmRNAs were used to establish a ceRNA network (Fig. 7).
Figure 6.

Venn diagram of the intersections of differentially expressed mRNAs between the Gene Expression Omnibus and The Cancer Genome Atlas databases. (A) Intersection of upregulated mRNAs; (B) Intersection of downregulated mRNAs. GEO, Gene Expression Omnibus.

Table III.

miRNAs that may be targeted by specific lncRNAs in triple-negative breast cancer.

lncRNAmiRNA
XISTmiR-503, miR-301b, miR-454, miR-93, miR-106a, miR-96, miR-137, miR-140, miR-141, miR-200a, miR-144, miR-155, miR-195, miR-497, miR-17, miR-192, miR-215, miR-429, miR-204, miR-21, miR-22, miR-32, miR-338
MEG3miR-551a, miR-301b, miR-454, miR-93, miR-106a, miR-96, miR-140, miR-141, miR-200a, miR-143, miR-144, miR-145, miR-155, miR-195, miR-497, miR-17, miR-192, miR-215, miR-429, miR-204, miR-21, miR-22, miR-338
MALAT1miR-503, miR-93, miR-106a, miR-96, miR-140, miR-141, miR-200a, miR-143, miR-144, miR-145, miR-155, miR-195, miR-497, miR-17, miR-192, miR-215, miR-429, miR-204, miR-21, miR-22, miR-32, miR-338
NEAT1miR-503, miR-301b, miR-454, miR-93, miR-106a, miR-96, miR-140, miR-141, miR-200a, miR-143, miR-144, miR-195, miR-497, miR-17, miR-183, miR-429, miR-204, miR-22, miR-338
MAGI2-AS3miR-503, miR-93, miR-106a, miR-137, miR-141, miR-200a, miR-143, miR-144, miR-145, miR-155, miR-195, miR-497, miR-429, miR-204, miR-210, miR-22, miR-32
PVT1miR-503, miR-551a, miR-93, miR-106a, miR-140, miR-143, miR-145, miR-195, miR-497, miR-17, miR-183, miR-187, miR-21
SNHG1miR-503, miR-137, miR-140, miR-141, miR-200a, miR-143, miR-144, miR-145, miR-195, miR-497, miR-204, miR-21, miR-32
EPB41L4A-AS1miR-503, miR-93, miR-106a, miR-141, miR-200a, miR-195, miR-497, miR-17, miR-183, miR-429, miR-22, miR-338
DLEU2miR-551a, miR-96, miR-137, miR-141, miR-200a, miR-143, miR-144, miR-21, miR-32
CYB561D2miR-503, miR-93, miR-106a, miR-140, miR-144, miR-22, miR-338
HOTAIRmiR-301b, miR-454, miR-143, miR-17, miR-93, miR-204, miR-21
SNHGmiR-93, miR-106a, miR-141, miR-200a, miR-17, miR-338
MIR210HGmiR-551a, miR-93, miR-106a, miR-145, miR-195, miR-497
TPRG1-AS1miR-93, miR-106a, miR-17, miR-210, miR-32, miR-338
SNHG6miR-137, miR-144, miR-429, miR-204, miR-22
EMX2OSmiR-503, miR-143, miR-183, miR-210, miR-22
ATP1B3-AS1miR-93, miR-106a, miR-96, miR-204
LINC00393miR-93, miR-106a, miR-192, miR-215
LINC00460miR-503, miR-143, miR-429, miR-338
GRIK1-AS1miR-145, miR-204, miR-338
LINC00504miR-140, miR-32, miR-338
MIR155HGmiR-155, miR-204, miR-338
TMEM9B-AS1miR-144, miR-145, miR-22
AGBL5-IT1miR-145, miR-204
C6orf99miR-140, miR-338
KIRREL3-AS1miR-144, miR-338
LINC00392miR-183, miR-32
UCA1miR-96, miR-143
ARHGAP31-AS1miR-137
MIR22HGmiR-32
RERG-IT1miR-21

lncRNA, long non-coding RNA; miRNA/miR, microRNA.

Table IV.

mRNAs that may be targeted by specific miRNAs in triple-negative breast cancer.

miRNAmRNA
miR-192CAB39L, CEP55, MCM10, TRIM59, CENPA, HJURP, TRIP13, DTL
miR-17KIF23, HMGB3, RRM2, E2F1, MKI67, MELK
miR-93RRM2, E2F1, KIF23, HMGB3, BIRC5, MELK
miR-21TOP2A, TRIM59, LIFR, TP63, HMGB3
miR-155TRIP13, RRM2, CAB39L, AMIGO2, KIF14
miR-215TRIP13, CENPA, MCM10, DTL
miR-195KIF23, CHEK1, CEP55, BIRC5
miR-497CHEK1, KIF23, CEP55, BIRC5
miR-106aRRM2, HMGB3, E2F1, KIF23
miR-454ESR1, CEP55, CCNA2
miR-503KIF23, CHEK1
miR-144LIFR, EZH2
miR-32AURKA
miR-145ESR1
miR-183AURKA
miR-200aEZH2
miR-187CENPA
miR-429LMNB1
miR-137AURKA
miR-210STMN1
miR-22ESR1

miRNA/miR, microRNA; mRNA, messenger RNA.

Figure 7.

The lncRNA-miRNA-mRNA competing endogenous RNA network for triple-negative breast cancer. Diamonds represent lncRNAs; rectangles represent miRNAs; ellipses represent mRNAs; red indicates upregulated genes and green represents downregulated genes. lncRNA, long non-coding RNA; miRNA, microRNA.

To determine whether the expression levels of the 69 DEmRNAs identified from the GEO database and the DEGs included in the ceRNA network were related to the overall survival of patients with TNBC, Kaplan-Meier curves were generated based on the gene expression data and survival information retrieved from the TCGA database. As a result, three DEmRNAs, tripartite motif containing 59 (TRIM59), exonuclease 1 (EXO1) and RAD51-associated protein 1 (RAD51AP1), one DElncRNA, KIRREL3-antisense RNA 1 (KIRREL3-AS1), and one DEmiRNA, hsa-mir-106a, were found to be significantly associated with the prognosis of patients with TNBC (P<0.05; Fig. 8). Moreover, TRIM59, KIRREL3-AS1 and hsa-mir-106a were found to be involved in the ceRNA network. All five survival-related genes were upregulated and the high expression levels of all five genes were related to better prognosis.
Figure 8.

Kaplan-Meier curves of three DEmRNAs, TRIM59, EXO1 and RAD51AP1, one DElncRNA, KIRREL3-AS1, and one DEmiRNA, hsa-mir-106a, associated with overall survival. DE, differentially expressed; lncRNAs, long non-coding RNAs; TRIM59, tripartite motif containing 59; EXO1, exonuclease 1; RAD51AP1, RAD51-associated protein 1; KIRREL3-AS1, KIRREL3-antisense RNA 1.

Discussion

Five distinct intrinsic breast cancer subtypes have been identified based on gene expression: i) Luminal A; ii) luminal B; iii) HER2 overexpressing; iv) basal-like; and v) healthy breast tissue-like (28–30). Prat et al (31) have also identified a new breast cancer intrinsic subtype known as claudin-low or mesenchymal-like. These six subtypes of breast cancer defined by immunohistochemistry (IHC) demonstrate distinct differences in terms of breast cancer survival (32). TNBC is regarded as an aggressive subtype of breast cancer, since it is characterized by higher rates of progression, metastasis and recurrence than the other molecular subtypes. This subtype is more commonly diagnosed in young women (33). The main therapeutic strategies for the treatment of patients with TNBC are surgery and chemotherapy. One of the primary factors that contribute to poor prognosis of patients with TNBC is that few effective therapeutic targets are available. Therefore, studies on the underlying mechanisms of the pathogenesis and progression of TNBC are required for the development of more effective therapeutic strategies. Compared with the original long-term fundamental experimental studies, integrated mining of microarray data and high-throughput sequencing from public databases appears to be more comprehensive and highly informative. Furthermore, since a number of studies have indicated the involvement of ceRNA crosstalk in diverse biological processes, including tumorigenesis, progression and metastasis, the comprehensive analysis of lncRNA-miRNA-mRNA ceRNA networks has become more widely used in the prediction of candidate RNA signatures in various cancer types (34–36), including TNBC (37–39). Nonetheless, non-coding RNA-associated ceRNA networks based on whole-genome gene expression profiling with large-scale microarray and sequencing data of patients with TNBC have not been described. In the current study, seven datasets from studies on TNBC were downloaded from the publicly available GEO database. The published original studies, from which the data were obtained, are discussed in the following text. Komatsu et al (15) identified abnormal spindle microtubule assembly and centromere protein K as novel molecular targets for TNBC therapy, since the absence of these genes was found to cause an arrest in the G2/M and G0/G1 phases of the cell cycle, respectively, and subsequently induced cell death in TNBC cells. Mathe et al (16) compared the genes that were found to be TNBC-specific from their cohort which consists of patients with TNBC and the The Cancer Genome Atlas (TCGA) cohort which is an external validation cohort including patients with TNBC from the TCGA database, and it was found that four of the genes, namely ankyrin repeat domain 30A, acidic nuclear phosphoprotein 32 family member E, desmocollin 2 and interleukin 6 signal transducer (IL6ST), were common to both TNBC cohorts. The survival curves revealed that high expression of IL6ST was significantly associated with improved survival outcomes. Liu et al (17) successfully developed an mRNA and an integrated mRNA-lncRNA signature based on eight mRNAs and two lncRNAs. The data from those two signatures suggested that the lncRNAs HIST2H2BC and SNRPEP4 promoted cell proliferation and invasion, thus contributing to paclitaxel resistance in TNBC cells. Romero-Cordoba et al (18) identified a set of altered miRNAs and experimentally confirmed that a specific miRNA, hsa-mir-342-3p, was downregulated in TNBC compared with other phenotypes. In addition, loss of function of miR-342-3p resulted in monocarboxylate transporter 1 overexpression and contributed to oncogenic metabolic reprogramming in TNBC. Varley et al (19) analyzed sequencing data in order to identify breast cancer-associated read-through fusion transcripts. This analysis led to the identification of SCNN1A-TNFRSF1A and CTSDIFITM10, two recurrent read-through fusion transcripts found to involve membrane proteins, which raised the possibility of them being breast cancer-specific cell surface markers. In order to fully mine the information of these datasets, multistep processing and integrated bioinformatics were applied to reveal DEGs in TNBC. More importantly, a robust rank aggregation (RRA) method was used to identify stable DEGs among different studies. This method analyzes prioritized gene lists and finds commonly overlapping genes, which are ranked consistently better than expected by chance (40). Finally, 69 dysregulated mRNAs were identified from the GEO datasets, including 53 upregulated and 16 downregulated genes, which are demonstrated in Fig. 3. Subsequently, GO and KEGG functional enrichment analyses were performed and a PPI network of these DEGs was constructed in order to demonstrate their characteristics and specific biological significance. DEGs were further validated in the TCGA database. Eventually, 29 lncRNAs, 21 miRNAs and 27 mRNAs were selected to construct a potential lncRNA-miRNA-mRNA ceRNA network by biological prediction in order to elucidate the interactions and regulatory mechanisms of DEGs. Finally, Kaplan-Meier curves were generated using the gene expression data and survival information provided by the TCGA database to detect prognostic indicators for patients with TNBC. As a result, three DEmRNAs, namely TRIM59, EXO1 and RAD51AP1, one DElncRNA, KIRREL3-AS1 and one DEmiRNA, hsa-mir-106a, were considered to be closely associated with the prognosis of patients with TNBC. Moreover, TRIM59, KIRREL3-AS1 and hsa-mir-106a were involved in the ceRNA network. The expression levels of all five genes were found to be upregulated and related to longer survival times. Nevertheless, the determination of the nature of protective factors or risk factors depends on gene expression at both the transcription and translation levels. Certain DEGs identified in the present study have been reported to be associated with the prognosis of breast cancer. As observed by gene expression profiling, IHC analysis has revealed that the expression of certain proliferation markers, including Ki67 and Aurora A kinase, is associated with poor prognosis in ER+ disease (41,42). In addition, as assessed by IHC, BCL2 expression is a powerful predictor of favorable prognosis in breast cancer across different molecular subtypes (43). Despite these findings, the routine assessment of IHC markers in addition to ER, PR and HER2 has yet to be implemented in standard clinical treatment guidelines. Therefore, the remaining survival-related genes identified in the current study should be further explored in future studies. In the present study, miRcode was used to collect predicted and experimentally validated miRNAs that are targeted by lncRNAs. In addition, TargetScan and miRTarBase were cooperatively utilized to determine mRNAs that are targeted by miRNAs. According to accumulating biological data, numerous computational models for potential miRNA-disease association inference have been developed, which are highly useful in research of the underlying molecular mechanism of human diseases and the development of new drugs for disease treatment (44–46). In particular, Chen et al (47) proposed the model of the Ensemble of Decision Tree-based MiRNA-Disease Association prediction (EDTMDA) for the identification of miRNA-disease associations by inputting features that were extracted from integrated miRNA similarity, disease similarity and known miRNA-disease associations. It is believed that EDTMDA is able to make reliable predictions and guide experiments to reveal further miRNA-disease associations. However, limited studies on computational models for the prediction of potential lncRNA-disease associations are available. The involvement of lncRNAs in numerous biological processes and their important roles in a variety of complex human diseases have been demonstrated. Therefore, the development of more effective computational models to identify potential relationships between lncRNAs and diseases may aid further understanding of disease mechanisms at the lncRNA molecular level. Furthermore, the development of a computational model capable of directly determining mRNAs that are targeted by lncRNAs would notably reduce the time required for analysis. In conclusion, research on computational models for the prediction of potential diseases at the level of gene expression regulation, post-translational protein modification and cellular environment may promote the diagnosis, treatment, prognosis and prevention of human diseases, including TNBC. Of note, gene expression profiling was conducted using large-scale microarray and sequencing data of patients with TNBC. A robust rank aggregation (RRA) method was applied to comprehensively screen stable DEGs among the different studies. Furthermore, a network of predicted ceRNA interactions derived from integrative bioinformatics analysis was successfully constructed. The reliability of the results was secured by the use of reasonable screening criteria, sufficient sample size and appropriate visualization of the results. There are also limitations to the current study. First, it would be more appropriate to include paired samples in order to eliminate error among different patients since each gene expression level varied substantially in different patients. Secondly, the complex ceRNA network was constructed by biological prediction based on the hypothesis that lncRNAs may serve as ceRNAs and, therefore, affect the expression of target genes by merging miRNAs. Therefore, further research is required for the verification of the current study results and further understanding of the molecular mechanisms of the identified RNA signatures in TNBC. In conclusion, the present study identified a large number of cancer-specific and several survival-related DEGs by integrated analysis of large-scale gene expression profiles from the GEO and TCGA databases. The predicted lncRNA-miRNA-mRNA ceRNA network may provide guidance for further studies on the molecular pathogenesis and progression mechanisms underlying TNBC.
  47 in total

1.  Identification of functional lncRNAs in gastric cancer by integrative analysis of GEO and TCGA data.

Authors:  Xianqin Zhang; Wanfeng Zhang; Yuyou Jiang; Kun Liu; Longke Ran; Fangzhou Song
Journal:  J Cell Biochem       Date:  2019-05-28       Impact factor: 4.429

2.  Triple-negative breast cancer: clinical features and patterns of recurrence.

Authors:  Rebecca Dent; Maureen Trudeau; Kathleen I Pritchard; Wedad M Hanna; Harriet K Kahn; Carol A Sawka; Lavina A Lickley; Ellen Rawlinson; Ping Sun; Steven A Narod
Journal:  Clin Cancer Res       Date:  2007-08-01       Impact factor: 12.531

3.  Integrative analysis of lncRNAs and miRNAs with coding RNAs associated with ceRNA crosstalk network in triple negative breast cancer.

Authors:  Naijun Yuan; Guijuan Zhang; Yurong Wang; Fengjie Bie; Min Ma; Yi Ma; Xuefeng Jiang; Xiaoqian Hao
Journal:  Onco Targets Ther       Date:  2017-12-12       Impact factor: 4.147

4.  Molecular features of triple negative breast cancer cells by genome-wide gene expression profiling analysis.

Authors:  Masato Komatsu; Tetsuro Yoshimaru; Taisuke Matsuo; Kazuma Kiyotani; Yasuo Miyoshi; Toshihito Tanahashi; Kazuhito Rokutan; Rui Yamaguchi; Ayumu Saito; Seiya Imoto; Satoru Miyano; Yusuke Nakamura; Mitsunori Sasa; Mitsuo Shimada; Toyomasa Katagiri
Journal:  Int J Oncol       Date:  2012-12-18       Impact factor: 5.650

5.  Subtype-specific micro-RNA expression signatures in breast cancer progression.

Authors:  Vilde D Haakensen; Vegard Nygaard; Liliana Greger; Miriam R Aure; Bastian Fromm; Ida R K Bukholm; Torben Lüders; Suet-Feung Chin; Anna Git; Carlos Caldas; Vessela N Kristensen; Alvis Brazma; Anne-Lise Børresen-Dale; Eivind Hovig; Åslaug Helland
Journal:  Int J Cancer       Date:  2016-05-09       Impact factor: 7.396

6.  A meta-analysis of microRNA expression in liver cancer.

Authors:  Jingcheng Yang; Shuai Han; Wenwen Huang; Ting Chen; Yang Liu; Shangling Pan; Shikang Li
Journal:  PLoS One       Date:  2014-12-09       Impact factor: 3.240

7.  Ki67 index, HER2 status, and prognosis of patients with luminal B breast cancer.

Authors:  Maggie C U Cheang; Stephen K Chia; David Voduc; Dongxia Gao; Samuel Leung; Jacqueline Snider; Mark Watson; Sherri Davies; Philip S Bernard; Joel S Parker; Charles M Perou; Matthew J Ellis; Torsten O Nielsen
Journal:  J Natl Cancer Inst       Date:  2009-05-12       Impact factor: 13.506

Review 8.  Long non-coding RNAs and complex diseases: from experimental results to computational models.

Authors:  Xing Chen; Chenggang Clarence Yan; Xu Zhang; Zhu-Hong You
Journal:  Brief Bioinform       Date:  2017-07-01       Impact factor: 11.622

9.  MDHGI: Matrix Decomposition and Heterogeneous Graph Inference for miRNA-disease association prediction.

Authors:  Xing Chen; Jun Yin; Jia Qu; Li Huang
Journal:  PLoS Comput Biol       Date:  2018-08-24       Impact factor: 4.475

10.  Identification of potential prognostic long non‑coding RNA signatures based on a competing endogenous RNA network in lung adenocarcinoma.

Authors:  Xiaojuan Wang; Yawen Ding; Bangming Da; Yan Fei; Gang Feng
Journal:  Oncol Rep       Date:  2018-09-21       Impact factor: 3.906

View more
  3 in total

1.  ceRNA network development and tumour-infiltrating immune cell analysis of metastatic breast cancer to bone.

Authors:  Shuzhong Liu; An Song; Xi Zhou; Zhen Huo; Siyuan Yao; Bo Yang; Yong Liu; Yipeng Wang
Journal:  J Bone Oncol       Date:  2020-07-20       Impact factor: 4.072

2.  Combined mRNAs and clinical factors model on predicting prognosis in patients with triple-negative breast cancer.

Authors:  Yanjun Hu; Dehong Zou
Journal:  PLoS One       Date:  2021-12-29       Impact factor: 3.240

Review 3.  Modulatory Role of microRNAs in Triple Negative Breast Cancer with Basal-Like Phenotype.

Authors:  Andrea Angius; Paolo Cossu-Rocca; Caterina Arru; Maria Rosaria Muroni; Vincenzo Rallo; Ciriaco Carru; Paolo Uva; Giovanna Pira; Sandra Orrù; Maria Rosaria De Miglio
Journal:  Cancers (Basel)       Date:  2020-11-07       Impact factor: 6.639

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.