Xinquan Wu1, Mingji Ding1, Jianqin Lin1. 1. Department of Thyroid and Breast Surgery, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, Fujian 362000, P.R. China.
Breast cancer is one of the most common malignant tumors in females (1). Breast cancer is classified into several subtypes according to the expression of various receptors, such as the estrogen receptor (ER), progesterone receptor (PR) and humanepidermal growth factor receptor 2 (HER-2) (2). In triple-negative breast cancer (TNBC), tumor cells do not express PR, ER or HER-2 (3). The majority of the treatments available to patients with middle- and late-stage TNBC are ineffective (4). Therefore, in order to improve the 5-year survival rate, further research to identify effective TNBC treatments is required (5). It is important to identify individuals at high risk for TNBC and provide appropriate treatment at an early stage (6).MicroRNAs (miRNAs/miRs) are a family of non-coding RNAs that were discovered in 1993 (7). miRNAs consist of 22–26 nucleotides but have a complex structure that confers the ability induce cleavage or translational repression of target mRNAs (7). miRNAs are therefore important regulators of gene expression (8). Breast cancer arises through the accumulation of genetic mutations and epigenetic modifications; therefore, miRNAs are important factors in breast carcinogenesis (9). A number of miRNAs are expressed at higher or lower levels in tumor tissues compared with normal tissues and may serve as tumor markers in breast cancer (10,11). In fact, miRNAs have attracted great interest as cancer biomarkers in the last decade (12–14).Certain miRNAs have been identified in previous research as breast cancer biomarkers, including miR-30a, miR-361-5p, miR-27a and miR-193b (12,15–18). However, research based on large sample sizes is lacking, particularly research employing clinical data (19–21). The Cancer Genome Atlas (TCGA) project, which was launched in 2006 by the National Cancer Institute and the National Human Genome Research Institute, contains data on 33 types of cancer from >10,000 patients (22,23). TNBC data available from TCGA consist of tissue miRNA-sequencing (seq) data and clinical information. The present study analyzed the aforementioned data in four steps: i) Identification of differentially expressed miRNAs between breast cancer and adjacent normal tissues; ii) screening of the miRNAs obtained in the first step using the Kaplan-Meier survival method, which yielded a total of six specific miRNAs; iii) identification of three of these six miRNAs that predicted patient survival rate based on statistical analysis; and iv) identification of the biological pathways regulated by these three miRNAs.
Materials and methods
TNBC miRNA-seq dataset and clinical information
The data analyzed in the present study were downloaded from TCGA (up to January 28, 2016). The clinical data for each patient were derived from a variety of methods used to detect HER-2 levels; therefore, ‘patientHER2 immunohistochemistry receptor status’ was used as a standard to determine patientHER-2 status (24). When confirming ER, PR and HER status, only patients with ‘positive’ and ‘negative’ data were enrolled, whereas patients with data deemed ‘close’ or ‘not available’ were excluded. The TNBC miRNA-seq data for miRNA differential expression consisted of data on tumor tissues (n=187) and matched normal tissues (n=12). All of the miRNA expression data were reported as ‘reads-per-million-miRNA-mapped’ and were normalized by log2 transformation. The inclusion criteria regarding clinical information for survival analysis were as follows: i) Complete follow-up data were available for 1–60 months (30-1,825 days); ii) the clinical data were complete (patients with uncertain or missing clinical data, with the exception of metastasis state, which contains various Mx stages, were excluded) and iii) miRNA-seq data integrity was validated (patients without individual miRNA values were exluded). Finally, a total of 151 patients who met these criteria were enrolled in the present study. In order to incorporate as much data as possible, miRNA differential expression and survival analyses were performed independently.
Screening of differentially expressed miRNAs
The Linear Models for Microarray Data (limma) package (version 3.36.5; http://bioconductor.org/packages/release/bioc/html/limma.html) in R (version 3.5.1; http://www.r-project.org/) was used to analyze differentially expressed miRNAs. The comparisons between tumor and normal tissues performed in limma were not conducted using a 1:1 ratio of tumor to healthy tissue, as healthy tissue specimens were not obtained from the majority of patients. Therefore, the tumor tissue from each patient was compared with all healthy tissue specimens. The fold-change (FC) was used to indicate the difference in miRNA expression between tumor and normal tissues. Log|FC|>1.5 and P<0.05 following false discovery rate adjustment were established as the cut-off criteria.
Survival analysis and the miRNA prognostic model
Survival analyses were performed using the survival package (version 2.43-1; http://cran.r-project.org/web/packages/survival/index.html) in R and SPS (version 22.0; IBM Corp.). The samples were sorted according to miRNA expression level and divided into high- and low-expression groups based on the median expression level. Each miRNA was evaluated individually using the Kaplan-Meier method in R. A total of 77 miRNAs that were closely associated with overall survival (OS) time (P<0.05) were identified. The intersection of the 77 miRNAs and 194 differentially expressed miRNAs revealed six miRNAs. After log-rank tests were performed, two miRNAs were eliminated as there was no significant difference in survival time between the low- and high-expression groups. The prognostic value of the remaining four miRNAs was evaluated with a Cox proportional hazards model using multiple logistic regression with forward stepwise selection of variables. Finally, three miRNAs were considered independent risk factors associated with OS time in the model. A risk score was calculated for each patient based on the expression levels of the three miRNAs as determined by the Cox regression coefficients (25). In total, 151 patients were divided into either the high-risk group (n=76) or low-risk group (n=75) based on risk score. The predictive power of the three-miRNA signature and clinical features together was analyzed. Univariate/multivariate Cox regression and the Kaplan-Meier method were used to assess the associations and influence on survival rate of the selected miRNAs. P<0.05 was considered to indicate a statistically significant difference. Data are expressed as the mean ± standard deviation.
Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) analyses
Two online miRNA databases were employed to predict the target genes of the three prognostic miRNAs: TargetScan (version 7.2) (26) and miRDB (27,28). The overlapping set of target genes was depicted in Venn diagrams for subsequent analysis. The overlapping genes were analyzed using the Database for Annotation, Visualization and Integrated Discovery (DAVID; version 6.8; david.ncifcrf.gov). DAVID is a bioinformatics database that integrates biological data and analysis tools, provides systematic and comprehensive bioinformatics annotations for large-scale gene or protein lists and aids users to extract bioinformatics data. The tools in DAVID perform numerous functions, including gene function enrichment, which improves the understanding of the biological roles of specific gene sets (29,30). GO (http://geneontology.org/) and KEGG (https://www.genome.jp/kegg; Release 88.0) pathway enrichment analyses were performed to identify the biological processes and pathways involving target genes. The cut-off criteria were P<0.05 and gene count ≥3 (31).
Results
Identification and analysis of differentially expressed miRNAs in TNBC
The TNBC miRNA-seq data in the present study included data on 187 tumor tissues and 12 matched normal tissues. A total of 194 differentially expressed miRNAs were identified using the cut-off criteria of P<0.05 and |log2FC|>2. Of these miRNAs, 65 were downregulated and 129 were upregulated (Fig. 1; Table SI).
Figure 1.
Volcano plot of the differentially expressed miRNAs. The dots in the gray regions represent upregulated or downregulated miRNAs. The white spots represent miRNA which has no statistical difference in expression between tumor tissue and normal tissue. miRNA, microRNA.
Identification of three miRNAs associated with OS time in TNBC
A total of 151 patients with validated data were enrolled in the present study. A total of 77 miRNAs significantly associated with OS time were identified from the survival analysis (Table SII). The intersection of the 77 miRNAs and 194 differentially expressed miRNAs revealed six miRNAs, including miR-4742-3p, miR-21-3p, miR-659-5p, miR-500b-3p, miR-429 and miR-200b-5p. Log-rank tests were performed to evaluate the survival curves. miR-4742-3p and miR-500b-3p were eliminated based on the cut-off criteria (P<0.05; Table SIII). The prognostic value of each of the remaining four miRNAs was evaluated using the Cox proportional hazards model, and miR-21-3p, miR-659-5p and miR-200b-5p were identified as independent risk factors associated with OS time (Fig. 2A-C). Among these miRNAs, miR-200b-5p was identified as a risk factor, whereas miR-21-3p and miR-659-5p were identified as protective factors. Additionally, the association between the aforementioned three miRNAs and clinical features was investigated (Table I). The results revealed that miR-218-1 was associated with metastasis state (P=0.031). A risk score based on the expression levels of the three miRNAs determined using the Cox regression coefficient for each miRNA was calculated as follows: Risk score=(0.627 × expression of miR-200b-5p)-(1.181 × expression of miR-659-5p)-(0.889 × expression of miR-21-3p). The risk score was used to divide the 151 patients into a high-risk group (n=76; risk score=−9.05~-5.99) and a low-risk group (n=75; risk score=−12.96~-9.06). Compared with the low-risk group, the high-risk group exhibited poor survival rate outcomes as assessed using the Kaplan-Meier method (Fig. 2D).
Figure 2.
Kaplan-Meier curve and log-rank test for each miRNA and the three-miRNA signature. (A) miR-21-3p, (B) miR-200b-5p, (C) miR-659-5p and (D) the three-miRNA signature. A total of 151 patients were divided into a high-risk group (n=76; risk score=−9.05~-5.99) and a low-risk group (n=75; risk score=−12.96~-9.06) based on the risk score. The risk score was calculated as follows: (0.627 × expression of miR-200b-5p)-(1.181 × expression of miR-659-5p)-(0.889 × expression of miR-21-3p). miRNA/miR, microRNA.
Table I.
Associations of the three microRNAs with the clinical features of patients with triple-negative breast cancer. The chi-square and t-tests were performed to assess the relationship between miRNA expression and clinical features. Data are expressed as the mean ± standard deviation.
Variable
Number of patients
miR-659-5p
P-value
miR-21-3p
P-value
miR-200-5p
P-value
Patient age at diagnosis
0.08
<60
100
1.40±0.70
0.81
11.29±0.77
0.19
4.52±1.10
≥60
51
1.43±0.64
11.10±0.92
4.16±1.34
AJCC clinical stage
0.33
GI–II
123
1.43±0.64
0.56
11.23±0.81
0.89
4.45±1.17
GIII–IV
28
1.34±0.83
11.21±0.91
4.20±1.31
AJCC T stage
0.84
T1-2
134
1.42±0.65
0.86
11.25±0.81
0.35
4.39±1.22
T3-4
17
1.38±0.90
11.05±0.93
4.45±0.99
AJCC N stage
0.52
N0-1
130
1.43±0.66
0.32
11.23±0.80
0.81
4.43±1.17
N2-3
21
1.27±0.77
11.18±0.98
4.24±1.36
AJCC M stage
0.03
M0
132
1.41±0.67
0.88
11.21±0.86
0.47
4.48±1.20
Mx
19
1.39±0.74
11.35±0.54
3.84±1.02
Histological type
0.39
IDC
136
1.41±0.70
0.93
11.26±0.81
0.12
4.43±1.21
Non-IDC
15
1.40±0.43
10.91±0.95
4.15±1.04
miR, microRNA; AJCC, American Joint Committee on Cancer; G, grade; T, tumor; N, node; M, metastasis; IDC, invasive ductile carcinoma.
Univariate and multivariate Cox regression analyses were performed to test the efficacy of the three-miRNA signature (high vs. low risk) in predicting OS time by considering the following clinical features: Age at initial pathological diagnosis (<60 years vs. ≥60 years), histological type [invasive ductile carcinoma (IDC) vs. non-IDC], tumor size [T in the American Joint Committee on Cancer (AJCC; http://cancerstaging.org/) Tumor, Node, Metastasis (TNM) staging system (T1-2 vs. T3-4)], lymph node status (N in the AJCC TNM staging system, N0-1 vs. N2-3), metastasis (M in AJCC TNM staging system, M0 vs. Mx) and AJCC pathological stage (G1-2 vs. G3-4). In the univariate analysis, the three-miRNA signature [hazard ratio (HR)=6.893; P=0.012; 95% confidence interval (CI) 1.52–30.69], pathological stage (HR=5.953; P=0.001; 95% CI 2.04–17.34) and lymph node status (HR=7.850; P=0.001; 95% CI 2.39–25.76) were associated with OS time in patients with TNBC (Table II). In the multivariate analysis, only the three-miRNA signature was identified as an independent prognostic factor (HR=7.396; P=0.011; 95% CI 1.59–34.41; Table II).
Table II.
Univariate and multivariate Cox regression analysis results for patients with triple-negative breast cancer.
Univariate analysis
Multivariate analysis
Variable
P-value
HR (95% CI)
P-value
HR (95% CI)
Age (≥60 vs. <60)
0.538
1.494 (0.416–5.366)
0.740
1.256 (0.327–4.831)
Histological type (Non-IDC vs. IDC)
0.686
1.366 (0.301–6.190)
0.969
1.034 (0.193–5.545)
Tumor size (T3-4 vs. T1-2)
0.200
2.307 (0.643–8.276)
0.735
1.336 (0.250–7.140)
Lymph node status (N2-3 vs. N0-1)
0.001
7.850 (2.392–25.761)
0.051
14.666 (0.989–217.523)
Metastasis (Mx vs. M0)
0.477
0.042 (0.000–257.128)
0.984
<0.001
Pathological stage (G3-4 vs. G1-2)
0.001
5.953 (2.044–17.335)
0.950
0.918 (0.063–13.415)
Three-miRNA signature (high-vs. low-risk)
0.012
6.893 (1.524–30.690)
0.011
7.396 (1.590–34.411)
HR, hazard ratio; CI, confidence interval; IDC, invasive ductile carcinoma; T, tumor; N, node; M, metastasis; G, grade.
KEGG and GO analyses of the target genes of the miRNAs in the three-miRNA signature
A total of 459 overlapping target genes for miR-21-3p, 52 overlapping target genes for miR-200b-5p and 74 overlapping target genes for miR-659-5p were identified (Fig. 3). The GO results revealed that these overlapping genes were enriched in 79 GO accessions and were mainly involved in ‘cell protein metabolism’, ‘RNA transcriptional regulation’ and ‘cell migration’ (Fig. 4). KEGG pathway analysis revealed enrichment in six pathways, including ‘ubiquitin-mediated proteolysis’, ‘long-term potentiation’, ‘MAPK signaling pathway’, ‘ErbB signaling pathway’, ‘prolactin signaling pathway’ and ‘adherens junction’ (Fig. 5).
Figure 3.
Venn diagrams displaying the overlapping target genes predicted using TargetScan and miRDB online tools. The grey circles represent the target genes of the three prognostic miRNAs predicted by TargetScan, and the circles with the diagonal lines represent the target genes of the three prognostic miRNAs predicted by miRDB. The overlapping areas between the two circles represent the overlapping set of target genes. miR, microRNA.
Figure 4.
GO analysis of the target genes of the miRNAs in the three-mRNA signature. The colors of the bands represent P-values, and the length of the bands represents the number of genes enriched in a particular GO term. GO, Gene Ontology; miRNA, microRNA.
Figure 5.
KEGG pathway analysis of the three-miRNA signature target genes. The color of the dots represents the -log10(P-value), and the dot size represents the number of genes enriched in a particular KEGG pathway. The x-axis represents the fold enrichment of target genes in a particular KEGG pathway. KEGG, Kyoto Encyclopedia of Genes and Genomes.
Discussion
The present study identified a three-miRNA signature associated with the survival rate of patients with TNBC. Patients were divided into two groups according to their risk score based on the expression pattern of the three miRNAs (miR-21-3p, miR-659-5p and miR-200b-5p). Significant differences in OS time were observed between the two groups. In the multivariate Cox model, the three-miRNA signature was identified as an independent prognostic factor of the OS time of patients with TNBC [HR=7.396; 95% CI, 1.59–34.41]. KEGG and GO analyses of the target genes of the three miRNAs indicated that they serve important roles in cell protein metabolism, RNA transcriptional regulation and cell migration, suggesting the aforementioned three miRNAs are implicated in the pathogenesis, progression and prognosis of TNBC. The 95% CI exhibited a wide range, possibly due to the small sample size (n=151). In a similar study, a wide 95% CI interval was also obtained (20). Future studies investigating larger sample sizes are required to assess the accuracy of the three-miRNA signature established in the present study.The present study investigated the associations between clinical factors, such as age, tumor size and clinical stage, and miRNA levels. However, factors such as co-morbidities, smoking status and treatment received were not taken into account. Databases are unlikely to have records of all of the clinical factors affecting miRNA levels, therefore it is not possible to consider the impact of these clinical factors on patients. In order to circumvent this problem, the limma package in R was used to analyze differentially expressed miRNAs in the present study. The comparisons between tumor and normal tissues performed in limma were not conducted using a 1:1 ratio of tumor to healthy tissue, as healthy tissue specimens were not obtained from the majority of patients. Therefore, the tumor tissue from each patient was compared with all healthy tissue specimens. This approach eliminates the impacts of individual patient treatment options or other diseases (32).miR-21 is one of the most studied miRNAs in breast cancer, but research focusing on the association between miR-21-3p and TNBC is lacking (33–35). In the present study, miR-21-3p expression was increased in tumor tissues compared with healthy tissues, which is consistent with previous studies (36,37). However, there is conflicting evidence regarding the association between miR-21-3p expression in tumor tissues and OS time. Previous studies revealed that miR-21-3p is an oncogenic factor in breast cancer, which upregulates the expression levels of phosphorylated protein kinase B (AKT) and L1 cell adhesion molecule (37,38). However, another in vitro study suggested that miR-21-3p reduces cancer cell growth (39). Furthermore, Jiao et al (40) reported that miR-21-3p functions as a tumor suppressor; however, further studies are required to investigate the effect of miR-21-3p on the OS time. Jiao et al (40) proposed that miR-21-3p causes transcript decay of miR-21-5p, a typical oncogene. The present study demonstrated that miR-21-3p was a protective factor in TNBC. However, this result requires further corroboration using a larger sample size.Aberrant expression of miR-200 family members, including miR-200-5p, has been observed in several cancer studies (13,41). The present study revealed that miR-200-5p expression was increased in TNBC tissues compared with healthy tissues, consistent with previous studies (19,21), and that miR-200-5p is a risk factor for OS time. However, previous studies have demonstrated that miR-200b/c activates the focal adhesion kinase/phosphoinositide 3-kinase/AKT/nuclear factor-κB signaling pathway and zinc-finger E-box-binding homeobox factor to inhibit cancer cell invasion and migration (42–44). The aforementioned studies all involved in vitro cell experiments, and one study reported a different conclusion using TCGA data (21). The aforementioned study reported no statistically significant difference in OS time between patients with low vs. high expression of miR-200b/c, highlighting the complex biology of cancer (21). There may be two possibilities to explain the discrepancy between the present study and previously published studies: i) The present study did not analyze a sufficient number of samples to generalize the entire population of patients with TNBC, and ii) in vitro experiments do not reflect the complex microenvironment in vivo, and some unknown factor may influence the function of miR-200-5p in vivo. These key issues require further investigation in future studies.The present study demonstrated that miR-659-5p is an independent protective factor associated with OS time in patients with TNBC. To the best of our knowledge, the present study was the first to describe an association between miR-659 and breast cancer. Due to the lack of systematic research, little is known about the role of miR-659. One study demonstrated that miR-659 targets the mitogen-activated protein kinase 9 (MAPK9) gene. In patients with nasopharyngeal carcinoma, MAPK9 expression levels influence polymorphic lactotransferrin(LTF) haplotypes, which are positively associated with the incidence of cancer (45). Other studies have focused on the role of miR-659 in muscle development and nervous system and metabolic diseases, such as frontotemporal dementia and diabetes mellitus (46–48). Thus, the mechanism of miR-659-5p in breast cancer requires further investigation.The present study investigated the biological activities of the selected three miRNAs using KEGG and GO analyses and identified six enriched pathways. Enhanced ubiquitin-mediated proteolysis causes the rapid hydrolysis of proteins, including tumor protein p53 and interferon α inducible protein 27, which are closely associated with invasiveness, malignancy, clinical stage and poor prognosis in breast cancer (49). Ubiquitin-mediated proteolysis has been suggested to be a marker of the risk of breast cancer (50). Zhang et al (51) reported that mitogen-activated protein kinase (MAPK) expression levels in TNBC tissues were significantly increased compared with normal adjacent tissues, indicating an association between MAPK expression and TNBC. In addition, studies have indicated that MAPK protein expression is associated with clinical stage and lymph node metastasis in TNBC (52,53). The epidermal growth factor receptor (EGFR) serves important roles in TNBC (54). Previous studies have revealed the upregulation of EGFR, which affects carcinogenesis, progression and metastasis of breast cancer, in 70% of patients with TNBC (55,56). A previous study reported enrichment of the prolactin (PRL) signaling pathway in TNBC (57). López-Ozuna et al (58) reported that the PRL signaling pathway may be used to predict the response to prodifferentiation therapy in TNBC. The adherens junction pathway is another pathway that was reported to be enriched in TNBC-associated processes (59). Vascular endothelial-cadherin, heparan and the probe for protein kinase C α have previously been implicated in the adherens junction pathway (60–62). The main component of the cell-cell adherens junction is E-cadherin, which plays a major role in maintaining normal breast epithelial cell morphology (63). Breast cancer has been found to downregulate or interfere with the expression of E-cadherin, which is closely associated with the epithelial-to-mesenchymal transition (EMT) (63), a key step in the invasion and metastasis of breast cancer (64,65). The induction of EMT requires the synergy of the MAPK signaling pathway (66,67). The aforementioned observations indicate an association between the MAPK signaling and adherens junction pathways, which requires further investigation.The present study had a number of limitations. These included a relatively small sample size and a lack of validation experiments, such as reverse-transcription quantitative polymerase chain reaction analysis, for the selected three miRNA transcripts at the tissue and plasma levels. Further research based on a larger cohort of patients with TNBC is required to validate the findings obtained.In conclusion, the present study identified a three-miRNA signature as a potential prognostic predictor of patients with TNBC. Due to the complex biology of cancer, the molecular mechanisms involving miRNA signatures warrant further investigation.
Authors: Izidore S Lossos; Debra K Czerwinski; Ash A Alizadeh; Mark A Wechser; Rob Tibshirani; David Botstein; Ronald Levy Journal: N Engl J Med Date: 2004-04-29 Impact factor: 91.245
Authors: M M Javle; J F Gibbs; K K Iwata; Y Pak; P Rutledge; J Yu; J D Black; D Tan; T Khoury Journal: Ann Surg Oncol Date: 2007-09-19 Impact factor: 5.344
Authors: Vanessa M López-Ozuna; Ibrahim Y Hachim; Mahmood Y Hachim; Jean-Jacques Lebrun; Suhad Ali Journal: Sci Rep Date: 2016-08-02 Impact factor: 4.379
Authors: Arsalan Amirfallah; Hildur Knutsdottir; Adalgeir Arason; Bylgja Hilmarsdottir; Oskar T Johannsson; Bjarni A Agnarsson; Rosa B Barkardottir; Inga Reynisdottir Journal: PLoS One Date: 2021-11-19 Impact factor: 3.240
Authors: Yuquan Qian; Jimmy Daza; Timo Itzel; Johannes Betge; Tianzuo Zhan; Frederik Marmé; Andreas Teufel Journal: Cells Date: 2021-03-15 Impact factor: 6.600