Breast cancer, the most common type of cancer in women, presents a challenge for global research (1). Molecular research on breast cancer was at a bottleneck, until the identification of a correlation between BRCA1/2 DNA repair associated and breast cancer was reported and marked great progress in the research, treatment and prognosis of breast cancer (2,3). Following this, genetic susceptibility has become the focus of breast cancer research (4). However, in the context of big data, with the development of high-throughput sequencing technology, it is not difficult to find susceptibility genes.The Homeobox (HOX) genes is a large class of transcription factors that play an important role in embryogenesis and oncogenesis, as well as the distribution of fat and hair in body (5–7). In humans, the HOX gene family contains 39 HOX genes located on 4 different chromosomes (7p15, 17q21.2, 12q13 and 2q31) (8). The 39 HOX genes are divided into 4 clusters (HOXA, HOXB, HOXC and HOXD) (9). Each HOX gene contains a well-conserved DNA sequence, known as the homeobox (10). The unique expression pattern, including mutation, and dependent mechanism of the HOX genes regulates, to some extent, the embryonic development of vertebrates (11,12). When HOX protein expression is upregulated, it may lead to cancer (5). It has also been reported that the HOXC gene family is highly expressed in certain solid tumors, including lung, prostate and colon cancer (13,14). The HOXA and HOXB gene families have a similar expression in breast cancer, which is derived from the ectoderm. Whether the expression level of the HOX gene follows the origin of the germ layer in cancer requires further investigation. A study has reported that HOXC13, a member of the HOXC gene family, is highly expressed in the MCF-7 cell line (15). Thus, the present study aimed to explore the expression and significance of HOXC13 in breast cancer.In the present study, the Oncomine and tumor public databases (bc-GenExMinerv 4.2; GOBO database; CCLE) were used to analyze the expression level of HOXC13 in different types of cancer including breast cancer. HOXC13 was then further investigated in breast cancer. The expression and co-expression of HOXC13 in breast cancer was re-analyzed using the University of California, Santa Cruz (UCSC) cancer genomics browser. Finally, the clinical significance of HOXC13 in breast cancer was further explored.
Materials and methods
Oncomine database verification
The Oncomine (www.oncomine.org) database is a public bioinformatics database containing gene expression data set that has become an industry-standard tool cited in >1,100 peer-reviewed journal articles (16,17). The Oncomine platform has been used as a foundation for ground-breaking discoveries with unique features that include scalability, high quality, consistency and standardized analysis (18). In order to screen out the most meaningful RNA probes, the paired Student's t-test was used to generate P-values to compare expression differences between cancer and healthy adjacent non-cancerous tissues. Relevant statistical indicators were used as follows: P<1×10−4, fold change >4 and gene ranking in the top 10%. Moreover, the Oncomine database was used to explore the co-expression analysis of HOXC13 in breast cancer. The search term ‘HOXC13’ was used, followed by coexpression analysis and selecting the cancer type as breast cancer. Lastly, the database of TCGA breast was chosen. Furthermore, the same cut-off values used as aforementioned.
Cancer cell line encyclopedia (CCLE) verification
The CCLE (portals.broadinstitute.org) provides public access to genomic data, analysis and visualization for >1,100 cell lines from various tumors, such as gastric cancer cell line AGS and intestinal cancer cell line SW480 (19,20). Each gene of the human genome has multiple datasets and data identifiers, obtained by high-throughput sequencing. The 5 major dataset types are copy number, mRNA expression (Affymetrix), reverse phase protein array, reduced representation bisulfite sequencing, and mRNA expression (RNA sequencing). The expression and methylation level of HOXC13 was analyzed in each tumor cell line using CCLE, using the search term ‘HOXC13’.
Gene expression-based Outcome for Breast cancer Online (GOBO) analysis
The GOBO database (version 1.0.3; co.bmc.lu.se/gobo/gsa_cellines.pl) is a user-friendly public database. It allows for rapid assessment of gene expression levels, identification of co-expressed genes and association with outcome for single genes, gene sets or gene signatures in an 1,881-sample breast cancer dataset (21). The most important functionality of the GOBO database is the possibility of investigating gene expression levels in breast cancer subgroups and cell lines for gene sets (22). Breast cancer subtypes are classified into basal A, basal B and luminal in the GOBO database. A correlation map is a square table where each line and column represent a gene. Each cell represents an interaction between two genes and is colored according to the value of the Pearson's correlation coefficient between these two genes, from dark blue (coefficient=−1) to dark red (coefficient=1). Cells from the diagonal of the correlation map represent the interaction of a gene with itself and are colored black.
UCSC cancer genomics browser analysis
UCSC (xena.ucsc.edu) is an online exploration tool for public and private multi-omics functional genomics and clinical/phenotype data (23). The Cancer Genome Atlas (TCGA; http://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga) were used in the present study. Using the UCSC database, a heat map of HOXC13 expression in various subtypes of breast cancer, such as LuminalA, luminalB, normal-like, Basal-like and Her2-riched was generated. At the same time, the co-expression of HOXC13 in breast cancer was analyzed.
cBioPortal database analysis
The public cBioPortal site (www.cbioportal.org; last entered, 28th March 2019) is hosted by the Center for Molecular Oncology at Memorial Sloan Kettering Cancer Center. The cBioportal database has been recognized as a way to verify gene mutations (24–26). This database can also analyze gene expression in different tumors and different studies (23,27). This database has been recognized by multiple studies (26,28). The online cBioPortal for Cancer Genomics was used to provide mutations of HOXC13 (including nonstart, missense, truncation and missense) and its expression in different studies (28). The cancer types and databases used were breast (TCGA2015), breast (TCGA), breast invasive carcinoma breast (TCGA PanCan) (29–34).
Breast cancer gene-expression miner (bc-GenExMiner) analysis
bc-GenExMinerv (version 4.2) is a statistical mining tool of published annotated genomic data (35,36). The statistical analyses are grouped in three modules: Expression, prognosis and correlation. The co-expression of HOXC13 in breast cancer was explored using this database; each study is validated across multiple databases to avoid discrepancies in individual databases. At the same time, the effects of HOXC13 and HOX transcript antisense RNA (HOTAIR) on the survival prognosis in breast cancer were also analyzed.
Statistical analysis
Pearson test and Spearman's rank test were used to evaluate coexpression. The analysis criteria selected were as follows: Gene, nodal and estrogen receptor status of the cohorts to be explored, event on which survival analysis will be based and splitting criterion of median HOXC13 expression. To analyze the prognostic value of HOXC13 and HOTAIR, the patient samples are split into two groups according to the median expressions of HOXC13 and HOTAIR. The two patient cohorts were compared by a Kaplan-Meier survival plot, and the hazard ratio with 95% confidence intervals and logrank P-value are calculated. Significance was determined by the P-value provided by each database.
Results
HOXC13 mRNA expression in human cancer
HOXC13 has been proven to be highly expressed in digestive tract-derived tumors (37–39). However, to the best of our knowledge, there have been no reports on the expression of HOXC13 in breast cancer. Therefore the expression levels of HOXC13 in various humantumors from the Oncomine database and CCLE were determined. A simultaneous fold change of >4, gene rank of >10% and P<1×10−4 was set as the threshold. To our surprise, HOXC13 was found by high-throughput sequencing and biological gene chip technology to be abnormally highly expressed in breast cancer (Fig. 1A). Only breast cancer revealed significant unique analysis in the Oncomine database when the fold change was >4.
Figure 1.
HOXC13 mRNA expression level in different types of human cancer. This image shows the expression of HOXC13 in human tumors. (A) Graph showing the number of datasets with a statistically significant mRNA HOXC13 overexpression, based on the cut-off value of P<1×10−4 and fold change >4 in the Oncomine database. The cell color is determined by the best gene rank percentile for the analyses within the cell. Blue represents low expression in tumors, red represents high expression in tumors and white represents no difference in tumor tissues and normal tissues. As shown in the figure, breast cancer has three data sets showing high expression, based on the cut-off value of P<1×10−4, fold change >4 and gene ranking in the top 10%. (B) mRNA expression of HOXC13 in different cancer cell lines. The expression of HOXC13 ranks highest in breast cancer using Affy gene chip data in the Cancer Cell Line Encyclopedia. (C) The mRNA expression of HOXC13 ranks fourth highest in different tumor cell lines RNA-seq data, behind that of Ewings sarcoma, esophagus and upper aerodigestive tract. Affy, Affymetrix; RNA seq, RNA sequencing; CNS, central nervous system; HOXC13, homeobox C13.
HOXC13 mRNA expression in breast cancer
The high expression of HOXC13 in breast cancer has been preliminary confirmed in the Oncomine database, GOBO database and CCLE. From the CCLE database, it was found that at the RNAseq level, the expression level of HOXC13 in breast cancer ranked fourth and ranked first in the Affy level. However, there is no report on the specific expression of HOXC13 in breast cancer. Therefore, its specific expression in various subtypes of breast cancer was further explored. HOXC13 was analyzed in various tumor types via Oncomine database and GOBO database and explored HOXC13 in various breast cancer cells via GOBO database and CCLE. HOXC13 was found to be highly expressed in invasive and luminal-like breast cancer than in any other subtype (Figs. 2 and 3). Such expression characteristics were consistent with breast cancer cell lines and tissues (Figs. 2D and 3). Using the UCSC cancer genomics browser analysis, the heat map of the gene and exon expression of HOXC13 in various subtypes of breast cancer was obtained (Fig. 4A). Furthermore, the expression of HOXC13 in the different data sets, including breast (TCGA 2015), breast (TCGA), breast invasive carcinoma breast (TCGA PanCan), was explored using the cBioPortal database. The expression characteristics of breast cancer in multiple data sets such as amplification and missense mutations, were demonstrated (Fig. 4B).
Figure 2.
HOXC13 expression analysis in different breast cancer subtypes. Box plots represent the expression of HOXC13 in invasive breast cancer from the Oncomine database. (A) Invasive BC (invasive lobular breast carcinoma): P-value=7.04×10−12. (B) Invasive BC (invasive breast carcinoma): P-value=1.05×10−16. (C) Invasive BC (invasive ductal breast carcinoma): P-value=2.67×10−23. (D-E) Red represents breast cancer subtype basal A, gray represents breast cancer subtype basal B, and blue represents breast cancer subtype luminal. (D) Using GOBO analysis, in various subtypes of breast cancer, the HOXC13 expression was significantly higher in luminal-like breast cancer: P=0.00017. (E) GOBO analysis showing the expression of HOXC13 in each breast cancer cell line. BC, breast carcinoma; GOBO, Gene expression-based Outcome for Breast cancer Online; HOXC13, homeobox C13. The neve expression refers to the base two logarithm of the expression of the gene in each cell, and the neve intensity refers to the expression level of each breast cancer cell relative to the expression of the internal reference.
Figure 3.
HOXC13 expression analysis in breast cancer subtypes of SCM1 subtypes. (A) The box plot of HOXC13 expression according to SCM1 subtypes from bc-GenExMiner version 4.2. (B) Group comparison in each subgroup of breast cancer, and corresponding statistical values in SCM1 subtypes. HER2, human epidermal growth factor receptor 2; Lum, luminal; HOXC13, homeobox C13; SCM, subtype clustering model.
Figure 4.
Analysis of the expression of HOXC13 in breast cancer subtypes. (A) HOXC13 expression heatmap of each breast cancer subtype in TCGA from the University of California, Santa Cruz Xena browser. Null data indicates samples that have no gene expression data. (B) HOXC13 expression analysis and mutation status in breast cancer using cBioPortal. HOXC13 was significantly amplified in breast cancer subtypes, including breast invasive carcinoma. The present study included datasets on breast (TCGA2015), breast (TCGA) and breast invasive carcinoma breast (TCGA PanCan) datasets. TCGA, The Cancer Genome Atlas; RNA seq, RNA sequencing; HER2, human epidermal growth factor receptor 2; RPKM, reads per kilobase of transcript per million mapped reads; HOXC13, homeobox C13; VUS, variants of uncertain significance; CNA, copy number alteration.
HOXC13 methylation and mutation in human breast cancer
The bubble chart shows the methylation level of HOXC13 in breast cancer cell lines from the CCLE (Fig. 5A). HOXC13 is highly methylated at three sites (positions 54,330,731, 54,330,950 and 54,330,957) on chromosome 12 from methylation and coverage (Fig. 5A). The discovery of cpG island methylation further supported the high expression of HOXC13 in breast cancer. cBioPortal was used to assess the frequency of HOXC13 mutations in breast cancer (Figs. 4B and 5). HOXC13 contains multiple mutations in breast cancer such as amplification, gain, missense and truncation (Fig. 4B). Missense and truncation are two major forms of mutation (Fig. 5).
Figure 5.
Methylation and mutation status of HOXC13 in breast cancer. (A) Methylation status of HOXC13 in breast cancer cell lines. The x-axis displays the cell line in which methylation was measured. The y-axis displays the position of the methylation data. The number before the colon is the chromosome, and the number after the colon the position. Red, orange and blue represent high, medium and low levels of methylation. The size of the bubble represents coverage. (B) Mutation status of HOXC13 in breast cancer from cBioPortal. The mutation types include nonstart, missense and truncation, and missense is shown as 1, 2, 3 and 4. The boxes in green and red represent the open reading frame. The scale shows where the mutation site is located in the sequence. Corresponding changed protein name contains M1, S242Y, X_246splice and S242Y. The data used included studies on breast cancer (29–34).
Co-expression of HOXC13 mRNA in breast cancer
To investigate the reason for the high expression of HOXC13 in breast cancer, bc-GenExMiner version 4.2 was used to analyze the potential co-expression of HOXC13 in breast cancer. It was found that HOTAIR and HOXC13 are highly co-expressed in breast cancer (Fig. 6A). Furthermore, to verify the co-expression of HOXC13 and HOTAIR in breast cancer, their co-expression heat maps were obtained and mined (Fig. 6B) and correlation analysis was performed using the UCSC Xena browser (Fig. 6C).
Figure 6.
Co-expression analysis of HOXC13 and HOTAIR. (A) A correlation map illustrates pairwise correlations between HOXC13 and HOTAIR. They are all patients, ER+, ER-, luminal A, luminal B and basal-like respectively according to the group. (B) The co-expression heat map of HOXC13 and HOTAIR in TCGA derived from the UCSC Xena browser. (C) Association between HOXC13 and HOTAIR gene expression in TCGA breast cancer derived from the UCSC Xena browser. The Pearson's value was 0.7369 and the Spearman's value was 0.7048. HOXC13, homeobox C13; HOTAIR, HOX transcript antisense RNA; UCSC, University of California Santa Cruz; ER, estrogen receptor; -, negative; +, positive; TCGA, The Cancer Genome Atlas; RNA seq, RNA sequencing; HER2, human epidermal growth factor receptor 2; PAM, prediction analysis of microarray.
Impact of HOXC13 and HOTAIR on the prognosis of patients with breast cancer
To verify the impact of the high expression of HOXC13 and HOTAIR on patients with breast cancer, prognostic analysis of HOXC13 and HOTAIR in breast cancer was performed and HOXC13 and HOTAIR were found to have a negative impact on the prognosis of patients with tumor and lymph node metastasis (Fig. 7).
Figure 7.
Prognostic value of HOXC13 and HOTAIR in breast cancer. (A) Prognostic analysis of HOXC13 with regards to positive nodal status. The higher the expression of HOXC13, the shorter the AE-free survival time. (B) Prognostic analysis for HOXC13 with regards to metastatic relapse. The higher the expression of HOXC13, the shorter the MR-free survival time. (C) Prognostic analysis for HOTAIR with regards to positive nodal status. The higher the expression of HOTAIR, the shorter the AE free survival time. (D) Prognostic analysis for HOTAIR with regards to metastatic relapse. The higher the expression of HOTAIR, the shorter the MR-free survival time. HOTAIR, HOX transcript antisense RNA; AE, any event; MR, metastatic relapse.
Discussion
The mammary gland is an appendage of the ectoderm, whose formation begins during embryonic development (40). Moreover, breast cancer cells remain highly associated with ectoderm cells (41). Studies have shown that the HOX gene plays an important role in embryonic development and tumor formation (9,42–44). However, to the best of our knowledge, no research has reported the exact role of HOXC13 in breast cancer so far.The HOXC13 gene is considered an element of hair morphogenesis, and the HOXC13 protein is a member of the human replication complex in growing cells (45,46). This id the same for breast adenocarcinomaMCF7 lines (46). HOXC13 can promote the expression of a series of proto-oncogenes, including topoisomerase I and II, and allow these expression products to form a replication complex (47,48). Therefore, HOXC13 is involved in the development of tumors. In addition, HOXC13 increases the metastasis of ectodermal-derived melanoma (49). HOTAIR is the product of the transcription of the HOXC gene cluster antisense strand (50). HOTAIR is transcribed from the mammalianHOXC gene cluster on chromosome 12q13.13 (51,52). HOTAIR predicts poor prognosis in tumor cell cycle and metastasis (53). It has been reported that HOTAIR upregulates HOX in colon cancer (39). A strong co-expression of HOTAIR and HOXC13 was confirmed in proximal and distal colon cancer, suggesting that HOTAIR and HOXC13 could promote tumor and lymph node metastasis (39,54–56).In the present study, it was first identified that HOXC13 is highly expressed in breast cancer both at the cellular and tissue levels. This was a finding from the co-verification of the data from TCGA's Oncomine or GSE's bc-GenExMiner. It was found for the first time that HOXC13 is most highly expressed in luminal-like subtype of breast cancer. This laid the foundation for the future study of the relationship between HOXC13 and breast cancer surface hormone receptors (estrogen receptor and progesterone receptor). In order to explore the high expression of HOXC13 in breast cancer, its methylation and mutation status were investigated. To our surprise, three sites on chromosome 12 were found to be consistently highly methylated in different breast cancer cell lines. Certain studies have reported that HOX gene methylation regulates hereditary breast cancer (57–59). In addition, the methylation level of HOXA11 is significantly higher in breast cancer compared with that in normal tissues, and is positively associated with family history and lymph node metastasis in breast cancer (59). Furthermore, the methylation level of HOXD13 in breast cancer is almost identical to that of HOXA11, and leads to shorter survival time (58). Therefore, the present findings and the HOX gene family have consistent trends in methylation levels and similar prognostic effects in breast cancer. However, further studies on the specific regulation mechanism of HOXC13 methylation in HOXC13 transcription is required. The present study is the first to report the mutation of HOXC13 in breast cancer. Missense may be an indispensable cause of the high expression of HOXC13 in breast cancer.The lncRNA HOTAIR is derived from the region between HOXC11 and HOXC12 (51). It has been shown that HOXC10, HOXC11, HOXC12 and HOXC13 are adjacent to each other in the HOXC gene cluster (52). Simultaneously, the HOXC distal enhancer has non-specific enhancement of HOXC10 and HOTAIR enhancer activity, promoting the HOXC10 and HOTAIR expression (60). On the other hand, specific intergenic non-coding RNAs (including HOTAIR) in the HOX loci can directly modulate the expression of the HOX gene in normal and cancer status (61). This has been confirmed in colon cancer (39,55). Therefore the mechanism between HOXC13 and HOTAIR will be explored further in this respect. In the present study, the co-expression of HOTAIR and HOXC13 provided a new direction for studying the function of HOTAIR in breast cancer. In addition, a study has identified through meta-analysis that HOTAIR has a statistically significant effect on lymph node and distant metastasis in various types of cancer, including breast cancer, gastric cancer and colorectal cancer, which was consistent with our conclusion (62). The present data showed that HOTAIR and HOXC13 were significantly associated with lymph node metastasis and distant metastasis recurrence. Furthermore, they were shown to have a significant impact on survival period.In conclusion, the present study was performed using public databases and revealed the expression and clinical significance of HOXC13 in breast cancer. However, the specific interactions and mechanisms involved require further experimental verification, which will be performed in future studies.
Authors: Peter Eirew; Adi Steif; Jaswinder Khattra; Gavin Ha; Damian Yap; Hossein Farahani; Karen Gelmon; Stephen Chia; Colin Mar; Adrian Wan; Emma Laks; Justina Biele; Karey Shumansky; Jamie Rosner; Andrew McPherson; Cydney Nielsen; Andrew J L Roth; Calvin Lefebvre; Ali Bashashati; Camila de Souza; Celia Siu; Radhouane Aniba; Jazmine Brimhall; Arusha Oloumi; Tomo Osako; Alejandra Bruna; Jose L Sandoval; Teresa Algara; Wendy Greenwood; Kaston Leung; Hongwei Cheng; Hui Xue; Yuzhuo Wang; Dong Lin; Andrew J Mungall; Richard Moore; Yongjun Zhao; Julie Lorette; Long Nguyen; David Huntsman; Connie J Eaves; Carl Hansen; Marco A Marra; Carlos Caldas; Sohrab P Shah; Samuel Aparicio Journal: Nature Date: 2014-11-26 Impact factor: 49.962
Authors: Fan Liu; Yan Chen; Gu Zhu; Pirro G Hysi; Sijie Wu; Kaustubh Adhikari; Krystal Breslin; Ewelina Pospiech; Merel A Hamer; Fuduan Peng; Charanya Muralidharan; Victor Acuna-Alonzo; Samuel Canizales-Quinteros; Gabriel Bedoya; Carla Gallo; Giovanni Poletti; Francisco Rothhammer; Maria Catira Bortolini; Rolando Gonzalez-Jose; Changqing Zeng; Shuhua Xu; Li Jin; André G Uitterlinden; M Arfan Ikram; Cornelia M van Duijn; Tamar Nijsten; Susan Walsh; Wojciech Branicki; Sijia Wang; Andrés Ruiz-Linares; Timothy D Spector; Nicholas G Martin; Sarah E Medland; Manfred Kayser Journal: Hum Mol Genet Date: 2018-02-01 Impact factor: 6.150