| Literature DB >> 35633393 |
James R W McMullen1, Ubaldo Soto2.
Abstract
Breast Cancer (BrC) is a common malignancy with genetically diverse subtypes. There is evidence that specific BrC subtypes originate from particular normal mammary cell populations. However, the cell populations that give rise to most BrC subtypes are unidentified. Several human breast scRNAseq datasets are available. In this research, we utilized a robust human scRNAseq dataset to identify population-specific marker genes and then identified the expression of these marker genes in specific BrC subtypes. In humans, several BrC subtypes, HER2-enriched, basal-like, and triple-negative (TN), are more common in women who have had children. This observation suggests that cell populations that originate during pregnancy give rise to these BrCs. The current human datasets have few normal parous samples, so we supplemented this research with mouse datasets, which contain mammary cells from various developmental stages. This research identified two novel normal breast cell populations that may be the origin of the basal-like and HER2-overexpressing subtypes, respectively. A stem cell-like population, SC, that expresses gestation-specific genes has similar gene expression patterns to basal-like BrCs. A novel luminal progenitor cell population and HER2-overexpressing BrCs are marked by S100A7, S100A8, and S100A9 expression. We bolstered our findings by examining SC gene expression in TN BrC scRNAseq datasets and S100A7-A9 gene expression in BrC cell lines. We discovered that several potential cancer stem cell populations highly express most of the SC genes in TN BrCs and confirmed S100A8 and A9 overexpression in a HER2-overexpressing BrC cell line. In summary, normal SC and the novel luminal progenitor cell population likely give rise to basal-like and HER2-overexpressing BrCs, respectively. Characterizing these normal cell populations may facilitate a better understanding of specific BrCs subtypes.Entities:
Keywords: Basal-like breast cancer; Breast cancer; Breast neoplasms; HER2 breast cancer; Mammary gland development
Year: 2022 PMID: 35633393 PMCID: PMC9148339 DOI: 10.1007/s12672-022-00500-6
Source DB: PubMed Journal: Discov Oncol ISSN: 2730-6011
Fig. 1Human adult mammary cell populations. A t-SNE plot of seven normal breast scRNA-seq datasets. Individual dots correspond to cells and dot color indicates tissue sample. B Four cell clusters, luminal progenitor (LP), luminal differentiated (LD), basal (B), and stromal (Str) are identified at 0.05 cluster resolution in a t-SNE plot. C Eight cell clusters, three luminal progenitors (LP1-3), two luminal differentiated (LD1-2), a basal (B1), a stem cell like population (SC), a transition (T), and a stromal (Str) cell cluster are identified at 0.14 cluster resolution in a t-SNE plot. D Expression of LP, LD, and B marker genes was analyzed in t-SNE plots. Grey indicates low or no gene expression while red indicates high expression. E Heatmaps of the top 20 and 10 differentially expressed genes at the 0.05 and 0.14 cluster resolution, respectively. Red indicates high expression while blue indicates low expression
Fig. 2Gene expression in human adult mammary cell populations. Dot plots show the expression of population specific genes in the cell populations at 0.14 cluster resolution. A Mammary luminal progenitor, luminal differentiated, and basal cell marker expression in human adult mammary cell populations. B SC gene expression in human mammary cells, as well as a basal (KRT5), luminal (KRT18), epithelial (EPCAM), and mesenchymal (VIM) marker expression. C LP2 and D LP3 marker gene expression in human mammary cells
Genes highly expressed in the SC, LP2, and LP3 populations, and their expression in mouse mammary populations and human breast cancer
| Gene marker | Identified human cell populations | Identified mouse cell populations | GENT2 data- BrC subtype with highest gene expression | GENT2 data- highest Log2 fold change between BrC subtypes |
|---|---|---|---|---|
|
| ||||
| BIRC5 | SC | Very hi in fMaSC, med in AD1, AP1, B1 | Basal | 1.994*** |
| CDK6 | SC | Medium in all LP, AP, AD, B1 B2, B4, and B5; low in fMaSC | Basal | 1.831*** |
| CENPF | SC | AD1, AP1, B1; low in fMaSC | Basal | 1.858*** |
| CENPW | SC | Medium in fMaSC, AD1, AP1, B1, low in rest | Basal | 2.248*** |
| FDCSP | Hi in LP3, low in rest | – | Basal | 4.458*** |
| HIST1H4C | SC, low in rest | Low in fMaSC | Triple-negative | 3.214*** |
| HMGB2 | SC, low in rest | Very hi in fMaSC, AP1, and AD1, medium in rest | Triple-negative (second highest in Basal) | 1.833*** |
| STMN1 | Hi in SC, low in rest | Very hi in fMaSC, hi in AP1, AD1, AP2, B1, low in rest | Triple-negative (second highest in Basal) | 2.208*** |
| TOP2A | SC | Hi in fMaSC, medium in AP1, AD1, B1 | Basal | 1.634*** |
| TPX2 | SC | fMaSC, AP1, AD1, B1 | Basal | 2.264*** |
| TYMS | SC | Hi in fMaSC medium in AP1, AD1, B1, low in rest | Basal | 1.899*** |
| UBE2C | SC | Very hi in fMaSC, hi in AP1, AD1, B1 | Basal | 1.949*** |
| UBE2S | Hi in SC, low in rest | Medium-low in all adult cell populations | Basal | 1.988*** |
|
| ||||
| CXCL5 | LP2 | Hi in LP2, medium in LP1 and LP4 | Triple-negative | 2.142*** |
| LCN2 | Hi in LP2, low in rest of LP | Very hi in LP2, medium-low in rest of adult cell populations | Basal | 2.587*** |
| S100A7 | LP2 | Low in LP1 & B3 | HER2 | 5.443*** |
| S100A8 | LP2 | LP1, LP2 | HER2 | 4.348*** |
| S100A9 | Hi in LP2, low in rest of LP | None | HER2 | 4.637*** |
| SAA1 | Hi in LP2, low in rest of LP, B1 | AD2, LP2 | Triple-negative | 2.659*** |
| SAA2 | LP2, low in LP3 | AD2 | – | – |
| SERPINB4 | LP2 | – | Luminal | 1.503*** |
| SLPI | Hi in LP2, low in rest of LP and SC | Hi in B2, low in rest of adult cell populations | Basal | 2.681*** |
|
| ||||
| CHI3L1 | LP2, LP3 | – | Basal | 2.780*** |
| CYP1B1 | LP3, low in LP1 and LP2 | – | Triple-negative | 1.410*** |
| FDCSP | LP3 | – | Basal | 4.458*** |
| LTF | LP3, low in LP1 and LP2 | Hi in AP2, LP2, medium in LP1, LP3, and LP4 | HER2 | 2.825*** |
| RARRES1 | LP2, LP3, low in LP1 | None | Basal | 3.189*** |
GENT2 data comes from n = 2164 microarrayed patient samples in the GEO database
fMaSC fetal mammary stem cells, B basal, AD differentiated alveolar cells, AP alveolar progenitor cells, LP luminal progenitor cells, LD differentiated luminal cells, SC stem cell, - gene not in dataset
*** p < 0.001
Fig. 3Mammary cell populations in the adult mouse in nulliparous, gestational, lactating, and post-involution stages. A 15 cell populations, basal (purple), differentiated alveolar (blue), alveolar progenitor (light blue), luminal progenitor (orange), and luminal differentiated (yellow) cell populations are shown. B 15 cell populations, nulliparous (teal), gestational (orange), lactating (red), and post- involution (turquoise) cell populations are shown. Based on the work of Bach et al. [9]
Fig. 4SC gene expression in eight human triple-negative breast cancer cell datasets. A t-SNE plot of eight triple-negative breast scRNA-seq datasets. Individual dots correspond to cells and dot color indicates individual tissue sample. B The cells from the t-SNE plot of A was broken down into eleven cell clusters at 0.25 cluster resolution based on shared gene expression patterns. The dot color and number represent cell clusters. C Dot plot showing the expression of SC genes in the breast cancer cell populations, as well as a basal (KRT5), luminal (KRT18), epithelial (EPCAM), and mesenchymal (VIM) marker
Fig. 5LP2 gene expression in breast cancer cell lines. RT-qPCR bar graphs of relative normalized RNA expression of LP2 genes in BrC cell lines. Gene expression from three LP2 genes (S100A7, A8, A9) were examined in MCF-7 (Luminal A, light blue), BT474 (Luminal B, orange), SKBR3 (HER2-overexpressing, turquoise), HS578T (TN claudin-low, brown), and MB231 (TN claudin-low, green). The error bars show standard error of the mean. Three independent experiments are shown. * indicates p < 0.050 compared to SKBR3