Literature DB >> 33810979

Mammary gland development from a single cell 'omics view.

Alecia-Jane Twigger1, Walid T Khaled2.   

Abstract

Understanding the complexity and heterogeneity of mammary cell subpopulations is vital to delineate the mechanisms behind breast cancer development, progression and prevention. Increasingly sophisticated tools for investigating these cell subtypes has led to the development of a greater understanding of these cell subtypes, complex interplay of certain subtypes and their developmental potential. Of note, increasing accessibility and affordability of single cell technologies has led to a plethora of studies being published containing data from mammary cell subtypes and their differentiation potential in both mice and human data sets. Here, we review the different types of single cell technologies and how they have been used to improve our understanding of mammary gland development.
Copyright © 2021 The Authors. Published by Elsevier Ltd.. All rights reserved.

Entities:  

Keywords:  Breast cancer; Mammary gland; Single cell technologies; mass cytometry; scATAC-seq; scDNA-seq; scRNA-seq

Mesh:

Year:  2021        PMID: 33810979      PMCID: PMC8158430          DOI: 10.1016/j.semcdb.2021.03.013

Source DB:  PubMed          Journal:  Semin Cell Dev Biol        ISSN: 1084-9521            Impact factor:   7.727


Introduction

Whilst the key motivating factor to study the development and function of the mammary gland is to better understand the tissue of origin for the deadliest cancer in females worldwide [1], under normal homoeostatic conditions it is the only secretory organ to mature in adulthood. Fundamentally, the organ is a bilayer branched ductal tree tipped by alveoli consisting of inner luminal cells and outer contractile myoepithelial basal cells which interface directly with surrounding supportive stromal cells. The main stages of mammary gland development occur during embryogenesis, puberty and in the adult in the case of pregnancy, lactation and subsequent involution [2]. Dramatic restructuring of the mammary gland occurs at each of these developmental stages, where only during lactation is the mammary gland considered functional and able to fulfil its purpose of milk production. Another source of dramatic reconstruction occurs in the case of breast cancer. Historically, breast cancer patients were typically stratified for treatment based on the clinical parameters of age, node status, tumour stage and histological grade together with the presence or absence of key hormone receptors for oestrogen (ER), progesterone (PR) and human epidermal growth factor receptor 2 (HER2) [3]. However, since then, molecular portraits of human breast cancers have been identified [4] and have been developed to separate tumours into six major subtypes: Luminal subtype A, Luminal subtype B, Luminal subtype C, ERBB2+, basal-like and normal-breast like [5], [6]. Due to the clear differences between these cell types, questions subsequently arose as to whether these diverse breast cancer phenotypes arose from the same “cell of origin” or from different subpopulations. The molecular profile of these tumour types were compared to normal human mammary subpopulations which can be enriched for using fluorescence activated cell sorting (FACS) and the markers integrin alpha-6 (CD49f) and epithelial cell adhesion marker (EpCAM) [7]. Following stromal depletion of CD45+ hematopoietic and CD31+ endothelial cells, enrichment and transcriptomic profiling of basal/mammary stem cells (EpCAM-/CD49f+), mature luminal cells (EpCAM+/CD49f-) and luminal progenitor cells (EpCAM+/CD49f+) revealed that each normal cell type resembled different breast cancer subtypes [7]. Despite these findings and the fact that the molecular subtypes can be used to help predict patient outcomes, we still cannot make accurate predictions as to patients’ response to treatment and indeed still do not know conclusively from which mammary cell subpopulation breast cancer is derived [8]. The major hindrance to answering these questions is that studying the human mammary gland is a highly static ex vivo process, where only opportunistic samples from donated tissue can be studied. This means that in any given study, it is only possible to determine cell types and possible cellular hierarchies from the single developmental time point at which the genetically unique individual donates their tissue. Examining the murine mammary gland offers the possibility to not only examine different stages of development at selected time points in genetically identical animals but also provides the opportunity to conduct extensive in vivo studies. Previous mammary repopulation studies, which involve the transplantation of one or more cells into a cleared murine mammary fat pad, identified a bipotent stem cell within the basal compartment of the mammary gland which was able to generate a fully functional mammary gland [9], [10], [11]. However, due to the nature of this experiment that involves radical changes in the microenvironment of the cell, it has been suggested that this bipotent mammary cell plasticity may be an induced effect that would not necessarily exist in normal biology [12]. Lineage tracing on the other hand allows for the tracking of individual cells and their progeny in vivo, by inducing cells expressing a specific marker to concurrently express a fluorescent and/or colorimetric protein that can be easily identified using a fluorescence microscope [13]. Whilst this technique would seemingly shed light on whether bipotent or unipotent lineage-restricted progenitors drive glandular tissue expansion, this technique is not without its flaws. Many of the lineage tracing studies conducted in the mammary gland have produced contradictory findings where some suggest the presence of multipotent progenitors [14], [15], [16], [17], whilst others find only unipotent progenitors contributing to mammary gland development postpartum [18], [19], [20], [21], [22], [23], [24], [25]. Discrepancies in these studies may have arisen from a number of factors that could be attributed to the lineage tracing technique or non-specific lineage marker selection. The result of these studies is that the evidence suggesting the presence of mammary multipotent stem cells postpartum remains inconclusive and better tools to identify different cell populations in both the mouse and the human are required to answer this question. With the rise of single cell technologies, we are able to discern cell subpopulations and state from different tissues on an unprecedented level.

Single cell technologies

Technology to molecularly characterise single cells on a protein, transcriptomic, genomic or epigenomic level has rapidly developed over the past decade and allows for a better understanding of cell subpopulations and differentiation potential. With the development of single cell RNA-sequencing (scRNA-seq) and single cell ATAC-seq (scATAC-seq), cells dissociated from healthy or diseased tissue can be grouped by similar transcriptomes or epigenomes to determine cell subpopulations and maturation states in an unbiased fashion. On the other hand, single cell technologies such as single cell DNA-sequencing (scDNA-seq) can be used to determine copy number variations which provide important insights into cancer cell heterogeneity and tracking of tumour cell evolution [26].

Single cell DNA-sequencing

Clonal diversity is an important feature of human tumours and can be explored through the use of scDNA-sequencing. After initial isolation of single nuclei from single cells, whole genome amplification is conducted before next generation sequencing is used to determine genome wide copy number profiles of single cells. Initial methodologies allowed for ~10% physical coverage of a single cell genome allowing for identification of copy number aberrations alone [27]. Methodologies such as BGI [28], nuc-seq [29], and SNES [30] that use a Phi29 enzyme to perform multiple-displacement amplifications, generated a greater than 90% coverage of the single cell genome allowing for mutations at a base pair level to be detected [31]. Another early amplification-based scDNA-seq technique, multiple annealing and looping based amplification cycles (MALBAC), also allows for the identification of single-nucleotide variations (SNVs) due to the high genome coverage (93%) offered by this method [32]. More recent transposition-based methodologies direct library preparation (DLP) [33] and DLP+ [34], do not require preamplification of DNA and therefore overcome disadvantages of coverage and polymerase bias. Careful extraction of single cells using laser-capture microdissection and catapulting, combined with adaption of highly multiplexed single nucleus sequencing (HM-SNS) [35] and tissue section image analysis, allows for topographic single cell sequencing (TSCS). TSCS allows for the capture of the genomic profile of single cells whilst preserving their cellular position and morphology within the tissue of interest [36]. This technique, along with the others described, are integral to better understanding tumour cell copy number mutations and breast cancer development, progression and evolution. However, scDNA-seq analysis alone cannot provide information on cell state or type and hence other single techniques are required to comprehensively understand mammary gland development and pathogenesis.

Flow and mass cytometry

Flow cytometry is one of the most widely used techniques of single cell analysis, using fluorescently labelled antibodies to conduct multiparameter profiling of individual cells. This technique uses polychromatic lasers and filters to detect cell protein expression profiles and is generally limited to 10 simultaneous measurements that can be made at once [37]. Building on similar concepts of flow cytometry, mass cytometry was developed to measure cells that have been labelled with isotope conjugated antibodies which can then be examined with mass spectrometry [38]. Each labelled cell is sprayed in a single droplet into inductively coupled argon plasma which vaporises the cell. During this process the cell’s atomic constituents are ionised and the resulting elemental ions can be sampled using time-of-flight (TOF) mass spectrometry and quantified. The result is that the cell is permanently destroyed, however this technique allows for characterisation of a large number of proteins; where up to 100 parameters per cell can be measured [37], [38], [39]. Whilst this technique is well suited to cost-effective high throughput use to investigate rare cell subpopulations, it relies on a priori knowledge of cell markers, limiting its use for novel cell type discoveries.

Single cell RNA-sequencing

Another single cell tool that has been widely adopted is scRNA-sequencing which has been predominantly developed to discover and unbiasedly characterise cell subpopulations from a large number of tissue dissociated cells. Since the development of the first scRNA-seq protocol in 2009 [40], many different protocols have been published, where major differences between them include: how the cells are separated and whether mRNA transcripts are amplified to generate full length cDNA or partial coverage. Many of these techniques, along with bioinformatic tools required to analyse them, have been previously expertly reviewed [41], [42], [43], [44]. Hence, for the purposes of this review, we will focus on the major techniques that have been adopted in the field of mammary gland biology. These techniques include SMART-Seq 2 and the Fluidigm C1 platform as well as Drop-seq utilised by the Chromium platform from 10x Genomics. Switching mechanism at the 5’ end of the RNA template (SMART-Seq) was first developed in 2012 [45] and subsequent chemical refinements of the system led to the development of SMART-Seq2 [46], which has an increased yield and length of cDNA libraries generated for each individual cell. Cells tagged with multi fluorophore antibodies are sorted using a FACS machine into individual wells of a 96- or 384-well plate, allowing for optional index sorting of the cells. Confirmation of single cells per individual well can be subsequently conducted with a microscope before proceeding with cell lysis and downstream library preparation. cDNA is then generated from the full length of the mRNA using specially designed primers which also add sequences including a cell barcode and Illumina sequence, which allow for subsequent amplifications and analysis including multiplexing of up to 96 samples [47]. Library preparation is then conducted before pooling samples. Generation of full-length cDNA provides good read coverage across the entire transcript and allows for detection of gene isoforms or allele-specific expression by determining single nucleotide polymorphisms (SNPs). SMART-Seq 2 provides a cost-effective method to analyse hundreds of cells with high transcript coverage, however it is a laborious process involving many pipetting steps which limits the number of cells that can be easily sequenced. SMART-Seq technology has been incorporated into an automated microfluidic system, Fluidigm C1, which allows for the capture, lysis and reverse transcription of up to 96 cells on an integrated fluidic chip. As with the manual method of SMART-Seq, one can verify the capture of a single cell on the chip using a microscope. However, in this case the user is restricted to only preparing 96 cells per cartridge on the machine, limiting the number of cells that can be analysed. This problem has been somewhat overcome by the introduction of C1 mRNA Seq HT allowing for up to 800 cells to be captured in a single run. This method, however, compromises on read coverage across the entire script due to utilising 3’ end counting mRNA sequencing [41]. The technique of droplet barcoding (or Drop-seq) revolutionised the field of scRNA-seq and allowed this technology to become a more accessible tool for molecular biology. The first protocols captured single cells in nanolitre droplets together with DNA-barcoded beads using a microfluidic system [48], [49]. Within these droplets, cells can be lysed, barcoded and cDNA generated. Subsequently, the cDNA can then be pooled for PCR amplification and sheared to allow for short-read sequencing permitting partial coverage of cDNA sequences that can be aligned to infer gene expression profiles. After development of this technique, a commercially available version was provided by 10x Genomics called the Chromium controller platform. This platform allows for the capture and analysis of thousands of cells on a single cell level from 8 different samples [50] and is a cost-effective technique when calculated per cell compared to many other scRNA-seq techniques. Evidently, it is not possible to check the morphology of the cells prior to sequencing them, nor verify single cell rather than doublet capture, although this has been optimised previously [50]. With the development of cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), detection of multiplexed protein markers can be done in conjunction with unbiased transcriptome profiling of the cells using this platform among others [51].

Single-cell ATAC-Seq

Whilst scRNA-sequencing provides in depth information of cell subpopulation gene expression profiles at a snapshot in time, only through examining gene promoter and enhancer chromatin accessibility profiles can we understand a cell’s potential for different cell states. Many of the techniques of single cell capture including array-based technologies, droplet microfluidics and combinatorial indexing through split pooling have allowed for the development of multiple single-cell Assay for Transposase Accessible Chromatin using sequencing (scATAC-seq) protocols [52]. The first method of scATAC-seq involved microfluidic capture of single cells using the Fluidigm system, and subsequent transposition using Tn5 transposases to tag regulatory regions by inserting sequencing adaptors into accessible regions of the genome [53]. Combinatorial cellular indexing through split pooling is a technique involving two rounds of nuclei barcoding, which allows for scATAC-seq analysis to be conducted on pooled single cells, without the need to first physically separate them [54]. An alternative method of scATAC-seq, which allows for the profiling of an even greater number of cells has also been developed by 10x Genomics [55]. Nuclei of cells are isolated from a bulk single cell suspension and transposed before being loaded onto the microfluidic chip. Within the chip, gel beads in emulsion (GEMs) are generated and contain a single nucleus and a bead. The bead contains: the reagents to lyse the nuclei, oligonucleotide sequencing adaptor and a priming sequence which during the linear amplification reaction incorporates a unique barcode into the transposed DNA [55]. Barcoded DNA can then be amplified before sequencing. A major disadvantage of scATAC-seq is inherent sparsity of the generated data, which arises due to the fact that diploid cells have low copy numbers. Despite the powerful computational tools developed to analyse such sparse data, it is thought that only 1–10% of the total accessible peaks are detected [52], where much of the scATAC-seq analysis heavily relies on single cell transcriptomic data to infer cell types.

Limitations of single cell tools

Evidently for each of the above single cell analysis techniques, it is possible to gain invaluable insights into cellular behaviour or differentiation potential, however no one technique is omnipotent. Whilst scDNA-seq provides information about mutational states of cells through copy number variations or SNVs, it may be limited in its coverage and does not provide any information on cell type, function or state. On the other hand, techniques that aim to characterise immediate cell subpopulations either by examining the direct transcriptional profiles (scRNA-seq) or gene expression products (flow/mass cytometry), are not always in agreement due to post-transcriptional modifications. Mass cytometry allows for identification of lowly expressed proteins which otherwise might be undetected in scRNA-seq data (particularly droplet-based platforms) due to dropout of low-level transcripts. Unlike scRNA-seq which unbiasedly classifies cells based on their whole measured transcriptional profile, mass cytometry relies on appropriate marker selection and antibody availability, without which cells might be misidentified. As mentioned above, only a fraction of the total chromatin accessible sites can be extracted using scATAC-seq, where identification of different cell subpopulations relies on mapping this limited data onto available transcriptomic information. To overcome the limitations of these different single cell techniques, multi-omic approaches aiming to combine different single cell technologies (such as CITE-seq) are emerging and have recently been expertly reviewed [56]. Along with emerging techniques to collect data from different single cell profiling methods, it is important that methods to analyse the increasing amounts of data involving machine learning and novel algorithms continue to evolve.

Single cell analysis tools

Development of machine learning bioinformatic tools to analyse single cell data is as important as improving the techniques themselves to advance our understanding of tissue subpopulations and differentiation trajectories. As demonstrated above each single cell technology is unique and often requires its own analysis pipeline and tools. Many of these have been previously expertly reviewed for analysis of: scDNA-seq [57], mass cytometry [58], [59], scRNA-seq [60], [61], [62] and scATAC-seq [52] data. Visualisation of single cell datasets has been improved as dimension reduction tools have evolved from: principle component analysis to t-distributed stochastic neighbour embedding (tSNE) [63] to uniform manifold approximation and projection for dimension reduction (UMAP) [64] to diffusion map graphical representation [65]. Tools such as diffusion pseudotime analysis use input from different single cell techniques and construct a temporal order of differentiating cells by measuring transitions between cells using diffusion-like random walks [66] and allow for cellular hierarchies to be inferred. Other interesting tools for scRNA-seq data include: scGen which uses machine learning to model perturbation and infection response of cells across cell types, studies and species [67] and Markov affinity-based graph imputation of cells (MAGIC) which determines gene interactions [68]. On the other hand, ChromVar uses scATAC-seq data to predict transcription factor associated accessibility and allow clustering of epigenomically profiled cells [69], whereas more recent tools such as Model-based AnalysEs of Transcriptome and RegulOme (MAESTRO) directly integrate data from both scRNA-seq and scATAC-seq experiments [70]. These are a few of the hundreds of tools available that can be used to aid in the biological interpretation of the big data generated by single cell analysis.

Using single cell technologies to understand mammary gland biology

Since the publication of the first single cell manuscripts characterising the postnatal mammary gland [71], [72], an increasing number of studies (Table 1) have focused on comparing mammary cells across different stages of normal development (Fig. 1). The focus of these studies is to comprehensively characterise mammary cell subpopulations and differentiation trajectories using mass cytometry, scRNA-seq and scATAC-seq, to answer key question such as whether a rare population of bipotent progenitors exist past embryonic development.
Table 1

Single cell datasets of normal mammary and tumour cells.

ReferenceLabSpeciesTissue type (normal/ cancer/ both)Sample time pointsMammary fraction (epithelial/stroma-immune/both)Preparation of cells prior to single cell sequencingTechnique + technologyNumber of cells/nuclei profiled
Andersson et al. bioRxiv 2020 (preprint)LundebergHumanCancer8x HER2 + breast tumoursBothCells spatially selected from fixed tissue sectionsSpatial transcriptomics developed by Ståhl et al. Science 20161007 spots (where spot refers to a small neighbourhood populated by multiple cells)
Azizi et al. Cell 2018Pe’er and RudenskyHumanBoth8x primary breast carcinomas tumour and matched normal breast tissue, peripheral blood and lymph nodesStroma-imuneCD45+ FACS sorted cellsscRNA-seq using the inDrop platform47,016 cells
Bach et al. Nature Comm 2017Marioni and KhaledMouseNormalMammary cells taken during pregnancy (day 14.5), lactation (day 6), involution (day 11) and from virgin 8 week old C57BL/6N miceEpithelialLive lineage negative EpCAM+ cells sorted using FACSscRNA-seq using Chromium 10x Drop-Seq platform23,184 cells
Bach and Pensa et al. Nature Comm 2021Khaled and MarioniMouseBothTumourogenesis data set: 13 x Blg-Cre; Brca1f/f;p53 + /- mice (aged 30–48 weeks, nulliparous) and 2x C57BL/6N mice (aged 36–40 weeks old)BothViable cells were isolated using MACS Dead Cell Removal KitscRNA-seq using Chromium 10x Drop-Seq platform102,829 cells
Pregnancy data set: 9x C57BL/6N mice at 4.5/9.5/14.5 days gestation as well as 3x C57BL/6N mice (aged 12 weeks old)
Bartoschek et al. Nature Comm 2018PietrasMouseCancerTumours from 14 week old MMTV-PyMT miceStroma-imuneEpCAM-/CD45-/CD31-/NG2- FACS sorted mesenchymal cellsscRNA-seq using Smart-Seq2768 cells
Baslan et al. eLife 2020HicksHumanCancerFresh pre-treatment core biopsies were taken from 16 patients’ enroled in phase II clinical trials conducted by the Brown University Oncology Group (BrUOG)BothNuclei was isolated and FACS sorted for ploidySingle-nuclei copy number analysis developed by Baslan et al. Nature Protocols 2012Mean of 116 single-nuclei per tumour
Carli et al. J Mammary Gland Biol Neoplasia 2020RudolphHumanNormalHuman milk cells from two participantsBothViable cells were isolated using FACSscRNA-seq using Chromium 10x Drop-Seq platform3740 cells
Casasent et al. Cell 2018Edgerton and NavinHumanCancer10x ductal carcinoma in situ (DCIS) and invasive ductal carcinoma (IDC) tumour samplesBothCells selected using single cell laser dissectionTopographic single cell sequencing (TSCS, scDNA sequencing)On average 129 cells per patient
Chung et al. Nature Comm 2017ParkHumanCancer11x tumour cells from different breast cancer subtypesBothDead cells were removed using Ficoll-Paque PLUSscATAC-seq using the Fluidigm C1 platform515 cells
Chung et al. Cell Reports 2019WahlMouseNormalMammary cells from E18 foetal and 8-week old adult CD1 miceEpithelialEpCAM+/Lin- FACS sorted cellssnATAC-seq7846 high quality single nuclei
Engelbrecht and Twigger et al. bioRxiv 2020 (preprint)Scheel and KhaledHumanNormalMammary tissue from one participant was digested 3 ways (3 h vs. 16 h digest and 10 rpm vs. 100 rpm. shaking speed).BothViable cells were isolated using MACS Dead Cell Removal KitscRNA-seq using Chromium 10x Drop-Seq11,191 cells
Gao et al. Nature Comm 2017NavinHumanCancerBreast cancer cell lines for technique validation and a single triple negative breast tumour sampleBothNuclei isolated from frozen tumour and cell linesNanogrid snRNA-seq796 primary cell nuclei
Giraddi et al. Cell Rep 2019Wahl and SpikeMouseNormalMammary cells isolated from day 16/18 embryonic (E16, E18), postnatal day 10 (P10) and 10–16-week-old adult C57BL/6 miceEpithelialEpCAM+ FACS sorted cellsscRNA-seq using Chromium 10x Drop-Seq and Fluidigm C1 platforms6060 cells using 10x and 262 cells using Fluidigm C1
Gkountela et al. Cell 2019AcetoHumanCancerBlood samples containing circulating tumour cells (CTCs) from 43 patients with progressive breast cancerBothLive CTSs were stained for EpCAM, HER2 and EGFR and sorted for CTCsSingle Cell Whole-genome Bisulfite Sequencing developed by Farlik et al. Cell Rep. 201589 single CTCs and 71 CTC clusters from patients and xenographs
Grosselin et al. Nature Genetics 2019Griffiths, Vallot and GérardMouse and HumanCancerPatient derived xenogrft mouse models of luminal and treatment resistant breast cancersBothViable cells were isolated using MACS Dead Cell Removal KitscChIP-seq and scRNA-seq2728 cells profiled by scRNA-seq
Han et al. Cell 2018GuoMouseNormalMammary cells taken during pregnancy, lactation, involution and from virgin C57BL/6 miceBoth*Cells from all major mice organs including mammary gland. *However, no mention was made of removal of the mammary lymph nodes prior to mammary gland dissociation.scRNA-seq using microwell-seq61,196 mammary gland cells
Kanaya et al. Commun Biol 2019ChenMouseNormalMammary cells from 9-week-old BALB/cj mice which underwent ovariectomy (surgical menopause) and were treated with vehicle, E2 and/or PBDEBothDead cells were removed using microbeadsscRNA-seq using Chromium 10x Drop-Seq platform14,856 cells
Karaayvaz et al. Nature Comm 2018EllisenHumanCancer6x primary triple negative breast tumoursBothViable tumour cells were isolated by FACS sortingscRNA-seq using Smart-seq21189 cells
Knapp et al. Cell Reports 2017EavesHumanBoth7x breast cancer cell lines and 8x normal primary breast tissue samplesBothN/AMass cytometry using 35 markers (16 extracellular and 19 intracellular)Not disclosed
Li et al. Cell Reports 2020BruggeMouseNormalMammary cells from young (3–4 month, n = 3) and aged (13–14 month, n = 4) old virgin C57BL/6 J miceBothNonescRNA-seq using Chromium 10x Drop-Seq platform13,684 cells
Lo et al. Cancers 2020ZhouMouseNormalMammary cells were isolated from normal and high fat diet (HFD) fed 2 month old C57BL/6 Brca1-/-;p53+/- miceStroma-imuneEpCAM- FACs sorted viable stromal and immune cellsscRNA-seq using Chromium 10x Drop-Seq platform3892 cells
Murrow et al. bioRxiv 2020 (preprint v4)GartnerHumanNormalMammary cells isolated from 28x premenopausal women of varying BMI and ageBothLive (DAPI-) cells were sorted using CD31-/CD45-/EpCAM+/-/ CD49f+/- and CD45+scRNA-seq using Chromium 10x Drop-Seq platform87,793 cells
Nguyen et al. Nature Comms 2018KessenbrockHumanNormalMammary cells isolated from 7x individuals whom underwent reduction mammoplastiesEpithelialCD31-/CD45- epithelial cells enriched by FACs sorting using CD49f and EpCAMscRNA-seq using Chromium 10x Drop-Seq and Fluidigm C1 platforms24,646 cells using 10x and 868 cells using Fluidigm C1
Pal et al. Nature Comms 2017VisvaderMouseNormalMammary cells were isolated from early postnatal (2 week old), mid-pubertal (5 weeks old) and mature adult (10 weeks, either from virgin or pregnant) FVB/NJ miceEpithelialCD45-/CD31-/CD24+ epithelial cells were FACS enrichedscRNA-seq using Chromium 10x Drop-Seq and Fluidigm C1 platforms3308 cells using 10x and 460 cells using Fluidigm C1
Pervolarakis et al. Cell Reports 2020Watanabe and KessenbrockMouseNormalMammary cells were isolated from 10-week-old FVB/NJ miceBothFACS sorting using markers CD31, CD45, EpCAM and CD49f separated live cells for scRNA-seq and basal/luminal cells for scATAC-seqscRNA-seq and scATAC-seq was performed using 10x Genomics Chromium platforms26,859 cells were profiled using scRNA-seq and 23,338 cells were profiled using scATAC-seq
Salmén et al. bioRxiv 2018 (preprint)LundebergHumanCancerTumour tissue sections from 10 patients diagnosed with HER2 + breast cancerBothN/ASpatial Transcriptomics developed by Ståhl et al. Science 2016 and others1007 spatial spots
Sebastian et al. Cancers 2020LootsMouseCancerMammary cells were extracted from 10-week-old BALB/c mice with 4T1 derived mammary tumours (a synergistic model for triple negative breast cancer)StromaUsing magnetic cell separation, CD45 depleted or CD45/CD90.1 depleted cells were sequenced, as well as bulk cells and CD140a+/EpCAM-/CD45-/7AAD- FACS sorted fibroblasts.scRNA-seq using Chromium 10x Drop-Seq platform6420 cells
Sun et al. J. Biol. Chem. 2018DengMouseNormalMammary cells were isolated from 3 to 4 month-old virgin or day 12.5 pregnant FVB miceEpithelialLineage positive (Lin+, endothelial or immune) cells were excluded using the EasySeq mouse epithelial cell enrichment kit. Luminal and basal cells were FACS sorted for using CD24 and CD29scRNA-seq using Fluidigm C1 platform239 cells
Thong et al. Front. Cell Dev. Biol. 2020ColacinoHumanNormalMammary cells were isolated from 3x normal mammoplasty tissue samples. Additionally, mammary cells from the 3 patients were conditionally reprogrammed in vitro and also sequenced.BothNonescRNA-seq using drop-seq protocols developed by Macosko et al. 2015.Not disclosed
The Tabula Muris Consortium Nature 2020MouseNormalMammary cells taken from at 3, 18 and 21 months from C57BL/6JN miceBoth*Cells from all major mice organs including mammary gland. *However, no mention was made of removal of the mammary lymph nodes prior to mammary gland dissociation.scRNA-seq using Chromium 10x Drop-Seq platform15,577 mammary cells
Tognetti et al. bioRxiv 2020 (preprint)BodenmillerHuman cell linesBoth62x breast cancer cell lines and 5x normal cell linesN/AN/AMass cytometry using 34 markers> 80 million cells
Twigger et al. bioRxiv 2020 (preprint)Khaled and ScheelHumanNormal4x samples from participants donating mammoplasty tissue and 4x human milk samples from lactating womenBothNonescRNA-seq using Chromium 10x Drop-Seq platform24,666 non-lactating breast cells and 27,023 human milk cells
Vatter et al., Cell Reports 2018Bodenmiller, LaBarge and LorensHumanNormal57 samples were profiled consisting of cultured cells from 44 women and 13 uncultured breast epithelia samples.EpitheliaN/AMass cytometry using 29 markers880,000 cells
Wagner at al. Cell 2019BodenmillerHumanBoth144x human breast tumours and 50x normal breast tissue samplesBothN/AMass cytometry using 73 markers26 million cells
Wang et al. eLife 2020SchwertfegerMouseNormalMammary cells isolated from 10-week-old dioestrus FVB/NJ miceStroma-imuneCD45+ cells were enriched for using magnetic cell separationscRNA-seq using Chromium 10x Drop-Seq platform13,000 cells were targeted
Wu et al. EMBO J 2020SwarbrickHumanCancer5x triple negative breast tumoursBothViable cells were enriched for using EasySep Dead Cell Removal KitscRNA-seq using Chromium 10x Drop-Seq platform24,271 cells
Wuidart et al. Nat Cell Biol 2018BlanpainMouseNormalMammary cells were isolated from embryonic day 14 (E14) and > 8-week-old adult miceEpithelialEmbryonic CD49fHi/Lgr5Hi cells, adult CD24+/CD29Hi basal and adult CD24+/CD29Lo luminal cells were FACS enrichedscRNA-seq using Smart-seq2193 cells
Fig. 1

Overview of mouse and human mammary development and cancer single cell studies.

Single cell datasets of normal mammary and tumour cells. Overview of mouse and human mammary development and cancer single cell studies.

Embryonic mammary gland composition

Recent studies have compared embryonic and post birth mammary cell subpopulations (Fig. 1) using scRNA-seq and ATAC-seq (Table 1) in an attempt to answer whether bipotent progenitors giving rise to both luminal and basal cells exist postnatally. Using multicolour lineage tracing and scRNA-seq, Wuidart et al. identified that at embryonic day 14 (E14) a single population of embryonic multipotent progenitors (EMPs) exist in the mouse mammary gland. These cells exhibit a hybrid transcriptional signature comprising of marker genes from both the luminal and basal lineages [23]. Within this study it was found that p63 is a master regulator of basal cells, where overexpression in luminal cells is enough to convert them into a basal phenotype. One of the limitations of this study is that only a small number of cells were analysed by single cell analysis and that comparisons between different developmental time points were limited (69 EMPs compared to 51 adult basal cells and 73 adult luminal cells). However, within the same year, another study was published by Giraddi et al. which compared the single cell profile of cells during embryogenesis (E16 and E18), postnatally at day 4 (P4) and from the adult mouse mammary gland. As was observed in the previous study, mammary cells from E16 and E18 generated single separate clusters of foetal cells, whereas postnatally both luminal and basal cells could be identified which was further separated in the adult gland into mature and alveolar luminal clusters and a separate basal cluster [73]. Original bulk ATAC-seq analysis by Dravis et al., referenced in Girraddi et al., showed that E18 foetal mammary stem cells (fMaSCs) presented open features at distal enhancer and proximal promoter regions of both luminal and basal genes [74]. Using single nucleus ATAC-seq (snATAC-seq), follow up studies confirmed that during late embryogenesis (E18), individual cells displayed either a basal-like (Krt5, Acta2) or pan luminal/luminal progenitor-like (Krt8, Krt18, Kit) chromatin accessibility profile together with foetal-enriched genes such as Sox10 and Sox21 [75]. snATAC-seq and scRNA-seq data from Giraddi et al. was integrated to examine the developmental trajectory which found that both E16 and E18 foetal cells remain tightly clustered. Whilst clustered cells at E18 were found to be indistinguishable on a transcriptomic level, lineage priming was found using snATAC-seq, where cells displayed accessibility either to luminal or basal cell fate [75]. Subsequently, only postnatal lineage restricted luminal or basal cells were present, which upon onset of puberty the luminal cells further split into luminal alveolar and hormone responsive luminal cells (see Table 2 for a summary of the commonly used nomenclature for these populations).
Table 2

Summary of nomenclature of adult mammary epithelial cell subtypes.


Basal
Luminal
ReferenceOrganismTechniqueBasalLuminal progenitorSecretory alveolarHormone responsive
Bach et al. 2017MousescRNA-seqBasalLuminal progenitorDifferentiated secretory alveolarHormone sensing
Chung et al. 2019MousesnATAC-seqBasalLuminal progenitorN.A.aMature luminal
Engelbrecht et al. 2020HumanscRNA-seqBasal/myoepithelial (BA)Luminal hormone receptor negative progenitors (LHR-)N.A.aLuminal hormone-receptor positive mature cells (LHR+)
Giraddi et al. 2018MousescRNA-seqBasalAlveolar precursorN.A.aMature luminal
Han et al. 2018MousescRNA-seqMyoepithelial cellsLuminal progenitorDuctal luminalSecretory alveoliN.A.
Knapp et al. 2017HumanMass cytometryBasal cells (BCs)Luminal progenitors (LPs)Luminal cells (LCs)
Li et al. 2020MousescRNA-seqMyoepithelialHS-AVAlveolar (AV)N.A.aHormone sensing (HS)
Murrow et al. 2020HumanscRNA-seqBasal/MyoepithelialSecretory luminalN.A.aHormone responsive (HR+)
Nguyen et al. 2017HumanscRNA-seqBasalMyoepithelialSecretory L1-typeN.A.aHormone responsive L2-type
Pal et al. 2017MousescRNA-seqBasal/MyoepithelialMammary stem cells (MaSCs)Luminal progenitorLuminal intermediate (Lum Int)AlveolarMature luminal/Ductal
Pervolarakis et al. 2020MousescRNA-seq/scATAC-seqMyoepithelialLuminal secretory (L-sec)Hormone responsive
Luminal progenitorLactation progenitor/precursorSecretory alveolar
Regan and Smalley 2020N.A.N.A.Myoepithelial cellsOestrogen receptor negative (ER-) ductal cellsSecretory alveolar cellsHormone-responsive oestrogen receptor-positive (ER+) cells
Sun et al. 2018MousescRNA-seqMyoepithelial/BasalProliferative luminal cells (PLCs)Keratinised luminal cells (KLCs)Mature luminal cells (Mature LCs)Lipid biosynthetic luminal cells (LBLCs)Stimulus-responsive luminal cells (SRLCs)
Proliferative basal cells (PBs)Wnt signalling -responsive basal cells (WRBCs)Mammary stem cells (MaSCs)
Thong et al. 2020HumanscRNA-seqMyoepithelialLuminal 1 (L1)N.A.aLuminal 2 (L2)
Twigger et al. 2020HumanscRNA-seqBasal cells (BA)Luminal progenitor (LP)Secretory alveolarHormone responsive (HR)
Luminal 1 (LC1)Luminal 2 (LC2)
Vatter et al. 2018HumanMass cytometryBasal myoepithelium (MEP)Luminal epithelium (LEP)
Wuidart et al. 2018MousescRNA-seqBasal cells (BCs)Luminal cells (LCs)

N.A. pregnancy/lactation time point wasn’t included in this study, hence secretory alveolar cells not described.

Summary of nomenclature of adult mammary epithelial cell subtypes. N.A. pregnancy/lactation time point wasn’t included in this study, hence secretory alveolar cells not described.

Adult mammary gland epithelial cell composition

Even through the dramatic changes occurring during pregnancy, lactation and involution in the adult mammary gland, scRNA-seq findings from Bach et al. support the findings that only lineage restricted progenitors exist postnatally. Epithelial cells taken from virgin, pregnant, lactating and involuting mice were characterised to find distinct subpopulations of cells from either the luminal or basal lineages [71]. These cell clusters were ordered along a pseudotime trajectory and it was found that both luminal and basal cells did not have a common progenitor [71]. The luminal cells however were found to have a continuum of differentiation stemming from Aldh1a3+ luminal progenitors and extend to the two major lineages of secretory alveolar and hormone responsive. Clearly replicating such experiments in humans would prove difficult, hence studies conducting scRNA-seq of human milk cells have provided novel findings on the maturation of functional human mammary cells during lactation [76], [77]. Recent work from our lab examining milk cells from 4 donors and 4 non-lactating breast tissue samples found two populations of secretory luminal milk derived cells which are transcriptionally similar to luminal progenitor cells from non-lactating breast [76]. This work suggests that luminal progenitors may give rise to secretory alveolar cells in humans, as well as in mice. Furthermore, another recently published study has found that aberrant differentiation of luminal progenitor cells, resulting in expression of key milk proteins Lalba, Csn2 and Wap, may provide an indication of early tumorigenesis in Brca/p53 mouse models [78]. This highlights the importance of resolving the role luminal progenitors play in development to better understand breast cancer development. Findings of a common luminal progenitor in the adult mammary gland is in agreement with findings from Giraddi et al. but are in contrast to the recent lineage tracing studies which suggest that these two luminal cell lineages are unipotent in adult mice [22], [79], [80], [81], [82]. It is possible that the bipotent luminal cells identified in these scRNA-seq studies are active during early embryonic and prepubertal stages and that later in development, the two luminal lineages are maintained by their respective progenitors. Interesting, it was found that parity induced Aldh1a3 luminal progenitor cells exist post involution, that were seemingly primed towards the alveolar lineage and expressed many lactation associated genes such as Lipa, Xdh and casein genes such as Csn2 and Csn3 [71]. Findings from a recent study that combined scATAC-Seq and scRNA-seq to study 10-week-old adult mice corroborated findings from the Bach study, identifying a single myoepithelial population as well as a Foxa1 hormone responsive luminal population and Elf5/Kit luminal cells (termed L-sec for luminal secretory, see Table 2) [83]. The L-sec cells were further divided into Rspo1/Aldh1a3 luminal progenitor and Lalba/Csn2/Lipa lactation progenitor cells which according to pseudo temporal analysis suggests that the former gives rise to the latter cell subtype [83]. A limitation of this study was that analysis was only conducted at one time point, where comparisons between the gene expression profiles of the discovered lactation progenitor and previously described secretory alveolar cells [71] would have developmentally contextualised these findings. Similar cell populations were reported in the adult human mammary gland at a single time point [84], [85]. Mass cytometry of normal primary mammary tissue identified epithelial subsets of basal, luminal (corresponding to mature luminal, see Table 2) and luminal progenitor cells [85]. Strikingly, within the latter population, a small subset of cells displayed an elevated content of active caspase-3 whilst maintaining clonogenicity, suggesting potential greater genomic instability and increased risk of oncogenic transformation in this set of luminal progenitor cells [85]. Nguyen et al. also identified 3 major epithelial cell types using scRNA-seq which they termed myoepithelial, luminal 1 (L1) and luminal 2 (L2) cells (see Table 2). The SLPI L1 population was split into ELF5/KIT L1.2 (resembling luminal progenitor cells, Table 2) and LTF+ L1.1 cells (resembling secretory alveolar cells, Table 2). L1.2 luminal progenitor cells were found to sit above the other luminal cell types on the differentiation hierarchy. L2 cells expressed the marker ANKRD30A together with hormone receptors ESR1, PGR and AR indicative of a hormone responsive subpopulation (Table 2). Interestingly, upon closer examination of the basal cluster identified using 10x Genomics scRNA-seq, a subcluster of cells highly expressed contractile genes (ACTA2, TGLN, KRT14) which the authors subsequently identified as specific myoepithelial cells [84]. However, data from a recent pre-print suggests that differential expression of adhesion markers may be an artefact of long digestion duration rather than due to distinct subclusters in human mammary cell subtypes [86]. Within the Nguyen paper, the authors manually set the start of the pseudotime to be within the basal cell type and found that the resulting trajectory differentiated from the basal cells to a myoepithelial branch and a bifurcated luminal branch, topped by L1.2 luminal progenitors [84]. Whilst the studies described thus far appear to represent a concordant view of mammary cell subpopulations in the embryonic and postnatal mammary gland of both mice and humans, contradictory findings have been published [72], [87], [88], [89] and reviewed [90], [91] suggesting bipotent progenitors existing postnatally.

Multipotent lineage progenitors in the adult mammary gland

Similar to Bach et al., another study published at the same time [72] investigated postnatal mammary cell subpopulations in mice at slightly different developmental time points (pre-puberty, puberty, adult and pregnancy as well as different stages of the oestrus cycle) however came to different conclusions. Findings from this study suggest that prior to puberty a majority of epithelial cells consist of a basal phenotype, where a rare basal-like population expressing some luminal features is positive for the marker Cd55 [72]. During puberty this population seemingly expands, where after puberty the Cd55 cells predominantly reside in the luminal cell compartment. Further luminal subpopulations were identified including intermediate (sharing markers of both luminal progenitor and hormone responsive cells) and mixed (expressing both Kit/Elf5 luminal markers and Acta/Krt14 basal markers) cell subpopulations [72]. Detailed discussion on this study, together with other early mammary scRNA-seq studies that classified new cell types based on expression of markers such as c-Kit (Kit), can be found in [92]. A more recent study in mice also found a hybrid luminal cell subpopulation, which expressed genes characteristic of both hormone sensing (HS) and alveolar progenitors (AV) termed HS-AV cells, which decreased with age [93]. Both the described “intermediate” [72] and “HS-AV” [93] luminal cells expressed higher levels of Prlr and Cited than luminal progenitor/alveolar cells, and higher levels of Csn3 and Trf than hormone responsive cells. Li et al. went further and confirmed using immunofluorescence these double luminal lineage cells by illustrating co-staining of progesterone receptor with milk fat globule-EGF factor 8 (PR+/MFGE8+) and oestrogen receptor with lactoferrin (ER+/LTF+) [93]. Despite the identification of new luminal cell types, Pal et al. did not observe a separate rare population of mammary stem cells positive for Epcam and Procr or Lgr5 and Tspan8 that was expected according to data from previous studies [72]. Similarly, Sun et al. examined epithelial cells taken from the adult mouse in the virgin and pregnant state and also noted an absence of a unique population of cells that expressed Lgr5 and Tspan8 [87]. Findings from this study observed a Cdh5/Procr population of cells that resided in the basal compartment, which proliferated in vitro and had in vivo mammary repopulation potential [87]. Analysis of developmental trajectories suggested that this cell type was at the top of a mammary epithelial cell hierarchy during pregnancy, however this was not the case in the virgin gland. Follow up studies utilised novel bioinformatic tools to compare across different data sets in an attempt to reconcile the differences found between the single cell studies to date. One such study developed a model called Landscape of Single Cell Entropy (LandSCENT) which provides a potency score for single cells independent of using lineage markers which they then used to define root states for lineage trajectory algorithms [88]. Using this method, this study reanalysed an adult human data set [84], where they identified high potency cells mapping to the periphery of the basal and immature alveolar clusters. Cells that sit in this bipotent state overexpressed transcription factors YBX1 and ENO1 which have been implicated in basal breast cancer risk and were suggested as potential markers identifying multipotent progenitors [88]. However, throughout this paper it was noted that data from the Nguyen et al. study was not of a sufficient quality to identify stem cell populations without this technique and that future studies should investigate more cells to a greater depth. A more recent study also conducting retrospective comparisons between many data sets and, in addition, produced a novel dataset characterising normal human mammoplasty cells from 3 donors together with cultured cells [89]. Whilst this study’s main focus was on the epithelial compartment of mammary cells, they captured all noted epithelial populations together with stromal and immune cell subtypes [89]. Primary mammary cells were cultured in 2D on top of irradiated fibroblasts to conditionally reprogramme the cells [94]. After which, the authors identified different “hybrid” epithelial cell populations emerging including basal/luminal (KRT14/KRT18), epithelial/mesenchymal (EPCAM/VIM) and some that expressed all 4 markers (quadruple positive cells, KRT14/KRT18/EPCAM/VIM). In particular, they found by comparing their data with previous studies [71], [73], [84] that normal human mammary tissue mapped to the adult mammary tissue of mice and conditionally reprogrammed cells map closer to the stem-like population of embryonic mouse cells. After comparison with all these time points, they noted that the “hybrid” cells existed at different time points predominantly during the in utero period, gestation and lactation. This paper provides a first look at all the mammary cell subpopulations (including stroma) that exist in the adult human mammary gland but does not take into account differences in marker expression profiles between mice and human cells, namely that KRT14 expression has been routinely observed by pathologists in both luminal and basal cells in the adult human [95]. Accurate use of markers to define different cell types is important when describing whether cell subpopulations contain markers from both luminal and basal cells. In addition, it is important to remember that cells captured in these studies are done so at a single snapshot in time and that their temporal trajectories are inferred only from these single time points. Therefore conclusive statements of biopotency must be further verified using in vivo models, as discussed by Watson and Khaled [96], and may take advantage of emerging technologies such as scGASTALT/LINNAEUS [97], [98] that combine unbiased lineage tracing with scRNA-seq-detectable genetic scaring.

Challenges remaining in the face of generating a unified mammary cell atlas

Before we attempt to understand the cells driving breast cancer formation, it is becoming increasingly clear that we need to unify findings from mammary development studies to be able to describe which mammary cell subpopulations universally exist, which species-specific markers they express and define commonly used nomenclature for each of these cell subtypes. To this end, we have attempted to summarise the major mammary single cell studies (Table 1) and common nomenclature used for the four major adult epithelial subtypes (Table 2). Not all studies identify the same mammary subtypes, however most studies agree that cell subpopulations such as adipocytes or secretory alveolar cells during lactation (see Bach et al.) have been incompletely profiled due to difficulties in isolating intact cells during dissociation. Future efforts to build a mammary cell atlas must explore different methods to dissociate tissue/isolate cells, such as modifying digestion duration [86] or temperature of enzymes [99], to overcome certain cell types being disproportionately enriched or depleted during sample preparation. As described above, some subclusters classified as epithelial cells were inimitably observed, including Cd55+ [72] and Cdh5+ [87] subclusters or ZEB1/TCF4 [84], VIM/EPCAM and KRT18/KRT14 cells [89]. Cd55+ and Cdh5+ clusters highly express genes such as Axl and Sparc or Cd36 and Pecam1, that have been associated to fibroblasts and endothelial stromal subpopulations [100]. This suggests the utilised CD24/CD29 sorting strategy may not have been sufficient to exclude contaminating stromal cells. Similarly, a cluster of Procr+ cells classified as basal cells also expressed some but not all markers of pericytes and hence the authors conceded that these cells may be contaminating non-epithelial cells [71]. Further rare epithelial cell subtypes such as ZEB1/TCF4 cells [84], VIM/EPCAM cells and KRT18/KRT14 cells [89] were identified by marker expression alone, where cells did not form separate clusters arising from clustering or dimension reduction analysis. As such, without orthogonal validation (such as tissue staining) it is difficult to determine the biological relevance of these results due to potential technical error such as doublet artefacts or detection of spurious transcripts. Clearly, it is remarkably difficult to characterise mammary cell subpopulations based on a limited number of markers. Hence, moving forward, cells profiled using single cell technologies should be classified into subpopulations using the cell’s global transcriptome, preferentially utilising cell signatures made up of an abundance of different validated genes. Indeed, the issue of unifying diverse cell types across different organs is a universal problem of cell biology and various consortiums have begun to catalogue different cell types using single cell technologies. A recent study by Han et al. characterised a range of different tissues including: the mammary gland, lung, kidney, testis and placenta using scRNA-seq as a resource in developing the mouse cell atlas [101]. Specifically, in regard to the mammary gland, tissue was examined from virgin, pregnant, lactating and involution tissue. Whilst many of the briefly described findings fit what has been described in studies above [101], due to the lymph nodes not being removed prior to tissue digestion it is hard to distinguish whether the reported immune cells are lymph node-resident or mammary tissue-resident. Another similar study conducted by the Tabula Muris consortium attempted to discern how many different organs aged differently [102]. Cells of different organs were extracted and sequenced from mice between 1 month to 2.5 years. Interestingly, it was noted that for the mammary gland, there was a significant decline in T cells (which could be confounded by the lymph nodes potentially not being removed prior to gland dissociation), and overall there was an upregulation of API transcription factor family (Junb, Jund, Fos) with age [102]. Findings from these studies highlight that it is important to consider factors outside of normal development such as age that might influence mammary cell subpopulations and gene expression profiles.

The impact aging, menopause, parity and obesity have on the adult mammary gland

The effect ageing, menopause, parity and obesity have on mammary cell subpopulation proportions has been investigated across a number of studies [93], [100], [103], [104]. Findings from an article examining primary mammary cells of healthy women using mass cytometry observed age-related increase in certain luminal cell subpopulations with a corresponding decrease in myoepithelial cell proportions [104]. Similarly, a recent article examining the mammary glands of young compared to aged mice has found that the gland contains altered proportions and gene expression profiles of both epithelial and stromal cells [93]. Interestingly, they find an increase in the proportion of epithelial cells as well as identifying an increase specifically in the numbers of alveolar cells. It was also noted that in myoepithelial cells, there was a decrease in the expression levels of genes associated with basement membrane protein synthesis as well as an increase in inflammatory cytokine production by the myoepithelial, vascular endothelial and macrophage cells [93]. These findings suggest dramatic changes occurring in the stroma in aging cells that may provide a pro-tumourigenic microenvironment. As a result of aging, menopause is a risk factor for breast cancer development and similarly various changes in the mammary cell subpopulations may provide a mechanism behind this phenomenon. In the context of surgical menopause (through an ovariectomy), the impact oestrogen (17beta-estadiol, E2) supplementation with or without exposure to harmful endocrine disrupting environmental contaminant PBDE (polybrominated diphenyl ethers) has on mammary cell subpopulation proportions was investigated [100]. This study finds in the menopausal vehicle mouse, there is an abundance of fibroblasts and stomal cells with a lower proportion of epithelial cells. However, the use of E2 induces the regrowth of terminal end bud like structures which is enhanced in the presence of PBDE. Administering E2 and PBDE together induced expression of progesterone receptor (PR) independent of oestrogen receptor (ERα, Esr1-) cells in two luminal cell populations as well as an increase in M2 macrophage populations which likely contribute to the mammary tissue remodelling and production of a pro-tumour microenvironment. Another set of factors thought to change the risk of developing breast cancer include parity and obesity. Murrow et al. in a recently updated preprint explored differences in the human mammary gland composition as a result of either obesity or parity [103]. It was found that parity increased the number of myoepithelial cells and transcriptional response of hormone responsive cells in the mammary gland, as well as altering the size of alveoli. Interestingly, they found that luminal progenitor proportions do not seem to correlate with parity and that hormone responsive cells decrease in number in obese women. Coming from a slightly different perspective, another study examined the impact high fat diet (HFD)-induced obesity had on the stomal compartment of the mammary glands of Brca1-/-, p53+/- breast cancer model mice [105]. This study found that HFD induced an increase in the expression of extracellular matrix genes (Col3a1, Col6a3, Eln and Sparc) and pro-tumourigenic M2 macrophage markers in monocytes suggesting an alteration in stromal cell function and microenvironment in obese breast-cancer prone mice. From these studies, it is clear that certain risk factors such as age, menopause and obesity alter both the cellular states and proportions of epithelial and stromal subtypes and suggest that careful characterisation of stomal cells is essential to understand breast cancer development.

Understanding mammary stroma diversity is integral to understanding breast cancer development

Theories surrounding the mammary stroma inducing a wound healing phenotype elicited in breasts with a high mammographic density, during obesity and in post-lactational remodelling have been put forward as a mechanism for breast cancer development [90]. As such, it is important to carefully characterise normal stromal cell compositions and how these might change as a mammary gland acquires a tumourigenic phenotype. Indeed, an early scRNA-seq study that profiled cells from breast tumours found that many of the cells with a normal phenotype (as identified by estimates of copy number variations) were immune cells [106]. Increasing interest in the macrophage proportion of mammary immune cells led one group to investigate the CD45+ population of 10-week-old murine mammary glands. Using scRNA-seq they identified a Lyve-1 subpopulation of tissue resident macrophages which expressed high levels of extracellular matrix remodelling associated genes [107]. In human tissue, arising from either breast tumour or matched normal breast tissue, scRNA-seq analysis of the CD45+ compartment found that immune cell subpopulations identified in normal tissue were only a small subset of those identified in tumours [108]. Interestingly they often found M1 and M2 macrophage associated genes in the same cell which were positively correlated, suggesting that each state might not be as mutually exclusive as previously thought and cell state might exist instead along a continuum. Another key stomal cell type of interest are mammary fibroblasts and in particular their similarity to cancer associated fibroblasts (CAFs). Focusing on sorted fibroblasts extracted from a precancer MMTV-PyMT mouse mammary cells, three major populations of CAFs have been identified including: vascular (vCAF, with a subset of cycling CAFs, cCAFs), matrix (mCAF) and developmental (dCAF) associated CAFs [109]. These cells were found to be transcriptionally and spatially distinct, where vCAFs were found to come from a perivascular location, resident fibroblasts gave rise to mCAFs and dCAFs are malignant cells that have undergone epithelial to mesenchymal transition [109]. On the other hand, a subsequent study investigating CAFs and normal fibroblasts in triple-negative breast cancer (TNBC)-modelling BALB/c-derived 4T1 mammary tumours, identified 6 CAF subpopulations [110]. These subpopulations included: Ly6c1high, -SMAhigh, Cd53high, Crabp1high, Cd74high and cycling CAFs, which were identified in the above study. This study further compared their findings to those of Bartoschek et al. and found that mCAFs were enriched for Crabp1, and other markers expressed by Crabp1high cells, suggesting that these cells may represent similar subpopulations. A recent study by Wu et al. examined the epithelial and stromal compartment of primary TNBC tumours from 6 patients and found aside from the expected normal mammary subpopulations, a basal epithelial cancer cluster, two CAF subpopulations and two perivascular-like (PVL) subpopulations [111]. The noted subpopulations included: myofibroblast-like CAFs (myCAFs), inflammatory CAFs (iCAFs) and mature and immature PVL cells. Here the authors suggest that the cells described as vCAFs in the Bartoschek et al. study, may be PVL cells identified by this study [111]. Not all CAF subsets were identified across studies, highlighting the need to synchronise findings as much as possible to determine translatable cell subpopulations.

Profiling breast cancer tumours

Many single cell techniques have been utilised to examine human mammary tumours and may be considered the beginning of a breast cancer cell atlas. A recent study by Wagner et al. used mass cytometry to examine the expression of 73 proteins across 144 human breast tumours and 50 non-tumour tissue samples to determine different cell subpopulations [112]. It was found that despite tumour subtype, there was a significant amount of cellular variation within individual tumours. Common to all breast cancer subtypes however, were PD-1+ T-cells and PD-L1+ tumour associated macrophages (TAMs). Another recent mass cytometry study also generated a cell atlas but instead examined 62 breast cancer cell lines and five lines from healthy tissue [113]. Comparisons between breast cancer cell lines and primary breast cancer tumours were made by Gao et al. using a novel technique of single-nucleus RNA-sequencing [114]. Similar to Wagner et al., different subpopulations of cells could be found within a single tumour. However, through examining all populations together, a rare subpopulation of highly proliferative cells was identified that upregulated breast cancer associated genes [114]. Focusing more closely on six triple negative breast cancer tumours, Karaayvaz et al. found that although most clusters arose from separate tumours from different patients, one cluster was contributed by all tumour samples [115]. This subcluster had a gene expression profile associated with multiple signatures of treatment resistance and metastasis and was characterised by activated glycosphingolipid metabolism and associated innate immunity pathways [115]. Using spatial transcriptomics, Salmén et al. instead examined another breast cancer subtype, HER-2 positive tumours and found that by integrating topographical information from the tumour section images and spatial spots, immune cells infiltrated into the invasive regions of the cancers [116]. In contrast, Andersson et al. who also used spatial transcriptomics but instead integrated image analysis with scRNA-seq data, found that epithelial cells remained segregated from the cross-talking stromal cells [117]. In many tumours, plasma cells were spatially segregated from B-cells, which were found to rather colocalize near T-cells in several patients [117]. Spatial information has also been integrated with scDNA-sequencing, where analysis of breast tissue from patients with ductal carcinoma in situ (DCIS) with invasive ductal carcinoma (IDC), found that genome evolution occurs in the ducts before tumour cells escape the basement membrane and begin generation of the IDC tissue [36]. Analysis of tumour cell genome and epigenome is essential to identifying how cancer cells acquire mutations and progress to become metastatic. Use of HM-SNS (see single cell DNA-sequencing section), found that copy number aberrations (CNA) were acquired in the earliest stages of triple negative breast cancer evolution in short, punctuated bursts rather than gradual evolution, as was previously thought [35]. Interestingly, Wang et al. found that triple-negative tumour cells had an increased mutation rate compared to ER+ cells [29]. Further to findings of Gao et al. [35], Wang found that aneuploid arrangements arose early in tumour development and remained stable across clonal expansion [29]. A more recent study using single-cell copy number analysis found that pseudo-diploid single cells existed across different PAM50 subtypes and contained cancer specific alterations (such as 1q gain and 16q loss), suggesting intrinsic genomic stability and providing a potential identifying feature that could be used in early disease stage risk assessment [118]. Analysis of chromatin states in patient-derived treatment-resistant xenograft models found that a subset of cells across different tumours shared a common chromatin signature consisting of a loss of H3K27me3 chromatin mark [119]. Another study examining the epigenome of breast cancer cells focused particularly on the DNA methylation patterns of circulating tumour cells and found that binding sites for stemness and proliferation-associated transcripts are specifically hypomethylated [120]. Utilisation of many different single cell tools provides new understandings about breast tumour cell diversity and growth potential; however, in many cases normal mammary cells may not have been profiled in the same way and thus cannot be compared. Where similar technologies have compared tumour and normal mammary cells, it is often the case that they have vastly different profiles, making it difficult to identify intermediate cells and discern the cell of origin for cancer. Increasing numbers of studies are now focused on examining normal mammary tissue from individuals with high breast cancer risk that may assist in identifying intermediate pre-clinical cells that can be targeted for early detection of disease.

Conclusions

As is evident from the increasing number of studies utilising these tools, single cell technologies is playing an increasingly important role in understanding the cellular composition of the mammary gland. Here we presented an overview of many of the findings from the emerging studies, which together are beginning to bring us a clearer picture of the different mammary subpopulations that exist in both the human and murine gland and how they change in normal development and cancer. Increasingly, there is a need to better unify findings from different studies through the development of consortiums such as the human cell atlas. Important to remember, is that disparate isolation techniques may influence findings of different cell subpopulations, and computational tools may be required to correct for batch or indeed cellular isolation effects that may have provided discordant views of cell subpopulations in previous literature. Whilst the power of computational tools to overcome limitations of different cell isolation techniques may be required, it is important to remember that in silico modelling also has its limitations. All single cell technologies to date only allow for the capture and examination of single cell profiles at a snap-shot in time, where any differentiation trajectories in the cell populations have only been inferred in silico. Hence, it is important that other in vitro and in vivo techniques such as organoid modelling or lineage tracing experiments are used to validate findings inferred from single cell analysis. This is, however, an exciting era for molecular biology where future studies should take full advantage of current and emerging techniques. Topographical/spatial transcriptomics such as Slide-seq [121] permits scRNA-seq of cells from known positions in ex vivo tissue sections. Techniques such as these allow for verification of cellular identity determined by transcriptomic profile by examining the cell’s position within a tissue, as well as interaction partners such as surrounding cells or extracellular matrix. Techniques that combine two single cell technologies, such as CITE-seq, allows for verification of cell subpopulation identity by combining cell surface marker analysis with scRNA-seq generated transcriptomic profiles. Similarly, SHARE-seq allows for the prediction of future cell state by integrating scRNA-seq and scATAC-seq in the same cells. This technique provides information on cell identity and function through transcriptomic profiling and cell states through chromatin accessibility. Together, techniques such as these will allow us to answer some of the most fundamental and interesting questions of mammary gland biology research, such as how the mammary gland is arranged, which definitive mammary subpopulations exist and the differentiation potential of different cell types.
  113 in total

1.  Diffusion maps for high-dimensional single-cell analysis of differentiation data.

Authors:  Laleh Haghverdi; Florian Buettner; Fabian J Theis
Journal:  Bioinformatics       Date:  2015-05-21       Impact factor: 6.937

2.  High-throughput single-cell ChIP-seq identifies heterogeneity of chromatin states in breast cancer.

Authors:  Kevin Grosselin; Adeline Durand; Justine Marsolier; Adeline Poitou; Elisabetta Marangoni; Fariba Nemati; Ahmed Dahmani; Sonia Lameiras; Fabien Reyal; Olivia Frenoy; Yannick Pousse; Marcel Reichen; Adam Woolfe; Colin Brenan; Andrew D Griffiths; Céline Vallot; Annabelle Gérard
Journal:  Nat Genet       Date:  2019-05-31       Impact factor: 38.330

3.  In situ identification of bipotent stem cells in the mammary gland.

Authors:  Anne C Rios; Nai Yang Fu; Geoffrey J Lindeman; Jane E Visvader
Journal:  Nature       Date:  2014-01-26       Impact factor: 49.962

4.  Single-cell RNA-Seq reveals cell heterogeneity and hierarchy within mouse mammary epithelia.

Authors:  Heng Sun; Zhengqiang Miao; Xin Zhang; Un In Chan; Sek Man Su; Sen Guo; Chris Koon Ho Wong; Xiaoling Xu; Chu-Xia Deng
Journal:  J Biol Chem       Date:  2018-04-17       Impact factor: 5.157

5.  Genome-wide detection of single-nucleotide and copy-number variations of a single human cell.

Authors:  Chenghang Zong; Sijia Lu; Alec R Chapman; X Sunney Xie
Journal:  Science       Date:  2012-12-21       Impact factor: 47.728

Review 6.  Single Cell RNA Sequencing of Human Milk-Derived Cells Reveals Sub-Populations of Mammary Epithelial Cells with Molecular Signatures of Progenitor and Mature States: a Novel, Non-invasive Framework for Investigating Human Lactation Physiology.

Authors:  Jayne F Martin Carli; G Devon Trahan; Kenneth L Jones; Nicole Hirsch; Kristy P Rolloff; Emily Z Dunn; Jacob E Friedman; Linda A Barbour; Teri L Hernandez; Paul S MacLean; Jenifer Monks; James L McManaman; Michael C Rudolph
Journal:  J Mammary Gland Biol Neoplasia       Date:  2020-11-20       Impact factor: 2.673

7.  Luminal progenitors restrict their lineage potential during mammary gland development.

Authors:  Veronica Rodilla; Alessandro Dasti; Mathilde Huyghe; Daniel Lafkas; Cécile Laurent; Fabien Reyal; Silvia Fre
Journal:  PLoS Biol       Date:  2015-02-17       Impact factor: 8.029

8.  Global Trend of Breast Cancer Mortality Rate: A 25-Year Study.

Authors:  Nasrindokht Azamjah; Yasaman Soltan-Zadeh; Farid Zayeri
Journal:  Asian Pac J Cancer Prev       Date:  2019-07-01

9.  Time-resolved single-cell analysis of Brca1 associated mammary tumourigenesis reveals aberrant differentiation of luminal progenitors.

Authors:  Karsten Bach; Sara Pensa; Marija Zarocsinceva; Katarzyna Kania; Julie Stockis; Silvain Pinaud; Kyren A Lazarus; Mona Shehata; Bruno M Simões; Alice R Greenhalgh; Sacha J Howell; Robert B Clarke; Carlos Caldas; Timotheus Y F Halim; John C Marioni; Walid T Khaled
Journal:  Nat Commun       Date:  2021-03-09       Impact factor: 14.919

Review 10.  Single-cell multiomics: technologies and data analysis methods.

Authors:  Jeongwoo Lee; Do Young Hyeon; Daehee Hwang
Journal:  Exp Mol Med       Date:  2020-09-15       Impact factor: 8.718

View more
  6 in total

1.  The Effect of Infant Gastric Digestion on Human Maternal Milk Cells.

Authors:  Rose Doerfler; Jilian R Melamed; Kathryn A Whitehead
Journal:  Mol Nutr Food Res       Date:  2022-08-31       Impact factor: 6.575

Review 2.  Hormonal regulation of mammary gland development and lactation.

Authors:  Fadil M Hannan; Taha Elajnaf; Laura N Vandenberg; Stephen H Kennedy; Rajesh V Thakker
Journal:  Nat Rev Endocrinol       Date:  2022-10-03       Impact factor: 47.564

3.  Transcriptional changes in the mammary gland during lactation revealed by single cell sequencing of cells from human milk.

Authors:  Alecia-Jane Twigger; Lisa K Engelbrecht; Karsten Bach; Isabel Schultz-Pernice; Sara Pensa; Jack Stenning; Stefania Petricca; Christina H Scheel; Walid T Khaled
Journal:  Nat Commun       Date:  2022-01-28       Impact factor: 17.694

Review 4.  The Mammary Gland: Basic Structure and Molecular Signaling during Development.

Authors:  Swarajit Kumar Biswas; Saswati Banerjee; Ginger Wendolyn Baker; Chieh-Yin Kuo; Indrajit Chowdhury
Journal:  Int J Mol Sci       Date:  2022-03-31       Impact factor: 6.208

Review 5.  Alveolar cells in the mammary gland: lineage commitment and cell death.

Authors:  Christine J Watson
Journal:  Biochem J       Date:  2022-05-13       Impact factor: 3.766

Review 6.  Breast cancer heterogeneity through the lens of single-cell analysis and spatial pathologies.

Authors:  Na Zhao; Jeffrey M Rosen
Journal:  Semin Cancer Biol       Date:  2021-07-16       Impact factor: 17.012

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.