Literature DB >> 27996962

An atlas of transcriptional, chromatin accessibility, and surface marker changes in human mesoderm development.

Pang Wei Koh1, Rahul Sinha2, Amira A Barkal2, Rachel M Morganti2, Angela Chen2, Irving L Weissman2, Lay Teng Ang3, Anshul Kundaje1, Kyle M Loh2.   

Abstract

Mesoderm is the developmental precursor to myriad human tissues including bone, heart, and skeletal muscle. Unravelling the molecular events through which these lineages become diversified from one another is integral to developmental biology and understanding changes in cellular fate. To this end, we developed an in vitro system to differentiate human pluripotent stem cells through primitive streak intermediates into paraxial mesoderm and its derivatives (somites, sclerotome, dermomyotome) and separately, into lateral mesoderm and its derivatives (cardiac mesoderm). Whole-population and single-cell analyses of these purified populations of human mesoderm lineages through RNA-seq, ATAC-seq, and high-throughput surface marker screens illustrated how transcriptional changes co-occur with changes in open chromatin and surface marker landscapes throughout human mesoderm development. This molecular atlas will facilitate study of human mesoderm development (which cannot be interrogated in vivo due to restrictions on human embryo studies) and provides a broad resource for the study of gene regulation in development at the single-cell level, knowledge that might one day be exploited for regenerative medicine.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27996962      PMCID: PMC5170597          DOI: 10.1038/sdata.2016.109

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


Background & Summary

A longstanding goal of regenerative medicine has been to efficiently differentiate stem cells into pure, functional populations of desired cell types. This has been challenging to achieve in practice: many extant differentiation methods take weeks or months to complete and result in heterogeneous mixtures of the target lineage and other contaminating lineages. Difficulties in differentiating stem cells into desired cell-types in vitro might stem from incomplete knowledge of how stem cells naturally develop into these lineages during the course of embryonic development. We focus here on human mesoderm development, which starts with the differentiation of pluripotent stem cells into the primitive streak (PS) and then into paraxial and lateral mesoderm[1-3]. Paraxial mesoderm subsequently buds off into tissue segments known as somites[4], with dorsal somites (dermomyotome) giving rise to brown fat, skeletal muscle, and dorsal dermis, and ventral somites (sclerotome) yielding the bone and cartilage of the spine and ribs[5]. Separately, lateral mesoderm goes on to form limb bud mesoderm[6] and cardiac mesoderm[7], the latter of which generates cardiomyocytes and other heart constituents. Our related publication[8] delineated a comprehensive roadmap for human mesoderm development that outlined key intermediate stages and defined the minimal combinations of extrinsic signals sufficient to induce differentiation at each stage. To elicit differentiation at defined stages, in addition to identifying the necessary inductive cues at each stage (as is typical), we also identified pathways leading to ‘unwanted’ cell fates and systematically repressed them at each lineage branchpoint. We used this strategy to efficiently differentiate pluripotent stem cells, through anterior and mid primitive streak, into paraxial and lateral mesoderm, and subsequently into somites, sclerotome, dermomytome, and cardiac mesoderm (Fig. 1). The identity and purity of these cell types was respectively assessed by transplantation into mouse models or single-cell gene expression profiling[8].
Figure 1

A schematic of human mesoderm development.

We differentiate and profile each of the 10 cell types shown in color here, starting with pluripotent stem cells and ending in dermomyotome, sclerotome, and cardiac mesoderm.

Here we describe in detail the materials and methods used to generate and profile these distinct cell types, with an eye towards promoting reproducibility and reuse of our data. We focus on the biological methods used to generate the data; the computational pre- and post-processing of the data; and the technical validation of the quality of our data. In contrast, our related publication[8] focused on experimentally validating the biological function and purity of the differentiated cell types and on extracting developmental insights from the data. Our dataset comprises three main types of data -- gene expression, chromatin accessibility, and surface marker expression -- across 10 different cell types (pluripotent stem cells, anterior PS, mid PS, paraxial mesoderm, somitomeres, somites, sclerotome, dermomyotome, lateral mesoderm and cardiac mesoderm). For expression, we performed bulk-population RNA-seq as well as single-cell RNA-seq (using the Fluidigm C1 system) on a total of 651 cells spanning all lineages. Chromatin accessibility across the genome was measured by ATAC-seq[9]. For each lineage, two to six biological replicates were assayed for bulk-population RNA-seq and ATAC-seq. Finally, the expression of 332 cell-surface markers was ascertained on most lineages by means of high-throughput antibody screening. Taken together, this dataset will constitute a useful resource for the study of human mesoderm development. For example, this dataset enabled us to identify novel marker genes in somitogenesis (a transient process which cannot be observed in vivo due to restrictions on the use of human embryos); identify the putative cell-of-origin for different subtypes of congenital scoliosis; and infer the activity of transcription factors at each stage of mesodermal development[8]. The data from the high-throughput surface marker screen will also be helpful in purifying desired cell types for transplantation or further study. Moreover, we believe that this dataset will be useful as a broader resource for the analysis of a timecourse data, e.g., as a testing ground for algorithms that aim to reconstruct developmental paths from single-cell RNA-seq data[10,11], or for the study of how changes in chromatin accessibility are correlated with, and are ultimately causative of, changes in gene expression across developmental time and space.

Methods

We reproduce here the experimental protocols included in our related publication[8], with added detail on our computational processing steps, RNA library construction, and surface marker screening. A list of all experiments reported here, together with accession codes of the corresponding data, can be found in Table 1 (available online only).
Table 1

Overall experimental metadata briefly describing each of the data sets available, with links to the appropriate data repository

Source (Cell type)Sample IDProtocol 1Protocol 2Protocol 3Data
D0 hESCH7_hESC_ATAC1Nuclei isolationATAC-seq SRR3689759
D0 hESCH7_hESC_ATAC2Nuclei isolationATAC-seq SRR3689760
D0 hESCH7hESC_1RNA extractionBulk RNA-seq SRR3439477
D0 hESCH7hESC_2RNA extractionBulk RNA-seq SRR3439478
D0 hESCH7hESC_3RNA extractionBulk RNA-seq SRR3439480
D0 hESCH7TrzlRNA extractionBulk RNA-seq SRR3439481
D0 hESCmultipleSingle-cell captureSingle-cell RNA-seq SRX1977195
D0 hESCH7 hESCSurface marker screening  10.6084/m9.figshare.3505817
D0 hESC(processed peak calls)Nuclei isolationATAC-seq GSM2257291
D1 Anterior Primitive StreakAPS_ATAC3DifferentiationNuclei isolationATAC-seqSRR3689761
D1 Anterior Primitive StreakAPS_ATAC4DifferentiationNuclei isolationATAC-seqSRR3689762
D1 Anterior Primitive StreakAPS_1DifferentiationRNA extractionBulk RNA-seqSRR3439429
D1 Anterior Primitive StreakAPS_2DifferentiationRNA extractionBulk RNA-seqSRR3439430
D1 Anterior Primitive StreakAPS_3DifferentiationRNA extractionBulk RNA-seqSRR3439431
D1 Anterior Primitive StreakmultipleDifferentiationSingle-cell captureSingle-cell RNA-seqSRX1977196
D1 Anterior Primitive StreakAnt Primitive Streak (MIXL1-GFP+)DifferentiationSurface marker screening 10.6084/m9.figshare.3505817
D1 Anterior Primitive Streak(processed peak calls)DifferentiationNuclei isolationATAC-seqGSM2257292
D1 Mid Primitive StreakMPS_ATAC5DifferentiationNuclei isolationATAC-seqSRR3689763
D1 Mid Primitive StreakMPS_ATAC6DifferentiationNuclei isolationATAC-seqSRR3689764
D1 Mid Primitive StreakMPS_1DifferentiationRNA extractionBulk RNA-seqSRR3439482
D1 Mid Primitive StreakMPS_2DifferentiationRNA extractionBulk RNA-seqSRR3439485
D1 Mid Primitive StreakMPS_3DifferentiationRNA extractionBulk RNA-seqSRR3439486
D1 Mid Primitive StreakMPS_4DifferentiationRNA extractionBulk RNA-seqSRR3439487
D1 Mid Primitive StreakmultipleDifferentiationSingle-cell captureSingle-cell RNA-seqSRX1977197
D1 Mid Primitive Streak(processed peak calls)DifferentiationNuclei isolationATAC-seqGSM2257293
D2 DLL1− Paraxial MesodermDLL1nD2nonPXM_1DifferentiationRNA extractionBulk RNA-seqSRR3439468
D2 DLL1− Paraxial MesodermDLL1nD2nonPXM_2DifferentiationRNA extractionBulk RNA-seqSRR3439469
D2 DLL1+ Paraxial MesodermDLL1pPXm_ATAC7DifferentiationNuclei isolationATAC-seqSRR3689781
D2 DLL1+ Paraxial MesodermDLL1pPXm_ATAC8DifferentiationNuclei isolationATAC-seqSRR3689915
D2 DLL1+ Paraxial MesodermDLL1pPXM_3DifferentiationRNA extractionBulk RNA-seqSRR3439471
D2 DLL1+ Paraxial MesodermmultipleDifferentiationSingle-cell captureSingle-cell RNA-seqSRX1977198
D2 DLL1+ Paraxial MesodermDLLpPXM_1DifferentiationRNA extractionBulk RNA-seqSRR3439472
D2 DLL1+ Paraxial MesodermDLL1pPXM_2DifferentiationRNA extractionBulk RNA-seqSRR3439470
D2 DLL1+ Paraxial Mesoderm(processed peak calls)DifferentiationNuclei isolationATAC-seqGSM2257294
D2 Lateral MesodermD2Ltm_ATAC10DifferentiationNuclei isolationATAC-seqSRR3689918
D2 Lateral MesodermD2Ltm_ATAC9DifferentiationNuclei isolationATAC-seqSRR3689916
D2 Lateral MesodermD2LtM_1DifferentiationRNA extractionBulk RNA-seqSRR3439434
D2 Lateral MesodermD2LtM_2DifferentiationRNA extractionBulk RNA-seqSRR3439437
D2 Lateral MesodermmultipleDifferentiationSingle-cell captureSingle-cell RNA-seqSRX1977202
D2 Lateral Mesoderm(processed peak calls)DifferentiationNuclei isolationATAC-seqGSM2257298
D2 Paraxial MesodermParaxial MesodermDifferentiationSurface marker screening 10.6084/m9.figshare.3505817
D2.25 SomitomeresSmtmrs_ATAC21DifferentiationNuclei isolationATAC-seqSRR3689991
D2.25 SomitomeresSmtmrs_ATAC22DifferentiationNuclei isolationATAC-seqSRR3690220
D2.25 SomitomeresmultipleDifferentiationSingle-cell captureSingle-cell RNA-seqSRX1977204
D2.25 Somitomeres(processed peak calls)DifferentiationNuclei isolationATAC-seqGSM2257300
D3 Cardiac MesodermCardiac Mesoderm (NKX2.5-GFP+)DifferentiationSurface marker screening 10.6084/m9.figshare.3505817
D3 Early SomiteESMT_ATAC13DifferentiationNuclei isolationATAC-seqSRR3689931
D3 Early SomiteESMT_ATAC14DifferentiationNuclei isolationATAC-seqSRR3689932
D3 Early SomiteSmt_1DifferentiationRNA extractionBulk RNA-seqSRR3439490
D3 Early SomiteSmt_2DifferentiationRNA extractionBulk RNA-seqSRR3439491
D3 Early SomiteD3EarlySmt_1DifferentiationRNA extractionBulk RNA-seqSRR3439438
D3 Early SomiteD3EarlySmt_2DifferentiationRNA extractionBulk RNA-seqSRR3439440
D3 Early SomitemultipleDifferentiationSingle-cell captureSingle-cell RNA-seqSRX1977199
D3 Early SomiteSmt_4DifferentiationRNA extractionBulk RNA-seqSRR3439494
D3 Early SomiteSmt_3DifferentiationRNA extractionBulk RNA-seqSRR3439493
D3 Early SomiteEarly SomiteDifferentiationSurface marker screening 10.6084/m9.figshare.3505817
D3 Early Somite(processed peak calls)DifferentiationNuclei isolationATAC-seqGSM2257295
D3 GARP+ Cardiac MesodermD3CrdcM_ATAC15DifferentiationNuclei isolationATAC-seqSRR3689933
D3 GARP+ Cardiac MesodermD3CrdcM_ATAC16DifferentiationNuclei isolationATAC-seqSRR3689934
D3 GARP+ Cardiac MesodermD3GARPpCrdcM_1DifferentiationRNA extractionBulk RNA-seqSRR3439441
D3 GARP+ Cardiac MesodermD3GARPpCrdcM_2DifferentiationRNA extractionBulk RNA-seqSRR3439442
D3 GARP+ Cardiac MesodermmultipleDifferentiationSingle-cell captureSingle-cell RNA-seqSRX1977203
D3 GARP+ Cardiac Mesoderm(processed peak calls)DifferentiationNuclei isolationATAC-seqGSM2257299
D5 DermomyotomeDrmmtm_ATAC19DifferentiationNuclei isolationATAC-seqSRR3689935
D5 DermomyotomeDrmmtm_ATAC20DifferentiationNuclei isolationATAC-seqSRR3689936
D5 DermomyotomeDrmmtm_1DifferentiationRNA extractionBulk RNA-seqSRR3439474
D5 DermomyotomeDrmmtm_2DifferentiationRNA extractionBulk RNA-seqSRR3439475
D5 DermomyotomeD5CentralDrmmtmDifferentiationRNA extractionBulk RNA-seqSRR3439443
D5 DermomyotomemultipleDifferentiationSingle-cell captureSingle-cell RNA-seqSRX1977201
D5 DermomyotomeDrmmtm_3DifferentiationRNA extractionBulk RNA-seqSRR3439476
D5 DermomyotomeDermomyotomeDifferentiationSurface marker screening 10.6084/m9.figshare.3505817
D5 Dermomyotome(processed peak calls)DifferentiationNuclei isolationATAC-seqGSM2257297
D6 PDGFRA+ SclerotomeD6Sclrtm_ATAC11DifferentiationNuclei isolationATAC-seqSRR3689921
D6 PDGFRA+ SclerotomeD6Sclrtm_ATAC12DifferentiationNuclei isolationATAC-seqSRR3689923
D6 PDGFRA+ SclerotomeSclrtm_1DifferentiationRNA extractionBulk RNA-seqSRR3439488
D6 PDGFRA+ SclerotomeSclrtm_2DifferentiationRNA extractionBulk RNA-seqSRR3439489
D6 PDGFRA+ SclerotomeD6PDGFRApSclrtm_1DifferentiationRNA extractionBulk RNA-seqSRR3439456
D6 PDGFRA+ SclerotomemultipleDifferentiationSingle-cell captureSingle-cell RNA-seqSRX1977200
D6 PDGFRA+ Sclerotome(processed peak calls)DifferentiationNuclei isolationATAC-seqGSM2257296
D6 SclerotomeSclerotomeDifferentiationSurface marker screening 10.6084/m9.figshare.3505817

Bulk-population RNA-seq

RNA extraction, library preparation and sequencing

For bulk-population RNA-seq, RNA was extracted from either whole cell populations or alternatively, cell subsets purified by fluorescence activated cell sorting (FACS). In brief, RNA was obtained from undifferentiated H7 hESCs (day 0 of in vitro differentiation), H7-derived anterior primitive streak populations (day 1), H7-derived mid primitive streak populations (day 1), H7-derived lateral mesoderm (day 2), H7-derived FACS-purified GARP+ cardiac mesoderm (day 3), H7-derived FACS-purified DLL1+ paraxial mesoderm populations (day 2), H7-derived day 3 early somite progenitor populations (day 3), H7-derived dermomyotome populations (day 5, treated with BMP4+CHIR99021+Vismodegib on days 4–5), and H7-derived FACS-purified PDGFRα+ sclerotome populations (day 6). Total RNA from the above cell populations was isolated using Trizol (Thermo Fisher) as per the manufacturer's recommendations, with the additional use of linear polyacrylamide (Sigma) as a carrier to facilitate RNA precipitation. Purified total RNA was treated with 4 units of RQ1 RNase-free DNase (Promega) at 37 degrees Celsius for 1 h to remove trace amounts of genomic DNA. The DNase-treated total RNA was cleaned-up using the RNeasy Micro Kit (Qiagen). Subsequently, the integrity of extracted RNA was assayed by on-chip electrophoresis (Agilent Bioanalyzer) and only samples with a high RNA integrity (RIN) value were used for subsequent cDNA library preparation. Purified total RNA (10–50 ng) was reverse-transcribed into cDNA and amplified using the Ovation RNA-seq System V2 (NuGEN). Amplified cDNA was sheared using the Covaris S2 (Covaris) with the following settings: total volume 120 μl, duty cycle 10%, intensity 5, cycle/burst 100 and total time 2 min. The sheared cDNA was cleaned up using Agencourt Ampure XP beads (Beckman Coulter) to obtain cDNA fragments >=400 base pairs (bp). 500 ng of sheared and size-selected cDNA was used as input for library preparation using the NEBNext Ultra DNA Library Prep Kit for Illumina (New England BioLabs) as per the manufacturer's recommendations. Resulting libraries (fragment distribution: 300–700 bp; peak 500–550 bp) were pooled (multiplexed) and sequenced using either a HiSeq 4000 or NextSeq 500 (Illumina) at the Stanford Functional Genomics Facility to obtain 2×150 bp paired-end reads. For each RNA-seq library, the effectiveness of adapter ligation and the effective library concentration was determined by qPCR and Bioanalyzer (Agilent) prior to pooling and loading them onto the sequencers. Each sample in our data constitutes a separate biological replicate. Bulk population RNA-seq libraries were prepared in three batches (Table 2).
Table 2

Bulk-population RNA-seq metadata and mapping statistics.

Sample IDCelltypeBatchNumber of uniquely mapped readsPercentage of uniquely mapping reads
H7hESC_1D0 H7 hESC12907793377.64
H7hESC_2D0 H7 hESC13259381072.39
H7hESC_3D0 H7 hESC13080039374.51
H7TrzlD0 H7 hESC36039964276.01
APS_1D1 Anterior Primitive Streak12944453271.43
APS_2D1 Anterior Primitive Streak13058567172.73
APS_3D1 Anterior Primitive Streak36255930374.76
MPS_1D1 Mid Primitive Streak13219510574.5
MPS_2D1 Mid Primitive Streak12966848376.07
MPS_3D1 Mid Primitive Streak36423683573.91
MPS_4D1 Mid Primitive Streak37638588672.9
DLL1nD2nonPXM_1D2 DLL1− Paraxial Mesoderm22855552364.46
DLL1nD2nonPXM_2D2 DLL1− Paraxial Mesoderm22499946666.51
DLL1pPXM_3D2 DLL1+ Paraxial Mesoderm21194878850.27
DLLpPXM_1D2 DLL1+ Paraxial Mesoderm22959540373.74
DLL1pPXM_2D2 DLL1+ Paraxial Mesoderm22550439471.61
D2LtM_1D2 Lateral Mesoderm37962384772.3
D2LtM_2D2 Lateral Mesoderm38528121370.89
D3GARPpCrdcM_1D3 GARP+ Cardiac Mesoderm38753045769.6
D3GARPpCrdcM_2D3 GARP+ Cardiac Mesoderm39458734674.36
Smt_1D3 Somite11527297557.37
Smt_2D3 Somite12470201660.63
D3EarlySmt_1D3 Somite36072373065.19
D3EarlySmt_2D3 Somite36446717664.97
Smt_4D3 Somite21423584162.14
Smt_3D3 Somite22070101662.06
Drmmtm_1D5 Dermomyotome13231812373.43
Drmmtm_2D5 Dermomyotome11889078963.49
D5CentralDrmmtmD5 Dermomyotome312679972669.06
Drmmtm_3D5 Dermomyotome21058392365.92
Sclrtm_1D6 PDGFRA+ Sclerotome12075154958.61
Sclrtm_2D6 PDGFRA+ Sclerotome12311707166.89
D6PDGFRApSclrtm_1D6 PDGFRA+ Sclerotome310235228674.86

Quantification and processing

Obtained RNA-seq reads were trimmed for base call quality (PHRED score >=21) and for adapter sequences (using Skewer[12]), and then were subsequently processed using a slightly-modified version of the ENCODE long RNA-seq pipeline for quantification of mRNA expression (https://www.encodeproject.org/rna-seq/long-rnas/)[13]. Specifically, reads were aligned to hg38 using STAR 2.4 (ref. 14); gene-level expression was then quantified using RSEM 1.2.21 (ref. 15). We only kept samples with at least 10,000,000 uniquely mapping reads and with at least 50% of reads uniquely mapping, which meant rejecting one sample (from sclerotome) out of 34. The numbers and percentages of uniquely mapping reads for each sample are listed in Table 2. The full parameter settings used can be found in our versions of STAR_RSEM.sh and STAR_RSEM_prep.py (see Code Availability below). To facilitate global comparisons of gene expression levels across cell types, we first took the log2TPM (transcripts per million) values for each gene, before filtering out all genes where there was a difference of less than 2 (in log2TPM units, i.e., a 4-fold difference in expression) between the cell types with the highest and lowest expression. Next, we used ComBat with non-parametric priors[16] (as implemented through the sva R package[17]) to correct for batch effects. This sometimes left small negative values for the expression of some genes, which we set to 0. The R Markdown script implementing this batch correction is bulkDataViz.Rmd. For ease of use, we also prepared a spreadsheet with TPM values for each gene, augmented with the following information on each gene: 1) whether the gene product is present on the cell surface (GO code GO:0009986); 2) for each pair of adjacent conditions, whether the gene was differentially expressed between those conditions; and 3) the shrunken log-fold-change for that gene between those conditions. We provide (1) as a convenience to help in finding potential surface markers that were not included in our high-throughput screen (e.g., because an antibody was not available). (2) and (3) were calculated by DESeq2 (ref. 18) using batch information; genes were called as differentially expressed at a false discovery rate (FDR) of 0.1. The raw data from the bulk-population RNA-seq can be found in [Data Citation 1]. A spreadsheet of TPM values can be found in [Data Citation 2]. The annotated spreadsheet, as described in the previous paragraph, is in [Data Citation 3].

Single-cell RNA-seq

Library preparation and sequencing

Cells were briefly washed (DMEM/F12), dissociated (TrypLE Express), strained (100 μm filter), pelleted and re-suspended in DMEM/F12 for counting. Before single-cell capture, two quality control steps were implemented. First, cell size was estimated in order to determine whether cells should be loaded onto C1 capture arrays of either 10–17 μm or 17–25 μm size. Arrays were chosen for each lineage by estimating the median cell size of each given population on a flow cytometer on the basis of the FSC-W signal[19] and choosing an array with an appropriate pore size to accommodate such cells. Second, to ensure the high viability of in vitro-differentiated cells prior to commencing single-cell RNA-seq, for each population a separate aliquot of cells was stained with 1.1 μM DAPI and analyzed by flow cytometry; for all cell populations that were used for single-cell RNA-seq, >98% of cells were viable (i.e., DAPI negative). For single-cell capture, cells were diluted to a concentration of 1000 cells per μl, diluted in a 3:2 mixture of C1 Cell Suspension Reagent and DMEM/F12, and then loaded onto a Fluidigm C1 single-cell capture array chip for automated capture on a Fluidigm C1 Machine (Stanford Stem Cell Institute Genomics Core). 10–17 μm array chips were used for hESCs, day 1 anterior PS, day 2 sorted DLL1+ paraxial mesoderm, day 2.25 somitomeres, day 3 early somites, day 2 lateral mesoderm, day 3 sorted GARP+ cardiac mesoderm, day 5 central dermomyotome, and day 6 sorted PDGFRA+ sclerotome while a 17–25 μm array chip was used for day 1 mid PS. After loading, the efficiency of single-cell capture was verified using an automated microscope that imaged each captured cell on the chip. Subsequent cell lysis, cDNA synthesis, and amplification was executed within each microfluidic chamber in the array chip in an automated fashion with the Fluidigm C1 machine using the reagents from SMARTer Ultra Low RNA Kit (Clontech, 634833), as per the manufacturers' instructions (Fluidigm, PN 100–7168 Rev. A2). The amplified cDNA from individual cells was harvested into a nuclease-free 96-well plate and diluted using the C1 harvesting reagent (Fluidigm). The concentration and integrity of amplified cDNA were assessed using a Fragment Analyzer (Advanced Analytical) in 96-well plate format. Amplified cDNAs from only those wells that (1) were not degraded and (2) originated from wells that were microscopically verified manually to contain a single cell, were carried forward for subsequent library construction. It is important to note that because of manual verification, we were able to effectively rule out doublets if captured in the medium (10–17 μm) or the large (17–25 μm) array chips. A single-channel liquid handling robot, Mosquito X1(TTP Labtech), was used to simultaneously, 1) dilute amplified cDNAs from single cells from all lineages to a concentration range of 0.05–0.16 ng per μl with C1 Harvest Reagent (Fluidigm) as a diluent and 2) consolidate the diluted cDNA into 384 well plates. The diluted single-cell cDNAs were tagmented and converted to sequencing libraries in the 384 well plates using the Nextera XT DNA Sample Prep Kit (Illumina, FC-131–1096) in an automated fashion using another 16-channel pipetting robot, Mosquito HTS (TTP Labtech), and 384 distinct Illumina-compatible molecular barcodes. The resulting sequencing libraries from a single such 384 well plate were then pooled and cleaned up using Agencourt AMPure XP beads (Beckman Coulter). The pooled libraries were then analyzed for quality and concentration using Bioanalyzer (Agilent) and qPCR and loaded on a single lane of NextSeq 500 or two lanes of HiSeq 4000 to obtain 1–2 million 2×150 bp reads per cell. The reads obtained were trimmed for base call quality (PHRED score >=21) and the presence of adapter sequences using Skewer[12]. We quantified single-cell gene expression using the ENCODE long RNA-seq pipeline (with the same parameter settings as employed for analysis of bulk-population RNA-seq). We only kept samples with at least 1 million uniquely mapped reads and at least 70% of reads uniquely mapping, which meant keeping data from 498 single cells out of 651. The numbers and percentages of uniquely mapping reads for each cell are listed in Table 3 (available online only).
Table 3

Single-cell RNA-seq metadata and mapping statistics

Sample IDNumber of uniquely mapped readsPercentage of uniquely mapping readsIncluded in analysis?
APS-p1c10r1149626881.21TRUE
APS-p1c10r3163598079.9TRUE
APS-p1c10r4146266679.58TRUE
APS-p1c10r6101576379.94TRUE
APS-p1c10r7164924180.35TRUE
APS-p1c10r863128579.63FALSE
APS-p1c11r2200657079.83TRUE
APS-p1c11r3157105180.36TRUE
APS-p1c11r6159560680.29TRUE
APS-p1c11r7161108080.05TRUE
APS-p1c12r2191016680.12TRUE
APS-p1c12r3145304481.01TRUE
APS-p1c12r4166227580.47TRUE
APS-p1c12r5180450580.55TRUE
APS-p1c12r6181073576.06TRUE
APS-p1c12r7206526879.42TRUE
APS-p1c1r2172737480.73TRUE
APS-p1c1r5175643080.18TRUE
APS-p1c1r8156929077.97TRUE
APS-p1c2r188546480.93FALSE
APS-p1c2r2192602280.48TRUE
APS-p1c2r3194168580.73TRUE
APS-p1c2r4170049079.24TRUE
APS-p1c2r6161491579.99TRUE
APS-p1c2r754356376.21FALSE
APS-p1c3r1160035481.01TRUE
APS-p1c3r3110160281.42TRUE
APS-p1c3r4162072980.95TRUE
APS-p1c3r5180978980.39TRUE
APS-p1c3r6182041080.26TRUE
APS-p1c4r2187003180.81TRUE
APS-p1c4r5176779879.67TRUE
APS-p1c4r6152533379.93TRUE
APS-p1c4r7181602780.58TRUE
APS-p1c4r8182156980.37TRUE
APS-p1c5r1180024881.67TRUE
APS-p1c5r2189160781.05TRUE
APS-p1c5r3149201380.99TRUE
APS-p1c5r4173015780.21TRUE
APS-p1c5r7170133280.69TRUE
APS-p1c5r8181278180.93TRUE
APS-p1c6r2218184580.94TRUE
APS-p1c6r3130802481.28TRUE
APS-p1c6r4173334380.84TRUE
APS-p1c6r5142194577.27TRUE
APS-p1c6r6171927380.1TRUE
APS-p1c6r8191183274.68TRUE
APS-p1c7r2154581281.26TRUE
APS-p1c7r3152413880.8TRUE
APS-p1c7r4107295580.44TRUE
APS-p1c7r5152699880.09TRUE
APS-p1c7r6157705179.48TRUE
APS-p1c7r884033376.82FALSE
APS-p1c8r1178753481.16TRUE
APS-p1c8r3135392880.91TRUE
APS-p1c8r4160746480.79TRUE
APS-p1c8r5153640580.24TRUE
APS-p1c8r841855172.48FALSE
APS-p1c9r1140640180.75TRUE
APS-p1c9r2170650981.23TRUE
APS-p1c9r3119440981.22TRUE
APS-p1c9r4167902180.14TRUE
APS-p1c9r6148486880.4TRUE
APS-p1c9r7176086879.9TRUE
cDM-p4c10r1201968981.41TRUE
cDM-p4c10r2196542080.01TRUE
cDM-p4c10r4187084881.12TRUE
cDM-p4c10r5188463981.67TRUE
cDM-p4c10r6184788581.15TRUE
cDM-p4c11r1191120379.96TRUE
cDM-p4c11r3171711681.56TRUE
cDM-p4c11r5166230380.64TRUE
cDM-p4c11r7207286080.03TRUE
cDM-p4c11r8191038081.22TRUE
cDM-p4c12r2191505480.57TRUE
cDM-p4c12r3193306979.73TRUE
cDM-p4c12r4181711381.03TRUE
cDM-p4c12r5199697781.25TRUE
cDM-p4c12r6211142182.11TRUE
cDM-p4c12r7223009979.93TRUE
cDM-p4c1r1182807579.63TRUE
cDM-p4c1r2199444080.47TRUE
cDM-p4c1r3198457780.11TRUE
cDM-p4c1r4197837881.57TRUE
cDM-p4c1r5189452080.67TRUE
cDM-p4c1r7219164280.94TRUE
cDM-p4c1r8188949579.06TRUE
cDM-p4c2r2183847879.98TRUE
cDM-p4c2r3182910180.64TRUE
cDM-p4c2r4187612280.46TRUE
cDM-p4c2r5175726280.67TRUE
cDM-p4c2r6177970380.34TRUE
cDM-p4c2r7206287981.3TRUE
cDM-p4c3r1190407680.17TRUE
cDM-p4c3r3199585778.14TRUE
cDM-p4c3r6176261480.09TRUE
cDM-p4c3r8201167080.68TRUE
cDM-p4c4r1174600780.77TRUE
cDM-p4c4r2174840380.22TRUE
cDM-p4c4r3176558181.91TRUE
cDM-p4c4r4174119179.69TRUE
cDM-p4c4r5181763980.9TRUE
cDM-p4c4r6165530279.89TRUE
cDM-p4c4r8214152580.15TRUE
cDM-p4c5r2193275280.97TRUE
cDM-p4c5r3182966779.91TRUE
cDM-p4c5r6165786681.54TRUE
cDM-p4c5r7193643078.26TRUE
cDM-p4c6r2186249180.72TRUE
cDM-p4c6r3186995181.21TRUE
cDM-p4c6r7199325277.67TRUE
cDM-p4c7r1178872180.26TRUE
cDM-p4c7r3174531181.19TRUE
cDM-p4c7r4158339880.37TRUE
cDM-p4c7r5186764480.28TRUE
cDM-p4c7r6143707282.28TRUE
cDM-p4c7r7197327681.6TRUE
cDM-p4c7r8253963880.51TRUE
cDM-p4c8r1223246680.51TRUE
cDM-p4c8r2192446380.94TRUE
cDM-p4c8r3196672979.69TRUE
cDM-p4c8r4183398681.25TRUE
cDM-p4c8r6173946781.13TRUE
cDM-p4c8r7170252480.64TRUE
cDM-p4c9r1208925880.94TRUE
cDM-p4c9r2159994879.86TRUE
cDM-p4c9r3181717479.3TRUE
cDM-p4c9r4199260980.64TRUE
cDM-p4c9r5198303780.94TRUE
cDM-p4c9r6172605881.48TRUE
cDM-p4c9r7196145881.39TRUE
D2_25somitomere-p9c10r2152322183.23TRUE
D2_25somitomere-p9c10r3128130882.57TRUE
D2_25somitomere-p9c10r4116919982.93TRUE
D2_25somitomere-p9c10r54065478.75FALSE
D2_25somitomere-p9c10r6148952183.89TRUE
D2_25somitomere-p9c10r7149333982.92TRUE
D2_25somitomere-p9c10r8135170483.22TRUE
D2_25somitomere-p9c11r1101433083.67TRUE
D2_25somitomere-p9c11r2144432283.34TRUE
D2_25somitomere-p9c11r3120906181.38TRUE
D2_25somitomere-p9c11r4123603782.49TRUE
D2_25somitomere-p9c11r6142202081.21TRUE
D2_25somitomere-p9c11r7151968583.89TRUE
D2_25somitomere-p9c11r8143797483.12TRUE
D2_25somitomere-p9c12r1112342882.9TRUE
D2_25somitomere-p9c12r2124994082.57TRUE
D2_25somitomere-p9c12r3109253181.6TRUE
D2_25somitomere-p9c12r4132347283.09TRUE
D2_25somitomere-p9c12r572086979.78FALSE
D2_25somitomere-p9c12r6146412583.25TRUE
D2_25somitomere-p9c12r7138127883.13TRUE
D2_25somitomere-p9c1r1125884182.63TRUE
D2_25somitomere-p9c1r2136143583.28TRUE
D2_25somitomere-p9c1r3107298983.87TRUE
D2_25somitomere-p9c1r5114243883.3TRUE
D2_25somitomere-p9c1r7157153283.97TRUE
D2_25somitomere-p9c2r1139229383.59TRUE
D2_25somitomere-p9c2r2138184082.08TRUE
D2_25somitomere-p9c2r386117781.35FALSE
D2_25somitomere-p9c2r4125538783.14TRUE
D2_25somitomere-p9c2r5112661582.97TRUE
D2_25somitomere-p9c2r6118486082.84TRUE
D2_25somitomere-p9c2r7136333982.25TRUE
D2_25somitomere-p9c2r8113208883.75TRUE
D2_25somitomere-p9c3r1112478479.67TRUE
D2_25somitomere-p9c3r2141999883.21TRUE
D2_25somitomere-p9c3r366348981.92FALSE
D2_25somitomere-p9c3r4125600684.04TRUE
D2_25somitomere-p9c3r587195281.84FALSE
D2_25somitomere-p9c3r6120086481.86TRUE
D2_25somitomere-p9c4r1144467383.59TRUE
D2_25somitomere-p9c4r2161088382.27TRUE
D2_25somitomere-p9c4r3129737081.6TRUE
D2_25somitomere-p9c4r4134312182.28TRUE
D2_25somitomere-p9c4r5157237382.5TRUE
D2_25somitomere-p9c4r6184724281.79TRUE
D2_25somitomere-p9c4r7135091282.4TRUE
D2_25somitomere-p9c5r1141225482.56TRUE
D2_25somitomere-p9c5r2154814681.79TRUE
D2_25somitomere-p9c5r3113641980.33TRUE
D2_25somitomere-p9c5r4123711082.51TRUE
D2_25somitomere-p9c5r5146105882.54TRUE
D2_25somitomere-p9c5r6152895983.1TRUE
D2_25somitomere-p9c5r7161175683.15TRUE
D2_25somitomere-p9c5r8134936783.32TRUE
D2_25somitomere-p9c6r1105034983.15TRUE
D2_25somitomere-p9c6r283987982.34FALSE
D2_25somitomere-p9c6r3110560081.23TRUE
D2_25somitomere-p9c6r491445883.24FALSE
D2_25somitomere-p9c6r5132628083.65TRUE
D2_25somitomere-p9c6r6120422880.85TRUE
D2_25somitomere-p9c6r7102490780.51TRUE
D2_25somitomere-p9c6r899322481.11FALSE
D2_25somitomere-p9c7r1114723482.9TRUE
D2_25somitomere-p9c7r249647382.5FALSE
D2_25somitomere-p9c7r3147384282.22TRUE
D2_25somitomere-p9c7r488857480.67FALSE
D2_25somitomere-p9c7r5118726883.82TRUE
D2_25somitomere-p9c7r6119696182.93TRUE
D2_25somitomere-p9c7r7131135682.88TRUE
D2_25somitomere-p9c7r8157397380.19TRUE
D2_25somitomere-p9c8r1159647680.48TRUE
D2_25somitomere-p9c8r2100426682.24TRUE
D2_25somitomere-p9c8r3125990580.02TRUE
D2_25somitomere-p9c8r472952679.27FALSE
D2_25somitomere-p9c8r5145736383.58TRUE
D2_25somitomere-p9c8r6155042682.39TRUE
D2_25somitomere-p9c8r7140034782.37TRUE
D2_25somitomere-p9c8r8150485982.74TRUE
D2_25somitomere-p9c9r1110018181.06TRUE
D2_25somitomere-p9c9r2141342382.7TRUE
D2_25somitomere-p9c9r3130478582.72TRUE
D2_25somitomere-p9c9r4131426082.47TRUE
D2_25somitomere-p9c9r5123562082.91TRUE
D2_25somitomere-p9c9r6148691182.98TRUE
D2_25somitomere-p9c9r7133883982.78TRUE
D2_25somitomere-p9c9r8137954483.4TRUE
DLL1PXM-p8c10r1399924175.5TRUE
DLL1PXM-p8c10r2378826575.88TRUE
DLL1PXM-p8c10r3203140579.22TRUE
DLL1PXM-p8c10r4132568481.3TRUE
DLL1PXM-p8c10r6120593182.67TRUE
DLL1PXM-p8c10r7136647182.66TRUE
DLL1PXM-p8c10r8137504482.09TRUE
DLL1PXM-p8c11r2371494774.72TRUE
DLL1PXM-p8c11r3198158679.89TRUE
DLL1PXM-p8c11r4131933381.84TRUE
DLL1PXM-p8c11r5123257982.76TRUE
DLL1PXM-p8c11r7139091582.28TRUE
DLL1PXM-p8c11r8140123182.85TRUE
DLL1PXM-p8c12r2298086975.57TRUE
DLL1PXM-p8c12r3142917481.76TRUE
DLL1PXM-p8c12r4120822182.74TRUE
DLL1PXM-p8c12r5125246081.07TRUE
DLL1PXM-p8c12r690656282.4FALSE
DLL1PXM-p8c1r1352655375.99TRUE
DLL1PXM-p8c1r2332543476.19TRUE
DLL1PXM-p8c1r3107812379.54TRUE
DLL1PXM-p8c1r4164366080.34TRUE
DLL1PXM-p8c1r6139927081.6TRUE
DLL1PXM-p8c1r7123989383.18TRUE
DLL1PXM-p8c1r8123739882.31TRUE
DLL1PXM-p8c2r1362964876.36TRUE
DLL1PXM-p8c2r2397862373.86TRUE
DLL1PXM-p8c2r3157738676.8TRUE
DLL1PXM-p8c2r4158239881.74TRUE
DLL1PXM-p8c2r5115658982.48TRUE
DLL1PXM-p8c2r6136163283.53TRUE
DLL1PXM-p8c2r7126437683.2TRUE
DLL1PXM-p8c3r1372886974.18TRUE
DLL1PXM-p8c3r2413131374.81TRUE
DLL1PXM-p8c3r3131053981.06TRUE
DLL1PXM-p8c3r4214841379.67TRUE
DLL1PXM-p8c3r690176180.09FALSE
DLL1PXM-p8c3r848193879.15FALSE
DLL1PXM-p8c4r1106111879.8TRUE
DLL1PXM-p8c4r3219450179.64TRUE
DLL1PXM-p8c4r4131463282.26TRUE
DLL1PXM-p8c4r5135387282.44TRUE
DLL1PXM-p8c4r654926777.18FALSE
DLL1PXM-p8c4r8103718183.18TRUE
DLL1PXM-p8c5r1278535472.09TRUE
DLL1PXM-p8c5r2412408675.87TRUE
DLL1PXM-p8c5r3166273681.18TRUE
DLL1PXM-p8c5r4116077283.29TRUE
DLL1PXM-p8c5r5117180982.23TRUE
DLL1PXM-p8c5r697606082.23FALSE
DLL1PXM-p8c5r8101806682.28TRUE
DLL1PXM-p8c6r1298335966.13FALSE
DLL1PXM-p8c6r3148768680.86TRUE
DLL1PXM-p8c6r460465577.84FALSE
DLL1PXM-p8c6r5137648582.34TRUE
DLL1PXM-p8c6r6121169683.04TRUE
DLL1PXM-p8c6r7119142581.53TRUE
DLL1PXM-p8c7r1357184570.49TRUE
DLL1PXM-p8c7r2370417571.41TRUE
DLL1PXM-p8c7r3316357176.32TRUE
DLL1PXM-p8c7r4115761582.54TRUE
DLL1PXM-p8c7r7123316482.7TRUE
DLL1PXM-p8c7r8156360681.47TRUE
DLL1PXM-p8c8r1377614972.95TRUE
DLL1PXM-p8c8r2370159474.95TRUE
DLL1PXM-p8c8r399934081.42FALSE
DLL1PXM-p8c8r5138400382.52TRUE
DLL1PXM-p8c8r6144656182.68TRUE
DLL1PXM-p8c8r7116453681.7TRUE
DLL1PXM-p8c8r8154602182.63TRUE
DLL1PXM-p8c9r2362485673.49TRUE
DLL1PXM-p8c9r4104398983.14TRUE
DLL1PXM-p8c9r5135171382.38TRUE
DLL1PXM-p8c9r6121051983.69TRUE
DLL1PXM-p8c9r8133945782.17TRUE
Earlysomite-p10c10r177481973.46FALSE
Earlysomite-p10c10r2142327480.76TRUE
Earlysomite-p10c10r3140718081.52TRUE
Earlysomite-p10c10r5160296381.25TRUE
Earlysomite-p10c10r7214541080.91TRUE
Earlysomite-p10c10r878749283.13FALSE
Earlysomite-p10c11r1153260982.03TRUE
Earlysomite-p10c11r2167351782.76TRUE
Earlysomite-p10c11r3139527080.9TRUE
Earlysomite-p10c11r496224283.59FALSE
Earlysomite-p10c11r5113753182.47TRUE
Earlysomite-p10c11r6109241879.58TRUE
Earlysomite-p10c11r76348277.21FALSE
Earlysomite-p10c12r2139172580.16TRUE
Earlysomite-p10c12r384284378.69FALSE
Earlysomite-p10c12r4137892080.65TRUE
Earlysomite-p10c12r6121724279.83TRUE
Earlysomite-p10c2r496287982.33FALSE
Earlysomite-p10c2r8144336080.14TRUE
Earlysomite-p10c3r2106747475.63TRUE
Earlysomite-p10c3r3123246281.96TRUE
Earlysomite-p10c4r1142336381.59TRUE
Earlysomite-p10c4r2134016181.07TRUE
Earlysomite-p10c4r397137582.38FALSE
Earlysomite-p10c4r4170990078.35TRUE
Earlysomite-p10c4r5134361482.21TRUE
Earlysomite-p10c4r6166251282.42TRUE
Earlysomite-p10c4r76449276.25FALSE
Earlysomite-p10c4r8112115881.28TRUE
Earlysomite-p10c5r1108245978.95TRUE
Earlysomite-p10c5r496869882.09FALSE
Earlysomite-p10c5r5171213482.37TRUE
Earlysomite-p10c5r6146236881.79TRUE
Earlysomite-p10c5r7193144680.42TRUE
Earlysomite-p10c5r8151221282.03TRUE
Earlysomite-p10c6r1139244581.23TRUE
Earlysomite-p10c6r2121862582.46TRUE
Earlysomite-p10c6r3108610582.44TRUE
Earlysomite-p10c6r4135858582.64TRUE
Earlysomite-p10c6r5118554983.27TRUE
Earlysomite-p10c6r6110033082.46TRUE
Earlysomite-p10c6r7196572181.13TRUE
Earlysomite-p10c6r8202881380.93TRUE
Earlysomite-p10c7r1146512680.68TRUE
Earlysomite-p10c9r5131468581.83TRUE
GARP-p6c10r31145477.13FALSE
GARP-p6c10r72553978.51FALSE
GARP-p6c10r82695878.53FALSE
GARP-p6c11r11176878.14FALSE
GARP-p6c11r31034977.35FALSE
GARP-p6c11r62729379.3FALSE
GARP-p6c12r11428077.94FALSE
GARP-p6c12r21276677.57FALSE
GARP-p6c12r63250178.95FALSE
GARP-p6c1r11116277.14FALSE
GARP-p6c1r2980376.78FALSE
GARP-p6c1r31157378.04FALSE
GARP-p6c1r51413578.16FALSE
GARP-p6c1r83349578.12FALSE
GARP-p6c2r11255877.71FALSE
GARP-p6c2r31159777.57FALSE
GARP-p6c2r51633678.32FALSE
GARP-p6c2r61262977.23FALSE
GARP-p6c2r72279278.73FALSE
GARP-p6c3r11513078.47FALSE
GARP-p6c3r21370677.69FALSE
GARP-p6c3r41466678.7FALSE
GARP-p6c3r51624878.41FALSE
GARP-p6c3r72984177.87FALSE
GARP-p6c3r82925478.35FALSE
GARP-p6c4r41782778.78FALSE
GARP-p6c4r51430377.57FALSE
GARP-p6c4r72581078.01FALSE
GARP-p6c4r82340177.92FALSE
GARP-p6c5r21150776.33FALSE
GARP-p6c5r31437177.13FALSE
GARP-p6c5r51450078.36FALSE
GARP-p6c5r61474977.74FALSE
GARP-p6c5r73050678.86FALSE
GARP-p6c6r31262776.86FALSE
GARP-p6c6r61710178.41FALSE
GARP-p6c6r82702878.21FALSE
GARP-p6c7r11249277.77FALSE
GARP-p6c7r21309077.52FALSE
GARP-p6c7r31055176.94FALSE
GARP-p6c7r51825978.01FALSE
GARP-p6c7r61454878.42FALSE
GARP-p6c7r72983378.89FALSE
GARP-p6c8r11106677.66FALSE
GARP-p6c8r31167176.37FALSE
GARP-p6c8r41280378.41FALSE
GARP-p6c8r61305778.01FALSE
GARP-p6c8r72622978.36FALSE
GARP-p6c9r31359777.83FALSE
GARP-p6c9r41674177.67FALSE
GARP-p6c9r51614177.49FALSE
GARP-p6c9r83213679.22FALSE
H7hESC-p7c10r11521277.81FALSE
H7hESC-p7c10r21304277.38FALSE
H7hESC-p7c10r3304017678.93TRUE
H7hESC-p7c10r414087669.78FALSE
H7hESC-p7c10r5281294779.05TRUE
H7hESC-p7c10r783129079.19FALSE
H7hESC-p7c10r8360105674.05TRUE
H7hESC-p7c11r4260718377.86TRUE
H7hESC-p7c11r6300047376TRUE
H7hESC-p7c11r7336351778.13TRUE
H7hESC-p7c12r11341077.49FALSE
H7hESC-p7c12r21203777.87FALSE
H7hESC-p7c12r312409671.79FALSE
H7hESC-p7c12r410448772.21FALSE
H7hESC-p7c12r621369250.66FALSE
H7hESC-p7c12r7333822174.7TRUE
H7hESC-p7c1r12483778.81FALSE
H7hESC-p7c1r21335277.58FALSE
H7hESC-p7c1r312992977.24FALSE
H7hESC-p7c1r4311079874.28TRUE
H7hESC-p7c1r5315598976.56TRUE
H7hESC-p7c1r6279244477.2TRUE
H7hESC-p7c1r7334449674.61TRUE
H7hESC-p7c1r8351100773.82TRUE
H7hESC-p7c2r11361878.42FALSE
H7hESC-p7c2r3219730372.9TRUE
H7hESC-p7c2r4316587179.03TRUE
H7hESC-p7c2r5341420073.2TRUE
H7hESC-p7c2r6282164875.67TRUE
H7hESC-p7c2r7262963375.39TRUE
H7hESC-p7c2r8354914375.69TRUE
H7hESC-p7c3r11610977.62FALSE
H7hESC-p7c3r3310716775.44TRUE
H7hESC-p7c3r4320079877.93TRUE
H7hESC-p7c3r5304698175.99TRUE
H7hESC-p7c3r6205218277.7TRUE
H7hESC-p7c3r7310432172.47TRUE
H7hESC-p7c4r21658077.12FALSE
H7hESC-p7c4r3300645278.27TRUE
H7hESC-p7c4r656601178.16FALSE
H7hESC-p7c4r7318368478.29TRUE
H7hESC-p7c4r8328565674.13TRUE
H7hESC-p7c5r21474777.51FALSE
H7hESC-p7c5r3356813977.64TRUE
H7hESC-p7c5r4248085979.42TRUE
H7hESC-p7c5r5238407578.8TRUE
H7hESC-p7c5r693667267.61FALSE
H7hESC-p7c5r8351947776.97TRUE
H7hESC-p7c6r11122877.48FALSE
H7hESC-p7c6r21203977.25FALSE
H7hESC-p7c6r3292649675.73TRUE
H7hESC-p7c6r4125845178.46TRUE
H7hESC-p7c6r5260614679.8TRUE
H7hESC-p7c6r6238751678.12TRUE
H7hESC-p7c6r7287684677.01TRUE
H7hESC-p7c6r8368746376.39TRUE
H7hESC-p7c7r11554877.71FALSE
H7hESC-p7c7r3319154878.21TRUE
H7hESC-p7c7r5295293678.61TRUE
H7hESC-p7c7r6305834277.42TRUE
H7hESC-p7c7r7285103477.88TRUE
H7hESC-p7c7r8323971275.89TRUE
H7hESC-p7c8r11333977.57FALSE
H7hESC-p7c8r21323477.22FALSE
H7hESC-p7c8r3310587473.86TRUE
H7hESC-p7c8r5324983776.01TRUE
H7hESC-p7c8r6317197977.12TRUE
H7hESC-p7c8r7292089674.23TRUE
H7hESC-p7c8r8387931574.84TRUE
H7hESC-p7c9r11508877.63FALSE
H7hESC-p7c9r21611179.54FALSE
H7hESC-p7c9r3208716479.75TRUE
H7hESC-p7c9r5286460878.63TRUE
H7hESC-p7c9r6222502378.87TRUE
H7hESC-p7c9r7283205478.63TRUE
H7hESC-p7c9r8338854975.62TRUE
LatM-p3c10r1184864580.26TRUE
LatM-p3c10r3171154278.22TRUE
LatM-p3c10r6122584582.13TRUE
LatM-p3c11r1203662980.18TRUE
LatM-p3c11r3185319680.2TRUE
LatM-p3c11r4162088380.38TRUE
LatM-p3c11r5175847379.52TRUE
LatM-p3c11r8182067480.73TRUE
LatM-p3c12r2178720581.83TRUE
LatM-p3c12r4147314581.97TRUE
LatM-p3c12r5172836780.65TRUE
LatM-p3c12r7175443480.59TRUE
LatM-p3c1r1177627881.49TRUE
LatM-p3c1r4179018681.03TRUE
LatM-p3c1r6158154480.21TRUE
LatM-p3c1r7171108280.03TRUE
LatM-p3c2r1178640880.57TRUE
LatM-p3c2r2188086080.72TRUE
LatM-p3c2r3176016580.96TRUE
LatM-p3c2r5175135480.27TRUE
LatM-p3c2r6162302380.2TRUE
LatM-p3c2r7185878479.49TRUE
LatM-p3c3r2144255979.79TRUE
LatM-p3c3r8170527880.16TRUE
LatM-p3c4r1182954180.95TRUE
LatM-p3c4r2193128981.29TRUE
LatM-p3c4r4188107180.69TRUE
LatM-p3c4r5165095778.83TRUE
LatM-p3c4r6166858980.08TRUE
LatM-p3c4r7189062879.95TRUE
LatM-p3c4r8195155779.25TRUE
LatM-p3c5r1193778581.57TRUE
LatM-p3c5r2185856779.88TRUE
LatM-p3c5r4184233279.91TRUE
LatM-p3c5r6138225481.5TRUE
LatM-p3c5r8184386681.03TRUE
LatM-p3c6r1186043778.79TRUE
LatM-p3c6r4181822480.47TRUE
LatM-p3c6r7201463679.86TRUE
LatM-p3c6r8192247879.64TRUE
LatM-p3c7r1177399881.46TRUE
LatM-p3c7r2186616280.69TRUE
LatM-p3c7r3183711381.17TRUE
LatM-p3c7r5162546979.99TRUE
LatM-p3c7r6161064279.42TRUE
LatM-p3c8r2185842580.39TRUE
LatM-p3c8r3190265980.99TRUE
LatM-p3c8r7208998878.8TRUE
LatM-p3c8r8185852880.44TRUE
LatM-p3c9r1168888082.07TRUE
LatM-p3c9r2179828381.47TRUE
LatM-p3c9r3188954079.88TRUE
LatM-p3c9r4171831080.47TRUE
LatM-p3c9r5155563879.93TRUE
LatM-p3c9r7198370079.74TRUE
MPS3-p5c10r42038176.7FALSE
MPS3-p5c10r51055578.21FALSE
MPS3-p5c10r61352278.05FALSE
MPS3-p5c10r71469977.16FALSE
MPS3-p5c10r81563576.64FALSE
MPS3-p5c11r42055876.27FALSE
MPS3-p5c11r51233978.61FALSE
MPS3-p5c11r61212577.2FALSE
MPS3-p5c11r71055977.66FALSE
MPS3-p5c11r81377377.09FALSE
MPS3-p5c12r1194875680.33TRUE
MPS3-p5c12r22202467.32FALSE
MPS3-p5c12r32098075.67FALSE
MPS3-p5c12r4203077180.97TRUE
MPS3-p5c12r51257778.13FALSE
MPS3-p5c12r6927076.62FALSE
MPS3-p5c1r1247745778.55TRUE
MPS3-p5c1r3231394580.07TRUE
MPS3-p5c1r4163735478.79TRUE
MPS3-p5c1r71067277.48FALSE
MPS3-p5c2r1248080879.2TRUE
MPS3-p5c2r2226396580.32TRUE
MPS3-p5c2r3220861479.38TRUE
MPS3-p5c2r4167283079.15TRUE
MPS3-p5c2r5202339880.15TRUE
MPS3-p5c2r61075077.89FALSE
MPS3-p5c2r81432776.66FALSE
MPS3-p5c3r1201278780.76TRUE
MPS3-p5c3r81236377.12FALSE
MPS3-p5c4r12557675.89FALSE
MPS3-p5c4r2188235281.27TRUE
MPS3-p5c4r52046277.17FALSE
MPS3-p5c4r71329978.95FALSE
MPS3-p5c4r81424278.48FALSE
MPS3-p5c5r1140507580.67TRUE
MPS3-p5c5r2216534679.87TRUE
MPS3-p5c5r3233267780.73TRUE
MPS3-p5c5r51076278.73FALSE
MPS3-p5c6r1233467278.86TRUE
MPS3-p5c6r32390177.09FALSE
MPS3-p5c6r41974575.9FALSE
MPS3-p5c6r51304578.17FALSE
MPS3-p5c6r71100577.97FALSE
MPS3-p5c7r1167573781.22TRUE
MPS3-p5c7r2219169279.27TRUE
MPS3-p5c7r32670376.97FALSE
MPS3-p5c7r4207737380.51TRUE
MPS3-p5c7r81404177.49FALSE
MPS3-p5c8r1182377081.26TRUE
MPS3-p5c8r32116077.06FALSE
MPS3-p5c8r4184192779.76TRUE
MPS3-p5c8r5890777.51FALSE
MPS3-p5c8r7987178.68FALSE
MPS3-p5c8r81227077.43FALSE
MPS3-p5c9r32405777.4FALSE
MPS3-p5c9r4207555480.11TRUE
MPS3-p5c9r51251777.45FALSE
MPS3-p5c9r61079478.06FALSE
MPS3-p5c9r71250678.54FALSE
MPS3-p5c9r81281677.5FALSE
Sclerotome-p2c10r1180295581.48TRUE
Sclerotome-p2c10r2172102380.74TRUE
Sclerotome-p2c10r3180031481.44TRUE
Sclerotome-p2c10r4211066579.43TRUE
Sclerotome-p2c10r5188231180.88TRUE
Sclerotome-p2c10r6163038780.23TRUE
Sclerotome-p2c10r7184578280.28TRUE
Sclerotome-p2c10r8196701680.1TRUE
Sclerotome-p2c11r1197030480.91TRUE
Sclerotome-p2c11r3177898980.41TRUE
Sclerotome-p2c11r4178280279.6TRUE
Sclerotome-p2c11r6153158380.34TRUE
Sclerotome-p2c11r7157730279.73TRUE
Sclerotome-p2c12r12459974.18FALSE
Sclerotome-p2c12r3184070080.81TRUE
Sclerotome-p2c12r5208371080.26TRUE
Sclerotome-p2c1r1133419880.66TRUE
Sclerotome-p2c1r2194378479.47TRUE
Sclerotome-p2c1r4189041480.28TRUE
Sclerotome-p2c1r5191259980.49TRUE
Sclerotome-p2c1r6179914081.08TRUE
Sclerotome-p2c1r7177735381.04TRUE
Sclerotome-p2c2r2156854481TRUE
Sclerotome-p2c2r373759981.34FALSE
Sclerotome-p2c2r6191920580.04TRUE
Sclerotome-p2c2r897314980.54FALSE
Sclerotome-p2c3r1174962481.93TRUE
Sclerotome-p2c3r2190839380.17TRUE
Sclerotome-p2c3r3188462780.5TRUE
Sclerotome-p2c3r4163573980.97TRUE
Sclerotome-p2c3r6184094181.35TRUE
Sclerotome-p2c3r7180146879.94TRUE
Sclerotome-p2c3r8185637880.14TRUE
Sclerotome-p2c4r1203302581.18TRUE
Sclerotome-p2c4r3188122980.21TRUE
Sclerotome-p2c4r4173346380.65TRUE
Sclerotome-p2c4r5183461679.73TRUE
Sclerotome-p2c4r6141485680.68TRUE
Sclerotome-p2c4r7172387979.71TRUE
Sclerotome-p2c4r8178459877.18TRUE
Sclerotome-p2c5r1190114481.28TRUE
Sclerotome-p2c5r3178593379.08TRUE
Sclerotome-p2c5r4161114077.72TRUE
Sclerotome-p2c5r6157068881.79TRUE
Sclerotome-p2c5r7198282080.76TRUE
Sclerotome-p2c6r3180958680.19TRUE
Sclerotome-p2c6r5173909878.21TRUE
Sclerotome-p2c6r6166062281.57TRUE
Sclerotome-p2c6r7177520179.93TRUE
Sclerotome-p2c6r8183551381.13TRUE
Sclerotome-p2c7r1198712081.32TRUE
Sclerotome-p2c7r2134744880.36TRUE
Sclerotome-p2c7r3164853580.24TRUE
Sclerotome-p2c7r472342880.49FALSE
Sclerotome-p2c7r5185094580.51TRUE
Sclerotome-p2c7r6168525078.77TRUE
Sclerotome-p2c8r2159242980.56TRUE
Sclerotome-p2c8r3167830680.05TRUE
Sclerotome-p2c8r4116289281.04TRUE
Sclerotome-p2c8r5190133981.21TRUE
Sclerotome-p2c8r6173147779.41TRUE
Sclerotome-p2c8r7160929778.87TRUE
Sclerotome-p2c8r8174270877.69TRUE
Sclerotome-p2c9r1184284981.26TRUE
Sclerotome-p2c9r2185815079.13TRUE
Sclerotome-p2c9r3158288381.39TRUE
Sclerotome-p2c9r4178212580.89TRUE
Sclerotome-p2c9r5181143279.69TRUE
Sclerotome-p2c9r6170909080.25TRUE
We next filtered out genes with low or undetectable expression by only considering genes with least 20 cells (across all 498 retained cells) showing a log2 (TPM+1) value of at least 10 for that gene. As with the data from the bulk-population RNA-seq, when performing analyses comparing cell types to one another, we additionally filtered out genes whose log2 (TPM+1) values did not vary by a difference of at least 2 (i.e., a 4-fold difference in expression) between the cell types with the highest and lowest expressions. The raw data from the single-cell RNA-seq can be found in [Data Citation 1]. A spreadsheet of TPM values can be found in [Data Citation 2].

ATAC-seq

ATAC-seq was performed as described previously[9], with minor modifications. In brief, for each replicate, 50,000 cells were lysed in lysis buffer containing 0.01% IGEPAL CA-630 (Sigma, I8896) to obtain nuclei, which were directly used in the Tn5 transposition reaction (reagents from Nextera DNA Sample Preparation Kit; Illumina, FC-121–1030). Immediately following transposition, DNA fragments were purified (MinElute Kit, Qiagen) and PCR amplified for a total of 12–13 cycles using previously-designed primers that included Illumina compatible adapters and barcodes[9]. The resulting ATAC-seq libraries were purified (MinElute Kit, Qiagen) and pooled, and final library-pool concentrations were assessed (Bioanalyzer) prior to next-generation sequencing. The quality of ATAC-seq libraries was confirmed by a shallow sequencing run using a MiSeq v3 (Stanford Functional Genomics Facility, 2×75 bp reads) before deep sequencing was performed on a NextSeq 500 (2×75 bp reads). Two replicates were analyzed per cell-type. We used the ATAqC pipeline[20] to process the ATAC-seq reads, starting with adapter trimming and then alignment to hg19 (Bowtie2 (ref. 21)). While we used hg38 for RNA-seq alignment, we opted for hg19 for ATAC-seq because of the availability of a curated blacklist of artifactual regions in hg19 (ref. 13). We then filtered out reads based on a variety of criteria (excluding unmapped reads, mate-unmapped reads, secondary alignments, duplicates (using Picard's MarkDuplicates[22]), multi-mapping reads (MAPQ<30), and mitochondrial reads), retaining only high-read-quality, properly-paired reads. Two biological replicates were assayed by ATAC-seq for each cell-type. As the post-filtering sequencing depth varied between replicates and cell types, we subsampled each replicate to a maximum of 35 M uniquely-mapping reads (post-filtering) to improve comparability between samples. We next used MACS2 (ref. 23) to call peaks for each replicate, with a relaxed false discovery rate (FDR) threshold of 0.01, and then created a unified peak list for each cell type by selecting only peaks that were reproducible between both replicates. This was done through an irreproducible discovery rate (IDR) analysis[24], similar to what was previously described by the ENCODE Consortium[25]. In brief, the IDR method takes in peak calls from a pair of replicates, filters out all peaks that only appear in one replicate, and then uses a copula mixture model to model the remaining peaks as belonging to either a reproducible ‘signal’ population or an irreproducible ‘noise’ population’. We used an IDR threshold of 0.1, i.e., we only retained peaks that were deemed to have come from the ‘signal’ population with a probability of more than 0.9 after a multiple testing correction. Finally, we filtered out all peaks that appeared in the aforementioned blacklist of artifactual regions in hg19 (https://www.encodeproject.org/annotations/ENCSR636HFF/). We note that this ATAC-seq analysis pipeline is an improved version of the one used for analysis in our related publication[8]. In particular, here we adjusted the IDR threshold, the shift size parameter for MACS2, and a multi-mapping parameter, resulting in increased sensitivity for peak detection. To obtain a universal list of peaks across all cell-types, we used BEDtools[26] to merge the lists of filtered, reproducible peaks for each cell-type, resulting in a total of 166,256 peaks. For each cell-type, we then pooled its two biological replicates together and called peaks (MACS2) on the pooled reads. To obtain a single measure of confidence at each peak P in the universal list for each cell-type C, we took the highest −log10 P-value out of all peaks in the pooled replicates for C that intersected with P. The raw ATAC-seq data can be found in [Data Citation 1]. The peak calls can be found in [Data Citation 2]. ATAC-seq metadata is tabulated in Table 4 (available online only).
Table 4

ATAC-seq metadata and quality control statistics

Sample IDCelltypeDate ATAC was performedDate of library prepRead count from sequencerRead count successfully alignedRead count after filtering for mapping qualityRead count after removing duplicate readsRead count after removing mitochondrial readsNon-Redundant Fraction (NRF)PBC1PBC2Fraction of reads in NFRNFR/mono-nuc readsPresence of NFR peakPresence of Mono-Nuc peakRaw peaksIDR peaksTSS enrichmentNumber of reads in universal DHS regionsNumber of reads in promoter regionsNumber of reads in enhancer regionsNumber of reads in called peak regions
H7_hESC_ATAC1D0 hESC11/8/201513/8/201593261696894459436342735548401487330572400.8741730.8773917.9637260.6721455314.48815287OKOK348998721075.6775326557501297218121894200557113974
H7_hESC_ATAC2D0 hESC11/8/201513/8/201515182517814574137410314122074914035460803460.7843570.7809654.3557910.5949059883.111463078OKOK254898721075.41797104680937222271676103090496258218
APS_ATAC3D1 APS4/8/201513/8/201588401812839527976055844742312884236411400.5895250.5498981.9402110.5911699913.182374684OKOK345846591256.4546793325428052162559566524265366166
APS_ATAC4D1 APS4/8/201513/8/201566716160634820355152947244238724367107840.7881870.7786574.2784580.6473511364.050393391OKOK243565591255.61585889478853502329280101273896313436
MPS_ATAC5D1 MPS4/8/201513/8/20151291253401237080678673344760812283343691500.7371180.7275533.4167040.5942715233.19534847OKOK359273579456.995176668210077252538497909418632701
MPS_ATAC6D1 MPS4/8/201513/8/20151288020541222293419645386281570579662555160.8235560.8189315.3067970.6787162884.711491841OKOK258726579454.58023748677129542117515103073945862319
DLL1pPXm_ATAC7D2 DLL1+ PXM6/8/201513/8/201519170280818371819613241465999511940660606140.8702620.8750247.8317890.66194574.286335569OKOK2763911092507.628156666972610130702261112509710862818
DLL1pPXm_ATAC8D2 DLL1+ PXM6/8/201513/8/201514729164214078866910016709374377015481389880.8763850.8823278.3270580.6486519464.078567409OKOK2772301092506.92349180990282932770033107548129077103
D2Ltm_ATAC9D2 LatM13/8/201526/8/201595742814891610275599826938367024204662380.8096050.814175.1680490.6450428363.413532025OKOK2761758310910.032752935519359181970761173215451176
D2Ltm_ATAC10D2 LatM13/8/201526/8/201519213422017973393711307022675469447372847300.7389650.7352843.5603570.6186959623.008369035OKOK271235831099.29431008693828743039301107695099234041
D6Sclrtm_ATAC11D6 Sclerotome13/8/201526/8/2015328044572305242787181293318101400785205593680.356590.2756791.0940470.6202698393.299242828OKOK25347210059017.341390676701318311027255758669432698
D6Sclrtm_ATAC12D6 Sclerotome13/8/201526/8/20151389045581304983637729640045578855134716300.5916140.5705172.0650990.6222103623.106016386OKOK27370210059022.349780455050293237678438036507543825
ESMT_ATAC13D3 Somite10/8/201526/8/201597028570902939616333967950618923375393500.8987270.90377610.2791850.6038383632.810326908OKOK28281311631212.471821071105551841847171099933914172891
ESMT_ATAC14D3 Somite10/8/201526/8/201591342558840653145837777746807655349190480.904160.9095510.965640.638698633.207448669OKOK26995211631211.902482741034663538118731054184012560254
D3CrdcM_ATAC15D3 Cardiac10/8/201526/8/201519177872018389739112506680480344503351144740.8178960.8423316.4750160.6577944313.578035603OKOK31945818038020.140991521598886857487621393890226578992
D3CrdcM_ATAC16D3 Cardiac10/8/201526/8/20151348718861296496828778097657145039261514820.8395050.8633657.5201290.6704422273.796855645OKOK28685118038019.139673341130356340068431007080516347057
Drmmtm_ATAC19D5 Dermomytome12/8/201526/8/201515297883614456639010853615884652575600777940.798360.8050275.2083830.7018942815.092762813OKOK263326489168.86285214883002672617001106959317189919
Drmmtm_ATAC20D5 Dermomytome12/8/201526/8/201571382368675026655029667240140535296559140.8653410.8712567.8411010.6880260994.642126486OKOK285309489168.8735654596769259214435286810735847726
Smtmrs_ATAC21D2.25 Somitomere6/8/201526/8/20154254778764070976683100858612385352861655843320.7919690.7979974.9735260.7301574975.929894484OKOK280874835459.60268717691018612824194111787188781283
Smtmrs_ATAC22D2.25 Somitomere6/8/201526/8/2015178465724169037246129385205104336662784898980.865890.8707847.7491380.706938234.987984435OKOK277098835459.44297811190997132875813111466959079702

High-throughput surface marker screening

High-throughput, antibody-based screening of surface markers expressed on various mesodermal progenitors was performed as described in our related publication[8] and explained in further detail here. The following lineages, derived from the indicated embryonic stem cell lines, were screened using this approach: undifferentiated H7 hESCs (‘undifferentiated hESCs’), H7-derived day 2 paraxial mesoderm (‘paraxial mesoderm’), H7-derived day 3 early somite progenitors (‘early somite’), H7-derived day 5 dermomyotome (‘dermomyotome’), H7-derived day 6 sclerotome (‘sclerotome’), MIXL1-GFP reporter HES3 hESC-derived day 1 anterior primitive streak (‘primitive streak’) and finally, NKX2.5-GFP reporter HES3 hESC-derived day 3 cardiac mesoderm (‘cardiac mesoderm’). 10–70 million cells of each lineage were used in each surface-marker screen. Due to limited resources, we did not include mid primitive streak and lateral mesoderm in this screen. Prior to antibody staining, hESCs or their differentiated mesodermal progeny were dissociated by brief 37 C incubation in TrypLE Express (Gibco). TrypLE Express was chosen as a dissociation reagent, as it has been previously shown to minimally cleave cell-surface epitopes[27], which would otherwise confound surface marker screening data. After cell detachment, they were washed off plates in a large excess of DMEM/F12 to neutralize the dissociation reagent, filtered to remove large cell clumps, pelleted by centrifugation, and re-suspended in approximately 30 ml of Cell Suspension Buffer (Biolegend). To conduct antibody screening, a multichannel pipette was used to plate the cell suspension into individual wells of four 96-well plates, each well containing a distinct PE-conjugated antibody against a human cell-surface antigen, altogether totaling 332 unique cell-surface markers across multiple 96-well plates (LEGENDScreen PE-Conjugated Human Antibody Plates; Biolegend, 700001). Cells were stained with respective antibodies for 30 min at 4 C, washed twice with Cell Staining Buffer and then finally re-suspended in Cell Staining Buffer containing 1.1 μM DAPI (Biolegend) as a viability dye before analysis on an LSR Fortessa (Stanford Stem Cell Institute FACS Core). Stained cells were not fixed prior to FACS analysis. The percentage of viable (DAPI-negative cells) for each lineage that expressed each given surface marker was determined by rigorously gating the PE fluorescent signal such that no more than several percent of negative control cells (unstained cells or cells that were stained with an isotype control antibody directed against no known cellular antigen) were regarded positive. For analysis of surface-marker expression on MIXL1-GFP reporter HES3 hESC-derived primitive streak or NKX2.5-GFP reporter HES3 hESC-derived day 3 cardiac mesoderm, cells were respectively pre-gated on the MIXL1-GFP+ and NKX2.5-GFP+ fractions before analysis of PE signal intensity. Multicolor compensation was conducted to control for fluorescent bleedthrough between the PE and GFP channels. A table with the percentage of viable cells in each lineage that expressed each given surface marker can be found in [Data Citation 4]. Metadata for the surface marker screen is tabulated in Table 5 (available online only).
Table 5

Surface marker screening metadata

WellAntigenAntibody CloneAntibody TypeCatalog No. (Biolegend)
LEGENDSCREEN PLATE #1
 1Blank   
 A2CD1aHI149Mouse IgG1, κ300106
 A3CD1bSN13 (K5- 1B8)Mouse IgG1, κ329108
 A4CD1cL161Mouse IgG1, κ331506
 A5CD1d51.1Mouse IgG2b, κ350306
 A6CD2RPA-2.10Mouse IgG1, κ300208
 A7CD3HIT3aMouse IgG2a, κ300308
 A8CD4RPA-T4Mouse IgG1, κ300508
 A9CD5UCHT2Mouse IgG1, κ300608
 A10CD6BL-CD6Mouse IgG1, κ313906
 A11CD7CD7-6B7Mouse IgG2a, κ343106
 A12CD8aHIT8aMouse IgG1, κ300908
 B1CD9HI9aMouse IgG1, κ312106
 B2CD10HI10aMouse IgG1, κ312204
 B3CD11aHI111Mouse IgG1, κ301208
 B4CD11bICRF44Mouse IgG1, κ301306
 B5CD11b (activated)CBRM1/5Mouse IgG1, κ301406
 B6CD11c3.9Mouse IgG1, κ301606
 B7CD13WM15Mouse IgG1, κ301704
 B8CD14M5E2Mouse IgG2a, κ301806
 B9CD15 (SSEA-1)W6D3Mouse IgG1, κ323006
 B10CD163G8Mouse IgG1, κ302008
 B11CD18TS1/18Mouse IgG1, κ302108
 B12CD19HIB19Mouse IgG1, κ302208
 C1CD202H7Mouse IgG2b, κ302306
 C2CD21Bu32Mouse IgG1, κ354904
 C3CD22HIB22Mouse IgG1, κ302506
 C4CD23EBVCS-5Mouse IgG1, κ338508
 C5CD24ML5Mouse IgG2a, κ311106
 C6CD25BC96Mouse IgG1, κ302606
 C7CD26BA5bMouse IgG2a, κ302706
 C8CD27O323Mouse IgG1, κ302808
 C9CD28CD28.2Mouse IgG1, κ302908
 C10CD29TS2/16Mouse IgG1, κ303004
 C11CD30BY88Mouse IgG1, κ333906
 C12CD31WM59Mouse IgG1, κ303106
 D1CD32FUN-2Mouse IgG2b, κ303206
 D2CD33WM53Mouse IgG1, κ303404
 D3CD34581Mouse IgG1, κ343506
 D4CD35E11Mouse IgG1, κ333406
 D5CD365–271Mouse IgG2a, κ336206
 D6CD38HIT2Mouse IgG1, κ303506
 D7CD39A1Mouse IgG1, κ328208
 D8CD40HB14Mouse IgG1, κ313006
 D9CD41HIP8Mouse IgG1, κ303706
 D10CD42bHIP1Mouse IgG1, κ303906
 D11CD43CD43-10G7Mouse IgG1, κ343204
 D12CD44BJ18Mouse IgG1, κ338808
 E1CD45HI30Mouse IgG1, κ304008
 E2CD45RAHI100Mouse IgG2b, κ304108
 E3CD45RBMEM-55Mouse IgG2b, κ310204
 E4CD45ROUCHL1Mouse IgG2a, κ304206
 E5CD46TRA-2–10Mouse IgG1352402
 E6CD47CC2C6Mouse IgG1, κ323108
 E7CD48BJ40Mouse IgG1, κ336708
 E8CD49aTS2/7Mouse IgG1, κ328304
 E9CD49cASC-1Mouse IgG1, κ343804
 E10CD49d9F10Mouse IgG1, κ304304
 E11CD49eNKI-SAM-1Mouse IgG2b, κ328010
 E12CD49fGoH3Rat IgG2a, κ313612
 F1CD50 (ICAM-3)CBR-IC3/1Mouse IgG1, κ330005
 F2CD51NKI-M9Mouse IgG2a, κ327910
 F3CD51/6123C6Mouse IgG1, κ304406
 F4CD52HI186Mouse IgG2b, κ316006
 F5CD53HI29Mouse IgG1, κ325406
 F6CD54HA58Mouse IgG1, κ353106
 F7CD55JS11Mouse IgG1, κ311308
 F8CD56 (NCAM)HCD56Mouse IgG1, κ318306
 F9CD57HCD57Mouse IgM, κ322312
 F10CD58TS2/9Mouse IgG1, κ330905
 F11CD59p282 (H19)Mouse IgG2a, κ304708
 F12CD61VI-PL2Mouse IgG1, κ336406
 G1CD62EHAE-1fMouse IgG1, κ336008
 G2CD62LDREG-56Mouse IgG1, κ304806
 G3CD62P (P-Selectin)AK4Mouse IgG1, κ304906
 G4CD63H5C6Mouse IgG1, κ353004
 G5CD6410.1Mouse IgG1, κ305008
 G6CD66a/c/eASL-32Mouse IgG2b, κ342304
 G7CD66bG10F5Mouse IgM, κ305106
 G8CD69FN50Mouse IgG1, κ310906
 G9CD70113-16Mouse IgG1, κ355104
 G10CD71CY1G4Mouse IgG2a, κ334106
 G11CD73AD2Mouse IgG1, κ344004
 G12CD74LN2Mouse IgG1, κ326808
 H1CD79bCB3-1Mouse IgG1, κ341404
 H2CD802D10Mouse IgG1, κ305208
 H3CD815A6Mouse IgG1, κ349506
 H4CD82ASL-24Mouse IgG1, κ342104
 H5CD83HB15eMouse IgG1, κ305308
 H6CD84CD84.1.21Mouse IgG2a, κ326008
 H7CD85a (ILT5)MKT5.1Rat IgG2a, κ337704
 H8CD85d (ILT4)42D1Rat IgG2a, κ338706
 H9CD85g (ILT7)17G10.2Mouse IgG1, κ326408
 H10CD85h (ILT1)24Mouse IgG2b, κ337904
 H11CD85j (ILT2)GHI/75Mouse IgG2b, κ333708
 H12CD85k (ILT3)ZM4.1Mouse IgG1, κ333008
     
LEGENDSCREEN PLATE #2
 A1Blank   
 A2CD86IT2.2Mouse IgG2b, κ305406
 A3CD87VIM5Mouse IgG1, κ336906
 A4CD88S5/1Mouse IgG2a, κ344304
 A5CD89A59Mouse IgG1, κ354104
 A6CD90 (Thy1)5E10Mouse IgG1, κ328110
 A7CD93VIMD2Mouse IgG1, κ336108
 A8CD94DX22Mouse IgG1, κ305506
 A9CD95DX2Mouse IgG1, κ305608
 A10CD96NK92.39Mouse IgG1, κ338406
 A11CD97VIM3bMouse IgG1, κ336308
 A12CD99HCD99Mouse IgG2a, κ318008
 B1CD100A8Mouse IgG1, κ328408
 B2CD101 (BB27)BB27Mouse IgG1, κ331006
 B3CD102CBR-IC2/2Mouse IgG2a, κ328506
 B4CD103Ber-ACT8Mouse IgG1, κ350206
 B5CD10458XB4Mouse IgG2a, κ327808
 B6CD10543A3Mouse IgG1, κ323206
 B7CD106STAMouse IgG1, κ305806
 B8CD107a (LAMP-1)H4A3Mouse IgG1, κ328608
 B9CD108MEM-150Mouse IgM, κ315704
 B10CD109W7C5Mouse IgG1, κ323306
 B11CD111R1.302Mouse IgG1, κ340404
 B12CD112 (Nectin-2)TX31Mouse IgG1, κ337410
 C1CD114LMM741Mouse IgG1, κ346106
 C2CD1159-4D2-1E4Rat IgG1, κ347304
 C3CD1164H1Mouse IgG1, κ305908
 C4CD117 (c-kit)104D2Mouse IgG1, κ313204
 C5CD119 (IFN-g R α chain)GIR-208Mouse IgG1, κ308606
 C6CD122TU27Mouse IgG1, κ339006
 C7CD1236H6Mouse IgG1, κ306006
 C8CD124G077F6Mouse IgG2a, κ355004
 C9CD126 (IL-6Rα)UV4Mouse IgG1, κ352804
 C10CD127 (IL-7Rα)A019D5Mouse IgG1, κ351304
 C11CD129 (IL-9 R)AH9R7Mouse IgG2b, κ310404
 C12CD1311C1Mouse IgG1, κ306104
 D1CD132TUGh4Rat IgG2b, κ338606
 D2CD134Ber-ACT35 (ACT35)Mouse IgG1, κ350004
 D3CD135BV10A4H2Mouse IgG1, κ313306
 D4CD137 (4-1BB)4B4-1Mouse IgG1, κ309804
 D5CD137L (4-1BB Ligand)5F4Mouse IgG1, κ311504
 D6CD138DL-101Mouse IgG1, κ352306
 D7CD140a16A1Mouse IgG1, κ323506
 D8CD140b18A2Mouse IgG1, κ323606
 D9CD141M80Mouse IgG1, κ344104
 D10CD1435–369Mouse IgG1, κ344204
 D11CD144BV9Mouse IgG2a, κ348506
 D12CD146SHM-57Mouse IgG2a, κ342004
 E1CD148A3Mouse IgG1, κ328708
 E2CD150 (SLAM)A12 (7D4)Mouse IgG1, κ306308
 E3CD152L3D10Mouse IgG1, κ349906
 E4CD15424–31Mouse IgG1, κ310806
 E5CD155 (PVR)SKII.4Mouse IgG1, κ337610
 E6CD156c (ADAM10)SHM14Mouse IgG1, κ352704
 E7CD158a/hHP-MA4Mouse IgG2b, κ339506
 E8CD158b (KIR2DL2/L3, NKAT2)DX27Mouse IgG2a, κ312606
 E9CD158dmAb 33 (33)Mouse IgG1, κ347006
 E10CD158e1 (KIR3DL1, NKB1)DX9Mouse IgG1, κ312708
 E11CD158fUP-R1Mouse IgG1, κ341304
 E12CD161HP-3G10Mouse IgG1, κ339904
 F1CD162KPL-1Mouse IgG1, κ328806
 F2CD163GHI/61Mouse IgG1, κ333606
 F3CD16467D2Mouse IgG1, κ324808
 F4CD165SN2 (N6- D11)Mouse IgG1, κ329010
 F5CD1663A6Mouse IgG1, κ343904
 F6CD167a (DDR1)51D6Mouse IgG3, κ334006
 F7CD1697–239Mouse IgG1, κ346004
 F8CD170 (Siglec-5)1A5Mouse IgG1, κ352004
 F9CD172a (SIRPa)SE5A5Mouse IgG1, κ323806
 F10CD172b (SIRPb)B4B6Mouse IgG1, κ323906
 F11CD172g (SIRPg)LSB2.20Mouse IgG1, κ336606
 F12CD178 (Fas-L)NOK-1Mouse IgG1, κ306407
 G1CD179aHSL96Mouse IgG1, κ347404
 G2CD179bHSL11Mouse IgG1, κ349804
 G3CD180 (RP105)MHR73-11Mouse IgG1, κ312906
 G4CD181 (CXCR1)8F1/CXCR1Mouse IgG2b, κ320608
 G5CD182 (CXCR2)5E8/CXCR2Mouse IgG1, κ320706
 G6CD183G025H7Mouse IgG1, κ353706
 G7CD184 (CXCR4)12G5Mouse IgG2a, κ306506
 G8CD193 (CCR3)5E8Mouse IgG2b, κ310706
 G9CD195 (CCR5)T21/8Mouse IgG1, κ321406
 G10CD196G034E3Mouse IgG2b, κ353410
 G11CD197 (CCR7)G043H7Mouse IgG2a, κ353204
 G12CD200 (OX2)OX-104Mouse IgG1, κ329206
 H1CD200 ROX-108Mouse IgG1, κ329306
 H2CD201 (EPCR)RCR-401Rat IgG1, κ351904
 H3CD202b ( Tie2/Tek)33.1 (Ab33)Mouse IgG1, κ334206
 H4CD203c (E-NPP3)NP4D6Mouse IgG1, κ324606
 H5CD205 (DEC- 205)HD30Mouse IgG1, κ342204
 H6CD206 (MMR)15-2Mouse IgG1, κ321106
 H7CD207 (Langerin)10E2Mouse IgG1, κ352204
 H8CD209 (DC- SIGN)9E9A8Mouse IgG2a, κ330106
 H9CD210 (IL- 10 R)3F9Rat IgG2a, κ308804
 H10CD213a2SHM38Mouse IgG1, κ354404
 H11CD215 (IL- 15Rα)JM7A4Mouse IgG2b, κ330208
 H12CD218a (IL-18Rα)H44Mouse IgG1, κ313808
     
LEGENDSCREEN PLATE #3
 A1Blank   
 A2CD220B6.220Mouse IgG2b, κ352604
 A3CD221 (IGF-1R)1H7/CD221Mouse IgG1, κ351806
 A4CD226 (DNAM-1)11A8Mouse IgG1, κ338306
 A5CD229 (Ly-9)HLy-9.1.25Mouse IgG1, κ326108
 A6CD231 (TALLA)SN1a (M3- 3D9)Mouse IgG1, κ329406
 A7CD235abHIR2Mouse IgG2b, κ306604
 A8CD243UIC2Mouse IgG2a, κ348606
 A9CD244 (2B4)C1.7Mouse IgG1, κ329508
 A10CD245 (p220/240)DY12Mouse IgG1, κInquire
 A11CD252 (OX40L)11C3.1Mouse IgG1, κ326308
 A12CD253 (Trail)RIK-2Mouse IgG1, κ308206
 B1CD254MIH24Mouse IgG1, κ347504
 B2CD255 (TWEAK)CARL-1Mouse IgG3, κ308305
 B3CD257 (BAFF, BLYS)T7–241Mouse IgG1, κ318606
 B4CD258 (LIGHT)T5–39Mouse IgG2a, κ318706
 B5CD261 (DR4, TRAIL-R1)DJR1Mouse IgG1, κ307206
 B6CD262 (DR5, TRAIL-R2)DJR2–4 (7–8)Mouse IgG1, κ307406
 B7CD263 (DcR1, TRAIL-R3)DJR3Mouse IgG1, κ307006
 B8CD266 (Fn14, TWEAK Receptor)ITEM-1Mouse IgG1, κ314004
 B9CD267 (TACI)1A1Rat IgG2a, κ311906
 B10CD268 (BAFF-R, BAFFR)11C1Mouse IgG1, κ316906
 B11CD270 (HVEM)122Mouse IgG1, κ318806
 B12CD271ME20.4Mouse IgG1, κ345106
 C1CD273 (B7- DC, PD-L2)24F.10C12Mouse IgG2a, κ329606
 C2CD274 (B7- H1, PD-L1)29E.2A3Mouse IgG2b, κ329706
 C3CD275 (B7- H2, B7-RP1, ICOSL)9F.8A4Mouse IgG1, κ329806
 C4CD276MIH42Mouse IgG1, κ351004
 C5CD277BT3.1Mouse IgG1, κ342704
 C6CD278 (ICOS)C398.4AArm. Hamster IgG313508
 C7CD279 (PD-1)EH12.2H7Mouse IgG1, κ329906
 C8CD282 (TLR2)TL2.1Mouse IgG2a, κ309708
 C9CD284 (TLR4)HTA125Mouse IgG2a, κ312806
 C10CD286 (TLR6)TLR6.127Mouse IgG1, κ334708
 C11CD2903C10C5Mouse IgG1, κ354604
 C12CD294BM16Rat IgG2a, κ350106
 D1CD298LNH-94Mouse IgG1, κ341704
 D2CD300e (IREM-2)UP-H2Mouse IgG1, κ339704
 D3CD300FUP-D2Mouse IgG1, κ340604
 D4CD301H037G3Mouse IgG2a, κ354704
 D5CD303201AMouse IgG2a, κ354204
 D6CD30412C2Mouse IgG2a, κ354504
 D7CD307509f6Mouse IgG2a, κ340304
 D8CD307d (FcRL4)413D12Mouse IgG2b, κ340204
 D9CD314 (NKG2D)1D11Mouse IgG1, κ320806
 D10CD317RS38EMouse IgG1, κ348406
 D11CD318 (CDCP1)CUB1Mouse IgG2b, κ324006
 D12CD319 (CRACC)162.1Mouse IgG2b, κ331806
 E1CD324 (E- Cadherin)67A4Mouse IgG1, κ324106
 E2CD3258C11Mouse IgG1, κ350805
 E3CD326 (Ep- CAM)9C4Mouse IgG2b, κ324206
 E4CD328 (Siglec-7)6–434Mouse IgG1, κ339204
 E5CD334 (FGFR4)4FR6D3Mouse IgG1, κ324306
 E6CD335 (NKp46)9E2Mouse IgG1, κ331908
 E7CD336 (NKp44)P44-8Mouse IgG1, κ325108
 E8CD337 (NKp30)P30-15Mouse IgG1, κ325208
 E9CD338 (ABCG2)5D3Mouse IgG2b, κ332008
 E10CD340 (erbB2/ HER-2)24D2Mouse IgG1, κ324406
 E11CD344 (Frizzled-4)CH3A4A7Mouse IgG1, κ326606
 E12CD351TX61Mouse IgG1, κ137306
 F1CD352 (NTB-A)NT-7Mouse IgG1, κ317208
 F2CD354 (TREM-1)TREM-26Mouse IgG1, κ314906
 F3CD355 (CRTAM)Cr24.1Mouse IgG2a, κ339106
 F4CD357 (GITR)621Mouse IgG1, κ311604
 F5CD360 (IL- 21R)2G1-K12Mouse IgG1, κ347806
 F6β2- micro- globulin2M2Mouse IgG1, κ316306
 F7BTLAMIH26Mouse IgG2a, κ344506
 F8C3ARhC3aRZ8Mouse IgG2b345804
 F9C5L21D9-M12Mouse IgG2a, κ342404
 F10CCR105/1/6588Arm. hamster IgG341504
 F11CLEC12A50C1Mouse IgG2a, κ353604
 F12CLEC9A8F9Mouse IgG2a, κ353804
 G1CX3CR12A9-1Rat IgG2b, κ341604
 G2CXCR78F11-M16Mouse IgG2b, κ331104
 G3δ-Opioid ReceptorDOR7D2A4Mouse IgG2b, κ327206
 G4DLL1MHD1–314Mouse IgG1, κ346404
 G5DLL4MHD4–46Mouse IgG1, κ346506
 G6DR3 (TRAMP)JD3Mouse IgG1, κ307106
 G7EGFRAY13Mouse IgG1, κ352904
 G8erbB3/HER-31B4C3Mouse IgG2a, κ324706
 G9FcεRIαAER-37 (CRA-1)Mouse IgG2b, κ334610
 G10FcRL62H3Mouse IgG2b, κInquire
 G11Galectin-99M1-3Mouse IgG1, κ348906
 G12GARP (LRRC32)7B11Mouse IgG2b, κ352504
 H1HLA-A,B,CW6/32Mouse IgG2a, κ311406
 H2HLA-A2BB7.2Mouse IgG2b, κ343306
 H3HLA-DQHLADQ1Mouse IgG1, κ318106
 H4HLA-DRL243Mouse IgG2a, κ307606
 H5HLA-E3D12Mouse IgG1, κ342604
 H6HLA-G87GMouse IgG2a, κ335906
 H7IFN-g R b chain2HUB-159Hamster IgG308504
 H8Ig light chain kMHK-49Mouse IgG1, κ316508
 H9Ig light chain λMHL-38Mouse IgG2a, κ316608
 H10IgDIA6-2Mouse IgG2a, κ348204
 H11IgMMHM-88Mouse IgG1, κ314508
 H12IL-28RAMHLICR2aMouse IgG2a, κ337804
     
LEGENDSCREEN PLATE #4
 A1Blank   
 A2Integrin α9β1Y9A2Mouse IgG1, κ351606
 A3integrin β5AST-3TMouse IgG2a, κ345204
 A4integrin β7FIB504Rat IgG2a, κ321204
 A5Jagged 2MHJ2–523Mouse IgG1, κ346904
 A6LAPTW4-6H10Mouse IgG1, κ349704
 A7Lymphotoxin b Receptor (LT-bR)31G4D8Mouse IgG2b, κ322008
 A8Mac-2 (Ga- lectin-3)Gal397Mouse IgG1, κ126705
 A9MAIR-IITX45Mouse IgG1, κ334804
 A10MICA/MICB6D4Mouse IgG2a, κ320906
 A11MSC (W3D5)W3D5Mouse IgG2a, κ327506
 A12MSC (W5C5)W5C5Mouse IgG1, κ327406
 B1MSC (W7C6)W7C6Mouse IgG1, κ327606
 B2MSC and NPC (W4A5)W4A5Mouse IgG1, κ330806
 B3MSCA-1 (MSC, W8B2)W8B2Mouse IgG1, κ327306
 B4NKp805D12Mouse IgG1, κ346706
 B5Notch 1MHN1–519Mouse IgG1, κ352106
 B6Notch 2MHN2–25Mouse IgG2a, κ348304
 B7Notch 3MHN3–21Mouse IgG1, κ345406
 B8Notch 4MHN4-2Mouse IgG1, κ349004
 B9NPC (57D2)57D2Mouse IgG1, κ327706
 B10PodoplaninNC-08Rat IgG2a, λ337004
 B11Pre-BCRHSL2Mouse IgG1, κ347904
 B12PSMALNI-17Mouse IgG1, κ342504
 C1Siglec-105G6Mouse IgG1, κ347604
 C2Siglec-87C9Mouse IgG1, κ347104
 C3Siglec-9K8Mouse IgG1, κ351504
 C4SSEA-1MC-480Mouse IgM, κ125606
 C5SSEA-3MC-631Rat IgM, κ330312
 C6SSEA-4MC-813-70Mouse IgG3, κ330406
 C7SSEA-58.00E+11Mouse IgG1, κ355204
 C8TCR g/dB1Mouse IgG1, κ331210
 C9TCR Vβ13.2H132Mouse IgG1, κ333108
 C10TCR Vβ23αHUT7Mouse IgG1, κ349406
 C11TCR Vβ8JR2 (JR.2)Mouse IgG2b, κ348104
 C12TCR Vβ9MKB1Mouse IgG2b, κ349204
 D1TCR Vδ2B6Mouse IgG1, κ331408
 D2TCR Vg9B3Mouse IgG1, κ331308
 D3TCR Vα24- Jα186B11Mouse IgG1, κ342904
 D4TCR Vα7.23C10Mouse IgG1, κ351706
 D5TCR α/βIP26Mouse IgG1, κ306708
 D6Tim-11D12Mouse IgG1, κ353904
 D7Tim-3F38-2E2Mouse IgG1, κ345006
 D8Tim-49F4Mouse IgG1, κ354004
 D9TLT-2MIH61Mouse IgG1, κ351104
 D10TRA-1-60-RTRA-1-60-RMouse IgM, κ330610
 D11TRA-1–81TRA-1–81Mouse IgM, κ330708
 D12TSLPR (TSLP-R)1B4Mouse IgG1, κ322806
 E1Ms IgG1, κ ITCLMOPC-21Mouse IgG1, κ400112
 E2Ms IgG2a, κ ITCLMOPC-173Mouse IgG2a, κ400212
 E3Ms IgG2b, κ ITCLMPC-11Mouse IgG2b, κ400314
 E4Ms IgG3, κ ITCLMG3–35Mouse IgG3, κ401320
 E5Ms IgM, κ ITCLMM-30Mouse IgM, κ401609
 E6Rat IgG1, κ ITCLRTK2071Rat IgG1, κ400408
 E7Rat IgG2a, κ ITCLRTK2758Rat IgG2a, κ400508
 E8Rat IgG2b, κ ITCLRTK4530Rat IgG2b, κ400636
 E9Rat IgM, κ ITCLRTK2118Rat IgM, κ400808
 E10AH IgG, ITCLHTK888Arm. Hamster IgG400907
 E11Blank   
 E12Blank   
 F1Blank   
 F2Blank   
 F3Blank   
 F4Blank   
 F5Blank   
 F6Blank   
 F7Blank   
 F8Blank   
 F9Blank   
 F10Blank   
 F11Blank   
 F12Blank   
 G1Blank   
 G2Blank   
 G3Blank   
 G4Blank   
 G5Blank   
 G6Blank   
 G7Blank   
 G8Blank   
 G9Blank   
 G10Blank   
 G11Blank   
 G12Blank   
 H1Blank   
 H2Blank   
 H3Blank   
 H4Blank   
 H5Blank   
 H6Blank   
 H7Blank   
 H8Blank   
 H9Blank   
     
Plate 4
Well IDSpecificityCloneIsotypeBioLegend Cat. No.
 H10Blank   
 H11Blank   
 H12Blank   

Code availability

All custom code used in this work is available at https://github.com/kundajelab/mesoderm. This includes R Markdown files that reproduce the figures in this paper. For RNA-seq processing and quantification, we used STAR 2.4 (ref. 14), RSEM 1.2.21 (ref. 15), and Skewer 0.1.127 (ref. 12). The full parameter settings for STAR and RSEM can be found in STAR_RSEM.sh and STAR_RSEM_prep.py in the Github repository above. For bulk-population RNA-seq read processing, we used the following parameters for Skewer: -x AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNATCTCGTATGCCGTCTTCTGCTTG -y AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT -t 16 -q 21 -l 21 -n -u -f sanger For single-cell RNA-seq read processing, we used the following parameters for Skewer: -x CTGTCTCTTATACACATCTCCGAGCCCACGAGACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG -y CTGTCTCTTATACACATCTGACGCTGCCGACGANNNNNNNNGTGTAGATCTCGGTGGTCGCCGTATCATT -t 16 -q 21 -l 21 -n -u -f sanger For ATAC-seq processing, we used commit 9077b9... of the ATAqC pipeline[20]. In turn, this used MACS2 2.1.0 (ref. 23) and Bowtie2 2.2.6 (ref. 21).

Differentiation

Human pluripotent stem cell culture

H7, MIXL1-GFP HES3, NKX2.5-GFP HES3, SOX17-mCherry H9, pCAG-GFP H7, EF1A-BCL2-2A-GFP H9 and UBC-Luciferase-2A-tdTomato; EF1A-BCL2-2A-GFP H9 hESCs and BJC1 hiPSCs were routinely propagated feeder-free in mTeSR1 medium (StemCell Technologies)+1% penicillin/streptomycin (Gibco) on cell culture plastics coated with Geltrex basement membrane matrix (Gibco). Undifferentiated human pluripotent stem cells (hPSCs) were maintained at high quality with particular care to avoid any spontaneous differentiation, which would confound downstream differentiation. Unless otherwise indicated, the majority of experiments performed in this study were conducted using H7 hESCs, including all bulk-population RNA-seq, single-cell RNA-seq, and ATAC-seq experiments.

Directed differentiation in defined medium

Partially-confluent wells of undifferentiated hPSCs were dissociated into very fine clumps using Accutase (Gibco) and sparsely passaged 1:12-1:20 onto new Geltrex-coated cell culture plates in mTeSR1 supplemented with 1 μM thiazovivin (Tocris; a ROCK inhibitor to prevent cell death after dissociation) overnight. Seeding hPSCs sparsely prior to differentiation was critical to prevent cellular overgrowth during differentiation, especially during long-duration differentiation. hPSCs were allowed to plate overnight. The following morning, they were briefly washed (in DMEM/F12) before the addition of differentiation medium. All differentiation was conducted in serum-free, feeder-free and monolayer conditions in chemically-defined CDM2 basal medium. The composition of CDM2 basal medium[28] was as follows: 50% IMDM (+GlutaMAX, +HEPES, +Sodium Bicarbonate; Gibco, 31980-097)+50% F12 (+GlutaMAX; Gibco, 31765-092)+1 mgml−1 polyvinyl alcohol (Sigma, P8136-250G)+1% v/v concentrated lipids (Gibco, 11905-031)+450 μM monothioglycerol (Sigma, M6145)+0.7 μgml−1 insulin (Roche, 1376497)+15 μgml−1 transferrin (Roche, 652202)+1% v/v penicillin/streptomycin (Gibco). Polyvinyl alcohol was brought into solution by gentle warming and magnetic stirring in IMDM/F12 media before addition of additional culture supplements.

Primitive streak induction

As previously described[8], after overnight plating, hPSCs were briefly washed (with DMEM/F12) and then differentiated into either anterior primitive streak (30 ngml−1 Activin A+4 μM CHIR99021+20 ngml−1 FGF2+100 nM PIK90; for subsequent paraxial mesoderm induction) or mid primitive streak (30 mgml−1 Activin A+40 ngml−1 BMP4+6 μM CHIR99021+20 ngml−1 FGF2+100 nM PIK90; for subsequent cardiac mesoderm induction) for 24 h. Though both types of primitive streak broadly expressed pan-primitive streak markers (e.g., MIXL1 and BRACHYURY), anterior and mid primitive streak lineages were distinguished by expression of distinct region-specific markers and differing developmental competence to develop into downstream lineages[8]. Subsequently, day 1 anterior primitive streak was briefly washed (DMEM/F12) and differentiated towards day 2 paraxial mesoderm for 24 h (1 μM A-83-01+3 μM CHIR99021+250 nM LDN-193189 [DM3189]+20 ngml−1 FGF2). Separately, day 1 mid primitive streak was differentiated towards day 2 lateral mesoderm for 24 h (1 μM A-83-01+30 ngml−1 BMP4+1 μM C59; with 2 μM SB-505124 sometimes used instead of A-83-01)[8].

Paraxial mesoderm downstream differentiation

Day 2 paraxial mesoderm was briefly washed (DMEM/F12) and further differentiated into day 3 early somite precursors for 24 hs (1 μM A-83-01+250 nM LDN-193189+1 μM C59+500 nM PD0325901). Subsequently, day 3 early somites were dorsoventrally patterned into either ventral somites/sclerotome (5 nM 21 K+1 μM C59) or dorsal somites/dermomyotome (3 μM CHIR99021+150 nM Vismodegib). Sclerotome induction was conducted for 48–72 h (leading to day 5–6 ventral somite progenitors). For dermomyotome induction, sometimes dermomyotome was induced in the presence of 50 ngml−1 BMP4 to upregulate PAX7 after 48 h of BMP4+CHIR99021+Vismodegib differentiation (leading to day 5 dermomyotome progenitors)[8]. Media was changed every 24 h for all steps. The small-molecule Hedgehog agonist 21 K[29] was commercially synthesized.

Lateral/cardiac downstream differentiation

Day 2 lateral mesoderm was differentiated into day 4 cardiac mesoderm by treating them with 1 μM A8301+30 ngml−1 BMP4+1 μM C59+20 ngml−1 FGF2 for 48 h, or alternatively, with 1 μM A8301+30 ngml−1 BMP4+20 ngml−1 FGF2 for 24 h followed by 25 ngml−1 Activin+30 ngml−1 BMP4+1 μM C59 for the next 24 h. Subsequently, day 4 cardiac mesoderm was briefly washed (DMEM/F12) and treated with 30 ngml−1 BMP4+1 μM XAV939+200 μg/ml 2-phospho-ascorbic acid (Sigma) for 48–96 h to yield day 6–8 cardiomyocyte-containing populations. Spontaneously contracting cardiomyocyte foci were evident from day 8 onwards[8].

Data Records

The raw RNA-seq data (bulk-population and single-cell) and ATAC-seq data can be found at SRA under BioProject PRJNA319573 (accession number SRP073808) [Data Citation 1]. Reproducible peak calls on our ATAC-seq data, as well as transcript per million (TPM) values for each gene and sample in our bulk-population and single-cell RNA-seq data, can be found at GEO under accession number GSE85066 [Data Citation 2]. Bulk-population RNA-seq metadata and mapping statistics can be found in Table 2, while single-cell RNA-seq mapping statistics are in Table 3 (available online only). ATAC-seq metadata can be found in Table 4 (available online only). For ease of usage, the collated bulk-population RNA-seq data can be viewed at http://cs.stanford.edu/∼zhenghao/mesoderm_gene_atlas. As described above, an augmented spreadsheet with TPM values for each gene (for bulk-population RNA-seq data) and additional annotations about whether each gene corresponds to a potential cell surface marker and whether the gene was differentially expressed between conditions can be found on Figshare with DOI 10.6084/m9.figshare.3842835 [Data Citation 3]. Processed surface marker data (a table with the percentage of cells expressing each marker in each cell type) can be found on Figshare with DOI 10.6084/m9.figshare.3505817 [Data Citation 4]. Surface marker screening metadata is in Table 5 (available online only). The full set of ATAC-seq quality control graphs for all of our samples can be found on Figshare with DOI 10.6084/m9.figshare.3507167 [Data Citation 5].

Technical Validation

As mentioned above (see Methods), we only analyzed samples with at least 10,000,000 uniquely mapping reads and with at least 50% of reads uniquely mapping. On average, each sample had 45 M uniquely mapping reads with 69% of reads uniquely mapping; full numbers and percentages are in Table 2. We used FastQC[30] to measure the per-base sequence quality for each of our bulk-population RNA-seq experiments. All of the samples passed this quality check (i.e., for each base, the distribution of quality scores had a lower quartile of more than 10 and a median of more than 25). We show a representative FastQC plot (of the first sample we assayed) in Fig. 2a.
Figure 2

RNA-seq data quality and visualization.

(a). Bulk-population RNA-seq FastQC quality scores across read position, shown for a representative sample (D0 hESC). (b). PCA plot of bulk-population RNA-seq data, based on the top 500 genes by variance across all samples, and using log2 TPM values. (c). Single-cell RNA-seq FastQC quality scores across read position, shown for a representative sample (D2.25 somitomeres). (d). PCA plot of single-cell RNA-seq data, based on the top 500 genes by variance across all cells, and using log2 TPM values. (e). Black: Plot of density against standardized log2 TPM for single-cell RNA-seq data across all genes in all cells, after removing zeroes. Red: Fitted normal distribution. (f). Plot of single-cell variability (s.d.) against mean expression value for each gene, shown for a representative cell type (paraxial mesoderm).

We also used principal component analysis (PCA) to visually inspect how the samples were distributed in log2(TPM) space. Applying PCA to the 500 genes with highest variance across all samples revealed the presence of batch effects. After correcting for batch effects (see Methods), the PCA plot showed tight clustering (Fig. 2b) among samples and implicitly suggested the developmental trajectory of the cells, starting from human embryonic stem cells in the bottom left and moving upwards towards cardiac mesoderm and somites and their derivatives. An R Markdown script to reproduce Fig. 2b is provided in bulkDataViz.Rmd in our Github repository. Lastly, in our related publication[8], we independently validated our RNA-seq results by qPCR. Specifically, we conducted qPCR to measure the mRNA expression levels of key genes known to be lineage markers for the various cell types in our study (e.g., TBX6 and MSGN1 for paraxial mesoderm; PARAXIS, MEOX1, and FOXC2 in the somites). These qPCR expression patterns corroborated our RNA-seq results[8]. Before sequencing, we used an automated microscope to image each of the cell-capture wells on our Fluidigm C1 chips and manually inspected each image; for subsequent single-cell RNA-seq library construction we only used libraries from wells that contained exactly one cell. After sequencing, we filtered out cells with fewer than 1 million uniquely mapping reads or with fewer than 70% of reads uniquely mapping. Unfortunately, under these stringent selection critera, all cardiac mesoderm cell RNA-seq libraries were discarded; we ultimately retained 498 single cells out of 651. Full statistics of the cells are provided in Table 3 (available online only). As with the bulk-population RNA-seq data, we used FastQC[30] to check the per-base sequence quality of each experiment. All of the cells passed this quality check (i.e., for each base, the distribution of quality scores had a lower quartile of more than 10 and a median of more than 25), with a representative FastQC plot in Fig. 2c. To visualize the distribution of single cells, we once again used PCA on the 500 genes with highest variance (Fig. 2d) in log2(TPM) space. As expected, the single-cell RNA-seq libraries separated by cell type, with cell types that are closer to each other biologically (and temporally) tending to cluster together. We note that each cell type was loaded onto a different Fluidigm C1 chip, and due to resource constraints we were only able to use one chip per cell type. This means that cell type is perfectly confounded with chip in our single-cell RNA-seq experiments, and in particular, we cannot tell from the PCA the degree to which batch/chip effects are responsible for the observed separation between cell types. To tackle this problem, for each cell type, we measured the overall Pearson correlation between the average expression in the single cells and the corresponding average expression in the bulk-population RNA-seq experiments, all in log2 TPM units. On average, correlation was 0.82, varying from 0.76 to 0.87 depending on cell type. To ensure that this behavior was not driven solely by housekeeping genes, we looked at key marker genes expressed across our cell types (e.g., MIXL1 and BRACHYURY in primitive streak; MSGN1 and DLL3 in paraxial mesoderm; HAND1 and FOXF1 in lateral mesoderm; HOPX in somitomeres; FOXC2 and PAX9 in sclerotome). Single-cell RNA-seq expression patterns of these archetypic marker genes were consistent with independent measures from bulk-population RNA-seq, qPCR, flow cytometry, and immunostaining (data in (ref. 8)). As technical checks, we also examined the distribution of TPM values across all genes and cells. This followed a roughly log-normal distribution (Fig. 2e) after removing zeros, as expected. Finally, for each cell type, we plotted the standard deviation of each gene against its mean expression value (shown for paraxial mesoderm in Fig. 2f), obtaining for each cell type an expected curve where standard deviation is lowest when average expression is very low (because the expression of the gene in each cell is close to zero) or very high (because high expression translates into a large number of reads, allowing us to reduce technical variation from sampling error). The script to reproduce Fig. 2d–f is provided scDataViz.Rmd. The correlation between average expression in single cells and the bulk population can be analyzed by running scAverageCorrelation.r. Through the ATAqC pipeline[20], we calculated a variety of quality metrics to validate our ATAC-seq data. First, we looked at how many reads remained in each replicate after removing reads that did not successfully align, multi-mapping reads, duplicate reads, and mitochondrial reads. We had two replicates per cell type, and on average, each replicate had 46 M reads remaining, enough to robustly call peaks. We then looked at the fragment length distribution of the remaining reads; we show a representative plot from lateral mesoderm in Fig. 3a. A ‘good’ ATAC-seq experiment will have a majority of reads falling in the nucleosome-free region (NFR), with a mono-nucleosomal peak representing reads that cut on both sides of a nucleosome (≈200 bp in length). All of our samples displayed a mono-nucleosome peak, with 60–70% of reads falling in the NFR.
Figure 3

ATAC-seq data quality metrics.

(a). Enrichment of ATAC-seq signal around transcription start sites (TSS), shown for a representative sample (lateral mesoderm). Top: enrichment around individual TSS. Bottom: aggregated enrichment around all TSS's. (b). Fragment length distribution of ATAC-seq reads from a representative sample (lateral mesoderm). Most of the reads fall into the nucleosome-free region (<150 bp) and a clear mono-nucleosome peak can be seen. (c). Irreproducible rate (IDR) analysis of ATAC-seq peaks from lateral mesoderm. The scatter plot shows one point for every peak, with its location representing in rank in each replicate. For downstream analysis, we only consider peaks shown in black (reproducible at an IDR rate of 0.1), which have ranks that are consistent between replicates.

We also studied the enrichment of reads falling into transcription start sites (TSS), as TSS are known to be open chromatin sites (Fig. 3b; lateral mesoderm). On average, the enrichment of reads at TSS was 10.4x, with a range from 4.6x to 22.3x. Next, we looked at the number of peaks called across each replicate. Because of the variability in quality across the experiments (e.g., some experiments had a higher TSS enrichment and/or more reads), we first subsampled each replicate to have a maximum of 35 M reads (post-filtering). We then used MACS2 (ref. 23) to call peaks on each replicate independently, before using an IDR analysis[24] to identify peaks that were reproducible between the two replicates for each cell type (Fig. 3c). Using an IDR threshold of 0.1, we found an average of 91 K reproducible peaks per cell type. Full statistics and metadata for each replicate is provided in Table 4 (available online only), including additional quality metrics such as library complexity metrics, the fraction of NFR to mono-nucleosome reads, and the number of reads falling in universal DNase-I hypersensitive regions, promoter regions, enhancer regions, and called peak regions. To compute these, we used putative promoter, enhancer, and DHS annotations from 127 cell types and tissues from the Roadmap Epigenomics Project. These annotations are provided in the flagship Roadmap Epigenomics Project publication[31] and are available from the supplementary website http://compbio.mit.edu/roadmap in the ‘DNase-I accessible regulatory regions’ section. In brief, DNase-seq based chromatin accessible regions were labeled as promoter or enhancer based on chromatin state maps learned using 5 core histone modifications across the 127 cell types and tissues. The graphs in Fig. 3 were taken from the output of the ATAqC pipeline for one representative sample (lateral mesoderm). The full set of graphs for all of our samples can be found in [Data Citation 5]. For validation, we focused on surface markers with lineage-specific expression (Fig. 4), and chose two surface markers for in-depth in vivo and in vitro validation: DLL1 (a marker of paraxial mesoderm) and GARP (a marker of cardiac mesoderm). In situ hybridization of zebrafish homologs of these genes (deltaC, the homolog of human DLL1 and lrrc32, the homolog of human GARP) was conducted in zebrafish embryos, which revealed fairly specific expression of deltaC in paraxial mesoderm and that of lrrc32 in the developing heart tube in vivo[8]. Additionally, fluorescence-activated cell sorting (FACS) of DLL1+ cells from hESC-derived day 2 paraxial mesoderm cultures followed by bulk-population and single-cell RNA-seq revealed that all DLL1+ cells essentially expressed paraxial mesoderm transcription factors at the single-cell level[8]. Collectively, this reaffirmed that DLL1 and GARP respectively mark human paraxial and cardiac mesoderm.
Figure 4

High-throughput surface marker screening.

We show here a heatmap of all surface markers whose expression varied considerably across cell types, filtering out markers where less than 30% or more than 70% of cells across all cell types expressed the marker. The % refers to the percentage of cells of a given type that expressed a marker.

Our related publication[8] focused on establishing the identity and function of the derived cell types, and we refer readers interested in those details to that manuscript. In brief, we verified cellular function through in vivo transplantation experiments and we assessed cellular identity and purity through molecular analyses of marker expression (RNA-seq, ATAC-seq, immunostaining, and flow cytometry). On the molecular side, for each cell type we identified archetypic genes and surface markers based on biological knowledge and prior literature. We confirmed that the key genes were expressed at a population level through bulk RNA-seq and qPCR; then, through single-cell RNA-seq, immunostaining, or flow cytometry, we verified that the population was suitably homogeneous for those genes and surface markers[8]. On the basis of those metrics, the cell populations we derived were generally between 80 and 99% pure. Motif enrichment analysis of the open chromatin regions in each cell type (as measured by ATAC-seq) also yielded results consistent with their cellular identity, e.g., GATA motifs were significantly enriched in lateral and cardiac mesoderm. We conducted transplantation experiments in immunodeficient mice to further verify the function of two human mesodermal cell-types derived from our differentiation process, namely sclerotome and cardiac mesoderm[8]. Sclerotome cells subcutaneously injected into immunodeficient mice self-organized to form an ectopic human bone, undergoing ossification, displaying spatial structure expected of human bone, and even attracting and becoming vascularized by mouse blood vessels. For the cardiac mesoderm, we first further differentiated them into cardiomyocytes through WNT blockade and BMP inhibition for four days and engineered them to express a constitutively-expressed luciferase reporter gene. To further test the functionality of these ESC-derived human cardiomyocytes, we developed an experimental system wherein ventricular fragments from week 15–17 human fetal heart[32] were subcutaneously implanted into the mouse ear. We then transplanted ESC-derived cardiomyocytes directly into the human fetal heart graft and found that they engrafted the human heart tissue for at least 10 weeks, as measured by bioluminescence imaging of luciferase-expressing cardiomyocytes in vivo.

Usage Notes

Researchers studying the single-cell RNA-seq data reported herein should note that inferences made from global comparisons (e.g., PCA or clustering) may be limited by experimental design, as each individual cell-type was processed on a separate Fluidigm C1 chip. Hence when comparing single-cell RNA-seq data from different cell-types it is difficult to account for batch effects arising from different chips. Our analysis, including comparisons of key lineage marker genes known to vary between distinct cell lineages, shows that the data are still valid. However, care should be taken in global comparisons that involve aggregating large numbers of genes, as the noise from batch effects could be substantial in that context; the bulk-population RNA-seq data could be used to verify results from such comparisons. In our related publication[8], we applied principal component analysis to the single-cell RNA-seq data to reconstruct the differentiation trajectory of paraxial mesoderm, somitomeres, and early somites in ‘pseudotime’, a concept first introduced in the context of single-cell RNA-seq by other groups (ref. 10 and ref. 33). Interested readers might want to apply these, and other more sophisticated trajectory reconstruction methods, to our single-cell data. The variance in data quality across the ATAC-seq experiments, due to technical reasons (e.g., different numbers of starting reads or varying cell lysis) and biological reasons (e.g., distinct cell types may have different amounts of open chromatin), mean that care must also be taken when conducting global comparisons of ATAC-seq data. We found that rank-normalization (or, at one extreme, binarization) makes it easier to compare ATAC-seq data across cell-types, as opposed to using P-values, local IDR values, or measures of signal intensity to score each peak. Sub-sampling reads before peak-calling, as we did in our analysis, should only be done for global comparisons; researchers who are doing an in-depth study of one cell type should use all available reads that pass the filtering criteria for maximal information. We are currently using this dataset to study the temporal changes in alternative splicing and long non-coding RNA expression as differentiation progresses. In addition, we are actively exploring the expression of repeat elements, including dormant retrotransposons, interspersed nuclear elements, Alu elements, and human endogenous retrovirus elements that may have a role in early human embryonic development. Readers who are interested in similar questions are welcome to contact us to discuss methods and collaborations.

Additional Information

How to cite this article: Koh, P. W. et al. An atlas of transcriptional, chromatin accessibility, and surface marker changes in human mesoderm development. Sci. Data 3:160109 doi: 10.1038/sdata.2016.109 (2016). Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
  28 in total

Review 1.  Formation and differentiation of avian somite derivatives.

Authors:  Bodo Christ; Martin Scaal
Journal:  Adv Exp Med Biol       Date:  2008       Impact factor: 2.622

2.  Single-cell trajectory detection uncovers progression and regulatory coordination in human B cell development.

Authors:  Sean C Bendall; Kara L Davis; El-Ad David Amir; Michelle D Tadmor; Erin F Simonds; Tiffany J Chen; Daniel K Shenfeld; Garry P Nolan; Dana Pe'er
Journal:  Cell       Date:  2014-04-24       Impact factor: 41.582

3.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

4.  Location and movements of cardiogenic cells in the chick embryo: the heart-forming portion of the primitive streak.

Authors:  G C Rosenquist
Journal:  Dev Biol       Date:  1970-07       Impact factor: 3.582

5.  Prospective isolation of human embryonic stem cell-derived cardiovascular progenitors that integrate into human fetal heart tissue.

Authors:  Reza Ardehali; Shah R Ali; Matthew A Inlay; Oscar J Abilez; Michael Q Chen; Timothy A Blauwkamp; Masayuki Yazawa; Yongquan Gong; Roeland Nusse; Micha Drukker; Irving L Weissman
Journal:  Proc Natl Acad Sci U S A       Date:  2013-02-07       Impact factor: 11.205

6.  Architecture of the human regulatory network derived from ENCODE data.

Authors:  Mark B Gerstein; Anshul Kundaje; Manoj Hariharan; Stephen G Landt; Koon-Kiu Yan; Chao Cheng; Xinmeng Jasmine Mu; Ekta Khurana; Joel Rozowsky; Roger Alexander; Renqiang Min; Pedro Alves; Alexej Abyzov; Nick Addleman; Nitin Bhardwaj; Alan P Boyle; Philip Cayting; Alexandra Charos; David Z Chen; Yong Cheng; Declan Clarke; Catharine Eastman; Ghia Euskirchen; Seth Frietze; Yao Fu; Jason Gertz; Fabian Grubert; Arif Harmanci; Preti Jain; Maya Kasowski; Phil Lacroute; Jing Jane Leng; Jin Lian; Hannah Monahan; Henriette O'Geen; Zhengqing Ouyang; E Christopher Partridge; Dorrelyn Patacsil; Florencia Pauli; Debasish Raha; Lucia Ramirez; Timothy E Reddy; Brian Reed; Minyi Shi; Teri Slifer; Jing Wang; Linfeng Wu; Xinqiong Yang; Kevin Y Yip; Gili Zilberman-Schapira; Serafim Batzoglou; Arend Sidow; Peggy J Farnham; Richard M Myers; Sherman M Weissman; Michael Snyder
Journal:  Nature       Date:  2012-09-06       Impact factor: 49.962

7.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome.

Authors:  Bo Li; Colin N Dewey
Journal:  BMC Bioinformatics       Date:  2011-08-04       Impact factor: 3.307

8.  The formation of mesodermal tissues in the mouse embryo during gastrulation and early organogenesis.

Authors:  P P Tam; R S Beddington
Journal:  Development       Date:  1987-01       Impact factor: 6.868

Review 9.  Defining cell types and states with single-cell genomics.

Authors:  Cole Trapnell
Journal:  Genome Res       Date:  2015-10       Impact factor: 9.043

10.  Integrative analysis of 111 reference human epigenomes.

Authors:  Anshul Kundaje; Wouter Meuleman; Jason Ernst; Misha Bilenky; Angela Yen; Alireza Heravi-Moussavi; Pouya Kheradpour; Zhizhuo Zhang; Jianrong Wang; Michael J Ziller; Viren Amin; John W Whitaker; Matthew D Schultz; Lucas D Ward; Abhishek Sarkar; Gerald Quon; Richard S Sandstrom; Matthew L Eaton; Yi-Chieh Wu; Andreas R Pfenning; Xinchen Wang; Melina Claussnitzer; Yaping Liu; Cristian Coarfa; R Alan Harris; Noam Shoresh; Charles B Epstein; Elizabeta Gjoneska; Danny Leung; Wei Xie; R David Hawkins; Ryan Lister; Chibo Hong; Philippe Gascard; Andrew J Mungall; Richard Moore; Eric Chuah; Angela Tam; Theresa K Canfield; R Scott Hansen; Rajinder Kaul; Peter J Sabo; Mukul S Bansal; Annaick Carles; Jesse R Dixon; Kai-How Farh; Soheil Feizi; Rosa Karlic; Ah-Ram Kim; Ashwinikumar Kulkarni; Daofeng Li; Rebecca Lowdon; GiNell Elliott; Tim R Mercer; Shane J Neph; Vitor Onuchic; Paz Polak; Nisha Rajagopal; Pradipta Ray; Richard C Sallari; Kyle T Siebenthall; Nicholas A Sinnott-Armstrong; Michael Stevens; Robert E Thurman; Jie Wu; Bo Zhang; Xin Zhou; Arthur E Beaudet; Laurie A Boyer; Philip L De Jager; Peggy J Farnham; Susan J Fisher; David Haussler; Steven J M Jones; Wei Li; Marco A Marra; Michael T McManus; Shamil Sunyaev; James A Thomson; Thea D Tlsty; Li-Huei Tsai; Wei Wang; Robert A Waterland; Michael Q Zhang; Lisa H Chadwick; Bradley E Bernstein; Joseph F Costello; Joseph R Ecker; Martin Hirst; Alexander Meissner; Aleksandar Milosavljevic; Bing Ren; John A Stamatoyannopoulos; Ting Wang; Manolis Kellis
Journal:  Nature       Date:  2015-02-19       Impact factor: 69.504

View more
  19 in total

Review 1.  Genome-wide analysis of chromatin accessibility using ATAC-seq.

Authors:  Tanvi Shashikant; Charles A Ettensohn
Journal:  Methods Cell Biol       Date:  2018-12-21       Impact factor: 1.441

2.  Neonatal Tbr1 Dosage Controls Cortical Layer 6 Connectivity.

Authors:  Siavash Fazel Darbandi; Sarah E Robinson Schwartz; Qihao Qi; Rinaldo Catta-Preta; Emily Ling-Lin Pai; Jeffrey D Mandell; Amanda Everitt; Anna Rubin; Rebecca A Krasnoff; Sol Katzman; David Tastad; Alex S Nord; A Jeremy Willsey; Bin Chen; Matthew W State; Vikaas S Sohal; John L R Rubenstein
Journal:  Neuron       Date:  2018-10-11       Impact factor: 17.173

3.  psupertime: supervised pseudotime analysis for time-series single-cell RNA-seq data.

Authors:  Will Macnair; Revant Gupta; Manfred Claassen
Journal:  Bioinformatics       Date:  2022-06-24       Impact factor: 6.931

4.  Epigenetic basis of oncogenic-Kras-mediated epithelial-cellular proliferation and plasticity.

Authors:  Preetish Kadur Lakshminarasimha Murthy; Rui Xi; Diana Arguijo; Jeffrey I Everitt; Dewran D Kocak; Yoshihiko Kobayashi; Aline Bozec; Silvestre Vicent; Shengli Ding; Gregory E Crawford; David Hsu; Purushothama Rao Tata; Timothy Reddy; Xiling Shen
Journal:  Dev Cell       Date:  2022-02-07       Impact factor: 13.417

5.  MarcoPolo: a method to discover differentially expressed genes in single-cell RNA-seq data without depending on prior clustering.

Authors:  Chanwoo Kim; Hanbin Lee; Juhee Jeong; Keehoon Jung; Buhm Han
Journal:  Nucleic Acids Res       Date:  2022-07-08       Impact factor: 19.160

6.  The Chromatin Accessibility Landscape of Adult Rat.

Authors:  Yue Yuan; Qiuting Deng; Xiaoyu Wei; Yang Liu; Qing Lan; Yu Jiang; Yeya Yu; Pengcheng Guo; Jiangshan Xu; Cong Yu; Lei Han; Mengnan Cheng; Peiying Wu; Xiao Zhang; Yiwei Lai; Giacomo Volpe; Miguel A Esteban; Huanming Yang; Chuanyu Liu; Longqi Liu
Journal:  Front Genet       Date:  2021-05-24       Impact factor: 4.599

7.  Accounting for cell type hierarchy in evaluating single cell RNA-seq clustering.

Authors:  Zhijin Wu; Hao Wu
Journal:  Genome Biol       Date:  2020-05-25       Impact factor: 13.583

Review 8.  Decoding the Heart through Next Generation Sequencing Approaches.

Authors:  Michal Pawlak; Katarzyna Niescierowicz; Cecilia Lanny Winata
Journal:  Genes (Basel)       Date:  2018-06-07       Impact factor: 4.096

9.  Chromatin accessibility and transcriptome landscapes of Monomorium pharaonis brain.

Authors:  Mingyue Wang; Yang Liu; Tinggang Wen; Weiwei Liu; Qionghua Gao; Jie Zhao; Zijun Xiong; Zhifeng Wang; Wei Jiang; Yeya Yu; Liang Wu; Yue Yuan; Xiaoyu Wei; Jiangshan Xu; Mengnan Cheng; Pei Zhang; Panyi Li; Yong Hou; Huanming Yang; Guojie Zhang; Qiye Li; Chuanyu Liu; Longqi Liu
Journal:  Sci Data       Date:  2020-07-08       Impact factor: 6.444

10.  A systematic performance evaluation of clustering methods for single-cell RNA-seq data.

Authors:  Angelo Duò; Mark D Robinson; Charlotte Soneson
Journal:  F1000Res       Date:  2018-07-26
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.