Literature DB >> 33795864

Single-cell chromatin accessibility identifies pancreatic islet cell type- and state-specific regulatory programs of diabetes risk.

Joshua Chiou1, Chun Zeng2,3, Zhang Cheng4, Jee Yun Han4, Michael Schlichting2,3, Michael Miller4, Robert Mendez4, Serina Huang2, Jinzhao Wang2,3, Yinghui Sui2,3, Allison Deogaygay2, Mei-Lin Okino2, Yunjiang Qiu3, Ying Sun2, Parul Kudtarkar2, Rongxin Fang3, Sebastian Preissl4, Maike Sander5,6,7, David U Gorkin8,9,10, Kyle J Gaulton11,12.   

Abstract

Single-nucleus assay for transposase-accessible chromatin using sequencing (snATAC-seq) creates new opportunities to dissect cell type-specific mechanisms of complex diseases. Since pancreatic islets are central to type 2 diabetes (T2D), we profiled 15,298 islet cells by using combinatorial barcoding snATAC-seq and identified 12 clusters, including multiple alpha, beta and delta cell states. We cataloged 228,873 accessible chromatin sites and identified transcription factors underlying lineage- and state-specific regulation. We observed state-specific enrichment of fasting glucose and T2D genome-wide association studies for beta cells and enrichment for other endocrine cell types. At T2D signals localized to islet-accessible chromatin, we prioritized variants with predicted regulatory function and co-accessibility with target genes. A causal T2D variant rs231361 at the KCNQ1 locus had predicted effects on a beta cell enhancer co-accessible with INS and genome editing in embryonic stem cell-derived beta cells affected INS levels. Together our findings demonstrate the power of single-cell epigenomics for interpreting complex disease genetics.

Entities:  

Mesh:

Substances:

Year:  2021        PMID: 33795864      PMCID: PMC9037575          DOI: 10.1038/s41588-021-00823-0

Source DB:  PubMed          Journal:  Nat Genet        ISSN: 1061-4036            Impact factor:   41.307


INTRODUCTION

Gene regulatory programs are largely orchestrated by cis-regulatory elements that direct the expression of genes in response to specific developmental and environmental cues. Genetic variants associated with complex disease are highly enriched within putative cis-regulatory elements[1]. The activity of regulatory elements is often restricted to specific cell types and/or cell states, limiting the ability of ATAC-seq and other “ensemble” (or “bulk”) epigenomic technologies to map regulatory elements in individual cell types within disease-relevant tissues. To overcome this limitation, new approaches to obtain ATAC-seq profiles from single nuclei allow for the disaggregation of open chromatin from heterogenous samples into component cell types and subtypes[2-4]. These developments create opportunities to dissect the molecular mechanisms that underlie genetic risk of disease. However, to date snATAC-seq data from disease-relevant human tissues are limited[5-8]. Type 2 diabetes (T2D) is a multifactorial disease with a highly polygenic inheritance[9]. Pancreatic islets are central to genetic risk of T2D, as evidenced by shared association between T2D risk and quantitative measures of islet function[10-13] and enrichment of T2D risk variants in islet regulatory sites[14-18]. Islets are comprised of multiple endocrine cell types with distinct functions[19-21] and are heterogeneous[22-24] in gene expression and other molecular signatures which likely reflect different functional cell states[22,25,26]. Heterogeneity in the epigenome of islet cell types has not been described, however, which is necessary to understand islet regulation and interpret the molecular mechanisms of non-coding T2D risk variants. In this study, we map accessible chromatin profiles of individual islet cells using snATAC-seq, define the regulatory programs of islet cell types and cell states, describe their relationship to T2D risk and fasting glycemia, and predict the molecular mechanisms of T2D risk variants.

RESULTS

Islet snATAC-seq reveals 12 distinct cell clusters

We performed snATAC-seq on human pancreatic islets from three donors using a combinatorial barcoding snATAC-seq approach optimized for use on tissues[2,4] (Supplementary Table 1). To confirm library quality, we first analyzed the data as ensemble ATAC-seq by aggregating all high-quality mapped reads irrespective of barcode. Ensemble snATAC-seq from all three samples showed the expected insert size distribution (Extended Data Figure 1a), strong enrichment of signal at transcription start sites (TSS) (Extended Data Figure 1b), and high concordance of signal with published islet ATAC-seq data[14,27-29] (Extended Data Figure 1c).
Extended Data Fig. 1

Quality control metrics and aggregate comparison to bulk islet ATAC.

(a) Insert size distribution for aggregate reads from each snATAC-seq experiment. (b) Aggregated read coverage from each snATAC-seq experiment in a ±2 kb window around individual promoters (top) and averaged across all promoters (bottom). (c) Spearman correlation between normalized read coverage within a merged set of peaks from 3 aggregated islet snATAC-seq, 42 bulk islet ATAC-seq, and 4 bulk pancreas ATAC-seq datasets. Names of samples are from the original sources of the data. (d) Binned log10 read depth distribution for each experiment.

To obtain high-quality single cell profiles, we first filtered out barcodes with fewer than 500 reads (Extended Data Figure 1d), resulting in a total of 18,544 cells across the three samples. We then clustered accessible chromatin profiles from these cells, making key modifications to previous approaches[4] (Extended Data Figure 2). After filtering out cells with aberrant quality metrics, we retained 15,298 cells which mapped to 12 clusters (Figure 1a). To determine the cell type represented by each cluster, we examined chromatin accessibility at the promoter region of the cognate hormone genes for endocrine cells and known marker genes for non-endocrine cell types. We identified clusters representing beta (INS-IGF2/insulin), alpha (GCG/glucagon), delta (SST/somatostatin), gamma (PPY/pancreatic polypeptide), ductal (CFTR), acinar (REG1A), immune (NCF2)[30], stellate (PDGFRB)[30], and endothelial (CD93)[31] cells (Figure 1b-c). We defined a broader set of marker genes for each cluster by identifying promoters with accessibility most specific to each cluster (Supplementary Table 2). We observed highly specific correlations between marker genes defining cell types in snATAC-seq and islet scRNA-seq[23] (Figure 1d, Extended Data Figure 3a-e).
Extended Data Fig. 2

Flowchart of the snATAC-seq data processing pipeline

(a) Flowchart summarizing key steps of the snATAC-seq processing pipeline, including the various steps where cells were filtered out. Samples were first processed individually. All samples were then combined using a batch correction method. Clusters corresponding to cells from low quality cells, including those with low read depth in highly variable windows and low fraction of reads in peaks were then removed. After re-clustering, iterative subclustering of the main clusters at high resolution was used to identify and remove doublet subclusters. The final clusters are not driven by potential confounders such as donor of origin. Boxplot center lines, limits, and whiskers represent median, quartiles, and 1.5 IQR respectively.

Figure 1.

Pancreatic islet cell type accessible chromatin defined using snATAC-seq.

(a) Clustering of accessible chromatin profiles from 15,298 pancreatic islet cells identifies 12 distinct clusters plotted on UMAP coordinates. The number of cells for each cluster is listed in parenthesis next to the cluster label. (b) Promoter accessibility in a 1 kb window around the TSS for selected marker genes. (c) Aggregate read density (counts per 1×105) at hormone marker genes: GCG (alpha), INS-IGF2 (beta), SST (delta), and PPY (gamma). (d) Spearman correlation between t-statistics of cluster-specific genes based on promoter accessibility (snATAC-seq) and gene expression (scRNA-seq). (e) Row-normalized chromVAR motif enrichment z-scores for 141 TF sequence motifs with variable enrichment across clusters (left). Cell types with multiple clusters are collapsed into a single cluster (e.g. beta 1 + beta 2 into beta). Enrichment z-scores for FOXA1 and PDX1 motifs for each cell projected onto UMAP coordinates (right). (f) Pearson correlation of TF motif enrichment z-scores between endocrine and exocrine cell types (g) FDR-corrected p-values from two-sided two sample T-tests of differential chromVAR motif enrichment comparison between delta and beta cells for 366 TF motifs. (h) Enrichment z-scores for SCRT1 and MAFB motifs in 7,598 beta and 710 delta cells projected onto UMAP coordinates (top) and shown as violin distributions (bottom; lines represent median and quartiles).

Extended Data Fig. 3

Analysis of islet single cell gene expression data

(a) log10 transformed read depth or (b) total number of genes expressed compared with number of marker genes expressed per cell from scRNA-seq data. Boxplot center lines, limits, and whiskers represent median, quartiles, and 1.5 IQR respectively. Cells expressing more than one marker gene (defined by mixture models) were marked as doublets and filtered out. (c) Clusters of islet cells from single cell RNA-seq data plotted on UMAP coordinates. quies. stellate, quiescent stellate. activ. stellate, activated stellate. (d) Selected marker gene log2(expression) for each cluster plotted on UMAP coordinates. (e) Row-normalized t-statistics of marker gene specificity showing the most specific genes (t-statistic>20) for each cluster.

To characterize regulatory programs in each cell type, we aggregated reads for cells within each cluster and identified accessible chromatin sites for the cluster using MACS2[32]. In total, we identified 228,873 accessible chromatin sites merged across the 12 clusters (Supplementary Data 1). We next used chromVAR[33] to identify transcription factor (TF) motifs from JASPAR[34] enriched within accessible chromatin of each cell. Analysis of motif enrichments averaged across cells for each cell type revealed cell type-specific patterns of motif enrichment (Figure 1e, Supplementary Table 3). For example, we observed enrichment of PDX1 in beta and delta cells[35], MAF in alpha and beta cells[36-38], IRF in immune cells[39] and ETS in endothelial cells[40] (Figure 1e). Hierarchical clustering of cell types based on motif enrichment revealed that regulatory programs of beta and delta cells were closely related as were alpha and gamma cells (Figure 1f), consistent with single cell expression data[30,41,42]. Motifs highly enriched in delta cell chromatin relative to beta cells included SCRT (SCRT1 -log10(FDR)=86.49) and ELF TFs (ELF5 -log10(FDR)=79.41), and motifs enriched in gamma cell chromatin relative to alpha cells included HOX (Hoxa9 -log10(FDR)=20.92) and IRF TFs (IRF1 -log10(FDR)=20.22) (Figure 1h,g, Extended Data Figure 4, Supplementary Table 4-5).
Extended Data Fig. 4

Comparison of motif enrichment between alpha and gamma cells

Differential enrichment of motifs between alpha cell open chromatin regions and gamma cell open chromatin regions as measured by a 2-sided T-test, with FDR calculated by the Benjamini-Hochberg procedure. Examples are highlighted of motifs enriched in alpha cells and gamma cells, respectively (MAFG, HOXA9). UMAP plots show enrichment z-scores for the indicated motifs in alpha and gamma cells. Violin plots below show the distribution of enrichment z-scores across alpha or gamma cells, where the lines represent median and quartiles.

Heterogeneity in islet endocrine cell regulatory programs

A major strength of single cell approaches is the ability to reveal heterogeneity within a cell type. Indeed, our initial clustering showed that alpha, beta and delta cells segregated into sub-clusters. We identified gene promoters with variable accessibility between sub-clusters (Methods, Supplementary Data 2). Notably, INS had amongst the most variable promoter accessibility between beta cell sub-clusters (INS-IGF2 beta OR=4.74, two-sided Fisher’s exact p=1.78×10−40), and therefore for clarity we renamed the clusters INShigh and INSlow beta cells, respectively (Figure 1b,c; Figure 2a). Similarly, GCG promoter accessibility was variable between alpha cell sub-clusters (GCG alpha OR=3.67, p=3.45×10−22), which we renamed GCGhigh and GCGlow alpha cells, and SST promoter accessibility was variable between delta cell sub-clusters (SST delta OR=1.86 p=0.02), which we renamed SSThigh and SSTlow delta cells (Figure 1b-c; Figure 2a).
Figure 2.

Heterogeneity in endocrine cell accessible chromatin and regulatory programs.

(a) Gene promoters with significantly differential chromatin accessibility between sub-clusters of alpha cells (left), beta cells (middle), and delta cells (right). (b) Enrichment of gene sets using ranked gene lists from the differential promoter analyses. Panels include genes with differential promoter accessibility between hormone-high or hormone-low states (first subpanel from left); genes expressed in beta cell sub-clusters from islet scRNA-seq (second and third subpanels); genes positively and negatively correlated with exocytosis from islet Patch-seq (fourth subpanel). (c) Enrichment of gene ontology terms related to glucose response, hormone secretion, stress response, and cell cycle among genes with differential promoter accessibility between endocrine cell states. (d) Row-normalized motif enrichments for 215 TF motifs with variable enrichment across endocrine cell states. Single cell motif enrichment z-scores for a representative RFX (RFX3) and FOS/JUN (FOS::JUN) motif are projected onto UMAP coordinates (right), and violin plots below show motif enrichment distribution within endocrine cell states (lines represent median and quartiles). (e) Ordering of alpha, beta and delta cells across pseudostate trajectories using high GCG/INS-IGF2/SST promoter accessibility as the reference point. Across each trajectory, the percentage of cells in the hormone-high state and the binary cluster call of individual cells are shown above the heatmaps, which show row-normalized motif enrichment for variable motifs between cell states. (f) Promoter accessibility for genes in the FOS/JUN motif family across pseudostate trajectories. Genes with matching promoter accessibility and motif enrichment patterns (ρ>0.5) are bolded.

We found significant overlap in the genes that distinguish hormone-high from hormone-low alpha, beta and delta cells by gene set enrichment analysis (GSEA) (Figure 2b). Genes with increased promoter accessibility in hormone-high states were enriched for hormone secretion and glucose response; in contrast, genes with increased promoter accessibility for hormone-low states were enriched for stress-induced signaling response (Figure 2a,c, Supplementary Table 6). We also observed enriched TF motifs that distinguished different states (Figure 2d, Supplementary Table 7). For example, RFX family motifs were enriched in hormone-high but not in hormone-low states (Rfx1 mean enrichment INShigh=0.36, INSlow=−0.95, p=0; GCGhigh=0.52, GCGlow=−1.16, p=7.3×10−260; SSThigh=0.76, SSTlow=−1.24, p=3.9×10−58) (Figure 2d). In contrast, FOS/JUN family motifs were prominently enriched in hormone-low but not hormone-high states (FOS::JUN mean enrichment INShigh=−1.78, INSlow=3.90, p=0; GCGhigh=−2.86, GCGlow=5.50, p=0; SSThigh=−0.21, SSTlow=7.62, p=1.1×10−121) (Figure 2d). These data reveal epigenomic differences between endocrine cell states among genes involved in hormone production and stress-induced signaling responses and point to an underlying commonality in regulatory networks that govern state-specific functions of endocrine cell types. We next sought to determine whether the observed heterogeneity in the epigenome of endocrine cells correlated with heterogeneity in islet gene expression and function. We first compared our states to beta cell sub-clusters from a previous scRNA-seq study[23]. Genes with increased promoter accessibility in hormone-low cells were enriched in a beta cell sub-cluster (β-sub.4) associated with ER stress and protein folding and low INS expression, whereas genes with increased promoter accessibility in hormone-high cells were enriched in the other beta cell sub-clusters (β-sub.1–3) (Figure 2b). We further found enrichment of genes with differential promoter accessibility among gene sets preferentially expressed in beta cell sub-clusters from a recent scRNA-seq meta-analysis[43] (Figure 2b). Finally, we found significant overlap in genes with differential promoter accessibility between states and genes that correlate with electrophysiological measures of beta cell function from a recent Patch-seq study[44] (Figure 2b). These results thus provide a link between epigenomic heterogeneity in endocrine cells and heterogeneity in gene expression and electrophysiological function. To explore potential gradations among endocrine cells as a continuum rather than as binary states[23,45], we used Cicero[8] to order cells from each cell type along trajectories based on chromatin accessibility. We observed cells on a gradient between hormone-high and hormone-low states (Figure 2e, Extended Data Figure 5a-c). These trajectories allowed us to examine gene promoter accessibility and TF motif enrichment as a function of pseudo-state (Figure 2e, Extended Data Figure 5d). Consistent with binary sub-clusters, lineage-specifying genes and TF family motif enrichments such as RFX and NFAT decreased along the trajectory from hormone-high to -low cells, whereas motif enrichment in TF families such as FOS/JUN increased (Figure 2e). Structurally-related TFs often have similar motifs, and thus to assign motifs to specific TFs we correlated promoter-accessibility of TFs within the structural subfamily with motif enrichments across the state trajectory. Motif enrichment for the FOS/JUN family correlated with the promoter accessibility of FOSL1, FOSL2 and JUN (Figure 2f), supporting a role for these specific TFs in hormone-low cell regulation.
Extended Data Fig. 5

Differentially accessible promoters across pseudo-states

(a) Pseudo-state (trajectory) values for alpha cells plotted on UMAP coordinates (left) and percentage of cells with GCG promoter accessibility decreases across 10 bins along the alpha (α) cell trajectory (right). (b) Pseudo-state (trajectory) values for beta (β) cells plotted on UMAP coordinates (left) and percentage of cells with INS promoter accessibility decreases across 10 bins along the beta cell trajectory (right). (c) Pseudo-state (trajectory) values for delta (δ) cells plotted on UMAP coordinates (left) and percentage of cells with SST promoter accessibility decreases across 10 bins along the beta cell trajectory (right). (d) Heatmaps showing promoters with dynamic accessibility across trajectories for alpha (top), beta (middle) and delta (bottom) cell trajectories. Gene promoters are clustered into 4 groups for each trajectory with k-medoids clustering. Enriched gene ontology for each k-medoid cluster (left) and selected genes present in at least one enriched gene ontology.

Islet cell type enrichment for diabetes and glycemia GWAS

Genetic variants influencing diabetes and fasting glucose level are enriched in pancreatic islet regulatory elements[14,17,18,46]. Using our islet cell type- and state-resolved accessible chromatin profiles, we determined the enrichment of variants associated with diabetes[9,47] and related quantitative phenotypes[11,12,48-51] as well as other complex traits[52-59]. We first determined the enrichment of variants in accessible chromatin sites for each cell type and state using stratified LD score regression[60-62]. We observed significant enrichment (FDR<0.1) of fasting glucose (FG) level association in INShigh beta cells and T2D association for both INShigh and INSlow beta cell states (FG INShigh Z=3.58 FDR=0.013; T2D INShigh Z=4.41 FDR=0.001, INSlow Z=4.19 FDR=0.002) (Figure 3a). We also observed more nominal evidence for enrichment of T2D association for GCGlow alpha cells and both delta cell states, as well as multiple glycemic traits for endocrine cells (Figure 3a).
Figure 3.

Enrichment of islet accessible chromatin for diabetes and fasting glycemia.

(a) Stratified LD score regression enrichment z-scores for diabetes-related quantitative endophenotypes (top), type 1 and 2 diabetes (middle), and control traits (bottom) for islet cell types. **FDR<0.01 *FDR<0.1. (b) Single cell enrichment z-scores for fasting glucose level (FG) and T2D projected onto UMAP coordinates (left), enrichment per cell type (middle panels), and beta cell enrichment split into 10 trajectory bins (right). Boxplot center line, limits, and whiskers represent median, quartiles, and 1.5 interquartile range respectively. (c) Enrichment (estimate ± 95% CI by fgwas) of variants at loci associated with both T2D and FG (T2D/FG) within beta cell accessible chromatin. (d) Candidate causal T2D variant rs11708067 overlaps an enhancer active in INShigh beta cells at the ADCY5 locus, consistent with beta cell enrichment patterns for T2D/FG loci. (e) Enrichment (estimate ± 95% CI by fgwas) of variants at T2D loci in accessible chromatin for non-beta endocrine cells after removing beta accessible chromatin. (f) Candidate causal T2D variant rs1111875 overlaps a delta cell-specific site at the HHEX locus. (g) Correlation between single cell FG and TF motif enrichments across all 14.2k cells (left) and 7.2k beta cells (right). Across all cells, FG has positive correlations with beta-enriched TF families such as PDX, NKX6 and PAX. Within beta cells, FG has positive correlations with INShigh beta-enriched TF families such as RFX, MAF/NRL, and FOXA. (h) Enrichment (effect±SE) of FG-associated variants directly overlapping sequence motifs for those either positively or negatively correlated with FG in beta cells.

In these analyses, we observed differences in enrichments between INShigh and INSlow beta cells for fasting glucose (Figure 3a). To further resolve the heterogeneity of genetic association enrichment, we tested enrichment of T2D and fasting glucose as well as several negative control traits within single cell profiles (Figure 3b, Extended Data Figure 6a). We observed marked heterogeneity among beta cells for fasting glucose-associated variants, where INShigh cells had significantly stronger enrichment (INShigh median Z=2.58, INSlow median Z=0.68, p=1.19×10−225) (Figure 3b). We further calculated the average enrichment for cells binned across ‘pseudostate’ (Figure 2), which revealed decreasing enrichment from INShigh to INSlow beta cells (Figure 3b). In contrast, for T2D we observed consistent enrichment across INShigh and INSlow beta cells (INShigh median Z=0.93, INSlow median Z=0.91, p=0.44) (Figure 3b). As many variants affecting fasting glucose also affect T2D[12,13,63], we grouped T2D loci associated with fasting glucose (‘T2D/FG’ loci) and tested these loci for enrichment of INShigh and INSlow beta cell sites using fgwas[64]. We observed strong enrichment of T2D/FG loci for INShigh beta cells only (INShigh beta ln(enrichment)=4.54, INSlow beta ln(enrichment)=−25.7), suggesting that T2D loci affecting glucose levels have state-specific effects (Figure 3c). For example, at the ADCY5 locus associated with both fasting glucose and T2D, variant rs11708067:A>G (PPA=0.79) overlapped a site specific to INShigh beta cells (Figure 3d).
Extended Data Fig. 6

Single cell GWAS enrichment and correlation with TF motifs

(a) Single cell GWAS enrichment z-scores for Major depressive disorder and Systemic lupus erythematosus projected onto UMAP coordinates (left panels), z-score enrichment distribution per cell type and state (middle panels) and z-score enrichment distribution split into 10 bins based on beta cell trajectory values (right panels). Boxplot center lines, limits, and whiskers represent median, quartiles, and 1.5 IQR respectively. (b) Correlation between single cell GWAS enrichment z-scores for Type 2 Diabetes and chromVAR TF motif enrichment z-scores across either all cells (left) or beta cells (right). Inset scatterplots highlight the top correlated motifs in either direction. (c) Variants mapping directly in sequence motifs positively correlated with T2D risk in beta cells are enriched for T2D association, whereas variants mapping in motifs negatively correlated with T2D risk in beta cells show no such enrichment. Values represent effect size and SE.

Outside of beta cells we also observed evidence for enrichment of T2D association for variants in chromatin sites for other endocrine cell states including GCGlow alpha cells and SSThigh and SSTlow delta cells (Figure 3a). In order to understand the potential role of these cell states in T2D risk, we performed additional enrichment analysis at known T2D risk loci using fgwas[64]. We tested alpha, delta and gamma sites not overlapping beta cell sites for enrichment at fine-mapped T2D loci from DIAMANTE. Here we again observed enrichment of T2D association for GCGlow alpha cells (ln(enrichment)=1.75) as well as SSThigh and SSTlow delta cells (ln(enrichment)=0.86, 1.30) (Figure 3e). In further support of the likely role of these cell types in T2D, several fine-mapped risk variants overlapped sites specific to these cell types; for example, rs1111875:C>T (PPA=0.16) mapped in a delta cell-specific site at the HHEX locus (Figure 3f). Given our ability to map both complex trait and TF motif enrichments to single cells, we reasoned that joint analysis could provide insight into TFs regulating trait-relevant chromatin. We correlated single cell fasting glucose and T2D enrichment z-scores with TF motif enrichments from chromVAR[33]. Across all 15.3k cells, we observed positive correlation between fasting glucose and T2D enrichment and motifs for beta cell TFs such as PDX (Figure 3g, Extended Data Figure 6b, Supplementary Table 9). Across the 7.6k beta cells only, we observed strongest positive correlation between fasting glucose and motifs in TF families enriched for INShigh beta cells such as bHLH (NEUROD1 ρ=0.21, p=9.43×10−78) and RFX (Rfx1 ρ=0.21, p=8.83×10−74) (Figure 3g, Supplementary Table 9). For T2D, strongest positive correlations included motifs for TF families such as RFX (Rfx1 ρ=0.052, p=3.72×10−9), NFAT (NFATC2 ρ=0.047, p=4.69×10−5), and MEF2 (MEF2A ρ=0.062, p=6.99×10−8) (Extended Data Figure 6b, Supplementary Table 9). We further determined whether TF motifs preferentially harbored associated variants directly. For fasting glucose, we identified strongest enrichment for INShigh state-specific TF motifs, most notably RFX (RFX2 p=1.3×10−10) and NEUROD (p=0.049). For T2D, we observed enrichment for TF motifs positively correlated with T2D association including RFX (Rfx1 p=4.0×10−15), NFAT (NFATC2 p=2.2×10−4), and MEF2 (MEF2D p=1.9×10−3) (Extended Data Figure 6c). These motifs remained significantly enriched for T2D when considering only variants predicted to disrupt the motif (Rfx1 p=1.1×10−9, NFATC2 p=2.2×10−4, MEF2D p=7.7×10−5). Together these results provide state-resolved insight into the role of beta cells and their TFs in both T2D risk and fasting glucose level and implicate other endocrine cell types in T2D risk.

Predictions of variant effects on islet cell type chromatin

Predicting the effects of non-coding genetic variants on regulatory activity remains a major challenge, in large part because the sequence vocabularies that encode regulatory function differ between cell types and states. We therefore used deltaSVM[65] to predict the effects of genetic variants from the Haplotype Reference Consortium panel[66] on chromatin accessibility in each endocrine cell type and cell state. We identified 432,072 variants genome-wide with predicted allelic effects (FDR<0.1), encompassing between 115k-161k variants (7.8%−10.9% of tested variants) per cell type or state (Figure 4a, Supplementary Data 3).
Figure 4.

Genetic variants with islet cell type- and state-specific effects on chromatin accessibility.

(a) Percentage of HRC variants in any endocrine cell type peak (n=1,411,387 variants) that had significant deltaSVM effects (FDR<0.1) for the reference (ref.) or alternative (alt.) allele. (b) Spearman correlation between deltaSVM Z-score and chromatin accessibility allelic imbalance Z-scores for variants with predicted effects in alpha and beta states. (c) Relative luciferase reporter activity (mean ± 95% CI; n=3 replicates) for five T2D variants with predicted beta cell effects. The allele with predicted effect is circled. p-values by two-sided Student’s T-tests. (d) Enrichment of islet caQTLs for variants with predicted effects in alpha and beta cells (left) and stratified based on shared, cell type- and state-specific effects (right). p-values by two-sided Fisher’s exact test, ns, not significant. (e) Examples of variants with predicted effects in alpha and beta cells (left). TF motif families enriched in sequences surrounding the effect allele relative to the non-effect allele (middle). Promoter accessibility patterns for genes in enriched TF motif families (right). Genes with matching promoter accessibility and TF motif enrichment patterns are highlighted. (f) Enrichment (estimate ± 95% CI) of low frequency and rare variants with predicted effects on islet chromatin at different T2D association thresholds. p-values by two-sided binomial test. (g) Low-frequency T2D variant rs78840640 at the IGF2BP3 locus with high causal probability (PPA=0.33) has predicted effects in beta cells.

To validate that our predictions captured true allelic effects on islet chromatin accessibility, we first compared alpha and beta cell predictions to direct measurements of allelic effects on chromatin accessibility. We found significant correlations between predicted allelic effects and allelic imbalance estimates for all alpha and beta cell states (GCGhigh Spearman ρ=0.255, p=1.20×10−34, GCGlow ρ=0.214, p=2.35×10−7, INShigh ρ=0.275, p=1.03×10−34, INSlow ρ=0.334, p=4.73×10−38) (Figure 4b). We further validated five likely causal T2D variants predicted to affect beta cell chromatin which had directionally consistent effects on enhancer activity using gene reporter assays in the MIN6 beta cell line (Figure 4c). We also compared predictions to islet chromatin accessibility quantitative trait loci (caQTLs)[28], and observed significant enrichment of caQTLs among variants with predicted effects in alpha or beta cells (observed=38.4%, expected=19.7%, two-sided Fisher’s exact p=2.78×10−97) (Figure 4d). When sub-dividing predictions based on shared, cell type-specific or state-specific effects we observed significant enrichment of caQTLs only among shared effect variants (Figure 4d). We further characterized genetic variants predicted to have cell type- and state-dependent effects on islet chromatin. Variants with state-specific effects tended to disrupt motifs for TF families such as NEUROD and RFX for hormone-high states (-log10(p)=59.0, 24.5) (Figure 4e). Similarly, variants with alpha or beta cell-specific effects tended to disrupt motifs for lineage-defining TF families including GATA for alpha cells (-log10(p)=21.6), and NKX6 and PDX1 for beta cells (-log10(p)=11.2, 10.3) (Figure 4e). To assign motifs to specific TFs, we examined promoter-accessibility of TFs within the structural TF subfamily[67]. Among GATA subfamily members only GATA6 had high promoter accessibility in alpha cells (GCGhigh=1.00, GCGlow=0.96, INShigh=0.22, INSlow=0.14), suggesting that GATA6 binding is likely disrupted in alpha cells. Similarly, among RFX family members RFX6 had promoter accessibility in hormone-high state cells (GCGhigh=0.92, GCGlow=0.70, INShigh=0.93, INSlow=0.80) (Figure 4e). We evaluated whether our predictions could prioritize lower frequency (defined as minor allele frequency MAF<0.05) functional variants involved in T2D risk. We observed enrichment of genome-wide significant T2D associations among lower frequency variants with predicted effects in any endocrine cell type compared to background (Figure 4f). When considering each cell type, we observed enrichment of T2D association among variants with predicted effects in beta and delta cells, even at sub-genome-wide significant p-values (Figure 4f). For example, at the IGF2BP3 locus, rs78840640:C>G (MAF=0.02) had allelic effects on beta cell chromatin (INShigh beta FDR=1.15×10−4; INSlow beta FDR=6.93×10−3), and fine-mapping supported a causal role in T2D (PPA=0.33) (Figure 4g). This variant affected enhancer activity in gene reporter assays where the alternate (and T2D risk) allele G had reduced activity (Figure 4c). These results reveal that cell type-specific chromatin can provide accurate functional predictions of lower frequency variants.

Co-accessibility links regulatory variants to target genes

Defining the genes affected by regulatory element activity remains a major challenge, as enhancers can regulate gene activity over large, non-adjacent distances[68]. A number of approaches have been developed to link regulatory elements to target genes[69,70], but are not typically cell type-resolved[27,71]. Recently, a new approach was developed to link regulatory elements at cell type resolution based on co-accessibility across single cells[8]. We thus leveraged single cell accessible chromatin profiles to define co-accessibility between accessible chromatin sites in alpha, beta and delta cells. To calibrate the extent to which co-accessibility reflected physical interactions between regulatory elements, we performed a distance-matched comparison between co-accessible sites stratified by co-accessibility threshold and chromatin loops identified from Hi-C and promoter capture Hi-C (pcHi-C) in primary islets[27,71]. We observed strong enrichment for sites with co-accessibility scores >0.05 for islet chromatin loops compared to non-co-accessible sites (Figure 5a, Extended Data Figure 7a-e). We therefore used this threshold (0.05) to define co-accessible sites (Supplementary Data 4-6). Among co-accessible sites were 47,871 (alpha), 46,036 (beta) and 42,234 (delta) distal sites co-accessible with a gene promoter (Extended Data Figure 7f-g), and the majority (71.9%) were cell type-specific (Extended Data Figure 8a-c). For example, the PDX1 promoter was co-accessible with 31 sites in beta and 39 sites in delta cells, including sites that directly coincided with islet pcHi-C, only 7 of which were in alpha cells (Figure 5b).
Figure 5.

Chromatin co-accessibility links diabetes risk variants to target genes.

(a) Distance-matched odds that co-accessible sites in beta cells overlap islet pcHi-C interactions at different threshold bins. (b) Single cell co-accessibility and islet pcHi-C interactions at the PDX1 promoter. (c) Enhancer harboring T2D variant rs231361 shows distal beta cell co-accessibility to the INS promoter and other non-promoter sites. The enhancer is accessible in INShigh but not INSlow beta and has decreasing accessibility across the beta cell trajectory. rs231361 disrupts an RFX motif and has predicted effects in INShigh beta. *FDR<0.1. CRISPR/Cas9 deletion of a 2.6 kb region (highlighted, ∆Enh) around the enhancer. (d) Expression (transcripts per million, TPM±SEM) of genes within 2 Mb of the enhancer in beta cell stage cultures for ∆Enh (n=6; 3 clones × 2 differentiations) and control (n=2; 1 clone × 2 differentiations), p-values from DESeq2, ns not significant. (e) Representative immunofluorescence staining for INS (green), NKX6–1 (red), and DAPI (blue) in beta cell stage cultures for ∆Enh (n=3 independent experiments) and control (n=3 experiments) (f) Quantification of INS median fluorescence intensity (MFI) and (g) INS content in beta cell stage cultures for ∆Enh (n=9; 3 clones × 3 differentiations) and control (n=6; 2 clones × 3 differentiations). (h) Relative expression of INS and CDKN1C, (i) Quantification of INS MFI, and (j) INS content in beta cell stage cultures for KCNQ1A/A (n=9; 3 clones × 3 differentiations) and KCNQ1G/G (n=6; 2 clones × 3 differentiations). Data shown are mean±SEM, p-values by two-sided Student’s T-test, ns, not significant.

Extended Data Fig. 7

Single cell co-accessibility analyses in islet cell types

(a) Distance-matched odds that delta cell co-accessibility links overlap islet pcHi-C chromatin loops at different co-accessibility threshold bins in 0.05 intervals demonstrate that co-accessible links are enriched for chromatin interactions. (b) Same analysis as in (a) but with alpha cell co-accessibility. (c) Same analysis as in (a) but with beta cell co-accessibility and Hi-C loops. (d) Same analysis as in (a) but with delta cell co-accessibility and Hi-C loops. (e) Same analysis as in (a) but with alpha cell co-accessibility and Hi-C loops. (f) Number of distal sites linked to each promoter peak for alpha, beta, and delta cells. (g) Number of promoters linked to each distal site for alpha, beta, and delta cells.

Extended Data Fig. 8

Cell type-specific and shared co-accessible sites

(a) An example of co-accessibility anchored at the promoter for the delta cell identity TF HHEX. Co-accessibility for beta, delta, and alpha cells are shown compared to high-confidence pcHi-C loops from ensemble islets. Genome browser plots scale: 0–10. (b) An example of co-accessibility anchored at the promoter for the alpha cell identity TF ARX. (c) An example of shared co-accessibility anchored at the promoter for the shared islet identity TF NEUROD1.

Distal sites with co-accessibility links to gene promoters harbored risk variants for T2D at many loci. At the KCNQ1 locus, an islet chromatin site located in intron 3 of KCNQ1 had beta cell-specific co-accessibility with the INS promoter over 500 kb distal and harbored a causal T2D variant rs231361:G>A (PPA=1) (Figure 5c). The site containing rs231361 was more accessible in INShigh compared to INSlow beta cells, and rs231361 was predicted to have state-specific effects on beta cell chromatin accessibility (INShigh beta FDR=0.060; INSlow beta FDR=0.40). Furthermore, rs231361 disrupted an RFX motif, which itself was enriched in INShigh beta cells (Figure 5c). Published 4C data from the EndoC-βH1 human beta cell line[72] revealed physical proximity between this site and the INS promoter (Extended Data Figure 9), but there was not similar evidence in Hi-C or other 3C-based data from EndoC-βH1 and primary islets[27,71,73].
Extended Data Fig. 9

3D chromatin interactions at the T2D-associated KCNQ1 locus

Top panels show Hi-C contact matrices from hESC-derived beta cells, visualized at 25 kb resolution. Region shown is chr11:500,00–4,500,000, hg19. Black arrows indicate putative interaction point of INS TSS and KCNQ1 enhancer. Genome browser plot below shows a zoomed view of chr11:1,750,000–3,250,000. Data from 4C-seq anchored on the INS promoter in EndoC-βH1 cells (Jian & Felsenfeld 2018) is shown, as analyzed with the 4C-ker package. Normalized read counts are shown in black from 3 biological replicates. Significant interactions from INS promoter are shown as arcs below read counts tracks. Interactions calls are from data pooled across 3 replicates are shown here. The region containing the KCNQ1 enhancer was called as a significant interaction region with INS promotor independently in each 4C replicate. Virtual 4C plots in green show log(normalized Hi-C interaction frequency) from INS promoter.

Although we observed physical proximity between rs231361 and the INS promoter in beta cells, the absence of a canonical chromatin loop necessitated further validation. We therefore deleted a 2.6 kb region flanking the site in hESCs by CRISPR/Cas9-mediated genome editing in three bi-allelic clones (KCNQ1∆Enh) (Figure 5c, Extended Data Figure 10a,b), and differentiated KCNQ1∆Enh clones and unedited control clones into beta cells using a modified version of an established protocol[74]. Analysis of cultures at the beta cell stage revealed similar numbers of INS+/NKX6–1+ cells in KCNQ1∆Enh and control clones (47.1±13.4% vs 56.5±7.6%) (Extended Data Figure 10c), suggesting that the enhancer deletion had no effect on beta cell differentiation. We determined effects of the enhancer deletion on expression of all genes within 2 Mb of the enhancer and observed a significant decrease in the expression of INS (p=3.02×10−4; FDR=0.066) and CDKN1C (p=1.96×10−4; FDR=0.059) in KCNQ1∆Enh compared to control cells, and not for other genes (all P>.05) (Figure 5d). Analysis of INS protein levels by immunofluorescence staining, flow cytometry, and ELISA further revealed reduced INS protein abundance in KCNQ1∆Enh beta cells (Figure 5e-g).
Extended Data Fig. 10

Genome editing of the KCNQ1 locus in hESCs

(a) Schematic of the workflow and (b) Sanger sequencing for KCNQ1 enhancer deletion in three independent hESC clones. (c) Representative figures of flow cytometry analysis for NKX6–1 and INS comparing control and KCNQ1ΔEnh cells (left). Quantification of the percentage of NKX6–1+/INS+ cells in beta cell stage cultures from control (n=6; 2 clones × 3 differentiations) and KCNQ1∆Enh (n=9; 3 clones × 3 differentiations) cells (right). Values represent mean and SEM. ns, not significant by two-sided Student’s T-test without adjustment for multiple comparisons. (d) Schematic of the workflow and (e) Sanger sequencing for two independent KCNQ1G/G clones and three KCNQ1A/A clones. (f) Representative figures of flow cytometry analysis for NKX6–1 and INS comparing KCNQ1G/G and KCNQ1A/A clones (left). of the percentage of NKX6–1+/INS+ cells in beta cell stage cultures from KCNQ1G/G (n=6; 2 clones × 3 differentiations) and KCNQ1A/A (n=9; 3 clones × 3 differentiations) cells (right). ns, not significant by two-sided Student’s T-test without adjustment for multiple comparisons. Values represent mean and SEM.

To next determine whether rs231261 itself had distal effects on INS regulation in beta cells, we used targeted base editing to generate hESC lines that were homozygous for either the major allele G (KCNQ1, two clones) or the minor (and T2D risk) allele A (KCNQ1A/A, three clones) (Extended Data Figure 10d-f). We then differentiated the KCNQ1 and KCNQ1 clones into beta cells using the same protocol as for KCNQ1∆Enh with additional modifications. We determined effects of the variant alleles on expression of the two genes INS and CDKN1C with significant changes in the enhancer deletion (in Figure 5d) using qPCR. We observed a significant difference in INS expression between alleles (p=0.022), as well as evidence for a difference in CDKN1C expression although this was not significant (p>0.05) (Figure 5h). Analysis of insulin protein levels by flow cytometry and ELISA revealed significant differences in insulin abundance between variant alleles (Figure 5i,j).

DISCUSSION

Over 400 known risk signals for T2D have been identified, yet only a handful have been characterized molecularly[16,17,27,75-81]. Our findings provide a roadmap demonstrating how single cell accessible chromatin derived from disease-relevant primary tissue can be utilized to define cell types, cell states, cis regulatory elements and genes involved in the genetic basis of T2D and other complex disease. The KCNQ1 locus has a complex contribution to T2D involving at least 10 independent signals[9]. Among these was causal T2D variant rs231361, where genome editing in hESC beta cell models revealed effects on insulin transcript and protein levels. Chromatin conformation from 4C in EndoC-βH1 cells[72] revealed physical proximity between the enhancer and INS promoter, although we did not find corresponding evidence for an interaction in other 3C-based data from EndoC-βH1 cells[73], hESC beta cells or primary islets[27,71]. Thus, while our results support a possible cis-regulatory effect of rs231361 on INS expression, we cannot currently rule out that the observed effects instead occur in trans, secondary to other effects. We anticipate that future studies to resolve phase between variant alleles and INS allelic expression in heterozygous samples will further clarify the nature of these effects. Single cell accessible chromatin uncovered heterogeneity in the regulatory programs of endocrine cell types, pointing to TFs that likely drive cell state-specific functions. Integrating single cell heterogeneity with large-scale genetic association data revealed that genetic variants modulating fasting glucose levels likely act through the high insulin-producing beta cell state, whereas genetic risk of T2D is mediated through effects on both the high insulin-producing state and other functional beta cell state(s) likely related to stress and signaling responses. Moreover, given similar heterogeneity in the epigenomes of alpha and delta cells, our results reveal that endocrine cell regulation involves both lineage-specific programs as well as an additional layer of state-specific programs common across endocrine cell types. Our results also implicate these other endocrine cell types in genetic risk of T2D independent of beta cells, most prominently delta cells. Previous studies have characterized heterogeneity in beta cell physiological function, cell surface markers, and gene expression[22,44,82-84]. We found that heterogeneity in the beta cell epigenome mapped to cellular states related to insulin production and stress-related signaling response and was tightly linked to heterogeneity in beta cell gene expression[23] and electrophysiology[44]. However, there is often not perfect correspondence between sub-clusters identified by different techniques and/or studies, and we anticipate that using multi-omic methods will help to further clarify these differences. Regardless of the technology used, heterogeneity defined from single cell data is by nature dependent on computational clustering or ordering. Separating true heterogeneity from other sub-structure therefore ultimately requires experimental validation, for example by profiling cell populations sorted using sub-type markers. As the sub-clusters described in our study have not been linked directly to functional differences, experiments will be necessary to determine the relationship between epigenomic heterogeneity and cellular function. Furthermore, while we observed heterogeneity in endocrine cells from cryopreserved tissue in addition to purified islets, determining the true extent and nature of heterogeneity in vivo will require more extensive studies across a broader range of samples and conditions. In summary, here we present a detailed characterization of islet cell type and state regulatory programs. When combined with genetic fine-mapping and whole genome sequencing, as well as additional cell type-specific data in islets[85], this resource will greatly enhance efforts to define molecular mechanisms of T2D risk. More broadly, our study provides a framework for using single cell chromatin from disease-relevant tissues to interpret the genetic mechanisms of complex disease.

ONLINE METHODS

Islet processing and nuclei isolation

We obtained islet preparations for three donors from the Integrated Islet Distribution Program (Supplementary Table 1). Islet preparations were further enriched using zinc-dithizone staining followed by hand picking. For experiments performed with whole pancreas tissue, a cryopreserved tissue sample was obtained from the Network for Pancreatic Organ Donors with Diabetes (nPOD) biorepository. Studies were given exempt status by the Institutional Review Board (IRB) of the University of California San Diego.

Generation of snATAC-seq libraries

Combinatorial barcoding single nucleus ATAC-seq was performed as described previously[2,4] with several modifications (described in Supplementary Note). Libraries were quantified using a Qubit fluorimeter (Life technologies) and the nucleosomal pattern was verified using a Tapestation (High Sensitivity D1000, Agilent). Libraries were sequenced on a HiSeq 2500 sequencer (Illumina) using custom sequencing primers, 25% spike-in library and following read lengths: 50+43+40+50 (Read1+Index1+Index2+Read2). Pancreas tissue preparation is described in the Supplementary Note. Droplet-based single cell ATAC-seq libraries were generated using the Chromium Chip E Single Cell ATAC kit (10x Genomics, 1000086) and indexes (Chromium i7 Multiplex Kit N, Set A, 10x Genomics, 1000084) following manufacturer instructions. Final libraries were quantified using a Qubit fluorimeter (Life Technologies) and the nucleosomal pattern was verified using a Tapestation (High Sensitivity D1000, Agilent). Libraries were sequenced on a NextSeq 500 and HiSeq 4000 sequencer (Illumina) with following read lengths: 50+8+16+50 (Read1+Index1+Index2+Read2).

Raw data processing and quality control

For each read, we first appended the cell barcode metadata to the read name. The cell barcode consisted of four pieces (P7, I7, I5, P5) which were derived from the index read files. We first corrected for sequencing errors by calculating the Levenshtein distance between each of the four pieces and a whitelist of possible sequences. If the piece did not perfectly match a whitelisted sequence, we took the best matching sequence if it was within 2 edits and the next matching sequence was at least 2 additional edits away. If none of these conditions were met, we discarded the read from further analyses. Sequence processing steps provided in the Supplementary Note.

Cluster analysis for snATAC-seq

We split the genome into 5 kb windows and removed windows overlapping blacklisted regions (v2) from ENCODE[86,87]. For each experiment, we created a sparse m x n matrix containing read depth for m cells passing read depth thresholds at n windows. Using scanpy[88] (v.1.4.4.post1), we extracted highly variable windows using mean read depth and normalized dispersion (‘min_mean=0.01, min_disp=0.25’). After normalization to uniform read depth and log-transformation, for each experiment, we regressed out the log-transformed read depth within highly variable windows for each cell. We then performed principal component analysis (PCA) and extracted the top 50 principal components. We used Harmony[24] to correct the principal components and remove batch effects across experiments, using donor-of-origin as a covariate. We used Harmony-corrected components to calculate the nearest 30 neighbors using the cosine metric, which were subsequently used for UMAP dimensionality reduction (‘min_dist=0.3’) and Leiden clustering[89] (‘resolution=1.5’). We performed iterative clustering to identify and remove cells with abnormal features prior to the final clustering results (see Supplementary Note). After removing these cells, we ended up with 15,298 cells mapping to 12 clusters. We used chromatin accessibility at windows overlapping promoters for marker hormones to assign cell types for the endocrine islet cell types and chromatin accessibility at windows around marker genes from scRNA-seq to assign cluster labels for non-endocrine islet clusters.

Comparison to bulk and sorted islet ATAC-seq

We processed sequence data of bulk islet ATAC-seq[14,27-29] and bulk pancreas from ENCODE[86] (see Supplementary Note). We calculated the Spearman correlation between normalized read coverages and used hierarchical clustering to assess similarity between bulk islet samples. To check peak call overlap between aggregated snATAC-seq and bulk ATAC-seq, we split peaks based into promoter proximal (±500 bp from TSS) and distal peaks based on promoter overlap. For each cluster, we calculated the percentage of aggregate peaks that overlapped merged autosomal bulk peaks and individual sample-level autosomal bulk peaks. We processed raw sequence data of ATAC-seq from flow-sorted pancreatic cells from two prior studies[90,91] (see Supplementary Note). We calculated Spearman correlations between read coverages and used hierarchical clustering to assess similarity between sorted and snATAC-seq samples.

Identifying marker peaks of chromatin accessibility

To identify peaks for each cell type, we aggregated reads for all cells within a cluster. We shifted reads aligning to the positive strand +4bp and reads aligning to the negative strand −5 bp, extended reads to 200 bp, and centered the reads. We used MACS2[32] to call peaks for each cluster (‘--nomodel --keep-dup all’). We removed peaks that overlapped the ENCODE blacklist v2[86,87]. We then used bedtools[92] (v.2.26.0) to merge peaks from clusters and create a set of merged peaks. We generated a sparse m x n matrix containing binary overlap between m peaks in the merged set of islet regulatory peaks and n cells. We calculated t-statistics of peak specificity for each cluster using linear regression models. For each peak and each cluster, we used binary encoding of read overlap with the peak as the outcome and whether a cell was in the cluster as the predictor, and included the log read depth of each cell as a covariate in the model.

Matching islet snATAC-seq with scRNA-seq clusters

To verify that clusters definitions and labels from single cell chromatin accessibility data matched those from single cell expression data, we obtained published scRNA-seq data from 12 non-diabetic islet donors[23]. Because cluster definitions for all cell types were not available, we re-analyzed the data and performed clustering analyses. We used the Spearman correlation between t-statistics from islet snATAC-seq and islet scRNA-seq data to verify cluster labels. See Supplementary Note for additional details.

Single cell motif enrichment

We used chromVAR[33] (v.1.5.0) to estimate TF motif enrichment z-scores for each cell. First, we constructed a merged peak by cell sparse binary matrix as the input for chromVAR. We corrected for GC bias based on hg19 (BSgenome.Hsapiens.UCSC.hg19) using the ‘addGCBias’ function. For TF motifs within the non-redundant JASPAR 2018 CORE vertebrate set, we calculated bias-corrected deviation z-scores for each cell. Across all cell types, we selected variable TF motifs (N=141) with variability>1.2. For each cell type, we then computed the average TF motif enrichment z-score across single cells, collapsing values for cell types with more than one state. We compared motif enrichment z-scores between single cells using Benjamini-Hochberg corrected p-values (FDR<0.01) from 2 sample T-tests.

Ordering alpha, beta, and delta cells along a pseudo-state trajectory

We used Cicero[8] (v.1.3.3 with Monocle 3) to order alpha, beta, and delta cells along separate trajectories. Starting from the merged peak by cell sparse binary matrix, we extracted beta cells and filtered out peaks that were not present in beta cells. We then preprocessed the data with latent semantic indexing (LSI) and continued onto dimensionality reduction, cell clustering, and trajectory graph learning using default parameters. We then chose the root state (i.e. the start of the trajectory) based on the highest average INS promoter accessibility. We repeated the same procedure for beta and delta cells, instead choosing the root state based on GCG or SST promoter accessibility respectively.

Comparison of endocrine cell states

To identify TF motifs variable between cell states, we performed one-sided Student’s T-test on motif z-scores between cells in each state. We adjusted p-values with the Benjamini-Hochberg procedure and defined motifs with FDR<0.05 and absolute difference in z-score>0.5 as differential. To analyze differential promoter accessibility between cell states, we performed two-sided Fisher’s exact tests between hormone-high and hormone-low states for each promoter against the null hypothesis that the promoter had similar accessibility across states. We used the Benjamini-Hochberg adjusted p-values (FDR<0.01 for alpha or beta cells and FDR<0.1 for delta cells) to identify gene promoters with differential accessibility across states. Differentially-accessible promoters were used to perform GO term enrichment on biological processes (2018 version) using Enrichr[93] (v.1.0). We filtered for terms that contained <150 genes. Details of trajectory analysis of motifs and promoters provided in Supplementary Note. We collected gene sets from Xin et al.[23], Mawla & Huising[43], and Camunas-Soler et al[44] (details in Supplementary Note). For each gene list, we performed gene set enrichment analysis[94] (GSEA) using log2 odds ratios from previous Fisher’s exact tests. We also performed GSEA using significantly differential promoters from Figure 2a as the gene sets to assess whether cell states showed concordant differences across cell types.

GWAS enrichment genome-wide with aggregate peak annotations

We used stratified LD score regression[60-62] (v.1.0.1) to calculate enrichment for GWAS traits. We obtained GWAS summary statistics for quantitative traits[11,12,48-51], diabetes[9,47], and control traits[52-59]. To create custom LD score files, we annotated variants using peaks for each cluster as a binary annotation. In addition to the annotations included in the baseline-LD model v2.2, we included LD scores estimated from merged peaks across all clusters as the background. For each trait, we used stratified LD score regression to estimate the enrichment z-scores of each annotation relative to background. We computed one-sided p-values for enrichment based on the z-scores and corrected for multiple tests using the Benjamini-Hochberg procedure.

GWAS enrichment with single cell annotations

We determined genetic enrichment of accessible chromatin profiles in individual cells for fasting glucose level[12], T2D[10], and control traits major depressive disorder[56] and systemic lupus erythematosus[54] GWAS using polyTest[95]. Additional details are provided in the Supplementary Note. To identify TFs correlated with trait enrichments, we regressed out log read depth, fraction of reads in peaks, and fraction of promoters used from the single cell enrichments. We calculated the Spearman correlation between the residuals of fasting glucose or T2D enrichment z-scores and motif enrichment z-scores using all cells or only beta cells. We used Bonferroni correction to correct p-values for multiple tests. To identify motifs directly enriched for T2D or FG association in beta cells, we identified all variants mapping in a beta cell site and using fimo[96] predicted instances of each motif in JASPAR[34] using sequence surrounding each allele. We considered variants disrupting motifs where there was a motif prediction for only one allele. We tested for enrichment of variants in a predicted motif or disrupting the motif for T2D or FG using polyTest[95].

GWAS enrichment at known T2D loci using aggregate peak annotations

We identified primary T2D risk signals[10] where the highest causal probability variant was associated with fasting glucose level[12] at genome-wide significance. We annotated variants at each signal with INShigh and INSlow beta cell sites and tested for enrichment using fgwas[64] (v0.3.6) [71] (‘-fine’). For alpha, delta and gamma cells, we retained sites that did not overlap a beta cell site and annotated variants at all T2D risk signals[9]. To exclude the possibility that enrichments could be driven by other relevant tissues, we annotated variants in liver and adipose ATAC-seq from ENCODE[86]. We tested for enrichment using fgwas[64] (‘-fine’) including other tissue annotations in the model. We considered annotations with positive enrichment estimates enriched for T2D risk.

Predicting genetic variant effects on chromatin accessibility

We used deltaSVM[65] to predict the effects of non-coding variants on chromatin accessibility in each cell type and cell state (details in Supplementary Note). From variant z-scores we calculated p-values and FDR and considered variants significant at FDR<0.1.

Luciferase gene reporter assays

We cloned sequences containing reference alleles in the forward orientation upstream of the minimal promoter of firefly luciferase vector pGL4.23 (Promega) using KpnI and SacI restriction sites. For rs34584161:A>G, we cloned the alternative allele in the same manner as the reference alleles. For other variants, we introduced the alternative alleles via site-directed mutagenesis (SDM) using the NEB Q5 Site-Directed Mutagenesis kit (New England Biolabs). SDM and reporter assay details provided in Supplementary Note. We normalized Firefly activity to Renilla activity and compared it to empty vector, and normalized results were expressed as fold change compared to control per allele. We used a two-sided Student’s T-test to compare the two alleles.

Mapping allelic imbalance within clusters

We extracted genomic DNA either from spare islet nuclei (donors 1 and 2), or acinar cells (donor 3). We used the DNeasy Blood & Tissue Kits (Qiagen) according to manufacturer’s protocol for purification of total DNA and performed genotyping on the Infinium Omni2.5–8 v1.4 array (Illumina). Additional details provided in Supplementary Note. Using cluster assignments for each cell, we split mapped reads for each donor into cluster-specific reads. We used WASP[97] (v.0.3.0) to correct for reference mapping bias at heterozygous variants. We then used a two-sided binomial test to assess imbalance assuming a null hypothesis where both alleles were equally likely to be observed. For each variant, we combined imbalance z-scores across donors using Stouffer’s z-score method with sequencing depth as a weight for each sample.

Grouping of predicted effect variants and enrichment for islet caQTLs

We categorized variants with predicted effects on alpha or beta cells based on effects across cell type and states: “alpha” (n=8,552), “beta” (n=11,650), “hormone-high” (n=12,874), and “hormone-low” (n=10,808), and “shared” (n=27,140). We also determined the concordance in the direction of effect for variants across cell states. For the set of variants with significant effects in each state, we calculated the fraction of variants where the effect allele had predicted effect in other states. We determined significance using a two-sided binomial test assuming an expected fraction of 50%. We assessed enrichment of predicted effect variants in alpha or beta cell states for islet caQTLs[28] compared to any caQTL in alpha or beta cell sites using two-sided Fisher’s exact tests. We stratified variants with predicted effects by category and assessed enrichment of caQTLs with predicted effects within each category with two-sided Fisher’s exact tests.

TF motif enrichment within predicted effect variant categories

For each cell- or state-resolved category of variants with predicted effects, we extracted 29bp sequences (± 14bp around variant) corresponding to the higher or lower predicted effect allele. We used AME[96] in MEME (v.4.12.0) to predict motif enrichment, using position weight matrices from JASPAR 2018[34]. We used sequences from the higher or lower effect alleles as the test or background set respectively. We used TFClass (http://tfclass.bioinf.med.uni-goettingen.de/) to group motifs by family. To determine TFs most likely driving the enrichment, we checked normalized promoter accessibility within family members in alpha or beta cells.

Enrichment of predicted variants for lower frequency variants

We obtained summary statistics of T2D[9] and performed LD pruning (details in Supplementary Note). We identified variants that had significant effects in any endocrine cell type, as well as for each cell type for either state. We created a background set of variants without significant effects in any endocrine cell type (FDR>0.1). We created thresholds based on T2D association p-values (5×10−8, 1×10−7, 1×10−6, 1×10−5, 1×10−4, 1×10−3). For each threshold, we calculated fold enrichment of predicted effect variants as compared to the background and determined significance using two-sided binomial tests.

Single cell chromatin co-accessibility

We used Cicero[8] to calculate co-accessibility for pairs of peaks for alpha, beta, and delta cells. As in the trajectory analysis, we started from the merged peak by cell sparse binary matrix, extracted beta cells, and filtered out peaks that were not present in beta cells. We used ‘make_cicero_cds’ to aggregate cells based on 50 nearest neighbors. We then used Cicero to calculate co-accessibility using a window size of 1 Mb and a distance constraint of 500 kb. We repeated the same procedure for alpha and delta cells. We used co-accessibility>0.05 to define pairs of peaks as co-accessible.

Enrichment of islet Hi-C and pcHi-C loops in co-accessible peaks

We obtained islet promoter capture Hi-C (pcHi-C) and merged Hi-C loops[27,71]. For both datasets, we used coordinates for anchors directly from the loop calls. To compare co-accessibility with pcHi-C or Hi-C, we used direct overlap of peaks with anchors. For binned thresholds of co-accessibility in 0.05 increments, we calculated distance-matched odds ratios for alpha, beta or delta cell co-accessible peaks containing pcHi-C or Hi-C loops versus non-co-accessible peaks (co-accessibility<0). We used two-sided Fisher’s exact tests to assess significance.

Hi-C library construction and data analysis

We performed in situ Hi-C as previously described using MboI on two batches of hESC-derived beta cells using the differentiation protocol described below cultured in high glucose (20mM mg/mL) or low glucose (5mM mg/mL). Hi-C libraries were sequenced to read counts of 1,509,428,732 and 1,918,698,012, respectively. We analyzed Hi-C using Juicer[98] with default settings and visualized with HiGlass[99]. Unless otherwise indicated, interaction frequencies were normalized using iterative matrix balancing. We generated virtual 4C tracks by extracting normalized interaction frequencies from an anchor bin of interest from the contact matrix. We performed Aggregate Peak Analysis (APA) using juicer tools with settings “-u -r 10000”.

Annotating fine-mapped diabetes risk variants

We annotated risk signals in compiled fine-mapping data for T2D (additional details in Supplementary Note). For each signal, we identified variants that were in the 99% credible set with PPA>0.01. We intersected these candidate variants with sites for each islet cell type and cell state, and then identified variants with predicted effects on the overlapping cell types/states. We annotated variants based on overlap with sites co-accessible to gene promoters. For target genes linked to diabetes risk variants we determined enriched gene sets using GSEA[94].

Analysis of INS promoter 4C data

We downloaded and re-analyzed published 4C of the INS promoter for EndoC-βH1[72] with 4C-ker[100]. We created a reduced genome using 25bp flanking sequences of BglII cutting sites. For the 3 replicates, we aligned reads to this reduced genome using bowtie2[101] (v.2.2.9; ‘-N 0 −5 20’). We extracted counts for each fragment after removing self-ligated and undigested fragments and input bedGraph files to R.4Cker. We generated normalized counts and called high interaction regions using ‘nearBaitAnalysis’ (‘k=10’).

CRISPR/Cas9-mediated genome editing in human embryonic stem cells

H1 hESCs (WA01; purchased from WiCell; NIH registration number: 0043) were seeded onto Matrigel®-coated six-well plates at a density of 50,000 cells/cm2 and maintained in mTeSR1 media (StemCell Technologies) for 3–4 days with media changed daily. Details for the generation of clonal hESC lines for enhancer deletion and allele-specific are provided in the Supplementary Note. sgRNA oligos and genotyping primers are listed in Supplementary Table 12. hESC research was approved by the University of California, San Diego, Institutional Review Board and Embryonic Stem Cell Research Oversight Committee.

Pancreatic differentiation of hESC clones

KCNQ1 enhancer-deleted hESC lines were differentiated in a suspension-based format using rotational culture with modifications to a published protocol[74] KCNQ1 base-edited hESC lines were differentiated in a suspension-based format using rotational culture with modifications to a published protocol[102]. Details on modifications are provided in the Supplementary Note. For the experimental analysis of clones, we performed flow cytometry, immunofluorescence staining, mRNA sequencing, quantitative PCR, and insulin content measurement (details of each assay in Supplementary Note).

Quality control metrics and aggregate comparison to bulk islet ATAC.

(a) Insert size distribution for aggregate reads from each snATAC-seq experiment. (b) Aggregated read coverage from each snATAC-seq experiment in a ±2 kb window around individual promoters (top) and averaged across all promoters (bottom). (c) Spearman correlation between normalized read coverage within a merged set of peaks from 3 aggregated islet snATAC-seq, 42 bulk islet ATAC-seq, and 4 bulk pancreas ATAC-seq datasets. Names of samples are from the original sources of the data. (d) Binned log10 read depth distribution for each experiment.

Flowchart of the snATAC-seq data processing pipeline

(a) Flowchart summarizing key steps of the snATAC-seq processing pipeline, including the various steps where cells were filtered out. Samples were first processed individually. All samples were then combined using a batch correction method. Clusters corresponding to cells from low quality cells, including those with low read depth in highly variable windows and low fraction of reads in peaks were then removed. After re-clustering, iterative subclustering of the main clusters at high resolution was used to identify and remove doublet subclusters. The final clusters are not driven by potential confounders such as donor of origin. Boxplot center lines, limits, and whiskers represent median, quartiles, and 1.5 IQR respectively.

Analysis of islet single cell gene expression data

(a) log10 transformed read depth or (b) total number of genes expressed compared with number of marker genes expressed per cell from scRNA-seq data. Boxplot center lines, limits, and whiskers represent median, quartiles, and 1.5 IQR respectively. Cells expressing more than one marker gene (defined by mixture models) were marked as doublets and filtered out. (c) Clusters of islet cells from single cell RNA-seq data plotted on UMAP coordinates. quies. stellate, quiescent stellate. activ. stellate, activated stellate. (d) Selected marker gene log2(expression) for each cluster plotted on UMAP coordinates. (e) Row-normalized t-statistics of marker gene specificity showing the most specific genes (t-statistic>20) for each cluster.

Comparison of motif enrichment between alpha and gamma cells

Differential enrichment of motifs between alpha cell open chromatin regions and gamma cell open chromatin regions as measured by a 2-sided T-test, with FDR calculated by the Benjamini-Hochberg procedure. Examples are highlighted of motifs enriched in alpha cells and gamma cells, respectively (MAFG, HOXA9). UMAP plots show enrichment z-scores for the indicated motifs in alpha and gamma cells. Violin plots below show the distribution of enrichment z-scores across alpha or gamma cells, where the lines represent median and quartiles.

Differentially accessible promoters across pseudo-states

(a) Pseudo-state (trajectory) values for alpha cells plotted on UMAP coordinates (left) and percentage of cells with GCG promoter accessibility decreases across 10 bins along the alpha (α) cell trajectory (right). (b) Pseudo-state (trajectory) values for beta (β) cells plotted on UMAP coordinates (left) and percentage of cells with INS promoter accessibility decreases across 10 bins along the beta cell trajectory (right). (c) Pseudo-state (trajectory) values for delta (δ) cells plotted on UMAP coordinates (left) and percentage of cells with SST promoter accessibility decreases across 10 bins along the beta cell trajectory (right). (d) Heatmaps showing promoters with dynamic accessibility across trajectories for alpha (top), beta (middle) and delta (bottom) cell trajectories. Gene promoters are clustered into 4 groups for each trajectory with k-medoids clustering. Enriched gene ontology for each k-medoid cluster (left) and selected genes present in at least one enriched gene ontology.

Single cell GWAS enrichment and correlation with TF motifs

(a) Single cell GWAS enrichment z-scores for Major depressive disorder and Systemic lupus erythematosus projected onto UMAP coordinates (left panels), z-score enrichment distribution per cell type and state (middle panels) and z-score enrichment distribution split into 10 bins based on beta cell trajectory values (right panels). Boxplot center lines, limits, and whiskers represent median, quartiles, and 1.5 IQR respectively. (b) Correlation between single cell GWAS enrichment z-scores for Type 2 Diabetes and chromVAR TF motif enrichment z-scores across either all cells (left) or beta cells (right). Inset scatterplots highlight the top correlated motifs in either direction. (c) Variants mapping directly in sequence motifs positively correlated with T2D risk in beta cells are enriched for T2D association, whereas variants mapping in motifs negatively correlated with T2D risk in beta cells show no such enrichment. Values represent effect size and SE.

Single cell co-accessibility analyses in islet cell types

(a) Distance-matched odds that delta cell co-accessibility links overlap islet pcHi-C chromatin loops at different co-accessibility threshold bins in 0.05 intervals demonstrate that co-accessible links are enriched for chromatin interactions. (b) Same analysis as in (a) but with alpha cell co-accessibility. (c) Same analysis as in (a) but with beta cell co-accessibility and Hi-C loops. (d) Same analysis as in (a) but with delta cell co-accessibility and Hi-C loops. (e) Same analysis as in (a) but with alpha cell co-accessibility and Hi-C loops. (f) Number of distal sites linked to each promoter peak for alpha, beta, and delta cells. (g) Number of promoters linked to each distal site for alpha, beta, and delta cells.

Cell type-specific and shared co-accessible sites

(a) An example of co-accessibility anchored at the promoter for the delta cell identity TF HHEX. Co-accessibility for beta, delta, and alpha cells are shown compared to high-confidence pcHi-C loops from ensemble islets. Genome browser plots scale: 0–10. (b) An example of co-accessibility anchored at the promoter for the alpha cell identity TF ARX. (c) An example of shared co-accessibility anchored at the promoter for the shared islet identity TF NEUROD1.

3D chromatin interactions at the T2D-associated KCNQ1 locus

Top panels show Hi-C contact matrices from hESC-derived beta cells, visualized at 25 kb resolution. Region shown is chr11:500,00–4,500,000, hg19. Black arrows indicate putative interaction point of INS TSS and KCNQ1 enhancer. Genome browser plot below shows a zoomed view of chr11:1,750,000–3,250,000. Data from 4C-seq anchored on the INS promoter in EndoC-βH1 cells (Jian & Felsenfeld 2018) is shown, as analyzed with the 4C-ker package. Normalized read counts are shown in black from 3 biological replicates. Significant interactions from INS promoter are shown as arcs below read counts tracks. Interactions calls are from data pooled across 3 replicates are shown here. The region containing the KCNQ1 enhancer was called as a significant interaction region with INS promotor independently in each 4C replicate. Virtual 4C plots in green show log(normalized Hi-C interaction frequency) from INS promoter.

Genome editing of the KCNQ1 locus in hESCs

(a) Schematic of the workflow and (b) Sanger sequencing for KCNQ1 enhancer deletion in three independent hESC clones. (c) Representative figures of flow cytometry analysis for NKX6–1 and INS comparing control and KCNQ1ΔEnh cells (left). Quantification of the percentage of NKX6–1+/INS+ cells in beta cell stage cultures from control (n=6; 2 clones × 3 differentiations) and KCNQ1∆Enh (n=9; 3 clones × 3 differentiations) cells (right). Values represent mean and SEM. ns, not significant by two-sided Student’s T-test without adjustment for multiple comparisons. (d) Schematic of the workflow and (e) Sanger sequencing for two independent KCNQ1G/G clones and three KCNQ1A/A clones. (f) Representative figures of flow cytometry analysis for NKX6–1 and INS comparing KCNQ1G/G and KCNQ1A/A clones (left). of the percentage of NKX6–1+/INS+ cells in beta cell stage cultures from KCNQ1G/G (n=6; 2 clones × 3 differentiations) and KCNQ1A/A (n=9; 3 clones × 3 differentiations) cells (right). ns, not significant by two-sided Student’s T-test without adjustment for multiple comparisons. Values represent mean and SEM.
  101 in total

1.  Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants.

Authors:  Stephen C J Parker; Michael L Stitzel; D Leland Taylor; Jose Miguel Orozco; Michael R Erdos; Jennifer A Akiyama; Kelly Lammerts van Bueren; Peter S Chines; Narisu Narisu; Brian L Black; Axel Visel; Len A Pennacchio; Francis S Collins
Journal:  Proc Natl Acad Sci U S A       Date:  2013-10-14       Impact factor: 11.205

Review 2.  Transcriptional enhancers: from properties to genome-wide predictions.

Authors:  Daria Shlyueva; Gerald Stampfel; Alexander Stark
Journal:  Nat Rev Genet       Date:  2014-03-11       Impact factor: 53.242

Review 3.  Beta cell heterogeneity: an evolving concept.

Authors:  Dana Avrahami; Agnes Klochendler; Yuval Dor; Benjamin Glaser
Journal:  Diabetologia       Date:  2017-06-08       Impact factor: 10.122

4.  Cicero Predicts cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data.

Authors:  Hannah A Pliner; Jonathan S Packer; José L McFaline-Figueroa; Darren A Cusanovich; Riza M Daza; Delasa Aghamirzaie; Sanjay Srivatsan; Xiaojie Qiu; Dana Jackson; Anna Minkina; Andrew C Adey; Frank J Steemers; Jay Shendure; Cole Trapnell
Journal:  Mol Cell       Date:  2018-08-02       Impact factor: 17.970

Review 5.  Transcriptional control of endothelial cell development.

Authors:  Sarah De Val; Brian L Black
Journal:  Dev Cell       Date:  2009-02       Impact factor: 12.270

6.  Enrichr: a comprehensive gene set enrichment analysis web server 2016 update.

Authors:  Maxim V Kuleshov; Matthew R Jones; Andrew D Rouillard; Nicolas F Fernandez; Qiaonan Duan; Zichen Wang; Simon Koplev; Sherry L Jenkins; Kathleen M Jagodnik; Alexander Lachmann; Michael G McDermott; Caroline D Monteiro; Gregory W Gundersen; Avi Ma'ayan
Journal:  Nucleic Acids Res       Date:  2016-05-03       Impact factor: 16.971

7.  Genome-wide association identifies nine common variants associated with fasting proinsulin levels and provides new insights into the pathophysiology of type 2 diabetes.

Authors:  Rona J Strawbridge; Josée Dupuis; Inga Prokopenko; Adam Barker; Emma Ahlqvist; Denis Rybin; John R Petrie; Mary E Travers; Nabila Bouatia-Naji; Antigone S Dimas; Alexandra Nica; Eleanor Wheeler; Han Chen; Benjamin F Voight; Jalal Taneera; Stavroula Kanoni; John F Peden; Fabiola Turrini; Stefan Gustafsson; Carina Zabena; Peter Almgren; David J P Barker; Daniel Barnes; Elaine M Dennison; Johan G Eriksson; Per Eriksson; Elodie Eury; Lasse Folkersen; Caroline S Fox; Timothy M Frayling; Anuj Goel; Harvest F Gu; Momoko Horikoshi; Bo Isomaa; Anne U Jackson; Karen A Jameson; Eero Kajantie; Julie Kerr-Conte; Teemu Kuulasmaa; Johanna Kuusisto; Ruth J F Loos; Jian'an Luan; Konstantinos Makrilakis; Alisa K Manning; María Teresa Martínez-Larrad; Narisu Narisu; Maria Nastase Mannila; John Ohrvik; Clive Osmond; Laura Pascoe; Felicity Payne; Avan A Sayer; Bengt Sennblad; Angela Silveira; Alena Stancáková; Kathy Stirrups; Amy J Swift; Ann-Christine Syvänen; Tiinamaija Tuomi; Ferdinand M van 't Hooft; Mark Walker; Michael N Weedon; Weijia Xie; Björn Zethelius; Halit Ongen; Anders Mälarstig; Jemma C Hopewell; Danish Saleheen; John Chambers; Sarah Parish; John Danesh; Jaspal Kooner; Claes-Göran Ostenson; Lars Lind; Cyrus C Cooper; Manuel Serrano-Ríos; Ele Ferrannini; Tom J Forsen; Robert Clarke; Maria Grazia Franzosi; Udo Seedorf; Hugh Watkins; Philippe Froguel; Paul Johnson; Panos Deloukas; Francis S Collins; Markku Laakso; Emmanouil T Dermitzakis; Michael Boehnke; Mark I McCarthy; Nicholas J Wareham; Leif Groop; François Pattou; Anna L Gloyn; George V Dedoussis; Valeriya Lyssenko; James B Meigs; Inês Barroso; Richard M Watanabe; Erik Ingelsson; Claudia Langenberg; Anders Hamsten; Jose C Florez
Journal:  Diabetes       Date:  2011-08-26       Impact factor: 9.461

8.  Pancreatic islet enhancer clusters enriched in type 2 diabetes risk-associated variants.

Authors:  Lorenzo Pasquali; Kyle J Gaulton; Santiago A Rodríguez-Seguí; Loris Mularoni; Irene Miguel-Escalada; İldem Akerman; Juan J Tena; Ignasi Morán; Carlos Gómez-Marín; Martijn van de Bunt; Joan Ponsa-Cobas; Natalia Castro; Takao Nammo; Inês Cebola; Javier García-Hurtado; Miguel Angel Maestro; François Pattou; Lorenzo Piemonti; Thierry Berney; Anna L Gloyn; Philippe Ravassard; José Luis Gómez Skarmeta; Ferenc Müller; Mark I McCarthy; Jorge Ferrer
Journal:  Nat Genet       Date:  2014-01-12       Impact factor: 38.330

9.  MEME SUITE: tools for motif discovery and searching.

Authors:  Timothy L Bailey; Mikael Boden; Fabian A Buske; Martin Frith; Charles E Grant; Luca Clementi; Jingyuan Ren; Wilfred W Li; William S Noble
Journal:  Nucleic Acids Res       Date:  2009-05-20       Impact factor: 16.971

10.  Large-scale association analyses identify new loci influencing glycemic traits and provide insight into the underlying biological pathways.

Authors:  Robert A Scott; Vasiliki Lagou; Ryan P Welch; Eleanor Wheeler; May E Montasser; Jian'an Luan; Reedik Mägi; Rona J Strawbridge; Emil Rehnberg; Stefan Gustafsson; Stavroula Kanoni; Laura J Rasmussen-Torvik; Loïc Yengo; Cecile Lecoeur; Dmitry Shungin; Serena Sanna; Carlo Sidore; Paul C D Johnson; J Wouter Jukema; Toby Johnson; Anubha Mahajan; Niek Verweij; Gudmar Thorleifsson; Jouke-Jan Hottenga; Sonia Shah; Albert V Smith; Bengt Sennblad; Christian Gieger; Perttu Salo; Markus Perola; Nicholas J Timpson; David M Evans; Beate St Pourcain; Ying Wu; Jeanette S Andrews; Jennie Hui; Lawrence F Bielak; Wei Zhao; Momoko Horikoshi; Pau Navarro; Aaron Isaacs; Jeffrey R O'Connell; Kathleen Stirrups; Veronique Vitart; Caroline Hayward; Tõnu Esko; Evelin Mihailov; Ross M Fraser; Tove Fall; Benjamin F Voight; Soumya Raychaudhuri; Han Chen; Cecilia M Lindgren; Andrew P Morris; Nigel W Rayner; Neil Robertson; Denis Rybin; Ching-Ti Liu; Jacques S Beckmann; Sara M Willems; Peter S Chines; Anne U Jackson; Hyun Min Kang; Heather M Stringham; Kijoung Song; Toshiko Tanaka; John F Peden; Anuj Goel; Andrew A Hicks; Ping An; Martina Müller-Nurasyid; Anders Franco-Cereceda; Lasse Folkersen; Letizia Marullo; Hanneke Jansen; Albertine J Oldehinkel; Marcel Bruinenberg; James S Pankow; Kari E North; Nita G Forouhi; Ruth J F Loos; Sarah Edkins; Tibor V Varga; Göran Hallmans; Heikki Oksa; Mulas Antonella; Ramaiah Nagaraja; Stella Trompet; Ian Ford; Stephan J L Bakker; Augustine Kong; Meena Kumari; Bruna Gigante; Christian Herder; Patricia B Munroe; Mark Caulfield; Jula Antti; Massimo Mangino; Kerrin Small; Iva Miljkovic; Yongmei Liu; Mustafa Atalay; Wieland Kiess; Alan L James; Fernando Rivadeneira; Andre G Uitterlinden; Colin N A Palmer; Alex S F Doney; Gonneke Willemsen; Johannes H Smit; Susan Campbell; Ozren Polasek; Lori L Bonnycastle; Serge Hercberg; Maria Dimitriou; Jennifer L Bolton; Gerard R Fowkes; Peter Kovacs; Jaana Lindström; Tatijana Zemunik; Stefania Bandinelli; Sarah H Wild; Hanneke V Basart; Wolfgang Rathmann; Harald Grallert; Winfried Maerz; Marcus E Kleber; Bernhard O Boehm; Annette Peters; Peter P Pramstaller; Michael A Province; Ingrid B Borecki; Nicholas D Hastie; Igor Rudan; Harry Campbell; Hugh Watkins; Martin Farrall; Michael Stumvoll; Luigi Ferrucci; Dawn M Waterworth; Richard N Bergman; Francis S Collins; Jaakko Tuomilehto; Richard M Watanabe; Eco J C de Geus; Brenda W Penninx; Albert Hofman; Ben A Oostra; Bruce M Psaty; Peter Vollenweider; James F Wilson; Alan F Wright; G Kees Hovingh; Andres Metspalu; Matti Uusitupa; Patrik K E Magnusson; Kirsten O Kyvik; Jaakko Kaprio; Jackie F Price; George V Dedoussis; Panos Deloukas; Pierre Meneton; Lars Lind; Michael Boehnke; Alan R Shuldiner; Cornelia M van Duijn; Andrew D Morris; Anke Toenjes; Patricia A Peyser; John P Beilby; Antje Körner; Johanna Kuusisto; Markku Laakso; Stefan R Bornstein; Peter E H Schwarz; Timo A Lakka; Rainer Rauramaa; Linda S Adair; George Davey Smith; Tim D Spector; Thomas Illig; Ulf de Faire; Anders Hamsten; Vilmundur Gudnason; Mika Kivimaki; Aroon Hingorani; Sirkka M Keinanen-Kiukaanniemi; Timo E Saaristo; Dorret I Boomsma; Kari Stefansson; Pim van der Harst; Josée Dupuis; Nancy L Pedersen; Naveed Sattar; Tamara B Harris; Francesco Cucca; Samuli Ripatti; Veikko Salomaa; Karen L Mohlke; Beverley Balkau; Philippe Froguel; Anneli Pouta; Marjo-Riitta Jarvelin; Nicholas J Wareham; Nabila Bouatia-Naji; Mark I McCarthy; Paul W Franks; James B Meigs; Tanya M Teslovich; Jose C Florez; Claudia Langenberg; Erik Ingelsson; Inga Prokopenko; Inês Barroso
Journal:  Nat Genet       Date:  2012-08-12       Impact factor: 38.330

View more
  19 in total

1.  Variant to function mapping at single-cell resolution through network propagation.

Authors:  Fulong Yu; Liam D Cato; Chen Weng; L Alexander Liggett; Soyoung Jeon; Keren Xu; Charleston W K Chiang; Joseph L Wiemels; Jonathan S Weissman; Adam J de Smith; Vijay G Sankaran
Journal:  Nat Biotechnol       Date:  2022-06-06       Impact factor: 54.908

2.  A multi-omics roadmap of β-cell failure in type 2 diabetes mellitus.

Authors:  Gaowei Wang; Maike Sander
Journal:  Nat Rev Endocrinol       Date:  2021-11       Impact factor: 43.330

Review 3.  Characterizing cis-regulatory elements using single-cell epigenomics.

Authors:  Sebastian Preissl; Kyle J Gaulton; Bing Ren
Journal:  Nat Rev Genet       Date:  2022-07-15       Impact factor: 59.581

4.  Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data.

Authors:  Martin Jinye Zhang; Kangcheng Hou; Kushal K Dey; Saori Sakaue; Karthik A Jagadeesh; Kathryn Weinand; Aris Taychameekiatchai; Poorvi Rao; Angela Oliveira Pisco; James Zou; Bruce Wang; Michael Gandal; Soumya Raychaudhuri; Bogdan Pasaniuc; Alkes L Price
Journal:  Nat Genet       Date:  2022-09-01       Impact factor: 41.307

5.  Single-nucleus chromatin accessibility profiling highlights regulatory mechanisms of coronary artery disease risk.

Authors:  Adam W Turner; Shengen Shawn Hu; Jose Verdezoto Mosquera; Wei Feng Ma; Chani J Hodonsky; Doris Wong; Gaëlle Auguste; Yipei Song; Katia Sol-Church; Emily Farber; Soumya Kundu; Anshul Kundaje; Nicolas G Lopez; Lijiang Ma; Saikat Kumar B Ghosh; Suna Onengut-Gumuscu; Euan A Ashley; Thomas Quertermous; Aloke V Finn; Nicholas J Leeper; Jason C Kovacic; Johan L M Björkegren; Chongzhi Zang; Clint L Miller
Journal:  Nat Genet       Date:  2022-05-19       Impact factor: 41.307

6.  Single-cell multiomics analysis reveals regulatory programs in clear cell renal cell carcinoma.

Authors:  Zhilin Long; Chengfang Sun; Min Tang; Yin Wang; Jiayan Ma; Jichuan Yu; Jingchao Wei; Jianzhu Ma; Bohan Wang; Qi Xie; Jiaming Wen
Journal:  Cell Discov       Date:  2022-07-19       Impact factor: 38.079

7.  Interpreting type 1 diabetes risk with genetics and single-cell epigenomics.

Authors:  Joshua Chiou; Ryan J Geusz; Mei-Lin Okino; Jee Yun Han; Michael Miller; Rebecca Melton; Elisha Beebe; Paola Benaglio; Serina Huang; Katha Korgaonkar; Sandra Heller; Alexander Kleger; Sebastian Preissl; David U Gorkin; Maike Sander; Kyle J Gaulton
Journal:  Nature       Date:  2021-05-19       Impact factor: 49.962

8.  Heterogenous impairment of α cell function in type 2 diabetes is linked to cell maturation state.

Authors:  Xiao-Qing Dai; Joan Camunas-Soler; Linford J B Briant; Theodore Dos Santos; Aliya F Spigelman; Emily M Walker; Rafael Arrojo E Drigo; Austin Bautista; Robert C Jones; Dana Avrahami; James Lyon; Aifang Nie; Nancy Smith; Yongneng Zhang; Janyne Johnson; Jocelyn E Manning Fox; Evangelos D Michelakis; Peter E Light; Klaus H Kaestner; Seung K Kim; Patrik Rorsman; Roland W Stein; Stephen R Quake; Patrick E MacDonald
Journal:  Cell Metab       Date:  2022-02-01       Impact factor: 27.287

Review 9.  Epigenome rewiring in human pluripotent stem cells.

Authors:  Jielin Yan; Danwei Huangfu
Journal:  Trends Cell Biol       Date:  2021-12-23       Impact factor: 20.808

Review 10.  Beyond GWAS: from simple associations to functional insights.

Authors:  Kazuyoshi Ishigaki
Journal:  Semin Immunopathol       Date:  2021-10-04       Impact factor: 9.623

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.