| Literature DB >> 33824393 |
Zhanying Feng1,2, Zhana Duren3,4, Ziyi Xiong5,6,7, Sijia Wang8,9, Fan Liu10,11, Wing Hung Wong12, Yong Wang13,14,15,16.
Abstract
Cranial Neural Crest Cells (CNCC) originate at the cephalic region from forebrain, midbrain and hindbrain, migrate into the developing craniofacial region, and subsequently differentiate into multiple cell types. The entire specification, delamination, migration, and differentiation process is highly regulated and abnormalities during this craniofacial development cause birth defects. To better understand the molecular networks underlying CNCC, we integrate paired gene expression & chromatin accessibility data and reconstruct the genome-wide human Regulatory network of CNCC (hReg-CNCC). Consensus optimization predicts high-quality regulations and reveals the architecture of upstream, core, and downstream transcription factors that are associated with functions of neural plate border, specification, and migration. hReg-CNCC allows us to annotate genetic variants of human facial GWAS and disease traits with associated cis-regulatory modules, transcription factors, and target genes. For example, we reveal the distal and combinatorial regulation of multiple SNPs to core TF ALX1 and associations to facial distances and cranial rare disease. In addition, hReg-CNCC connects the DNA sequence differences in evolution, such as ultra-conserved elements and human accelerated regions, with gene expression and phenotype. hReg-CNCC provides a valuable resource to interpret genetic variants as early as gastrulation during embryonic development. The network resources are available at https://github.com/AMSSwanglab/hReg-CNCC .Entities:
Mesh:
Year: 2021 PMID: 33824393 PMCID: PMC8024315 DOI: 10.1038/s42003-021-01970-0
Source DB: PubMed Journal: Commun Biol ISSN: 2399-3642
Fig. 1Schematic overview of inferring human regulatory network of CNCC (hReg-CNCC) based on paired gene expression and chromatin accessibility data.
High-quality hReg-CNCC is reconstructed by a two-step framework. Step 1: PECA2 infers context-specific regulatory networks from biological replicates (see “Methods” for details). Specifically, CNCC replicates, each with paired RNA-seq and ATAC-seq data, are input into PECA2 and output R regulatory networks. We defined the CRM associated with TF - TG pair as a set of REs bound by TF to regulate TG. Each network is denoted by TF-TG regulatory strength (, ) and CRM (, ), i.e., the TF-CRM-TG triplets. Step 2: Consensus optimization integrates R regulatory networks ( and ) and outputs hReg-CNCC with reliable regulatory strength and reproducible CRMs . hReg-CNCC serves as a valuable resource to interpret genetic variants from face GWAS, comparative genomics, and disease studies.
Fig. 2Validating hReg-CNCC by independent data sources show that consensus optimization outperforms the alternative methods.
a Consensus optimization achieves significantly higher precision, recall, and F1 measure than single networks. The improvement is robust to parameter choices. N = 6 for single networks and N = 6 for consensus optimization. One-tailed T-test is conducted to obtain P-values. b Consensus optimization outperforms the naive union and intersection methods in precision, recall, and F1 measure. c hReg-CNCC can better predict two master regulators’ (TFAP2A and NR2F1) ChIP-seq binding sites than single networks. d Using human biased differentially expressed genes as the gold standard, hReg-CNCC predicts the human biased enhancers’ target genes more accurately than ABC model and proximity-based method. e hReg-CNCC predicts 33% enhancer gene relationships as distal regulation for human biased enhancers, which cannot be found by ABC model or proximity-based method. f RNA-seq, ATAC-seq, H3K27ac ChIP-seq track around ROBO3. hReg-CNCC predicts ROBO3 as the target gene for a distal human biased enhancer (comparing human and chimpanzee’s ATAC-seq tracks and H3K27ac tracks), which is located near the gene body of HEPACAM. The expression pattern of ROBO3 supports the target assignment for the human biased enhancer (comparing human and chimpanzee’s RNA-seq tracks). g Capture-C and REs tracks around SOX9. Two REs are predicted by hReg-CNCC to regulate SOX9. One RE is on the promoter of SOX9 and the distal RE is validated by a loop of Capture-C anchored by SOX9.
Fig. 3Architecture of hReg-CNCC reveals the regulatory hierarchy in CNCC.
a Heatmap of hReg-CNCC’s regulatory strength adjacent matrix shows two TF-TG modules with different regulatory patterns. X-axis denotes TG and y-axis is TF. TFs in Module 1 tend to regulate a large number of TGs. TFs in Module 2 specifically regulate a subset of TGs. b Dense network extracted from hReg-CNCC for the TFs in Module 1 and Module 2 shows a clear hierarchical structure. TFs in Module 1 tend to be upstream and core regulators. TFs in Module 2 tend to be downstream TFs associated with CNCC’s specific development, differentiation, and migration. The hierarchy is consistent with GO function enrichment results. c Heatmap of TF-TF regulatory strength matrix shows a consistent two-module structure of hReg-CNCC. d Overlapping TFs in Module 1 and Module 2 with the known CNCC pathways. Hypergeometric test is conducted to obtain the P-values. e Overlap of TFs, TGs, REs, and TF-TG regulations between hReg-CNCC-H9 and hReg-CNCC. f Heatmap of hReg-CNCC-H9 reveals two-module architecture. g The TFs of the two modules are significantly shared by hReg-CNCC-H9 and hReg-CNCC. A hypergeometric test is conducted to obtain the P-values.
TFs in Module 1 are annotated as CNCC markers, pathways and other CNCC TFs.
| TFs in Module 1 | Annotation | |||
|---|---|---|---|---|
| Other CNCC TFs | ||||
TFs in Module 2 are enriched in CNCC-specific developmental terms.
| Function | −log( | Associated TFs in Module 2 |
|---|---|---|
| Chordate embryonic development | 29.93 | |
| Mesenchyme development | 21.28 | |
| Skeletal system development | 21.18 | |
| Tissue morphogenesis | 18.91 | |
| Response to growth factor | 18.49 | |
| Organ sensory development | 26.85 | |
| Vasculature development | 14.86 | |
| Ossification | 14.18 | |
| Regulation of animal organ morphogenesis | 13.78 | |
| Regulation of neuron differentiation | 12.72 | |
| Muscle structure development | 12.01 | |
| Appendage morphogenesis | 11.87 | |
| Heart development | 11.84 | |
| Mesenchyme morphogenesis | 11.78 |
The genes with bold characters are the marker of the function in the first column.
Fig. 4hReg-CNCC identifies causal regulations for genetic variants and reveals biological insights for genotype and phenotype mapping.
a Face GWAS SNPs are more enriched in CRMs in hReg-CNCC than CNCC ATAC-seq peaks by fold change along with –log(P-value). Other tissues’ fold change is the mean of 27 samples (Supplementary Data 4). b Procedure to associate TF-CRM-TG triplet with significant SNPs passing threshold. If one SNP is located on CRM, then the TF-CRM-TG is linked by this SNP and form a SNP associated TF-CRM-TG. c The face SNPs associated TF-CRM-TG network. In the network, REs associated with SNPs are shown instead of the whole CRM. The colors of SNP, RE, and TG indicate different sub-phenotypes for the face illustrated in the right corner. d Ranking the regulatory strength in the face SNP associated network shows that ALX1 is a pivotal regulator. e RNA-seq, ATAC-seq, H3K27ac, and RE tracks around ALX1. Two SNPs are located in REs of ALX1: one in promoter, the other in 97K upstream cis-regulatory region. The SNP in promoter changes the binding affinity of TCF cluster and the SNP in distal RE changes the binding affinity of IRX3 and FOXM1, which causally alter ALX1’s expression level and further face phenotype is given that ALX1 is the master regulator in CNCC’s migration. f, g Examples of two types of SNPs’ multi-trait effect in SNPs associated network. h Multiple SNP cooperation example in SNPs associated network. EnR Right Endocanthion, EnL Left Endocanthion, Prn Pronasale, AlL Left Alare.
Fig. 5hReg-CNCC provides mechanism understanding for human face related diseases.
a 18 face associated traits in GWAS catalog are scanned and 6 (1/3) can be explained by hReg-CNCC with associated RE and TF-TG regulation. TFs were filtered to select top TFs (details in “Methods”). b rs11609609 is associated with frontonasal distances. c Detailed regulation of rs11609609 and ALX1, which influences frontonasal face and “Monobrow”. d The upstream TFs and CRM of BAZ1B in hReg-CNCC support BAZ1B as a causal gene associated with the rare disease Williams-Beuren Syndrome. Two SNPs, rs73134905 and rs62466263, locate in the downstream enhancer and promoter (two REs in the CRM) of BAZ1B. These two SNPs are most associated with face width phenotype in GWAS study. Face width phenotype is consistent with wilder face symptom of WBS patients.
Fig. 6hReg-CNCC interprets the DNA difference in evolution and uncovers important regulatory elements and genes.
a The scheme to extract the subnetwork in hReg-CNCC associated with human evolutionarily important elements from comparative genomics. If one human evolutionarily important element is overlapped with CRM, this TF-CRM-TG triplet is extracted and pooled into a subnetwork. b The evolutionarily UCEs associated network. Instead of the whole CRM, only the REs associated with evolutionary elements are shown. Vista enhancer and literature evidence are annotated and support their importance in face development.