| Literature DB >> 35252945 |
David Bray1,2,3,4, Heather Hook1,2,4, Rose Zhao1,2, Jessica L Keenan1,2,3, Ashley Penvose1,2, Yemi Osayame1,2, Nima Mohaghegh1,2, Xiaoting Chen5, Sreeja Parameswaran5, Leah C Kottyan5,6,7, Matthew T Weirauch5,8,9,6, Trevor Siggers1,2,10,11.
Abstract
Non-coding DNA variants (NCVs) impact gene expression by altering binding sites for regulatory complexes. New high-throughput methods are needed to characterize the impact of NCVs on regulatory complexes. We developed CASCADE (Customizable Approach to Survey Complex Assembly at DNA Elements), an array-based high-throughput method to profile cofactor (COF) recruitment. CASCADE identifies DNA-bound transcription factor-cofactor (TF-COF) complexes in nuclear extracts and quantifies the impact of NCVs on their binding. We demonstrate CASCADE sensitivity in characterizing condition-specific recruitment of COFs p300 and RBBP5 (MLL subunit) to the CXCL10 promoter in lipopolysaccharide (LPS)-stimulated human macrophages and quantify the impact of all possible NCVs. To demonstrate applicability to NCV screens, we profile TF-COF binding to ~1,700 single-nucleotide polymorphism quantitative trait loci (SNP-QTLs) in human macrophages and identify perturbed ETS domain-containing complexes. CASCADE will facilitate high-throughput testing of molecular mechanisms of NCVs for diverse biological applications.Entities:
Year: 2022 PMID: 35252945 PMCID: PMC8896503 DOI: 10.1016/j.xgen.2022.100098
Source DB: PubMed Journal: Cell Genom ISSN: 2666-979X
Figure 1.Customizable approach to survey complex assembly at DNA elements (CASCADE) method and applications
(A) Cofactors (COFs) affect transcription and chromatin state.
(B) COF recruitment to DNA is assayed using a DNA microarray on nuclear extracts from a cell type of interest. COF recruitment is assayed to a “seed” probe (e.g., genomic-derived TF binding site sequence) and all single variant (SV) probes. As shown in the confetti plots, COF recruitment to single variant probes yields nucleotide preferences along DNA sequence. Preferences are transformed to a COF recruitment motif (i.e., a COF recruitment logo). COF recruitment logos are matched to TF motif databases to infer TF identity.
(C) Overview of CASCADE applications. CASCADE can be applied to cis-regulatory elements (CREs) or to single-nucleotide polymorphism (SNP) pairs (reference [REF] and non-reference [non-REF] probes). Reference probes relate to the genomic consensus nucleotide sequence and non-reference to the phenotype-associated nucleotide variant. For CREs, tiling probes are used to span the genomic region and COF motifs for each tiling probe are integrated into a CRE-wide COF motif. For SNP pairs, COF recruitment motifs are determined for both and compared. IRF, interferon response factor; ETS, Erythroblast transformation specific.
Figure 2.CASCADE-based characterization of COF recruitment to the CXCL10 promoter
(A) Schematic of LPS-inducible recruitment of p300 to the CXCL10 promoter in macrophages.
(B) CRE-wide p300 recruitment motif and TFs IRF3 and p65/RELA across the CXCL10 promoter. Experiments using extracts from LPS-stimulated or untreated (UT) macrophages are indicated with colored bars. p300 recruitment motifs are shown for biological replicate experiments (Replicate 1 and 2).
(C) Schematic of condition-independent recruitment of RBBP5 to CXCL10 promoter.
(D) CRE-wide motifs for COF RBBP5 and TF IRF2 across the CXCL10 promoter segment. Experimental conditions as in (B). See also Table S1 and Figures S1, S2, and S7.
Figure 3.CASCADE-based analysis of SNP-QTLs in human macrophages
(A) Overview of two-step CASCADE-based approach to characterize 1,712 SNP-QTLs. (1) Step 1: screen for differential COF recruitment to SNP probe pairs by comparing recruitment to reference (REF) and non-reference (non-REF) alleles. The number of probe pairs in each QTL class for which significant COF recruitment was identified in at least one experiment is indicated. (2) Step 2: CASCADE-based motifs are generated for SNPs identified as significantly bound. COF motifs are compared against TF-motif databases to infer TF identity.
(B) Comparison of p300 differential recruitment across biological replicates. Comparison of q-values for replicates is shown (left). The q-values represent the statistical significance of the difference between p300 recruitment measured at REF and non-REF probes within a SNP pair adjusted for multiple hypothesis testing (see STAR Methods). Comparison of statistical significance (q-values) against the difference in p300 recruitment Z score between REF and non-REF probes in each SNP pair in technical replicates 1 (middle) and 2 (right) is shown. QTL class for each SNP is indicated.
(C) Comparison of differential COF recruitment across biological replicates is shown for candidate COFs and the TF PU.1. eQTL, expression quantitative trait loci; caQTL, chromatin accessibility quantitative trait loci. See also Table S1 and Figures S3, S4, S5, S6, and S7.
Figure 4.CASCADE-determined motifs at SNP loci
COF recruitment motifs for p300, SMARCA4, TBL1XR1, GCN5, and RBBP5 are shown for 10 SNP-QTL loci. PU.1 binding motifs at each locus are also shown. Position of the SNP location within each motif is shown with a shaded rectangle.(*) denotes a motif match obtained at lower stringency (see STAR Methods). QTL type of each SNP is indicated (left-hand side, colored dots). Only sites that met an imposed seed Z score threshold were plotted (see STAR Methods). Corresponding reference and SNP are shown beneath each rsID. (rc) denotes a site plotted as its reverse complement relative to the reference strand. For these sites, the reference and non-REF alleles are also indicated as their complementary nucleotides. See also Table S1 and Figures S4, S5, S6, and S7.
Figure 5.Constructing models with CASCADE for SNP-eQTLs
(A) Left column: CASCADE-determined COF recruitment motifs for p300, SMARCA4, TBL1XR1, GCN5, and RBBP5 at the local genomic region surrounding rs11950944. PU.1 binding motif is also shown. Right column: TF binding motif with the strongest association to each corresponding CASCADE COF recruitment motif. Statistical significance (p value) for TF matching is shown below each TF motif (see STAR Methods). Position of the SNP location within each motif is shown in the shaded area. QTL type and inferred TF category are indicated by the same color scheme as in Figure 4.
(B) Same as in (A) but for the local genomic region surrounding rs10833823. Only sites that met an imposed Z score threshold were plotted and used for motif analysis (see STAR Methods).
(C) Integrative model for COF recruitment changes at SNP-eQTL rs11950944.
(D) Same as in (C) but for SNP-eQTL rs10833823. See also Table S1 and Figures S5, S6, and S7.
KEY RESOURCES TABLE
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Antibodies | ||
| Anti-p300 | Aabcam | Cat#ab14984; RRID: AB_301550 |
| Anti-Brg-1 | Santa Cruz Biotechnology | Cat#sc17796; RRID: AB_626762 |
| Anti-Gcn5 | Santa Cruz Biotechnology | Cat#sc365321x; RRID: AB_10846182 |
| Anti-RBBP5 | Bethyl Laboratories | Cat#A300-109A; RRID: AB_210551 |
| Anti-TBL1XR1 | Santa Cruz Biotechnology | Cat#sc100908; RRID: AB_1130006 |
| Anti-HDAC1 | Abcam | Cat#ab7028; RRID: AB_305705 |
| Anti-MED1 | Bethyl Laboratories | Cat#A300-793A; RRID: AB_577241 |
| Anti-BRD4 | Bethyl Laboratories | Cat#A301-985A50; RRID: AB_1576498 |
| Anti-NFkB p65 | Santa Cruz Biotechnology | Cat#sc372; RRID: AB_632037 |
| Anti-NFkB p65 | Santa Cruz Biotechnology | Cat#sc8008; RRID: AB_628017 |
| Anti-ICSBP | Santa Cruz Biotechnology | Cat#sc6058x; RRID: AB_649510 |
| Anti-IRF3 | Cell Signaling Technology | Cat#D83B9; RRID: AB_1904036 |
| Anti-IRF2 | Santa Cruz Biotechnology | Cat#sc374327; RRID: AB_10990717 |
| Anti-PU.1 | Santa Cruz Biotechnology | Cat#sc352; RRID: AB_632289 |
| Donkey anti-Goat IgG (H+L) | ThermoFisher Scientific | Cat#A-11055; RRID: AB_2534102 |
| Goat anti-Mouse IgG (H+L) Highly Cross-Adsorbed Secondary Antibody, | ThermoFisher Scientific | Cat#A-11029; RRID: AB_138404 |
| Goat anti-Rabbit IgG (H+L) Highly Cross-Adsorbed Secondary Antibody, | ThermoFisher Scientific | Cat#A-11034; RRID: AB_2576217 |
| Goat anti-Mouse IgG (H+L) Highly Cross-Adsorbed Secondary Antibody, | ThermoFisher Scientific | Cat#A32728; RRID: AB_2633277 |
| Goat anti-Rabbit IgG (H+L) Highly Cross-Adsorbed Secondary Antibody, | ThermoFisher Scientific | Cat#A32733; RRID: AB_2633282 |
| Goat anti-Rabbit IgG (H+L) | ThermoFisher Scientific | Cat#G-21234; RRID: AB_2536530 |
| Goat anti-Mouse IgG (H+L) Cross-Adsorbed Secondary Antibody, HRP | ThermoFisher Scientific | Cat#G-21040; RRID: AB_2536527 |
| Chemicals, peptides, and recombinant proteins | ||
| Lipopolysaccharides from | Sigma-Aldrich | Cat#L3024 |
| Human IFN-gamma Recombinant Protein | ThermoFisher Scientific | Cat#PHC4031 |
| Protease Inhibitor Cocktail | Sigma-Aldrich | Cat#P8340 |
| Phosphatase Inhibitor Cocktail A | Santa Cruz Biotechnology | Cat#sc45055 |
| Phorbol 12-myristate 13-acetate | Sigma-Aldrich | Cat#P8139 |
| Deposited data | ||
| RBBP5 ChIP-seq data, performed in K562 cells | ENCODE | ENCFF666PCE |
| EP300 ChIP-seq data, performed in K562 cells | ENCODE | ENCFF755HCK |
| SMARCA4 ChIP-seq data, performed in K562 cells | ENCODE | ENCFF267OGF |
| TBL1XR1 ChIP-seq data, performed in K562 cells | ENCODE | ENCFF868SWL |
| CASCADE microarray data | GEO | GSE148945 |
| Experimental models: Cell lines | ||
| THP-1 | ATCC | Cat#TIB-202 |
| Oligonucleotides | ||
| Protein Binding Microarray Double-stranding Primer | Eurofins | 5′ - CAGCAGCGCTCAAGGAATCAAGAC - 3′ |
| Software and algorithms | ||
| RColorBrewer, version 1.1.2 | Neuwirth[ | N/A |
| Cowplot, version 1.1.1 | Wilke[ | N/A |
| ggseqlogo R package, version 0.1 | Wagih[ | N/A |
| TOMTOM (MEME suit), version 5.0.3 | Patwardhan et al.[ | N/A |
| FASTQC, version 0.11.8 | Kalita et al.[ | N/A |
| Trim Galore, version 0.4.2 | Goodwin et al.[ | N/A |
| cutadapt, version 1.9.1 | Bentley et al.[ | N/A |
| bowtie2, version 2.3.4.1 | Langmead and Salzberg[ | N/A |
| samtools, version 1.8.0 | Patwardhan et al.[ | N/A |
| IMPUTE2, version 2.3.0 | Verma et al.[ | N/A |
| cnvPartition, version 3.2.1 | Illumina Genome Studio | N/A |
| MARIO, version 3.9.3 | Verma et al.[ | N/A |
| UpSetR Package, version 1.4.0 | Conway et al.[ | N/A |
| CASCADE R-scripts, version 1.0.0 |
| N/A |
| ggplot2, version 3.3.5 | Wickham[ | N/A |