| Literature DB >> 33626351 |
Nargess Farhangdoost1, Cynthia Horth1, Bo Hu1, Eric Bareke1, Xiao Chen2, Yinglu Li2, Mariel Coradin3, Benjamin A Garcia4, Chao Lu2, Jacek Majewski5.
Abstract
Chromatin dysregulation has emerged as an important mechanism of oncogenesis. To develop targeted treatments, it is important to understand the transcriptomic consequences of mutations in chromatin modifier genes. Recently, mutations in the histone methyltransferase gene nuclear receptor binding SET domain protein 1 (NSD1) have been identified in a subset of common and deadly head and neck squamous cell carcinomas (HNSCCs). Here, we use genome-wide approaches and genome editing to dissect the downstream effects of loss of NSD1 in HNSCC. We demonstrate that NSD1 mutations are responsible for loss of intergenic H3K36me2 domains, followed by loss of DNA methylation and gain of H3K27me3 in the affected genomic regions. In addition, those regions are enriched in cis-regulatory elements, and subsequent loss of H3K27ac correlates with reduced expression of their target genes. Our analysis identifies genes and pathways affected by the loss of NSD1 and paves the way to further understanding the interplay among chromatin modifications in cancer.Entities:
Keywords: cis-regulatory elements; epigenomics; head and neck squamous cell carcinoma; histone H3 lysine 36 dimethylation; histone modifications; nuclear receptor binding SET domain protein 1
Mesh:
Substances:
Year: 2021 PMID: 33626351 PMCID: PMC8006058 DOI: 10.1016/j.celrep.2021.108769
Source DB: PubMed Journal: Cell Rep Impact factor: 9.423
Figure 1.Epigenomic characterization of NSD1-WT and MT HNSCC cell lines
(A) Genome-wide prevalence of modifications based on mass spectrometry; diamonds represent within-condition averages; p values obtained using Welch’s t test: H3K36me2 (p = 0.07) and H3K27me3 (p = 0.240).
(B) Genome-browser tracks displaying individual samples as lines and condition averages as area plots in a lighter shade; ChIP-seq signals shown are mass spectrometry (MS)-normalized logCPM and beta values are used for WGBS; regions of noticeable difference are highlighted.
(C) Heatmaps showing H3K36me2 (MS-normalized logCPM (log counts per million)) enrichment patterns ± 20 kb flanking intergenic regions (IGRs); n = 10,630. Numbers displayed at the bottom of aggregate plots correspond to the intergenic/genic ratio where transcription start sites (TSS)/transcription end sites (TES) and outer edges are excluded.
(D) Relative enrichment of signal within intergenic regions over those of flanking genes; CPM values are used for ChIP-seq and beta values for WGBS; ***Wilcoxon rank sum test p < 1e-5. In the box plots, the lower and upper hinge correspond to the first and third quartile; the upper whiskers extend to the largest value ≤ 1.5 * interquartile range (IQR) and vice versa for the lower whiskers.
(E) Distribution of DNA methylation beta values within actively transcribed genes (zFPKM > −3; Hart et al., 2013) compared against those in intergenic regions. *** represents p value < 1e-5 based on the difference-in-difference estimator of differential methylation between genic and intergenic regions. In the violin plots, the white dots correspond to the median and the lines span the IQR. In the box plots, the lower and upper hinge correspond to the first and third quartile; the upper whiskers extend to the largest value ≤ 1.5 * IQR and vice versa for the lower whiskers.
Figure 2.Epigenomic characterization of NSD1-WT and KO HNSCC cell lines
(A) Genome-wide prevalence of modifications based on mass spectrometry; diamonds represent within-condition averages; p values obtained using Welch’s t test: H3K36me2 p = 0.04; H3K27me3 p = 0.16. In this plot and all the following ones, Cal27-KO corresponds to replicate 1, Det562-KO to replicate 2, and FaDu-KO to replicate 1, unless otherwise specified.
(B) Genome-browser tracks displaying individual cell-line differences (KO-PA) as lines and condition averages as area plots in a lighter shade; ChIP-seq signals shown are MS-normalized logCPM and beta values are used for WGBS; regions of noticeable difference in Figure 1B are highlighted.
(C) Heatmaps showing H3K36me2 enrichment patterns centered on IGRs; n = 10,630. Numbers displayed at the bottom of aggregate plots correspond to the intergenic/genic ratio where TSS/TES and outer edges are excluded.
(D) Distribution of differential beta values within actively transcribed genes, all genes, intergenic regions, and regions depleted of H3K36me2 (corresponding to regions defined in Figure 3A as “cluster B”); median values are shown at the top. For the boxplots, the lower and upper hinge correspond to the first and third quartile; the upper whiskers extend to the largest value ≤ 1.5 * IQR and vice versa for the lower whiskers.
(E) Spearman correlation of differential enrichment between NSD1-WT versus KO and WT versus MT.
Figure 3.Loss of NSD1 preferentially impacts distal intergenic cis-regulatory elements
(A) Scatterplots of H3K36me2 enrichment (10-kb resolution) comparing a representative WT parental sample (Cal27) against its NSD1-KO counterpart (replicate 1; see also Figure S3A for other cell lines).
(B) Overlap enrichment result of Ensembl annotations with bins consistently labeled as cluster B (i.e., identified as B in all three paired WT versus KO comparisons). Stratification is applied to only focus on intergenic regions to avoid spurious associations to annotations confounded by their predominantly intergenic localization. The size of the dots corresponds to number of bins overlapping the corresponding annotation. *** represents p-value < 1e-5 based on Fisher’s exact test of bins overlapping a specific class of annotated regions versus a background of all non-quiescent bins, meaning >10 reads in at least one mark in one sample.
(C) Aggregate plots of differential signal enrichment centered around CREs overlapping (n = 5,193) consistent cluster B bins. Values are averaged across all three WT versus KO comparisons.
(D) Log fold change of H3K27ac normalized enrichment values comparing all differentially bound sites to those overlapping consistent cluster B bins. For the box plots, the lower and upper hinge correspond to the first and third quartile; the upper whiskers extend to the largest value ≤ 1.5 * IQR and vice versa for the lower whiskers.
(E) Distribution of genomic compartments overlapping various subsets of H3K27ac peaks categorized by differential binding status.
Figure 4.Loss of H3K36me2 domains and enhancer H3K27ac affects expression of target genes
(A) Log fold-change (LFC) of various subsets of DEGs. The lower and upper hinge in the box plots correspond to the first and third quartile; the upper whiskers extend to the largest value ≤ 1.5 * IQR and vice versa for the lower whiskers.
(B) LFC of putative target genes for various differential binding (DB) site subsets. For the box plots, the lower and upper hinge correspond to the first and third quartile; the upper whiskers extend to the largest value ≤ 1.5 * IQR and vice versa for the lower whiskers.
(C) Logistic regression model outputs for expression downregulation status based on its distance to and/or whether it shares a TAD with a cluster B bin.
(D) Permutation test on downregulated genes’ tendency to share a TAD with cluster B bin, controlling for distance.
(E) Example loci illustrating genome-wide phenomenon using differential signal tracks in which enrichment values of the respective parental line were subtracted from the corresponding KOs, after which the average across lines was taken.
(F) Schematic model of epigenetic dysregulation resulting from the absence of NSD1 (created with https://biorender.com). Note that, in the absence of NSD1, PRC2 deposits H3K27me3 in the same intergenic regions where H3K36me2 was depleted. In addition, H3K27ac decreases around distal enhancers located in these H3K36me2-depleted regions.
Figure 5.Changes in transcriptome and pathways resulted from loss of NSD1 and reduced H3K36me2 levels
(A) GSEA enrichment plot of hallmark gene sets significantly associated with the aggregated ranking of differentially expressed genes and genes targeted by differentially acetylated enhancers using their test statistics.
(B) Motifs exhibiting differential activity between up- versus downregulated peaks, with dots representing the strength of association in each direction and triangles their difference.
(C) Aggregated motif density plots around differential H3K27ac sites for the most significant differentially associated motifs in each direction.
(D) Stratified rank-rank hypergeometric overlap plot of gene expression differences between NSD1-WT versus MT and WT versus KO.
(E) Distribution of expression changes for leading edge genes of hallmark gene sets significantly associated with the aggregated ranking of differential gene expression for both NSD1-WT versus MT and WT versus KO. The error bars indicate the interquartile range (IQR).
Figure 6.Validation of cell-line-based observations in primary tumors from TCGA
(A) TCGA-HNSC samples were ranked by relative similarity to NSD1-WT and KO cell line samples (top), NSD1 mutational status was tabulated (middle), and enrichment of NSD1 mutational groups within quantiles was computed (bottom).
(B) Differential CpG methylation across all sites is contrasted against those located in regions depleted of H3K36me2 upon NSD1-KO (i.e., cluster B bins). For the box plots, the lower and upper hinge correspond to the first and third quartile; the upper whiskers extend to the largest value ≤ 1.5 * IQR and vice versa for the lower whiskers.
(C) Rank-rank hypergeometric overlap of genes ranked by LFC in cell line system and TCGA; see Figure 5D.
(D) Most significant results from GSEA on genes ranked by concordant upregulation (or downregulation) of gene expression and enhancer accessibility and DNA methylation, with the values associated with constituent genes shown to the right. The error bars indicate the interquartile range (IQR).
KEY RESOURCES TABLE
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Antibodies | ||
| Rabbit monoclonal anti-H3K36me2 | Cell Signaling Technology | Cat # 2901, RRID:AB_1030983 |
| Rabbit monoclonal anti-H3K27me3 | Cell Signaling Technology | Cat # 9733; RRID:AB_2616029 |
| Rabbit polyclonal anti-H3K27ac | Diagenode | Cat # C15410196 RRID:AB_2637079 |
| Mouse monoclonal anti-NSD1 (N312/10) | NeuroMab | Cat # 75–280, RRID:AB_11001827 |
| Chemicals, peptides, and recombinant proteins | ||
| Fetal Bovine Serum, qualified, Canada | ThermoFisher | Cat # 12483020 |
| Hydrocortisone | SigmaAldrich | Cat # H0888–5G |
| DMEM:F12 | ThermoFisher | Cat # 11320033 |
| Schneider’s Drosophila medium | ThermoFisher | Cat # 21720024 |
| Critical commercial assays | ||
| iDeal ChIP-seq Kit for Histones | Diagenode | Cat # C01010051 |
| Alt-R® CRISPR-Cas9 tracrRNA, ATTO 550 | Integrated DNA Technologies | Cat # 1075927 |
| Alt-R® S.p. Cas9 Nuclease V3 | Integrated DNA Technologies | Cat # 1081058 |
| Kapa Hyper Prep Kit | Roche | Cat # 07962363001 |
| Kapa HiFi Uracil+ Kit | Roche | Cat # 07959079001 |
| EZ-DNA Methylation Gold Kit | Zymo Research | Cat # D5006 |
| NxSeq AmpFREE Low DNA Library Kit | Lucigen | Cat #14000 |
| AllPrep DNA/RNA/miRNA Universal Kit | QIAGEN | Cat # 80224 |
| TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat | Illumina | Cat # RS-122–2201 |
| MinElute PCR purification kit | QIAGEN | Cat # 28006 |
| Trans-Blot® Turbo RTA Mini LF PVDF Transfer Kit | Bio-Rad | Cat # 1704274 |
| Deposited data | ||
| WGBS, RNA-seq, and ChIP-seq for histone H3 post-translational modifications in human head and neck squamous cell carcinoma cell lines | This paper | GEO: GSE149670 |
| Experimental models: cell lines | ||
| Cal27 | ATCC | CRL-2095 |
| Detroit562 | ATCC | CCL-138 |
| FaDu | ATCC | HTB-43 |
| BICR 78 | ECACC | Cat # 04072111 |
| SCC4 | ATCC | CRL-1624 |
| SKN-3 | JCRB Cell Bank | JCRB1039 |
| Oligonucleotides | ||
| NSD1 crRNA targeting PWWP region: GCCCTATCGGCAGTACTACG | Integrated DNA Technologies | N/A |
| NSD1 crRNA targeting SET region: GTGAATGGAGATACCCGTGT | Integrated DNA Technologies | N/A |
| Primer to screen PWWP target region, without Miseq adaptors: F- TGTTTCCAGACAGTCTTCTTTGG | Integrated DNA Technologies | N/A |
| Primer to screen PWWP target region, without Miseq adaptors: R- AAAGCCTTTTTCGTTTCCTACC | Integrated DNA Technologies | N/A |
| Primer to screen SET target region, without Miseq adapters: F- CACAGCAGAGGTCTCAGGAA | Integrated DNA Technologies | N/A |
| Primer to screen SET target region, without Miseq adapters: R- GTGGTGATGGTTGCACAAAA | Integrated DNA Technologies | N/A |
| Software and algorithms | ||
| R v3.6.1 | The R Project for Statistical Computing | |
| Python v3.7.5 | ||
| ggplot2 v3.3.0 | ||
| matplotlib v3.2.1 | ||
| pyGenomeTracks v3.2.1 | ||
| IGV v2.8.2 | ||
| BWA v0.7.17 | ||
| BBTools v38.73 | Sourceforge | |
| BISCUIT v0.3.12 | GitHub | |
| MethylDackel v0.4.0 | GitHub | |
| STAR v2.7.3a | ||
| Salmon v1.1.0 | ||
| GATK v4.1.5.0 | ||
| bedtools v2.29.0 | ||
| EpiProfile v2.0 | ||
| deepTools v3.3.1 | ||
| HDBSCAN v0.8.24 | ||
| LOLA v1.16.0 | ||
| MACS v2.2.6 | ||
| DiffBind v2.14.0 | ||
| ChIPseeker v1.22.1 | ||
| limma v3.42.2 | ||
| dmrff v1.0.0 | ||
| SpectralTAD v1.2.0 | ||
| DEseq2 v1.26.0 | ||
| apeglm v1.8.0 | ||
| RRHO2 v1.0 | ||
| zFPKM v1.8.0 | ||
| RobustRankAggreg v1.1 | ||
| fgsea v1.12.0 | ||