| Literature DB >> 32820185 |
Klaus Schmitz-Abe1,2,3,4, Guzman Sanchez-Schmitz3,5, Ryan N Doan1,3, R Sean Hill1,3, Maria H Chahrour1,3, Bhaven K Mehta1,3, Sarah Servattalab1,3, Bulent Ataman6, Anh-Thu N Lam1,3, Eric M Morrow7, Michael E Greenberg6, Timothy W Yu8,9, Christopher A Walsh10,11,12,13,14, Kyriacos Markianos15,16,17,18.
Abstract
More than 98% of the human genome is made up of non-coding DNA, but techniques to ascertain its contribution to human disease have lagged far behind our understanding of protein coding variations. Autism spectrum disorder (ASD) has been mostly associated with coding variations via de novo single nucleotide variants (SNVs), recessive/homozygous SNVs, or de novo copy number variants (CNVs); however, most ASD cases continue to lack a genetic diagnosis. We analyzed 187 consanguineous ASD families for biallelic CNVs. Recessive deletions were significantly enriched in affected individuals relative to their unaffected siblings (17% versus 4%, p < 0.001). Only a small subset of biallelic deletions were predicted to result in coding exon disruption. In contrast, biallelic deletions in individuals with ASD were enriched for overlap with regulatory regions, with 23/28 CNVs disrupting histone peaks in ENCODE (p < 0.009). Overlap with regulatory regions was further demonstrated by comparisons to the 127-epigenome dataset released by the Roadmap Epigenomics project, with enrichment for enhancers found in primary brain tissue and neuronal progenitor cells. Our results suggest a novel noncoding mechanism of ASD, describe a powerful method to identify important noncoding regions in the human genome, and emphasize the potential significance of gene activation and regulation in cognitive and social function.Entities:
Mesh:
Year: 2020 PMID: 32820185 PMCID: PMC7441318 DOI: 10.1038/s41598-020-70656-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Homozygosity and de novo CNV rates in three different ASD collections. (a) Observed homozygosity in the HMCA, AGRE, and SSC: Distribution of recent homozygosity (homozygous intervals 5 cM or longer in autosomes) within individuals from each cohort. For display purposes, samples with no homozygosity are not shown. (b) Burden of rare de novo CNV events in three ASD collections. De novo copy number events are observed more frequently in affected individuals across all three cohorts, although within individual cohorts, it reaches statistical significance only in the SSC (Fisher test, one sided). Within the HMCA, high homozygosity families do not show an excess of de novo copy number mutation. Families with low homozygosity show a trend towards excess, but this does not reach significance likely due to sample size. Results are presented in a stacked bar plot (CNV1 bottom, CNV3 top, Probes ≥ 25). Numbers of samples and ratios for each comparison are shown in Table S2.
Summary of data sets used in this study: Homozygosity Mapping Collaborative of Autism (HMCA), Autism Genetic Resource Exchange (AGRE), Simons Simplex Collection (SSC) and HapMap (control samples).
| HMCA collection | AGRE | Simons Simplex | HapMap | Total | |
|---|---|---|---|---|---|
| # of families | 187 | 740 | 1,027 | 801 | 2,755 |
| # of samples | 790 | 2,985 | 3,881 | 1,251 | 8,907 |
| affected individuals (offspring) [Unaffected siblings] | 255 [169] | 1,463 [94] | 1,027 [798] | 0 [856] | 2,745 [1,917] |
| affected parents | 13 | 4 | 0 | ||
| % of families with both parents | 84% | 85% | 100% | 20.10% | |
| % of consanguineous families | 66% | 0.40% | 0% | 1.50% | |
| % of multiplex families | 22% | 87% | 0% | 0% | |
| male/female ratio (affected) [Unaffected siblings] | 3.63 [0.76] | 3.69 [0.71] | 6.55 [0.83] | n/a [1.03] | |
| SNP array technology | Affy 6.0 & 500 K | Affy 5.0 | Illumina 1 M | Affy 6.0 & 500 K |
For each dataset, the table presents the fraction of families with: both parents, consanguinity, and 2 or more affected children (multiplex families). In the bottom of the table we show the male/female ratio for both affected individuals (offspring) and unaffected siblings. Additional information can be found in Methods (Description of datasets).
Figure 2Homozygous deletions and relation to histone modification marks. (a) Selection of homozygous deletions (CNV0). We used a series of increasingly stringent selection criteria to compare CNV0 rate in affected individuals versus unaffected siblings and evaluate the overlap between biallelic deletions and ENCODE histone peaks. (b) Burden of rare homozygous deletions (CNV0) in three ASD collections (Table 1). Percentage of affected individuals and unaffected siblings with one or more rare biallelic events. Affected individuals show an elevated rate of biallelic deletions in all datasets. The difference is significant only in the HMCA collection (Fisher test, one sided) and is driven by consanguineous families (high homozygosity). The corresponding number of samples and ratios are shown in Table S4a. (c) Example of a non-coding biallelic deletion (AU-16801, Table 2). This particular homozygous deletion is approximately 7 kb in size, and it removes an H3K4Me3 histone modification mark in the vicinity of BRINP3 / FAM5C gene. The ENCODE profile shown represents the cell lines profiles available from UCSC. (d) Empirical distribution of the number of coincidences in HCMA families between biallelic deletions and 3 histone modification marks (H3K4Me1, H3K4Me3, and H3K27Ac) as defined by the ENCODE project. We randomize location of qPCR confirmed biallelic deletions. For events denoted in the HMCA families (Table 2), the joint probability to observe such an enrichment/depletion pattern is p < 0.009. ENCODE regions are defined using a score ≥ 20, and conclusions are robust regarding the threshold (Table S6). Simulations excluded sex chromosomes and low marker coverage regions (Methods: J,K).
List of all qPCR confirmed rare biallelic deletions (CNV0) among individuals with ASD in the HMCA collection.
| # of CNV | chr | Start | Size (Kb) | Histone peak | Gene location | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| ENCODE project | Primary neuron | Road map neuron | Road map brain | Exonic | Intronic | On the left | On the right | ||||
| 1 | 1 | 189,959,475 | 6.9 | Y | |||||||
| 1 | 1 | 191,473,007 | 179.0 | Y | Y | Y | Y | ||||
| 1 | 2 | 167,346,017 | 47.1 | Y | Y | Y | Y | ||||
| 1 | 2 | 184,794,451 | 8.0 | Y | |||||||
| 1 | 2 | 227,341,510 | 6.1 | Y | Y | Y | Y | ||||
| 1 | 2 | 242,915,454 | 119.2 | Y | Y | Y | |||||
| 1 | 3 | 1,782,524 | 5.1 | ||||||||
| 1 | 3 | 75,394,265 | 149.8 | Y | Y | Y | Y | ||||
| 1 | 3 | 143,637,504 | 853.2 | Y | Y | Y | |||||
| 1 | 4 | 134,871,302 | 321.4 | Y | Y | Y | Y | ||||
| 1 | 5 | 9,904,421 | 20.6 | Y | |||||||
| 1 | 6 | 154,121,271 | 10.1 | Y | Y | ||||||
| 3 | 7 | 16,900,135 | 15.3 | Y | |||||||
| 1 | 7 | 80,157,064 | 141.8 | Y | Y | Y | Y | ||||
| 1 | 7 | 159,049,219 | 13.1 | No data | Y | ||||||
| 1 | 8 | 15,937,585 | 88.5 | Y | Y | ||||||
| 1 | 8 | 18,852,675 | 9.6 | Y | Y | Y | Y | ||||
| 1 | 8 | 34,800,058 | 43.0 | Y | Y | ||||||
| 1 | 10 | 81,512,254 | 85.7 | Y | Y | Y | Y | ||||
| 1 | 12 | 112,432,874 | 5.6 | Y | Y | Y | |||||
| 1 | 14 | 28,475,766 | 25.0 | Y | Y | Y | Y | ||||
| 1 | 14 | 47,966,854 | 2.6 | ||||||||
| 2 | 20 | 52,643,162 | 20.2 | Y | Y | Y | Y | ||||
| 1 | 21 | 18,802,512 | 19.7 | Y | Y | Y | |||||
| 1 | 22 | 30,336,496 | 30.3 | Y | Y | Y | Y | ||||
The table notes overlap with histone peaks as defined by the ENCODE Project[33], by ChIP-seq data from Primary–Neuron[34], and both Brain and Neuron epigenomes from Roadmap Project (ChromHMM state model[36]. Neighboring genes are shown, and genes with bibliographic evidence linking them to neurodevelopmental disorders are noted in bold. Table S5a list rare homozygous deletions (CNV0) for unaffected siblings.
Figure 4Examples of overlap between non-coding biallelic deletions and histone peaks as defined by the ENCODE project. In addition, we show histone peaks derived by ChIP-seq data from Primary Neuron culture. ENCODE or Primary Neuron profiles shown in the figures represent the union of all cell lines available. Additional examples are presented in Figures S5a-c. (a) Non-coding biallelic deletion for sample AU-8101. The homozygous deletion removes the SCN7A promoter as defined by RNA-Seq data. (b) Non-coding biallelic deletion for sample AU-18101. Published chromatin interaction data obtained from human fibroblasts demonstrate that one broadly active element directly interacts with the IRS1 gene promoter. (c) Non-coding biallelic deletion for sample AU-19401 upstream of UNC5D, a gene encoding a receptor implicated in neuronal axon guidance and cell survival. (d) Non-coding biallelic deletion for sample AU-18301. The homozygous deletion interrupts a non-coding gene (NUTM2B-AS1), a broadly expressed antisense transcript on the opposite strand of NUTM2B.
Figure 3Overlap of homozygous deletions with regulatory regions defined by the Epigenome Roadmap Project. Illustrated are p-values for coincidence between non-coding homozygous deletions and epigenetic marks. Most significant correlations are observed among primary Brain cells and Neuronal profiles. We use 127 profiles provided by the Epigenome Roadmap Project (Table S7a) and the 15-state ChromHMM model to test enrichment/depletion of coincidences in affected/unaffected individuals (noncoding CNVs defined by KnownGene annotation). Similar results can be found in Supplemental Information using alternative gene annotations (RefGene and Ensembl, Figures S6a-b, Tables S7b-e).
Figure 5Protein–Protein Interactions between genes in proximity to homozygous deletions and 30 ASC genes[16,20]. STRING identifies interactions between 21/30 ASC genes and 16/76 genes (11 affected individuals) in the neighborhood of qPCR validated biallelic deletions from Table 2 (p < 6e-5, see Figure S7c). In contrast, STRING predicts only one interaction between ASC genes and the 22 genes from 5 unaffected siblings (p = NS, Figure S7d). For display clarity, disconnected genes from individuals are excluded from the figure.