| Literature DB >> 22615578 |
Jian Li1, R Alan Harris, Sau Wai Cheung, Cristian Coarfa, Mira Jeong, Margaret A Goodell, Lisa D White, Ankita Patel, Sung-Hae Kang, Chad Shaw, A Craig Chinault, Tomasz Gambin, Anna Gambin, James R Lupski, Aleksandar Milosavljevic.
Abstract
The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR) mediated by low-copy repeats (LCRs). Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ~1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs) from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH) chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR-mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease.Entities:
Mesh:
Year: 2012 PMID: 22615578 PMCID: PMC3355074 DOI: 10.1371/journal.pgen.1002692
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Figure 1Association between methylation deserts and human-specific structural rearrangements.
(A) Locations of human-specific structural rearrangements (black), 100 Kbp windows with methylation index value 0 (violet), 100 Kbp windows with lowest 1% sperm methylation at 15× coverage (green) and 2.5× coverage (red) for three representative chromosomes. (See Figure S18 for a whole genome view). (B) Cumulative sperm methylation distribution and the Kolmogorov-Smirnov statistics for 100 Kbp windows containing rearrangements (solid line) and the rest of the windows (dashed line) at 15× coverage (red) and at 2.5× coverage (red). (C) Simulation test of extent of hypomethylation in the regions flanking human-specific structural rearrangements. Distribution of methylation levels for 10 Kbp regions sampled at increasing distances (from 10 Kbp to 100 Kbp) from the 522 human specific structural rearrangements is compared to the distribution of methylation levels of randomly picked segments with matching sizes within the same chromosome (100 random samplings for each rearrangement). The same analysis is performed for methylomes at 15× coverage (green) and 2.5× coverage (red). D and significance p-value were determined using the Kolmogorov-Smirnov test.
Figure 2Statistical risk analysis of structural mutability due to hypomethylation and DP–LCRs.
(A) Venn diagram of 100 Kbp windows classified into one or more of the following three categories: (i) windows containing human-specific structural rearrangements; (ii) windows within methylation deserts (windows with lowest 1% methylation at 2.5× or 15× coverage); and (iii) windows containing regions between DP-LCRs. Numbers within the circle areas indicate fraction (per mil) of the genome occupied by the specific groups of windows. (B) Statistical relative risk (RR) and statistical attributable risk (AR) of structural instability for hypomethylation and DP-LCRs (the first row corresponds to A).
Figure 3Major patterns of CNVs in relation to LCRs (arrows with same texture indicates paralogous LCRs).
(A) CNVs involving whole regions between DP-LCRs. (B) Scattered CNVs (CNVs covering <40% of the distance between LCRs) between DP-LCRs. (C) CNVs involving whole regions between non-paralogous LCRs. (D) Scattered CNVs between non-paralogous LCRs. (E) Complex patterns of CNVs extending over various LCR groups and intervening regions. (F) CNVs overlapping LCRs. (G–H) Contingency tables summarizing the counts of CNVs observed between LCRs, corresponding to A, B, C and D. The CNVs between paralogous LCRs tend to involve the whole region (as illustrated in A, corresponding to counts in top left cells in G and H), a signature of NAHR involving paralogous LCRs.
Figure 4Structural mutability assessed by structural heterozygosity.
(A) Under the infinite allele model, assuming structural mutations are neutral and at drift-mutation equilibrium, mutation rates are proportional to heterozygosity rates. (B) Comparison of average CNV heterozygosity rates (data from four studies) within (black for methylomes at 15× coverage, gray for methylomes at 2.5× coverage) and outside (white) methylation deserts. Error bars represent standard deviation of CNV heterozygosity rates in corresponding regions.
Enrichment of structural mutability in hypomethylated regions determined by the germline methylation index (MI = 0) and whole-genome bisulfite sequencing of human sperm DNA (at 2.5× and 15×).
| Enrichment fold (p-value) | Windows with MI = 0 | sperm lowest 5% | MI>0 & sperm<5% | MI = 0 & sperm>5% | MI = 0 & sperm<5% | |||||
| 2.5× | 15× | 2.5× | 15× | 2.5× | 15× | 2.5× | 15× | |||
|
|
| 10.2 (3.9e-106) | 5.1 (2.0e-77) | 7.1 (3.7e-226) | 3.6 (2.6e-35) | 5.2 (1.3e-78) | 6.0 (1.3e-28) | 2.6 (7.0e-3) | 12.5 (5.1e-76) | 12.1 (4.0e-134) |
|
|
| 2.7 (4.2e-13) | 1.8 (5.7e-8) | 2.3 (1.6e-26) | 1.7 (6.5e-5) | 2.4 (2.0e-22) | 2.5 (2.8e-7) | 2.1 (3.2e-3) | 3.4 (2.1e-9) | 3.2 (1.2e-13) |
|
| 2.0 (1.6e-12) | 1.3 (6.0e-4) | 1.5 (1.9e-16) | 1.3 (6.3e-4) | 1.4 (8.6e-8) | 1.8 (5.9e-8) | 1.3 (1.9e-2) | 2.0 (3.1e-6) | 2.3 (1.5e-17) | |
|
| 2.6 (2.3e-24) | 1.6 (1.1e-9) | 2.2 (3.0e-56) | 1.5 (3.1e-7) | 2.2 (1.5e-34) | 2.5 (1.8e-17) | 1.8 (1.1e-3) | 2.5 (6.3e-9) | 3.2 (1.3e-31) | |
|
| 3.1 (4.2e-10) | 2.6 (4.1e-6) | 2.1 (1.1e-10) | 2.5 (4.4e-6) | 1.6 (2.3e-12) | 3.9 (1.4e-5) | 1.7 (6.8e-3) | 2.8 (3.8e-3) | 2.9 (4.2e-10) | |
|
|
| 4.0 (2.5e-5) | 2.7 (1.2e-2) | 2.0 (2.7e-2) | 2.6 (4.7e-2) | 1.8 (7.3e-2) | 3.8 (5.0e-3) | 5.3 (3.5e-3) | 5.7 (4.5e-3) | 3.5 (2.0e-3) |
|
| 3.9 (1.1e-3) | 4.1 (1.3e-4) | 3.8 (5.0e-3) | 4.5 (1.8e-5) | 2.1 (7.3e-5) | 3.5 (1.8e-2) | 2.7 (1.4e-3) | 8.2 (1.7e-5) | 8.4 (5.5e-3) | |
|
| 1.7 (8.3e-2) | 2.3 (1.2e-2) | 2.0 (9.9e-2) | 1.8 (1.8e-2) | 1.1 (9.9e-1) | 0.6 (7.3e-1) | 0.5 (4.7e-1) | 1.4 (1.1e-1) | 2.7 (9.3e-3) | |
|
| 2.3 (2.4e-83) | 2.9 (4.5e-302) | 3.3 (9.9e-80) | 2.5 (1.1e-298) | 1.9 (1.6e-190) | 3.4(5.8e-231) | 4.8 (1.4e-276) | 1.6 (1.4e-3) | 2.1 (3.2e-3) | |
P-values are calculated using Chi-square test. Significance of enrichment for hypomethylation in rows marked “Human evolution” and “Structural polymorphisms” was calculated relative to randomly selected windows throughout the genome. For rows marked “Disease studies”, significance of enrichment for hypomethylation was calculated using the following controls: for the schizophrenia study, using control-specific rare CNVs; for the autism study, using inherited rare CNVs found in cases; for the bipolar study, using control-specific singleton deletions. The developmental delay study, significance of enrichment for hypomethylation in windows containing rare (<1% population frequency) CNVs found in cases was established using the CNVs found in control group as controls.
Enrichment of various regulatory features in methylation deserts detected using permutation test or chi-square test. Enrichments for an expanded set of regulatory features are included in Table S6.
| Regulatory features | Fold-enrichment in methylation deserts | p-value |
|
| 15 | <1e-3 |
|
| 12 | <1e-10 |
|
| 9.2 | 1.42e-43 |
|
| 33 | 1.41e-146 |
|
| 37.6 | <1e-4 |
|
| 4 | <1e-3 |