| Literature DB >> 29980163 |
Klaasjan G Ouwens1,2, Rick Jansen3, Bas Tolhuis2, P Eline Slagboom4, Brenda W J H Penninx3, Dorret I Boomsma1.
Abstract
Postzygotic mutations are DNA changes acquired from the zygote stage onwards throughout the lifespan. These changes lead to differences in DNA sequence among cells of an individual, potentially contributing to the etiology of complex disorders. Here we compared whole genome DNA sequence data of two monozygotic twin pairs, 40 and 100 years old, to detect somatic mosaicism. DNA samples were sequenced twice on two Illumina platforms (13X and 40X read depth) for increased specificity. Using differences in allelic ratios resulted in sets of 1,720 and 1,739 putative postzygotic mutations in the 40-year-old twin pair and 100-year-old twin pair, respectively, for subsequent enrichment analysis. This set of putative mutations was strongly (p < 4.37e-91) enriched in both twin pairs for regulatory elements. The corresponding genes were significantly enriched for genes that are alternatively spliced, and for genes involved in GTPase activity. This research shows that somatic mosaicism can be detected in monozygotic twin pairs by using allelic ratios calculated from DNA sequence data and that the mutations which are found by this approach are not randomly distributed throughout the genome.Entities:
Keywords: genetics; mutations; next-generation sequencing; postzygotic mutations; somatic mosaicism
Mesh:
Year: 2018 PMID: 29980163 PMCID: PMC6175188 DOI: 10.1002/humu.23586
Source DB: PubMed Journal: Hum Mutat ISSN: 1059-7794 Impact factor: 4.878
Figure 1The result of a possible postzygotic mutation during early development in a monozygotic twin pair. Circles represent somatic cells, with cells containing a postzygotic mutation in black. A locus that was heterozygous before the twinning event may show different allele fractions between co‐twins posttwinning. NGS variant calling software would generally call both co‐twins heterozygous at this locus
Figure 2The chance of finding a postzygotic mutation because of sampling error in 40X and 13X data, based on 1,000 simulations of 893,581 heterozygote sites. The red dotted line indicates 5% false positives
Figure 3The fraction of putative mosaic sites found in both Illumina sets for both twin pairs. Asterisks indicate significance of binomial test of matching sites versus nonmatching sites. *p < 1e–5; **p < 1e–10, ***p < 1e–20
Enrichment of variant types in the set of putative postzygotic mutations
| 40‐year‐old twin pair ARD > 0 | 40‐year‐old twin pair ARD > 0.25 | Enrichment | Enrichment | 100‐year‐old twin pair ARD > 0 | 100‐year‐old twin pair ARD > 0.25 | Enrichment | Enrichment | |
|---|---|---|---|---|---|---|---|---|
| Total variants | 226,945 | 1,720 | 225,010 | 1,739 | ||||
| Intronic | 55.4% | 59.7% |
| 1.66e–04 | 55.2% | 59.6% |
|
|
| Intergenic | 35.1% | 24.5% | 1 | 1 | 35.4% | 24.6% | 1 | 1 |
| Modifier | 99.7% | 99.8% | 0.338 | 0.364 | 99.8% | 99.7% | 0.840 | 0.905 |
| Low impact | 0.9% | 2.6% |
| 5.01e–09 | 0.9% | 2.0% |
|
|
| Moderate impact | 0.6% | 1.5% |
| 3.14e–05 | 0.6% | 1.3% |
|
|
| High impact | 0.03% | 0.1% | 0.0978 | 0.114 | 0.03% | 0.1% | 0.0915 | 0.107 |
| Noncoding | 32.4% | 35.6% |
| 2.13e | 32.1% | 35.8% |
|
|
| Synonymous | 0.7% | 1.5% |
| 3.20e | 0.7% | 1.4% |
|
|
| Missense | 0.6% | 1.5% |
| 3.14e | 0.6% | 1.3% |
|
|
| Regulatory | 9.1% | 25.7% |
| 6.12e | 9.1% | 25.8% |
|
|
| TF binding | 0.7% | 2.9% |
| 3.78e | 0.7% | 2.7% |
|
|
| Protein coding | 49.2% | 57.6% |
| 1.582e | 49.4% | 59.1% |
|
|
| 3′ UTR | 1.7% | 2.4% | 0.0118 | 0.015 | 1.7% | 2.8% |
|
|
| 5′ UTR | 0.4% | 4.8% |
| 8.05e | 0.4% | 3.3% |
|
|
| 1 KG median | 31.6 | 32.7 | 31.3% | 33.9 | ||||
| 1 KG mean | 34.8 | 35.6 | 34.6% | 36.1 | ||||
| Freq < 0.1 | 16.0% | 13.4% | 16.6% | 14.4% | ||||
| Percentage in 1000G | 96.6% | 82.6% | 96.7% | 82.5% |
Notes. ARD = allelic ratio difference; FDR = false discovery rate. For an explanation of the different annotation terms, see https://www.ensembl.org/info/genome/variation/predicted_data.html#consequences. p‐values in italics are significant after correction for multiple testing.
Significant results from functional annotation using DAVID
| 40‐year‐old twin pair | 100‐year‐old twin pair | ||||
|---|---|---|---|---|---|
| Database | Term |
| FDR |
| FDR |
| UP | Alternative splicing | 7.47e | 1.05e | 2.43e | 3.45e |
| UP | Splice variant | 1.09e | 1.92e | 7.73e | 1.39e |
| UP | Polymorphism | 3.55e | 4.992647e | 1.19e | 1.69e |
| KEGG | Inflammatory mediator regulation of TRP channels | 1.11e | 1.45e | n.s. | n.s. |
| GO | Intracellular signal transduction | 1.53e | 2.75e | n.s. | n.s. |
| GO | Positive regulation of synapse assembly | 7.95e | 0.0143 | n.s. | n.s. |
| GO | Positive regulation of GTPase activity | 1.01e | 0.0182 | n.s. | n.s. |
| UP | Cell junction | 1.50e | 0.0211 | 8.96e | 1.27e |
| UP | Sequence variant | 1.36e | 0.0241 | 1.19e | 0.0214 |
| UP | Synapse | n.s. | n.s. | 1.74e | 2.47e |
| UP | Ion channel | n.s. | n.s. | 2.00e | 2.84e |
| UP | Epidermal growth factor‐like domain | n.s. | n.s. | 5.35e | 8.92e |
| INTERPRO | Axon guidance | n.s. | n.s. | 5.99e | 0.011 |
| GO | EGF‐like domain | n.s. | n.s. | 1.26e | 0.0178 |
| UP | Ig‐like C2‐type 5 | n.s. | n.s. | 9.95e | 0.0179 |
| UP | GTPase activator activity | n.s. | n.s. | 1.40e | 0.0220 |
| GO | Membrane | n.s. | n.s. | 1.73e | 0.0245 |
| UP | Metal‐binding | n.s. | n.s. | 1.79e | 0.0254 |
| UP | Metal ion binding | n.s. | n.s. | 2.45e | 0.0386 |
| GO | EGF‐like 2 | n.s. | n.s. | 2.26e | 0.0405 |
Notes. FDR = false discovery rate; GO = Gene Ontology; KEGG = Kyoto Encyclopedia of Genes and Genomes; UP = UniProt.