| Literature DB >> 29093469 |
Shoji Tatsumoto1, Yasuhiro Go2,3,4, Kentaro Fukuta5,6, Hideki Noguchi5,6, Takashi Hayakawa7,8, Masaki Tomonaga7,8,9, Hirohisa Hirai10, Tetsuro Matsuzawa7,8,9,11, Kiyokazu Agata12,13,14, Asao Fujiyama15,16,17.
Abstract
Mutations generate genetic variation and are a major driving force of evolution. Therefore, examining mutation rates and modes are essential for understanding the genetic basis of the physiology and evolution of organisms. Here, we aim to identify germline de novo mutations through the whole-genome surveyance of Mendelian inheritance error sites (MIEs), those not inherited through the Mendelian inheritance manner from either of the parents, using ultra-deep whole genome sequences (>150-fold) from a chimpanzee parent-offspring trio. We identified such 889 MIEs and classified them into four categories based on the pattern of inheritance and the sequence read depth: [i] de novo single nucleotide variants (SNVs), [ii] copy number neutral inherited variants, [iii] hemizygous deletion inherited variants, and [iv] de novo copy number variants (CNVs). From de novo SNV candidates, we estimated a germline de novo SNV mutation rate as 1.48 × 10-8 per site per generation or 0.62 × 10-9 per site per year. In summary, this study demonstrates the significance of ultra-deep whole genome sequencing not only for the direct estimation of mutation rates but also for discerning various mutation modes including de novo allelic conversion and de novo CNVs by identifying MIEs through the transmission of genomes from parents to offspring.Entities:
Mesh:
Year: 2017 PMID: 29093469 PMCID: PMC5666008 DOI: 10.1038/s41598-017-13919-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Whole-genome sequencing (WGS) and workflow of variant discovery. (A) Pipeline for mapping and variant detection. The offspring’s data are shown in the box. (B) Distribution of the read-depth within the datasets from the chimpanzee trio. Lower and upper read-depths shown in each histogram indicates ± 3σ from the mean, and the reads present in the outlier regions were excluded from the following analyses.
Summary of SNVs.
| Individual | Father | Mother | Offspring |
|---|---|---|---|
|
| 27 ≤ depth ≤ 251 | 29 ≤ depth ≤ 199 | 34 ≤ depth ≤ 201 |
|
| 977,567 | 968,196 | 975,445 |
|
| 1,748,513 | 1,767,067 | 1,751,084 |
|
| 2,726,080 | 2,735,263 | 2,726,529 |
|
| 0.118 | 0.118 | 0.118 |
|
| 0.076 | 0.076 | 0.076 |
|
| 1,810,503 | 1,818,242 | 1,811,915 |
|
| 915,577 | 917,021 | 914,614 |
|
| 1.98 | 1.98 | 1.98 |
|
| 89.16 | ||
|
| 93.05 | ||
Figure 2Classification of the MIEs. When the variant alleles were identified only in the offspring, they were classified as [i] de novo SNVs. Inherited MIEs are classified into [ii] copy-number neutral inherited variants (CNIVs), [iii] hemizygous-deletion inherited variants (HDIVs), and [iv] de novo CNVs, according to the relative depth of the read-coverage among the trio’s sequences. Black circles indicate the sites of SNVs. The vertical columns in the right panel represent schematics of the read-coverage and their relative ratios.
Figure 3Representative region of hemizygous deletion and a de novo CNV on chromosome 22. Blue, red, and green lines represent the average depth of the read coverage for the corresponding regions in the father, mother, and offspring, respectively.
Figure 4Representative Sanger sequencing electropherogram at the position of de novo SNVs. (A) An example of germline de novo SNV identified on chromosome 12, where the parents’ genotypes are homozygous and those of the blood and hair follicle DNAs of the offspring are heterozygous (red arrow). (B) A somatic de novo SNV identified on chromosome 3, where the only blood-derived DNA of the offspring shows heterozygous (red arrow).
False positive SNVs identified from the DNAs of blood and hair follicle cells using NGS and Sanger sequencing.
| chr | position | panTro4 | Father Blood NGS | Mother Blood NGS | Offspring Blood NGS | Father Blood Sanger | Mother Blood Sanger | Offspring Blood Sanger | Offspring Hair Sanger | Call* |
|---|---|---|---|---|---|---|---|---|---|---|
| chr1 | 13552700 | C | CC | CC | CT | CC | CC | CC | CC | FP |
| chr2A | 102577476# | C | CT | CC | CG | CGT# | CC | CGT# | CGT# | FP |
| chr6 | 7711997 | A | AG | AA | AT | AG | AA | AG | AG | FP |
| chr6 | 12022852 | G | GG | GG | GT | GG | GG | GG | GG | FP |
| chr6 | 33261071 | T | TT | TT | TA | TA | TA | TA | TA | FP |
| chr12 | 14055837 | T | TT | TT | TA | TA | TT | TA | TA | FP |
| chr12 | 28800658 | C | CC | CC | CT | CC | CC | CC | CC | FP |
| chr22 | 22163245# | G | GC | GG | GA | GA | GG | GA | GA | FP |
*FP: false positive, #Known segmental duplication regions in chimpanzees[30].
De novo SNVs identified only by DeNovoGear and genotypes determined by NGS and Sanger sequencing.
| chr | position | panTro4 | Father Blood NGS | Mother Blood NGS | Offspring Blood NGS | Father Blood Sanger | Mother Blood Sanger | Offspring Blood Sanger | Call* |
|---|---|---|---|---|---|---|---|---|---|
| chr1 | 2694332 | T | CC | CC | CT | CC | CC | CC | TN |
| chr3 | 201706151 | T | TT | TT | CT | TT | TT | TT | TN |
| chr6 | 73652409 | A | AA | AA | AC | AA | AC | AC | TN |
| chr8 | 29532927 | A | AA | AA | AG | AG | AA | AG | TN |
| chr15 | 21679458 | T | TT | TT | CT | CT | TT | CT | TN |
| chr17 | 34345520 | A | AA | AA | AC | AC | AA | AC | TN |
| chr19 | 56060294 | T | TT | TT | CT | TT | TT | TT | TN |
*TN: true negative.
Figure 5Number of candidate de novo SNV site among four different depth of sequencing coverage data (30×, 60×, 90×, 120×). (A) Venn diagram of shared de novo SNVs among four different coverage data. Especially, low- and middle-coverage data (30× and 60×) have many non-shared de novo SNVs. (B) Comparison of the shared and specific de novo SNVs between 90× and 120× coverage data. The result shows that 90× coverage data is not enough to accurate de novo SNV calls.