| Literature DB >> 26387877 |
Lakshmi Chaitanya1, Arwin Ralf1, Mannis van Oven1, Tomasz Kupiec2, Joseph Chang3, Robert Lagacé3, Manfred Kayser1.
Abstract
Whole mitochondrial (mt) genome analysis enables a considerable increase in analysis throughput, and improves the discriminatory power to the maximum possible phylogenetic resolution. Most established protocols on the different massively parallel sequencing (MPS) platforms, however, invariably involve the PCR amplification of large fragments, typically several kilobases in size, which may fail due to mtDNA fragmentation in the available degraded materials. We introduce a MPS tiling approach for simultaneous whole human mt genome sequencing using 161 short overlapping amplicons (average 200 bp) with the Ion Torrent Personal Genome Machine. We illustrate the performance of this new method by sequencing 20 DNA samples belonging to different worldwide mtDNA haplogroups. Additional quality control, particularly regarding the potential detection of nuclear insertions of mtDNA (NUMTs), was performed by comparative MPS analysis using the conventional long-range amplification method. Preliminary sensitivity testing revealed that detailed haplogroup inference was feasible with 100 pg genomic input DNA. Complete mt genome coverage was achieved from DNA samples experimentally degraded down to genomic fragment sizes of about 220 bp, and up to 90% coverage from naturally degraded samples. Overall, we introduce a new approach for whole mt genome MPS analysis from degraded and nondegraded materials relevant to resolve and infer maternal genetic ancestry at complete resolution in anthropological, evolutionary, medical, and forensic applications.Entities:
Keywords: MPS; NGS; massively parallel sequencing; mitochondria; mtDNA; next-generation sequencing
Mesh:
Substances:
Year: 2015 PMID: 26387877 PMCID: PMC5057296 DOI: 10.1002/humu.22905
Source DB: PubMed Journal: Hum Mutat ISSN: 1059-7794 Impact factor: 4.878
Performance Summary of the 20 Geographically Diverse DNA Samples for Whole mt Genome Sequencing with the MPS Tiling Approach via 161 Short Overlapping Amplicons
| Sample ID | Total reads | Aligned reads | Percent of aligned reads | Maximum coverage | Average coverage | Percent of coverage |
|---|---|---|---|---|---|---|
| 1 | 641,486 | 535,003 | 83.40 | 67,847 | 5,593 | 100% |
| 2 | 502,725 | 427,031 | 84.94 | 54,668 | 4,852 | 100% |
| 3 | 268,055 | 224,250 | 83.66 | 68,677 | 4,494 | 100% |
| 4 | 408,522 | 343,567 | 84.10 | 44,394 | 4,285 | 100% |
| 5 | 355,691 | 305,619 | 85.92 | 50,539 | 3,804 | 100% |
| 6 | 493,881 | 485,970 | 98 | 32,082 | 3,143 | 100% |
| 7 | 289,696 | 258,720 | 89.31 | 43,687 | 2,607 | 100% |
| 8 | 264,828 | 232,310 | 87.72 | 23,412 | 1,488 | 100% |
| 9 | 264,231 | 204,023 | 77.21 | 47,124 | 3,777 | 100% |
| 10 | 352,588 | 279,159 | 79.17 | 53,959 | 3,496 | 100% |
| 11 | 476,939 | 426,050 | 89.33 | 66,262 | 5,637 | 100% |
| 12 | 268,968 | 235,011 | 87.38 | 37,458 | 2,560 | 100% |
| 13 | 249,338 | 198,790 | 79.73 | 46,631 | 1,840 | 100% |
| 14 | 433,566 | 389,219 | 89.77 | 52,840 | 3,358 | 99.59% |
| 15 | 541,295 | 532,291 | 98.34 | 44,460 | 3,488 | 100% |
| 16 | 211,908 | 194,369 | 91.72 | 25,927 | 1,302 | 100% |
| 17 | 307,441 | 278,527 | 90.60 | 30,230 | 1,865 | 100% |
| 18 | 324,954 | 299,098 | 92.04 | 48,897 | 2,006 | 100% |
| 19 | 420,730 | 416,278 | 98.94 | 38,839 | 3,644 | 100% |
| 20 | 249,701 | 241,699 | 96.80 | 37,051 | 2,628 | 99.62% |
| Average | 366,327 | 325,349 | 88 | 45,749 | 3,293 |
The 9‐bp deletion at np 8,281–8,289 in sample 17 is the defining mutation of haplogroup B. Hence, the coverage for sample 17 was considered to be 100% despite the 9‐bp deletion.
Figure 1The average amplicon coverage, across all the 20 samples tested with the newly developed MPS tiling approach for whole mt genome sequencing, are presented in three plots. A: Amplicons 1–53. B: Amplicons 54–106. C: Amplicons 107–161.
Differences Between the Previously Developed Long‐Range Amplification MPS Approach and the Newly Introduced MPS Tiling Approach for Whole mt Genome Analysis Both Obtained via the PGM in Sample 2
| NGS data from NextGENe® software | |||||||
|---|---|---|---|---|---|---|---|
| %A | %C | %G | %T | %Insertions | %Deletions | ||
| Long range | m.10664C>Y | 0 | 17.46 | 0 | 78.31 | 0 | 4.23 |
| Tiling | m.10664C>T | 0 | 0.34 | 0 | 99.35 | 0 | 0.3 |
| Long range | m.8251G>R | 71.43 | 0 | 28.57 | 0 | 0 | 0 |
| m.8252C>M | 29.37 | 70.63 | 0 | 0 | 2.1 | 0 | |
| Tiling | m.8251G>A | 97.27 | 0 | 2.1 | 0 | 0 | 0.63 |
Human mitochondrial genome, rCRS (GenBank NC_012920.1).
Haplogroups of the 20 DNA Samples Whole mt Genome Sequenced with the MPS Tiling Tool and Interpreted Using MitoTool
| Sample ID | Haplogroup | Broad haplogroup | Known sampling region | Main geographic region of the (broad) haplogroup origin | References |
|---|---|---|---|---|---|
| 1 | L3d3a1a | L3*(xM,N) | South Africa | Africa, West Asia | Behar et al. ( |
| 2 | L0d1a1a (199 missing) | L0 | South Africa | Southern Africa | Behar et al. ( |
| 3 | H1c | H | Russia (Caucasus) | West Eurasia, Northern Africa | Loogväli et al. ( |
| 4 | U7a2 | U7 | Israel | West Eurasia, Central Asia, Southern Asia | Palanichamy et al. ( |
| 5 | P1d1 | P | New Guinea | Oceania (Papuan, Melanesian, and Australian Aborigines) | Friedlaender et al. ( |
| 6 | V | HV*(xH) | Algeria | West Eurasia, Northern Africa | Achilli et al. ( |
| 7 | A1a1 (missing 235) | A*(xA2) | China | East Asia | Kong et al. ( |
| 8 | J2b1a | J | Italy | West Eurasia | Pala et al. ( |
| 9 | M7c1a2a | M*(xM1,C,D) | China | South Asia, East Asia, Southeast Asia | Kong et al., |
| 10 | F4a1a | R9 | China | East Asia, Southeast Asia | Kong et al. ( |
| 11 | X2b5 | X | Orkney Islands | West Eurasia, Northern Africa, Americas | Reidla et al. ( |
| 12 | J2b1(J2b1a1: missing16278) | J | Italy | West Eurasia | Pala et al. ( |
| 13 | P | P | Australia (Aborigine) | Oceania (Papuan, Melanesian, and Australian Aborigines) | Friedlaender et al. ( |
| 14 | M72a | M*(xM1,C,D) | Thailand | South Asia, East Asia, Southeast Asia | Tabbada et al. ( |
| 15 | T2b | T | Italy | West Eurasia | Pala et al. ( |
| 16 | K1a12 | U | The Netherlands | West Eurasia | Achilli et al. ( |
| 17 | B5a1a | B5 | The Netherlands | East Asia | Kong et al. ( |
| 18 | W5a1a | W | The Netherlands | West Eurasia | Finnilä et al. ( |
| 19 | V3c | HV*(xH) | The Netherlands | West Eurasia, Northern Africa | Achilli et al. ( |
| 20 | H23 | H | The Netherlands | West Eurasia, Northern Africa | Loogväli et al. ( |
Figure 2STR profiles from the AmpFlSTR® Identifiler® PCR Amplification Kit (Applied Biosystems) targeting 15 autosomal STRs plus amelogenin to illustrate the degree of DNA fragmentation for the sample treated with DNase at different time intervals: 5, 10, 15, 20, 30, and 45 min.
Performance Summary of PGM‐Based Whole mt Genome Sequencing of DNase‐Treated Sample at Different Time Intervals
| Total reads | Aligned reads | Percent of aligned reads | Maximum coverage | Average coverage | Percent of coverage | Haplogroup | ||
|---|---|---|---|---|---|---|---|---|
| DNase‐treated sample at different time intervals | 5 min | 463,854 | 452,505 | 97.55% | 51,989 | 6,810 | 100% | M35a1 |
| 10 min | 242,109 | 234,156 | 96.72% | 45,167 | 3,886 | 100% | ||
| 15 min | 163,854 | 152,505 | 93.07% | 38,249 | 3,007 | 100% | ||
| 20 min | 60,988 | 59,085 | 96.88% | 31,262 | 2,073 | 94.40% | ||
| 30 min | 83,565 | 80,367 | 96.17% | 10,731 | 1,183 | 82.60% | ||
| 45 min | 15,642 | 13,606 | 86.98% | 9,625 | 487 | 19.25% | ||
| Sample 15 | UV for 30 min | 258,179 | 255,277 | 98.88% | 21,901 | 1,362 | 88.65% | T2b |
| Enzymatic shearing | 315,030 | 278,648 | 88.45% | 5,827 | 975 | 89.90% | ||
| Ancient and degraded bone and teeth samples | Sample 1 | 542,889 | 525,140 | 96.73% | 9,572 | 4,042.52 | 87.87% | U5b2b |
| Sample 2 | 95,396 | 58,311 | 61.13% | 2,379 | 225.57 | 50.36% | H4a1 | |
| Sample 3 | 538,548 | 512,826 | 95.22% | 7,273 | 3,321.23 | 90.12% | H | |
| Sample 4 | 529,929 | 480,856 | 90.74% | 6,048 | 3,141.47 | 88.32% | T2b | |
| Sample 5 | 477,578 | 450,506 | 94.33% | 8,218 | 2,924.11 | 59.59% | U4a2 | |
| Sample 6 | 175,642 | 140,021 | 79.72% | 3,535 | 612.48 | 75.25% | T1a |
Sample 15 subjected to three different degradation methods and six different highly degraded bone and teeth samples.
Figure 3The average mt DNA amplicon coverage across all amplicons for the DNase‐treated sample at the different time intervals as obtained with the MPS tiling approach.
Figure 4A: The amplicon coverage (number of times amplicons were observed in the MPS data) of all the 161 amplicons used to obtain complete mt genome coverage with our MPS approach, arranged according to amplicon length from the shortest amplicon used (144 bp) on the left‐hand side to the largest amplicon used (230 bp) on the right‐hand side, for the different time intervals of enzymatic DNA degradation: 5, 10, 15, 20, 30, and 45 min. All the amplicons below X represents the coverage below the 50 reads threshold. B: Additional zoomed‐in image of the longer amplicons from amplicon lengths of 206 bp (left‐hand side) to 230 bp (right‐hand side).