| Literature DB >> 18335039 |
Alessandro Achilli1, Ugo A Perego, Claudio M Bravi, Michael D Coble, Qing-Peng Kong, Scott R Woodward, Antonio Salas, Antonio Torroni, Hans-Jürgen Bandelt.
Abstract
Only a limited number of complete mitochondrial genome sequences belonging to Native American haplogroups were available until recently, which left America as the continent with the least amount of information about sequence variation of entire mitochondrial DNAs. In this study, a comprehensive overview of all available complete mitochondrial DNA (mtDNA) genomes of the four pan-American haplogroups A2, B2, C1, and D1 is provided by revising the information scattered throughout GenBank and the literature, and adding 14 novel mtDNA sequences. The phylogenies of haplogroups A2, B2, C1, and D1 reveal a large number of sub-haplogroups but suggest that the ancestral Beringian population(s) contributed only six (successful) founder haplotypes to these haplogroups. The derived clades are overall starlike with coalescence times ranging from 18,000 to 21,000 years (with one exception) using the conventional calibration. The average of about 19,000 years somewhat contrasts with the corresponding lower age of about 13,500 years that was recently proposed by employing a different calibration and estimation approach. Our estimate indicates a human entry and spread of the pan-American haplogroups into the Americas right after the peak of the Last Glacial Maximum and comfortably agrees with the undisputed ages of the earliest Paleoindians in South America. In addition, the phylogenetic approach also indicates that the pathogenic status proposed for various mtDNA mutations, which actually define branches of Native American haplogroups, was based on insufficient grounds.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18335039 PMCID: PMC2258150 DOI: 10.1371/journal.pone.0001764
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Basal tree encompassing the roots of Native American mtDNA haplogroups.
The tree is rooted on the haplogroup L3 founder and the position of the revised Cambridge reference sequence (rCRS) [49] is indicated for reading off sequence motifs. Closely related Asian branches are indicated in green. Detailed phylogenies for the four pan-American haplogroups (A2, B2, C1, and D1, highlighted in red) are shown in the corresponding figures. The complete sequences that are currently available for the other four Native American haplogroups (X2a, C4c, D2a, and D4h3, highlighted in red) are also displayed. Haplogroup D3 is common among Inuit populations [16], but all complete sequences available are from Siberia [13], [18]. As for A2a, the HVS-I motif (16111 16192 16223 16233 16290 16319 16331) of the reported sequence (no. 1) is common in Na-Dené groups [5]. Sequence no. 2 has been revised taking into account that the originally reported transitions at 4732 and 5147 [8] were artifacts due to a sample mix-up, while sequence no. 6 represents the shared motif of six Aleutian mitochondrial genomes [13]. Mutations are transitions unless specified: suffixes indicate transversions (to A, G, C, or T) or indels (+, d). Mutations back to the rCRS nucleotide are prefixed with @. Recurrent mutational events are underlined. Mutations in italics are either disease-causing or heteroplasmic or likely erroneous (and do not enter age calculations). We have followed the recent guidelines for standardization of the alignment in long C stretches [50], but disregarded any length variation in the C stretches that would then be scored at 309 or 16193 (which is often subject to considerable heteroplasmy). A number flagging a circled haplotype indicates the number of individuals sharing the corresponding haplotype (if >1). Additional information is provided in Text S4, while Table S1 lists the source of the complete genomes.
Figure 2Phylogeny of complete mtDNA sequences belonging to haplogroup A2.
The sequencing procedure for the novel complete sequences and the phylogeny construction were performed as described elsewhere [47]. Recurrent mutational events within the haplogroup are underlined, while mutations in italics are either disease-causing or heteroplasmic or likely erroneous, and were not used for age calculations. Table S1 lists the source of the complete genomes. For additional information, see the legend for Figure 1.
Figure 3Phylogeny of complete mtDNA sequences belonging to haplogroups B2 (A), C1 (B) and D1 (C).
For additional information, see the legends for Figures 1 and 2.
Haplogroup coalescence time estimates
| Haplogroup | No. ( | No. of base sub-stitutions |
|
| Star-likeness | T (years) | ΔΤ (years) |
|
| 96 | 321+3 | 3.340 | 0.322 | 0.332 | 17,200 | 1,700 |
|
| 86+1 | 304+3 | 3.529 | 0.348 | 0.335 | 18,100 | 1,800 |
|
| 27+16 | 116+61 | 4.116 | 0.463 | 0.447 | 21,200 | 2,400 |
|
| 42+13 | 198+57 | 4.636 | 0.836 | 0.121 | 23,800 | 4,300 |
|
| 21+4 | 86+14 | 4.000 | 1.150 | 0.121 | 20,600 | 5,900 |
|
| 15+7 | 63+23 | 3.909 | 0.695 | 0.368 | 20,100 | 3,600 |
|
| 6+2 | 13+4 | 2.125 | 0.573 | 0.809 | 10,900 | 2,900 |
|
| 17+17 | 67+56 | 3.618 | 0.441 | 0.547 | 18,600 | 2,300 |
|
| 172+47 | 684+177 | 3.932 | 0.311 | 0.186 | 20,200 | 1,600 |
|
| 172+47 | 649+161 | 3.699 | 0.274 | 0.225 | 19,000 | 1,400 |
First summand refers to the complete mtDNA sequences displayed in Figures 2 and 3 and second summand refers to additional entire coding-region sequences [1]–[3]. Three C to G transversions (at positions 14974, 15439, and 15499) [1] – likely candidates for phantom mutations [2] that went undetected – were disregarded.
The average number of base substitutions in the mtDNA coding region (between positions 577 and 16023) from the root sequence type.
Standard error calculated from an estimate of the genealogy [4].
Starlikeness (“effective star size” [4]) can take values between 1/n (single haplotype representing n mtDNAs) and 1 (perfect star phylogeny).
Estimate of the time to the most recent common ancestor of each cluster, using an evolutionary rate estimate of 1.26±0.08×10−8 base substitutions per nucleotide per year in the coding region [5], corresponding to 5,140 years per substitution in the whole coding region.
This includes one Apache A2a mtDNA (#1 in Table S1) and 9 Siberian mtDNAs (four A2a and five A2b) [6], [7].
Without A2a and A2b mtDNAs.