| Literature DB >> 34943238 |
Anna E White1, Toni de-Dios1,2, Pablo Carrión1, Gian Luca Bonora3, Laia Llovera1, Elisabetta Cilli4, Esther Lizano1,5, Maral K Khabdulina6, Daniyar T Tleugabulov6, Iñigo Olalde1,7, Tomàs Marquès-Bonet1,5,8,9, François Balloux10, Davide Pettener11, Lucy van Dorp10, Donata Luiselli4, Carles Lalueza-Fox1.
Abstract
The Asian Central Steppe, consisting of current-day Kazakhstan and Russia, has acted as a highway for major migrations throughout history. Therefore, describing the genetic composition of past populations in Central Asia holds value to understanding human mobility in this pivotal region. In this study, we analyse paleogenomic data generated from five humans from Kuygenzhar, Kazakhstan. These individuals date to the early to mid-18th century, shortly after the Kazakh Khanate was founded, a union of nomadic tribes of Mongol Golden Horde and Turkic origins. Genomic analysis identifies that these individuals are admixed with varying proportions of East Asian ancestry, indicating a recent admixture event from East Asia. The high amounts of DNA from the anaerobic Gram-negative bacteria Tannerella forsythia, a periodontal pathogen, recovered from their teeth suggest they may have suffered from periodontitis disease. Genomic analysis of this bacterium identified recently evolved virulence and glycosylation genes including the presence of antibiotic resistance genes predating the antibiotic era. This study provides an integrated analysis of individuals with a diet mostly based on meat (mainly horse and lamb), milk, and dairy products and their oral microbiome.Entities:
Keywords: Central Asian steppe; ancient pathogens; bacteria; paleogenomics; red complex
Year: 2021 PMID: 34943238 PMCID: PMC8698332 DOI: 10.3390/biology10121324
Source DB: PubMed Journal: Biology (Basel) ISSN: 2079-7737
Mapping statistics for the Kuygenzhar samples that were mapped against the human reference genome GRCh37. B2_B1 was removed from further analyses due to insufficient data.
| Nuclear DNA | Number of Sequenced Paired Reads | Mapped Reads | Reads with a Mapping Quality > 30 | Average Coverage | Covered Positions (%) |
|---|---|---|---|---|---|
| B2_B1 | 51,212,164 | 6715 | 5140 | 0.0002x | 0.02% |
| B2_B3 | 86,182,644 | 13,108,305 | 11,448,368 | 0.3977x | 31.59% |
| B2_B5 | 82,504,751 | 16,069,879 | 13,838,347 | 0.4788x | 36.18% |
| B2_B6 | 54,440,203 | 13,936,061 | 11,947,290 | 0.3379x | 26.49% |
| B2_B7 | 40,649,429 | 3,347,196 | 2,961,184 | 0.0973x | 9.12% |
| Mitochondrial DNA | |||||
| B1_B2 | 51,212,164 | 41 | 39 | 0.2187x | 19.05% |
| B2_B3 | 86,182,644 | 8799 | 6847 | 43.8743x | 99.99% |
| B2_B5 | 82,504,751 | 18,689 | 12,439 | 82.3307x | 100% |
| B2_B6 | 54,440,203 | 7853 | 6212 | 33.8299x | 99.98% |
| B2_B7 | 40,649,429 | 2215 | 1991 | 11.0503x | 99.79% |
Figure 1(A) Principal component analysis (PC.A) of the genetic variation in Eurasia (grey points) with ancient samples projected onto this. The Kuygenzhar individuals (golden triangles) are clustering near to present-day Kazakhs. (B) Ancestry modelling performed with qpAdm for the fitting two-way model with a p-value > 0.05 for all individuals individually and grouped.
Mapping statistics of Kuygenzhar non-human reads mapped against the Tannerella forsythia reference genome (NC_016610.1).
| Sample Name | Human Free Reads | Mapped Reads | Reads with a Mapping Quality > 30 | Average Coverage | Covered Positions (%) |
|---|---|---|---|---|---|
| B2_B1 | 148,214,698 | 273 | 132 | 0.003x | 0.31% |
| B2_B3 | 87,420,490 | 537,695 | 391,726 | 11.416x | 66.29% |
| B2_B5 | 84,107,411 | 4275 | 3366 | 0.081x | 7.23% |
| B2_B6 | 114,507,988 | 15,450 | 12,304 | 0.328x | 23.18% |
| B2_B7 | 117,513,937 | 428,178 | 318,145 | 8.889x | 83.23% |
Figure 2(A,B) Post-mortem damage patterns at the 5′ and 3′ends for the four samples with more than 0.05x coverage mapped against the T. forsythia reference genome (B2_B3 yellow, B2_B5 green, B2_B6 cyan, and B2_B7 blue). (C). Expected percentage of the T. forsythia reference genome that was recovered against the actual recovered percentage for the analysed samples. (D–G) Sequences’ edit distance of each different sample mapped against the T. forsythia reference genome (B2_B3, B2_B5, B2_B6, and B2_B7; in that order, following colour codes).
Figure 3Mapping of the four 18th century Kuygenzhar samples against the Tannerella forsythia reference genome (NC_016610.1). The outer ring (blue) shows the reference genome mapped against itself, the presence of genes is indicated in orange, and the four Kuygenzhar samples from the inner rings in the order of: B2_B3 (dark green), B2_B5 (light green), B2_B6 (light blue), and B2_B7 (dark blue).
Figure 4Phylogenetic tree of the high-quality alignment covering 129,641SNPs for Tannerella forsythia. Isolates present in the tree include 13 human modern isolates from Europe, America, and Japan; seven ancient samples from Africa, Europe, and America, our two isolates from Kazakhstan (triangles); and a modern isolate recovered from a dog used as a phylogenetic outgroup.
Figure 5Tip calibrated phylogeny of Tannerella forsythia. Root–tip distance analysis of the analysed samples after recombination and homoplasy pruning, using the tip distance to the root (y-axis) and the collection time of the isolate (x-axis) (A). The ML phylogenetic tree created with 1494 recombination and homoplasy free positions, with temporal annotation data (higher posterior densities) for each node (B).