| Literature DB >> 35366416 |
Guido Alberto Gnecchi-Ruscone1, Anna Szécsényi-Nagy2, István Koncz3, Gergely Csiky4, Zsófia Rácz3, A B Rohrlach5, Guido Brandt6, Nadin Rohland7, Veronika Csáky2, Olivia Cheronet8, Bea Szeifert2, Tibor Ákos Rácz9, András Benedek10, Zsolt Bernert11, Norbert Berta12, Szabolcs Czifra11, János Dani13, Zoltán Farkas12, Tamara Hága13, Tamás Hajdu14, Mónika Jászberényi9, Viktória Kisjuhász15, Barbara Kolozsi13, Péter Major12, Antónia Marcsik16, Bernadett Ny Kovacsóczy17, Csilla Balogh18, Gabriella M Lezsák19, János Gábor Ódor20, Márta Szelekovszky13, Tamás Szeniczey14, Judit Tárnoki21, Zoltán Tóth22, Eszter K Tutkovics23, Balázs G Mende2, Patrick Geary24, Walter Pohl25, Tivadar Vida3, Ron Pinhasi8, David Reich26, Zuzana Hofmanová27, Choongwon Jeong28, Johannes Krause29.
Abstract
The Avars settled the Carpathian Basin in 567/68 CE, establishing an empire lasting over 200 years. Who they were and where they came from is highly debated. Contemporaries have disagreed about whether they were, as they claimed, the direct successors of the Mongolian Steppe Rouran empire that was destroyed by the Turks in ∼550 CE. Here, we analyze new genome-wide data from 66 pre-Avar and Avar-period Carpathian Basin individuals, including the 8 richest Avar-period burials and further elite sites from Avar's empire core region. Our results provide support for a rapid long-distance trans-Eurasian migration of Avar-period elites. These individuals carried Northeast Asian ancestry matching the profile of preceding Mongolian Steppe populations, particularly a genome available from the Rouran period. Some of the later elite individuals carried an additional non-local ancestry component broadly matching the steppe, which could point to a later migration or reflect greater genetic diversity within the initial migrant population.Entities:
Keywords: Avars; Carpathian Basin; Pannonia; ancient DNA; early medieval; human migration; migration period; population genomics; steppe nomads
Mesh:
Substances:
Year: 2022 PMID: 35366416 PMCID: PMC9042794 DOI: 10.1016/j.cell.2022.03.007
Source DB: PubMed Journal: Cell ISSN: 0092-8674 Impact factor: 66.850
Figure 1Geographic and temporal locations of ancient individuals in this study
(A) A map of Eurasia with geographic coordinates of the ancient individuals analyzed in this study marked by color-filled shapes. Dark yellow shades mark the steppe ecoregion. Newly produced genomes from the Carpathian Basin pre-Avar period are highlighted with white outlined symbols.
(B) A zoom-in map of the Carpathian Basin showing geographic coordinates of the newly analyzed ancient samples from the Avar period with symbols referring to specific archaeological and social categories and colored according to the regions, as defined in the bottom left and right legend respectively.
(C) A rough timeline of Carpathian Basin and Mongolia from 200 BC to 950 AD.
See also Table S1.
Figure S2Summary of imputation quality control and post-imputation analyses
(A) Scatter plot of pairwise mismatch rate for the pseudohaploid data (x axis) versus the pairwise mismatch rate for the imputed data (y axis). Individuals are filtered to have the proposed coverage cutoff of 1.43×. Points are colored by number of overlapping SNPs in each pairwise mismatch rate calculation. The red line is the line. Transparent points indicate pMMR values for individuals, which were not included because falling below our coverage threshold.
(B) Two-way qpAdm models and PCAs for the West Eurasian component (i.e., masking East Asian ancestry tracts) of DTI late Avar period individuals with Mosaic (left) and RFMix (right).
(C) West Eurasian (left) and East Eurasian (right) PCAs projecting the masked local ancestry tracts of DTI late individuals performed with both methods (Mosaic and RFMix) and pseudohaploid data of individuals representative of local (Sarmatian period) and non-local (North_Caucasus_7C) individuals, related to Figure 3 and Table S1.
Figure 2Principal component analyses
(A) Pre-Avar Eurasian PCA (top) and west Eurasian PCA (bottom). New data are highlighted with white outlined and filled symbols.
(B) Eurasian PCA of newly produced Avar period individuals. Colors refers to key regions within the Carpathian Basin. Filled symbols are individuals retrieved from elite contexts. Specific archaeological categories discussed in the text are shown with different symbol shapes.
See also Figure S1A.
Figure S1West Eurasia and Eurasia PCAs and Most relevant qpAdm models of DTI Avar-period elites/elite associated individuals
(A and B) (A) West Eurasian PCA and (B) Eurasian PCA. The symbols and color scheme are the same as in Figure 2.
(C) From left to right qpAdm models for: DTI early Avar period elite individuals; DTI middle Avar period elite individuals; DTI late Avar period elite individuals. Black boxes highlight the outlier infant (A1817, DTI early Avar period) and child (I18744, DTI middle Avar period).
(D) Three-way competing models of DTI late-Avar-period individuals contrasting local + non-local sources in the same model. A transparency factor is added to the models presenting poor fits (p < 0.05), related to Figures 1, 2, and 3 and Tables S1 and S2.
Figure S5qpAdm models for the non-DTI Avar-period individuals and the Sarmatian-period groups
(A) From left to right: two-way models for: early-Avar-period Transtisza group elite associated individuals; early-Avar-period Transtisza group non-elite associated individuals; late-Avar-period non-elite associated individuals; early-Avar-period elite associated (Kölked-Feketekapu site) individuals.
(B) Two-way individual based qpAdm models contrasting different local sources for individuals unresolved with the two-way eastern + western proxies’ models. Models for I6750 and I18185 are still non-optimal as they all have infeasible admixture proportions (>>100% for a single source) and large SE despite some having p values > 0.05. Nevertheless, Szolad_south_6c as a unique source shows the less deviant models overall.
(C) qpAdm models for the two Sarmatian period groups: LS_P_DTI_4-5c and LS_P_Transtisza_4-5c. LS_P_Transtisza_4-5c can be modeled without any extra component from the steppe and matches the Szolad_others_6c profile, while LS_P_DTI_4-5c requires additional gene flow from the steppe with different surrogate proxies providing working models, related to Figure 3 and Tables S2 and S4.
Figure 3Ancestry deconvolution performed with qpWave/qpAdm
Representative two- or three-way admixture models for the Avar period individuals. Early Avar-period on the left and middle-late Avar period on the right. The figure reports the overall best models resulting from the evaluation of all the individual-based, group-based and local ancestry-based qpAdm models following the rationale detailed in the STAR Methods section. Three possible sources of ancestry are tested, an eastern Asian steppe source (AR_Xianbei_P_2c is used for all models reported in the figure), a Carpathian Basin source (blue side, represented by either one among the three Longobard-period Szólád groups or the Sarmatian-period group), and a Pontic-to-Kazakh steppe source (green side, represented by either one of the Iron Age groups from the steppe or the North_Caucasus_7c). In Figures S1, S4, and S5 are shown the specific sources within these geographic categories that provided fitting models for each of the tested individuals.
See also Figures S1, S4, and S5 and Tables S2 and S4.
Figure S4Assessment of genetic inbreeding and relatedness
(A) hapROH analyses reporting only individuals showing runs of homozygosity (ROH) tracts longer than 4 cM grouped in 4 length bins. On the right, the result of simulated ROH patterns corresponding to recent inbreeding or small population sizes as provided by the hapROH pipeline.
(B) Comparison between ROH on imputed diploid calls run with PLINK “--homozyg” function and hapROH including also the 2–4-cM bins and only overlapping individuals for comparison.
(C) Genetic relatedness estimated with READ (left) and lcMLkin (right), related to Figures 1 and 2.
p values of the group-based qpWave/qpAdm models of the two DTI_late_elite groups
| Targets | Second sources (the first is DTI_early_elite) of two-way | |||
|---|---|---|---|---|
| North_Caucasus_7c | Szolad_north_6c | LS_P_Transtisza_4−5c | LS_P_DTI_4−5c | |
| DTI_late_elite1 | p value = 0.66 | p value = 5.5 × 10−8 | p value = 0.0004 | p value = 0.0007 |
| DTI_late_elite2 | p value = 0.008 | p value = 0.701 | p value = 0.89 | p value = 0.57 |
DTI_late_elite1 (A1809, A1813, A1814, A1815, I18222) include individuals with predominantly non-local signal while DTI_late_elite2 (A1810, I18225) include individuals with predominantly local signal from the three-way models in Figure S1D and Table S3.
p values > 0.05
Figure 4Admixture dating obtained with DATES
Admixture dates and respective error bars obtained with DATES (between a western and eastern Eurasian source) plotted against the Euclidean distance from PC1 and PC2 coordinates of the Rouran_P_6c genome (PCA of Figure 2B). The colored bands mark the chronology of the three Avar and Rouran periods, used to derive the dates of admixture from the estimated number of generations since admixture. All dates shown are individual-based except for the DTI elites (early, middle, and late) for which the group-based date estimates are shown in the plot and therefore the individuals’ median Euclidean distance from Rouran_P_6c are used in this case.
See also Figure S3.
Figure S3All individual-based Avar-period admixture dates obtained with DATES plotted against Euclidean distance from PC1 and PC2 of the Rouran-period individual
(A) Late Avar period.
(B) Middle Avar period.
(C) Early Avar period. The Rouran-period genome is also shown with its estimated admixture date and at 0 distance from itself on the x axis.
(D–F) Ancestry covariance decay plot for the group-based analyses (WE = western, EA = eastern sources), related to Figure 4.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Ancient skeletal element | This study | A1801 |
| Ancient skeletal element | This study | A1802 |
| Ancient skeletal element | This study | A1803 |
| Ancient skeletal element | This study | A1804 |
| Ancient skeletal element | This study | A1805 |
| Ancient skeletal element | This study | A1806 |
| Ancient skeletal element | This study | A1807 |
| Ancient skeletal element | This study | A1808 |
| Ancient skeletal element | This study | A1809 |
| Ancient skeletal element | This study | A1810 |
| Ancient skeletal element | This study | A1811 |
| Ancient skeletal element | This study | A1812 |
| Ancient skeletal element | This study | A1813 |
| Ancient skeletal element | This study | A1814 |
| Ancient skeletal element | This study | A1815 |
| Ancient skeletal element | This study | A1816 |
| Ancient skeletal element | This study | A1817 |
| Ancient skeletal element | This study | A1818 |
| Ancient skeletal element | This study | A1819 |
| Ancient skeletal element | This study | A1820 |
| Ancient skeletal element | This study | A1821 |
| Ancient skeletal element | This study | A1822 |
| Ancient skeletal element | This study | A1823 |
| Ancient skeletal element | This study | A1824 |
| Ancient skeletal element | This study | A1825 |
| Ancient skeletal element | This study | I20801 |
| Ancient skeletal element | This study | I20800 |
| Ancient skeletal element | This study | I20798 |
| Ancient skeletal element | This study | I20799 |
| Ancient skeletal element | This study | I16812 |
| Ancient skeletal element | This study | I16741 |
| Ancient skeletal element | This study | I18742 |
| Ancient skeletal element | This study | I18743 |
| Ancient skeletal element | This study | I18744 |
| Ancient skeletal element | This study | I18224 |
| Ancient skeletal element | This study | I18223 |
| Ancient skeletal element | This study | I18225 |
| Ancient skeletal element | This study | I18222 |
| Ancient skeletal element | This study | I16759 |
| Ancient skeletal element | This study | I18174 |
| Ancient skeletal element | This study | I18184 |
| Ancient skeletal element | This study | I18185 |
| Ancient skeletal element | This study | I16743 |
| Ancient skeletal element | This study | I16744 |
| Ancient skeletal element | This study | I16753 |
| Ancient skeletal element | This study | I16752 |
| Ancient skeletal element | This study | I16751 |
| Ancient skeletal element | This study | I16750 |
| Ancient skeletal element | This study | A181013 |
| Ancient skeletal element | This study | A181014 |
| Ancient skeletal element | This study | A181015 |
| Ancient skeletal element | This study | A181016 |
| Ancient skeletal element | This study | A181017 |
| Ancient skeletal element | This study | A181018 |
| Ancient skeletal element | This study | A181019 |
| Ancient skeletal element | This study | A181020 |
| Ancient skeletal element | This study | A181021 |
| Ancient skeletal element | This study | A181022 |
| Ancient skeletal element | This study | A181023 |
| Ancient skeletal element | This study | A181024 |
| Ancient skeletal element | This study | A181025 |
| Ancient skeletal element | This study | A181026 |
| Ancient skeletal element | This study | A181027 |
| Ancient skeletal element | This study | A181028 |
| Ancient skeletal element | This study | I20802 |
| Ancient skeletal element | This study | A181029 |
| Destilled Water DNA free, UltraPure™ | Thermo Fisher Scientific | Cat# 10977035 |
| 0.5 M EDTA pH 8.0 | Thermo Fisher Scientific | Cat# AM9261 |
| Proteinase K | Thermo Fisher Scientific | Cat# AM2548 |
| Isopropanol | Sigma Aldrich | Cat# I9516 |
| Guanidine hydrochloride | Sigma Aldrich | Cat# G4505 |
| Sodium Acetate Solution (3 M), pH 5.2 | Thermo Fisher Scientific | Cat# R1181 |
| Tween-20 | Sigma Aldrich | Cat# P2287 |
| Buffer PE | Qiagen | Cat# 19065 |
| Buffer PB | Qiagen | Cat# 19066 |
| Tris-EDTA buffer solution | Sigma Aldrich | Cat# 93283 |
| 10x Buffer Tango | Thermo Fisher Scientific | Cat# BY5 |
| ATP 100 mM | Thermo Fisher Scientific | Cat# R0441 |
| BSA 20mg/mL | Roche | Cat# 10711454001 |
| dNTP Mix | Thermo Fisher Scientific | Cat# R1121 |
| USER enzyme | New England Biolabs | Cat# M5505 |
| Uracil Glycosylase inhibitor (UGI) | New England Biolabs | Cat# M0281 |
| T4 Polynucleotide Kinase | New England Biolabs | Cat# M0201 |
| T4 DNA Polymerase | New England Biolabs | Cat# M0203 |
| Bst DNA Polymerase, large fragment | New England Biolabs | Cat# M0275L |
| Ethanol | Merck | Cat# 1009831000 |
| 10x T4 Ligase Buffer | Thermo Fisher Scientific | Cat# EL0011 |
| T4 DNA Ligase | Thermo Fisher Scientific | Cat# EL0011 |
| 10x Thermopol Buffer | New England Biolabs | Cat# B9004S |
| Ampure XP | Bioscience | Cat# BCI-A63881 |
| Agilent D1000 ScreenTapes | Agilent Technologies | Cat# 5067-5582 |
| Agilent D1000 Ladder | Agilent Technologies | Cat# 5067-5586 |
| Agilent D1000 Reagents | Agilent Technologies | Cat# 5067-5583 |
| Agarose | Lonza | Cat# 50004 |
| HyperLadder™ 25bp (formerly HyperLadder V), | Bioline | Cat# BIO-33057 |
| ECO Safe Nucleic Acid Staining Solution 20,000X | Thermo Fisher Scientific | Cat# 3910001 |
| 2X Hi-RPM Hybridization Buffer | Agilent Technologies | Cat# 5190-0403 |
| PfuTurbo Cx Hotstart DNA Polymerase | Agilent Technologies | Cat# 600412 |
| Herculase II Fusion DNA Polymerase | Agilent Technologies | Cat# 600679 |
| Sodiumhydroxide Pellets | Fisher Scientific | Cat# 10306200 |
| Sera-Mag Magnetic Speed-beads. Carboxylate-Modified (1 mm, 3EDAC/PA5) | GE LifeScience | Cat# 65152105050250 |
| Dynabeads MyOne Streptavidin | Thermo Fisher Scientific | Cat# 65602 |
| SSC Buffer (20x) | Thermo Fisher Scientific | Cat# AM9770 |
| GeneAmp 10x PCR Gold Buffer | Thermo Fisher Scientific | Cat# 4379874 |
| Salmon sperm DNA | Thermo Fisher Scientific | Cat# 15632-011 |
| Human Cot-I DNA | Thermo Fisher Scientific | Cat#15279011 |
| 5M NaCl | Sigma Aldrich | Cat# S5150 |
| 1M NaOH | Sigma Aldrich | Cat# 71463 |
| 1 M Tris-HCl pH 8.0 | Sigma Aldrich | Cat# AM9856 |
| 50x Denhardt’s solution | Thermo Fisher Scientific | Cat# 750018 |
| Methanol, certified ACS | VWR | Cat# EM-MX0485-3 |
| Acetone, certified ACS | VWR | Cat# BDH1101-4LP |
| Dichloromethane, certified ACS | VWR | Cat# EMD-DX0835-3 |
| Hydrochloric acid, 6N, 0.5N & 0.01N | VWR | Cat# EMD-HX0603-3 |
| 5 M Sodium chloride solution | Sigma-Aldrich | Cat# S5150-1L |
| 20% SDS | Serva | Cat# 39575.01 |
| PEG-4000 | Thermo Fisher Scientific | Cat# EL0011 |
| PEG-8000 | Promega | Cat# V3011 |
| MinElute PCR Purification Kit | QIAGEN | Cat# 28006 |
| TwistAmp Basic Kit | TwistDX | Cat# TABAS03kit |
| Qubit® dsDNA HS Assay Kit, 500 assays | Thermo Fisher Scientific | Cat# Q32854 |
| High Pure Extender Assembly from the Roche High Pure Viral Nucleic Acid Large Volume Kit,40 reactions | Roche | Cat# 5114403001 |
| MiSeq Reagent Kit v3 (150 cycle) | Illumina | Cat# MS-102-3001 |
| NextSeq® 500/550 High Output Kit v2 (150 cycles) | Illumina | Cat# FC-404-2002 |
| HiSeq Cluster Kit SR | Illumina | GD-410-1001 |
| HiSeq 4000 SBS Kit (50/75 cycles) | Illumina | Cat# FC-410-1001/2 |
| NextSeq® 500/550 High Output Kit v2 (150 cycles) | Illumina | Cat# FC-404-2002 |
| DyNAmo Flash SYBR Green qPCR Kit | Thermo Fisher Scientific | Cat# F415L |
| Maxima SYBR Green kit | Thermo Fisher Scientific | Cat# K0251 |
| Oligo aCGH/Chip-on-Chip Hybridization Kit | Agilent Technologies | Cat# 5188-5220 |
| Raw and analyzed data (European nucleotide archive) | This study | ENA: PRJEB50368 |
| 1240K Genotype data (Edmond Data | This study | |
| EAGER 1.92.55 | ||
| AdapterRemoval 2.2.0 | ||
| BWA 0.7.12 | ||
| DeDup 0.12.2 | ||
| mapDamage 2.0.6 | ||
| bamUtil 1.0.13 | ||
| CircularMapper | ||
| ANGSD 0.910 | ||
| Schmutzi | ||
| SAMtools 1.3 | ||
| pileupCaller | ||
| GATK v3.5 | ||
| GeneImp 1.4 | ||
| SHAPEIT v2.r790 | ||
| pMMRCalculator | ||
| HaploGrep2 | ||
| Yleaf v1.0 | ||
| READ | ||
| lcMLkin | ||
| EIGENSOFT v6.0.1 | ||
| ADMIXTOOLS 5.1 | ||
| DATES v753 | ||
| MOSAIC v1.3 | ||
| RFMix v2.03 | ||
| PLINK v. 1.9 | ||
| hapROH v0.3 | ||
| R v4.0.5 | ||