| Literature DB >> 35561686 |
Nina Marchi1, Laura Winkelbach2, Ilektra Schulz3, Maxime Brami2, Zuzana Hofmanová4, Jens Blöcher2, Carlos S Reyna-Blanco3, Yoan Diekmann5, Alexandre Thiéry1, Adamandia Kapopoulou1, Vivian Link3, Valérie Piuz6, Susanne Kreutzer2, Sylwia M Figarska2, Elissavet Ganiatsou7, Albert Pukaj2, Travis J Struck8, Ryan N Gutenkunst8, Necmi Karul9, Fokke Gerritsen10, Joachim Pechtl11, Joris Peters12, Andrea Zeeb-Lanz13, Eva Lenneis14, Maria Teschler-Nicola15, Sevasti Triantaphyllou16, Sofija Stefanović17, Christina Papageorgopoulou7, Daniel Wegmann18, Joachim Burger19, Laurent Excoffier20.
Abstract
The precise genetic origins of the first Neolithic farming populations in Europe and Southwest Asia, as well as the processes and the timing of their differentiation, remain largely unknown. Demogenomic modeling of high-quality ancient genomes reveals that the early farmers of Anatolia and Europe emerged from a multiphase mixing of a Southwest Asian population with a strongly bottlenecked western hunter-gatherer population after the last glacial maximum. Moreover, the ancestors of the first farmers of Europe and Anatolia went through a period of extreme genetic drift during their westward range expansion, contributing highly to their genetic distinctiveness. This modeling elucidates the demographic processes at the root of the Neolithic transition and leads to a spatial interpretation of the population history of Southwest Asia and Europe during the late Pleistocene and early Holocene.Entities:
Keywords: Neolithic transition; ancient genomics; demogenomic modeling; demographic inference; demographic processes; human evolution; population admixture; upper Palaeolithic
Mesh:
Substances:
Year: 2022 PMID: 35561686 PMCID: PMC9166250 DOI: 10.1016/j.cell.2022.04.008
Source DB: PubMed Journal: Cell ISSN: 0092-8674 Impact factor: 66.850
Figure 1Spatial and temporal distribution of the ancient genomes analyzed in this study
(A) Location of archaeological sites with newly sequenced genomes and additional genomes used for modeling: Neolithic (black) and Mesolithic or Palaeolithic (red); different chronological phases of Neolithic expansion (colored areas) and archaeological cultures (blue) along the Danubian route of Neolithization; geographical areas (purple).
(B) Chronological distribution of the 25 genomes analyzed in this study, with the 15 newly sequenced genomes in bold, and the previously published genomes in italics (details in Tables 1 and S3). We also list the cultural groups (EFs, early farmers; HGs, hunter-gatherers), the regions and the archaeological sites where ancient individuals were sampled. The chronological interval at 2 sigma (95.4% probability) is shown for each directly 14C-dated sample, except for Stuttgart and Ess7, for which approximate dates are given based on the archaeological context.
Archaeological and genetic information for newly sequenced genomes
| Individual | Period (culture) | Site | Country | Age (cal. BP) | Mean depth (X) | Genetic sex | Haplogroups mtDNA | Haplogroups Y |
|---|---|---|---|---|---|---|---|---|
| VLASA7 | LM | Vlasac | Serbia | 8,764–8,340 | 15.21 | M | U5a2a | I2 |
| VLASA32 | LM | Vlasac | Serbia | 9,741–9,468 | 12.65 | M | U5a2a | R1b1 |
| AKT16 | EN | Aktopraklık | Turkey | 8,635–8,460 | 12.25 | F | K1a3 | – |
| Bar25 | EN | Barcın | Turkey | 8,384–8,205 | 12.65 | M | N1a1a1 | G2a2b2a1 |
| Nea3 | EN | Nea Nikomedeia | Greece | 8,327–8,040 | 11.57 | F | K1a2c | – |
| Nea2 | EN | Nea Nikomedeia | Greece | 8,173–8,023 | 12.51 | F | K1a | – |
| LEPE48 | TEN | Lepenski Vir | Serbia | 8,012–7,867 | 10.92 | M | K1a1 | C1a2b |
| LEPE52 | E-MN | Lepenski Vir | Serbia | 7,931–7,693 | 12.37 | M | H3 | G2a2b2a1a1c |
| STAR1 | EN (Starčevo) | Grad-Starčevo | Serbia | 7,589–7,476 | 10.55 | F | T2e2 | – |
| VC3-2 | EN (Starčevo) | Vinča-Belo Brdo | Serbia | 7,565–7,426 | 11.22 | M | HV-16311 | G2a2a1a3 |
| Asp6 | EN (LBK) | Asparn-Schletz | Austria | 7,575–7,474 | 12.11 | M | U5a1c1 | G2a2b2a3 |
| Klein7 | EN (LBK) | Kleinhadersdorf | Austria | 7,244–7,000 | 11.30 | F | W1-119 | – |
| Dil16 | EN (LBK) | Dillingen-Steinheim | Germany | 7,235–6,998 | 10.60 | M | J1c6 | C1a2b |
| Ess7 | EN (LBK) | Essenbach-Ammerbreite | Germany | (7,050–6,900 BP) | 12.34 | M | U5b2c1 | G2a2b2a1a1 |
| Herx | EN (LBK) | Herxheim | Germany | 7,164–6,993 | 11.46 | F | K1a4a1i | – |
LM, late Mesolithic; EN, early Neolithic; TEN, transformational/early Neolithic; E-MN, early-middle Neolithic; LBK, Linearbandkeramik. Samples with genetic sex determined as XX and XY are noted as F and M, respectively.
The samples’ ages are based on 14C dating (95.4% probability), except Ess7, for which an approximate date is given based on the archaeological context.
Figure 2Genetic structure and affinities of ancient individuals
(A) Multidimensional scaling (MDS) analysis performed on the neutrally evolving portion of ancient (n = 25; large filled circles and squares: early farmers; triangles: hunter-gatherers) and modern (n = 65, shown as small circles) genomes from Europe and SW Asia. Ellipses highlight two clusters of ancient individuals: the European hunter-gatherers (HGs) and the European and Anatolians early farmers (i.e., western EFs).
(B) Admixture analyses (K = 3) performed on 22 ancient genomes (three genomes with the lowest quality were discarded: Bichon, Bon002, and Bar8). Note that AKT16 in NW Anatolia is more admixed than a neighbor genome from the Barcın site (Bar25), in keeping with f-statistics analyses (see Figure S1), which has led us to consider them as originating from independent populations in our demographic modeling.
(C) Heterozygosity computed at neutral sites in ancient genomes (HGs in blue, EFs in green).
(D) Runs of homozygosity (ROHs) computed on imputed ancient whole genomes for intermediate ROHs (2–10 Mb, dark color) or long ROHs (>10 Mb, light color), indicative of background relatedness in small populations and close inbreeding, respectively.
See also Figure S1.
Figure S1Population grouping verified with f-statistics, related to Figure 2 and Methods S1
These analyses were performed on the 1240k dataset.
(A) We first tested if some individuals of a specific group had significantly more shared ancestry with individuals of a different group using f-statistics of the form D(Ind1 from a population, Ind2 from the same population; Test, Mbuti [outgroup]). We only found three significant absolute Z scores (>3.0, yellow). For Austria, Asp6 appears to share more ancestry with VLASA7 than Klein7. Since variation in European HG ancestry is expected in EF populations due to the ongoing process of admixture and since these samples did not show variation in their affinities to other EF samples, modeling them as a single population seems justified. For NWAnatolia, AKT16 was found to share significantly more ancestry with both Loschbour and Bichon, which we further investigated in (B).
(B) To shed more light on the variation in European HG ancestry among Anatolian and Greek samples, we calculated f-statistics of the form D(NGreece/SGreece/NWAnatolia/CAnatolia, CAnatolia; HG_west, Mbuti [outgroup]), where we use HG_west to denote both West 1 and West 2 European HGs. This test indicates whether the tested individual/population from NGreece, SGreece, NWAnatolia, or CAnatolia (left) shares more (orange, Z score > 0) or less (blue, Z score < 0) ancestry with the tested HG_west individual (bottom) than the tested individual from CAnatolia (top). Significant Z scores above 3.0 or below −3.0 are shown with more intense colors. Among the CAnatolian individuals, Pınarbaşı and Boncuklu_N appear to share excess drift with HG_west when compared with individuals from Greece and NWAnatolia, in contrast to Tepecik-Çiftlik_N. Among the individuals from NGreece, SGreece, and NWAnatolia, AKT16 appears closest to HG_west, and much closer than Bar25. In light of these results, we modeled AKT16 and Bar25 independently in the demographic inferences.
Figure 3Demographic scenario inferred from genomic modeling
This demographic history was obtained by compiling the best models of all tested scenarios (see Methods S1— Demogenomic inference with fastsimcoal2—Final model). Times of the events (y axis) and population ages (shown below their symbols) are indicated in ky BP. Under each population name, we indicate their sampled genomes, their associated inbreeding coefficients (Fis), and their diploid effective population sizes (Ne). Unfilled symbols indicate ancestral populations that we simulated after or before key events (split times or admixture events). The X symbols indicate bottlenecks that occurred on ancestral branches, modeled as a one-generation bottleneck through a population with its effective size shown in italics. Admixture proportions >10% from the Western metapopulation are indicated by blue arrows.
See also Figure S2 for effective populations sizes inferred by MSMC2.
Figure S2MSMC2 effective population size estimates, related to Figure 3 and Methods S1
These were obtained for populations either with four haplotypes or two haplotypes when only one ancient individual was available for the population (shown in the legend with ∗, details in Table M1_3, population sizes estimated from single ancient and modern genomes are shown in Figure M1_12). We used a mutation rate of 1.25 × 10−8 per generation per site and a generation time of 29 years. The analysis suggests smaller effective population sizes in the most recent times for HGs compared with EFs.
Figure 4Evolutionary insights gained from the demographic scenario shown in Figure 3
(A) MDS analysis done on 12 populations used in the demogenomic analyses and on simulated ancestral populations (unfilled symbols) sampled at key moments of their history, as defined on Figure 3: (1) on the ancestral branch before the split between Western and Eastern metapopulations 25.6 kya; (2) on the Central metapopulation branch just before and after the admixture occurring 14.2 kya; (3) on the Western metapopulation branch just before this admixture; (4) on the Eastern metapopulation branch at the time of split of the Iranian population 13.6 kya; (5) at the top of the western EFs ancestors branch just after its admixture with the Western metapopulation (12.9 kya), and then every 25 generations until the split of the Aegean populations 9.3 kya. Arrows indicate the trajectory of the populations caused by important demographic events (i.e., admixture events, bottlenecks, episodes of drift).
(B) Admixture plot for K = 3 performed on sampled and ancestral populations.
See also Figure S3 for corresponding admixture plots done on observed and simulated individuals.
Figure S3Comparison of observed and simulated Admixure plots, related to Figure 4 and Methods S1.
Admixture plot for K = 2 (left panel) and K = 3 (right panel) carried out on (A) observed data for 16 ancient genomes included into fastsimcoal2 demographic inferences (909,688 sites without missing data; Bichon, Bar8, and Bon002 were not included because of lower quality) and on (B) data simulated accordingly to our final model (shown in Figure 3), for the same subset of individuals as in (A).
Figure 5A spatiotemporal interpretation of population differentiation in SW Asia and Europe based on our model and the geographic distribution of the genomes
For a Figure360 author presentation of this figure, see https://doi.org/10.1016/j.cell.2022.04.008.
Colored shaded areas indicate approximate putative distributions of populations at different time points. The letters (A) to (H) indicate the chronological order of events; see main text for a detailed description. Note that warmer periods (Bølling and Allerød interstadials, Holocene) correspond to population range expansions while colder periods (LGM, Older Dryas) are associated with contractions.
See also Figure S4 for additional f-statistics analyses supporting alternative connections between the Levant and the Aegean/Greece area.
Figure S4Patterns of population admixture revealed by f-statistics, related to Figure 5 and Methods S1.
Relationship of Anatolian and Greek Neolithics with the Levant using f-statistics on the 1240k dataset of the form D(CAnatolia/NWAnatolia/NGreece/SGreece/CEurope,Test; Levant,Mbuti [outgroup]). This test indicates whether Neolithic individuals from CAnatolia, NWAnatolia, Greece, or CEurope (left) share less (blue, Z score < 0) or more (orange, Z score > 0) ancestry with samples from the Levant, namely individuals from contemporary Israel associated with Natufian culture (Israel_Natufian), Pre-Pottery Neolithic B (Israel_PPNB), and Chalcolithic (Israel_C; all bottom) than the Test individuals/populations from Greece and CAnatolia. Significant Z scores above 3.0 or below −3.0 are shown with more intense colors. We find a strongly significant excess of shared drift between populations from the Levant and NGreece, SGreece, NWAnatolia, and CEurope when contrasted to Boncuklu. This signal was, however, not replicated when representing CAnatolia by Pınarbaşı. In contrast, the SGreece populations Diros_EN and Peloponnese_N appear to share excess drift with populations from the Levant, and in particular the Chalcolithic Israel_C, when contrasted to samples from NGreece, NWAnatolia, or CEurope.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Ancient human bone material | This study | AKT16; ERS10598167 |
| Ancient human bone material | This study | Bar25; ERS10598177 |
| Ancient human bone material | This study | Nea2; ERS10598179 |
| Ancient human bone material | This study | Nea3; ERS10598178 |
| Ancient human bone material | This study | VLASA32; ERS10598181 |
| Ancient human bone material | This study | VLASA7; ERS10598180 |
| Ancient human bone material | This study | LEPE48; ERS10598171 |
| Ancient human bone material | This study | LEPE52; ERS10598168 |
| Ancient human bone material | This study | STAR1; ERS10598170 |
| Ancient human bone material | This study | VC3-2; ERS10598169 |
| Ancient human bone material | This study | Asp6;ERS10598172 |
| Ancient human bone material | This study | Klein7; ERS10598176 |
| Ancient human bone material | This study | Ess7; ERS10598174 |
| Ancient human bone material | This study | Dil16; ERS10598175 |
| Ancient human bone material | This study | Herx; ERS10598173 |
| AccuPrime™ Pfx SuperMix | Invitrogen | Cat#12344040 |
| Bst Polymerase, Large Fragment (8 U/μl) | New England Biolabs GmbH | Cat#M0275S |
| dNTPs (each 10 mM) | QIAGEN, Hilden, Germany | Cat#201901 |
| EDTA (0.5 M, pH 8.0) | Ambion/Applied Biosystems, Life Technologies | Cat#AM9262 |
| Lauroylsarcosine, Sodium Salt | Merck Millipore, Merck KGaA, Darmstadt, Germany | Cat#428010 |
| NEBNext End Repair Enzyme Mix | New England Biolabs GmbH | Cat#E6050L |
| NEBNext End Repair Reaction Buffer (10X) | New England Biolabs GmbH | Cat#E6050L |
| Nuclease-free H2O | Life Technologies | Cat#AM9932 |
| PEG-4000 | Thermo Scientific | Cat#EL0011 |
| Proteinase K | Roche Diagnostics, Mannheim, Germany | Cat#3115828001 |
| T4 DNA Ligase (5 U/μl) | Thermo Scientific | Cat#EL0011 |
| T4 DNA Ligase Buffer (10X) | Thermo Scientific | Cat#EL0011 |
| ThermoPol Buffer (10X) | New England Biolabs GmbH | Cat#M0275S |
| Tris-EDTA | Sigma-Aldrich | Cat#T9285 |
| Tris-HCl (1M, pH 8.0) | Life Technologies | Cat#15568025 |
| USER™ enzyme | New England Biolabs GmbH | Cat#M5505L |
| Agilent 2100 Expert Bioanalyzer System and High Sensitivity DNA Analysis Kit | Agilent Technologies | Cat#5067-4626 (kit) |
| Qubit Fluorometric quantitation and dsDNA HS Assay Kit | Invitrogen | Cat#Q32854 (kit); |
| Sequencing data and aligned BAMfiles | This study | ENA: PRJEB50857 |
| Code and input-files connected to this study | This study | |
| Ancient genome - Demultiplexed FASTQ files kindly provided by the authors (contact: Zuzana Hofmanová, co-author of this study) | ( | Bar8 |
| Ancient genome - Demultiplexed FASTQ files | ( | Bichon; ERR1078331 - ERR1078351 |
| Ancient genome - FASTQ files (produced from BAMfiles by ENA) | ( | Bon002; ERR1514027 |
| Ancient genome - Demultiplexed FASTQ files kindly provided by the authors (contact: Kay Prüfer) | ( | Loschbour |
| Ancient genome - Raw sequencing FASTQ file kindly provided by the authors (contact: Lara Cassidy) | ( | NE1 |
| Ancient genome - Aligned BAM file | ( | SF12; ERR2060277 |
| Ancient genome - Demultiplexed FASTQ files kindly provided by the authors (contact: Kay Prüfer) | ( | Stuttgart |
| Ancient genome - Demultiplexed FASTQ file kindly provided by the authors (contact: Yoan Diekmann, co-author of this study) | ( | CarsPas1 |
| Ancient genome - Demultiplexed FASTQ files kindly provided by the authors (contact: Jens Blöcher, co-author of this study) | ( | WC1 |
| Ancient genome - Demultiplexed FASTQ files | ( | KK1; ERR1078321-ERR1078325 |
| Modern genomes - Aligned BAMfiles from 77 modern individuals | The Simons Genome Diversity Project (SGDP) | |
| 1000 Genomes phase3 per chromosome VCFs | ( | |
| Allen Ancient DNA Resource v42.4 and v37.2 | Reich lab public data release | |
| Assembly gaps and centromeres | UCSC genome browser | |
| Chimpanzee hg19 nucleotide states | UCSC genome browser | |
| Chimpanzee Reference Genome (panTro4) | GenBank assembly accession: GCA_000001515.4 | |
| CpG islands list | UCSC genome browser | (CpG Islands (cpgIslandExt) Track) |
| Ensembl Compara 71 genome FASTA files | Ensembl Compara | |
| Gorilla hg19 nucleotide states | UCSC genome browser | |
| Haplotype Reference Consortium dataset | ( | accession number EGAD00001002729 on the European Genome-phenome Archive |
| HapMap file for chrX (ANGSD) HapMapChrx.gz | ( | |
| HapMap phase II b37 genetic map | N/A | |
| Human reference sequence hs37d5 | ( | |
| Known InDel positions | ( | |
| Known InDel positions | ( | |
| Per chromosome mappability mask for human reference genome hs37d5 | The Simons Genome Diversity Project (SGDP) | |
| Recombination Map for YRI population | ( | |
| Ultraconserved sites for recalibration step | ( | |
| P5 and P7 adapters | ( | N/A |
| P5 and P7 indexing primers with index sequences (8 bp) from the NexteraXT index Kit v2 | ( | N/A |
| ADMIXTOOLS package v7300 | ( | |
| ANGSD - version 0.917 | ( | |
| ANNOVAR | ( | |
| ATLAS - commits 6bd2482 & 7cfc900 | ( | |
| ATLAS-Pipeline, commit 6df90e7 | Wegmann lab, Ilektra Schulz | |
| bcftools versions: 1.9 and 0.1.15 | ( | |
| bwa - Burrows-Wheeler Alignment Tool - versions 0.7.15 and 0.7.17 | ( | bio-bwa.sourceforge.net |
| BEDOPS v2.4.40 | ( | |
| Bedtools 2.25.0 | ( | |
| ContamMix - version 1.0 | ( | |
| dadi | ( | |
| fastsimcoal2.7 | ( | |
| fastqc - version 0.11.5 | Babraham Bioinformatics | |
| GATK - version 3.7 | ( | |
| HIrisPlex-S webtool | ( | |
| IBDSeq v. r1206 | ( | |
| LEA R package v.2.6.0 | ( | |
| mafft - version 7.31 | ( | |
| MIA (Mapping Iterative Assembler) - version 1.0 | MPI EVA Bioinformatics | |
| MSMC2 | ( | |
| MSMC-tools, commit 07bc8a9 | ( | |
| phy-mer | ( | |
| Picard-tools - version 2.9 | Broad Institute | |
| R - version 4.0, 3.7, and 3.6.1 | ||
| SAMtools - version 1.3 | ( | |
| Samtools 1.9 | ( | |
| seqtk - version 1.2 | N/A | |
| SHAPEIT4 v1.2 | ( | |
| Snakemake - version 4.0 | ( | |
| Trim Galore! - version 0.4.3 | Babraham Bioinformatics | |
| Yjasc_3752_ry_compute.py, version 0.4 | ( | |
| Yleaf | ( | |
| Agencourt® AMPure® XP beads | Beckmann Coulter | Cat#A63880 |
| Amicon Ultra-4 Centrifugal Filter Units, 30kDa | Merck Millipore, Darmstadt, Germany | Cat#UFC803096 |
| Hydroxyapatite | Sigma-Aldrich | Cat#21223 |
| MinElute PCR Purification Kit | QIAGEN, Hilden, Germany | Cat#28006 |
| MSB® Spin PCRapace | Invitek Molecular GmbH, Berlin, Germany | Cat#1020220400 |
| Spezial-Edelkorund (EW60/250μ) | Harnisch+Rieth, Winterbach, Germany | Cat#75250 |
| Spezial-Edelkorund (30B/50μ) | Harnisch+Rieth, Winterbach, Germany | Cat#75308 |