Mateja Hajdinjak1, Qiaomei Fu2,3,4, Alexander Hübner1, Martin Petr1, Fabrizio Mafessoni1, Steffi Grote1, Pontus Skoglund5, Vagheesh Narasimham5, Hélène Rougier6, Isabelle Crevecoeur7, Patrick Semal8, Marie Soressi9,10, Sahra Talamo10, Jean-Jacques Hublin10, Ivan Gušić11, Željko Kućan11, Pavao Rudan11, Liubov V Golovanova12, Vladimir B Doronichev12, Cosimo Posth13,14, Johannes Krause13,14, Petra Korlević1, Sarah Nagel1, Birgit Nickel1, Montgomery Slatkin15, Nick Patterson5,16, David Reich5,16,17, Kay Prüfer1, Matthias Meyer1, Svante Pääbo1, Janet Kelso1. 1. Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany. 2. Key Laboratory of Vertebrate Evolution and Human Origins of Chinese Academy of Sciences, IVPP, CAS, Beijing 100044, China. 3. CAS Center for Excellence in Life and Paleoenvironment, Beijing 100044, China. 4. University of Chinese Academy of Sciences, Beijing 100049, China. 5. Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA. 6. Department of Anthropology, California State University Northridge, Northridge, California 91330-8244, USA. 7. Université de Bordeaux, CNRS, UMR 5199-PACEA, 33615 Pessac Cedex, France. 8. Royal Belgian Institute of Natural Sciences, 1000 Brussels, Belgium. 9. Faculty of Archaeology, Leiden University, 2300 RA Leiden, The Netherlands. 10. Department of Human Evolution, Max Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany. 11. Croatian Academy of Sciences and Arts, HR-10000 Zagreb, Croatia. 12. ANO Laboratory of Prehistory 14 Linia 3-11, St Petersburg 1990 34, Russia. 13. Max Planck Institute for the Science of Human History, 07745 Jena, Germany. 14. Institute for Archaeological Sciences, University of Tübingen, Rümelin Strasse 23, 72070 Tübingen, Germany. 15. Department of Integrative Biology, University of California, Berkeley, California 94720-3140, USA. 16. Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA. 17. Howard Hughes Medical Institute, Harvard Medical School, Boston, Massachusetts 02115, USA.
Abstract
Although it has previously been shown that Neanderthals contributed DNA to modern humans, not much is known about the genetic diversity of Neanderthals or the relationship between late Neanderthal populations at the time at which their last interactions with early modern humans occurred and before they eventually disappeared. Our ability to retrieve DNA from a larger number of Neanderthal individuals has been limited by poor preservation of endogenous DNA and contamination of Neanderthal skeletal remains by large amounts of microbial and present-day human DNA. Here we use hypochlorite treatment of as little as 9 mg of bone or tooth powder to generate between 1- and 2.7-fold genomic coverage of five Neanderthals who lived around 39,000 to 47,000 years ago (that is, late Neanderthals), thereby doubling the number of Neanderthals for which genome sequences are available. Genetic similarity among late Neanderthals is well predicted by their geographical location, and comparison to the genome of an older Neanderthal from the Caucasus indicates that a population turnover is likely to have occurred, either in the Caucasus or throughout Europe, towards the end of Neanderthal history. We find that the bulk of Neanderthal gene flow into early modern humans originated from one or more source populations that diverged from the Neanderthals that were studied here at least 70,000 years ago, but after they split from a previously sequenced Neanderthal from Siberia around 150,000 years ago. Although four of the Neanderthals studied here post-date the putative arrival of early modern humans into Europe, we do not detect any recent gene flow from early modern humans in their ancestry.
Although it has previously been shown that Neanderthals contributed DNA to modern humans, not much is known about the genetic diversity of Neanderthals or the relationship between late Neanderthal populations at the time at which their last interactions with early modern humans occurred and before they eventually disappeared. Our ability to retrieve DNA from a larger number of Neanderthal individuals has been limited by poor preservation of endogenous DNA and contamination of Neanderthal skeletal remains by large amounts of microbial and present-day human DNA. Here we use hypochlorite treatment of as little as 9 mg of bone or tooth powder to generate between 1- and 2.7-fold genomic coverage of five Neanderthals who lived around 39,000 to 47,000 years ago (that is, late Neanderthals), thereby doubling the number of Neanderthals for which genome sequences are available. Genetic similarity among late Neanderthals is well predicted by their geographical location, and comparison to the genome of an older Neanderthal from the Caucasus indicates that a population turnover is likely to have occurred, either in the Caucasus or throughout Europe, towards the end of Neanderthal history. We find that the bulk of Neanderthal gene flow into early modern humans originated from one or more source populations that diverged from the Neanderthals that were studied here at least 70,000 years ago, but after they split from a previously sequenced Neanderthal from Siberia around 150,000 years ago. Although four of the Neanderthals studied here post-date the putative arrival of early modern humans into Europe, we do not detect any recent gene flow from early modern humans in their ancestry.
The Middle to Upper Palaeolithic transition in Europe was characterized by major cultural and biological changes that coincided with the arrival of early modern humans and the disappearance of Neandertals8,9. Analysis of the first Neandertal genomes provided evidence for gene flow from Neandertals into modern humans between 50,000 and 60,000 years ago, resulting in around 2% of Neandertal DNA in the genomes of non-Africans today1,2,10. Additionally, genetic analyses of a ~39,000-42,000-year-old modern human from Romania, Oase 1, showed that interbreeding between Neandertals and modern humans also happened in Europe at a later point in time11. However, little is known about the diversity of late Neandertal populations across Europe and West Asia shortly before their disappearance, or about their relationship to the population that admixed with early modern humans. To date, only a handful of Neandertal remains have been identified with a sufficiently high content of endogenous DNA and low enough levels of microbial and human DNA contamination3 to allow analysis of larger parts of their genomes1,2,7, limiting our ability to study their genetic history.In an attempt to make more Neandertal genomes available for population analyses, we identified five specimens with sufficient preservation of endogenous DNA to explore the possibility of nuclear genome sequencing (Fig. 1A and Supplementary Information 1): a fragment of a right femur (Goyet Q56-1) dated to 43,000-42,080 cal BP from the Troisième caverne of Goyet in Belgium12; an upper right molar (Spy 94a) that is associated with a maxillary fragment dated to 39,150-37,880 cal BP from the neighbouring Spy cave13; a tooth (Les Cottés Z4-1514) dated to 43,740-42,720 cal BP from Les Cottés cave14 in France; an undiagnostic bone fragment found in Vindija cave in Croatia (Vindija 87) dated to >44,000 uncalibrated BP; and a skull fragment of an infant from Mezmaiskaya cave (Mezmaiskaya 2) in the Russian Caucasus dated to 44,600-42,960 cal BP15.
Figure 1
Specimen information and the effects of 0.5% hypochlorite treatment.
a, Location and age of the five late Neandertal specimens analysed in this study (new), and other Neandertal sites from which genome-wide data was obtained previously (old; map source Vectormapcollection). b, Proportion of DNA fragments aligned to the human reference genome in untreated bone/tooth powder and in powder treated with 0.5% sodium hypochlorite (Supplementary Table S2.1). c, Proportion of present-day human contamination (with 95% binomial confidence intervals) inferred from mitochondrial DNA fragments in treated and untreated sample material. Pearson's chi-squared test (two-sided) used for calculating significant differences (denoted with **, α << 0.001) (Supplementary Table S2.5).
We extracted DNA from between 9 and 58 mg of bone or tooth powder. Approximately half of the powder from each specimen was treated with 0.5% hypochlorite solution prior to DNA extraction in order to remove present-day human and microbial DNA contamination6. Hypochlorite treatment increased the proportion of DNA fragments mapping to the human reference genome between 5.6- and 161-fold (Fig. 1B; Supplementary Information 2), and reduced present-day human contamination in four of the specimens between 2- and 18-fold (Fig. 1C; Supplementary Information 2). This substantial increase in the proportion of informative fragments made whole genome sequencing of these previously inaccessible specimens feasible.We generated additional single-stranded DNA libraries from selected extracts and sequenced them to an average genomic coverage of 2.7-fold for Les Cottés Z4-1514, 2.2-fold for Goyet Q56-1, 1.7-fold for Mezmaiskaya 2, 1.3-fold for Vindija 87 and 1-fold for Spy 94a (Extended Data Table 1; Supplementary Information 3). Estimates of mitochondrial and nuclear DNA contamination ranged from 0.52% to 5.06% and from 0.18% to 1.75%, respectively (Supplementary Information 4). Therefore, we restricted analyses to fragments carrying cytosine (C) to thymine (T) substitutions at their ends as these derive from deamination of cytosine to uracil and indicate that DNA molecules are of ancient origin16,17 (Extended Data Fig. 1 and 2). This reduced the mtDNA contamination to 0.39-1.61% and the autosomal contamination to 0-0.81% (Supplementary Information 4). To mitigate the influence of deamination on genetic inferences, we further restricted the analyses to only transversion polymorphisms (see Supplementary Information 6).
Extended Data Table 1
Amount of data generated for Les Cottés Z4-1514, Goyet Q56-1, Mezmaiskaya 2, Vindija 87 and Spy 94a.
Reported is the number of fragments after merging all of the sequencing libraries together. The obtained coverage of the nuclear genomes is determined by counting the number of bases with the base quality of at least 30 in fragments longer than 35 bp with the mapping quality of at least 25 that overlapped highly mappable regions of autosomes of the human genome and dividing that number by the total length of these regions.
Specimen
All fragments
Fragments with terminal C-to-T substitutions to the reference genome
Number of sequenced fragments
Number of fragments ≥ 35 bp
Number of mapped fragments ≥ 35 bp, MQ ≥ 25, Map35_100%
Number of unique fragments ≥ 35 bp, MQ ≥ 25, Map35_100%
Obtained nuclear coverage
Number of mapped fragments ≥ 35 bp, MQ ≥ 25, Map35_100%
Number of unique fragments ≥ 35 bp, MQ ≥ 25, Map35_100%
Obtained nuclear coverage
Les Cottés Z4-1514
1,776,108,129
978,981,412
253,134,472
121,336,498
2.71
59,777,476
41,804,332
1.00
Goyet Q56-1
917,798,995
575,597,226
124,582,445
80,515,019
2.18
13,585,646
9,859,280
0.27
Mezmaikaya 2
2,571,128,011
1,356,302,134
125,815,832
74,407,074
1.74
32,838,281
23,521,855
0.56
Vindija 87
1,353,546,508
678,005,556
110,385,510
60,034,976
1.27
32,597,516
22,828,056
0.49
Spy 94a
733,212,063
278,595,622
72,962,863
46,836,553
1.00
14,301,192
10,003,560
0.22
Extended Data Figure 1
Frequency of nucleotide substitutions at the beginning and the end of nuclear alignments for the final dataset of Les Cottés Z4-1514, Goyet Q56-1, Mezmaiskaya 2, Vindija 87 and Spy 94a.
Only fragments of at least 35 base pairs (bp) that mapped to the human reference genome with a mapping quality of at least 25 (MQ ≥ 25) were used for this analysis. Solid lines depict all fragments and dashed lines the fragments that have a C-to-T substitution at the opposing end (‘conditional’ C-to-T substitutions). All other types of substitutions are marked in grey.
Extended Data Figure 2
Fragment size distribution of fragments longer than 35 base pairs (bp) mapped to the human reference genome with the mapping quality of at least 25 (MQ≥25) for each of the five late Neandertals.
All fragments are depicted in solid lines and fragments with C-to-T substitutions to the reference genome (putatively deaminated fragments) are depicted with dashed lines.
A phylogenetic tree (Fig. 2A) of the reconstructed complete mitochondrial genomes places these five specimens within the Neandertal mtDNA variation. The mtDNA of Les Cottés Z4-1514 is more closely related to the mtDNAs of Neandertals from Okladnikov cave and Denisova cave located at the Eastern edge of the Neandertal range (Supplementary Information 5), whereas the mtDNA of Mezmaiskaya 2 is most closely related to Feldhofer 2 from Germany, challenging the previously proposed division between Eastern and Western mtDNAs in late-surviving Neandertals18. As inferred from the sequence coverage of the X chromosome and the autosomes, Goyet Q56-1, Les Cottés Z4-1514 and Vindija 87 were females, whereas Mezmaiskaya 2 and Spy 94a were males (Extended Data Fig. 3). The Y chromosome sequences of both male individuals fall outside the known variation of present-day human Y chromosomes (Fig. 2B; Supplementary Information 5), as is the case for the Y chromosome of a Neandertal from El Sidrón, Spain19.
Figure 2
Phylogenetic relationships of late Neandertals.
a, Bayesian phylogenetic tree of mitochondrial genomes of 23 Neandertals, 3 Denisovans, 64 modern humans and a hominin from Sima de los Huesos. Reported are posterior probabilities for the branches. b, Neighbour-joining tree of Y chromosome sequences of Mezmaiskaya 2, Spy 94a, 175 present-day humans21 and two present-day humans carrying the A00 haplogroup30. Numbers of substitutions are denoted above the branches.
c, Neighbour-joining tree of nuclear genomes based on autosomal transversions among late Neandertals, Vindija 33.19, Mezmaiskaya 1, Altai Neandertal, Denisovan and 12 present-day humans. Reported are bootstrap support values after 1000 replications.
Extended Data Figure 3
Sex determination based on the number of fragments aligning to the X chromosome and the autosomes.
The expected ratios of X to (X + autosomal) fragments for a female and a male individual are depicted as dashed lines. The results were concordant for all fragments (in red) and for deaminated fragments only (in grey).
We analysed the genomes of the five late Neandertals with the previously published high-quality genomes of a ~120,000-year-old Neandertal from Siberia (AltaiNeandertal)2 and a >45,000-year-old Neandertal from Croatia (Vindija 33.19)7, the low-coverage genome of a ~60,000-70,000-year-old Neandertal from Mezmaiskaya cave (Mezmaiskaya 1)2,7, the composite low-coverage genome of three Neandertals from Croatia1, the high-coverage genome of a Denisovan individual20, and a world-wide panel of present-day humans2,21. Based on sharing of derived alleles with the Vindija 33.19 genome7, we find that the Vindija 87 specimen originates from the same individual as Vindija 33.19. We therefore excluded Vindija 87 from subsequent analyses (Supplementary Information 7).A neighbour-joining tree based on the number of pairwise transversions between individuals shows that all Neandertals form a monophyletic clade relative to the Denisovan individual (Fig. 2C; Supplementary Information 7 and 8). The tree reflects an apparent age-related division among the Neandertals with the oldest specimen, the AltaiNeandertal branching off first, followed by Mezmaiskaya 1, while the late Neandertals form a clade. To assess the relationship of the late Neandertals to the high coverage genomes of the Altai and Vindija 33.19 Neandertals, we used two related statistics that measure the fraction of derived allele-sharing between the genomes. We find that all late Neandertals and Mezmaiskaya 1 share significantly more derived alleles with Vindija 33.19 than with the AltaiNeandertal (-32.1 ≤ Z ≤ -58.1; Extended Data Table 2; Supplementary Information 9) and that late Neandertals share on average 49.0% (95% CI: 44.2-54.2%) of the derived alleles separating Vindija 33.19 from other high-coverage genomes in the analysis, whereas they share 17.2% (95% CI: 15.9-18.3%) of the alleles derived in the AltaiNeandertal (Extended Data Table 3; Supplementary Information 7).
Extended Data Table 2
Relationship of the late Neandertals and Mezmaiskaya 1 to the Altai and Vindija 33.19 calculated as D (Altai, Vindija 33.19, Neandertal, outgroup) for all fragments and deaminated fragments, restricted to transversions.
All late Neandertals and Mezmaiskaya 1 are significantly closer to Vindija 33.19 than to the Altai Neandertal, irrespective of the used outgroup. Blue denotes Z-score << -2. The total number of informative sites that are transversions among Neandertals and where at least one late Neandertal has coverage is 1,567,449.
D (Altai, Vindija 33.19, Neandertal, outgroup)
All fragments
Deaminated fragments
W
X
Y
Z
% D
Z-score
BABA
ABBA
% D
Z-score
BABA
ABBA
Altai
Vindija33.19
Les Cottés Z4-1514
Dinka
48.87
-55.31
30,276
10,399
49.08
-52.83
19,048
6,506
Altai
Vindija33.19
Les Cottés Z4-1514
Mbuti
49.08
-57.83
30,340
10,362
49.34
-55.25
19,077
6,471
Altai
Vindija33.19
Les Cottés Z4-1514
Chimp
49.16
-59.83
29,427
10,028
49.38
-56.69
18,597
6,302
Altai
Vindija33.19
Les Cottés Z4-1514
Orang
48.88
-60.25
28,149
9,664
49.15
-57.21
17,775
6,060
Altai
Vindija33.19
Les Cottés Z4-1514
Rhesus
48.35
-58.96
25,856
9,003
48.38
-55.77
16,281
5,664
Altai
Vindija33.19
Goyet Q56-1
Dinka
55.87
-66.92
33,208
9,403
56.32
-56.88
8,673
2,423
Altai
Vindija33.19
Goyet Q56-1
Mbuti
56.00
-68.27
33,269
9,383
56.32
-58.37
8,689
2,428
Altai
Vindija33.19
Goyet Q56-1
Chimp
55.96
-68.97
32,188
9,090
56.76
-59.25
8,485
2,340
Altai
Vindija33.19
Goyet Q56-1
Orang
55.68
-68.11
30,735
8,749
56.79
-59.39
8,125
2,239
Altai
Vindija33.19
Goyet Q56-1
Rhesus
55.40
-67.04
28,312
8,125
56.31
-58.09
7,483
2,092
Altai
Vindija33.19
Mezmaiskaya1
Dinka
36.58
-36.63
20,392
9,468
37.09
-32.69
8,379
3,845
Altai
Vindija33.19
Mezmaiskaya1
Mbuti
36.80
-37.33
20,516
9,479
37.43
-33.41
8,438
3,842
Altai
Vindija33.19
Mezmaiskaya1
Chimp
36.70
-38.66
19,860
9,197
37.00
-33.45
8,126
3,736
Altai
Vindija33.19
Mezmaiskaya1
Orang
36.18
-37.38
18,904
8,860
36.77
-32.27
7,758
3,586
Altai
Vindija33.19
Mezmaiskaya1
Rhesus
36.00
-37.77
17,433
8,204
36.36
-32.05
7,129
3,327
Altai
Vindija33.19
Mezmaiskaya2
Dinka
47.44
-42.97
26,802
9,553
46.84
-40.59
12,477
4,516
Altai
Vindija33.19
Mezmaiskaya2
Mbuti
47.83
-47.39
26,781
9,451
47.13
-44.43
12,471
4,481
Altai
Vindija33.19
Mezmaiskaya2
Chimp
47.92
-51.11
26,019
9,160
47.37
-47.06
12,233
4,369
Altai
Vindija33.19
Mezmaiskaya2
Orang
47.86
-53.48
24,921
8,788
47.07
-47.53
11,700
4,210
Altai
Vindija33.19
Mezmaiskaya2
Rhesus
47.33
-49.32
22,908
8,189
46.65
-44.74
10,799
3,929
Altai
Vindija33.19
Spy 94a
Dinka
54.86
-56.62
21,416
6,242
55.17
-50.39
6,558
1,895
Altai
Vindija33.19
Spy 94a
Mbuti
55.18
-59.76
21,453
6,196
55.53
-52.89
6,567
1,878
Altai
Vindija33.19
Spy 94a
Chimp
55.32
-60.58
20,814
5,988
55.39
-50.67
6,354
1,824
Altai
Vindija33.19
Spy 94a
Orang
55.01
-60.26
19,793
5,746
55.33
-50.10
6,052
1,740
Altai
Vindija33.19
Spy 94a
Rhesus
54.46
-58.94
18,179
5,360
54.93
-48.52
5,584
1,624
Extended Data Table 3
The fraction of derived alleles among putatively deaminated fragments that each of the low coverage individuals shares with the Altai Neandertal, Vindija 33.19, Denisovan, and a present day human genome.
The state of DNA fragments overlapping the positions at which the high coverage genomes of the Altai Neandertal, the Vindija 33.19 Neandertal, the Denisovan individual and a present-day African (Mbuti, HGDP00982) differ from those of the great apes were investigated. Fragments longer than 35bp with mapping quality of at least 25 and within the highly mappable regions of the genome that had terminal C-to-T substitutions reported in the Table S3.2 were utilized. 95% binomial confidence intervals are provided in brackets.
Human (%)
Neandertal (%)
Altai Neandertal (%)
Vindija 33.19 (%)
Denisovan (%)
Neandertal-Denisova (%)
Human-Neandertal (%)
Human-Denisova (%)
Les Cottés Z4-1514
0.67[0.65-0.70]
93.01[92.89-93.13]
18.00[17.69-18.32]
46.41[46.06-46.77]
0.86[0.82-0.89]
97.90[97.83-97.98]
97.30[97.17-97.43]
3.22[3.07-3.38]
Goyet Q56-1
0.60[0.56-0.65]
94.38[94.19-94.57]
16.47[15.93-17.02]
53.55[52.91-54.18]
0.80[0.74-0.86]
98.17[98.04-98.29]
97.72[97.49-97.92]
2.52[2.29-2.78]
Spy 94a
0.71[0.66-0.77]
93.89[93.65-94.12]
16.52[15.90-17.17]
51.45[50.70-52.21]
0.80[0.73-0.87]
97.83[97.66-97.99]
97.62[97.35-97.87]
3.06[2.75-3.39]
Vindija 87
0.36[0.33-0.38]
97.08[96.97-97.19]
6.81[6.53-7.09]
75.93[75.49-76.36]
0.37[0.34-0.40]
99.05[98.98-99.12]
98.91[98.79-99.02]
1.38[1.24-1.53]
Mezmaiskaya 2
0.65[0.62-0.69]
92.78[92.62-92.93]
17.91[17.51-18.31]
44.65[44.19-45.12]
0.83[0.79-0.88]
97.74[97.63-97.84]
97.53[97.36-97.68]
3.38[3.18-3.59]
Mezmaiskaya 1
0.76[0.72-0.80]
91.87[91.67-92.06]
20.30[19.81-20.80]
38.38[37.85-38.91]
0.97[0.91-1.03]
97.61[97.48-97.73]
96.80[96.58-97.00]
3.73[3.49-4.00]
Vindija 33.16
3.60[3.50-3.71]
93.39[93.18-93.59]
17.09[16.55-17.65]
58.99[58.36-59.63]
3.94[3.81-4.08]
96.14[95.95-96.32]
95.46[95.15-95.74]
4.89[4.56-5.24]
Vindija 33.25
3.04[2.94-3.15]
94.17[93.95-94.37]
16.69[16.10-17.30]
60.81[60.11-61.49]
3.41[3.28-3.55]
96.53[96.33-96.72]
95.79[95.46-96.09]
3.77[3.46-4.12]
Vindija 33.26
3.49[3.38-3.62]
93.54[93.31-93.77]
17.04[16.43-17.67]
59.65[58.94-60.36]
3.70[3.55-3.85]
96.25[96.04-96.45]
95.40[95.04-95.72]
4.56[4.21-4.95]
Feldhofer 1
4.71[2.40-9.01]
94.38[87.51-97.58]
20[8.86-39.13]
50.00[34.07-65.93]
1.80[0.50-6.33]
90.63[81.02-95.63]
96.00[80.46-99.29]
0[0-12.87]
El Sidron 1253
1.29[0.44-3.72]
90.99[84.21-95.03]
17.50[8.75-31.95]
42.86[30.02-56.73]
4.35[2.22-8.34]
95.24[88.39-98.13]
97.44[86.82-99.55]
2.33[0.41-12.06]
Denisova 4
2.63[1.51-4.54]
2.31[1.06-4.94]
2.13[0.59-7.43]
2.04[0.56-7.14]
71.43[65.87-76.40]
96.15[91.86-98.23]
10.00[5.15-18.51]
90.32[80.45-95.49]
Denisova 8
1.60[1.39-1.85]
6.75[6.13-7.43]
1.75[1.23-2.48]
1.83[1.37-2.45]
60.02[58.92-61.12]
92.53[91.66-93.32]
15.32[13.72-17.06]
87.72[86.01-89.25]
Obtaining genome-wide data of multiple late Neandertals from a broad geographical range allowed us to determine for the first time whether relatedness among Neandertals is correlated to their geographical proximity, as is the case for present-day humans22. In support of this, we find that the two Neandertals from Belgium share more derived alleles with each other than with any other Neandertal (-3.65 ≤ Z ≤ -8.47; Supplementary Information 9), and in turn more derived alleles with Neandertals from France and Croatia than with the late Neandertal from the Caucasus. Similarly, the four Neandertals from Vindija cave that come from a relatively narrow time range share more derived alleles with each other than with other Neandertals (-2.2 ≤ Z ≤ -14.5; Supplementary Information 9). Furthermore, specimens of similar age with the largest geographic distance between them (Les Cottés Z4-1514 and Mezmaiskaya 2) shared the fewest derived alleles (Supplementary Information 9). In contrast, Mezmaiskaya 2 shared more derived alleles with the other late Neandertals than with Mezmaiskaya 1 (-2.13 ≤ Z ≤ -9.56; Supplementary Information 9), suggesting that there was a population turnover towards the end of Neandertal history. This turnover may have been the result of a population related to Western Neandertals replacing earlier Neandertals in the Caucasus, or the replacement of Neandertals in the West by a population related to Mezmaiskaya 2. The timing of this turnover coincides with dramatic climatic fluctuations during Marine Isotope Stage 3 between 60,000 and 24,000 years ago23, when extreme cold periods in Northern Europe may have triggered local extinction of Neandertal populations and subsequent re-colonization from refugia in South Europe or Western Asia24,25.We estimated the population split times between each of the low-coverage Neandertals and the two high-coverage Neandertals by determining the fraction of sites at which each of the low coverage Neandertals shares a derived allele that occurs in the heterozygote state in one of the high-quality genomes (F(A|B) statistics1,20). This fraction was then used to estimate the population split times for each pair of Neandertals using previous inferences of how the Neandertal population sizes changed over time2,7. Due to the uncertainties in the mutation rate and generation times, we caution that while the times presented are likely to accurately reflect the relative ages of the population split times, the absolute estimates in years are approximate. We estimate that all late Neandertals separated from a common ancestor with the AltaiNeandertal ~150,000 years ago (kya) (95% CI: 142-186 kya), and from a common ancestor with Vindija 33.19 ~70 kya (95% CI: 58-72 kya; Extended Data Table 4, Supplementary Information 8). The estimates of the population split times from the common ancestors shared with the Denisovan and with modern humans are ~400 kya (95% CI: 367-484 kya) and ~530 kya (95% CI: 503-565 kya; Extended Data Table 4, Supplementary Information 8), respectively, consistent with previous estimates using the Altai and Vindija 33.19 Neandertal genomes1,2,20.
Extended Data Table 4
Time of separation of late Neandertals and Mezmaiskaya 1 (A) from the ancestor with the high-coverage genomes of Altai and Vindija 33.19 Neandertals, Denisovan individual and a present-day human (B), when measured in terms of time of split from the B individual (split A-B), or time from present (split-time + branch shortening (bs)).
Results reported for all fragments and deaminated fragments only (Tables S3.1 and S3.2). 95% confidence intervals for F(A|B) values are estimated via Weighted Block Jackknife, using blocks of 5 Mb. In order to obtain estimates of split times from present, we correct "split A-B" by adding the age of the high coverage B-individual estimated from branch-shortening (column split time + bs (kya)).
All fragments
Deaminated fragments
A
B
% F(A|B)
Split A-B (ky)
Split time + bs (kya)
95% CI (kya)
% F(A|B)
Split A-B (ky)
Split time + bs (kya)
95% CI (kya)
Les Cottés Z4-1514
Altai
35.2
35.0
157.5
144.6-160.7
34.7
37.8
160.2
156.4-163.5
Goyet Q56-1
Altai
35.8
22.3
144.7
142.8-157.6
35.9
22.2
144.6
141.5-159.3
Mezmaiskaya 2
Altai
35.4
33.8
156.2
143.9-159.7
34.2
40.2
162.6
158.7-166.5
Spy 94a
Altai
34.4
39.2
161.7
158.2-165.2
33.6
43.1
165.5
160.4-186.2
Vindija 87
Altai
35.2
35.4
157.8
144.7-161.1
34.6
38.3
160.8
156.2-164.8
Mezmaiskaya 1
Altai
34.8
37.4
159.9
156.1-163.0
34.4
42.9
165.3
161.1-178.6
Les Cottés Z4-1514
Vindija 33.19
35.3
15.1
66.9
66.1-68.7
34.3
18.3
70.0
67.9-72.4
Goyet Q56-1
Vindija 33.19
37.8
8.3
60.1
58.8-61.7
38.0
8.1
59.9
57.8-61.9
Mezmaiskaya 2
Vindija 33.19
35.6
14.3
66.1
64.3-67.7
34.3
18.3
70.0
67.4-74.5
Spy 94a
Vindija 33.19
36.9
10.4
62.2
60.6-64.3
37.2
9.9
62.7
59.1-64.8
Vindija 87
Vindija 33.19
46.8
0.0
51.8
51.8-51.8
45.0
0.0
61.7
51.8-51.8
Mezmaiskaya 1
Vindija 33.19
31.6
42.0
93.8
81.5-96.9
30.6
46.9
98.7
94.5-102.9
Les Cottés Z4-1514
Denisovan
12.6
333.1
405.1
381.9-448.2
12.2
346.7
418.7
390.4-470.8
Goyet Q56-1
Denisovan
12.8
328.6
400.6
377.3-438.0
13.1
322.4
394.3
367.1-434.0
Mezmaiskaya 2
Denisovan
12.4
338.2
410.2
387.0-458.4
11.9
362.0
434.0
398.9-483.8
Spy 94a
Denisovan
12.5
336.0
407.9
384.7-451.6
12.4
339.4
411.3
382.4-470.8
Vindija 87
Denisovan
12.5
337.1
409.1
385.8-455.0
12.2
346.7
418.7
390.9-470.2
Mezmaiskaya 1
Denisovan
12.3
341.6
413.6
389.2-463.5
12.3
345.0
417.0
388.7-471.4
Les Cottés Z4-1514
Mbuti
17.5
525.9
525.9
511.3-539.8
17.2
549.3
549.3
533.2-565.4
Goyet Q56-1
Mbuti
17.7
515.0
515.0
501.8-528.9
17.6
520.8
520.8
502.6-540.6
Mezmaiskaya 2
Mbuti
17.6
521.6
521.6
506.9-536.2
17.4
535.4
535.4
517.9-553.0
Spy 94a
Mbuti
18.0
502.6
502.6
488.7-517.2
17.3
542.7
542.7
518.6-566.1
Vindija 87
Mbuti
17.6
523.0
523.0
509.1-537.6
17.3
542.0
542.0
524.5-558.8
Mezmaiskaya 1
Mbuti
17.8
512.8
512.8
498.9-528.1
17.1
552.2
552.2
534.7-569.8
To investigate whether any of the Neandertals sequenced to date is more closely related to the Neandertal population that contributed genetic material to modern humans, we compared the Neandertal genomes from this and previous studies to the genomes of 263 present-day humans21 as well as a number of early modern humans10,11,26,27. We find that all late Neandertals and the older Mezmaiskaya 1 Neandertal share significantly more derived alleles with the introgressing Neandertals than the AltaiNeandertal does (-2.4 ≤ Z ≤ -5.6; Fig. 3A and Supplementary Information 10), with no significant differences among them (-0.1 ≤ Z ≤ 1.8; Fig. 3B; Supplementary Information 10). Interestingly, this is also true for a ~45,000-year-old modern human from Siberia (Ust’-Ishim) (Fig. 3; Supplementary Information 10), who was contemporaneous with late Neandertals, but is not a direct ancestor of present-day humans10. Thus, the majority of gene flow into early modern humans appears to have originated from one or more Neandertal populations that diverged from other late Neandertals after their split from the AltaiNeandertal about 150 kya, but before the split from Mezmaiskaya 1 at least 90 kya (Extended Data Table 4). Due to the paucity of overlapping genetic data from Oase 1, whose genome revealed an unusually high percentage of Neandertal ancestry11, we were unable to resolve whether one of these late Neandertals was significantly closer than others to the introgressing Neandertal in Oase 1.
Figure 3
Proximity to the introgressing Neandertal populations in present-day and ancient humans calculated from the D (Neandertal1, Neandertal2; non-African, African).
Three Mbuti individuals from SGDP were used as an outgroup and standard errors were calculated using a weighted block jackknife (Supplementary Information 10). Shaded grey region corresponds to the D-values < 0. a, All late Neandertals and the older Mezmaiskaya 1 are significantly closer to the introgressing Neandertal population(s) than the Altai Neandertal
b, There is no significant difference between late Neandertals, Mezmaiskaya 1 and Vindija 33.19 in their proximity to the introgressing Neandertal population(s) in present-day and ancient humans.
Interbreeding between Neandertals and early modern humans is likely to have occurred intermittently, presumably resulting in gene flow in both directions28. However, when we applied an approach that utilizes the extended length of haplotypes expected from recent introgression to the late Neandertals analysed, we find no indications of recent gene flow from early modern humans (Supplementary Information 11). We caution that given the small number of Neandertals analysed we cannot exclude that such gene flow occurred. However, it is striking that Oase 1, one of the two early modern humans that overlapped in time with late Neandertals, showed evidence for recent additional Neandertal introgression10,11 whereas none of the late Neandertals analysed here do. This may indicate that gene flow affected the ancestry of modern human populations more than it did Neandertals29. Further work is necessary to determine if this was the case. Our work demonstrates that the generation of genome sequences from a large number of archaic human individuals is now technically feasible, and opens the possibility to study Neandertal populations across their temporal and geographical range.
Online Methods
DNA extraction and library preparation
All specimens were sampled in clean room facilities dedicated to the analysis of ancient DNA. Between 28 mg and 104 mg of tooth or bone powder was obtained by drilling once into the physically cleaned part of the specimen and split evenly (Supplementary Information 2; Table S2.1). Approximately half of the powder was directly subjected to DNA extraction using a silica-based method31 as implemented by Korlević et al.6 (“untreated” sample), whereas the second half was treated with 0.5% sodium hypochlorite solution6 prior to the DNA extraction in an attempt to remove some of the microbial and present-day human DNA contamination3–5. Five or ten µL of each extract were converted into single-stranded DNA libraries32 with the modifications as in Korlevic et al.6. The initial libraries of Vindija 87, Goyet Q56-1 and Les Cottés Z4-1514 were treated with E. coliuracil-DNA-glycosylase (UDG) and E. coli endonuclease VIII (Endo VIII)15,20 to excise uracils, while all other libraries were prepared without this enzymatic treatment (Supplementary Information 2). The libraries were amplified into plateau33 and tagged with two sample-specific indices6,34. An aliquot of each amplified library was additionally enriched for the hominin mitochondrial DNA using a bead-based hybridization method and modern human mtDNA as a bait35–37. After analysing the first set of libraries, we selected the extracts with the highest proportion of endogenous DNA and the lowest levels of present-day human DNA contamination to produce final set of single-stranded DNA libraries (Supplementary Information 2; Table S2.6)6,32.
Genome sequencing and data processing
All libraries were initially sequenced together on Illumina’s MiSeq and HiSeq 2500 platforms to determine their suitability for whole genome sequencing. Subsequently, 23 libraries from five Neandertal specimens were selected and sequenced on 50 lanes of the Illumina HiSeq 2500 platform in rapid mode, using double index configuration (2x76 bp)34 (Supplementary Information 2). Base calling was done using Bustard (Illumina) for the MiSeq runs and FreeIbis38 for the HiSeq runs. Adapters were trimmed and overlapping paired-end reads were merged into single sequences using leeHom39. The Burrows-Wheeler Aligner (BWA, version: 0.5.10-evan.9-1-g44db244; https://github.com/mpieva/network-aware-bwa)40 was used to align the shotgun data to the modified human reference GRCh37 (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/) and to align the human mitochondrial DNA capture data to the revised Cambridge Reference Sequence (NC_01290) with parameters adjusted for ancient DNA (“-n 0.01 –o 2 –l 16500”)20. We developed a two-step algorithm called jivebunny (https://bioinf.eva.mpg.de/jivebunny) for de-multiplexing of the sequencing runs (Supplementary Information 3), retaining only fragments that were assigned to the correct library based on their index sequences for all of the downstream analyses. PCR duplicates were removed using bam-rmdup (version: 0.6.3; https://bitbucket.org/ustenzel/biohazard) and fragments were filtered for read length (≥ 35 bp) and mapping quality (MQ≥25) using SAMtools (version: 1.3.1)41.The deamination of cytosine (C) to uracil (U) residues leaves characteristic C-to-T substitution patterns in ancient DNA molecules, which are particularly close to the alignment ends15 and thus provide evidence for the presence of authentic ancient DNA in specimens5,42 (Supplementary Information 2 and 3; Extended Data Fig. 1). We evaluated the frequency of these nucleotide substitution patterns characteristic of ancient DNA using an in-house Perl script. For determining the coverage of the nuclear genomes, we counted the number of bases with a base quality of at least 30 (BQ ≥ 30) in the fragments that overlapped highly mappable regions of the autosomes of the human genome (Map35_100% of Prüfer et al.2) and divided it by the total length of those regions. We determined the sex of the Neandertal individuals by counting the number of fragments that aligned to the X chromosome and the autosomes (Extended Data Fig. 3).
Contamination estimates
We used four methods to estimate the proportion of present-day human DNA contamination in the final dataset (Supplementary Information 4). We estimated the proportion of mtDNA contamination by present-day human DNA by using two different sets of positions. We first re-aligned the shotgun data of all HiSeq runs to the revised Cambridge Reference Sequence (rCRS, NC_012920) using BWA40. We then counted the number of fragments overlapping 63 positions where 18 published Neandertal mitochondrial genomes1,2,12,43–46 differ from those of 311 present-day humans43. In the second approach, we determined positions in the reconstructed mtDNA genomes of each of the five Neandertals that are specific for that Neandertal when compared to 311 present-day humans. To mitigate the effect of deamination for both approaches we ignored the alignments on the forward or reverse strands at positions where the informative base was a C or a G17.We estimated the extent of present-day human DNA contamination on the autosomes in the five late Neandertals using a maximum likelihood approach described in Green et al.1, based on all sites covering informative positions in the nuclear genome where humans carry a fixed derived variant when compared to the great apes. Additionally, we estimated levels of present-day human DNA contamination using an ancestry model where each low-coverage Neandertal traces a portion of its genome either from a high-coverage uncontaminated Neandertal, or from a modern human population related to present-day non-Africans. We built a two-source qpAdm47 model where one part of the ancestry was modelled as being closely related to the high-coverage Vindija 33.19 Neandertal, and the other source of ancestry was modelled as being most closely related to the Dinka population of the Simons Genome Diversity Project (SGDP)21. We estimate the proportion of male contamination for the three female individuals, Les Cottés Z4-1514, Goyet Q56-1 and Vindija 87 by counting the number of fragments aligning to the unique regions of the Y chromosome and divided it with the number of fragments that would be expected if the individual was a male (Supplementary Information 4; Table S4.1).
mtDNA and Y chromosomes of Neandertals
We re-aligned the shotgun data of all HiSeq runs to the Vindija 33.16 mitochondrial genome (AM948965)43 using BWA40 to reconstruct complete mitochondrial genomes of five Neandertals. A consensus base was called at each position covered by at least three fragments and where at least two-thirds of fragments had an identical base17 (Supplementary Information 5). We aligned the reconstructed mitochondrial genomes of the five late Neandertals to the mtDNA genomes of 18 Neandertals2,12,43–46,48,49, 311 present-day humans43, 10 ancient modern humans5,10,36,50–52, three Denisovans53–55, a hominin from Sima de los Huesos17 and a chimpanzee56 using MAFFT57 and calculated the number of pairwise differences among mitochondrial genomes using MEGA658. Bayesian phylogenetic analysis was performed using BEAST v2.4.559 to infer the time to the most recent common ancestor (TMRCA) of all Neandertal mitochondrial genomes (Supplementary Information 5). Radiocarbon dates of Neandertals and ancient modern humans were used as calibration points for the molecular clock and we used jModelTest260 and marginal likelihood estimation (MLE) analysis61,62 in order to choose the best fitting substitution, clock, and tree model (Supplementary Information 5).For reconstructing the phylogeny of the Neandertal Y chromosomes, we compared the two male Neandertal individuals, Mezmaiskaya 2 and Spy 94a, to 175 present-day human males from the SGDP21 and two present-day humans with haplogroup A0030. For processing the Y chromosomal data of Neandertal individuals presented here, we followed the processing of the previously published Neandertal Y chromosome analysis19 and used the described parameters to call bases and infer genotypes (Supplementary Information 5).
Set of filters for nuclear data analyses
As the data presented in this study were of low-coverage and contained elevated C-to-T substitutions, we investigated spurious correlations stemming from the properties of the data using the D-statistics1,54,63 (Supplementary Information 6). All of the sampling schemes, except random read sampling and then restricting to transversion polymorphisms, as well as simulation of UDG treatment, resulted in significant spurious correlations between individuals (Supplementary Information 6; Table S6.1). Therefore, we applied random read sampling at each position in the genome that was covered for the low-coverage individuals. To diminish the impact of present-day human DNA contamination and enrich for the endogenous fragments17, we further selected the fragments that showed C-to-T substitutions relative to the human reference genome at the first three and/or the last three positions, i.e. putatively deaminated fragments. The newly generated sequencing data of Mezmaiskaya 17, a ~60,000-70,000-year-old Neandertal from Russia, were processed in exactly the same way as the data of the other low-coverage individuals. For the high-coverage genomes of Altai2 and Vindija 33.197 Neandertals, as well as the Denisovan20 individual, snpAD genotype calls (http://cdna.eva.mpg.de/neandertal/Vindija/VCF/) were used. For comparison to modern humans, we used hetfa files of 263 present-day human genomes of the SGDP21 and snpAD genotype calls of three high quality ancient modern humans, Ust’-Ishim, a ~45,000-year-old modern human from Siberia10; Loschbour, a ~8,000-year-old hunter-gatherer from Luxembourg26; and LBK, a ~7,000-year-old farmer from Stuttgart26. Variant sites across all genomes were extracted (https://bitbucket.org/ustenzel/heffalump) and converted into an input format for AdmixTools (version 4.1)63 or exported into combined VCF files. We further restricted all analyses to bi-allelic sites in the genome covered by at least one low-coverage Neandertal and to transversion polymorphisms.
Principal Component Analysis
We carried out a Principal Component Analysis (PCA)64,65 using genomes of Vindija 33.197, Altai Neandertal2 and Denisovan individual20 to estimate the eigenvectors of the genetic variation and then projected the low-coverage Neandertals onto the defined plane. This allowed us to explore the relationship of low-coverage Neandertals relative to the high-coverage Neandertals (Extended Data Fig. 4).
Extended Data Figure 4
A Principal Component Analysis (PCA) of the genomes of Vindija 33.19, Altai, Denisova, five late Neandertals and Mezmaiskaya 1.
Genomes of the high coverage archaics were used to estimate the eigenvectors of the genetic variation and low coverage Neandertals were projected onto the plane. Only transversion polymorphisms and bi-allelic sites were considered for the analysis, amounting to a total of 1,010,417 sites as defined by the high coverage genomes.
Lineage attribution and average sequence divergence of late Neandertals
We followed an approach introduced in Meyer et al.66 based on the sharing of derived alleles with a certain hominin group in order to determine more precisely to which hominin group the nuclear genomes of Les Cottés Z4-1514, Goyet Q56-1, Spy 94a, Mezmaiskaya 2 and Vindija 87 are most closely related to. We investigated the state of DNA fragments overlapping the positions at which the high-coverage genomes of the Altai Neandertal2, the Vindija 33.19 Neandertal7, the Denisovan individual20 and a present-day African (Mbuti, HGDP00982)2 differ from those of the great apes (chimpanzee, bonobo, gorilla and orangutan). We then determined the proportion of fragments for each of the low-coverage Neandertals that supported the derived state of each of the branches in the phylogenetic tree relating the four high-coverage genomes66 (Extended Data Table 3; Supplementary Information 7; Table S7.1).We estimated the divergence of the five late Neandertal genomes along the lineage from the ancestor shared with the chimpanzee and the high-coverage genomes of the Altai and Vindija 33.19 Neandertals, the Denisovan individual, or a present-day human from the B-panel of Prüfer et al.2, using the triangulation method previously applied to a number of ancient genomes1,2,20,54,55. We calculated how many of the substitutions inferred to have occurred from the human-chimpanzee ancestor to the high-coverage genomes occurred after the split from the low-coverage genome (Supplementary Information 7; Tables S7.2 and S7.3). Standard errors were computed by a Weighted Block Jackknife67 with a block size of 5 million base pairs (5 Mb) across all autosomes.
Split times of late Neandertals and the neighbour-joining tree of nuclear genomes
We conditioned on the heterozygous sites in the two high-coverage Neandertals, and then computed the fraction of sites that show the same derived allele in randomly sampled fragments of the low-coverage individuals in order to estimate the split times between the low-coverage Neandertals and the Altai2 and Vindija 33.197 Neandertals (Extended Data Table 4; Supplementary Information 8). In this F(A|B) statistic1,2,20, expected value depends only on the demography of the high-coverage individuals from which heterozygous sites are determined. We performed these analyses on all randomly sampled fragments, and on deaminated fragments only (Extended Data Table 4; Supplementary Information 8).We used the low-coverage nuclear genomes of Les Cottés Z4-1514, Goyet Q56-1, Spy 94a, Mezmaiskaya 2 and Mezmaiskaya 1, the high-coverage genomes of Vindija 33.19, AltaiNeandertal, Denisovan individual and 12 present-day humans from Prüfer et al.2 for constructing a neighbour-joining tree. We counted the total number of transversions between all pairs of individuals and the human-chimpanzee common ancestor2,54. We constructed a neighbour-joining tree68 based on the pairwise number of transversions in windows of 5 Mb across all autosomes between all individuals with 1000 bootstrap replications (Supplementary Information 8). The tree was constructed as implemented in the R-package phangorn69 and visualized with FigTree (version: v1.4.2) (http://tree.bio.ed.ac.uk/software/figtree/).
Inferring the relationships between Neandertals
We used D-statistics1,54,63 to investigate the population relationships among Neandertal individuals. We co-analysed all low-coverage Neandertal genomes with the high-coverage genomes of the Altai and Vindija 33.197 Neandertals. We used the whole genome alignments of the chimpanzee, orangutan and rhesus macaque to the human reference genome70–72 to infer the ancestral states for the analyses of D(Altai, Vindija 33.19; Neandertal, outgroup). Furthermore, we used the genomes of the Dinka and Mbuti individuals from the SGDP21 as outgroups for the statistics of D(Neandertal. The standard errors were computed using a Weighted Block Jackknife63,67 with equally sized blocks of 5 Mb over all autosomes. We further restricted these analyses to bi-allelic sites in the genome covered by at least one low-coverage Neandertal and transversion polymorphisms.
Inferring the relationship to the introgressed Neandertals in present-day and ancient modern humans
We analysed the low-coverage late Neandertal genomes together with the high-coverage genomes of the Altai2 and Vindija 33.19 Neandertals7, the high-coverage genomes of the Denisovan individual20 and 263 present-day humans of the SGDP21 (Supplementary Information 10). We included Ust’-Ishim, a ~45,000-year-old modern human from Siberia10; Loschbour, a ~8,000-year-old hunter-gatherer from Luxembourg26; and LBK, a ~7,000-year-old farmer from Stuttgart26 to study the differences among Neandertals in their proximity to the introgressed Neandertal DNA detected in ancient modern humans. Analyses were restricted to transversion polymorphisms and to bi-allelic sites in the genome covered by at least one low-coverage Neandertal. As above, the D-statistics was used to infer the relationships between individuals1,54,63 and standard errors were computed using a Weighted Block Jackknife63,67 over all autosomes (block size: 5 Mb).
Early modern human gene flow into late Neandertals
We modelled admixture from modern humans into Neandertals with an ascertainment scheme in which both the Denisova and Altai Neanderal were fixed for the ancestral allele and at least half of the alleles in present-day African populations are derived. We applied the method as described previously73 and estimated a date of a recent early modern human (EMH) admixture into Neandertals to be ~10-100 generations ago. At each SNP in the genome, we considered data from all Yoruba individuals from the 1000 Genomes Project74 covered by at least three fragments that passed a pre-defined set of filters. Furthermore, we restricted the analysis to sites in the genome where ≥24 Yoruba individuals as well as the AltaiNeandertal and Denisovan individual had allele calls (Map35_50% filter from Prüfer et al.2). The ancestral states were taken from the inferred ancestor of humans and chimpanzees (Ensembl Compara v64)75,76. We introduced a more complex demographic history that is loosely based on the model described in Gravel et al.77 (details in the Supplementary Information 11).
Data availability
The aligned sequences have been deposited in the European Nucleotide Archive under accession numbers PRJEB21870 (Goyet Q56-1), PRJEB21875 (Les Cottés Z4-1514), PRJEB21881 (Mezmaiskaya 2), PRJEB21882 (Vindija 87) and PRJEB21883 (Spy 94a). The mitochondrial consensus sequences reported in this paper are available in GenBank with the accession codes MG025536-MG025540.
Code availability
All software packages used for analysis are cited and all packages are publicly available. Algorithms and packages are freely available at https://bioinf.eva.mpg.de/.
Frequency of nucleotide substitutions at the beginning and the end of nuclear alignments for the final dataset of Les Cottés Z4-1514, Goyet Q56-1, Mezmaiskaya 2, Vindija 87 and Spy 94a.
Only fragments of at least 35 base pairs (bp) that mapped to the human reference genome with a mapping quality of at least 25 (MQ ≥ 25) were used for this analysis. Solid lines depict all fragments and dashed lines the fragments that have a C-to-T substitution at the opposing end (‘conditional’ C-to-T substitutions). All other types of substitutions are marked in grey.
Fragment size distribution of fragments longer than 35 base pairs (bp) mapped to the human reference genome with the mapping quality of at least 25 (MQ≥25) for each of the five late Neandertals.
All fragments are depicted in solid lines and fragments with C-to-T substitutions to the reference genome (putatively deaminated fragments) are depicted with dashed lines.
Sex determination based on the number of fragments aligning to the X chromosome and the autosomes.
The expected ratios of X to (X + autosomal) fragments for a female and a male individual are depicted as dashed lines. The results were concordant for all fragments (in red) and for deaminated fragments only (in grey).
A Principal Component Analysis (PCA) of the genomes of Vindija 33.19, Altai, Denisova, five late Neandertals and Mezmaiskaya 1.
Genomes of the high coverage archaics were used to estimate the eigenvectors of the genetic variation and low coverage Neandertals were projected onto the plane. Only transversion polymorphisms and bi-allelic sites were considered for the analysis, amounting to a total of 1,010,417 sites as defined by the high coverage genomes.
Amount of data generated for Les Cottés Z4-1514, Goyet Q56-1, Mezmaiskaya 2, Vindija 87 and Spy 94a.
Reported is the number of fragments after merging all of the sequencing libraries together. The obtained coverage of the nuclear genomes is determined by counting the number of bases with the base quality of at least 30 in fragments longer than 35 bp with the mapping quality of at least 25 that overlapped highly mappable regions of autosomes of the human genome and dividing that number by the total length of these regions.
Relationship of the late Neandertals and Mezmaiskaya 1 to the Altai and Vindija 33.19 calculated as D (Altai, Vindija 33.19, Neandertal, outgroup) for all fragments and deaminated fragments, restricted to transversions.
All late Neandertals and Mezmaiskaya 1 are significantly closer to Vindija 33.19 than to the AltaiNeandertal, irrespective of the used outgroup. Blue denotes Z-score << -2. The total number of informative sites that are transversions among Neandertals and where at least one late Neandertal has coverage is 1,567,449.
The fraction of derived alleles among putatively deaminated fragments that each of the low coverage individuals shares with the Altai Neandertal, Vindija 33.19, Denisovan, and a present day human genome.
The state of DNA fragments overlapping the positions at which the high coverage genomes of the AltaiNeandertal, the Vindija 33.19 Neandertal, the Denisovan individual and a present-day African (Mbuti, HGDP00982) differ from those of the great apes were investigated. Fragments longer than 35bp with mapping quality of at least 25 and within the highly mappable regions of the genome that had terminal C-to-T substitutions reported in the Table S3.2 were utilized. 95% binomial confidence intervals are provided in brackets.
Time of separation of late Neandertals and Mezmaiskaya 1 (A) from the ancestor with the high-coverage genomes of Altai and Vindija 33.19 Neandertals, Denisovan individual and a present-day human (B), when measured in terms of time of split from the B individual (split A-B), or time from present (split-time + branch shortening (bs)).
Results reported for all fragments and deaminated fragments only (Tables S3.1 and S3.2). 95% confidence intervals for F(A|B) values are estimated via Weighted Block Jackknife, using blocks of 5 Mb. In order to obtain estimates of split times from present, we correct "split A-B" by adding the age of the high coverage B-individual estimated from branch-shortening (column split time + bs (kya)).
Supplementary Material
Supplementary Information is available in the online version of the paper at www.nature.com/nature.
Authors: Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich Journal: Nat Genet Date: 2006-07-23 Impact factor: 38.330
Authors: Johannes Krause; Adrian W Briggs; Tomislav Maricic; Udo Stenzel; Martin Kircher; Nick Patterson; Richard E Green; Heng Li; Weiwei Zhai; Markus Hsi-Yang Fritz; Nancy F Hansen; Eric Y Durand; Anna-Sapfo Malaspinas; Jeffrey D Jensen; Tomas Marques-Bonet; Can Alkan; Kay Prüfer; Matthias Meyer; Hernán A Burbano; Jeffrey M Good; Rigo Schultz; Ayinuer Aximu-Petri; Anne Butthof; Barbara Höber; Barbara Höffner; Madlen Siegemund; Antje Weihmann; Chad Nusbaum; Eric S Lander; Carsten Russ; Nathaniel Novod; Jason Affourtit; Michael Egholm; Christine Verna; Pavao Rudan; Dejana Brajkovic; Željko Kucan; Ivan Gušic; Vladimir B Doronichev; Liubov V Golovanova; Carles Lalueza-Fox; Marco de la Rasilla; Javier Fortea; Antonio Rosas; Ralf W Schmitz; Philip L F Johnson; Evan E Eichler; Daniel Falush; Ewan Birney; James C Mullikin; Montgomery Slatkin; Rasmus Nielsen; Janet Kelso; Michael Lachmann; David Reich; Svante Pääbo Journal: Science Date: 2010-05-07 Impact factor: 47.728
Authors: Guy Baele; Philippe Lemey; Trevor Bedford; Andrew Rambaut; Marc A Suchard; Alexander V Alekseyenko Journal: Mol Biol Evol Date: 2012-03-07 Impact factor: 16.240
Authors: Kay Prüfer; Fernando Racimo; Nick Patterson; Flora Jay; Sriram Sankararaman; Susanna Sawyer; Anja Heinze; Gabriel Renaud; Peter H Sudmant; Cesare de Filippo; Heng Li; Swapan Mallick; Michael Dannemann; Qiaomei Fu; Martin Kircher; Martin Kuhlwilm; Michael Lachmann; Matthias Meyer; Matthias Ongyerth; Michael Siebauer; Christoph Theunert; Arti Tandon; Priya Moorjani; Joseph Pickrell; James C Mullikin; Samuel H Vohr; Richard E Green; Ines Hellmann; Philip L F Johnson; Hélène Blanche; Howard Cann; Jacob O Kitzman; Jay Shendure; Evan E Eichler; Ed S Lein; Trygve E Bakken; Liubov V Golovanova; Vladimir B Doronichev; Michael V Shunkov; Anatoli P Derevianko; Bence Viola; Montgomery Slatkin; David Reich; Janet Kelso; Svante Pääbo Journal: Nature Date: 2013-12-18 Impact factor: 49.962
Authors: Iosif Lazaridis; Nick Patterson; Alissa Mittnik; Gabriel Renaud; Swapan Mallick; Karola Kirsanow; Peter H Sudmant; Joshua G Schraiber; Sergi Castellano; Mark Lipson; Bonnie Berger; Christos Economou; Ruth Bollongino; Qiaomei Fu; Kirsten I Bos; Susanne Nordenfelt; Heng Li; Cesare de Filippo; Kay Prüfer; Susanna Sawyer; Cosimo Posth; Wolfgang Haak; Fredrik Hallgren; Elin Fornander; Nadin Rohland; Dominique Delsate; Michael Francken; Jean-Michel Guinet; Joachim Wahl; George Ayodo; Hamza A Babiker; Graciela Bailliet; Elena Balanovska; Oleg Balanovsky; Ramiro Barrantes; Gabriel Bedoya; Haim Ben-Ami; Judit Bene; Fouad Berrada; Claudio M Bravi; Francesca Brisighelli; George B J Busby; Francesco Cali; Mikhail Churnosov; David E C Cole; Daniel Corach; Larissa Damba; George van Driem; Stanislav Dryomov; Jean-Michel Dugoujon; Sardana A Fedorova; Irene Gallego Romero; Marina Gubina; Michael Hammer; Brenna M Henn; Tor Hervig; Ugur Hodoglugil; Aashish R Jha; Sena Karachanak-Yankova; Rita Khusainova; Elza Khusnutdinova; Rick Kittles; Toomas Kivisild; William Klitz; Vaidutis Kučinskas; Alena Kushniarevich; Leila Laredj; Sergey Litvinov; Theologos Loukidis; Robert W Mahley; Béla Melegh; Ene Metspalu; Julio Molina; Joanna Mountain; Klemetti Näkkäläjärvi; Desislava Nesheva; Thomas Nyambo; Ludmila Osipova; Jüri Parik; Fedor Platonov; Olga Posukh; Valentino Romano; Francisco Rothhammer; Igor Rudan; Ruslan Ruizbakiev; Hovhannes Sahakyan; Antti Sajantila; Antonio Salas; Elena B Starikovskaya; Ayele Tarekegn; Draga Toncheva; Shahlo Turdikulova; Ingrida Uktveryte; Olga Utevska; René Vasquez; Mercedes Villena; Mikhail Voevoda; Cheryl A Winkler; Levon Yepiskoposyan; Pierre Zalloua; Tatijana Zemunik; Alan Cooper; Cristian Capelli; Mark G Thomas; Andres Ruiz-Linares; Sarah A Tishkoff; Lalji Singh; Kumarasamy Thangaraj; Richard Villems; David Comas; Rem Sukernik; Mait Metspalu; Matthias Meyer; Evan E Eichler; Joachim Burger; Montgomery Slatkin; Svante Pääbo; Janet Kelso; David Reich; Johannes Krause Journal: Nature Date: 2014-09-18 Impact factor: 49.962
Authors: Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean Journal: Nature Date: 2012-11-01 Impact factor: 49.962
Authors: Anders Bergström; Chris Stringer; Mateja Hajdinjak; Eleanor M L Scerri; Pontus Skoglund Journal: Nature Date: 2021-02-10 Impact factor: 49.962
Authors: Jennifer K Wagner; Chip Colwell; Katrina G Claw; Anne C Stone; Deborah A Bolnick; John Hawks; Kyle B Brothers; Nanibaa' A Garrison Journal: Am J Hum Genet Date: 2020-08-06 Impact factor: 11.025
Authors: Laura L Colbran; Eric R Gamazon; Dan Zhou; Patrick Evans; Nancy J Cox; John A Capra Journal: Nat Ecol Evol Date: 2019-10-07 Impact factor: 15.460
Authors: Lukas Bokelmann; Mateja Hajdinjak; Stéphane Peyrégne; Selina Brace; Elena Essel; Cesare de Filippo; Isabelle Glocke; Steffi Grote; Fabrizio Mafessoni; Sarah Nagel; Janet Kelso; Kay Prüfer; Benjamin Vernot; Ian Barnes; Svante Pääbo; Matthias Meyer; Chris Stringer Journal: Proc Natl Acad Sci U S A Date: 2019-07-15 Impact factor: 11.205
Authors: Thibaut Devièse; Grégory Abrams; Mateja Hajdinjak; Stéphane Pirson; Isabelle De Groote; Kévin Di Modica; Michel Toussaint; Valentin Fischer; Dan Comeskey; Luke Spindler; Matthias Meyer; Patrick Semal; Tom Higham Journal: Proc Natl Acad Sci U S A Date: 2021-03-08 Impact factor: 11.205
Authors: Rodrigo S Lacruz; Chris B Stringer; William H Kimbel; Bernard Wood; Katerina Harvati; Paul O'Higgins; Timothy G Bromage; Juan-Luis Arsuaga Journal: Nat Ecol Evol Date: 2019-04-15 Impact factor: 15.460
Authors: Kay Prüfer; Cosimo Posth; He Yu; Alexander Stoessel; Maria A Spyrou; Thibaut Deviese; Marco Mattonai; Erika Ribechini; Thomas Higham; Petr Velemínský; Jaroslav Brůžek; Johannes Krause Journal: Nat Ecol Evol Date: 2021-04-07 Impact factor: 15.460