Literature DB >> 32094538

Emergence of human-adapted Salmonella enterica is linked to the Neolithization process.

Felix M Key^1,2,3, Cosimo Posth⁴, Luis R Esquivel-Gomez⁵, Ron Hübler⁴, Maria A Spyrou⁴, Gunnar U Neumann⁴, Anja Furtwängler⁶, Susanna Sabin⁴, Marta Burri⁴, Antje Wissgott⁴, Aditya Kumar Lankapalli⁴, Åshild J Vågene⁴, Matthias Meyer⁷, Sarah Nagel⁷, Rezeda Tukhbatova^4,8, Aleksandr Khokhlov⁹, Andrey Chizhevsky¹⁰, Svend Hansen¹¹, Andrey B Belinsky¹², Alexey Kalmykov¹², Anatoly R Kantorovich¹³, Vladimir E Maslov¹⁴, Philipp W Stockhammer^4,15, Stefania Vai¹⁶, Monica Zavattaro¹⁷, Alessandro Riga¹⁶, David Caramelli¹⁶, Robin Skeates¹⁸, Jessica Beckett¹⁸, Maria Giuseppina Gradoli¹⁹, Noah Steuri²⁰, Albert Hafner²⁰, Marianne Ramstein²¹, Inga Siebke²², Sandra Lösch²², Yilmaz Selim Erdal²³, Nabil-Fareed Alikhan²⁴, Zhemin Zhou²⁴, Mark Achtman²⁴, Kirsten Bos⁴, Sabine Reinhold¹¹, Wolfgang Haak⁴, Denise Kühnert⁵, Alexander Herbig²⁵, Johannes Krause²⁶.

Abstract

It has been hypothesized that the Neolithic transition towards an agricultural and pastoralist economy facilitated the emergence of human-adapted pathogens. Here, we recovered eight Salmonella enterica subsp. enterica genomes from human skeletons of transitional foragers, pastoralists and agropastoralists in western Eurasia that were up to 6,500 yr old. Despite the high genetic diversity of S. enterica, all ancient bacterial genomes clustered in a single previously uncharacterized branch that contains S. enterica adapted to multiple mammalian species. All ancient bacterial genomes from prehistoric (agro-)pastoralists fall within a part of this branch that also includes the human-specific S. enterica Paratyphi C, illustrating the evolution of a human pathogen over a period of 5,000 yr. Bacterial genomic comparisons suggest that the earlier ancient strains were not host specific, differed in pathogenic potential and experienced convergent pseudogenization that accompanied their downstream host adaptation. These observations support the concept that the emergence of human-adapted S. enterica is linked to human cultural transformations.

Entities: Chemical

Mesh：

Year: 2020 PMID： 32094538 PMCID： PMC7186082 DOI： 10.1038/s41559-020-1106-9

Source DB: PubMed Journal: Nat Ecol Evol ISSN： 2397-334X Impact factor: 15.460

The transition in human lifestyle from foraging to agriculture and pastoralism during the Neolithic revolution represents possibly the biggest cultural change in human history. The transition began in the Near Eastern fertile crescent around 10,000 years ago and subsequently Neolithic economies spread across western Eurasia[1]. This Neolithization process was accompanied by a change in mobility, introduction of domesticated animals, and increased contact to animal and human excrements, which facilitated a more constant and recurrent exposure to pathogens and possibly the emergence of zoonotic disease[2-5]. However, direct molecular evidence, like ancient DNA, in support of this hypothesis is currently missing. Microbial paleogenomics provides a unique window into the past human infectious disease burden, and promises to elucidate the deep evolutionary history of clinically relevant pathogens[6,7]. Recently, ancient DNA analysis identified the human-specific bacterial pathogen Salmonella enterica Paratyphi C in 450 year old skeletons from Mexico and 800 year old skeletons from Norway[8,9]. However, Salmonella enterica is largely absent from the historical record because it has nonspecific clinical symptoms and does not cause distinctive skeletal lesions. The bacterial pathogen Salmonella enterica encompasses over 2,500 serovars. Today 99% of the cases of salmonellosis in mammals, including up to 200,000 annual fatal human infections are associated with S. enterica subsp. enterica (S. enterica) comprising 60% of all serovars[10,11]. S. enterica that cause systemic infection are often restricted to single host species, or have a narrow host range, while S. enterica causing gastroenteritis are host generalists, and only rarely cause systemic infections[12,13]. S. enterica Paratyphi C causes systemic disease specifically in humans but is a member of a phylogenetic lineage (“Para C Lineage”) that also includes serovars invasive for pigs (Typhisuis, Choleraesuis). Choleraesuis also rarely infects humans, and previous analyses posit that Paratyphi C arose as a zoonosis in Europe within the last 4,000 years, possibly by a host jump of Choleraesuis from domesticated pigs to humans[9,14]. The evolution of ancient Salmonella genomes might thus have been influenced by shifts in human cultural practices during the Neolithization process.

Transect of eight ancient S. enterica across western Eurasia

In order to investigate these changes, we screened for S. enterica DNA in 2,739 ancient metagenomes extracted from human skeletons spanning primarily the Eurasian Neolithization process until the Middle Ages. Twenty-four metagenomes yielded multiple reads mapping to different S. enterica serovars. Each sample was enriched for DNA using a targeted in-solution S. enterica-specific hybridization capture assay, which allowed the reconstruction of eight genomes with a coverage of 0.7- to 24-fold (Table 1). The human teeth from which these genomes were assembled were up to 6,500 years old, and their locations ranged from Russia to Turkey to Switzerland (Figure 1). Teeth are vascularized during life, and the successful retrieval of S. enterica DNA suggests bacteria were present in high levels in the blood at the time of death[15]. Thus, our results suggest systemic S. enterica infections were geographically widespread across prehistoric western Eurasia.

Table 1

Overview of S. enterica positive samples.

Sample	Site	Country	Date (cal BP)	Mapped Reads	% endog. DNA	Coverage	% Reference covered
MUR009	Murzihinskiy	Russia	6500 - 6350	783,054	2.8	8.8	88.9
MUR019	Murzihinskiy	Russia	6490 - 6320	1,405,983	7.8	16.5	90.1
IV3002	Ipatovo	Russia	5580 - 5080	761,643	8.9	7.0	88.9
OBP001	Oberbipp	Switzerland	5320 - 5070	138,083	0.5	1.2	59.3
IKI003	Ikiztepe	Turkey	5290 - 5050	840,560	2.7	8.3	90.0
SUA004	Seulo	Italy	4300 - 4010	2,517,870	11.5	24.0	93.2
MK3001	Marinskaja	Russia	2990 - 2870	79,534	0.8	0.7	37.8
ETR001	Chiusi	Italy	1810 – 1620	1,977,148	24.5	17.8	93.7

Date is in calibrated years before present (95% confidence interval), based on direct AMS dating. Number of mapped reads, percent endogenous DNA (% endog. DNA), mean coverage, and percent bases covered at least once (% Reference covered) are based on the alignment to Paratyphi C RKS4594.

Figure 1

Geographic location and radiocarbon age of ancient human individuals infected with S. enterica.

Previously published ancient genomes from 13th century Norway (Ragna) and 16th century Mexico (Tepos) are also shown.

During the last 8,000 years the Neolithic lifestyle spread across western Eurasia and adapted to regionally different conditions[16]. We investigated differences in human subsistence practices and genomic ancestry of the ancient S. enterica positive individuals using the archaeological information as well as human ancient DNA (see also Supplementary Note 1). The oldest burial site in Milicejskiy (western Russia), with two 6,500 year old individuals (MUR009 and MUR019) is archeologically assigned to the local Eneolithic, a transitional period dominated by a foraging economy that relied on hunting, with first evidence for the adoption of pastoralism[17]. In addition, genomic ancestry components of MUR009 and the mitochondrial DNA from MUR019 resemble those of previously sequenced eastern European foragers (Supplementary Table 1, Extended Data Fig. 1 and 2). In contrast, the 5,500 year old individuals IV3002 (southern Russia) and IKI003 (Turkey) were from a regional Bronze Age culture that primarily practiced a pastoralist economy linked to the use of domesticated sheep, goat and cattle. The other individuals from the Neolithic (OBP001) in Switzerland, the Bronze Age (SUA004) in Sardinia, the Iron Age (MK3001) in Russia, and the late Roman Empire (ETR001) from Italy were associated with archaeological evidence for an agro-pastoralist economy (Supplementary Note 1). Ancient human DNA analysis for these individuals (except OBP001 with insufficient recoverable DNA) infers genomic compositions consistent with (agro-)pastoralist economies based on previously published populations with similar ancestries (Supplementary Note 1, Extended Data Fig. 1 and 2).

Extended Data Fig. 1

Ancient human population genetic analysis.

(a) PCA of newly reported ancient individuals with sufficient data (in red) and selected published ancient and modern individuals are projected onto principal components built with present-day West Eurasian populations (grey dots). (b) ADMIXTURE analysis (K=10) of newly reported ancient individuals and relevant published ancient and modern individuals sorted by genetic clusters. Overview ancient human genetic data Supplementary Table 1 and further analysis Extended Data Fig. 4. EHG, Eastern hunter gatherer; E, Early; M, Middle; HG, hunter–gatherer; N, Neolithic; C, Caucasus; S, Scandinavian; W, Western; BA, Bronze Age.

Extended Data Fig. 2

Summary human genetic analysis.

(a) ADMIXTURE analysis (K = 3 - 16) of newly reported ancient individuals (bold horizontal text) and published ancient and modern individuals sorted by genetic clusters and geographic origin (Europe, Near East and Caucasus, Asia, America, Africa). Each K was run five times and the replicate with the highest likelihood is reported. Ancient MK3001 shows Asian genetic ancestry components represented by Nganasan, Kankanaey, Atayal, and Ami. (b) Box plot of five cross-validations (CV) values for every K calculated in ADMIXTURE. EHG, Eastern hunter gatherer; E, Early; M, Middle; HG, hunter–gatherer; N, Neolithic; C, Caucasus; S, Scandinavian; W, Western; BA, Bronze Age.

In conclusion, the eight bacterial genomes transect 4,700 years of S. enterica infections during the spread of an (agro-)pastoralist subsistence, and can provide an unprecedented view into the diversity and evolution of S. enterica during this transformative time.

Ancient S. enterica genomes form a previously uncharacterized super branch

We assessed the relationships of the ancient S. enterica genomes to the modern diversity by leveraging a representative set of 2,961 genomes from EnteroBase[9,10]. For phylogenetic reconstruction, we built a multi-sequence alignment against Paratyphi C, a close relative to several ancient genomes in our screen. To minimize any reference bias we retained only positions covered across all genomes and excluded ancient genomes with less than 5-fold coverage to ensure reliable SNP calling. Surprisingly, all the ancient genomes clustered in a single phylogenetic branch containing a limited number of serovars (Figure 2A), which we designate the Ancient Eurasian Super Branch (AESB). Only 60 out of the 2,961 representative, modern S. enterica genomes are part of the AESB (Figure 2A) and the majority of the representative genomes belong to multiple other branches[9,10] that were not found among the ancient genomes. Those results suggest that all detected prehistoric S. enterica infections across western Eurasia were caused by a sub-group of serovars within the much larger S. enterica diversity known today.

Figure 2

Phylogenetic relationships of reconstructed ancient and modern S. enterica core genomes.

(a) Maximum likelihood tree of the ancient genomes (>5X coverage) and 2,961 modern S. enterica genomes, including 182,645 SNP positions in the core genome. Selected branches are identified based on predicted serotype provided by EnteroBase and coloured according to host specificity (blue/orange), if they include ancient genomes (red), or not specified (black). (b) Maximum likelihood tree of the AESB including the ancient genomes (>5X coverage) and 463 S. enterica genomes, considering 37,040 SNP positions in the core genome. New ancient genomes are shown in red, and previously reported ancient genomes (Ragna, Tepos) in pink. Ancient human economy is indicated for all newly presented genomes based on archaeological and ancient human genetic information. Low coverage ancient genomes (coverage <5X) are phylogenetically placed (red dashed line) based on all SNP positions covered once: MK3001: 17,324; OBP001: 26,657; and Ragna: 35,465. Modern genomes are collapsed based on their predicted serovar, eBurst group (closely related sequence types), or available metadata in EnteroBase. Host adapted serovars are coloured orange (incl. a pictogram of the host species). Bootstrap values are shown in black at each node (1,000 bootstraps). Black dashed rectangles show extends of Para C Lineage and hierarchical cluster HC2600_1272. Enteritidis P125109 is used as the outgroup.

Emergence of S. enterica cluster HC2600_1272 during the Neolithization process

In order to provide greater statistical power for detailed analyses of the AESB, we identified 403 additional modern genomes within EnteroBase that belong to the AESB, and included them in subsequent analyses. A stringent phylogenetic analysis of the AESB was performed with 37,040 SNPs, which could be called in all 460 modern and six ancient genomes. The results demonstrate that the ancient genomes fall on multiple distinct lineages within the AESB (Figure 2B). The same topology for the ancient genomes was found with a more relaxed phylogenetic analysis based on 130,036 SNPs that could be called from at least 95% of the genomes, except for MUR019 and a few modern clades whose topological associations varied (Extended Data Fig. 3). Such variable topological assignments might result from homologous recombination, which is common in S. enterica[18]. Indeed, we detect high levels of recombination for multiple deep lineages within the AESB, including the sub-branch on which MUR019 is located (Extended Data Fig. 4), suggesting that recombination might cause the topological differences between the stringent and relaxed phylogenies. The topology of the other ancient genomes did not differ between the two analyses, and only these were used for further phylogenetic analyses. However, excluding MUR019 reduces our observation of ancient S. enterica genomes from a transitional forager economy to a single example, MUR009.

Extended Data Fig. 3

Maximum likelihood phylogeny of the AESB based on SNPs in positions present in 95% of strains.

Maximum likelihood tree of the AESB including the high coverage ancient genomes and 463 S. enterica genomes, considering all SNPs covered in at least 95% of strains (130,036 SNPs). New ancient genomes are shown in red, and previously reported ancient genomes (Tepos) in pink.

Extended Data Fig. 4

Recombination rate estimates for the AESB.

Estimated recombination rate is shown as recombination event per mutation event (r/m) and indicated on top of branch and by branch color. Recombination events have been inferred using all positions shared by 95% of strains from the AESB and are here reported for the SNPs shared by all strains on the AESB (correspond to maximum likelihood phylogeny shown in Figure 2B). Maximum likelihood tree including all SNPs shared by at least 95% of strains from the AESB is shown in Extended Data Fig. 3.

The MUR009 S. enterica genome from an individual associated to a transitional forager economy is most closely related to a rare lineage containing diverse isolates (Figure 2B). That lineage includes serovar Abortusequi, which can cause miscarriage in horses[14], as well as a group of ST416/417 strains which are possibly adapted to marine mammals as they were exclusively recovered from the lungs of stranded harbour porpoises[19]. The branch also includes serovar Bispebjerg, which has been isolated from turtles and humans, and serovar Abortusovis, which can cause miscarriage in sheep[14]. All S. enterica genomes from ancient (agro-)pastoralists are phylogenetically related to the previously designated “Para C Lineage”[9]. This lineage includes the invasive serovars Paratyphi C, Typhisuis, Choleraesuis, as well as the rare serovar Lomita (Figure 2B)[9]. Of these serovars, Choleraesuis and Typhisuis are adapted to pigs. However, Paratyphi C is a major cause of enteric fever in humans, and Choleraesuis and Lomita can also cause systemic diseases in humans[20]. The previously described “Tepos”[8] (Mexico, 500 BP) and Ragna[9] (Norway, 800 BP) genomes are at the base of modern Paratyphi C. Of the newly reported ancient genomes, the 1,700 BP genome from Italy (ETR001) defines a novel fourth sub-branch of Choleraesuis. The early Bronze Age genomes SUA004 (Italy; 4,200 BP) and Neolithic OBP001 (Switzerland; 5,200 BP) define a new branch that is basal to the entire Para C Lineage. The other ancient genomes from (agro)-pastoralists (IV3002, IKI003 and MK3001) fall instead outside the Para C Lineage. The original description[9] of the Para C Lineage noted that genomes most closest related to that lineage were in serovar Birkenhead, a rare pathogen of humans[21]. Hierarchical clustering of core-genome MLST sequence types shows that the Para C Lineage and Birkenhead are both in cluster HC2600_1272, which also includes all ancient genomes from (agro-)pastoralists. Among those the oldest genomes IV3002 (Caucasus; 5,300 BP; early Bronze Age) and IKI003 (Turkey; 5,200 BP; late Chalcolithic) define a novel branch closely related to Birkenhead (Fig. 2B). Notably, despite the large geographic distance between both archaeological sites, the genomes differ by only 170 SNPs (out of 2.4M aligned positions), suggesting a relatively fast dissemination during the Bronze Age, which is also confirmed by other aspects of the archaeological record (Supplementary Note 1). Finally, a low coverage ancient S. enterica genome from southern Russia, MK3001 (2,900 BP), falls basal to the branch subtending the HC2600_1272 cluster (Figure 2B). These data show that S. enterica strains that are part of the HC2600_1272 cluster have been infecting human pastoralists and agro-pastoralists for more than 5,000 years across western Eurasia.

Molecular dating of the ancient S. enterica HC2600_1272 cluster

The broad temporal transect of ancient bacterial genomes presented here has the potential to yield a better understanding of the evolutionary timing of S. enterica diversification. Overall, the AESB represents a very deep split within the S. enterica diversity (Figure 2A), possibly even corresponding to a time to the most recent common ancestor (tMRCA) pre-dating the arrival of humans in Eurasia. We did not attempt to date the entire AESB due to a weak temporal signal[22] (correlation coefficient R2 between phylogenetic distance and sampling times = 0.04). This may be due to problems in distinguishing single step mutations from homoplasies across the long timescale of the AESB. In addition, our analyses revealed high levels of recombination on several internal branches of the AESB, which may affect branch lengths and can interfere with molecular dating[23] (Extended Data Fig. 4). Hence, we restricted our dating analysis to the HC2600_1272 cluster, which yielded a higher correlation, a significant degree of temporal signal (R2 = 0.16), and successfully passed a date randomization test[24] (Extended Data Fig. 5). Bayesian phylogenetic molecular dating with BEAST2[25] confirmed the close relationship of IV3002 and IKI003 with a tMRCA of 5,680 (5,990 – 5,450) years ago (Supplementary Table 2, Figure 3A). The split between SUA004 and the Para C Lineage was estimated around 11,000 (16,600 – 6,470). We also confirm the prior estimates[9] of around 4,000 years ago for the emergence of the direct progenitors of Paratyphi C. We note, however, that these estimates have large confidence intervals and that we did not date the entire HC2600_1272 as estimates are potentially affected by non-identified recombination events.

Extended Data Fig. 5

Temporal signal analysis.

Results of the date randomization test for two subsets of the HC2600_1272 cluster (agro-pastoralist branch). Circles represent mean substitution rate estimations with error bars representing 95% highest posterior density (HPD) intervals. For each subset 10 date randomizations were done. Significant temporal signal is indicated by non-overlapping HPD intervals between real data (red) and the randomizations (black), which is the case for both subsets.

Figure 3

AESB topology, divergence times and gain-loss events.

(a) Topology of the AESB highlighting the hierarchical cluster HC2600_1272 (yellow), with symbols indicating ancient human economy of S. enterica positive samples. Selected divergence time estimates for the HC2600_1272 cluster are shown in years BP (95% highest posterior density intervals, see also Supplementary Table 2). (b) Gain-loss results for all SPI’s and selected genes. For SPI’s, colour gradient relates to the average of %-genes covered over 95% across all strains per branch. For genes, colour gradient according to mean percentage covered across all strains per branch. Tepos: 16th century Mexican.

Genomic architecture differentiates transitional forager and (agro-)pastoralist strains

Differences in S. enterica disease manifestation and virulence are not fully understood and have been linked to variation in genomic content, such as Salmonella pathogenicity islands (SPI) and the virulence plasmid (virP)[14]. The ancient and modern strains forming the AESB largely show genomic stability, with consistency in SPI presence or absence (Figure 3B, Supplementary Note 2). One exception is SPI-6, which encodes several putative virulence factors like fimbriae gene clusters saf and tcf, or the Type 6 Secretion System (T6SS)[26]. Those show variable gene content across different modern and ancient serovars (Figure 3B, Supplementary Note 2). Further, the virulent phenotype of virP is conferred by the spv locus, which is involved in bacteremia[27,28] and is present primarily in the HC2600_1272 cluster (Figure 3B, Supplementary Note 2). IV3002 and IKI003 carry in addition the conjugal transfer operon tra, which likely facilitated the original horizontal transfer of virP into the HC2600_1272 cluster[29]. Lastly, the virulence gene rck, which confers complement resistance[30], is present in the ancient S. enterica genomes IV3002, IKI003, and SUA004 but is absent in most modern genomes possibly due to natural selection. Overall, the differential mobile gene content likely contributed to variation in pathogenicity between the HC2600_1272 cluster and other S. enterica within the AESB, as well as between ancient and modern S. enterica within the HC2600_1272 cluster.

Host range of ancient S. enterica strains and evolution of host specificity

Understanding the host range of the ancient S. enterica is informative about different evolutionary trajectories towards human and mammalian host adaptation across the AESB. Among all ancient S. enterica, only ETR001 is phylogenetically confined within a group of host adapted modern S. enterica (pig/human adapted Choleraesuis), whereas all other ancient genomes are basal to S. enterica adapted to different hosts or basal to non-adapted lineages, suggesting that also those early ancient strains were non-host adapted (host generalists). Accumulation of pseudogenes has been linked to host adaptation in various S. enterica serovars, including Paratyphi C and Choleraesuis[31-34]. Thus, the pseudogene frequency observed in the ancient S. enterica can provide additional evidence about their host range even though we cannot predict definitely host specificity from genomes alone. We used an unbiased set of over 8,000 S. enterica pan genes to infer pseudogenes and observed, as expected, an increase in pseudogenization from host generalists to host adapted serovars on the AESB (Figure 4A; Supplemental Data 1). Strikingly, all newly reported ancient genomes are in the range of host generalists and the oldest genomes, MUR009 (14), IV3002 (15), and MUR019 (18) have the lowest pseudogene frequency within the entire AESB (Figure 4A). Among the ancient genomes, ETR001 has in accordance with its archaeological age and phylogenetic placement the highest pseudogene frequency, which is close to the frequency of Paratyphi C strains and indicates it might have been host adapted. Interestingly, the accumulation of pseudogenes correlates with archaeological age (R2 = 0.48, Extended Data Fig. 6). Using linear regression we calculate the rate of pseudogenization across the ancient samples to be approximately one gene per 75 years, which extrapolates towards frequencies observed today in human adapted Paratyphi C. Taken together, the phylogenetic placement and low frequency of pseudogenes in early ancient genomes suggests that prehistoric systemic S. enterica infections in humans were caused by host generalists.

Figure 4

Pseudogenes and evolution of host adaptation across the AESB.

(a) Relative frequency of pseudogenes for each strain of the AESB. Ancient genomes are identified and host generalists are shown in blue and host adapted in orange. (b) Illustration of model used to infer evolution towards host adaptation. (c) Simulated expectation of candidate genes (phoN in green, ydcK in beige) to harbour randomly distributed pseudogenization events using 10,000 simulations. (d) Host adapted S. enterica serovars that harbour a phoN or ydcK pseudogene. BA: Bronze Age, Tepos: 16th century Mexican.

Extended Data Fig. 6

Correlation between pseudogene frequency and time for all ancient genomes with mean genome-wide coverage above 5X.

The AESB harbours six different host adapted S. enterica serovars, several serovars that are host unrestricted, and the predicted host unrestricted ancient strains older than 3,000 years. This provides an opportunity to study the evolution of bacterial host adaptation within the last six millennia. Notably, all modern host adapted serovars carry largely distinct sets of pseudogenes irrespective of relatedness (Extended Data Fig. 7), which is in line with previous results in S. enterica and suggests a primarily independent evolution of pseudogenes[31,32]. However, shared pseudogenes that are formed independently likely represent key changes necessary for host adaptation. Here we harness the vast serovar diversity present on the AESB to test for pseudogenization events that may have facilitated the adaptation towards different mammalian hosts (Figure 4B). We identified 29 candidate genes that harbour two or more independent pseudogenization events, which have different functional roles including metabolism, transcription, and biofilm formation (Supplementary Table 3). Out of those, two genes harbour an excess of pseudogenization events based on random simulations (Bonferroni corrected p-value of 0.006 (phoN) and 0.02 (ydcK); Figure 4C; Supplementary Table 4) and are lost in four host adapted serovars (Figure 4D). Notably, the top candidate phoN is regulated by PhoPQ and can induce a strong antibody response during systemic Salmonella infections in humans and mice, even though its exact role is not yet understood[35]. Altogether, the results indicate that convergent evolution in terms of pseudogene formation contributed to S. enterica host adaptation.

Extended Data Fig. 7

Proportion of shared pseudogenes between strains across the AESB.

Proportion of pseudo Temporal signal analysis.gene-sharing (0-100%) between strains on the AESB is shown in tones of red. Strains are ordered by phylogenetic branch and coloured accordingly.

Discussion

Here we elucidate the evolutionary history of S. enterica interwoven with the dramatic changes in human lifestyle during the spread of Neolithic economies[1] (graphical abstract in Extended Data Fig. 8). Therefore we harness ancient bacterial and human DNA obtained from fossils together with their archaeological record. We present eight ancient Salmonella genomes from as early as 6,500 years ago isolated from ancient pastoralists, agro-pastoralists, and transitional foragers. All are phylogenetically confined within the AESB, suggesting it represented a common yet diverse group of S. enterica that infected humans in prehistory. Apart from the ancient genomes, the AESB contains the human adapted serovar Paratyphi C and the majority of known animal adapted S. enterica serovars.

Extended Data Fig. 8

Graphical abstract.

All ancient genomes were recovered from the pulp chamber of human teeth. The pulp chamber is supplied by blood during life and after death well insulated from environmental conditions, which allows to conclude that human pathogenic bacteria found in the pulp chamber likely originate from systemic disease present at time of death[15]. Thus, our findings highlight systemic S. enterica infection from strains part of the AESB as a human health concern during the last six millennia of human prehistory. It is notable that none of the current predominant causes of invasive salmonellosis in humans were found, such as serovars Typhi and Paratyphi A. Possibly these were less common causes of systemic human disease across western Eurasia in the past. The newly reported ancient Salmonella strains most likely caused systemic salmonellosis despite their lack of SPI-7, a pathogenicity island contributing[36] to paratyphoid fever by Paratyphi C and typhoid fever by Typhi. Moreover, the early S. enterica older than 3,000 years were possibly host generalists and were not specifically adapted to humans. The suggestion that they were generalists is supported by the ability of the modern descendants of some of their phylogenetic neighbours to cause disease in a variety of mammals, and by the relatively low frequency of pseudogenes, which are thought to increase in number upon host adaptation[31-34]. The Neolithization process has caused the first epidemiological transition, where the introduction of domesticated animals, intensification of human-animal co-residence, and changes in mobility are thought to have led to the emergence of new zoonotic diseases in humans through increased exposure to pathogens[2,3,5,37,38]. We reconstructed two S. enterica genomes from ancient transitional foragers excavated in a single site in Russia but are only able to confidently place one of them in the phylogeny, which precludes general conclusions about the diversity of S. enterica infections in transitional forager populations and its geographic spread. However, we reconstructed six ancient S. enterica genomes from (agro-)pastoralists spanning over five millennia of human cultural evolution across western Eurasia and all are within the HC2600_1272 cluster that includes human adapted Paratyphi C. This suggests that progenitors to Paratyphi C evolved within pastoral and agro-pastoral societies during the Neolithization process and provides evidence that the first epidemiological transition facilitated the emergence of human-specific Paratyphi C. However, we note that our ancient metagenomic data screened for S. enterica is biased towards human samples from the Neolithization process and that the HC2600_1272 cluster emerged possibly before the Neolithic transition in a currently unknown context. In order to better understand the origin and range of hosts infected with early strains from the HC2600_1272 cluster, more ancient metagenomic data from early human groups as well as faunal remains are necessary. The origin of human-specific Paratyphi C[9] was previously thought to be a result of a spillover event from pigs to humans around 4,000 years ago due to the close relationship between human- and pig-adapted lineages. Our identification of putative host generalist strains in humans as early as 5,500 years ago (IV3002, IKI003, SUA004), that are phylogenetically basal to human- and pig-adapted serovars, renders the pig-origin hypothesis unlikely. Our results rather support two different hypotheses: (i) that pig-adapted serovars resulted from an anthroponosis, i.e. a spillover from humans to pigs[39], or (ii) that adaptations to humans and pigs occurred via independent processes during the Neolithization within a permissive environment that led to continuous S. enterica exposure and subsequent infection[40]. An independent adaptive history is further supported by the observed small overlap of pseudogenes between host-adapted serovars. Ultimately, additional sampling, including ancient animal remains, will be required to lend credence to either hypothesis and to further elucidate the evolutionary history of S. enterica and other pathogens.

Materials and Methods

Sequencing library preparation

Sample processing took place in dedicated ancient DNA facilities in the Max Planck Institute for the Science of Human History in Jena and the Palaeogenetics laboratory at the University of Tübingen. Teeth were cut at the enamel-dentin junction and sampled in the pulp chamber with a dentist drill. Tooth powder was extracted[41] and the resulting DNA extract was transformed in either double or single stranded (only IKI003) genetic libraries with the use of full, partial or no UDG (uracil DNA-glycosylase) treatment in order to fully, partially or not remove the characteristic C to T substitutions towards both ends of ancient DNA molecules, respectively[42-45]. Genetic libraries were double indexed and amplified before shotgun sequencing. Those libraries identified as positive for S. enterica after screening were further captured using an in-solution target enrichment of Salmonella enterica subsp. enterica DNA[8]. The same libraries were used for a human genome-wide capture that targeted ~390,000 SNPs for ETR001 and ~1.2 million SNPs for all the other samples[46]. In addition, negative controls were taken along initial library preparation and S. enterica capture. All sequencing runs were performed on Illumina platforms in either single or paired-end mode.

Detection, authentication and genome reconstruction

For screening we used the software package HOPS[47], which utilizes MALT[8], a software that aligns sequencing data to a custom-made reference database of all 6,247 complete bacterial genomes from NCBI (obtained December 2016) using spaced seeds. Reads were mapped with at least 85% nucleotide identity, and for taxonomic placement via the LCA (lowest common ancestor) algorithm we retained only alignments within the top 1% of all alignments per read (max. 100). For candidate detection, we interrogated the Salmonella enterica subsp. enterica taxonomic node, and required a minimum of 10 assigned sequencing reads, which show no or only few mismatches (a declining edit distance distribution). This step is crucial to avoid false positives due to incomplete genomic representation of unknown environmental bacteria in our database[7]. Further, we required presence of typical ancient DNA damage[48], i.e. C to T or G to A mismatches at the end of sequencing reads. We screened 2,739 ancient metagenomic datasets obtained from human remains excavated in Eurasia and South America that were preliminary archeologically classified into the following coarse groups: Eneolithic/Chalcolithic (95), Iron Age (61), Late Neolithic Bronze Age (1,027), Medieval (382), Mesolithic (135), Neolithic (548), Paleolithic (183), Post-Columbian (46), Pre-Columbian (127) and other (135). From those, 24 candidates had positive hits on the Salmonella enterica subsp. enterica taxonomic node. We enriched all candidate libraries for S. enterica DNA using a previously presented in-solution capture reagent based on 67 S. enterica reference genomes[8]. Each library was sequenced for about 5M reads and mapped to the Paratyphi C RKS4594 genome (Acc#: NC_012125) using BWA[49] with non-stringent criteria (-n 0.01) within the software package EAGER[50]. We use the Paratyphi C RKS4594 reference, because it regularly occurred among the top hits in the screening analysis, and the overall little divergence (max. 1.3% genome-wide) among S. enterica reference core genomes suggests no major effects by reference bias. The resulting data was evaluated for a minimum genome-wide coverage of 0.3X with 15% of the reference covered. For samples with higher genome-wide coverage the %-reference covered scaled up linearly. In addition, DNA damage (>10% C>T mismatches at 3’ end) had to be observable, which led to eight samples positive for S. enterica (Extended Data Fig. 9). All eight archaeological specimen are shown in Extended Data Figure 10. The remaining 16 libraries had either too little endogenous S. enterica DNA or were false positives due to the presence of closely related species harbouring genomic elements aligned to the Salmonella enterica subsp. enterica node by MALT. All negative controls were negative. For all positive samples (except IKI003), we generated two UDG full treated libraries (removing damage) from the initial ancient DNA extracts, which provides high quality data for further analysis.

Extended Data Fig. 9

Mismatch distribution along positions at the 5’- and 3’- end of mapped sequencing reads.

C to T changes indicated in red and G to A changes in blue, all other substitutions in grey. IV3002 and MK3001 are UDG-half treated, which leads to observable damage only in the terminal positions. Plots generated with mapDamage2 (Jónsson H. et al, Bioinformatics 2013).

Extended Data Fig. 10

Photographs of archaeological specimens that harboured ancient S. enterica DNA.

(a) MUR009; (b) OBP001; (c) MUR019; (d) IKI003; (e) IV3002; (f) ETR001; (g) SUA004; (h) MK3001.

Library preparation of IKI003 was performed in the clean room facilities dedicated to ancient DNA work at the MPI EVA in Leipzig. Two single-stranded Illumina sequencing libraries[44] were generated from 30 µl extract each according to an automated version of the protocol on an Agilent Technologies Bravo NGS Workstation without previous uracil-DNA-glycosylase (UDG) treatment. Both libraries were double-indexed with a unique combination of 7 base pair indices[51]. An in-solution S. enterica capture was repeated for each library and all libraries were sequenced to exhaustion (between 23 to 133M reads). For ETR001, SUA004, OBP001, MUR009, MUR019 only the UDG-treated data was used for downstream analysis. In order to maximize genome-wide coverage we merged for IV3002 and MK3001 the UDG data with the initial UDG-half capture data trimmed at the 3’ and 5’ tail by two bases in order to cleave ancient DNA damage. IKI003 underwent single-stranded library preparation without any UDG treatment, and each sequencing read was trimmed by two bases to minimize impact of ancient DNA damage in all subsequent analyses. Through this approach we generated high quality data with as few as possible alignment mismatches due to ancient DNA damage. We reconstructed all genomes through a mapping strategy to the Paratyphi C reference genome. Again, all data was processed through EAGER, but with stringent BWA mapping parameters (-n 0.1) leading to eight reconstructed genomes, with six of them having an average coverage of 7X and above.

Modern S. enterica genomes and phylogenetic analysis

For initial phylogenetic placement of the ancient genomes we used a recently published set of 2,961 S. enterica genome assemblies from EnteroBase[9,10], where each genome assembly represents a random pick of a unique ribosomal sequence type (rST). The rST is a multilocus sequence type based on 53 genes encoding ribosomal proteins, used to efficiently capture S. enterica diversity[52]. EnteroBase contains over 140,000 S. enterica genomes and this approach allows a drastic reduction in the number of genomes, and hence a reduction in computational resources necessary for a comprehensive analysis. Each genome was split into kmers of 100 bases with a step size of 1 base, and mapped against the Paratyphi C RKS4594 reference following the same stringent criteria used for the ancient genomes (see above). In addition, we include two previously published 16th century Mexican (Tepos) and one medieval Norwegian (Ragna) genome using the publicly available raw reads[8,9]. An alignment of all variable sites was built for all modern and ancient genomes using the tool multivcfanalyzer (v. 0.87-alpha)[53]. Repetitive and highly conserved regions of the Paratyphi C RKS4594 reference were excluded from SNP calling to avoid spurious read mapping[8]. In order to avoid spurious SNP calls every site reported per genome had to have at least 5-fold coverage and a genotype support of at least 90%. We define the core genome of all alignments by using only sites shared across all genomes (complete deletion), which requires to remove ancient genomes with a genome-wide coverage below 5-fold (MK3001, OBP001, and Ragna) to avoid excessive loss of positions, leading to an alignment with overall 182,645 SNPs. A maximum-likelihood tree was built with RAxML[54] using the GTRCAT model and 100 bootstraps (Figure 2A). We use the CAT approximation of rate heterogeneity compared to GAMMA as it is computational much more tractable with large datasets and produces comparable results[55]. The ancient genomes are phylogenetically placed in the AESB, which harbours merely 60 out of the 2,961 genomes. Notably, we observe uncertainty for very deep lineages based on bootstrap support and high levels of recombination, including MUR019, which thus cannot be placed with high confidence within the vast S. enterica genomic diversity. For detailed analysis, we obtained from the 140,000 genomes available on EnteroBase (late March 2018) all genome assemblies that have one of the unique rST’s identified on the AESB, leading to a total of 469 genome assemblies from EnteroBase (http://enterobase.warwick.ac.uk/species/senterica/search_strains?query=workspace:12971, which represents the known genetic diversity of the AESB. Those include several host generalists recovered from various animal species (Tallahassee, Uzaramo, Goverdhan) and/or by food safety surveillance laboratories (e.g. eBG 409 and 426)[56-61]. In addition, we add the Enteritidis P125109 reference genome as an outgroup, because it is phylogenetically close to the AESB. Data processing and filtering followed the same guidelines as above leading to a SNP alignment with 37,040 positions. A maximum-likelihood tree was built with RAxML[54] using the GTRCAT model and 1000 bootstraps (Figure 2B). Due to their ambiguous placement, we removed two genomes: FDA00002391 and FDA00002392. We estimated the phylogenetic placement of the three ancient genomes filtered due to coverage (MK3001, OBP001, and Ragna) by adding them to the existent SNP alignment with relaxed SNP calling criteria. We required only 1X per site to call a genotype, which lead to the following number of callable sites: MK3001 - 17,324; OBP001 - 26,657; and Ragna - 35,465 out of the 37,040 SNPs. Missing or ambiguous sites were typed as N. In addition, we used a more relaxed filtering approach by using all SNPs shared by at least 95% of all sequences, which led to 130,036 SNPs and is shown in Extended Data Fig. 3. Hierarchical clustering of core-genome MLST sequence types at multiple levels of pair-wise linkage distances was done using EnteroBase[62].

Quantification of recombination across the AESB

The alignment based on all SNPs shared by at least 95% of all sequences were used for recombination detection, which covered 83% of the sites in the Paratyphi C RKS4594 reference genome. A total of 130,597 SNPs were identified in these relaxed core genomic sites, and assigned using the EToKi-phylo[62] onto branches in the tree using a maximum likelihood method with a symmetric transition model[63]. We found 49,101 sites (38%) were mutated on multiple independent occasions (homoplasies) resulting in a total of 241,275 substitution events. RecHMM[64] was then applied on these substitutions to estimate the number of recombination events per substitution (Extended Data Fig. 4).

Human genetic analysis

Using EAGER[50], all sequencing reads from the human genome-wide capture data were trimmed, paired-end data merged, and mapped against the human reference genome, using a mapping quality filtered (MQ>30) and duplicates removed. Typical ancient DNA damage pattern was assessed using mapDamage2.0[65]. Individual mitochondrial consensus sequences were reconstructed using schmutzi (base quality > 20)[66], and mitochondrial DNA haplogroups were assigned using HaploGrep[67], and further used for mitochondrial DNA contamination estimation. Nuclear contamination for male individuals was estimated based on heterozygosity at X-chromosome SNPs using ANGSD[68]. Sex determination was by comparing the coverage at targeted SNPs on X- and Y-chromosomes relative to SNPs on the autosomes (Supplementary Table 1). For nuclear DNA analyses (all samples except IKI003) we minimized spurious SNP calls by minimizing the amount of ancient DNA damage in the data. Sequencing reads of fully UDG treated libraries were kept untrimmed, sequencing reads from partially UDG treated libraries were trimmed by 2bp on both ends, and sequencing reads from non-UDG treated libraries were trimmed of 10bp on both ends. The targeted set of SNPs was genotyped by pseudohaploid calls with PileupCaller (https://github.com/stschiff/sequenceTools/tree/master/src-pileupCaller). For IKI003, the only single stranded libraries, genotyping was performed on both, 10bp trimmed and untrimmed sequences retaining calls deriving from untrimmed data except for C to T mutations, which presence was inspected in the trimmed data to avoid mistaking DNA damage as real mutations. The genotype data was merged with the Human Origins dataset[69] comprising ~7000 ancient and modern individuals. For nuclear DNA analysis, we require at least 15,000 SNPs overlapping with the Human Origins dataset in each sample, which was not fulfilled for two individuals (OBP001 and MUR019), and thus excluded for nuclear DNA analyses. A principal component analyses (PCA) was performed using smartpca[70]. Principal components were computed on 65 present-day West Eurasian populations[69] onto which ancient individuals were projected (parameters lsqproject: yes, shrinkmode: yes). Model-based clustering analysis using ADMIXTURE[71] was done using the complete dataset (parameters: –s time –cv) with K=3 to K=16 and each repeated five times (Extended Data Fig. 2). The run with highest likelihood was reported for each K (Extended Data Fig. 2).

Molecular dating

The temporal signal of the full AESB was evaluated with the software TempEst[22], which employs a root to tip regression of genetic distances on sampling times. Due to the lack of a signal on the entire AESB, further analyses were done using a reduced datasets[9]. In total, we analyse two subsets of reduced replicates of the agro-pastoralist branch. Both contain ETR001, CS-3, the two 16th century Mexican samples (Tepos), Lomita, IV3002, IKI003, SUA004, all five sequences from the serovar Birkenhead, 38 randomly sampled sequences, and a sequence from the serovar Bispebjerg as the outgroup. All major lineages of the HC2600_1272 cluster were represented in the subsets The TempEst analysis yielded a temporal signal for the pastoralist subsets (R2 = 0.17 and 0.16). The presence of temporal signal in the subsets was further evaluated with a date randomization test[72]. We generated 10 replicates with randomized sampling dates for each alignment, and the substitution rates were estimated with the software BEAST2[25], using a relaxed lognormal molecular clock, a general time reversible model of nucleotide substitution with six gamma categories (GTR+γ6) and a Bayesian birth-death tree prior[73] with constant-through-time birth and death rates and an upper bound of 100,000 years. This tree prior allowed explicit modelling of the fact that the proportion of samples is lower in the past than near the present. Through a monophyly constraint the respective outgroup was enforced to be an outgroup in the analysis. The two subsets of the agro-pastoralist branch passed the date randomization test (Extended Data Fig. 5), and the dating analysis was conducted with BEAST2 as described above. As only SNP alignments were used, we corrected for ascertainment bias by specifying the number of invariable sites per base in relation to the Paratyphi C RKS4594 reference genome. For each dataset, multiple BEAST replicates with 400 million MCMC steps were done and subsequently combined with the software LogCombiner, after removing a 10% burn-in. This yielded effective sample sizes above the standard threshold of 200 for all parameters. The posterior estimates for the main internal node dates and branch rates are summarized in Supplementary Table 2. The files specifying the BEAST2 analysis, the posterior distributions of all parameters and the maximum clade consensus trees are are available on figshare, with a link provided in the Data Availability Statement for this paper.

14C Dating

Collagen was extracted from the bone or tooth sample, purified by ultrafiltration (fraction >30kD) and freeze-dried. Collagen was combusted to CO2 in an Elemental Analyzer and converted catalytically to graphite. Accelerator Mass Spectrometry dating was performed to estimate the 14C age of each sample. The 14C ages are given in BP (before present) meaning years before 1950. In order to provide absolute calendar ages the 14C ages need to be calibrated. The calibration was done using the dataset INTCAL13[74] und the software SwissCal 1.0 (L. Wacker, ETH-Zürich). The results of the calibration were reported as “Cal 1-sigma” and “Cal 2-sigma” using the 1-sigma and 2-sigma uncertainty of the 14C ages, respectively. We report all ages in the format: “Cal 2-sigma start – end BP (uncalibrated radiocarbon years, laboratory number).

Genomic architecture

GenBank files for SPI-1 to SPI-12 and SGI-1/2 were obtained from PAIDB[75], and all annotated CDS sequences extracted. Additional sequences for SPI-13 to SPI-21 were obtained from respective references[76-80], as well as a SPI-6 annotation from the Para C Lineage[9]. GenBank files for plasmid annotations were obtained from NCBI. We mapped each genome (fragmented in kmer of 50b with step-size 1) and the capture probes to each reference using BWA mem[81], and filtered all alignments using picard tools (CleanSam, MarkDuplicates). For each annotation, we obtained the number of bases covered at least once using bedtools (genomecov -1)[82]. For analysis we considered only genes that were covered by at least 95% with the DNA capture probe set.

Pseudogene analysis

Pseudogenes were inferred based on premature stop codons or frameshift mutations relative to an intact gene across the S. enterica pan genome. The previously published pan genome of S. enterica (http://enterobase.warwick.ac.uk/schemes/Salmonella.wgMLST/exemplar.alleles.fasta.gz) is based on all CDS from 537 representative genomes, which were grouped based on sequence identity and cleaned for paralogs yielding 21,065 genes[10]. We filtered the complete pan genome for genes that were 100% covered by the probe set used for DNA capture, leading to 8,726 genes used for inference. For each strain present on the AESB, we aligned the genome to the refined pan gene set using BWA mem[81], which allows for overhang alignments (soft clipping) at the end of the reference, which is necessary for correct alignment to single genes. All alignments were filtered for duplicates, a mapping quality of 37 or above, and realigned (indel realignment) before SNPs and insertion/deletion were called using GATK v3.5[83]. For each strain and each gene, we built a consensus sequence using GenConS (-total_coverage 2, -major_allele_coverage 2, -consensus_ratio 0.75, -punishment_ratio 0.8)[84] and inferred pseudogenes as genes with premature stop mutation or frame-shift mutation. For each genome, we consider only genes that were covered by at least 90% for analysis. We report in Figure 4A the frequency of pseudogenes relative to the absolute number of genes analysed in each respective genome. To test for signatures of convergent evolution in host adapted serovars we assess for each gene all observed independent pseudogenization events. We select candidate pseudogenes associated with convergent evolution towards host adaptation on the AESB by filtering pseudogenes for the following criteria: (i) having at least two fixed parallel (independent) pseudogenization events across the host adapted serovars (Abortusovis, Abortusequi, Porpoise, Typhisuis, Choleraesuis, Paratyphi C), and (ii) it is not a pseudogene in the ancient host generalist from the Eneolithic and Bronze Age (MUR019, MUR009, IV3002, IKI003, SUA004) as well as modern host generalist groups (Bispebjerg, Goverdhan_eBG426, eBG409, Tallahassee, Birkenhead). The probability to observe the respective number of pseudogenization events (or more) per gene with length L was simulated (N = 10,000) by randomly distributing the observed number of pseudogenization mutations in host-adapted serovars (1,432) across the average pan-genome (size 3,828,459 bp (σ = 69,974), results shown in Supplementary Table 3). Bonferroni correction was applied to all simulation-derived probabilities to correct for multiple testing.

Ancient human population genetic analysis.

Summary human genetic analysis.

Maximum likelihood phylogeny of the AESB based on SNPs in positions present in 95% of strains.

Recombination rate estimates for the AESB.

Temporal signal analysis.

Proportion of shared pseudogenes between strains across the AESB.

Proportion of pseudo Temporal signal analysis.gene-sharing (0-100%) between strains on the AESB is shown in tones of red. Strains are ordered by phylogenetic branch and coloured accordingly.

Mismatch distribution along positions at the 5’- and 3’- end of mapped sequencing reads.

Photographs of archaeological specimens that harboured ancient S. enterica DNA.

(a) MUR009; (b) OBP001; (c) MUR019; (d) IKI003; (e) IV3002; (f) ETR001; (g) SUA004; (h) MK3001.

63 in total

Review 1. Host adaptation and the emergence of infectious disease: the Salmonella paradigm.

Authors: R A Kingsley; A J Bäumler
Journal: Mol Microbiol Date: 2000-06 Impact factor: 3.501

2. Infectious diseases in ancient populations.

Authors: T A Cockburn
Journal: Curr Anthropol Date: 1971

3. Bioarchaeology of Neolithic Çatalhöyük reveals fundamental transitions in health, mobility, and lifestyle in early farmers.

Authors: Clark Spencer Larsen; Christopher J Knüsel; Scott D Haddow; Marin A Pilloud; Marco Milella; Joshua W Sadvari; Jessica Pearson; Christopher B Ruff; Evan M Garofalo; Emmy Bocaege; Barbara J Betz; Irene Dori; Bonnie Glencross
Journal: Proc Natl Acad Sci U S A Date: 2019-06-17 Impact factor: 11.205

Review 4. Mining Metagenomic Data Sets for Ancient DNA: Recommended Protocols for Authentication.

Authors: Felix M Key; Cosimo Posth; Johannes Krause; Alexander Herbig; Kirsten I Bos
Journal: Trends Genet Date: 2017-07-05 Impact factor: 11.639

5. Epidemic multiple drug resistant Salmonella Typhimurium causing invasive disease in sub-Saharan Africa have a distinct genotype.

Authors: Robert A Kingsley; Chisomo L Msefula; Nicholas R Thomson; Samuel Kariuki; Kathryn E Holt; Melita A Gordon; David Harris; Louise Clarke; Sally Whitehead; Vartul Sangal; Kevin Marsh; Mark Achtman; Malcolm E Molyneux; Martin Cormican; Julian Parkhill; Calman A MacLennan; Robert S Heyderman; Gordon Dougan
Journal: Genome Res Date: 2009-11-09 Impact factor: 9.043

6. Salmonella enterica genomes from victims of a major sixteenth-century epidemic in Mexico.

Authors: Åshild J Vågene; Alexander Herbig; Michael G Campana; Nelly M Robles García; Christina Warinner; Susanna Sabin; Maria A Spyrou; Aida Andrades Valtueña; Daniel Huson; Noreen Tuross; Kirsten I Bos; Johannes Krause
Journal: Nat Ecol Evol Date: 2018-01-15 Impact factor: 15.460

Review 7. A genomic overview of the population structure of Salmonella.

Authors: Nabil-Fareed Alikhan; Zhemin Zhou; Martin J Sergeant; Mark Achtman
Journal: PLoS Genet Date: 2018-04-05 Impact factor: 5.917

Review 8. Ancient pathogen genomics as an emerging tool for infectious disease research.

Authors: Maria A Spyrou; Kirsten I Bos; Alexander Herbig; Johannes Krause
Journal: Nat Rev Genet Date: 2019-06 Impact factor: 53.242

Review 9. World Health Organization Estimates of the Global and Regional Disease Burden of 22 Foodborne Bacterial, Protozoal, and Viral Diseases, 2010: A Data Synthesis.

Authors: Martyn D Kirk; Sara M Pires; Robert E Black; Marisa Caipo; John A Crump; Brecht Devleesschauwer; Dörte Döpfer; Aamir Fazil; Christa L Fischer-Walker; Tine Hald; Aron J Hall; Karen H Keddy; Robin J Lake; Claudio F Lanata; Paul R Torgerson; Arie H Havelaar; Frederick J Angulo
Journal: PLoS Med Date: 2015-12-03 Impact factor: 11.069

10. Pan-genome Analysis of Ancient and Modern Salmonella enterica Demonstrates Genomic Stability of the Invasive Para C Lineage for Millennia.

Authors: Zhemin Zhou; Inge Lundstrøm; Alicia Tran-Dien; Sebastián Duchêne; Nabil-Fareed Alikhan; Martin J Sergeant; Gemma Langridge; Anna K Fotakis; Satheesh Nair; Hans K Stenøien; Stian S Hamre; Sherwood Casjens; Axel Christophersen; Christopher Quince; Nicholas R Thomson; François-Xavier Weill; Simon Y W Ho; M Thomas P Gilbert; Mark Achtman
Journal: Curr Biol Date: 2018-07-19 Impact factor: 10.834

19 in total

1. Getting sick in the Neolithic.

Authors: Anne C Stone
Journal: Nat Ecol Evol Date: 2020-03 Impact factor: 15.460

2. Ancient history.

Authors: Andrea Du Toit
Journal: Nat Rev Microbiol Date: 2020-04 Impact factor: 60.633

3. The source of the Black Death in fourteenth-century central Eurasia.

Authors: Maria A Spyrou; Lyazzat Musralina; Guido A Gnecchi Ruscone; Arthur Kocher; Pier-Giorgio Borbone; Valeri I Khartanovich; Alexandra Buzhilova; Leyla Djansugurova; Kirsten I Bos; Denise Kühnert; Wolfgang Haak; Philip Slavin; Johannes Krause
Journal: Nature Date: 2022-06-15 Impact factor: 69.504

4. Mass burial genomics reveals outbreak of enteric paratyphoid fever in the Late Medieval trade city Lübeck.

Authors: Magdalena Haller; Kimberly Callan; Julian Susat; Anna Lena Flux; Alexander Immel; Andre Franke; Alexander Herbig; Johannes Krause; Anne Kupczok; Gerhard Fouquet; Susanne Hummel; Dirk Rieger; Almut Nebel; Ben Krause-Kyora
Journal: iScience Date: 2021-04-20

Review 5. Life History Transitions at the Origins of Agriculture: A Model for Understanding How Niche Construction Impacts Human Growth, Demography and Health.

Authors: Jonathan C K Wells; Jay T Stock
Journal: Front Endocrinol (Lausanne) Date: 2020-05-21 Impact factor: 5.555

6. Genetic Markers in S. Paratyphi C Reveal Primary Adaptation to Pigs.

Authors: Satheesh Nair; Maria Fookes; Craig Corton; Nicholas R Thomson; John Wain; Gemma C Langridge
Journal: Microorganisms Date: 2020-04-30

7. Genomic Serotyping, Clinical Manifestations, and Antimicrobial Resistance of Nontyphoidal Salmonella Gastroenteritis in Hospitalized Children in Ho Chi Minh City, Vietnam.

Authors: Vu Thuy Duong; Hao Chung The; Tran Do Hoang Nhu; Ha Thanh Tuyen; James I Campbell; Pham Van Minh; Hoang Le Phuc; Tran Thi Hong Chau; Nguyen Minh Ngoc; Lu Lan Vi; Alison E Mather; Stephen Baker
Journal: J Clin Microbiol Date: 2020-11-18 Impact factor: 5.948

Review 8. Salmonella versus the Microbiome.

Authors: Andrew W L Rogers; Renée M Tsolis; Andreas J Bäumler
Journal: Microbiol Mol Biol Rev Date: 2020-12-23 Impact factor: 11.056

Review 9. The Recovery, Interpretation and Use of Ancient Pathogen Genomes.

Authors: Sebastián Duchêne; Simon Y W Ho; Ann G Carmichael; Edward C Holmes; Hendrik Poinar
Journal: Curr Biol Date: 2020-10-05 Impact factor: 10.834

10. Metagenomics of the modern and historical human oral microbiome with phylogenetic studies on Streptococcus mutans and Streptococcus sobrinus.

Authors: Mark Achtman; Zhemin Zhou
Journal: Philos Trans R Soc Lond B Biol Sci Date: 2020-10-05 Impact factor: 6.237