Literature DB >> 34648730

Hallstatt miners consumed blue cheese and beer during the Iron Age and retained a non-Westernized gut microbiome until the Baroque period.

Frank Maixner¹, Mohamed S Sarhan², Kun D Huang³, Adrian Tett⁴, Alexander Schoenafinger⁵, Stefania Zingale², Aitor Blanco-Míguez⁶, Paolo Manghi⁶, Jan Cemper-Kiesslich⁷, Wilfried Rosendahl⁸, Ulrike Kusebauch⁹, Seamus R Morrone⁹, Michael R Hoopmann⁹, Omar Rota-Stabelli¹⁰, Thomas Rattei¹¹, Robert L Moritz⁹, Klaus Oeggl¹², Nicola Segata⁶, Albert Zink², Hans Reschreiter¹³, Kerstin Kowarik¹⁴.

Abstract

We subjected human paleofeces dating from the Bronze Age to the Baroque period (18th century AD) to in-depth microscopic, metagenomic, and proteomic analyses. The paleofeces were preserved in the underground salt mines of the UNESCO World Heritage site of Hallstatt in Austria. This allowed us to reconstruct the diet of the former population and gain insights into their ancient gut microbiome composition. Our dietary survey identified bran and glumes of different cereals as some of the most prevalent plant fragments. This highly fibrous, carbohydrate-rich diet was supplemented with proteins from broad beans and occasionally with fruits, nuts, or animal food products. Due to these traditional dietary habits, all ancient miners up to the Baroque period have gut microbiome structures akin to modern non-Westernized individuals whose diets are also mainly composed of unprocessed foods and fresh fruits and vegetables. This may indicate a shift in the gut community composition of modern Westernized populations due to quite recent dietary and lifestyle changes. When we extended our microbial survey to fungi present in the paleofeces, in one of the Iron Age samples, we observed a high abundance of Penicillium roqueforti and Saccharomyces cerevisiae DNA. Genome-wide analysis indicates that both fungi were involved in food fermentation and provides the first molecular evidence for blue cheese and beer consumption in Iron Age Europe.

Entities: Chemical

Keywords: Hallstatt; beer; cheese; diet; fermented food; microbiome; paleofeces; protohistory; salt mine

Mesh：

Year: 2021 PMID： 34648730 PMCID： PMC8660109 DOI： 10.1016/j.cub.2021.09.031

Source DB: PubMed Journal: Curr Biol ISSN： 0960-9822 Impact factor: 10.834

Introduction

Paleofeces are naturally preserved ancient feces found in dry caves, desert areas, waterlogged environments, and frozen habitats. Specific environmental processes such as desiccation or freezing prevent their deterioration in mummies, ancient latrines, bogs, and soils. Previous studies have shown that paleofecal material still contains plant macro- and microfossils, parasite eggs, and even ancient biomolecules (DNA, proteins, metabolites). Ancient paleofeces have therefore recently been used as a source of information to study prehistoric nutrition patterns3, 4, 5 and health, and to analyze single representatives, or the overall composition of the intestinal microbiome of our ancestors.10, 11, 12 One of the few archaeological sites where well-preserved paleofeces can be found is the protohistoric salt mines of the Austrian UNESCO World Heritage area Hallstatt-Dachstein/Salzkammergut. Protohistoric salt mines offer ideal preservation conditions for organic materials. The high salt concentrations and the constant annual temperature of around 8°C inside the isolated mine workings preserve organic artifacts very well. The Hallstatt salt mines located in the Eastern Alps (Figure 1) offer one of the world’s oldest and most continuous record of underground salt mining.13, 14, 15 Large-scale underground mining in the Hallstatt salt mountains dates back at least to the 14th century BC (late Bronze Age). Several protohistoric (Bronze, Iron Age) and historic (14th century AD to present) mining phases are well documented. The site also gave name to the early period of the Iron Age in Europe, the so-called Hallstatt Period (800 to 400 BC). Dense layers of production waste reaching several meters of thickness were excavated from the protohistoric Bronze Age and Iron Age mine workings of Hallstatt, uncovering thousands of wooden tools and construction elements, implements made from fur, rawhide, hundreds of woolen textile fragments, grass, bast ropes, and human excrements. These objects provide insights into the daily life of a Bronze Age and Iron Age mining community ranging from mining technology, organization of production, and resource management to human health, dietary habits, social organization of production processes, and social status within a mining system. These aspects have been studied extensively in Hallstatt based on a combination of data sources encompassing the protohistoric mine working, Bronze Age meat-curing facilities, and large Iron Age cemetery.,,

Figure 1

The Hallstatt salt mine and radiocarbon-dated paleofeces samples used in this study

(A) The salt mines are located in Upper Austria.

(B) Finding sites of the four paleofeces samples in the Bronze Age, Iron Age, and Baroque mining area. The symbol color corresponds to the radiocarbon date of the paleofeces.

(C) Macroscopic appearance of the four paleofeces samples. The scale bar corresponds to 1 cm of length. The sample description includes the sample ID, the mine workings name, and the radiocarbon date. The provided radiocarbon date range corresponds to the Cal 2-sigma values with the highest probability.

(D) Temporal assignment of the radiocarbon-dated paleofeces to the major European time periods from the Bronze Age onward.

See also Figure S1 for details of the salt crystals surrounding sample 2612. Data S1A and S1B provide additional information about the samples and the radiocarbon dating.

The Hallstatt salt mine and radiocarbon-dated paleofeces samples used in this study (A) The salt mines are located in Upper Austria. (B) Finding sites of the four paleofeces samples in the Bronze Age, Iron Age, and Baroque mining area. The symbol color corresponds to the radiocarbon date of the paleofeces. (C) Macroscopic appearance of the four paleofeces samples. The scale bar corresponds to 1 cm of length. The sample description includes the sample ID, the mine workings name, and the radiocarbon date. The provided radiocarbon date range corresponds to the Cal 2-sigma values with the highest probability. (D) Temporal assignment of the radiocarbon-dated paleofeces to the major European time periods from the Bronze Age onward. See also Figure S1 for details of the salt crystals surrounding sample 2612. Data S1A and S1B provide additional information about the samples and the radiocarbon dating. Here we focus on the question of the structure and evolution of dietary habits as well as the human gut microbiome in one of Europe’s most important early production communities. We used microscopic, metagenomic, and proteomic analysis to characterize nutrition patterns of the protohistoric (Bronze Age, early Iron Age) and historic (Baroque period, 18th century AD) miners and metagenomic analysis to determine the structure and evolution of the gut microbiome. Our findings will enhance the understanding of early European dietary habits (especially the production and consumption of processed foodstuffs) and provide further evidence of the recent change in gut microbiome structure as result of industrialization and Westernization processes.

Results

Paleofeces from Bronze Age to Baroque Period contain ancient endogenous DNA

In this study, we initially subjected four paleofeces samples, collected from Bronze Age and Iron Age Hallstatt mine workings, to radiocarbon dating, then to in-depth microscopic and molecular analysis (Figures 1A–1C; STAR Methods; Data S1A). The four paleofeces samples can be macroscopically differentiated into three samples containing a high amount of fibrous plant material (2610, 2604, 2611) and one more homogeneous sample (2612) that does not contain any visible larger plant fragments (Figures 1 andS1A). Radiocarbon analyses date the roughly structured samples to the Late Bronze Age (2610) and Iron Age (2604, 2611), which is in perfect accordance with the proposed period of usage of the mine workings where the paleofeces have been found. In contrast, the fine-textured paleofeces 2612 sampled in an Iron Age minedates to the Baroque period (18th century AD) (Figure 1C; Data S1B). For this part of the salt mine, however, it is historically documented that the mine workings had started to be reused from the beginning of 18th century onward. Independently of the paleofeces’ age, their storage time since excavation (some samples were recovered in the year 1983), or the mode of excavation (wet sieving versus direct sampling) (Figure S1), we could retrieve biomolecules (DNA and protein) from all samples for the subsequent molecular analysis (Data S1 and S2; STAR Methods). Proteomics analysis provided the first evidence for the presence of endogenous biomolecules in the paleofeces material. The most abundant peptides were assigned to human intestinal tract proteins that are involved in food digestive processes (Data S2B, S2D, S2F, and S2H). The DNA of the paleofeces material was further subjected to a deep shotgun sequencing approach resulting in 57,130,584 to 221,314,691 quality-filtered reads (Data S1C). A first taxonomic overview using DIAMOND against the NCBI NR database revealed that the majority of reads in the samples are assigned to Bacteria (93.9% to 78.9% of all assigned reads), with Firmicutes and Bacteroidetes being the most abundant phyla of this kingdom (Figures S2A–S2D). Less than 7.5% of the reads were eukaryotic, with up to 6.7% fungal reads in sample 2604. The Metazoa and Viridiplantae reads, important for the molecular reconstruction of the diet, comprised 0.5% to 0.01% of all assigned reads. Further analysis of the human DNA in the paleofeces revealed an endogenous DNA content between 0.26% and 0.06%, sufficient for molecular sex and mitochondrial haplogroup assignment (Data S1D). The highly fragmented human reads display a very low deamination pattern at the 5′ ends (Figures S2E–S2H). In the most recent sample (2612), the reads appear even less fragmented and display almost no DNA damage. Considering the age of these samples, the DNA damage is exceptionally low. This high preservation is most likely due to the rapid desiccation of the samples in the salt mine, which may result in reduced hydrolytic damage of the biomolecules. Our analyses show that the four paleofeces come from male individuals that carry distinct mitogenomes with low contamination estimates (1% to 2%), indicating that each sample represents unique unmixed ancient feces.

Ancient paleofeces display a gut microbiome structure similar to modern non-Westernized individuals

We compared the microbiome structure of the paleofeces to a large number of contemporary metagenomes (n = 823) (Data S1E). Principal coordinate analysis (PCoA) performed on a species-level taxonomic composition shows that the paleofeces from the Bronze Age to the Baroque period cluster with stool samples from contemporary non-Westernized individuals (Figure 2A) with diets mainly consisting of unprocessed foods and fresh fruits and vegetables. This clustering is similarly observed for encoded metabolic pathways (Figure 2B). All the paleofecal samples were distinct from the oral and, more importantly, from the soil samples, suggesting little evidence of soil contamination, which is sometimes observed in ancient metagenomics studies. The source prediction analysis further supports the sample preservation (Data S1F).

Figure 2

Overview of microbial composition and metabolic pathways of paleofeces samples in comparison to a large collection of contemporary metagenomes

(A) Principal coordinate analysis (PCoA) based on microbial abundance profiled using MetaPhlAn 3.0 between four paleofeces samples and 823 contemporary samples characterized by sampling environment, body site, and non-Westernized lifestyle.

(B) Principal coordinate analysis (PCoA) based on metabolic pathway abundance profiled using HUMAnN 3.0 between four paleofeces samples and the same contemporary samples used in (A).

(C) Prevalence of the top 15 most enriched species of four paleofeces samples in non-Westernized and Westernized datasets comprising 8,968 stool samples from healthy adult individuals. Asterisk indicates species that is likely from external contamination.

(D) Relative abundance of P. copri four clades estimated using MetaPhlAn 3.0 in each paleofecal sample.

See also Figure S1 for additional microbial profiles in the DNA “wash-out” experiment. Data S1 contains additional information about the comparative datasets and the results obtained by the prevalence and abundance analysis.

Overview of microbial composition and metabolic pathways of paleofeces samples in comparison to a large collection of contemporary metagenomes (A) Principal coordinate analysis (PCoA) based on microbial abundance profiled using MetaPhlAn 3.0 between four paleofeces samples and 823 contemporary samples characterized by sampling environment, body site, and non-Westernized lifestyle. (B) Principal coordinate analysis (PCoA) based on metabolic pathway abundance profiled using HUMAnN 3.0 between four paleofeces samples and the same contemporary samples used in (A). (C) Prevalence of the top 15 most enriched species of four paleofeces samples in non-Westernized and Westernized datasets comprising 8,968 stool samples from healthy adult individuals. Asterisk indicates species that is likely from external contamination. (D) Relative abundance of P. copri four clades estimated using MetaPhlAn 3.0 in each paleofecal sample. See also Figure S1 for additional microbial profiles in the DNA “wash-out” experiment. Data S1 contains additional information about the comparative datasets and the results obtained by the prevalence and abundance analysis. To further assess the paleofeces samples, we analyzed the prevalence of the top 15 most abundant species in the paleofeces compared to 8,968 gut microbiomes of healthy Westernized and non-Westernized adults (Data S1G). As a result, 13 out of the 15 most abundant species were identified to be associated with human gut environment, of which 11 species were found to be more prevalent in modern non-Westernized compared to Westernized populations. Five of these species, Bifidobacterium angulatum, Lactobacillus ruminis, Catenibacterium mitsuokai, Prevotella copri, and Clostridium ventriculi, were over twice as prevalent in non-Westernized populations (Figure 2C; Data S1H). One of the two species not associated with the human gut is the halophilic archaeon Halococcus morrhuae, which survives on a high concentration of salt. It was observed in low abundance in the paleofeces sample 2612, the only sample that was not subjected to wet sieving. Therefore, we assume that the archaeon was introduced from the environment via the salt crystals (Figure S1B). In the paleofeces samples 2604 and 2611, we also identified Clostridium perfringens, a known intestinal foodborne pathogen, that also occurs free living in the soil and other environments. Since an infection with C. perfringens causes acute diarrhea and the paleofeces does not indicate any characteristics pointing to that disease, we assume that the presence of this bacterium is due to an environmental contaminant rather than a remnant of food spoilage in the miners’ gut. When all paleofecal microbiome members were considered in population prevalence analysis, 100 out of 158 species were found in ≥5% stool samples from modern healthy adult individuals and 65% of these species are overrepresented in non-Westernized populations in comparison with the Westernized equivalent (Data S1I). A similar result was obtained when we decreased the prevalence threshold to 1% (Data S1J). The Prevotella copri complex, which is highly prevalent in non-Westernized populations and prevalent in previously investigated ancient samples, was identified in all paleofecal samples, representing, on average, 7.3% (∼1.6%–14.7%) of the relative abundance (Figure 2C; Data S1K). Consistent with previous findings, we found multiple clades of the complex to be present in each of the paleofeces samples (Figure 2D) with the exception of Clade D, which was barely detectable in sample 2604 and 2612. All other clades were detected with relative abundances > 0.01% in all samples (Figure 2D; Data S1L). Of note, in contrast to other samples, the sample 2604 displayed higher abundance of bacterial species such as Lactobacillus brevis, Bifidobacterium merycicum, Bifidobacterium angulatum, and Lactobacillus plantarum (Data S1K) that are known to be of probiotic activities or involved in processing of dairy products.

Microscopic and molecular reconstruction of the Hallstatt miners' diet

Next, we aimed to reconstruct the dietary components in the paleofeces using both a microscopic and a molecular survey. The above-mentioned structural differences between the paleofeces became even more evident in the microscopic analyses. The Baroque period sample 2612 was much finer textured than all other samples from protohistory (Figure 1C). This was also reflected in the macro-remain composition of the paleofeces, showing that samples 2610, 2604, and 2611 contained a lot of seeds contrary to 2612, which consisted of frequent tissues of fruit husks and seed coats (Figure 3A; Data S1M). Generally, all samples displayed a predominance of cereal remains.

Figure 3

Microscopic and molecular dietary analysis of the Hallstatt paleofeces

(A) Plant macro-remains microscopically detected in the four paleofeces samples. The scale bar indicates 1 mm of length. The heatmap shows the log-scale macro-remain counts normalized to 3.7 g sample. The sample with asterisk was assessed in a semiquantitative manner. For further details, please refer to Data S1.

(B) Most abundant taxa (plants, nematodes, animals, fungi) detected in the four paleofeces metagenomes and proteomes. The circle size and circle color correspond to log10 “normalized” number of reads per million at genus and species levels, respectively. The asterisks in the proteome heatmap mean the peptides were assigned only to genus level.

(C) Phylogenetic assignment of two partial Triticum chloroplast genomes in the 2604 and 2610 metagenomes. The comparative dataset included complete chloroplast genomes of selected members of the Triticeae tribe (NCBI accession numbers are provided in the figure). The tree was calculated using the maximum-likelihood algorithm (PhyML) based on 136,160 informative positions. Black circles symbolize parsimony and neighbor joining bootstrap support (>90%) based on 100 and 1,000 iterations, respectively. The scale bar indicates 10% estimated sequence divergence.

(D) Wheat subgenome (A, B, and D) representation in the 2604 and 2610 metagenomes (Data S1), aligned to the modern hexaploid bread wheat reference genome (accession number GCA_900519105). Both the wheat chloroplast and nuclear reads were highly fragmented and display aDNA-specific damage patterns (Figures S3H–S3K).

See also Figure S3 for details about the comparative analysis, phylogenetic assignment, and damage pattern of selected plant, animal, and parasite DNA. Data S1 provides further details of the macro-remains, comparative datasets, dietary DNA, and mapping statistics. Data S2 provides additional information about the comparative datasets and proteomics results.

Microscopic and molecular dietary analysis of the Hallstatt paleofeces (A) Plant macro-remains microscopically detected in the four paleofeces samples. The scale bar indicates 1 mm of length. The heatmap shows the log-scale macro-remain counts normalized to 3.7 g sample. The sample with asterisk was assessed in a semiquantitative manner. For further details, please refer to Data S1. (B) Most abundant taxa (plants, nematodes, animals, fungi) detected in the four paleofeces metagenomes and proteomes. The circle size and circle color correspond to log10 “normalized” number of reads per million at genus and species levels, respectively. The asterisks in the proteome heatmap mean the peptides were assigned only to genus level. (C) Phylogenetic assignment of two partial Triticum chloroplast genomes in the 2604 and 2610 metagenomes. The comparative dataset included complete chloroplast genomes of selected members of the Triticeae tribe (NCBI accession numbers are provided in the figure). The tree was calculated using the maximum-likelihood algorithm (PhyML) based on 136,160 informative positions. Black circles symbolize parsimony and neighbor joining bootstrap support (>90%) based on 100 and 1,000 iterations, respectively. The scale bar indicates 10% estimated sequence divergence. (D) Wheat subgenome (A, B, and D) representation in the 2604 and 2610 metagenomes (Data S1), aligned to the modern hexaploid bread wheat reference genome (accession number GCA_900519105). Both the wheat chloroplast and nuclear reads were highly fragmented and display aDNA-specific damage patterns (Figures S3H–S3K). See also Figure S3 for details about the comparative analysis, phylogenetic assignment, and damage pattern of selected plant, animal, and parasite DNA. Data S1 provides further details of the macro-remains, comparative datasets, dietary DNA, and mapping statistics. Data S2 provides additional information about the comparative datasets and proteomics results. Microscopic analysis revealed that the Bronze Age sample (2610) consisted more or less exclusively of cereal remains, which originated from barley (Hordeum vulgare), spelt (Triticum spelta), some emmer (Triticum dicoccum), proso millet (Panicum miliaceum), and a few weeds, e.g., corn cockle (Agrostemma githago) and poison parsley (Aethusa cynapium). The Iron Age samples (2604, 2611) were characterized by a predominance of cereal remains from barley (Hordeum vulgare), spelt (Triticum spelta), millets (P. miliaceum, Setaria italica), and a little emmer (T. dicoccum). Furthermore, in sample 2611, testa remains of broad beans (Vicia faba) and seeds of opium poppies (Papaver somniferum) were observed. Crab apples (Malus sylvestris) and bilberries/cranberries (Vaccinium myrtillus/vitis-idea) in sample 2604 document the consumption of gathered wild fruits. Striking in the Iron Age sample 2604 was the contamination with weeds, in particular corn cockle (Agrostemma githago). In the sample 2612 from the Baroque time, the microscopic pattern was notably different from the other samples. The plant material was finely ground, and entire fruits were missing apart from a digested mericarp of anise (Pimpinella anisum). The plant tissue belonged to bran (fragments of cereal testa, pericarp, hairs, hilum, endosperm) of wheat (Triticum sp.). The precise species was unidentifiable, but according to the rare occurrence of tube cells in the pericarp fragments, a member of the tetra- or hexaploid wheat group is suggested. Bran of barley (H. vulgare) was also observed in minor quantities. Furthermore, the consumption of legumes is documented by testa remains of garden bean (Phaseolus vulgare) in this sample. In addition to the microscopic analysis, we subjected paleofeces biomolecules (DNA and proteins) to molecular dietary analyses. Both metagenomic and proteomic analyses included a homology search against different databases, followed by strict filtering steps of the obtained hits and a subsequent in-depth analysis of selected identified taxa (STAR Methods; Figure S3; Data S1N, S1O, S2B, S2D, S2F, and S2H). For the plant diet, we could confirm the presence of the most abundant domesticated plant macro-remains, including broomcorn millet (P. milliaceum), barley (H. vulgare), and wheat (Triticum spp.) (Figure 3B). In addition, we found DNA-based evidence of the presence of walnut (Juglans regia) in the sample 2604 and protein-based support for the occurrence of opium poppy seeds (P. somniferum) in the sample 2611. In addition to the foxtail millet (S. italica), which appeared with high grain number, all the low abundant wild plants unveiled by the microscopic investigation were not identified in our molecular survey, which has undergone strict filtering to minimize the false positives (STAR Methods). Further phylogenetic analysis assigned the Triticum spp. chloroplast genomes of the samples 2604 and 2610 closest to the chloroplasts of tetraploid (emmer, durum) and hexaploid (spelt wheat, bread wheat) wheat varieties, respectively (Figures 3C, S3A, and S3B; Data S1P). Additional comparison with the bread wheat genome revealed an equal subgenome (A, B, and D) representation in the 2604 and 2610 metagenomic reads, which suggests, in combination with the microscopic identification of numerous characteristic grains, glumes, and spikelets, the presence of hexaploid spelt wheat (T. spelta) in these paleofeces (Figure 3D). Beside the plant diet, we obtained molecular evidence for the consumption of cattle (Bos taurus) and swine meats (Sus scrofa) throughout all investigated time periods (Figures 3B and S3D). Interestingly, the most abundant cattle proteins in sample 2611 (hemoglobin and coagulation proteins) indicate the plant diet was supplemented by blood-rich animal tissues (e.g., muscle, liver) (Data S2F). The molecular analyses revealed in addition, that individuals from both the Iron Age (2611) and the Baroque (2612) suffered from intestinal infections of whipworms (Trichuris trichuria) and roundworm (Ascaris spp.) (Figure 3B and S3D–S3G). Finally, all samples showed a continuous low background with fungal DNA mainly coming from different Ascomycota.

Molecular evidence for blue cheese and beer consumption during the Iron Age

In contrast to all other samples, the Iron Age sample 2604 displayed an exceptionally high abundance of Penicillium roqueforti and Saccharomyces cerevisiae proteins (Data S2D) and DNA (Data S1N), making up to 7%–22% of total eukaryotic reads. This was characteristic of this sample as compared with the other samples that did not show such prevalence—even the sample 2611, which was taken from a similar context and dated back to the same time point. To authenticate the data and gain further insights into their potential ecological significance, we mapped the high-quality reads of sample 2604 against the reference genomes of these two fungi (Figure S4A; Data S1O). With 11–13× coverage, we were able to reconstruct >92% of both genomes, displaying even coverage and SNP distribution. To confirm whether these two fungi are of ancient origin and not modern contaminants, we initially checked the ancient DNA damage pattern of the mapped reads. Both fungi displayed typical ancient DNA damage patterns, with levels comparable to the human endogenous DNA (Figures S4B and S4C). Hence, and considering their extraordinarily high abundance and exclusive incidence in this sample, we assumed their endogenous originality to the coprolite microbial community. Additionally, both fungi are commonly used nowadays in food processing: P. roqueforti is used for cheese fermentation, and S. cerevisiae is used for fermenting bread and alcoholic beverages including beer, mead, and wine. Therefore, we assume that they could have been involved in food processing at that time. To test this assumption, we used the reconstructed genomes for further comparative phylogeny and population genetic analyses to infer whether they had been truly involved in food processing or were just transient environmental microbes. First, we compared our putative P. roqueforti strain to 33 other sequenced modern P. roqueforti strains coming from different functional niches. The comparative dataset included 18 cheese-fermenting and 15 non-cheese-fermenting strains (Data S1Q), in addition to Penicillium psychrosexualis and Penicillium carneum as an outgroup. After mapping the raw reads of all strains to the reference P. roqueforti genome FM164 and data filtering, we resolved 120,337 SNPs, which were used for inferring maximum likelihood (ML) phylogenetic relationships among the tested strains (STAR Methods). Consistent with the original publication of Dumas and colleagues, the resulting phylogeny revealed four distinct clades: a Roquefort cheese clade, a non-Roquefort cheese clade containing blue cheeses others than Roquefort, a silage/food spoilage clade, and a wood/food spoilage clade (Figure 4A). Initially, the phylogenetic analysis separated the non-Roquefort cheese clade from the other, then the Roquefort cheese clade was diverged from the other food spoilage clades. The ancient P. roqueforti strain showed highest similarity to the non-Roquefort cheese strains, being clustered together with their corresponding clade as an earlier divergent. The reason behind such early divergence might be attributed to the recent acquisition of some genomic regions—most importantly, CheesyTer and Wallaby—by the non-Roquefort cheese strains. This gene acquisition most likely happened via repeated multiplication of selected spores of the best cheeses on bread used as a growth medium in the late 19th century and early 20th century before the advent of microbiological in vitro culturing techniques. Thereby, modern non-Roquefort strains were exposed to extensive selection coupled to horizontal gene transfer events from other cheese-producing Penicillium spp. or even other genera.,27, 28, 29 Importantly, our Iron Age strain did not contain any of those recently acquired fragments, which comes in congruence with the hypothesis that such domestication events occurred during the last two centuries.

Figure 4

Genome-wide SNP analysis of ancient fungal “strains” versus modern industrial and wild/environmental strains

(A) Maximum likelihood (ML) phylogenetic analysis of the Penicillium roqueforti genome assembled from the sample 2604 in addition to other previously published P. roqueforti genomes. A total number of 120,359 SNP positions were used for the analysis. P. roqueforti FM164 was used as a reference, while P. carneum and P. psychrosexualis were used as outgroups. The scale bar depicts 0.1 substitutions per residue. Colored strips indicate the P. roqueforti population as previously inferred. For further information on the comparison dataset, please refer to Data S1Q.

(B) Population structure analysis of P. roqueforti 2604 with the same previous dataset, considering 3 ancestries (K = 3 with lowest cross-validation error), based on 120,337 SNPs. The order of labels corresponds to the clustering in panel (A).

(C) Wooden containers that have been found among other archeological findings in the mines and assumed to be used as cheese strainers

(D) ML phylogenetic analysis of Saccharomyces cerevisiae genome assembled from the sample 2604 compared with other published S. cerevisiae genomes. The dataset for the analysis included 375,629 SNPs. The Saccharomyces paradoxus CBS432 was used as an outgroup. The scale bar depicts 0.1 substitutions per residue. The colored strips indicate the clade/origin as reported previously. The colored dots at the tree edges refer to the presence/absence of functional marker genes. Blue dots in (A) and (C) indicate bootstrap support >80% based on 1,000 bootstrap replicates.

(E) Principal component analysis based on 136,712 SNP, of the S. cerevisiae strains.

For additional information on the coverage and SNP density, DNA damage, and ADMIXTURE, please refer to Figure S4. Data S1 provides further details about the comparative datasets, mapping results, and functional marker analysis.

Genome-wide SNP analysis of ancient fungal “strains” versus modern industrial and wild/environmental strains (A) Maximum likelihood (ML) phylogenetic analysis of the Penicillium roqueforti genome assembled from the sample 2604 in addition to other previously published P. roqueforti genomes. A total number of 120,359 SNP positions were used for the analysis. P. roqueforti FM164 was used as a reference, while P. carneum and P. psychrosexualis were used as outgroups. The scale bar depicts 0.1 substitutions per residue. Colored strips indicate the P. roqueforti population as previously inferred. For further information on the comparison dataset, please refer to Data S1Q. (B) Population structure analysis of P. roqueforti 2604 with the same previous dataset, considering 3 ancestries (K = 3 with lowest cross-validation error), based on 120,337 SNPs. The order of labels corresponds to the clustering in panel (A). (C) Wooden containers that have been found among other archeological findings in the mines and assumed to be used as cheese strainers (D) ML phylogenetic analysis of Saccharomyces cerevisiae genome assembled from the sample 2604 compared with other published S. cerevisiae genomes. The dataset for the analysis included 375,629 SNPs. The Saccharomyces paradoxus CBS432 was used as an outgroup. The scale bar depicts 0.1 substitutions per residue. The colored strips indicate the clade/origin as reported previously. The colored dots at the tree edges refer to the presence/absence of functional marker genes. Blue dots in (A) and (C) indicate bootstrap support >80% based on 1,000 bootstrap replicates. (E) Principal component analysis based on 136,712 SNP, of the S. cerevisiae strains. For additional information on the coverage and SNP density, DNA damage, and ADMIXTURE, please refer to Figure S4. Data S1 provides further details about the comparative datasets, mapping results, and functional marker analysis. We further used the tool ADMIXTURE to infer the degree of admixture among the strains. By assuming the presence of 3 ancestries (K = 3), we could clearly distinguish the non-Roquefort, Roquefort, and food spoilage strains (Figure 4B). Our putative strain displayed ∼70% cheese-producing ancestry (60% of the non-Roquefort and 10% of Roquefort cheese) and ∼30% food-spoiler ancestry. Both the phylogenetic placement and ADMIXTURE profile indicate that the ancient P. roqueforti has already been under positive selection toward the non-Roquefort cheese cluster, a selection process that most likely occurred during the process of cheese production. Some archeological findings excavated from the mines might have been used for that purpose (Figure 4C), as they showed some traces of fatty food products. Next, we compared the ancient S. cerevisiae genome to 157 recent strains coming from different ecological niches, i.e., food, alcoholic beverages (e.g., beer, wine, sake, and spirits), biofuels, and laboratories, as well as wild strains (Data S1R). ML phylogenetic analysis, based on 375,629 SNP positions, distinguished 2 main clades. The first main clade splits into two subclades, with one containing most of the beer strains (beer 1 clade) that show a successive sub-clustering based on the origin of the strains (Figure 4D). The other subclade (henceforth referred to as “mixed” clade) included a mixture of bread, wine, beer, and spirit strains. The second main clade is composed of two subclades: a wine clade and another beer clade (beer 2). All other wild, laboratory, and sake strains fall to the base of the whole phylogeny. The ancient S. cerevisiae strain clustered basal to the second main clade, which includes the wine and beer 2 strains. Further population structure analysis displayed high admixture in our putative strain, resembling primarily the wine ancestral population (47%), followed by 29.2% beer ancestries (Figure S4D) and only 19% wild strain ancestry. Therefore, and considering the ML phylogenetic assignment, we assume that the possibility of our strain to be of wild origin is unlikely. The results rather indicate higher similarity to wine and beer strains. Principal component analysis (PCA) provided further indication for the domestication of our strain in alcoholic beverage fermentation. Along the PC1 that explains 25.42% of the variation, our strain clustered closer to the strains of beer 2 than to the strains of the wine clade (Figure 4E). This was further supported with proteomic analysis (Data S2D) that unveiled that most of the peptides assigned to the genus Saccharomyces derived from proteins involved in alcohol fermentation pathways (e.g., glycolysis). To further narrow down the possible routes of domestication, we decided to differentiate the strains based on functional marker genes (Data S1S). According to recent literature,,, the genes RTM1, BIO1/BIO6, and the chromosomal regions A/B/C can be used to differentiate yeast strains based on their functional niches. The gene RTM1 is a strong domestication marker responsible for conferring resistance against the toxicity of molasses and other rich-sugar substrates and is assumed to be positively selected in beer yeast strains., The genes BIO1 and BIO6, which are involved in de novo biosynthesis of biotin, are highly selected in sake fermenting yeasts, due to lack of biotin in the fermentation substrates, such as rice. The regions A, B, and C are horizontally transferred genomic regions from other yeast genera, e.g., Kluyveromyces, Pichia, and Zygosaccharomyces. These regions contain 39 genes distributed over 3 different chromosomes and are assumed to play a role in wine fermentation. Therefore, we searched the presence of these marker genes in our comparative dataset, including our ancient strain. In accordance with the literature, almost all beer strains—either of clade 1 or clade 2—were positive for RTM1, while the wine clade was mainly positive for the genomic regions A/B/C. The mixed clade contained both RTM1 and the regions A/B/C. The sake clade exclusively contained the BIO1/BIO6 genes and partially the RTM1 gene. The wild strains, isolated from cacao in Africa, clustered in the basal clade and did not contain any of these marker genes (Figure 4D; Data S1T), contrary to our strain that contained the RTM1 and lacked the BIO1/BIO6 genes and the regions A/B/C. Based on the previous findings—i.e., (1) the ancient DNA damage profile, (2) the high prevalence of S. cerevisiae reads, (3) the presence of fermentable cereal substrates such as wheat and barley, (4) the phylogenetic assignment of the ancient yeast strain, (5) the yeast admixture profile, and (6) the distribution of marker genes—we assume that this yeast is of ancient origin and has been involved in beer fermentation, although the mode of fermentation is unknown (i.e., bottom, top, or spontaneous fermentation).

Discussion

Our interdisciplinary analyses of the samples have given detailed insight into the microbiome evolution and dietary habits and food processing techniques of the Hallstatt miners over the past three millennia. Molecular and microscopic investigations revealed that the miner’s diet was mainly composed of cereals, such as domesticated wheats (emmer and spelt), barley, common millets, and foxtail millets. This carbohydrate-rich diet was supplemented with proteins from broad beans and occasionally with fruits, nuts, or animal products. The food remains in the protohistoric sample with abundant entire fruits and seeds were less processed than those of the Baroque sample, which consisted of finely ground wheat. This suggests that the protohistoric miners consumed the cereals and legumes in a sort of gruel or porridge, whereas miners in the 18th century AD ate their cereals in a more processed form, e.g., as a bread or biscuit. In general, such carbohydrate-rich fibrous dietary components as observed in the Bronze Age and Iron Age samples are typical for traditional communities and are considered to be the main drivers of the non-Westernized microbiome structure., Consistent with this observation, our analysis showed that the Hallstatt paleofecal samples contain microbial features similar to gut microbiomes of modern non-Westernized populations (Figures 2A and 2B). Species identified in the samples, such as Lactobacillus ruminis, Catenibacterium mitsuokai, and Prevotella copri, were also found to be highly prevalent in present-day individuals with a more traditional lifestyle (Figure 2C). Furthermore, our paleofecal samples were rich in the P. copri complex (Figure 2D), including the four clades that are nearly ubiquitous and co-present in non-Westernized populations. Of particular interest, P. copri members have been shown to be associated with the digestion of complex carbohydrates,,40, 41, 42 which are the major component of an unprocessed fibrous plant diet. Finding paleofeces highly resembling that of non-Westernized individuals, in terms of microbiome structure, supports previous observations., It also adds weight to the hypothesis that the modern industrialized human gut microbiome has diverged from an ancestral state, probably due to modern lifestyle, diet, or medical advances. Interestingly, this non-Westernized microbiome structure has been observed in all four paleofeces dating from the Bronze Age to the Baroque period, which would indicate quite a recent change in the gut community. However, to spot the critical time points when this shift in the human gut microbiome began requires more ancient samples spanning a wider time range; of particular interest would be samples from the past two or three centuries, when major dietary and medical changes occurred. Overall, our results support the theory that the shift from traditional to an industrial Westernized lifestyle might be the driving force for changing the human gut microbiome from its ancestral state.,43, 44, 45 In one of the Iron Age samples (2604), the molecular analyses indicated consumption of fermented food and beverages. The fungal analysis revealed a high prevalence of Penicillium roqueforti and Saccharomyces cerevisiae, which are nowadays involved in fermenting blue cheeses and alcoholic beverages, respectively, with clear signs of domestication. Following rapidly in the wake of ruminant animal domestication (mainly cattle, sheep, and goat), cheese production represents one of the oldest and widespread food preservation techniques developed by humans. The oldest evidence for milk use dates back to the Neolithic in the Fertile Crescent yet provides only indirect evidence for fermentation. The oldest reported chemical evidence for processing of milk into fermented products (i.e., kefir dairy) is dated back to the Early Bronze Age in Western China. Other indications, including actual preserved pieces of cheese, whey strainers, and recipes for cheese production, were found in Northern Europe, the Middle and Near East, and the Mediterranean basin.49, 50, 51, 52 Here, we report evidence for the domestication of the fungus Penicillium roqueforti in the course of food processing in the 1st millennium BC that would likely produce a cheese resembling a blue cheese (non-Roquefort cheese clade, in Figure 4A). To our knowledge, this represents the earliest known evidence for directed cheese ripening and affinage in Europe, adding a crucial aspect to an emerging picture of highly sophisticated culinary traditions in European protohistory. Importantly, the production of blue cheese today involves a surface application of dry salt; therefore, it is characterized by a high salt content of up to 7.5% (w/w). The cheese curd could have been collected, desiccated, and inoculated with the fungi in wooden cheese containers like the ones excavated in the Hallstatt mines (Figure 4C). The presence of P. roqueforti indicates a major step in ruminant milk processing from fresh to ripened cheese, which could have offered, in addition to new flavors, several advantages to the Hallstatt miners including longer storage (i.e., months) and less lactose content in the fermented dairy product. The reduced lactose content may have helped the ancient minors to better digest milk products, living in a time when lactose persistence frequencies only started to rise in Europe. The presence of salt as well as the constant temperature (8°C) and humidity inside the Hallstatt mine workings represent ideal conditions for blue cheese production, following the current cheese production standards. It is noteworthy that the early discovery of the Roquefort cheese was linked to Roquefort-sur-Soulzon caves in France, which maintain a temperature of 10°C and ∼90% humidity over the year. With such conditions protecting the cheese from desiccation, these caves have been used exclusively for centuries for ripening and aging of the “Roquefort” cheese., Indications for the production of fermented alcoholic beverages in protohistory are abundant, albeit frequently ambiguous, and can be found in the Near East, Middle East, Far East, and Europe.60, 61, 62, 63, 64, 65, 66, 67 Evidence for the production of grape wine in Europe and viniculture in the Near East dates back to the 6th and 7th millennium BC. Such evidence was mainly based on chemical residue analysis or archaeobotanical analysis or indicated in ancient inscriptions. Recently scientists claimed that they were able to revive an ancient yeast strain from Egyptian potteries and used it to ferment beer. Here we were able to reconstruct >90% of the S. cerevisiae genome from an Iron Age-dated paleofecal sample. We used different molecular analysis at the genome level to infer the possible routes of domestication for this yeast. Our results suggested it was used in beer fermentation. Together with the results of the dietary analysis that showed presence of different fermentable cereals, e.g., wheat, barley, and millets, we can envisage how the fermentation was carried out. It might be assumed that the fermentation was carried out in a spontaneous manner—i.e., by adding water to wort and allowing the fermentation process to take place by the wild air-borne yeasts or the constitutional microbiota of the used cereals. We do not see, however, indications for other yeasts species, such as Brettanomyces bruxellensis, that co-occur in spontaneously fermented beers. In addition, we see clear indications of domestication and continuous supply of new admixture components to this yeast, which might suggest that fermentation vessels were repeatedly used for this purpose or the inoculation of the fermentation batches has been done by back-slopping (i.e., inoculation of new fermentation batches with portions of previous batches). Albeit varied evidence for beer production in protohistoric Europe exists,,65, 66, 67 these beers could not be preserved for longer time periods and would have had to be consumed rapidly after production, which also presupposes that the beer would have had to be produced either in Hallstatt itself or in the very near surroundings. Considering the constant temperature of 8°C inside the Hallstatt mines, it might be expected that this yeast was used for production of lager-like beer, when fermentation is carried out at low temperatures (also known as bottom-fermentation) and results in a beer that can be stored for longer time periods. Historically, however, the bottom-fermentation was most likely developed after the year 1553, when the Duke of Bavaria Albrecht V forbade brewing during summer months. Additionally, Gonçalves and colleagues demonstrated that Saccharomyces pastorianus strains, which are hybrids of S. cerevisiae and another Saccharomyces species and are used for production of lager beers, belong to the main beer clade (Figure 4D). Therefore, we postulate that the beer produced at that time is similar to what would nowadays be known as pale beer, produced mainly by top-fermenting S. cerevisiae strains. Paleofeces material displays an archaeological information source that provides insights into the diet and gut microbiome composition of ancestors. Here, we had access to four paleofeces samples from the Hallstatt salt mines dating from the Bronze Age to the Baroque period. The constant low annual temperature and high salt concentrations inside the mine preserved both plant macro-remains and biomolecules (DNA and protein) in the paleofeces. We demonstrate the indispensable complementarity of using microscopic and molecular approaches in resolving the paleofecal dietary residual components and to reconstruct the ancient gut microbiome. Furthermore, we extended our paleofeces microbiome analysis to focusing on key microbes that are involved in food processing, which opens new avenues in understanding fermentation history. In the future, additional samples from different time points will provide a more fine-scaled diachronic picture, which may help us to understand the role of dietary changes in shaping our gut microbiome and how much this was further influenced by modern lifestyles or medical advances recently introduced through industrialization and Westernization.

STAR★Methods

Key resources table

Resource availability

Lead contact

Further information on materials, datasets, and protocols should be directed to and will be fulfilled by the lead contact, Frank Maixner (frank.maixner@eurac.edu).

Materials availability

This study did not generate new unique reagents.

Experimental model and subject details

In this study, we subjected four paleofeces samples to in-depth microscopic and molecular analysis. The paleofeces material stems from Bronze Age and Iron Age mine workings in the Hallstatt salt mines in Upper Austria (Figures 1A–1C). See Data S1A for additional details to the samples used in this study.

Method details

Paleofeces samples, radiocarbon dating

Three paleofeces (2610, 2604, 2611) were recovered in 1983, 1989 (2604) and 2003 (2610) through wet-sieving of larger blocks of protohistoric production debris excavated in the mine workings. One additional paleofeces (2612) was sampled in 2019 in situ at the site Edlersbergwerk-oben with sterile sampling tools. This sample was not subjected to wet-sieving. All four samples (using approx. 500 mg each) were subjected to radiocarbon dating at the Curt-Engelhorn-Centre for Archaeometry in Mannheim, Germany (Figure 1C; Data S1B). The 14C content was measured with an AMS system of the MICADAS type. The isotopic ratios 14C/12C and 13C/12C of the samples, the calibration standards (oxalic acid II), the blanks and the control standards were measured simultaneously in the AMS system. The determined 14C ages are based on 13C = −25‰ and were calibrated to calendar ages with the dataset INTCAL20 and the software SwissCal (L.Wacker, ETH-Zurich). The remaining material has been used for microscopic and molecular analyses.

Microscopic analysis of the paleofeces

Before chemical treatment of the paleofeces the surface of each sample was stripped off to avoid contamination. Then rehydration in a 0.5% solution of trisodium phosphate for 72 h. After olfactory and visual testing, the liquid was screened through 500, 250, 125, and 63 μm steel meshes, and the outwash was kept for further microfossil studies. Plant macro-remains were picked out of the fractionated residues in the steel meshes and then identified and quantified under a stereo microscope with magnifications up x63. The identification of the plant remains was conducted with identification keys108, 109, 110, 111 and the reference collection of the Department of Botany, Innsbruck University, was consulted for comparative purposes.

Molecular analysis of the paleofeces

DNA extraction, library preparation and sequencing

The molecular analysis of the paleofeces samples was conducted at the ancient DNA laboratory of the EURAC Institute for Mummy Studies in Bolzano, Italy. Sample documentation, sample preparation and DNA extraction were performed in a dedicated pre-PCR area following the strict procedures required for studies of ancient DNA: use of protective clothing, UV-light exposure of the equipment and bleach sterilization of surfaces, use of PCR workstations and filtered pipette tips. The DNA was extracted from the paleofeces samples (200 mg) using a chloroform-based DNA extraction method according to the protocol of Tang and colleagues with minor modifications. To test whether the wet-sieving step during the classical archaeological excavation results in an “wash-out” effect of DNA from the paleofeces, we re-hydrated additional 200 mg material of two samples (2611, 2612) in 1ml of 0.5N Na3PO4 (tri-sodium phosphate)-buffer for 18 h at RT. After re-hydration, the samples were centrifuged (5000 g, 5 min) and both the supernatant (2611RW, 2612RW) and the pelleted paleofeces (2611R, 2612R) were subjected to DNA extraction as described above. From all DNA extracts double-indexed libraries were generated for the sequencing runs with a modified protocol for Illumina multiplex sequencing., Libraries were first shallow sequenced on an Illumina MiSeq platform and then deep sequenced on Illumina HiSeq2500 and HiSeqX10 platforms using 101–base pair (HiSeq2500) and 151–base pair (MiSeq, HiSeqX) paired-end sequencing kits. For details to the metagenomic shotgun datasets please refer to Data S1C. Data are available from the European Nucleotide Archive under accession number PRJEB44507.

Pre-processing of the sequence data, general taxonomic overview, human DNA analysis

The paired Illumina reads were first quality-checked and processed (adaptor removal and read merging) as previously described in Maixner et al. Initially we tested for the above-described possible DNA wash-out effect by assigning taxonomically the microbial reads using MetaPhlAn 3.0 in the shallow sequenced MiSeq datasets of the untreated samples (2611, 2612), the washed and pelleted paleofeces (2611R, 2612R), and the supernatant (2611RW, 2612RW). For all subsequent analyses, we used the combined sequencing data available for each sample (Data S1C). First, we assessed a general taxonomic profile of the sequencing reads using DIAMOND (v2.0.7) blastx search against the NCBI nr database (Release 237, April 2020). The DIAMOND tables were converted to rma6 (blast2rma tool) format (–minPercentIdentity 97), imported into MEGAN6 software, and subsequently visualized using the Krona tool. Next, we assessed the endogenous human DNA content in the paleofeces samples by aligning the sequence reads against the human genome (build hg19) and the human mtDNA reference genome (rCRS) using BWA with a seed length of 1,000. The minimum mapping and base quality were both 30. To deduplicate the mapped reads, we used the DeDup tool (https://github.com/apeltzer/DeDup). For details to the mapping results please refer to Data S1D. The resulting bam files were used to check for characteristic aDNA nucleotide misincorporation frequency patterns and for the fragment length distribution using the DamageProfiler tool. Mitochondrial human contamination rates were assessed using Schmutzi. The sex of the mapped human reads was assigned using a Maximum likelihood method, based on the karyotype frequency of X and Y chromosomal reads. Variants in the mitochondrial genome were called using SAMtools mpileup and bcftoools with stringent filtering options (quality > 30). The haplogroup was identified by submitting the variant calling file to the HaploGrep website.

Comparing paleofeces microbiome structure to contemporary metagenomic datasets

To compare the paleofeces microbiome structure with contemporary individuals’, we downloaded 9,368 publicly available shotgun metagenomes representing modern-day human populations (n = 9,207), and, as a control, environmental soil sample (n = 161). Each metagenome was characterized by source (human or soil), and in the case of human samples, by body-site, and whether from a Westernized or non-Westernized population. The term “non-Westernized” describes a population that follows a traditional non-urbanized lifestyle encompassing factors such as diet, hygiene, and with no or limited access to medical healthcare and pharmaceuticals (e.g., antibiotics) as previously described.,, Afterward, we performed profiling of the microbial composition of each metagenomic sample including the four paleofeces samples from this study using MetaPhlAn 3.0 using default settings. To survey paleofeces microbiome members in different populations, microbial abundance profiles of four paleofeces were first merged using merge_metaphlan_tables.py, a utility script in MetaPhlAn 3.0. The merged abundance table was then visualized in the form of a hierarchically-clustered heatmap using hclust2 (https://github.com/SegataLab/hclust2) with parameters–ftop 15–f_dist_f braycurtis–s_dist_f braycurtis, with a result of displaying only 15 most abundant microbial species in the four paleofeces samples. We further analyzed the prevalence of these top 15 ancient-sample enriched species in the context of global populations using a subset (n = 8,968) of shotgun metagenomes that are stool samples from healthy adult individuals characterized by non-Westernization (Data S1H). A species was counted as presence in a sample if the relative abundance was above 0.01%. We repeated the prevalence analysis for all species detected in the four paleofeces samples to take low-abundance members into account as well (Data S1I and S1J). Next, we randomly selected 662 metagenomes (132 oral cavity and 530 stool samples) from all publicly available human metagenomes, along with 161 soil metagenomes, for profiling of the metabolic pathways using HUMAnN 3.0 with default settings. The Python script humann_renorm_table.py as part of the pipeline was used to normalize the metabolic pathways to relative abundance, followed by converting individual profiles into a merged abundance table using the utility script humann_join_tables.py. For the same selected metagenomic samples, the individual profiles of microbial composition generated by MetaPhlAn 3.0 as described above were similarly merged into a single abundance table using a python script merge_metaphlan_tables.py. Subsequently, the merged abundance tables were used for calculating Bray-Curtis’s distance with logarithmic transformation based on which a Principal Coordinates Analysis was performed using python package scikit-bio (version 0.5.6; http://scikit-bio.org/). Lastly, to predict the source of microbial communities in the paleofeces samples SourceTracker2 was used as described in a recent study, comparing the paleofeces samples’ abundance profiles with those of different sources (a subset of samples randomly selected from the aforementioned Principal Coordinates Analysis: 16 non-Westernized oral samples, 20 non-Westernized stool samples, 26 soil samples, 10 Westernized oral samples and 30 Westernized stool samples). For details to the comparative datasets please refer to Data S1E and S1G.

Molecular characterization of ancient diet

The merged reads were searched for homology, using BLAST search, against different specific DNA barcoding databases (Data S1N). Specifically, the databases of plant Internal transcribed spacers, RuBisCo-large subunit and Maturase K genes (rbcL/matK, https://v3.boldsystems.org/), and chloroplast genomes (https://www.ncbi.nlm.nih.gov/genome/organelle) were used to identify the plant dietary components. The BOLD database of the cytochrome c oxidase subunit I gene (COI, https://v3.boldsystems.org/) was used to specifically target the animal diet and intestinal parasites. The UNITE database of the fungal ITS gene database (https://unite.ut.ee/) was used to identify potential food processing fungi. Finally, to target all the eukaryotic dietary and diet-related components, the reads were aligned against the currently available full mitochondrial and chloroplast genomes of the NCBI database using BWA with default parameters. Subsequently, to obtain a taxonomic overview of the aligned reads we performed a sequence similarity search using BLASTn with default parameters against the complete NCBI-nt database. Both the resulting BLAST and the previously created DIAMOND tables were converted to rma6 (blast2rma tool) format and imported into MEGAN6 software. The read counts at genus- and species-level were exported as csv files and imported into R-Studio software for further analysis. To distinguish the true hits from the false hits, we applied two filters as follows: i) For the BLAST and DIAMOND tables, we only considered hits of ≥90% identity and ≥90% coverage of each query read; then ii) We considered the sample positive for a certain component only if it shows incidence in majority of databases of that component, e.g., the sample 2604 was considered positive for Bos taurus, because it contained true hits for mitochondrial-, COI-, and nr-databases; and finally iii) for the plant dietary components, we used the plantiTS database as a primary filter, since it is highly curated and recently updated. So, we initially filtered out all the plantiTS negative taxa then considered the same majority rule. Finally, we used transformed sums of counts of the genera and species that passes the aforementioned filters into normalized log base 2 (log10 reads million-1) and plotted them as dot-plots proportional to their abundances. The dendrogram to the left of the dot-plot represent the phylogeny based on NCBI taxonomy (https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi). Selected identified taxa were further subjected to in-depth analysis by mapping the quality-filtered sequence reads against organellar (mitochondrial and chloroplast) and nuclear genomes using bowtie2 (v1.2.1.1) and the parameter “end-to-end” (for details to the references used please refer to the Data S1O). In two cases where the previously described molecular dietary analysis resulted in an assignment down to the genus level (Triticum spp., Ascaris spp.) we assessed an appropriate reference sequence by mapping the sequence reads against selected chloroplast and mitochondrial genomes using BWA (with default parameters) implemented in the program FastQ Screen (https://github.com/StevenWingett/FastQ-Screen), and selected as reference the species that belongs to one of these two genera and has the most specific hits. After the bowtie2 alignment against the reference sequences the mapped reads were deduplicated and checked for damage patterns as described above for the human DNA. For details to the mapping results please refer to Data S1O and S1P. Organellar sequences that showed higher than 70% sequence coverage were further subjected to phylogenetic analysis as previously detailed. In brief, a consensus FASTA sequence was generated using the ANGSD tool and together with other comparative datasets subjected to a multiple alignment using the MAFFT multiple sequence alignment program. Phylogenetic analysis were performed by applying distance-matrix, maximum-parsimony, and maximum-likelihood methods implemented in the ARB software package: neighbor-joining (using the Jukes-Cantor algorithm for nucleic acid correction with 1000 bootstrap iterations), DNA parsimony (PHYLIP version 3.66 with 100 bootstrap iterations), and DNA maximum-likelihood [PhyML with the HKY substitution model]. The number of informative nucleotide positions used for the phylogenetic analysis and the bootstrap support is indicated in the respective figure captions.

Genome-level analysis of ancient fungi - Variant calling

The sample 2604 showed high prevalence of Penicillium roqueforti and Saccharomyces cerevisiae DNA allowing further genome-level comparative analysis with modern datasets (Data S1Q and S1R). As comparative datasets, we included the recently published genomic data of 34 P. roqueforti as well as 157 S. cerevisiae. We mapped the quality-filtered reads against the reference genomes of Penicillium roqueforti FM164 and Saccharomyces cerevisiae S288c, independently, using bowtie2 (v1.2.1.1) and the parameter “end-to-end.” Using samtools, the mapping qualities of <30 were filtered out and the bam files were sorted and indexed. The read duplicates were marked and removed using the DeDup tool (https://github.com/apeltzer/DeDup/) and new read-groups were assigned using the “AddOrReplaceReadGroups” of Picard tools (https://broadinstitute.github.io/picard/). Then, we used the Genome Analysis Toolkit 4 (GATK4) (https://gatk.broadinstitute.org/hc) to call genome variants, following the best practices workflow as follows: i) Variants within each of the single bam files were called using the tool “HaplotypeCaller”; ii) The resulting VCF files were combined into a single file using the tool “CombineGVCFs”; iii) The combined VCF file was then genotyped using the tool “GenotypeGVCFs”; iv) InDels were filtered out and only single nucleotide polymorphisms (SNPs) were kept using the tool “SelectVariants”; and finally v) the VCF file was filtered to include SNPs with minor allele frequency (MAF) of ≥0.05-0.1 and ≤10% missing data. A total of 120359 and 375629 SNPs were resolved for P. roqueforti and S. cerevisiae, respectively. The total number of positions with alternative alleles to the reference genome accounted for 0.97% in case of P. roqueforti and 10.05% in case of S. cerevisiae. More than 99% of the positions with alternative alleles have a major allele frequency of > 90%, which minimize the likelihood of having chimeric alignments. In addition, we applied majority rule selection for the positions with multiple alleles, thereby analyzing only the major allele type.

Phylogenetic- and population structure analysis

To analyze the phylogenetic relationship of the strains, we used the vcf-kit to convert the SNP datasets (vcf files) to fastA alignment format. Then we used the Fasta2phylip.pl script to create PHYLIP alignment files (https://mullinslab.microbiol.washington.edu/), which were then subjected to maximum likelihood (ML) phylogenetic analysis using RAxML (v8.2.12), following generalized time reversible substitution model (GTR) and GAMMA model of rate heterogeneity. Best-scoring ML trees were searched after 1000 bootstrap replicates and visualized and annotated using the interactive tree of life, iToL tool (https://itol.embl.de). To infer the degree of admixture and the number of populations in the analyzed datasets (120,337 SNPs for P. roqueforti and 136,712 S. cerevisiae), we carried out unsupervised population structure analysis using ADMIXTURE (v1.3.0), testing K in the range of 3 to 22. The best K values were determined based on the lowest cross-validation error (cv) and maximum likelihood.

Functional marker genes

To infer the potential functions of the modern and ancient Saccharomyces cerevisiae yeast strains based on their genome sequence, we searched for the presence of functional marker genes (RTM1, BIO1/BIO6, and Regions A/B/C) that were recently described by Cheeseman et al. The RTM1 gene is responsible for conferring resistance to the toxicity of molasses, and therefore it is assumed to be positively selected in beer yeast strains. While the genes BIO1 and BIO6, which are involved in de novo synthesis of biotin, are highly selected in sake fermenting yeasts. The regions A, B, and C are horizontally transferred genomic regions from other yeast genera, e.g., Kluyveromyces, Pichia, and Zygosaccharomyces. These regions contain 39 genes distributed over 3 different chromosomes (Data S1S) and are assumed to play a role in wine fermentation. To search for these marker genes, we used bowtie2 to map the quality filtered reads, of our sample as well as the recently published S. cerevisiae comparative genomic data, against the individual genes. Then, we considered a gene as present if it is covered ≥90% with 3x depth (Data S1T). For phylogenetic tree annotation (Figure 4D), the regions A/B/C were considered present if any of the genes was positive, while BIO1/BIO6 were considered positive if both genes were positive.

Proteomic analysis

Sample preparation

Paleofeces samples (ID 2610, 2604, 2611, 2612) and a blank control sample to highlight any contamination were resuspended in 800 μL 5% (w/v) sodium dodecyl sulfate (SDS), 50 mM triethylammonium bicarbonate (TEAB, pH 8.5) and disrupted at 4°C using three 2.8 mm ceramic beads (QIAGEN, USA) and a Precellys 24 homogenizer (Bertin Corp, USA) at 6500 rpm for 30 s. Samples were transferred to Eppendorf tubes and centrifuged at 4,000 rpm for 7 min, the supernatant removed and centrifuged twice at 13,000 g for 10 min. The supernatant was subjected to the S-Trap mini spin column digestion protocol (ProtiFi, USA). Briefly, samples were acidified with 12% phosphoric acid in water (pH ≤1). Binding/wash buffer (100 mM TEAB (final) in 90% methanol) was added to each sample at 6.4 x the volume of each sample. Samples were vortexed and applied to the S-Trap column in ≤500 μL aliquots until the entire sample was loaded (J. Wilson, personal communication). After each sample load the S-Trap column was centrifuged at 4,000 g for 30 s. Samples were washed to remove any residual SDS by adding 3x 400 μL binding/wash buffer and centrifugation at 4,000 g for 30 s after each wash, and 1 min after the last wash. Proteins were digested on the S-Trap at 37°C for 13 h by adding 125 μL digestion buffer containing trypsin gold (Promega, USA) in 125 mM NH4HCO3. The S-Trap was rehydrated with 100 μL 125 mM NH4HCO3 and peptides eluted with 80 μL 125 mM NH4HCO3, 80 μL 0.2% (v/v) formic acid in water, and 80 μL 50% (v/v) acetonitrile in water. Peptide eluates were pooled and dried under centrifugal evaporation (Savant, Thermo-Fisher Scientific, USA).

Liquid chromatography - mass spectrometry analysis (LC-MS/MS)

Samples were analyzed by high-resolution nano LC-MS/MS on an Orbitrap-Eclipse Tribrid mass spectrometer (Thermo-Fisher Scientific, USA) equipped with an Easy-nLC 1000 (Thermo-Fisher Scientific, USA). Peptides were resolubilized in 50 μL 0.1% (v/v) formic acid in water. 5 μL of each sample were loaded onto a 2 cm Acclaim PepMap 100 trap column (75 um ID, C18 3 μm (dp), Thermo-Fisher Scientific, USA) and separated using a 50 cm C18 2 μm (dp) Easy Spray column (ES903, Thermo-Fisher Scientific, USA) using a 120-min gradient. Mobile phase A consisted of 0.1% (v/v) formic acid in water, and mobile phase B consisted of 0.1% (v/v) formic acid in acetonitrile. The gradient used 5% to 22% mobile phase B over 105 min, followed by separation from 22% to 32% mobile phase B over the remaining 15 min. The column was washed and equilibrated at 95% and 5% mobile phase B, respectively, over 40 min following each run. The flow rate was set at 300 nL/min and the column was heated at 45°C. Mass spectra were acquired on the Orbitrap-Eclipse mass spectrometer using data dependent acquisition (DDA) with dynamic exclusion. The precursor scan range was from 375-1500 m/z at 120,000 resolution and 100% normalized target AGC (4e5) with a 50 ms maximum injection time. The duty cycle was set to acquire as many MS/MS spectra as possible over a three second period following each precursor scan. A 1.2 m/z selection window was used to acquire MS/MS spectra at 30,000 resolution, a normalized AGC target of 100% (5e4), 54 ms maximum injection time, and fragmented using HCD with a normalized collision energy of 30. Dynamic exclusion was set to 60 s, with peptide match set to preferred and isotope exclusion turned on. Charge exclusion was set to 1 and greater than 7.

Protein data analysis

The raw mass spectra files were converted to mzML format using msConvert from ProteoWizard and analyzed using the Trans-Proteomic Pipeline (TPP v6.0.0-rc16 Noctilucent). The analysis pipeline consisted of database searching with Comet (version 2018.01 rev. 4) against a 17-species UniProt FASTA database (Data S2A) and shuffled decoy sequences. The 17 species were selected based on the microscopic and metagenomics analyses described above. An equal number of decoy sequences were appended to the FASTA database for downstream statistical analysis and validation of any observed peptide sequences, as described below. For details to the comparative datasets used in this study and to the protein analysis results please refer to Data S2. Comet parameters included variable modifications of +15.994915 Da (Met, Cys, His, Tyr, Trp, and Phe), and +0.984016 (Asn and Gln), as these types of protein degradations are commonly found in aged samples. A precursor tolerance of 20 ppm was set, a fragment bin tolerance of 0.2 and fragment bin offset of 0. Semi-enzymatic cleavage with up to 3 missed cleavages was allowed, and an isotope error tolerance of 3. Peptide-spectrum matches (PSMs) were validated using PeptideProphet and iProphet. Protein inference was performed using ProteinProphet. Relative quantitation of protein groups was performed using StPeter. Observed protein groups for each sample were derived from the set of protein groups validated with ProteinProphet at an estimated 1% false discovery rate (FDR). Those protein groups were then curated by removing any group that intersected with the set of protein groups identified in the blank control analysis. A list of observed peptides was created from the peptide sequences that were used to infer the final set of protein groups. The peptide list was then filtered to contain only those peptides with a probability below a 1% FDR threshold as determined from the iProphet analysis. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD027613. The identified peptides, evaluated by statistical analysis to 1% false discovery rate, were further taxonomically screened for homologous peptides and subsequently filtered for peptide sequences that can be unambiguously assigned to certain plants, animals, and fungi. Therefore, the peptide sequences of each sample were imported into Unipept 4.0 Desktop version, and subjected to LCA assignment, and filtered out the following: i) peptides assigned to bacteria; ii) peptides assigned to multiple species; iii) single peptide incidences that uniquely infer a species presence (Data S2). Finally, functional annotation of the samples was carried out on different taxonomic levels to infer the molecular function of the obtained peptides.

Quantification and statistical analysis

Phylogenetic analyses were carried out based on the maximum-likelihood method using the tool Randomized Axelerated Maximum Likelihood (RAxML) with 1000 bootstrap replicates. For the dietary analysis, majority rule-based selection was implemented using in-house R-script in R-Studio (https://www.rstudio.com/).

REAGENT or RESOURCE	SOURCE	IDENTIFIER
Biological samples

Paleofeces sample from the Bronze Age	This study	2610
Paleofeces sample from the Iron Age	This study	2604
Paleofeces sample from the Iron Age	This study	2611
Paleofeces sample from the Baroque time	This study	2612

Chemicals, peptides, and recombinant proteins

Sodium phosphate	Sigma-Aldrich	Cat #342483
Trypsin gold	Promega	Cat # V528A

Critical commercial assays

S-Trap	ProtiFi	Cat #K02-mini-10

Deposited data

Paleofeces metagenomic shotgun datasets	This study	ENA: PRJEB44507
NCBI-nr database	⁷⁵	https://www.ncbi.nlm.nih.gov/protein/
NCBI-nt database	⁷⁵	https://www.ncbi.nlm.nih.gov/nucleotide/
PlantiTS database	⁷⁶	https://github.com/apallavicini/PLANiTS
Chloroplast genome database	N/A	https://www.ncbi.nlm.nih.gov/genome/organelle
Fungal ITS database	⁷⁷	https://unite.ut.ee/
BOLD system databases	⁷⁸	https://v3.boldsystems.org/

Software and algorithms

MetaPhlAn	²¹	https://github.com/biobakery/MetaPhlAn/wiki/MetaPhlAn-3.0
DIAMOND	⁷⁹	https://github.com/bbuchfink/diamond
MEGAN6	⁸⁰	https://www.wsi.uni-tuebingen.de/lehrstuehle/algorithms-in-bioinformatics/software/megan6
Krona tool	⁸¹	https://github.com/marbl/Krona/wiki
BWA	⁸²	http://bio-bwa.sourceforge.net/
DeDup tool	N/A	https://github.com/apeltzer/DeDup
DamageProfiler	⁸³	https://damageprofiler.readthedocs.io/en/latest/index.html
Schmutzi	⁸⁴	https://github.com/grenaud/schmutzi
Molecular sex determination	⁸⁵	https://github.com/pontussk/ry_compute
SAMtools	⁸⁶	http://samtools.github.io/
HaploGrep	⁸⁷	https://haplogrep.i-med.ac.at/
HUMAnN	²¹	https://github.com/biobakery/humann
python package scikit-bio	N/A	http://scikit-bio.org/
bowtie2	⁸⁸	http://bowtie-bio.sourceforge.net/bowtie2/
FastQ Screen	N/A	https://github.com/StevenWingett/FastQ-Screen
ANGSD tool	⁸⁹	https://github.com/ANGSD/angsd
MAFFT	⁹⁰	https://mafft.cbrc.jp/alignment/software/
ARB software package	⁹¹	http://www.arb-home.de/
PhyML	⁹²	https://github.com/stephaneguindon/phyml
BLASTn	⁹³	N/A
Picard tools	N/A	https://broadinstitute.github.io/picard/
GATK4	N/A	https://gatk.broadinstitute.org/hc
vcf-kit	⁹⁴	https://vcf-kit.readthedocs.io/en/latest/
RAxML	⁹⁵	https://github.com/stamatak/standard-RAxML
Interactive Tree of Life (iToL)	⁹⁶	https://itol.embl.de/
ADMIXTURE	⁹⁷	http://dalexander.github.io/admixture/download.html
R-Studio	N/A	https://www.rstudio.com/
ProteoWizard	⁹⁸	https://proteowizard.sourceforge.io/
Trans-Proteomic Pipeline	⁹⁹	http://tools.proteomecenter.org/software.php
PeptideProphet	¹⁰⁰	http://peptideprophet.sourceforge.net/
ProteinProphet	¹⁰¹	http://proteinprophet.sourceforge.net/
StPeter	¹⁰²	http://tools.proteomecenter.org/wiki/index.php?title=Software:StPeter
Comet	¹⁰³	http://comet-ms.sourceforge.net/
Unipept 4.0	¹⁰⁴	https://unipept.ugent.be/

88 in total

Review 1. Evolutionary biology through the lens of budding yeast comparative genomics.

Authors: Souhir Marsit; Jean-Baptiste Leducq; Éléonore Durand; Axelle Marchant; Marie Filteau; Christian R Landry
Journal: Nat Rev Genet Date: 2017-07-17 Impact factor: 53.242

Review 2. Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics.

Authors: Eric W Deutsch; Luis Mendoza; David Shteynberg; Joseph Slagel; Zhi Sun; Robert L Moritz
Journal: Proteomics Clin Appl Date: 2015-04-02 Impact factor: 3.494

3. HYPERsol: High-Quality Data from Archival FFPE Tissue for Clinical Proteomics.

Authors: Dylan M Marchione; Ilyana Ilieva; Kyle Devins; Danielle Sharpe; Darryl J Pappin; Benjamin A Garcia; John P Wilson; John B Wojcik
Journal: J Proteome Res Date: 2020-01-14 Impact factor: 4.466

4. Distinct Polysaccharide Utilization Determines Interspecies Competition between Intestinal Prevotella spp.

Authors: Eric J C Gálvez; Aida Iljazovic; Lena Amend; Till Robin Lesker; Thibaud Renault; Sophie Thiemann; Lianxu Hao; Urmi Roy; Achim Gronow; Emmanuelle Charpentier; Till Strowig
Journal: Cell Host Microbe Date: 2020-10-27 Impact factor: 21.023

5. Eukaryote-to-eukaryote gene transfer events revealed by the genome sequence of the wine yeast Saccharomyces cerevisiae EC1118.

Authors: Maite Novo; Frédéric Bigey; Emmanuelle Beyne; Virginie Galeote; Frédérick Gavory; Sandrine Mallet; Brigitte Cambon; Jean-Luc Legras; Patrick Wincker; Serge Casaregola; Sylvie Dequin
Journal: Proc Natl Acad Sci U S A Date: 2009-09-09 Impact factor: 11.205

6. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies.

Authors: Alexandros Stamatakis
Journal: Bioinformatics Date: 2014-01-21 Impact factor: 6.937

7. Revealing the constituents of Egypt's oldest beer using infrared and mass spectrometry.

Authors: Mohamed A Farag; Moamen M Elmassry; Masahiro Baba; Renée Friedman
Journal: Sci Rep Date: 2019-11-07 Impact factor: 4.379

8. PLANiTS: a curated sequence reference dataset for plant ITS DNA metabarcoding.

Authors: Elisa Banchi; Claudio G Ametrano; Samuele Greco; David Stanković; Lucia Muggia; Alberto Pallavicini
Journal: Database (Oxford) Date: 2020-01-01 Impact factor: 3.451

9. Dig out, Dig in! Plant-based diet at the Late Bronze Age copper production site of Prigglitz-Gasteil (Lower Austria) and the relevance of processed foodstuffs for the supply of Alpine Bronze Age miners.

Authors: Andreas G Heiss; Thorsten Jakobitsch; Silvia Wiesinger; Peter Trebsche
Journal: PLoS One Date: 2021-03-24 Impact factor: 3.240

10. Adaptive Horizontal Gene Transfers between Multiple Cheese-Associated Fungi.

Authors: Jeanne Ropars; Ricardo C Rodríguez de la Vega; Manuela López-Villavicencio; Jérôme Gouzy; Erika Sallet; Émilie Dumas; Sandrine Lacoste; Robert Debuchy; Joëlle Dupont; Antoine Branca; Tatiana Giraud
Journal: Curr Biol Date: 2015-09-24 Impact factor: 10.834

1 in total

1. Positive selection acts on regulatory genetic variants in populations of European ancestry that affect ALDH2 gene expression.

Authors: Helmut Schaschl; Tobias Göllner; David L Morris
Journal: Sci Rep Date: 2022-03-16 Impact factor: 4.379

1 in total