| Literature DB >> 32337106 |
Maxime Borry1, Bryan Cordova1, Angela Perri2,3, Marsha Wibowo4,5,6, Tanvi Prasad Honap7,8, Jada Ko9, Jie Yu10, Kate Britton3,11, Linus Girdland-Flink11,12, Robert C Power3,13, Ingelise Stuijts14, Domingo C Salazar-García15,16, Courtney Hofman7,8, Richard Hagan1, Thérèse Samdapawindé Kagoné17, Nicolas Meda17, Helene Carabin18, David Jacobson7,8, Karl Reinhard19, Cecil Lewis7,8, Aleksandar Kostic4,5,6, Choongwon Jeong1,20, Alexander Herbig1, Alexander Hübner1, Christina Warinner1,9,21.
Abstract
Shotgun metagenomics applied to archaeological feces (paleofeces) can bring new insights into the composition and functions of human and animal gut microbiota from the past. However, paleofeces often undergo physical distortions in archaeological sediments, making their source species difficult to identify on the basis of fecal morphology or microscopic features alone. Here we present a reproducible and scalable pipeline using both host and microbial DNA to infer the host source of fecal material. We apply this pipeline to newly sequenced archaeological specimens and show that we are able to distinguish morphologically similar human and canine paleofeces, as well as non-fecal sediments, from a range of archaeological contexts.Entities:
Keywords: Archeology; Coprolite; Dog; Endogenous DNA; Gut; Human; Machine learning; Microbiome; Nextflow; Paleofeces
Year: 2020 PMID: 32337106 PMCID: PMC7169968 DOI: 10.7717/peerj.9001
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Examples of archaeological paleofeces analyzed in this study.
(A) H29-3, from Anhui Province, China, Neolithic period; (B) Zape 2, from Durango, Mexico, ca. 1300 BP; (C) Zape 28, from Durango, Mexico, ca. 1300 BP. Paleofeces ranged from slightly mineralized intact pieces (A) to more fragmentary organic states (B and C), and color ranged from pale gray (A) to dark brown (C).
Modern reference microbiome datasets.
| Metagenome source | Food production | Analysis | Source | |
|---|---|---|---|---|
| WHU | 36 | microbiome | ||
| WHU and NWHR | 19 | microbiome | ||
| NWHR | 20 | microbiome | ||
| NWHR | 110 | microbiome | ||
| NWHR | 3 | microbiome | ||
| NWHR | 12 | microbiome | ||
| NWHR | 38 | microbiome | ||
| NWHR | 24 | microbiome | ||
| WHU | 49 | host DNA | This study | |
| NWHR | 69 | host DNA | This study | |
| Canis familiaris | – | 150 | microbiome and host DNA | |
| Soil | – | 16 | microbiome | |
| Soil | – | 2 | microbiome | |
| Soil | – | 2 | microbiome |
Archaeological samples.
| Archeological ID | Laboratory ID | Site Name | Region | Period | Sample type | Archaeologically suspected species | Plot ID |
|---|---|---|---|---|---|---|---|
| Zape 2 | ZSM002 | Cueva de los Muertos Chiquitos | Mexico | 1300 BP | Paleofeces | HUMAN | 01 |
| Zape 5 | ZSM005 | Cueva de los Muertos Chiquitos | Mexico | 1300 BP | Paleofeces | HUMAN | 02 |
| Zape 23 | ZSM023 | Cueva de los Muertos Chiquitos | Mexico | 1300 BP | Paleofeces | HUMAN or CANID | 03 |
| Zape 25 | ZSM025 | Cueva de los Muertos Chiquitos | Mexico | 1300 BP | Paleofeces | HUMAN | 04 |
| Zape 27 | ZSM027 | Cueva de los Muertos Chiquitos | Mexico | 1300 BP | Paleofeces | HUMAN | 05 |
| Zape 28 | ZSM028 | Cueva de los Muertos Chiquitos | Mexico | 1300 BP | Paleofeces | HUMAN | 06 |
| Zape 29 | ZSM029 | Cueva de los Muertos Chiquitos | Mexico | 1300 BP | Paleofeces | HUMAN | 07 |
| Zape 31 | ZSM031 | Cueva de los Muertos Chiquitos | Mexico | 1300 BP | Paleofeces | HUMAN | 08 |
| H29-1 | AHP001 | Xiaosungang | China | Neolithic 7200–6800 BP | Paleofeces | CANID or CERVID | 09 |
| H35-1 | AHP002 | Xiaosungang | China | Neolithic 7200–6800 BP | Paleofeces | CANID or CERVID | 10 |
| H29-2 | AHP003 | Xiaosungang | China | Neolithic 7200–6800 BP | Paleofeces | CANID or CERVID | 11 |
| H29-3 | AHP004 | Xiaosungang | China | Neolithic 7200–6800 BP | Paleofeces | CANID or CERVID | 12 |
| LG 4560.69 | YRK001 | Surrey | UK | Post-Medieval | Paleofeces | HUMAN | 13 |
| AP3-C197S163 | DRL001.A | Derragh | Ireland | Mesolithic | Midden Sediment | – | 14 |
| AP4-A6-2860 | CBA001.A | Cabeço das Amoreiras | Portugal | Mesolithic | Midden Sediment | – | 15 |
| AP5-798-162 | BRF001.A | Binchester Roman Fort | England | Roman | Midden Sediment | – | 16 |
| AP6-LPZ702 | LEI010.A | Leipzig | Germany | 10th–11th century AD | Midden Sediment | – | 17 |
| AP7-6-28353 | ECO004.D | El Collado | Spain | Mesolithic | Pelvic Sediment | – | 18 |
| AP8-CMN-M1 | CMN001.D | Cingle del Mas Nou | Spain | Mesolithic | Pelvic Sediment | – | 19 |
| AP9-17590 | MLP001.A | Molpir | Slovakia | 7th century BC | Pelvic Sediment | – | 20 |
Note:
Metagenomic data were previously published in Hagan et al. (2020).
Figure 2Workflow schematic of the coproID pipeline.
CoproID consists of five steps: Preprocessing (orange), Mapping (blue), Computing host DNA content for each metagenome (red), Metagenomic profiling (green), and Reporting (violet). Individual programs (squared boxes) are colored by category (rounded boxes).
Figure 3Gut microbiome host DNA content.
The median percentage of host DNA in the gut microbiome and the number of samples in each group are displayed besides each boxplot.
Statistical comparison of reference gut host DNA content.
Mann–Whitney U test for independent observations. H0: the distributions of both populations are equal.
| Comparison | Mann-Whitney | |
|---|---|---|
| Dog vs NWHR | 3327.0 | <0.0001 |
| Dog vs WHU | 41.0 | <0.0001 |
| NWHR vs WHU | 370.0 | <0.0001 |
| Dog vs Human | 3368.0 | <0.0001 |
Figure 4The effect of filtering for damaged reads using PMD.
The log2 of the human NormalizedHostDNA is graphed against the log2 of the dog NormalizedHostDNA. Squares represent samples before filtering by PMD, whereas crosses represent samples after filtering by PMD. Dotted lines show the correspondence between samples. The red diagonal line marks the boundary between the two species, and the grey shaded area indicates a zone of species uncertainty (±1log2FC) due to insufficient genetic information.
Figure 5Embedding of reference modern gut microbiomes.
(A) t-SNE embedding of the species composition based on sample pairwise Weighted Unifrac distances for training modern gut microbiomes training samples. Samples are colored by their actual source. (B) t-SNE embedding of the species composition based on sample pairwise Weighted Unifrac distances for source prediction of modern test samples. The outer circle color is the actual source of a sample, while the inner circle color is the predicted sample source by Sourcepredict.
Figure 6Prediction of archaeological samples sources and t-SNE embedding by Sourcepredict.
t-SNE embedding of archaeological (crosses) and modern (hexagons) samples. The color of the modern samples is based on their actual source while the color of the archaeological samples is based on their predicted source by Sourcepredict. Archaeological sample are labelled with their Plot ID (Table 2).
Figure 7coproID source prediction.
Predicted human proportion graphed versus predicted canine proportion. Samples are colored by their predicted sources proportions. Samples with a low canine and human proportion are not annotated.
Figure 8Host DNA and Sourcepredict source prediction for paleofeces samples.
For human (A) and canine (B). The vertical bar represents the predicted proportion by host DNA (lighter fill) or by Sourcepredict (darker fill). The horizontal dashed line represents the confidence threshold to assign a source to a sample.