| Literature DB >> 30886779 |
Raphael Eisenhofer1,2, Laura Susan Weyrich1,2.
Abstract
The field of palaeomicrobiology-the study of ancient microorganisms-is rapidly growing due to recent methodological and technological advancements. It is now possible to obtain vast quantities of DNA data from ancient specimens in a high-throughput manner and use this information to investigate the dynamics and evolution of past microbial communities. However, we still know very little about how the characteristics of ancient DNA influence our ability to accurately assign microbial taxonomies (i.e. identify species) within ancient metagenomic samples. Here, we use both simulated and published metagenomic data sets to investigate how ancient DNA characteristics affect alignment-based taxonomic classification. We find that nucleotide-to-nucleotide, rather than nucleotide-to-protein, alignments are preferable when assigning taxonomies to short DNA fragment lengths routinely identified within ancient specimens (<60 bp). We determine that deamination (a form of ancient DNA damage) and random sequence substitutions corresponding to ∼100,000 years of genomic divergence minimally impact alignment-based classification. We also test four different reference databases and find that database choice can significantly bias the results of alignment-based taxonomic classification in ancient metagenomic studies. Finally, we perform a reanalysis of previously published ancient dental calculus data, increasing the number of microbial DNA sequences assigned taxonomically by an average of 64.2-fold and identifying microbial species previously unidentified in the original study. Overall, this study enhances our understanding of how ancient DNA characteristics influence alignment-based taxonomic classification of ancient microorganisms and provides recommendations for future palaeomicrobiological studies.Entities:
Keywords: Alignment; Ancient DNA; Bioinformatics; Microbiology; Microbiome; Palaeomicrobiology; Shotgun metagenomics; Taxonomic classification
Year: 2019 PMID: 30886779 PMCID: PMC6420809 DOI: 10.7717/peerj.6594
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1General overview of simulated data construction and analysis.
Figure 2Percentage of reads assigned taxonomy using simulated metagenomes of empirical ancient DNA fragment length against different MALT databases.
Percentages of total reads assigned at different taxonomic levels with different read length cut-offs.
| Fragment length | Reads assigned total | Reads assigned genus | Reads assigned species |
|---|---|---|---|
| 30 bp_MALTn-Genome | 100 | 100 | 97 |
| 30 bp_MALTn-CDS | 86 | 86 | 83 |
| 30 bp_MALTx | 0 | 0 | 0 |
| 50 bp_MALTn-Genome | 100 | 100 | 98 |
| 50 bp_MALTn-CDS | 88 | 88 | 86 |
| 50 bp_MALTx | 0 | 0 | 0 |
| 70 bp_MALTn-Genome | 100 | 100 | 98 |
| 70 bp_MALTn-CDS | 90 | 90 | 88 |
| 70 bp_MALTx | 33 | 31 | 25 |
| 90 bp_MALTn-Genome | 100 | 100 | 98 |
| 90 bp_MALTn-CDS | 91 | 91 | 89 |
| 90 bp_MALTx | 82 | 75 | 55 |
| Empirical_MALTn-Genome | 99 | 98 | 97 |
| Empirical_MALTn-CDS | 87 | 87 | 86 |
| Empirical_MALTx | 16 | 14 | 10 |
Figure 3Species level taxonomic classification of empirical fragment length simulated metagenome.
Species coloured black were not used as input for constructing the simulated metagenomes and are misclassifications.
Effects of deamination on taxonomic classification of empirical ancient DNA read-length distribution.
| Fragment length | Reads assigned total (%) | Reads assigned genus (%) | Reads assigned species (%) |
|---|---|---|---|
| MALTn-genome_0δs | 98.6 | 98.4 | 96.6 |
| MALTn-genome_10δs | 98.4 | 98.2 | 96.5 |
| MALTn-genome_20δs | 98.5 | 98.3 | 96.5 |
| MALTn-genome_50δs | 97.7 | 97.5 | 95.7 |
| MALTn-CDS_0δs | 87.4 | 87.1 | 85.5 |
| MALTn-CDS_10δs | 87.2 | 86.9 | 85.3 |
| MALTn-CDS_20δs | 87.2 | 86.9 | 85.3 |
| MALTn-CDS_50δs | 86.5 | 86.2 | 84.6 |
| MALTx_0δs | 15.8 | 14.2 | 9.7 |
| MALTx_10δs | 15.2 | 13.7 | 9.4 |
| MALTx_20δs | 15.0 | 13.6 | 9.2 |
| MALTx_50δs | 14.5 | 13.1 | 8.9 |
Figure 4Percentage of reads assigned taxonomy using divergent and deaminated simulated metagenomes of typical ancient DNA fragment length.
Figure 5Percentage of reads assigned taxonomy to different taxonomic ranks for deeply sequenced published data.
Clustered columns represent samples analysed using different reference databases. Colours indicate specificity of assignments.
Number of genera and species identified in each MALT database.
| Genus-level | ||||
|---|---|---|---|---|
| Database: | 2014nr | 2017nt | HOMD | RefSeqGCS |
| CHIMP | 46 | 57 | 35 | 52 |
| ELSIDRON1 | 49 | 50 | 42 | 48 |
| MODERN | 23 | 32 | 28 | 29 |
| SPYII | 64 | 64 | 54 | 62 |
| Average | 46 | 51 | 40 | 48 |
Figure 6UPGMA tree of species-level Bray–Curtis dissimilariies calculated from the microbial composition between each sample.
The branch scale bar represents the Bray–Curtis dissimilarity between samples.