| Literature DB >> 33902704 |
H Soon Gweon1,2, Liam P Shaw3, Jeremy Swann3, Nicola De Maio3, Manal AbuOun4, Rene Niehus5, Alasdair T M Hubbard3, Mike J Bowes6, Mark J Bailey6, Tim E A Peto3,7, Sarah J Hoosdally3, A Sarah Walker3,7, Robert P Sebra8, Derrick W Crook3,7, Muna F Anjum4, Daniel S Read6, Nicole Stoesser9.
Abstract
BACKGROUND: Shotgun metagenomics is increasingly used to characterise microbial communities, particularly for the investigation of antimicrobial resistance (AMR) in different animal and environmental contexts. There are many different approaches for inferring the taxonomic composition and AMR gene content of complex community samples from shotgun metagenomic data, but there has been little work establishing the optimum sequencing depth, data processing and analysis methods for these samples. In this study we used shotgun metagenomics and sequencing of cultured isolates from the same samples to address these issues. We sampled three potential environmental AMR gene reservoirs (pig caeca, river sediment, effluent) and sequenced samples with shotgun metagenomics at high depth (~ 200 million reads per sample). Alongside this, we cultured single-colony isolates of Enterobacteriaceae from the same samples and used hybrid sequencing (short- and long-reads) to create high-quality assemblies for comparison to the metagenomic data. To automate data processing, we developed an open-source software pipeline, 'ResPipe'.Entities:
Keywords: Antimicrobial resistance (AMR); Enterobacteriaceae; Metagenomics; One health
Year: 2019 PMID: 33902704 PMCID: PMC8204541 DOI: 10.1186/s40793-019-0347-1
Source DB: PubMed Journal: Environ Microbiome ISSN: 2524-6372
Fig. 1Schematic overview of the study. For each sample, we used both a metagenomics and culture-based approach. We developed a software pipeline (‘ResPipe’) for the metagenomic data. For more details on each step of the workflow, see Methods
Fig. 2Rarefaction curve at various sequencing depths for a AMR gene families, and b AMR gene allelic variants. Colours indicate sample type. For each sampling depth, sequences were randomly subsampled 10 times, with each point representing a different subsampling. Lines connect the means (large circles) of these points for each sample type
Fig. 3The most common AMR gene families and gene allelic variants in each sample. Left panel: the top 20 AMR gene families from effluent, pig caeca and upstream sediment by number of reads (top to bottom), with the top three most abundant highlighted in colour (hue indicates sample type) for comparison with the right-hand panel. Right panel: the most abundant AMR gene allelic variants within these top three most abundant gene families (left to right), sorted by abundance. For more information on the definitions of ‘AMR gene family’ and ‘allelic variant’, see Methods: ‘AMR gene profiling’
Fig. 4The effect of normalization on the most common AMR gene allelic variants from each sample. Shown are the top 20 AMR gene allelic variants from each sample (effluent, pig caeca and upstream sediment), and the effect of different normalisations (left: raw count, middle: normalisation by gene length, right: further normalisation by Thermus thermophilus count). Arrows show the changing rank of each variant with normalisation. Note that a different x-axis is used for upstream sediment in all three panels. Asterisks denote AMR allelic variants that do not have a “protein homolog” detection model in CARD (see Methods: ‘AMR gene profiling’)
Fig. 5Taxonomic classification of metagenomes by method. Resulting taxonomic composition of effluent (E), pig caeca (P) and upstream sediment (U) metagenomes using Kraken, Centrifuge and classification by in silico 16S rRNA extraction (16S). a Domain-level classification. b Relative abundance of bacterial phyla c Relative abundance of Enterobacteriaceae
Fig. 6Impact of sequencing depth on genus-level richness. Three methods are shown: a Kraken, b Centrifuge and c in silico 16S rRNA extraction
Details of cultured isolates and assembled genomes. For more details on isolate sequencing, see Additional file 6: Table S4
| Sample | Isolate number | Species | Genome size (bp) | Number of contigs (number circularized) |
|---|---|---|---|---|
| Effluent | 1 |
| 5,213,846 | 4 (4) |
| 2 |
| 5,590,302 | 9 (7) | |
| 3 |
| 5,465,276 | 7 (5) | |
| 4 |
| 5,393,186 | 8 (4) | |
| Pig caeca | 1 |
| 4,898,477 | 5 (5) |
| 2 |
| 4,967,077 | 2 (2) | |
| Upstream sediment | 1 |
| 4,839,493 | 1 (1) |
Fig. 7Metagenomic read coverage of assembled genetic structures from isolates cultured from each sample. a Effluent isolates: E1-E4, b Pig caeca isolates: P1-P2, c Upstream sediment isolate: U1. Genetic structures are coloured by size. Note the different y-axis scale for the upstream sediment sample