| Literature DB >> 30935407 |
Fanny Berglund1,2, Tobias Österlund1,2, Fredrik Boulund3, Nachiket P Marathe2,4,5, D G Joakim Larsson2,4, Erik Kristiansson6,7.
Abstract
BACKGROUND: Environmental and commensal bacteria maintain a diverse and largely unknown collection of antibiotic resistance genes (ARGs) that, over time, may be mobilized and transferred to pathogens. Metagenomics enables cultivation-independent characterization of bacterial communities but the resulting data is noisy and highly fragmented, severely hampering the identification of previously undescribed ARGs. We have therefore developed fARGene, a method for identification and reconstruction of ARGs directly from shotgun metagenomic data.Entities:
Keywords: Antibiotic resistance; Beta-lactamases; Environmental sequencing; Gene assembly; Microbiome; Resistome
Mesh:
Year: 2019 PMID: 30935407 PMCID: PMC6444489 DOI: 10.1186/s40168-019-0670-1
Source DB: PubMed Journal: Microbiome ISSN: 2049-2618 Impact factor: 14.650
Fig. 1A schematic overview of fARGene. fARGene takes metagenomic paired-end data as input which then are subjected to an ARG model which classify the reads as coming from a resistance gene or not (panel 1). The paired-end sequences of the positively classified reads are extracted, quality controlled and then assembled into full-length genes (panel 2). The produced gene sequences are once again classified by the ARG model (panel 3). The output consists of nucleotide and amino acid sequences of the reconstructed ARGs. The method can also be applied directly to whole genomes and metagenomic contigs and then the classification, extraction, and assembly of reads are not performed
Model performance
| Sensitivity | Specificity | ||||
|---|---|---|---|---|---|
| Model | Reference genes | Full-length | Reads (100 nt) | Full-length | Reads(100 nt) |
| Class A | 71 | 1.000 | 0.897 | 1.000 | 0.990 |
| Subclass B1 + B2 | 35 | 1.000 | 0.811 | 1.000 | 0.962 |
| Subclass B3 | 11 | 1.000 | 0.722 | 1.000 | 0.921 |
| Class C | 22 | 1.000 | 0.939 | 1.000 | 0.991 |
| Class D1 | 9 | 1.000 | 0.904 | 1.000 | 0.986 |
| Class D2 | 20 | 1.000 | 0.901 | 1.000 | 0.981 |
Fig. 2a–f Results from optimization of six ARG models for the four classes of β-lactamases. Each figure shows the performance of correctly classifying fragments as ARGs. The green curve shows the sensitivity, i.e. the fraction of correctly classified fragments from true resistance genes, while the orange curve shows 1-specificity, i.e. the fraction of incorrectly fragment sequences from genes without a resistance phenotype. A model with good performance should have a high sensitivity while 1-specificity should be low. The dashed line corresponds to the model threshold selected to have a high sensitivity and an acceptable specificity
Datasets used in this study
| Dataset | Size (nt) | # reads | Avg. read length | Reference |
|---|---|---|---|---|
| HMP* | 4.69 × 1012 | 4.41 × 1010 | 96 | [ |
| Human gut | 2.80 × 1011 | 3.50 × 109 | 75 | [ |
| Oil spill | 3.36 × 1011 | 3.33 × 109 | 101 | [ |
| Polluted lake | 6.76 × 109 | 6.69 × 107 | 101 | [ |
| Wadden sea | 8.42 × 109 | 5.23 × 107 | 161 | [ |
*Human Microbiome Project
Results from reconstruction of ARGs
| Class A | Class B | Class C | Class D | |||||
|---|---|---|---|---|---|---|---|---|
| Reconstructed genes | Reconstructed genes | Reconstructed genes | Reconstructed genes | |||||
| Total | New† | Total | New† | Total | New† | Total | New† | |
| HMP* | 91 | 23 | 25 | 8 | 1 | 0 | 2 | 0 |
| Human gut | 52 | 6 | 10 | 3 | 4 | 0 | 4 | 1 |
| Oil spill | 2 | 1 | 11 | 9 | 0 | 0 | 7 | 7 |
| Polluted lake | 3 | 0 | 0 | 0 | 3 | 0 | 3 | 0 |
| Wadden Sea | 1 | 0 | 0 | 0 | 0 | 0 | 2 | 0 |
| Total | 149 | 30 | 46 | 20 | 8 | 0 | 18 | 8 |
*Human Microbiome Project
†< 70% sequence similarity against any sequence in NCBI GenBank
Fig. 3Number of reconstructed ARGs per millions reads for the four β-lactamase classes
Fig. 4The ability to correctly classify metagenomic fragments for fARGene and five competing methods. The performance of fARGene was consistently higher than all compared methods (in average, 87% compared to 55%, 7.5%, 52%, 42%, 46%, and 0%, for deepARG, Resfams, MEGAN SEED, MEGAN eggNOG, ARGs-OAP, and GROOT, respectively)
Fig. 5The estimated specificity for fARGene and five competing methods. The specificity was estimated from simulated metagenomic fragments of genes closely related to β-lactamases