| Literature DB >> 30265232 |
G Royer1,2,3, J W Decousser1,3, C Branger1, M Dubois2, C Médigue2, E Denamur4,1, D Vallenet2.
Abstract
Plasmid prediction may be of great interest when studying bacteria of medical importance such as Enterobacteriaceae as well as Staphylococcus aureus or Enterococcus. Indeed, many resistance and virulence genes are located on such replicons with major impact in terms of pathogenicity and spreading capacities. Beyond strain outbreak, plasmid outbreaks have been reported in particular for some extended-spectrum beta-lactamase- or carbapenemase-producing Enterobacteriaceae. Several tools are now available to explore the 'plasmidome' from whole-genome sequences with various approaches, but none of them are able to combine high sensitivity and specificity. With this in mind, we developed PlaScope, a targeted approach to recover plasmidic sequences in genome assemblies at the species or genus level. Based on Centrifuge, a metagenomic classifier, and a custom database containing complete sequences of chromosomes and plasmids from various curated databases, PlaScope classifies contigs from an assembly according to their predicted location. Compared to other plasmid classifiers, PlasFlow and cBar, it achieves better recall (0.87), specificity (0.99), precision (0.96) and accuracy (0.98) on a dataset of 70 genomes of Escherichia coli containing plasmids. In a second part, we identified 20 of the 21 chromosomal integrations of the extended-spectrum beta-lactamase coding gene in a clinical dataset of E. coli strains. In addition, we predicted virulence gene and operon locations in agreement with the literature. We also built a database for Klebsiella and correctly assigned the location for the majority of resistance genes from a collection of 12 Klebsiella pneumoniae strains. Similar approaches could also be developed for other well-characterized bacteria.Entities:
Keywords: Escherichia coli; antimicrobial resistance; bioinformatic method; plasmid detection
Mesh:
Substances:
Year: 2018 PMID: 30265232 PMCID: PMC6202455 DOI: 10.1099/mgen.0.000211
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Fig. 1.The PlaScope workflow. After read assembly using SPAdes, contigs are classified into three categories using Centrifuge (i.e. chromosome, plasmid, unclassified) with a custom database containing chromosome and plasmid sequences.
PlaScope, PlasFlow and cBar benchmark results on contigs from 70 E. coli genomes
| True positive | 1123 | 1106 | 954 |
| True negative | 9162 | 6231 | 5570 |
| False positive | 52 | 2983 | 3644 |
| False negative | 173 | 190 | 342 |
| Recall | 0.87 | 0.85 | 0.74 |
| Precision | 0.96 | 0.27 | 0.21 |
| Specificity | 0.99 | 0.68 | 0.6 |
| Accuracy | 0.98 | 0.7 | 0.62 |
| F1 score | 0.91 | 0.41 | 0.32 |
Fig. 2.PlaScope, PlasFlow and cBar performance for each genome taken individually. Recall, specificity, precision and accuracy obtained for each of the 70 genomes containing plasmids are plotted according to the method in blue, green and red for PlaScope, PlasFlow and cBar, respectively. Grey points on box plots represent values for each of these genomes.
Fig. 3.Genetic distance-based tree with PlaScope-predicted location of blaCTX-M-15, virulence genes and operons in the ST410 E. coli strains from Falgenhauer et al. [15]. Locations of the genes are displayed with coloured squares (blue: plasmid prediction, orange: chromosome prediction, grey: unclassified).