| Literature DB >> 30052170 |
James Robertson1, John H E Nash2.
Abstract
Large-scale bacterial population genetics studies are now routine due to cost-effective Illumina short-read sequencing. However, analysing plasmid content remains difficult due to incomplete assembly of plasmids. Bacterial isolates can contain any number of plasmids and assembly remains complicated due to the presence of repetitive elements. Numerous tools have been developed to analyse plasmids but the performance and functionality of the tools are variable. The MOB-suite was developed as a set of modular tools for reconstruction and typing of plasmids from draft assembly data to facilitate characterization of plasmids. Using a set of closed genomes with publicly available Illumina data, the MOB-suite identified contigs of plasmid origin with both high sensitivity and specificity (95 and 88 %, respectively). In comparison, plasmidfinder demonstrated high specificity (99 %) but limited sensitivity (50 %). Using the same dataset of 377 known plasmids, MOB-recon accurately reconstructed 207 plasmids so that they were assigned to a single grouping without other plasmid or chromosomal sequences, whereas plasmidSPAdes was only able to accurately reconstruct 102 plasmids. In general, plasmidSPAdes has a tendency to merge different plasmids together, with 208 plasmids undergoing merge events. The MOB-suite reduces the number of errors but produces more hybrid plasmids, with 84 plasmids undergoing both splits and merges. The MOB-suite also provides replicon typing similar to plasmidfinder but with the inclusion of relaxase typing and prediction of conjugation potential. The MOB-suite is written in Python 3 and is available from https://github.com/phac-nml/mob-suite.Entities:
Keywords: bacterial genomes; mobile genetic elements; plasmid transmissibility; plasmids; relaxase typing; replicon benchmarking
Mesh:
Year: 2018 PMID: 30052170 PMCID: PMC6159552 DOI: 10.1099/mgen.0.000206
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Fig. 1.Flowchart outlining the major elements of the MOB-recon algorithm. Draft or complete assemblies are used as input to the software and candidate contigs of plasmid origin are identified and clustered together to produce a report file and individual fasta files for each grouping.
Fig. 2.Flowchart outlining the major elements of the MOB-typer algorithm. Draft or complete assemblies are used as input to the software and each plasmid is typed using known replicons and relaxases. Additional databases of mate-pair formation proteins and known oriT sequences are used to predict the transmissibility of the plasmid.
Fig. 3.Box-plot outlining the sensitivity and specificity for each of the tested tools.
Performance benchmarking of the ability to reconstruct plasmid groups by plasmidSPAdes and MOB-recon
Contigs were assigned to their respective plasmid by blast and the content of the groupings was evaluated. A plasmid is considered to have undergone a split event if contigs are present in more than one cluster. A merge event is defined as a cluster that contains multiple plasmids. Combination events are when a plasmid has undergone both a split and a merge. Correctly partitioned plasmids are ones that have been assigned to a single grouping and where no additional plasmid contigs have been assigned to that cluster.
| PlasmidSPAdes | MOB-recon | |
|---|---|---|
| Total plasmids identified | 337 | 377 |
| Correctly partitioned plasmids | 102 | 207 |
| Plasmids split across multiple clusters | 14 | 50 |
| Plasmids merged into single clusters | 208 | 36 |
| Plasmids with a combination of splits and merges | 13 | 84 |