| Literature DB >> 28348876 |
Kim Judge1, Martin Hunt2, Sandra Reuter1, Alan Tracey2, Michael A Quail2, Julian Parkhill2, Sharon J Peacock3.
Abstract
Translating the Oxford Nanopore MinION sequencing technology into medical microbiology requires on-going analysis that keeps pace with technological improvements to the instrument and release of associated analysis software. Here, we use a multidrug-resistant Enterobacter kobei isolate as a model organism to compare open source software for the assembly of genome data, and relate this to the time taken to generate actionable information. Three software tools (PBcR, Canu and miniasm) were used to assemble MinION data and a fourth (SPAdes) was used to combine MinION and Illumina data to produce a hybrid assembly. All four had a similar number of contigs and were more contiguous than the assembly using Illumina data alone, with SPAdes producing a single chromosomal contig. Evaluation of the four assemblies to represent the genome structure revealed a single large inversion in the SPAdes assembly, which also incorrectly integrated a plasmid into the chromosomal contig. Almost 50 %, 80 % and 90 % of MinION pass reads were generated in the first 6, 9 and 12 h, respectively. Using data from the first 6 h alone led to a less accurate, fragmented assembly, but data from the first 9 or 12 h generated similar assemblies to that from 48 h sequencing. Assemblies were generated in 2 h using Canu, indicating that going from isolate to assembled data is possible in less than 48 h. MinION data identified that genes responsible for resistance were carried by two plasmids encoding resistance to carbapenem and to sulphonamides, rifampicin and aminoglycosides, respectively.Entities:
Keywords: MinION; antimicrobial resistance; assembly; long reads; plasmid; software
Mesh:
Year: 2016 PMID: 28348876 PMCID: PMC5320651 DOI: 10.1099/mgen.0.000085
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Comparison of assembly software: number and size of contigs, errors and time/memory requirements
| Assembly | PBcR | Canu | Miniasm | Miniasm & Nanopolish | SPAdes | Illumina | Manually finished |
|---|---|---|---|---|---|---|---|
| Number of contigs | 21 | 15 | 16 | 16 | 13 | 90 | 10 |
| Number of bases | 5490929 | 5542520 | 5843777 | 5673354 | 5576147 | 5454767 | 5586413 |
| Largest contig (bases) | 1615977 | 2782732 | 1548218 | 1504104 | 5303011 | 686305 | 5031167 |
| Mean contig (bases) | 261473 | 369501 | 365236 | 354585 | 428934 | 60608 | 620713 |
| N50* | 1197808 | 2782732 | 661959 | 641515 | 5303011 | 153115 | 5031167 |
| Total mis-assemblies | 5 | 2 |
0 | 3 | 5 | 6 | |
| Mismatches per kb | 1.0038 | 0.3494 | 6.6578 | 5.4843 | 0.0371 | 0.0355 | |
| Indels per kb | 12.1668 | 7.769 | 18.6418 | 8.987 | 0.0353 | 0.0322 | |
| Memory requirement | 7 GB | 8 GB | 3 GB | 3 GB & 4 GB | 2 GB | 4 GB | |
| Run time | 8 h | 2 h | 2 min |
2 min & | 3 h | 3 h | |
| Total CPU time† | 79728 | 54745 | 124 | 9450274 | 9164 | 12514 | |
| Number of threads | 16 | 8 | 2 | 2 & 16 | 16 | 2 |
*N50: a weighted median statistic. Half (50 %) of the assembly is contained in contigs greater than or equal to a contig of this size.
†Total CPU (Central Processing Unit) time: The amount of time used by the CPUs actively processing instructions. Run time, or ‘real’ time, may be longer, as it includes idle time or time spent waiting for input or output, or may be shorter if the workload is shared between more than one CPU.
Fig. 1.Comparison between (a) Canu (top), manually finished (middle) and PBcR assemblies (bottom) and (b) miniasm and nanopolish (top), manually finished (middle) and SPAdes hybrid assemblies (bottom). Matches are shown where the length of the match is greater than 10 kb or 50 % of the length of the shortest sequence it matches. Forward and reverse matches are colored green and brown, respectively.