| Literature DB >> 29026658 |
Sophie George1, Louise Pankhurst1, Alasdair Hubbard1, Antonia Votintseva1, Nicole Stoesser1, Anna E Sheppard1, Amy Mathers2, Rachel Norris3, Indre Navickaite1, Chloe Eaton4, Zamin Iqbal3, Derrick W Crook1, Hang T T Phan1.
Abstract
This study aimed to assess the feasibility of using the Oxford Nanopore Technologies (ONT) MinION long-read sequencer in reconstructing fully closed plasmid sequences from eight Enterobacteriaceae isolates of six different species with plasmid populations of varying complexity. Species represented were Escherichia coli, Klebsiella pneumoniae, Citrobacter freundii, Enterobacter cloacae, Serratia marcescens and Klebsiella oxytoca, with plasmid populations ranging from 1-11 plasmids with sizes of 2-330 kb. Isolates were sequenced using Illumina (short-read) and ONT's MinION (long-read) platforms, and compared with fully resolved PacBio (long-read) sequence assemblies for the same isolates. We compared the performance of different assembly approaches including SPAdes, plasmidSPAdes, hybridSPAdes, Canu, Canu+Pilon (canuPilon) and npScarf in recovering the plasmid structures of these isolates by comparing with the gold-standard PacBio reference sequences. Overall, canuPilon provided consistently good quality assemblies both in terms of assembly statistics (N50, number of contigs) and assembly accuracy [presence of single nucleotide polymorphisms (SNPs)/indels with respect to the reference sequence]. For plasmid reconstruction, Canu recovered 70 % of the plasmids in complete contigs, and combining three assembly approaches (Canu or canuPilon, hybridSPAdes and plasmidSPAdes) resulted in a total 78 % recovery rate for all the plasmids. The analysis demonstrated the potential of using MinION sequencing technology to resolve important plasmid structures in Enterobacteriaceae species independent of and in conjunction with Illumina sequencing data. A consensus assembly derived from several assembly approaches could present significant benefit in accurately resolving the greatest number of plasmid structures.Entities:
Keywords: Gram-negative Enterobactericacae; MinION nanopore sequencing; plasmid assembly; plasmid reconstruction
Mesh:
Year: 2017 PMID: 29026658 PMCID: PMC5610714 DOI: 10.1099/mgen.0.000118
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Summary statistics of 8 MinION long-read sequencing runs based on 2D reads that passed quality control
| Sample | CAV1015 | CAV1016 | CAV1374 | CAV1411 | CAV1492 | CAV1596 | CAV1741 | P46212 |
|---|---|---|---|---|---|---|---|---|
| Sequencing run | 1 | 2 | 3 | 3 | 4 | 4 | 5 | 5 |
| Estimated coverage | 122.92 | 77.58 | 36.58 | 36.49 | 15.29 | 57.31 | 25.37 | 29.43 |
| Total bases (Mb) | 809.13 | 434.22 | 264.38 | 182.65 | 89.12 | 322.09 | 136.77 | 154.49 |
| No. 2D reads | 69 806 | 42 889 | 29 841 | 18 013 | 7 044 | 29 251 | 10 966 | 13 055 |
| Mean read length (bp) | 11 591 | 10 124 | 8 860 | 10 140 | 12 653 | 11 011 | 12 472 | 11 834 |
| Max. read length (bp) | 43 246 | 67 060 | 49 368 | 57 507 | 57 649 | 69 030 | 36 407 | 45 085 |
| N50 (bp) | 14 288 | 13 021 | 11 939 | 12 209 | 17 593 | 16 347 | 15 481 | 15 065 |
| 5 591 | 2 544 | 905 | 515 | 1 326 | 4 121 | 1 212 | 1 418 | |
| 43 013 | 20 392 | 12 621 | 9 501 | 4 184 | 14 797 | 7 330 | 7 899 | |
| 58 660 | 33 592 | 20 540 | 14 740 | 5 317 | 19 576 | 9 098 | 10 403 | |
| 943 | 476 | 321 | 88 | 91 | 582 | 21 | 25 |
Fig. 1.Assembly summary statistics for sequences including assembly size (a), number of contigs (b), maximum contig size (c), mean contig length (d) and N50 (e) – the results of plasmidSPAdes are not included as it is not a complete genome assembly method. SPAdes used only Illumina short-read data, Canu used only MinION long-read data, whereas hybridSPAdes, npScarf and canuPilon used both.
Fig. 2.Comparison of assemblies with the PacBio reference genomes – the results of plasmidSPAdes are not included as it is not a complete genome assembly method.
Summary of plasmid structure recovery of different assembly approaches using Illumina and/or MinION long-read sequences
| CAV1015 | CAV1016 | CAV1374 | CAV1411 | CAV1492 | CAV1596 | CAV1741 | P46212 | Total number of plasmids recovered | Recovery rate | |
|---|---|---|---|---|---|---|---|---|---|---|
| Number of plasmids (reference PacBio assembly) | 5 (a–e) | 3 (a–c) | 11 (a–k) | 2 (a, b) | 5 (a–e) | 4 (a–d) | 6 (a–f) | 1 (a) | 37 | |
| SPAdes* | 1, c | 1, a | 0 | 0 | 0 | 1, a | 2 b, e | 0 | 5 | 14 % |
| plasmidSPAdes* | 2 a,c | 1, a | 0 | 0 | 0 | 0 | 2 | 0 | 5 | 14 % |
| Hybrid SPAdes† | 1, c | 1, a | 0 | 0 | 1, c | 1, a | 2 b, e | 1 | 7 | 19 % |
| npScarf† | 1, c | 1, a | 0 | 0 | 0 | 0 | 1, e | 1 | 4 | 11 % |
| Canu/canuPilon† | 4, b–e | 3, a-c | 6, c, f–i, k | 2, a, b | 5, a–e | 3 b–d | 2, d, e | 1 | 26 | 70 % |
| Aggregated results | 5 | 3 | 6 | 2 | 5 | 4 | 3 | 1 | 29 | 78 % |
*Assembly using Illumina short-read data only.
†Hybrid assembly – Illumina short-read plus MinION long-read data.
Fig. 3.Mummerplots comparing study plasmid assemblies and reference plasmid sequences, for canuPilon, hybridSPAdes and npScarf approaches, and for isolates CAV1374, CAV1741 (x-axis: reference plasmids; y-axis: matched contigs from assemblies; red, sequence match in the forward direction; blue, matching in the reverse complement direction; legends of x-axis on the right of subplots ordering by left to right, legends of y-axis on the left of subplots in blue, ordering up to down). A resolved plasmid is indicated by overlapping ends, shown as overlapping co-ordinates of diagonal lines on the x-axis. The number of reference plasmids in the subplots differs between assembly methods due to the missing of contigs matched with any of these reference plasmids from the assemblies. The mummerplot comparisons of other samples are in Figs S6 and S7.