Literature DB >> 31361781

Nanopore sequencing for fast determination of plasmids, phages, virulence markers, and antimicrobial resistance genes in Shiga toxin-producing Escherichia coli.

Narjol González-Escalona¹, Marc A Allard¹, Eric W Brown¹, Shashi Sharma¹, Maria Hoffmann¹.

Abstract

Whole genome sequencing can provide essential public health information. However, it is now known that widely used short-read methods have the potential to miss some randomly-distributed segments of genomes. This can prevent phages, plasmids, and virulence factors from being detected or properly identified. Here, we compared assemblies of three complete Shiga toxin-producing Escherichia coli (STEC) O26:H11/H- genomes from two different sequence types (ST21 and 29), each acquired using the Nextera XT MiSeq, MinION nanopore-based sequencing, and Pacific Biosciences (PacBio) sequencing. Each closed genome consisted of a single chromosome, approximately 5.7 Mb for CFSAN027343, 5.6 Mb for CFSAN027346, and 5.4 MB for CFSAN027350. However, short-read whole genome sequencing (WGS) using Nextera XT MiSeq failed to identify some virulence genes in plasmids and on the chromosome, both of which were detected using the long-read platforms. Results from long-read MinION and PacBio allowed us to identify differences in plasmid content: a single 88 kb plasmid in CFSAN027343; a 157kb plasmid in CFSAN027350; and two plasmids in CFSAN027346 (one 95 Kb, one 72 Kb). These data enabled rapid characterization of the virulome, detection of antimicrobial genes, and composition/location of Stx phages. Taken together, positive correlations between the two long-read methods for determining plasmids, virulome, antimicrobial resistance genes, and phage composition support MinION sequencing as one accurate and economical option for closing STEC genomes and identifying specific virulence markers.

Entities: Chemical Disease Species

Mesh：

Substances：

Year: 2019 PMID： 31361781 PMCID： PMC6667211 DOI： 10.1371/journal.pone.0220494

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Whole genome sequencing is an essential tool for characterizing and tracking pathogenic bacteria that may have contaminated the food supply as well as for identifying whether those bacteria carry virulence factors that could pose serious threats to public health [1,2]. Closed bacterial genomes provide: 1) high-quality reference material that supports pathogen source tracking during a foodborne outbreak investigation, 2) important clues about the long-term evolution of enteric pathogens, 3) key insights into mechanisms and transmission of mobile elements conferring antimicrobial resistance, and 4) critical information about the potential contribution of DNA modifications on pathogenesis [2-9]. These details are particularly important for understanding hemorrhagic pathogens such as Shiga toxin-producing E. coli (STEC) O26:H11/H- strains, which cause significant human morbidity and mortality worldwide [10-13]. The genomes of O26:H11/H- are very complex, containing many virulence genes, insertion sequences, phages, and plasmids; consequently, strains of the same lineage can possess significantly different content [7,9,10],[11,14-16]. Missing the presence of some of these elements during an investigation can have large impacts on human health [e.g. hlyA gene in enterohemorrhagic E. coli (EHECs)]. Thus, it really matters that we have reproducible WGS systems for fully capturing and sequencing these elements. Long-read sequencing platforms afford one solution to this challenge. Some systems such as Pacific Biosciences (PacBio) Sequencers RSII or Sequel (https://www.pacb.com/products-and-services/pacbio-systems/), use single-molecule real-time (SMRT) sequencing technology that allow for real-time observation of DNA synthesis through zero-mode waveguides (ZMWs) and phospho-linked nucleotides [17,18]. While comprehensive in their ability to capture entire genomes, extraneous elements included, these systems often require significant investments in machinery, space, and laboratory expertise, all of which may be obstacles to routine use. These systems also require significant quantities of DNA (i.e., 5 μg), require a more substantial preparatory time (i.e., 8 hr DNA sequencing library protocols), and produce average read lengths of about 11Kb, although reads of greater than 50 kb were possible in the laboratory at the time of this study. Alternative sequencing platforms based on nanopore technology may be able to provide high-quality libraries from long reads and produce closed bacterial genomes, while also offering several other advantages in portability and affordability. These systems are much less expensive, take little laboratory space, and can even be taken into the field for on-site sequencing. The MinION nanopore (Oxford Nanopore, Oxford, UK) system, as one example, comprises a palm-sized unit able to detect changes in ionic current when DNA or RNA passes through the nanopores, whereupon those changes are translated into base calls. Researchers have already used nanopore sequencing on plants, yeasts, viruses, and to perform de novo bacterial assembly [19-21]. Other applications have included rapid identification of viral pathogens [22,23], metagenomics [23-25], detection of antimicrobial resistance genes [26,27], comparative RNA expression levels in diverse cells [28], and detecting DNA methylation patterns [29]. Since the initial MinION Access Programme (MAP), this technology has been refined several times [30]. At least three nanopore-based sequencing tools are currently in production as of autumn 2018 [19,21,30,31]. As MinION does not require a size selection step prior to sequencing, it can obtain longer reads than other platforms [19,22,30-33]. Moreover, the system also allows for “quasi” real-time sequencing approaches [34]. The accuracy of nanopore sequencing is reported to be 90% in general, with some researchers reporting 99.96% accuracy only after read polishing (http://simpsonlab.github.io/2016/08/23/R9). Two of the main subgroups within O26:H11/H- are the enteropathogenic (EPEC), which carry the locus of enterocyte effacement (LEE) causing mild diarrhea, and enterohemorrhagic (EHEC), which carry either stx1 and/or stx2 genes in addition to the LEE, and are associated with more severe illnesses, such as hemorrhagic colitis (HC) and hemolytic uremia syndrome (HUS) [35,36]. A new clone of O26:H11/H- belonging to sequence type 29 (ST29) and that had a specific virulome (stx2a+, eae+, plasmid gene profile ehxA+, etpD+) has been recently found distributed all over Europe [10]. Serotype O26:H11/H- can be divided, based on MLST, into several STs, with ST21 and ST29 associated with disease in humans [7]. We have selected three STEC O26:H11 strains with very complex genomes belonging to different STs (21 and 29), and isolated in different years, places and sources (clinical and environmental). This makes them ideal strains to compare the three WGS capabilities. In this study, we compared the sequencing capabilities of the MinION with that of the PacBio RS II and MiSeq by sequencing 3 STEC O26:H11 strains [10] using MinION and a genome library prepared using two different library kits (i.e., 1D ligation kit and rapid sequencing kit). We then compared the results obtained by this sequencing platform (MinION). We also tested two different assembly pipelines for the de novo assembly of these genomes. Finally, we compared the results of each technology to assess their capacity for detecting virulence and antimicrobial genes, Shiga toxin phages, plasmid presence, genome synteny, and phylogenetic analysis.

Results

The 3 STEC strains employed here are listed in Table 1, along with their sources and confirmed STs. These genomes were sequenced on each of the sequencing systems, and three main differences were documented among those systems: number of contigs, ease of assembly, and detection of important genomic markers.

Table 1

Summary of the characteristics of the 3 STEC O26:H11 strains sequenced in this study.

strain	CFSAN No.	Serotype^a	ST^b	Date	location	source
99.085	CFSAN027343	O26:H11	21	1999	Argentina	clinical
99.1773	CFSAN027346	O26:H11	21	1999	USA	clinical
12.1843	CFSAN027350	O26:H11	29	2012	USA	environmental

aThese results were confirmed with in silico serotyping.

bDetermined by in silico MLST.

aThese results were confirmed with in silico serotyping. bDetermined by in silico MLST.

MiSeq sequencing and assembly

These STECs were sequenced using using Nextera XT MiSeq (Illumina) and assembled de novo. Their genomes assembled into 234 to 321 contigs (Table 2), which is within the expected range for de novo assembled STECs. Results from other drafts STEC genomes are typically around >250 contigs (based on our own and other researcher’s observations; data available from NCBI). In silico MLST identified the samples correctly as belonging to ST21 (CFSAN027343 and CFSAN027346), and ST29 (CFSAN027350). in silico serotyping confirmed that these strains were all serotype O26:H11.

Table 2

MiSeq assembly statistics for the 3 O26:H11 STECs.

CFSAN No.	ContigsMiSeq	No.of reads	Q>30Readssize	Totalbases (bp)	N50	Totalgenome (bp)	Averagecoverage X
CFSAN027346	274	2.14E+06	194	4.17E+08	100,594	5.42E+06	77
CFSAN027343	321	2.56E+06	188	4.81E+08	93,611	5.41E+06	89
CFSAN027350	234	2.45E+06	221	5.42E+08	99,163	5.28E+06	103

De novo assembly of the 3 genomes resulted in assemblies that ranged from 5.2 (CFSAN027350) to 5.4 Mb (the other two genomes) (Table 2). As stated earlier, it is notable that de novo assemblies of short-read sequencing technology produced an elevated number of contigs resulting in fractioned assemblies.

MinION sequencing and DNA sequencing library approaches

We sequenced the same three STECs using the nanopore-based MinION device. Two strains were sequenced in triplicate (CFSAN027350 and CFSAN027346); CFSAN027343 was sequenced twice. We used two different DNA library approaches: the 1D ligation kit (SQK-LSK108, expected to produce more output and longer reads) and the rapid sequencing kit (SQK-RAD002, expected to produce lower output and smaller reads, but with a much simpler and straightforward procedure for preparing the DNA library). Table 3 confirms that the 1D ligation kit produced larger outputs than the rapid sequencing kit. Even though the DNA extraction method was the same for each strain, the sequencing output was different, varying from 1 Gb (CFSAN027350a) to 8 Gb (CFSAN027343a). DNA libraries output also varied between replicates (with a higher percentage of reads > 5 kb in the case of CFSAN027346b with 355,653 reads above 5 kb long compared to the same replicate library CFSAN027346a with 139,883 reads above 5 kb (results not shown). As expected, sequencing runs prepared with the rapid kit produced lower output, with an average of 0.71 Gb (Table 3).

Table 3

STECs MinION sequencing output statistics by replicate and library kit.

Run¹	Totalreads	total output (GB)	Average coverage (X)
²CFSAN027343a	4,027,578	8.21	1127
²CFSAN027346a	692,153	3.33	585
²CFSAN027346b	611,279	6.82	1199
²CFSAN027350a	267,065	1.08	189
²CFSAN027350b	1,369,072	5.23	920
³CFSAN027343	173,937	0.93	163
³CFSAN027346	155,628	0.90	158
³CFSAN027350	131,153	0.31	54

1Runs: a) first, and b) second run.

21D Genomic DNA by ligation (SQK-LSK108)

3rapid sequencing kit (SQK-RAD002)

1Runs: a) first, and b) second run. 21D Genomic DNA by ligation (SQK-LSK108) 3rapid sequencing kit (SQK-RAD002) Nevertheless, after running our CANU assembly, we were able to produce a closed bacterial genome for each strain, including the chromosome and plasmid(s) (Table 4). Although there were still variations in chromosome sizes among the replicates, we found overall agreement that CFSAN027343 and CFSAN027350 each contained a chromosome of ~ 5,6 Mb with a single plasmid while CFSAN027346 contained two plasmids. We will describe this plasmid in more detail later in this section.

Table 4

Assembly statistics MinION.

Sample¹	Chromosome(s)contig(s)	plasmids	Chromosomesize (bp)	plasmid(s)size(s) (bp)
²CFSAN027343a	1	1	5,688,712	88,561
³CFSAN027343	1	1	5,688,145	88,702
²CFSAN027346a	1	2	5,588,947	1 (95,599);2 (72,940)
²CFSAN027346b	1	2	5,592,589	1 (95,821);2 (72,972)
³CFSAN027346	1	2	5,592,692	1 (95,696);2 (72,950)
²CFSAN027350a	1	1	5,422,984	157,276
²CFSAN027350b	3	1	5,448,646	157,340
³CFSAN027350	15	1	5,451,905	157,300

1Runs: a) first, and b) second run.

21D Genomic DNA by ligation (SQK-LSK108)

3rapid sequencing kit (SQK-RAD002)

1Runs: a) first, and b) second run. 21D Genomic DNA by ligation (SQK-LSK108) 3rapid sequencing kit (SQK-RAD002)

PacBio sequencing

We next sequenced our three STECs using the PacBio RSII system with a 20-kb insert library protocol. After the library protocol was completed, we sequenced it on three SMRT cells (Table 5). Although the same preparation was used for each isolate, slight differences were noted in the output of raw data among different SMRT cells, (Table 5). The total output of the three SMRT cells for CFSAN027343, CFSAN027346, and CFSAN027350 was 2.3, 2.84. and 2.9 Gb, respectively. The small discrepancy is common since the SMRT cells were loaded with a different binding complex for each strain. The average read of insert lengths for each individual strain CFSAN027343, CFSAN027346, and CFSAN027350 was 10,011 bp, 10,804 bp, and 11,632 bp, respectively.

Table 5

PacBio sequencing output for each SMRT cell.

Strains	SMRT Cell#	Total Reads	Total Output (Gb)	Average Read Length of Insert
CFSAN027343	1	79,102	0.79	9,965
	2	77,649	0.8	10,326
	3	75,856	0.74	9,743
CFSAN027346	1	91,744	1.03	11,252
	2	82,672	0.88	10,677
	3	87,886	0.93	10,484
CFSAN027350	1	81,504	0.95	11,715
	2	82,543	0.96	11,641
	3	87,866	1.01	11,541

To compare assembly performance from Pacbio data, we carried out de novo assembly for every strain with each SMRT cell sequenced. Only for CFSAN027350 the data from three SMRT cells were combined. The overall consensus concordance for the chromosome was between 99.91% and 99.99% and for the plasmids, between 99.95% and 99.99%. All three de novo assemblies for CFSAN027346 and CFSAN027350 generated a single contig for the chromosome and the plasmid(s) (Table 6). The three assemblies made using one SMRT cell for strain CFSAN027343 produced a single contig for the plasmid, but 2–3 contigs for the chromosome, due to the presence of a larger repeat (Table 6). To achieve a single contig for the chromosome, we had to combine the three SMRT cells and set the filter for the minimum subread length to 5000. Using these settings, the chromosome was able to be closed for CFSAN027343 with a coverage of 400X. After manual closure, a Quiver consensus algorithm was run for each consensus thereby achieving a consensus concordance of 100% with an average coverage from 130X to 170X. It is important to note that the sizes of the chromosome identified for CFSAN027346 and CFSAN027350, and the sizes of the plasmids found after three different SMRT cells runs for each strain were almost identical (variation < 0.0003%).

Table 6

Assembly statistics per SMRT cell# for PacBio data using HGAP3.0 and Quiver.

Strains^a	Chromosome contig(s)^b	Chromosome size (bp)^c	Plasmids contig(s)^d	Plasmid(s) size(s) (bp)^e	Coverage (X)^f
CFSAN027343
SMRT Cell # 1	3	5,351,371; 281,039; 81,920	1	88,847	134
SMRT Cell # 2	2	5,525,151; 164,081	1	88,848	134
SMRT Cell # 3	2	5,525,031; 164,076	1	88,847	129
CFSAN027346
SMRT Cell # 1	1	5,592,579	2	1 (96,016), 2 (73,152)	147
SMRT Cell # 2	1	5,592,570	2	1 (96,016), 2 (73,152)	156
SMRT Cell # 3	1	5,592,582	2	1 (96,017), 2 (73,152)	153
CFSAN027350
SMRT Cell # 1	1	5,436,071	1	157,534	170
SMRT Cell # 2	1	5,436,072	1	157,534	166
SMRT Cell # 3	1	5,436,082	1	157,535	176

aThe statistic is listed for each SMRT cell# per isolate.

bThe # represents the assembled contigs that belong to the chromosome.

cThe size (bp) for each assembled contig that belongs to the chromosome.

dThe # represents the assembled contigs that belong to the plasmid(s).

eThe size (bp) for each assembled contig that belongs to the plasmid(s).

fThe number represents the mean coverage for the assembly for each SMRT cell per isolate.

aThe statistic is listed for each SMRT cell# per isolate. bThe # represents the assembled contigs that belong to the chromosome. cThe size (bp) for each assembled contig that belongs to the chromosome. dThe # represents the assembled contigs that belong to the plasmid(s). eThe size (bp) for each assembled contig that belongs to the plasmid(s). fThe number represents the mean coverage for the assembly for each SMRT cell per isolate.

Detection of virulence genes

The output assemblies from our MinION/CANU assembly were used for in silico detection of 95 described virulence genes for E. coli [37], which include genes reported from all pathotypes of pathogenic E. coli, with particular focus on EHECs. To test our hypothesis that Nextera XT MiSeq library sequencing was subpar for obtaining a complete representation of the STEC genomes (chromosome and plasmids), we compared results obtained using MinION assemblies against those obtained from Nextera XT MiSeq assemblies. We confirmed those results using the PacBio assemblies as reference. The 3 O26:H11 STEC strains were positive for 18 of the 95 virulence genes tested (Table 7). Among those genes were: astA, cif, eae, ehxA (plasmid), espA, espB, espF, espJ, espP (plasmid), gad, iha, iss, lpfA, nleA, nleB, nleC, tir, and toxB (plasmid) genes. Other genes were sporadic and strain dependent: tccP gene was present in CFSAN027346 and CFSAN027350, efa1 and katP (plasmid) genes were only present in CFSAN027343 and CFSAN027346. CFSAN027350 was the only strain found to contain espI and stx2a. CFSAN027343 and CFSAN027346 carried a different Shiga toxin phage that instead contained the stx1a variant.

Table 7

Virulence genes present in the O26:H11 STEC MinION assemblies by in silico analysis.

Plasmid borne genes were: espP, toxB, katP, and ehxA.

Assemblies¹	espI	stx2a⁴	stx1a⁵	tccP	efa1	katP
²CFSAN027343a	-	-	+	-	+	+
³CFSAN027343	-	-	+	-	+	+
²CFSAN027346a	-	-	+	+	+	+
²CFSAN027346b	-	-	+	+	+	+
³CFSAN027346	-	-	+	+	+	+
²CFSAN027350a	+	+	-	+	-	-
²CFSAN027350b	+	+	-	+	-	-
³CFSAN027350	+	+	-	+	-	-

All assemblies were positive for astA, cif, eae, ehxA (plasmid), espA, espB, espF, espJ, espP (plasmid), gad, iha, iss, lpfA, nleA, nleB, nleC, tir, and toxB (plasmid) genes.

1Runs: a) first, and b) second run.

21D Genomic DNA by ligation (SQK-LSK108).

3rapid sequencing kit (SQK-RAD002).

4,5shiga toxin genes and their variants.

Virulence genes present in the O26:H11 STEC MinION assemblies by in silico analysis.

Plasmid borne genes were: espP, toxB, katP, and ehxA. All assemblies were positive for astA, cif, eae, ehxA (plasmid), espA, espB, espF, espJ, espP (plasmid), gad, iha, iss, lpfA, nleA, nleB, nleC, tir, and toxB (plasmid) genes. 1Runs: a) first, and b) second run. 21D Genomic DNA by ligation (SQK-LSK108). 3rapid sequencing kit (SQK-RAD002). 4,5shiga toxin genes and their variants. MinION assemblies were congruent and showed the presence/absence of the same genes for each strain (Table 7). We found both nanopore DNA libraries (ligation and rapid) provided sufficient data to assemble and include all virulence genes present, confirming that nanopore technology can provide fast determination of virulence potential by detecting specific virulence genes. Detection of the same virulence genes was also observed with the PacBio assemblies for each strain. However, several virulence genes (i.e., toxB, tccP, iha, and astA) were not detected in some of the genome assemblies obtained with Nextera XT MiSeq, pointing to the corresponding library prep as having been responsible for loss of some of these segments or problems with the assembly of those gene regions.

stx phage identification and location

We assessed whether MinION data could precisely locate where Shiga toxin phages and major pathogenicity islands were located on the genomes, comparing those results with PacBio data–MiSeq was unable to reconstruct individual phages from these strains. For the MinION-assembled chromosome CFSAN027343, we detected 20 prophage regions using Phaster [38], of which: 14 regions were intact, 3 regions were incomplete, and 3 regions were questionable (a sample Phaster result is shown in S3 Fig). The stx carrying phage was 57.6 Kb located in the torS-torT intergenic region gene, different from what was previously known for stx phage insertions (Table 8). The PacBio sequence gave the same number of prophage regions for this genome. Also the stx1 phage in the PacBio assembly was of the same size and in the same location as observed for the MinION assembly.

Table 8

Identification of chromosomal insertion sites for stx phages in the 3 STEC O26:H11 MinION genomes, their stx gene type, regions and coordinates in the genome, and their stx phage sizes.

Insertion of Stx phage in^a
Strains	wrbA	yehV	yecE	sbcB	torS-torT intergenic region	mlaA—ypdK intergenic region	Region phaster- stx phage (coordinates genome)	stx phage size(kb)	stx type
CFSAN027343	-	-	-	-	+	-	14 (3518081–3575683)	57.6	1a
CFSAN027346	+	-	-	-	-	-	8 (2329980–2401033)	69.3	1a
CFSAN027350	-	-	-	-	-	+	14 (3749826–3852495)	102.6	2a
11368 (NC_013361.1)	+	-	-	-	-	-	9 (2347644–2418764)	69.3	1a

a All chromosomes started at the dnaA gene.

a All chromosomes started at the dnaA gene. In the MinION-assembled chromosome in CFSAN027346, Phaster detected 17 prophage regions: 15 regions were intact, 1 region was incomplete, and 1 region was questionable. PacBio sequence for the same genome was similar, although only 16 prophages were identified (14 intact, 1 incomplete and 1 questionable). Surprisingly, akin to PacBio, that particular stx phage was at the classic wrbA gene insertion site. Nonetheless, both MinION and PacBio found the size of the phage to be 69.3 kb (Table 8). In the MinION-assembled chromosome in CFSAN027350, Phaster detected 17 prophage regions: 13 regions were intact, 1 region was incomplete, and 3 regions were questionable. PacBio sequence for the same genome identified 16 prophages (11 intact, 5 incomplete; none were questionable). Again, the phage carrying stx was located at an unusual insertion site, mlaA—ypdK intergenic region, and this stx phage was the biggest among the 3 strains– 102.6 kb (Table 8).

Detection of antimicrobial genes

In silico detection of antimicrobial resistance (AMR) genes using the output assemblies from our MinION pipeline found that only CFSAN027346 carried antimicrobial resistance genes, specifically: aph(3'')-Ib, aph(6)-Id, blaTEM-1B, sul2, tetB, and dfrA. All these genes were contained within a smaller plasmid (73 kb) (Table 4 and Table 6) (S1 Fig). Using the assembly generated by the Nextera XT MiSeq data, in silico analysis showed a similar result but missed the sul2 gene.

cgMLST SNP analysis of O26:H11 genomes–MiSeq, MinION, and PacBio

For this section, de novo assemblies were produced for the nanopore data using our analysis pipeline described in Fig 1 which includes assembly and Nextera XT polishing of the MinION/CANU assemblies. The phylogenetic relationships among E. coli O26:H11/H- strains were determined by a cgMLST SNP analysis shown in Fig 2. This cgMLST was the same as previously published [9]. The genome of E. coli strain 11368 (NC_013361.1) was used as a reference and has 4,554 genes. The resultant NJ SNP tree showed that the O26 genomes analyzed were highly diverse and polyphyletic and that the assemblies for these 3 strains (i.e., MinION polished assemblies) clustered with the assemblies of the genomes for the same strain generated by the other two technologies that are known to be more accurate (i.e., the MiSeq and PacBio data) (Fig 2A). However, the assemblies generated by our pipeline were located in longer branches in the same cluster and showed that they still have many errors that differentiated them from their counterparts of high-quality genomes generated by MiSeq or PacBio by 128–203 SNPs for CFSAN027343 (Fig 2C and S2A Fig) and 92–175 SNPs for CFSAN027346 (Fig 2C and S2B Fig).

Fig 1

Schematic representation of the analysis pipeline used in this study for assembly and polishing of the MinION sequencing output.

Fig 2

Phylogenetic analysis of the O26:H11/H- E. coli strains sequenced in this study by MiSeq, MinION, and PacBio and 195 genomes that are available at GenBank by cgMLST analysis.

The SNPs were extracted from the core loci (1303) and the SNP matrix (5089 SNPs) was used to determine the genetic relationships among the strains. The evolutionary history was inferred by using Neighbor-Joining (NJ) tree built using the genetic distance and showing the existence of high diversity and that O26:H11 strains were polyphyletic. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The tree was rooted to E. coli O111:H- strain 11128 (NC_013364). A) The genomes generated by any of the 3 technologies still clustered together by the cgMLST analysis. Snapshot of the clusters formed by the genomes generated by the 3 technologies for B) CFSAN27343, C) CFSAN027346, and D) CFSAN027350 strains, respectively. The names of the strains can be discerned in S5 Fig.

Phylogenetic analysis of the O26:H11/H- E. coli strains sequenced in this study by MiSeq, MinION, and PacBio and 195 genomes that are available at GenBank by cgMLST analysis.

Synteny comparisons of MinION and PacBio assemblies

A comparison of genome synteny between the assemblies produced by MinION/CANU vs. PacBio was performed using Mauve aligner [39]. Overall MinION-generated chromosomes for the 3 strains have the same synteny to the ones generated by PacBio (Fig 3). (Fig 3).

Fig 3

Comparison of the synteny mapping of chromosomes generated by either MinION or PacBio, using the Sanger generated genome for STEC O26:H11 strain 11368 (AP010953) as reference with MAUVE.

Each chromosome sequence is laid out in a horizontal track. Matching colors indicate homologous segments and are connected across genomes. Respective scales show the sequence coordinates in base pairs. A colored similarity plot is shown for each genome, the height of which is proportional to the level of sequence identity in that region. Only strain CFSAN027343 synteny is shown for illustration purposes (The other two strains (CFSAN027346 and CFSAN027350) synteny can be found in S4 Fig).

Comparison of the synteny mapping of chromosomes generated by either MinION or PacBio, using the Sanger generated genome for STEC O26:H11 strain 11368 (AP010953) as reference with MAUVE.

Novel plasmid in O26:H11 (155 kb)

The virulence plasmids of CFSAN027343 and CFSAN027346 were 88 kb and 96 kb, respectively. Interestingly, the virulence plasmid for CFSAN027350 was much larger (157 kb). Both pCFSAN027343 and pCFSAN027346, however, carried these 4 virulence genes: ehxA, espP, toxB, and katP, while pCFSAN027350 carried only ehxA, espP, and toxB. Comparative analyses of the 3 plasmids showed that pCFSAN027350 was very different from the other 2 plasmids (Fig 4), possessing extensive unique regions. Based on these data, pCFSAN027350 contains 214 annotated ORFs with multiple transposons and insertion elements and has a G + C content of 48.4% (S2 Table).

Fig 4

Circular map of virulence plasmid pCFSAN027350 compared to the other two virulence plasmids (pCFSAN027343 and pCFSAN027346), generated using CGView [63].

Blue block arrows in the outer circle denote coding regions in the plasmid, indicating the ORF transcription direction. G+C content is shown in the middle circle and the deviation from average G+C content (47.71%) is displayed in the innermost circle. BLAST comparisons with the other two EHEC plasmids are shown in light red (pCFSAN027343) and green (pCFSAN027346).

Circular map of virulence plasmid pCFSAN027350 compared to the other two virulence plasmids (pCFSAN027343 and pCFSAN027346), generated using CGView [63].

Discussion

Based on MLST analyses, strains of STEC O26:H11/H- can be separated into several STs; of these, ST21 and ST29 have been associated with disease in humans [7]. Therefore, it is especially important to be able to rapidly characterize all detected STECs O26:H11/H- (both chromosome and plasmid(s), if present) to determine whether those clones or strains circulating in particular locations are likely to be threats to human health. Here, we tested the use of such a technique (i.e., Nanopore sequencing) to obtain the complete genomes of 3 unrelated STEC O26:H11 strains followed by subsequent comparisons to those obtained by PacBio and Nextera XT MiSeq WGS platforms. The 3 strains belonged to the EHEC O26 group in accordance with the O26:H11/H- cgMLST scheme reported previously by Gonzalez-Escalona et al. in 2016 [9]. We conducted Nanopore testing using a MinION device connected to a laptop using two different DNA sequencing kits (the 1D ligation kit and the rapid sequencing kit) in the study. Resultant genomes were able to be closed using both DNA sequencing kits deployed here save for CFSAN027350. For that strain we could not generate a complete closed genome using the rapid kit—the chromosome was contained in 15 contigs. Nonetheless, the virulence plasmid for that strain/run was contained in a single contig. One explanation for this discrepancy could be that the WGS output was very low at <0.14 Gb for reads above 5,000 bp (i.e., 21,759 total reads). Also, the minimum amount of reads that were necessary for closing the STEC genome was around 58,000 reads above 5,000 bp for the rapid kit. The number of reads generated by the 1D ligation kit was around 5 times higher than that for the rapid kit, which allowed us to estimate that, in principal, we could run up to 5 different strains per flow cell thus reducing the cost per genome to ~200 USD. This assumption is reinforced by the newer released RAD004 kit, which produces even higher and better-quality output than its predecessor (unpublished observations). In the last 4 years, Pacbio sequencing, combined with de novo assembly, has been an attractive method for closing bacterial chromosomes and the plasmid(s) that they may carry. Indeed, in this study, we verified that by using PacBio technology with P6-C4 chemistry and de novo assembly, the three E. coli genomes including their plasmids studied here were able to be completely sequenced and closed. The three assemblies obtained for each isolate using the raw sequence data from each SMRT cell run, respectively, showed high consistency and reproducibility with a coverage of above 100X and an average read length of 10 to 12 kb. Moreover, all chromosomes and plasmids could be completely closed with the sequence data from one SMRT cell except for the single chromosomal sequence run from CFSAN027343. Fragmented assemblies can occur when the repetitive region in some bacteria is longer than the read lengths. Pathogenic E. coli, in fact, are known for carrying repetitive regions within the chromosome and read lengths obtained from the PacBio sequencer, at times, are not long enough to overcome this challenge. Albeit, Pacbio is a stand alone instrument that does not require finishing or polishing with additional WGS data from another source. Using MinION-generated genomes, polished with Nextera XT MiSeq reads, allowed us to accurately and efficiently determine the virulence genes present in each strain, as well as which genes were located on plasmid(s) or on the chromosome (see Table 7). We also were able to determine the presence of AMR in one of the strains (CFSAN027346) and that was located on an extra plasmid carried solely by that strain (S1 Fig). The other two STEC strains carried a single virulence plasmid albeit with different sizes and gene compositions. It was also noteworthy that none of these strains carried the etpD gene (coding for a Type II secretion protein) in the plasmid, indicative of the presence of the European clone [9,10,40,41]. From the nanopore data analyzed here, the virulence profile for each strain was determined as follows: CFSAN027343 (ST21, stx+, eae-beta1+, plasmid gene profile ehxA+, espP+, katP, and toxB+), CFSAN027346 (ST21, stx+, eae-beta1+, plasmid gene profile ehxA+, espP+, katP, and toxB+), and CFSAN027350 (ST29, stx2a+, eae-beta1+, plasmid gene profile ehxA+, espP+, and toxB+). Besides differences in their Shiga toxin phage content, we detected other virulence genes unique to some of the 3 STECs analyzed here. Among them was the tccP gene, coding for an effector protein (Tir-cytoskeleton coupling protein), found commonly in O157:H7 EHEC strains [42,43] and which was present in strains CFSAN027346 and CFSAN027350. The tccP gene plays a direct role during EHEC infection inducing actin polymerization by coupling Tir to the actin cytoskeleton [44]. The presence of this gene, together with espJ, stx1a or 2a, intimin, and Tir, makes these strains highly pathogenic. Another important illustration of the mosaic of virulence factors found within STEC O26:H11 strains is that two important genes for intestinal colonization (efa1/lifA—a protein with putative glycosyltransferase activity that has an important role in intestinal colonization), and katP- (a catalase-peroxidase, which might help EHEC O157:H7 to colonize host intestines by reducing oxidative stress), were present in both CFSAN027343 and CFSAN027346 [45,46]. We also sequenced the same 3 strains using PacBio, which is considered the gold standard for closing bacterial genomes [18,47,48]. These genomes were compared to the ones generated by nanopore sequencing. Overall, there was fair agreement with the sequences in both the chromosome(s) and plasmid(s) generated from the 3 STEC strains. This was evidenced by the extremely high synteny as well as virulence genes profiles, AMR genes, and overall plasmid content. Despite this extraordinary congruence, some discrepancies were observed for CFSAN027343 where there was a fragment (~ 1.5 Mb) that appears to be reversed in direction on the chromosome between the two, influencing the resultant calling of the stx phage and with MinION stx phage being smaller (57.6 kb) than the PacBio stx-phage (75.6 kb). As observed by other authors, sequencing accuracy continued to be an issue for the genomes generated by nanopore sequencing here. That is, PacBio maintained 99% accuracy (1 SNP observed by cgMLST) while MinION genomes revealed expectedly lower accuracy (92–203 SNPs by cgMLST) (S2 Fig). Without Nextera XT MiSeq polishing, we are unable to accurately place the MinION assemblies -using this cgMLST scheme- into the phylogenetic tree as clustering relies critically on sequence calling using complete ORFs. One of the main drawbacks of MinION nanopore sequencing is that the assemblies contained numerous artifactual indels that will readily translate into incorrect allele calls. This will, in turn, affect the correct clustering of those isolates in the phylogenetic tree. Continued improvements in nanopore-based sequencing chemistry (e.g. RAD004) as well as development of alternative base-calling algorithms such as Scrappie–an open-source transducer neural network [49] may further mitigate this problem. This latter development is designed to aid in the correction of longer homopolymeric runs, one of the main challenges of the nanopore platform. On the other hand, it is equally important to note that PacBio also has continued to mitigate caveats including to time of labor, and read length with recent advances in Sequel chemistry (SMRTbell Express Template Prep Kit, Sequel Sequencing and Binding Kit 3.0 and SMRT Cell 1M v3). By using MinION sequencing we were able to easily identify the location and composition of the shiga toxin carrying phages in the STEC O26:H11 strains. Contrary to what has been observed for EHEC O157:H7 and non-O157:H7 STECs, the location on the chromosome of the stx phages [50] were in different regions of the chromomere for each strain in this study. According to Bonanno et al. (2015), nine Stx phage insertion sites have been described in STEC strains genes: wrbA, yehV, yecE, sbcB, Z2577, ssrA, prfC, argW, torS-torT intergenic region. Delannoy et al. (2017) also described the presence of a Stx phage at a different insertion site yciD for O26:H11/H- strains [51]. For the prototypic Sanger sequence STEC O26:H11 strain 11368, the Stx phage is located at the wrbA gene, a very commonly observed site for stx phage insertions in STECs [50]. The Stx phage in these 3 strains (sequenced in this study) were located known locations on the chromosome (torS-torT intergenic region–CFSAN027343, wrbA- CFSAN027346) except for CFSAN027350 which was located at a novel location in the intergenic region between genes mlaA and ypdK, highlighting once more the high diversity among STECs O26:H11/H- strains. Finally, with both MinION and PacBio, we were able to identify a large virulence plasmid in strain CFSAN027350 (~ 157 kb) (Fig 4). The presence of a large virulence plasmid in O26:H11 strains is not uncommon as was observed for strain H30 (pO26-Vir—168 kb) [14], which contained 5 additional plasmids. As was observed with pO26-Vir, pCFSAN027350 showed a mosaic structure, with many fragments of the plasmid matching other plasmids available at GenBank and contained multiple transposons and insertion sequences, evidence of the mobility of some of the regions (e.g. toxB gene was surrounded by IS elements at both ends of the gene). Nevertheless, this plasmid was highly similar (99%) to another plasmid reported for a clinical strain of O26:H11, albeit of larger size (~181 kb, strain 2013C-3277 plasmid unnamed3, CP027334.1). Overall, the high degree of correlation we found between these two long-read methods for determining plasmids, virulome, antimicrobial resistance genes, and phage composition in STEC O26 strongly indicates that the MinION sequencing technology is an excellent solution for rapidly determining STEC O26 closed genomes and performing comprehensive analysis of their genomic markers. The MinION devices are robust enough to be used to monitor viruses in remote areas [21], yet can provide most of the applications provided on the PacBio platform (methylation patterns, metagenomics, finding location of pathogenicity islands, resolving plasmids, among other applications) [24-26,29,31,32]. To be able to rapidly provide an assessment of possible virulence, antimicrobial resistance potential and disease risk posed by any STEC using this nanopore sequencing technology, is of paramount importance for the protection of the public health and for the tracking of existent and new clones in any geographical region.

Materials and methods

Bacterial strains and media

E. coli strains CFSAN027343, CFSAN027346, and CFSAN027350 (Table 1) were purchased from the E. coli Reference Center (Pennsylvania State University, University Park, PA). These strains were revived in Tryptic Soy Broth and grown overnight at 37°C.

DNA preparation

Genomic DNA from each strain was isolated from overnight cultures using the DNeasy Blood and Tissue Kit (Qiagen, Valencia, CA), following the manufacturer’s instructions. The extracted DNA was stored at -20°C until used as a template for whole genome sequencing. The concentration was determined using a Qubit double-stranded DNA HS assay kit and a Qubit 2.0 fluorometer (Thermo Scientific), according to manufacturer's instructions.

MinION nanopore whole genome sequencing and contigs assembly

The genomes of the strains were sequenced using a MinION device (Nanopore, Oxford, UK), with FLO-MIN106 (R9.4) flow cells, according to the manufacturer’s instructions, at > 50 X—average coverage. The sequencing libraries were prepared using either the 1D Genomic DNA by ligation kit (SQK-LSK108) library chemistry or the rapid sequencing kit (SQK-RAD002) according to the manufacturer’s instructions. An exception was for the 1D ligation kit where we omitted the shearing step and the initial step was the End-prep step, since the DNA extraction step already sheared the DNA. The DNA input was 1 ug per DNA library for all 1D Genomic DNA by ligation kit (SQK-LSK108) and 0.4 ug per rapid sequencing kit (SQK-RAD002) library. Each prepared library was loaded into a FLO-MIN106 flowcell (R9.4) for a 48-hour run. E. coli strains CFSAN027346 and CFSAN027350 were sequenced in duplicate using the ligation kit, and a single sequencing of each strain was conducted using the rapid kit (Table 2). The base calling was performed using Albacore software (Nanopore, Oxford, UK). The fastq files were generated from the base called sequencing fast5 reads using Poretools [52]. Genomic sequence contigs were de novo assembled using default settings within the CANU program [53] v1.6. Determination of the overlapping regions of the chromosome and plasmids were carried out using Gepard [54]. The resultant assemblies from CANU were corrected for errors using Pilon [55] and the Nextera XT MiSeq data generated from those strains. Our de novo assembly and polishing pipeline are shown in Fig 1.

PacBio whole genome sequencing and contigs assembly

The strains were sequenced on the Pacific Biosciences (PacBio) RSII Sequencer, as previously described [2,18,48]. Specifically, we prepared the library using 10 μg genomic DNA that was sheared to a size of 20kb fragments by g-tubes (Covaris, Inc., Woburn, MA) according to the manufacturer’s instruction. The SMRTbell 20-kb template library was constructed using DNA Template Prep Kit 1.0 with the 20-kb insert library protocol (Pacific Biosciences; Menlo Park, CA, USA). Size selection was performed with BluePippin (Sage Science, Beverly, MA). The library was sequenced using the P6/C4 chemistry on 3 single-molecule real-time (SMRT) cells with a 240-min collection protocol along with stage start. Analysis of the sequence reads was implemented using SMRT Analysis 2.3.0. The best de novo assembly was established with the PacBio Hierarchical Genome Assembly Process (HGAP3.0) program using the continuous-long-reads from the four SMRT cells. The assemblies outputs from HGAP contains overlapping regions at the end which can be identified using dot plots in Gepard [54]. Genomes were checked manually for even sequencing coverage. Afterwards the improved consensus sequence was uploaded in SMRT Analysis 2.3.0. to determine the final consensus and accuracy scores using Quiver consensus algorithm [48].

MiSeq whole genome sequencing, contig assembly and annotation

The genomes of the strains were sequenced using an Illumina MiSeq sequencer (Illumina, San Diego, CA), with the 2x250 bp pair-end chemistry according to manufacturer’s instructions, at approximately 80X average coverage. The genome libraries were constructed using the Nextera XT DNA sample prep kit (Illumina). Genomic sequence contigs were de novo assembled using default settings within CLC Genomics Workbench v9.5.2 (QIAGEN) with a minimum contig size threshold of 500 bp in length.

Genome and plasmid annotations

We used RAST (http://rast.nmpdr.org/rast.cgi) for all annotations performed in this study [56].

In silico serotyping

The serotype of each strain analyzed in this study was confirmed using the genes deposited in the Center for Genomic Epidemiology (http://www.genomicepidemiology.org) for E. coli as part of their web-based serotyping tool (SerotypeFinder 1.1 - https://cge.cbs.dtu.dk/services/SerotypeFinder) [57,58].

In silico MLST phylogenetic analysis

The initial analysis and identification of the strains were performed using an in silico E. coli MLST approach, based on the information available at the E. coli MLST website Enterobase (http://enterobase.warwick.ac.uk/species/index/ecoli) and using Ridom SeqSphere+ software v2.4.0 (Ridom; Münster, Germany) (http://www.ridom.com/seqsphere). Seven housekeeping genes (dnaE, gyrB, recA, dtdS, pntA, pyrC, and tnaA), described previously for E. coli [15], were used for MLST analysis.

In silico determination of virulence genes

Virulence genes were determined as previously described [9] using the genes deposited in the Center for Genomic Epidemiology (http://www.genomicepidemiology.org) for E. coli as part of their VirulenceFinder 1.5 web-based tool (https://cge.cbs.dtu.dk/services/VirulenceFinder) [58], We used Ridom to batch screen our set of genomes for known virulence genes. S1 Table shows the 93 virulence genes analyzed by this method.

In silico identification of antimicrobial resistance genes

We identified the antimicrobial resistance (AMR) genes present in our sequenced genomes as previously described by Gonzalez-Escalona [9], using genes deposited in the Center for Genomic Epidemiology (http://www.genomicepidemiology.org) as part of their web-based Resfinder 2.1 tool (https://cge.cbs.dtu.dk/services/ResFinder) [59]. We used Ridom to batch screen our set of genomes for known AMR genes.

Comparisons of genomic synteny

The genomic synteny between PacBio and MinION data was determined using Mauve [39].

stx phage and T3SS identification and location

Prophages and prophage-like elements within the sequenced O26:H11 STEC strains were initially identified using the prophage-predicting PHASTER web server [38]. Next, we used the genomic island prediction web server IslandViewer3 to detect potential pathogenicity and genomic islands [60]. Each identified prophage, and prophage-like elementwas then examined, using CLC Genomics Workbench 7.5, to locate nearby integrases and potential integration sites, which would confirm their status.

Phylogenetic relationship of the strains by cgMLST analysis

Due to intrinsic problems with the sequencing technology used by MinION sequencing–generation of many indels located mainly in areas of homopolymers tracks- we wanted to test the effectiveness of MinION genomes produced by our pipeline to estimate the phylogeny of those 3 STEC strains in a context of 155 other genomes of highly similar O26:H11 available at NCBI. All these O26 genomes from GenBank were generated by MiSeq or HighSeq (Illuimina), considered the “gold standard” for accurate genome sequence determination and SNP analyses. We used Ridom SeqSphere+ software v2.4.0 to assess the phylogenetic relationship of these strains, performing a core genome multilocus sequence typing (cgMLST) analysis as previously described for O26:H11 [9]. cgMLST uses the allele numbers of each loci to determine genetic distances and build the phylogenetic tree. We used O26:H11 strain 11368 (NC_013361.1) as the reference genome for generating the core genes for the phylogenetic tree and downloaded 195 genomes of E. coli O26:H11, available at NCBI, to build the phylogenetic tree. We used Nei’s DNA distance method [61] for calculating the matrix of genetic distance, taking only the number of same/different alleles in the core genes into consideration. A Neighbor-Joining (NJ) tree [62] using the appropriate genetic distances was built after the cgMLST analysis for initial inspection of the placement of the new sequenced genomes by either technology. The SNPs were extracted from the core loci (1303) and the SNP matrix (5089 SNPs) was used to reconstruct the NJ phylogenetic tree. The use of SNPs allowed for determining a true phylogeny and to find informative SNPs in the dataset. The NJ SNP tree was rooted to strain 11128 (O111:H-).

Nucleotide sequence accession numbers

The SRA sequences of all three E. coli strains used in our study are available in GenBank under the accession numbers: MiSeq data (SRR8333591, SRR8333592, and SRR8333590), MinION data (SRR8335317, SRR8335318, and SRR8335317), and PacBio assemblies [CFSAN027343 (CP037943 and CP037944), CFSAN027346 (CP037945, CP037946, and CP037947), and CFSAN027350 (CP037941 and CP037942)].

Annotation of plasmid 73 kb from Strain CFSAN027346 showing the antimicrobial resistance genes.

(DOCX) Click here for additional data file.

SNPs differences observed by a cgMLST analysis between the genomes generated by MiSeq, PacBio, and MinION.

A) CFSAN027343, B) CFSAN027346. (DOCX) Click here for additional data file.

Snapshot of the PHASTER [38] results for CFSAN027343 MinION chromosome.

(DOCX) Click here for additional data file. Each chromosome sequence is laid out in a horizontal track. Matching colors indicate homologous segments and are connected across genomes. Respective scales show the sequence coordinates in base pairs. A colored similarity plot is shown for each genome, the height of which is proportional to the level of sequence identity in that region. Strains CFSAN027346 and CFSAN027350 are shown. Top sanger sequenced strain 11368, middle MinION sequenced strain, and bottom the same strain sequenced by PacBio. (DOCX) Click here for additional data file.

Same phylogenetic tree as Fig 2A but on a larger scale to show the names of the strains.

(DOCX) Click here for additional data file.

E. coli virulence genes tested by in silico virulence typing.

(DOCX) Click here for additional data file.

Summary of ORFs in pCFSAN027350 as annotated by RAST [56].

(DOCX) Click here for additional data file.

62 in total

1. Mauve: multiple alignment of conserved genomic sequence with rearrangements.

Authors: Aaron C E Darling; Bob Mau; Frederick R Blattner; Nicole T Perna
Journal: Genome Res Date: 2004-07 Impact factor: 9.043

2. Circular genome visualization and exploration using CGView.

Authors: Paul Stothard; David S Wishart
Journal: Bioinformatics Date: 2004-10-12 Impact factor: 6.937

3. TccP is an enterohaemorrhagic Escherichia coli O157:H7 type III effector protein that couples Tir to the actin-cytoskeleton.

Authors: Junkal Garmendia; Alan D Phillips; Marie-France Carlier; Yuwen Chong; Stephanie Schüller; Olivier Marches; Sivan Dahan; Eric Oswald; Rob K Shaw; Stuart Knutton; Gad Frankel
Journal: Cell Microbiol Date: 2004-12 Impact factor: 3.715

Review 4. Enteropathogenic and enterohemorrhagic Escherichia coli infections: translocation, translocation, translocation.

Authors: Junkal Garmendia; Gad Frankel; Valérie F Crepin
Journal: Infect Immun Date: 2005-05 Impact factor: 3.441

5. Gepard: a rapid and sensitive tool for creating dotplots on genome scale.

Authors: Jan Krumsiek; Roland Arnold; Thomas Rattei
Journal: Bioinformatics Date: 2007-02-19 Impact factor: 6.937

6. Distribution of tccP in clinical enterohemorrhagic and enteropathogenic Escherichia coli isolates.

Authors: Junkal Garmendia; Zhihong Ren; Sharon Tennant; Monica Aparecida Midolli Viera; Yuwen Chong; Andrew Whale; Kristy Azzopardi; Sivan Dahan; Marcelo Palma Sircili; Marcia Regina Franzolin; Luiz R Trabulsi; Alan Phillips; Tânia A T Gomes; Jianguo Xu; Roy Robins-Browne; Gad Frankel
Journal: J Clin Microbiol Date: 2005-11 Impact factor: 5.948

7. The complete DNA sequence and analysis of the virulence plasmid and of five additional plasmids carried by Shiga toxin-producing Escherichia coli O26:H11 strain H30.

Authors: Pina M Fratamico; Xianghe Yan; Alfredo Caprioli; Giuseppina Esposito; David S Needleman; Tiziana Pepe; Rosangela Tozzoli; Maria Luisa Cortesi; Stefano Morabito
Journal: Int J Med Microbiol Date: 2011-01-05 Impact factor: 3.473

Review 8. Escherichia coli serogroup O26--a new look at an old adversary.

Authors: C Jenkins; J Evans; H Chart; G A Willshaw; G Frankel
Journal: J Appl Microbiol Date: 2008-01 Impact factor: 3.772

9. Sex and virulence in Escherichia coli: an evolutionary perspective.

Authors: Thierry Wirth; Daniel Falush; Ruiting Lan; Frances Colles; Patience Mensa; Lothar H Wieler; Helge Karch; Peter R Reeves; Martin C J Maiden; Howard Ochman; Mark Achtman
Journal: Mol Microbiol Date: 2006-06 Impact factor: 3.501

10. The RAST Server: rapid annotations using subsystems technology.

Authors: Ramy K Aziz; Daniela Bartels; Aaron A Best; Matthew DeJongh; Terrence Disz; Robert A Edwards; Kevin Formsma; Svetlana Gerdes; Elizabeth M Glass; Michael Kubal; Folker Meyer; Gary J Olsen; Robert Olson; Andrei L Osterman; Ross A Overbeek; Leslie K McNeil; Daniel Paarmann; Tobias Paczian; Bruce Parrello; Gordon D Pusch; Claudia Reich; Rick Stevens; Olga Vassieva; Veronika Vonstein; Andreas Wilke; Olga Zagnitko
Journal: BMC Genomics Date: 2008-02-08 Impact factor: 3.969

22 in total

1. Subtyping Evaluation of Salmonella Enteritidis Using Single Nucleotide Polymorphism and Core Genome Multilocus Sequence Typing with Nanopore Reads.

Authors: Zhihan Xian; Shaoting Li; David Ames Mann; Yixiao Huang; Feng Xu; Xingwen Wu; Silin Tang; Guangtao Zhang; Abigail Stevenson; Chongtao Ge; Xiangyu Deng
Journal: Appl Environ Microbiol Date: 2022-07-12 Impact factor: 5.005

2. Phage Genome Annotation: Where to Begin and End.

Authors: Anastasiya Shen; Andrew Millard
Journal: Phage (New Rochelle) Date: 2021-12-16

3. Evaluation of high molecular weight DNA extraction methods for long-read sequencing of Shiga toxin-producing Escherichia coli.

Authors: Sandra Jaudou; Mai-Lan Tran; Fabien Vorimore; Patrick Fach; Sabine Delannoy
Journal: PLoS One Date: 2022-07-13 Impact factor: 3.752

4. Genomic epidemiology of Escherichia coli isolates from a tertiary referral center in Lilongwe, Malawi.

Authors: Gerald Tegha; Emily J Ciccone; Robert Krysiak; James Kaphatika; Tarsizio Chikaonda; Isaac Ndhlovu; David van Duin; Irving Hoffman; Jonathan J Juliano; Jeremy Wang
Journal: Microb Genom Date: 2021-01

5. Comparative Pathogenomics of Escherichia coli: Polyvalent Vaccine Target Identification through Virulome Analysis.

Authors: J R Clark; A M Maresso
Journal: Infect Immun Date: 2021-07-15 Impact factor: 3.441

6. Genomic analyses of multidrug-resistant Salmonella Indiana, Typhimurium, and Enteritidis isolates using MinION and MiSeq sequencing technologies.

Authors: Zhao Chen; Dai Kuang; Xuebin Xu; Narjol González-Escalona; David L Erickson; Eric Brown; Jianghong Meng
Journal: PLoS One Date: 2020-07-02 Impact factor: 3.240

7. Evaluating the accuracy of Listeria monocytogenes assemblies from quasimetagenomic samples using long and short reads.

Authors: Seth Commichaux; Kiran Javkar; Padmini Ramachandran; Niranjan Nagarajan; Denis Bertrand; Yi Chen; Elizabeth Reed; Narjol Gonzalez-Escalona; Errol Strain; Hugh Rand; Mihai Pop; Andrea Ottesen
Journal: BMC Genomics Date: 2021-05-26 Impact factor: 3.969

8. Microevolution and Gain or Loss of Mobile Genetic Elements of Outbreak-Related Listeria monocytogenes in Food Processing Environments Identified by Whole Genome Sequencing Analysis.

Authors: Helen Yang; Maria Hoffmann; Marc W Allard; Eric W Brown; Yi Chen
Journal: Front Microbiol Date: 2020-05-29 Impact factor: 5.640

9. Assembly methods for nanopore-based metagenomic sequencing: a comparative study.

Authors: Adriel Latorre-Pérez; Pascual Villalba-Bermell; Javier Pascual; Cristina Vilanova
Journal: Sci Rep Date: 2020-08-12 Impact factor: 4.379

10. Closing Clostridium botulinum Group I Genomes Using a Combination of Short- and Long-Reads.

Authors: Narjol Gonzalez-Escalona; Shashi K Sharma
Journal: Front Microbiol Date: 2020-02-19 Impact factor: 5.640