Literature DB >> 31296688

Closed Genome Sequences of Three Salmonella enterica Strains Belonging to Serovars Saintpaul, Weltevreden, and Thompson, Isolated from Mexico.

Narjol Gonzalez-Escalona¹, J R Aguirre-Sánchez², J R Ibarra-Rodríguez², C Chaidez-Quiroz², Jaime Martinez-Urtaza³.

Abstract

Here, we report the genome sequences of three Salmonella enterica strains belonging to serovars Weltevreden (CFSAN047349), Saintpaul (CFSAN047351), and Thompson (CFSAN047352), isolated from river water in Sinaloa, Mexico. The genomes were closed by a combination of long-read and short-read sequencing. The strain sequence types (STs) are ST365, ST50, and ST26, respectively.

Entities: Chemical Disease Species

Year: 2019 PMID： 31296688 PMCID： PMC6624771 DOI： 10.1128/MRA.00656-19

Source DB: PubMed Journal: Microbiol Resour Announc ISSN： 2576-098X

ANNOUNCEMENT

Salmonella enterica is one of the most important foodborne pathogens and is implicated in illnesses worldwide (1). There are almost 200,000 Salmonella genomes in the NCBI encompassing isolates from many countries. Salmonella spp. have been isolated from many sources, like the environment, animals, and food. Water can also be a reservoir of Salmonella spp. and has often been implicated as a source of contamination of food and fresh produce (2). Several Salmonella strains were isolated from rivers in Sinaloa, Mexico, during a study of Salmonella prevalence in river waters (2008 to 2010) (3). Here, we sequenced the complete genomes of three Salmonella strains (CFSAN047349, CFSAN047351, and CFSAN047352) isolated during that period using a combination of long-read and short-read sequencing technologies. The availability of these closed genomes will be useful for future outbreak investigations. The strains were grown overnight in Luria-Bertani (LB) medium at 35°C, and the DNA was extracted with the DNeasy blood and tissue kit (Qiagen). The long reads for each strain were generated through MinION sequencing (Nanopore, Oxford, UK). The sequencing library was prepared using the rapid barcoding sequencing kit (SQK-RBK004). The sequencing library contained DNA fragmented randomly by a transposase present in the fragmentation mix of the SQK-RBK004 kit, rendering fragments of >30 kb. This library was run in a FLO-MIN106 (R9.4.1) flow cell, according to the manufacturer’s instructions, for 48 h. The run was base called live using default settings in MinKNOW v18.12 and Guppy v1.8.7. The sequencing output was 1.6 Gb (199,000 reads, but only reads above 5 kb were used for the downstream analyses [151,258 reads]), for an estimated genome average coverage of 25 to 58×. The short-read whole-genome sequence for each strain was generated by MiSeq Illumina sequencing with the MiSeq V3 kit using 2 × 250-bp paired-end chemistry (Illumina, San Diego, CA), according to the manufacturer’s instructions, at 80 to 660× coverage. The libraries were constructed using 100 ng of genomic DNA using the Nextera DNA Flex kit (Illumina), according to the manufacturer’s instructions. The final genome was achieved by using a pipeline already described (4). Briefly, the genome was obtained by de novo assembly, using Nanopore data and default settings within the Canu program v1.7 (5). A second assembly was generated using a SPAdes (6) hybrid assembly (with default settings) using both Nanopore and MiSeq data generated for each strain. The final corrected assembly was generated by comparing the SPAdes hybrid and Canu assemblies using Mauve (7). The genomes and plasmid (if present) were confirmed as being circular closed by finding the contig end overlap and trimming the overlap. If the two assemblies agreed in synteny and size, the SPAdes hybrid assembly was used as the final assembly. Each Salmonella genome was of a distinct size (Table 1). The genomes were annotated using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP; https://www.ncbi.nlm.nih.gov/genome/annotation_prok/) (8).

TABLE 1

Metadata for the three environmental S. enterica strains reported in this study

CFSAN no.	GenBank accession no. (size [bp]) for:		GC content (%)	Hybrid assembly genome coverage (×)	BioSample accession no.	SRA accession no.	Serotype	ST
CFSAN no.	Chromosome	Plasmid	GC content (%)	Hybrid assembly genome coverage (×)	BioSample accession no.	SRA accession no.	Serotype	ST
CFSAN047349	CP040701 (4,994,320)	CP040702 (104,768)	52	40	SAMN10261265	SRR9099568, SRR9099595	Weltevreden	365
CFSAN047351	CP040700 (4,767,334)		52	41	SAMN10261267	SRR9099567, SRR9099594	Saintpaul	50
CFSAN047352	CP040699 (4,709,901)		52	327	SAMN10261268	SRR9099569, SRR9099596	Thompson	26

Metadata for the three environmental S. enterica strains reported in this study In silico multilocus sequence typing (MLST) analyses (https://enterobase.warwick.ac.uk/species/index/senterica) showed that each strain belonged to a different sequence type (ST), with CFSAN047349 belonging to ST365, CFSAN047351 belonging to ST50, and CFSAN047352 belonging to ST26. In silico serotyping using SeqSero (9) (http://www.denglab.info/SeqSero), a tool to infer serovar from the genes that determine antigenic structure, showed that the strains belonged to serovars Weltevreden, Saintpaul, and Thompson, respectively. The GC content was 52%, similar to that of other salmonellae. Only CFSAN047349 carried a plasmid of 104,768 bp.

Data availability.

The accession numbers for the genome sequences are listed in Table 1.

1 in total

1. Choice of library preparation affects sequence quality, genome assembly, and precise in silico prediction of virulence genes in shiga toxin-producing Escherichia coli.

Authors: Julie Haendiges; Karen Jinneman; Narjol Gonzalez-Escalona
Journal: PLoS One Date: 2021-03-24 Impact factor: 3.240

1 in total