Literature DB >> 32972949

Draft Genome Sequences of Isolates of Diverse Host Origin from the E. coli Reference Center at Penn State University.

David W Lacher1, Mark K Mammel2, Jayanthi Gangiredla2, Solomon T Gebru2, Tammy J Barnaba2, Sydney A Majowicz3, Edward G Dudley3,4.   

Abstract

Escherichia coli strains present a vast genomic diversity. We report the draft genome sequences of 1,000 isolates from the E. coli Reference Center at Penn State University. These strains were originally isolated from multiple animal and environmental sources over the past 50 years.

Entities:  

Year:  2020        PMID: 32972949      PMCID: PMC7516160          DOI: 10.1128/MRA.01005-20

Source DB:  PubMed          Journal:  Microbiol Resour Announc        ISSN: 2576-098X


ANNOUNCEMENT

Members of the genus Escherichia, specifically Escherichia coli, include pathogenic and nonpathogenic strains. The ability to differentiate these two groups of E. coli has an impact on food safety. As part of the U.S. Food and Drug Administration’s efforts to expand state-of-the-art technology to identify pathogenic E. coli strains, we are developing an in-depth phylogenetic landscape of E. coli that parses these bacteria into different clades. In order to expand this landscape as well as provide further depth, whole-genome sequences are essential. Here, we report the draft genome sequences of 1,000 isolates from the culture collection housed at Penn State University’s E. coli Reference Center. The diverse collection examined in this study contains isolates from animal, environmental, and food sources. E. coli is commonly found as a member of the gut microbiota of warm-blooded organisms and has been isolated from a wide range of animal hosts (1, 2). Phylogenetic analyses have shown that E. coli can be divided into several phylogroups (3, 4), with pathogenic and nonpathogenic strains seemingly randomly distributed among them. This project focuses on the whole-genome sequencing of E. coli isolates from nonhuman animal sources, as well as the environment, that may reveal lineages from nonpathogenic to pathogenic strains. Understanding this evolutionary path may provide molecular insight into the acquisition of virulence attributes from an environmental source. Pure cultures for each strain were grown aerobically overnight in Luria-Bertani broth at 37°C. Total genomic DNA was extracted from 1 ml of overnight culture using the DNeasy blood and tissue kit (Qiagen, Hilden, Germany). DNA extractions were performed with the Qiagen QIAcube instrument using the manufacturer’s Gram-negative bacterium protocol. Sequencing libraries were prepared with 1 ng DNA using the Nextera XT DNA sample prep kit (Illumina, San Diego, CA, USA) and sequenced on either the Illumina MiSeq or NextSeq platform. The resulting paired-end reads (2 × 250 bp for MiSeq, 2 × 150 bp for NextSeq) were quality assessed by FastQC v0.11.8 (5). Low-quality reads were trimmed to a quality threshold of Q > 30, and adapter sequences were removed using the NexteraPE adapter file in Trimmomatic v0.38 (6). The genomes were de novo assembled with SPAdes v3.13.0 (7) using a k-mer size of 55, and assembly quality assessment was performed with QUAST v5.0 (8). The genomes were automatically annotated using the NCBI Prokaryotic Genome Annotation Pipeline (9). Default parameters were used for all software unless otherwise specified. The depth of coverage for the draft genomes ranged from 17× to 161×, with the genomes ranging in size from 4,291,381 to 5,764,740 bp. The number of contigs ranged from 50 to 741, while the N50 values ranged from 16,761 to 315,275 bp. The genomes were placed into one of six categories according to their source, avian, environmental, food, mammal, reptile, or unknown (Table 1). Most (n = 629) of the strains are of mammalian origin, with bovine, porcine, and canine sources being the most common (n = 203, 168, and 92, respectively). Among the 270 isolates of avian origin, chicken and turkey were the most common sources (n = 68 and 60, respectively). Phylogroups were assigned based on the single nucleotide polymorphisms (SNPs) present within 45 genes found in E. coli K-12 MG1655 (GenBank accession number U00096.3). Briefly, the 45 genes were extracted from each assembly and aligned to the sequence from K-12 MG1655 using BLAST. A SNP profile of 45 concatenated sites was then used to assign the phylogroup. Each of the established E. coli phylogroups is represented among the 1,000 genomes, namely, phylogroups A (n = 180), B1 (n = 438), B2 (n = 220), D (n = 69), E (n = 38), and F (n = 23). Twenty isolates belong to one of the following four known “cryptic” lineages of Escherichia (10, 11): lineage 1 (n = 3), lineage 3 (n = 4), lineage 4 (n = 2), and lineage 5 (n = 11). The remaining 12 isolates were classified as undetermined, because their phylogroup could not be assigned using the panel of 45 SNP loci.
TABLE 1

Summary of 1,000 genomes from the E. coli Reference Center

CategoryNo. of genomesNo. of source species or typesPhylogroup(s) observedCryptic lineage(s) observed
Avian27018A, B1, B2, D, E, F1, 3, 4, 5
Environmental623A, B1, B2, D, E, F3, 4, 5
Food376A, B1, DNone
Mammal62941A, B1, B2, D, E, F1
Reptile11ANone
Unknown11B1None
Summary of 1,000 genomes from the E. coli Reference Center

Data availability.

The draft genome assemblies were deposited at DDBJ/ENA/GenBank through the FDA’s GenomeTrakr pipeline under BioProject accession number PRJNA357722. The versions described in this announcement are the second versions. A full listing of the source and phylogroup information for the 1,000 genomes can be found at https://doi.org/10.6084/m9.figshare.12885527.v2 (12). A list of the 45 genes and diagnostic SNPs used for phylogroup assignment can be found at https://doi.org/10.6084/m9.figshare.12899765.v1 (13).
  10 in total

1.  SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors:  Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

2.  QUAST: quality assessment tool for genome assemblies.

Authors:  Alexey Gurevich; Vladislav Saveliev; Nikolay Vyahhi; Glenn Tesler
Journal:  Bioinformatics       Date:  2013-02-19       Impact factor: 6.937

Review 3.  The "Cryptic" Escherichia.

Authors:  Seth T Walk
Journal:  EcoSal Plus       Date:  2015

Review 4.  Escherichia coli from animal reservoirs as a potential source of human extraintestinal pathogenic E. coli.

Authors:  Louise Bélanger; Amélie Garenaux; Josée Harel; Martine Boulianne; Eric Nadeau; Charles M Dozois
Journal:  FEMS Immunol Med Microbiol       Date:  2011-03-24

5.  Phylogenetic distribution of branched RNA-linked multicopy single-stranded DNA among natural isolates of Escherichia coli.

Authors:  P J Herzer; S Inouye; M Inouye; T S Whittam
Journal:  J Bacteriol       Date:  1990-11       Impact factor: 3.490

6.  Cryptic lineages of the genus Escherichia.

Authors:  Seth T Walk; Elizabeth W Alm; David M Gordon; Jeffrey L Ram; Gary A Toranzos; James M Tiedje; Thomas S Whittam
Journal:  Appl Environ Microbiol       Date:  2009-08-21       Impact factor: 4.792

7.  The Clermont Escherichia coli phylo-typing method revisited: improvement of specificity and detection of new phylo-groups.

Authors:  Olivier Clermont; Julia K Christenson; Erick Denamur; David M Gordon
Journal:  Environ Microbiol Rep       Date:  2012-12-24       Impact factor: 3.541

Review 8.  Recent Updates on Outbreaks of Shiga Toxin-Producing Escherichia coli and Its Potential Reservoirs.

Authors:  Jun-Seob Kim; Moo-Seung Lee; Ji Hyung Kim
Journal:  Front Cell Infect Microbiol       Date:  2020-06-04       Impact factor: 5.293

9.  Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors:  Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal:  Bioinformatics       Date:  2014-04-01       Impact factor: 6.937

10.  NCBI prokaryotic genome annotation pipeline.

Authors:  Tatiana Tatusova; Michael DiCuccio; Azat Badretdin; Vyacheslav Chetvernin; Eric P Nawrocki; Leonid Zaslavsky; Alexandre Lomsadze; Kim D Pruitt; Mark Borodovsky; James Ostell
Journal:  Nucleic Acids Res       Date:  2016-06-24       Impact factor: 16.971

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.