Literature DB >> 30533715

Draft Genome Sequences of the Escherichia coli Reference (ECOR) Collection.

Isha R Patel1, Jayanthi Gangiredla1, Mark K Mammel1, Keith A Lampel1, Christopher A Elkins1, David W Lacher1.   

Abstract

Here, we report the genomes of all 72 isolates belonging to the Escherichia coli reference (ECOR) collection. Strains in this collection were isolated from diverse hosts and geographic locations and have been used for more than 30 years to represent the phylogenetic diversity of E. coli.

Entities:  

Year:  2018        PMID: 30533715      PMCID: PMC6256646          DOI: 10.1128/MRA.01133-18

Source DB:  PubMed          Journal:  Microbiol Resour Announc        ISSN: 2576-098X


ANNOUNCEMENT

Escherichia coli has been used as a model species to analyze the processes involved in bacterial genome evolution. More than 30 years ago, a set of strains known as the E. coli reference (ECOR) collection was assembled to represent the known genetic diversity of the species (1). Subsequent phylogenetic studies have shown that pathogenic and nonpathogenic strains of E. coli are randomly distributed when classified into the ECOR phylogroups (2, 3). PCR-based methods were used later to reassign strains ECOR 35, 36, 38, 39, 40, and 41 from phylogroup D to the newly described phylogroup F (4, 5). This finding was also supported by whole-genome-based microarray data (6). Since its creation, the ECOR collection has been widely used by scientists around the world. Unfortunately, during this time, several discrepancies from the original collection have been reported (7). Other researchers have made genome sequence data available for the entire ECOR collection (8). However, we caution against use of these data since nearly half of the strains appear to be contaminated, as evidenced by the presence of multiple molecular serotyping loci within the affected assemblies. For example, the assembly for strain ECOR 46 (GenBank accession number LYCC00000000) contains the wzxO1, wzxO7, wzyO1, wzyO7, fliCH6 and fliCH45 alleles. Here, we report our version of the ECOR collection so that others may use the data to better understand the nature of their differences with the original collection (Table 1).
TABLE 1

Accession numbers and assembly metrics for the 72 ECOR strains

ECOR strainSRA run no.No. of readsGenBank accession no.Average coverage (×)No. of contigsGenome size (bp)N50 values (bp)G+C content (%)Messerer et al. (8) GenBank accession no.
1SRR39895312,464,522QOWM0000000032.5794,715,492430,66950.6LYBJ00000000a
2SRR78191456,102,266QOWN0000000078.92345,239,177104,05550.7LYBI00000000a
3SRR39514656,911,676QOWO0000000081.71434,926,647124,67050.7LYBH00000000
4SRR39514665,536,830QOWP0000000072.01214,587,238129,06050.8LYBG00000000a
5SRR39514674,299,776QOWQ0000000050.82465,065,618121,00750.7LYBF00000000a
6SRR39514688,044,114QOWR00000000125.01444,556,86269,69250.9LYBE00000000a
7SRR39514696,530,276QOWS0000000080.2904,896,746169,93550.7LYBD00000000
8SRR395147010,733,010QOWT00000000130.61504,896,359146,11350.5LYBC00000000a
9SRR39514716,022,232QOWU0000000068.62875,177,52067,39450.9LYBB00000000
10SRR39514725,570,968QOWV0000000068.6984,751,057230,92250.6LYBA00000000a
11SRR395147312,657,330QOWW00000000145.52805,182,586144,66950.7LYAZ00000000
12SRR39514749,606,228QOWX00000000114.42185,078,970128,90450.7LYAY00000000a
13SRR39514759,627,106QOWY00000000127.81654,619,03170,95850.8LYAX00000000
14SRR39514766,464,928QOWZ0000000079.11504,941,866112,44750.7LYAW00000000a
15SRR39515174,867,598QOXA0000000060.51284,911,383121,07450.7LYAV00000000
16SRR39514774,046,036QOXB0000000052.91264,636,98193,94050.8LYAU00000000
17SRR39514784,786,784QOXC0000000066.11104,506,69888,85350.7LYAT00000000
18SRR39514799,042,146QOXD00000000121.21124,634,531104,35150.8LYAS00000000
19SRR39514809,318,330QOXE00000000142.71644,535,55482,27850.8LYAR00000000
20SRR395148110,805,000QOXF00000000155.21544,595,72184,14150.8LYAQ00000000
21SRR39514826,045,112QOXG0000000085.11594,568,66285,42950.9LYAP00000000
22SRR39514849,380,404QOXH00000000118.8844,514,994211,90350.8LYAO00000000
23SRR39514854,819,930QOXI0000000056.61615,093,876233,30450.4LYAN00000000
24SRR395148610,917,604QOXJ00000000119.91465,227,547157,12950.7LYAM00000000
25SRR39514887,017,374QOXK0000000086.91254,752,894184,76350.5LYAL00000000
26SRR395148912,549,182QOXL00000000153.3934,678,648236,08650.7LYAK00000000
27SRR395149010,793,212QOXM00000000128.61094,867,073190,03150.5LYAJ00000000
28SRR39514916,645,234QOXN0000000081.31094,925,046187,23750.7LYAI00000000
29SRR39514926,383,664QOXO0000000079.11124,928,564177,52550.6LYAH00000000
30SRR39514936,835,550QOXP0000000085.11204,825,526193,32050.6LYAG00000000a
31SRR395149411,223,998QOXQ00000000126.01205,302,667135,88750.7LYAF00000000
32SRR395151810,348,842QOXR00000000129.21294,794,190185,24550.7LYAE00000000
33SRR395149611,691,094QOXS00000000152.71294,795,454185,22850.7LYAD00000000a
34SRR395149712,857,174QOXT00000000165.31284,908,743154,83450.7LYAC00000000a
35SRR39514989,789,342QOXU00000000132.42205,104,51879,46650.6LYAB00000000
36SRR395149910,108,776QOXV00000000134.82795,231,49958,26050.5LYBO00000000
37SRR395150014,011,950QOXW00000000149.33135,589,95997,74550.3LYAA00000000
38SRR395150119,252,412QOXX00000000211.72065,240,321109,90250.5LXZZ00000000a
39SRR395150316,096,112QOXY00000000170.42115,284,758109,90250.4LYCJ00000000a
40SRR39515045,498,306QOXZ0000000062.41905,201,125109,34550.5LYCI00000000a
41SRR395150512,179,258QOYA00000000131.72045,242,084105,64050.4LYCH00000000
42SRR395150610,293,022QOYB00000000114.71115,189,763467,10450.5LYCG00000000a
43SRR395150810,938,788QOYC00000000127.32265,272,828107,73550.6LYCF00000000a
44SRR39515099,842,910QOYD00000000112.41715,240,115179,75750.6LYCE00000000a
45SRR395151017,558,230QOYE00000000226.1954,726,888210,96950.7LYCD00000000a
46SRR39876772,932,548QOYF00000000107.71515,259,34090,48550.5LYCC00000000a
47SRR395151218,191,986QOYG00000000228.5774,920,788214,55850.6LYCB00000000a
48SRR395151318,794,936QOYH00000000218.11315,333,881171,96350.5LYCA00000000a
49SRR39895144,399,522QOYI00000000187.72775,278,098118,11850.6LYBZ00000000
50SRR39895321,990,960QOYJ0000000022.73545,591,74482,35250.5LYBY00000000
51SRR39895333,562,062QOYK0000000044.51355,169,898234,50550.5LYYB00000000a
52SRR78191445,273,668QOYL0000000072.61725,097,032219,13250.4LYCT00000000
53SRR78191433,163,954QOYM0000000043.31415,131,093244,87550.4LYCU00000000
54SRR39895345,331,188QOYN0000000070.41135,035,502332,01450.4LYCV00000000
55SRR39895354,680,452QOYO0000000060.01225,049,489284,05750.6LYCW00000000
56SRR78191422,047,664QOYP0000000028.81104,956,973196,11250.5LYDK00000000
57SRR39895153,149,298QOYQ00000000137.21285,277,375185,26950.5LYCX00000000a
58SRR39895211,955,120QOYR0000000091.71034,902,658120,85050.6LYCY00000000a
59SRR39895072,122,200QOZF0000000099.6844,764,104194,58450.4LYCZ00000000a
60SRR39895221,995,476QOYS0000000090.81355,068,027305,37250.7LYDA00000000a
61SRR39876782,362,538QOYT0000000099.11184,885,105129,20750.6LYDB00000000a
62SRR39895232,518,590QOYU00000000114.41155,155,092161,54550.4LYDL00000000a
63SRR39895291,790,086QOYV0000000078.01625,113,907113,04550.5LYDM00000000
64SRR781914810,979,330QOYW00000000143.21695,105,087196,23050.8LYDC00000000
65SRR39895373,338,208QOYX0000000042.21024,946,845237,52550.7LYDD00000000
66SRR39895382,878,266QOYY0000000039.5474,739,108239,49050.9LYDE00000000a
67SRR39895301,916,842QOYZ00000000170.2824,725,509198,15250.7LYDN00000000
68SRR78191477,341,922QOZA0000000099.61565,037,255191,32750.7LYDF00000000
69SRR39895392,363,422QOZB0000000033.7744,641,530141,37150.6LYDG00000000
70SRR39895404,864,768QOZC0000000061.81355,123,298225,73350.8LYDH00000000a
71SRR39895414,321,280QOZD0000000056.81384,875,196113,17250.8LYDI00000000a
72SRR78191462,653,970QOZE0000000039.2754,725,669288,62950.6LYDJ00000000a

Possible contamination identified by the presence of multiple O and/or H molecular serotyping loci.

Accession numbers and assembly metrics for the 72 ECOR strains Possible contamination identified by the presence of multiple O and/or H molecular serotyping loci. Pure cultures for each strain were grown aerobically overnight in Luria-Bertani broth at 37°C. Total genomic DNA was extracted from 1 ml of overnight culture using the DNeasy blood and tissue kit (Qiagen, Hilden, Germany). DNA extractions were performed with the Qiagen QIAcube instrument using the manufacturer’s protocol for Gram-negative bacteria. Sequencing libraries were prepared with 1 ng of DNA using the Nextera XT DNA sample prep kit (Illumina, San Diego, CA, USA) and sequenced on either the Illumina MiSeq or NextSeq platform. The resulting paired-end reads (2 × 250 bp for MiSeq, 2 × 150 bp for NextSeq) were quality controlled using FastQC (Q score, >30) and de novo assembled using SPAdes 3.8.2 (9) or CLC Genomics Workbench 8.2.1 (CLC bio, Aarhus, Denmark). Depth of coverage for the draft genomes ranged from 23× to 229×, with the genome sizes ranging from 4,506,698 to 5,591,744 bp. The number of contigs ranged from 47 to 354, while the N50 values ranged from 58,260 to 467,104 bp. Preliminary phylogenetic analysis utilizing polymorphisms present within conserved core genes identified two strains as belonging to a phylogroup inconsistent with their expected ECOR designation. Phylogroup A strains ECOR 7 and ECOR 23 were found to cluster within phylogroups B1 and B2, respectively. The phylogroup F status of strains ECOR 35, 36, 38, 39, 40, and 41 was confirmed by our phylogenetic analysis.

Data availability.

The draft genome assemblies were deposited at DDBJ/ENA/GenBank under the accession numbers QOWM00000000 to QOZF00000000 and under BioProject accession number PRJNA230969. The versions described in this announcement are the first versions.
  8 in total

1.  SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors:  Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

2.  Standard reference strains of Escherichia coli from natural populations.

Authors:  H Ochman; R K Selander
Journal:  J Bacteriol       Date:  1984-02       Impact factor: 3.490

Review 3.  The population genetics of commensal Escherichia coli.

Authors:  Olivier Tenaillon; David Skurnik; Bertrand Picard; Erick Denamur
Journal:  Nat Rev Microbiol       Date:  2010-03       Impact factor: 60.633

4.  Guide to the various phylogenetic classification schemes for Escherichia coli and the correspondence among schemes.

Authors:  Olivier Clermont; David Gordon; Erick Denamur
Journal:  Microbiology       Date:  2015-02-24       Impact factor: 2.777

5.  Assigning Escherichia coli strains to phylogenetic groups: multi-locus sequence typing versus the PCR triplex method.

Authors:  David M Gordon; Olivier Clermont; Heather Tolley; Erick Denamur
Journal:  Environ Microbiol       Date:  2008-06-02       Impact factor: 5.491

6.  The Clermont Escherichia coli phylo-typing method revisited: improvement of specificity and detection of new phylo-groups.

Authors:  Olivier Clermont; Julia K Christenson; Erick Denamur; David M Gordon
Journal:  Environ Microbiol Rep       Date:  2012-12-24       Impact factor: 3.541

7.  Investigating the global genomic diversity of Escherichia coli using a multi-genome DNA microarray platform with novel gene prediction strategies.

Authors:  Scott A Jackson; Isha R Patel; Tammy Barnaba; Joseph E LeClerc; Thomas A Cebula
Journal:  BMC Genomics       Date:  2011-07-06       Impact factor: 3.969

8.  Investigation of horizontal gene transfer of pathogenicity islands in Escherichia coli using next-generation sequencing.

Authors:  Maxim Messerer; Wolfgang Fischer; Sören Schubert
Journal:  PLoS One       Date:  2017-07-21       Impact factor: 3.240

  8 in total
  3 in total

1.  Phylogenetic background and habitat drive the genetic diversification of Escherichia coli.

Authors:  Marie Touchon; Amandine Perrin; Jorge André Moura de Sousa; Belinda Vangchhia; Samantha Burn; Claire L O'Brien; Erick Denamur; David Gordon; Eduardo Pc Rocha
Journal:  PLoS Genet       Date:  2020-06-12       Impact factor: 5.917

2.  Genome evolution and the emergence of pathogenicity in avian Escherichia coli.

Authors:  Leonardos Mageiros; Guillaume Méric; Sion C Bayliss; Johan Pensar; Ben Pascoe; Evangelos Mourkas; Jessica K Calland; Koji Yahara; Susan Murray; Thomas S Wilkinson; Lisa K Williams; Matthew D Hitchings; Jonathan Porter; Kirsty Kemmett; Edward J Feil; Keith A Jolley; Nicola J Williams; Jukka Corander; Samuel K Sheppard
Journal:  Nat Commun       Date:  2021-02-03       Impact factor: 14.919

3.  Long-read-sequenced reference genomes of the seven major lineages of enterotoxigenic Escherichia coli (ETEC) circulating in modern time.

Authors:  Astrid von Mentzer; Grace A Blackwell; Derek Pickard; Christine J Boinett; Enrique Joffré; Andrew J Page; Ann-Mari Svennerholm; Gordon Dougan; Åsa Sjöling
Journal:  Sci Rep       Date:  2021-04-29       Impact factor: 4.379

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.