| Literature DB >> 29046740 |
Audrey Segura1, Pauline Auffret1, Christophe Klopp2, Yolande Bertin1, Evelyne Forano1.
Abstract
Escherichia coli is the most abundant facultative anaerobic bacteria in the gastro-intestinal tract of mammals but can be responsible for intestinal infection due to acquisition of virulence factors. Genomes of pathogenic E. coli strains are widely described whereas those of bovine commensal E. coli strains are very scarce. Here, we report the genome sequence, annotation, and features of the commensal E. coli BG1 isolated from the gastro-intestinal tract of cattle. Whole genome sequencing analysis showed that BG1 has a chromosome of 4,782,107 bp coding for 4465 proteins and 97 RNAs. E. coli BG1 belonged to the serotype O159:H21, was classified in the phylogroup B1 and possessed the genetic information encoding "virulence factors" such as adherence systems, iron acquisition and flagella synthesis. A total of 12 adherence systems were detected reflecting the potential ability of BG1 to colonize different segments of the bovine gastro-intestinal tract. E. coli BG1 is unable to assimilate ethanolamine that confers a nutritional advantage to some pathogenic E. coli in the bovine gastro-intestinal tract. Genome analysis revealed the presence of i) 34 amino acids change due to non-synonymous SNPs among the genes encoding ethanolamine transport and assimilation, and ii) an additional predicted alpha helix inserted in cobalamin adenosyltransferase, a key enzyme required for ethanolamine assimilation. These modifications could explain the incapacity of BG1 to use ethanolamine. The BG1 genome can now be used as a reference (control strain) for subsequent evolution and comparative studies.Entities:
Keywords: Bovine; Commensal; Escherichia coli; Ethanolamine; Gastro-intestinal tract; Virulence factors; Whole genome sequencing
Year: 2017 PMID: 29046740 PMCID: PMC5634895 DOI: 10.1186/s40793-017-0272-0
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Classification and general features of E. coli BG1 [58]
| MIGS ID | Property | Term | Evidence codea |
|---|---|---|---|
| Classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order “ | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species | TAS [ | ||
| Gram stain | Negative | IDA, TAS [ | |
| Cell shape | Rod | IDA, TAS [ | |
| Motility | Motile | TAS [ | |
| Sporulation | None | TAS [ | |
| Temperature range |
| TAS [ | |
| Optimum temperature |
| TAS [ | |
| pH range; Optimum |
| TAS [ | |
| Carbon source | Carbohydrates, amino acids | IDA, TAS [ | |
| MIGS-6 | Habitat | Bovine digestive tract | IDA |
| MIGS-6.3 | Salinity | Not reported | |
| MIGS-22 | Oxygen requirement | Facultative anaerobe | TAS [ |
| MIGS-15 | Biotic relationship | Commensalism | IDA |
| MIGS-14 | Pathogenicity | Non-pathogenic | |
| MIGS-4 | Geographic location | France | |
| MIGS-5 | Sample collection | January 14, 2009 | |
| MIGS-4.1 | Latitude | Not reported | |
| MIGS-4.2 | Longitude | Not reported | |
| MIGS-4.4 | Altitude | Not reported |
aEvidence codes – IDA Inferred from Direct Assay; TAS Traceable Author Statement (i.e., a direct report exists in the literature); NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [68]
Fig. 1Transmission electron micrograph of E. coli BG1. The strain BG1 is a rod-shaped bacteria with a length of 1.5–2 μm and a diameter of 1 μm. It moves via peritrichous flagella. The magnification rate is 20,000×. The scale bar indicates 500 nm.
Fig. 2Phylogenetic tree highlighting the position of E. coli BG1 relative to other E. coli strains. The whole genome SNP based phylogeny was established with CSI phylogeny 1.4 [28] using the genome of K71 as a reference and standard input parameters. The tree was midpoint rooted and plotted using Seaview (version 4.6.1) [56]. Each strain is identified as H (Human), B (Bovine), A (Avian), F (Food) or K12 (Laboratory strain), and its clinical or non-pathogenic (NP) characteristic is specified.
Genome sequencing project information for E. coli BG1
| MIGS ID | Property | Term |
|---|---|---|
| MIGS 31 | Finishing quality | High quality draft |
| MIGS-28 | Libraries used | Paired ends library |
| MIGS 29 | Sequencing platforms | Illumina MiSeq |
| MIGS 31.2 | Fold coverage | 127× |
| MIGS 30 | Assemblers | SPAdes version 3.1.1 |
| MIGS 32 | Gene calling method | PROKKA version 1.10 |
| Locus Tag | BLX34 | |
| Genbank ID |
| |
| GenBank Date of Release | 2017–02-24 | |
| GOLD ID | ||
| BIOPROJECT |
| |
| MIGS 13 | Source Material Identifier | BG1 |
| Project relevance | Commensal |
Genome statistics
| Attribute | Value | % of Totala |
|---|---|---|
| Genome size (bp) | 4,782,107 | 100.00 |
| DNA coding (bp) | 4,218,785 | 88.22 |
| DNA G + C (bp) | 2,424,397 | 50.70 |
| DNA scaffolds | 84 | |
| Total genes | 4562 | 100.00 |
| Protein coding genes | 4465 | 97.88 |
| RNA genes | 97 | 2.13 |
| Pseudo genes | 22 | 0.48 |
| Genes in internal clusters | 1171 | 25.67 |
| Genes with function prediction | 3831 | 83.98 |
| Genes assigned to COGs | 3814 | 83.60 |
| Genes with Pfam domains | 275 | 6.03 |
| Genes with signal peptides | 174 | 3.81 |
| Genes with transmembrane helices | 1080 | 23.67 |
| CRISPR repeats | 2 |
aThe total is based on either the size of the genome in base pairs or the total number of proteins coding genes in the annotated genome
All the information has been obtained from Prokka annotation
Number of genes associated with general COGs functional categories
| Code | Value | % agea | Description |
|---|---|---|---|
| J | 250 | 6.55 | Translation, ribosomal structure and biogenesis |
| A | 2 | 0.05 | RNA processing and modification |
| K | 293 | 7.68 | Transcription |
| L | 154 | 4.04 | Replication, recombination and repair |
| B | 0 | 0.00 | Chromatin structure and dynamics |
| D | 41 | 1.07 | Cell cycle control, cell division, chromosome partitioning |
| V | 93 | 2.44 | Defense mechanisms |
| T | 176 | 4.61 | Signal transduction mechanisms |
| M | 271 | 7.11 | Cell wall/membrane/envelope biogenesis |
| N | 156 | 4.09 | Cell motility |
| U | 60 | 1.57 | Intracellular trafficking, secretion, and vesicular transport |
| O | 153 | 4.01 | Post-translational modifications, protein turnover, chaperones |
| C | 282 | 7.39 | Energy production and conversion |
| G | 381 | 9.99 | Carbohydrate transport and metabolism |
| E | 335 | 8.78 | Amino acid transport and metabolism |
| F | 101 | 2.65 | Nucleotide transport and metabolism |
| H | 169 | 4.43 | Coenzyme transport and metabolism |
| I | 119 | 3.12 | Lipid transport and metabolism |
| P | 190 | 4.98 | Inorganic ion transport and metabolism |
| Q | 53 | 1.39 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 211 | 5.53 | General function prediction only |
| S | 238 | 6.24 | Function unknown |
| – | 750 | 7.43 | Not in COGs |
aThe total is based on the total number of proteins coding genes in the annotated genome
Characteristics of whole genome datasets of different 10.1601/nm.3093 strains
| Strain name | Phylo group | Origina | Plasmid(s) | Genome size (bp) [chromosome + plasmid(s)] | G + C ratio (%) | CDS (nb) | Protein coding regions (nb) | rRNA operons (nb)b | tRNA genes (nb) |
|---|---|---|---|---|---|---|---|---|---|
| BG1 | B1 | Bc | 0 | 4,782,107 | 50.7 | 4562 | 4465 | 8 | 86 |
| K71 | B1 | Bc | 0 | 5,115,070 | 50.7 | 5178 | 4872 | 4 | 65 |
| W26 | B1 | Bc | 0 | 5,118,532 | 50.6 | 4925 | 4852 | 4 | 66 |
| Nissle 1917 | B2 | Hc | 0 | 5,441,200 | 50.6 | 5417 | 4970 | 10 | 121 |
| SE15 | B2 | Hc | 1 | 4,839,683 [4,717,338 + 122,345] | 50.7 | 4763 | 4572 | 7 | 85 |
| NCTC86 | A | Hc | 0 | 5,111,920 | 50.6 | 5243 | 4934 | 7 | 87 |
| VL2732 | A | Bp | 0 | 4,664,032 | 50.6 | 4615 | 4363 | 4 | 71 |
| Sakai | E | Hp | 2 | 5,594,477 [5,498,450 + 92,721 + 3306] | 50.5 | 5447 | 5324 | 7 | 103 |
| NCTC9001T | B2 | Hp | 0 | 5,038,133 | 50.6 | 5154 | 4859 | 6 | 62 |
aB: bovine; H: human; c: commensal; p: pathogen
bMinimal number of rRNA operons based on Prokka (BG1) or Genbank (K71, W26, VL2732, 10.1601/strainfinder?urlappend=%3Fid%3DNCTC+86, 10.1601/strainfinder?urlappend=%3Fid%3DNCTC+9001 T) annotation or on rrnDB (version 5.1) information [69]
Adherence systems encoded by the E. coli BG1 genome
| Adherence system | Gene or genes cluster | Pathotypea | in vitro cell adherenceb | Receptor |
|---|---|---|---|---|
| Curli fimbriae |
| EHEC, ETEC, aEPEC, APEC | T84 | Matrix, plasma proteins |
| EhaA autotransporter |
| EHEC, EAEC, ETEC, AIEC, EPEC | Primary bovine epithelial cells (terminal rectum) | Unknown |
| EhaB autotransporter |
| EHEC, UPEC, ETEC, EIEC, EAEC | NAc | Collagen I, laminin |
| EhaC autotransporter |
| EHEC, UPEC | Unknown | Unknown |
| ECP ( |
| ETEC, EHEC, NMEC, EAEC, aEPEC, septicemia | HT29, Hep-2, HeLa, HTB-4 | Arabinosyl residues |
| ELF ( |
| EHEC, aEPEC | HT29, Hep-2, MDBK | Laminin |
| F9 Fimbriae |
| EHEC, UPEC, APEC, AIEC, EAEC, EPEC | EBL | Bovine fibronectin, Galβ1-3GlcNAc |
| EaeH adhesin |
| UPEC, EHEC, ETEC, NMEC | UM-UC-3, Caco-2, CHO, HeLa, Vero | Unknown |
| HCP (Hemorrhagic Coli Pilus) |
| ETEC, EHEC, aEPEC, APEC | T84, Caco-2, HeLa, Hep-2, MDBK, cow colon explants | Laminin, fibronectin |
| Stg fimbriae |
| APEC, UPEC | UM-UC-3, INT 407 | Unknown |
| T1P (Type I pili) |
| UPEC, aEPEC, EAEC, APEC, STEC | HeLa, REC, colonic and ileal enterocytes | Mannose |
| UpaG autotransporter |
| UPEC | T24 | Fibronectin, laminin |
aSee the “Abbreviations” paragraph
bCell lines: T84 (human colonic adenocarcinoma), HT29 (human colorectal adenocarcinoma), Hep-2 (epithelial cells from epidermoid carcinoma of the human larynx), HeLa (human cervix epithelial carcinoma), HTB-4 (human bladder transitional carcinoma), MDBK (Madin-Darby bovine kidney), EBL (embryonic bovine lung), UM-UC-3 (human bladder carcinoma), Caco-2 (human colon carcinoma), CHO (Chinese hamster ovary), Vero (kidney epithelial cells from an African green monkey), INT 407 (HeLa derivative), REC (humen B cell lymphoma), T24 (human bladder transitional carcinoma)
cNA: no adherence to the cells lines tested
Fig. 3Predicted secondary structure modeling of the EutT protein of E. coli BG1 obtained with Phyre version 2.0 [57]