Literature DB >> 35510268

Data on complete genome sequence and annotation of two multidrug resistant atypical enteropathogenic Escherichia coli O177 serotype isolated from cattle faeces.

Peter Kotsoana Montso1,2, Victor Mlambo3, Collins Njie Ateba1,2.   

Abstract

Atypical enteropathogenic E. coli belonging to the serotype O177 is a rare strain found in ruminants, especially cattle. When compared to shiga toxin producing E. coli (STEC) O157 and non-O157 STEC (O26, O45, O103, O104, O111, O121, and O145) serotypes, the antimicrobial resistance, virulence factors, and genomic structure of E. coli O177 are poorly understood. Therefore, in this article, we present the whole genome sequence data of two aEPEC E. coli O177 isolates (E. coli O177_CF-154-A and E. coli O177_CF-335-B) generated using Illumina MiSeq platform. The raw data were generated, cleaned, and assembled using Trimmomatic and SPAdes. Genome data analysis yielded 5,112,402 and 5,460,435 bp, comprising contigs 101 and 191 with GC contents of 50.7% and 50.5% for E. coli O177_CF-154-A and E. coli O177_CF-335-B, respectively. Prokaryotic Genome Annotation Pipeline (PGAP) and Rapid Annotation using Subsystem Technology (RAST) showed that the complete genome of E. coli O177_CF-154-A contained 5040 coding sequences (CDS), 5146 genes, 4896 proteins, 90 RNAs, and 78 tRNA while that of E. coli O177_CF-335-B contained 5463 CDS, 5570 genes, 5230 proteins, 92 RNAs, and 80 tRNA for. A total of 426 and 425 subsystem features with 5190 and 5662 CDS were obtained for E. coli O177_CF-154-A and E. coli O177_CF-335-B, respectively. Several genes encoding virulence and antimicrobial resistance were identified in both genomes. Complete genome sequence data of both isolates have been deposited in the National Center for Biotechnology Information (NCBI), GenBank: accession numbers, VMKH00000000 (E. coli O177_CF-154-A) and VMKG00000000 (E. coli O177_CF-335-B). This data can be used as a reference for determining the virulence and antimicrobial resistance in E. coli O177 isolates from different sample sources.
© 2022 The Author(s). Published by Elsevier Inc.

Entities:  

Keywords:  Escherichia coli O177; Genome annotation; Genomic data; Virulence and Antimicrobial resistance genes; Whole genome sequence

Year:  2022        PMID: 35510268      PMCID: PMC9058948          DOI: 10.1016/j.dib.2022.108167

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table

Value of the Data

These data provide genomic features of E. coli O177 serotype. Moreover, these data give an extensive information on the virulence and antimicrobial resistance profile of this serotype, which may contribute to understanding and improving of scientific knowledge of this pathogenic strain. The data may be used by researchers to develop new methods for detection of E. coli O177 serotype from different environmental samples. In addition, these data can be used in public health to establish policy framework and strategy intended to curb antimicrobial resistance, especially in humans. This genome can be used as a reference, especially for comparative genomic and epidemiological studies.

Data Description

Two atypical enteropathogenic E. coli O177 isolates (E. coli O177_CF-154-A and E. coli O177_CF-335-B) were obtained from cattle faeces in the North West province, South Africa (−27° 00′ 0.00″ S 26° 00′ 0.00″ E), Fig. 1. Genome sequencing was performed using Illumina MiSeq platform and a total of 576.5 Mb (CF-154’s genome) and 794.3 Mb (CF-335’s genome) raw data were obtained. The genome characteristics of the two isolates (E. coli O177_CF-154-A and E. coli O177_CF-335-B) are summarised in Table 1 and Fig. 2. The genome sizes were 5,112,402 and 5,460,435 bp, comprising contigs 101 and 191 with GC content of 50.7% and 50.5% for E. coli O177_CF-154-A and E. coli O177_CF-335-B, respectively. There were 5040 coding sequences (CDS), 5146 genes, 4896 proteins, 90 RNAs, and 78 tRNA for E. coli O177_CF-154-A genome, while E. coli O177_CF-335-B genome contained 5463 CDS, 5570 genes, 5230 proteins, 92 RNAs, and 80 tRNA. Furthermore, both genomes contained 2 CRISPR Arrays. Based on RAST annotation, there were 426 and 425 subsystem feature counts with 5190 and 5662 CDS in E. coli O177_CF-154-A and E. coli O177_CF-335-B, respectively. As depicted in Fig. 2, the carbohydrates; amino acids and derivatives; stress response; respiration; DNA metabolism; protein metabolism; membrane transport; and cofactor, vitamins, prosthetic groups, pigments were the most abundant subsystem feature found in both genomes. The circular complete genome draft shown in Fig 3 was constructed using CGView [1]. The Virulence and Resistance Gene Identifier revealed that both genomes contained several virulence and antimicrobial resistance genes, Figs 3-7 and Excel sheets 1 and 2 (S 1 and 2).
Fig 1

An illustration of the North West province map. https://municipalities.co.za/provinces/view/8/north-west.

Table 1

Features of draft genomes of two E. coli O177 isolates obtained from cattle faeces.

Sample ID
FeaturesE. coli O177_CF-154-AE. coli O177_CF-335-B
Genome size5,112,402 bp5,460,435 bp
Genome coverage depth124.7x162.128x
Total length5111092 bp5459908 bp
GC content (%)50.750.5
Number of contigs101191
Contigs N50127249113919
Contigs L501415
Number of Scaffold101-
Scaffold N50130301-
Scaffold L5013-
Coding genes48965230
Total genes51465570
Total CDSs50405463
Total proteins48965230
rRNA8, 4, 6 (5S, 16S, 23S)7, 4, 6 (5S, 16S, 23S)
tRNA7880
ncRNA1010
CRISPR Arrays22
Fig. 2

Frequency distribution of gene categories in genomes of two E. coli O177isolates obtained from cattle faeces.

Fig. 3

The circular genome map of E. coli O177 isolates (CF-154-A and CF-335-B) obtained from cattle faeces. Circle displays from inside to outside: GC Skew (light orange), GC content (light purple), Drug Tagets (black), Transporters (blue), Virulence factor genes (yellow), Antimicrobial resistance genes (red), Non CDS features (turquoise blue), CDS reverse strand (light purple) and CDS forward strand (green).

Fig. 7

Resistance mechanisms in two E. coli O177 isolates from cattle faeces

An illustration of the North West province map. https://municipalities.co.za/provinces/view/8/north-west. Features of draft genomes of two E. coli O177 isolates obtained from cattle faeces. Frequency distribution of gene categories in genomes of two E. coli O177isolates obtained from cattle faeces. The circular genome map of E. coli O177 isolates (CF-154-A and CF-335-B) obtained from cattle faeces. Circle displays from inside to outside: GC Skew (light orange), GC content (light purple), Drug Tagets (black), Transporters (blue), Virulence factor genes (yellow), Antimicrobial resistance genes (red), Non CDS features (turquoise blue), CDS reverse strand (light purple) and CDS forward strand (green). Distribution of antimicrobial resistance genes in genomes of two E. coli O177 isolates obtained from cattle faeces. Distribution of antimicrobial resistance gene family in genomes of two E. coli O177 isolates obtained from cattle faeces. Antimicrobial drug classes in genomes of two E. coli O177 isolates from cattle faeces Resistance mechanisms in two E. coli O177 isolates from cattle faeces

Experimental Design, Materials and Methods

Bacterial strain

Two atypical enteropathogenic E. coli O177 isolates were obtained from Antimicrobial Resistance and Phage Biocontrol Laboratory, Department of Microbiology. The isolates were selected based on the virulence and antimicrobial resistance profiles as described in the previous studies [2,3]. The stock cultures were removed from −80 °C and revived on MacConkey agar. The plates were incubated at 37 °C for 24 hours. After incubation, a single colony was transferred into 15 falcon tubes containing 10 mL nutrient broth. The tubes were incubated in a shaking incubator (150 rpm) at 37 °C for 24 hours.

Genomic DNA extraction and Sequencing

Genomic DNA was extracted from overnight cultures using the Zymo Research Genomic DNATM-Tissue MiniPrep Kit (Biolab, South Africa) following the manufacturer's instructions. The DNA concentration was determined using the NanoDropTM-Lite 1,000 spectrophotometer (Thermo Fisher Scientific, Walton, ma, USA). After fragmentation, DNA libraries were constructed using the Nextera XT DNA library prep kit (Illumina, USA) following the manufacturer's instruction. The fragmented DNA was amplified using 12 cycles PCR, which adds the index sequences [index 1 (i7) and index 2 (i5)]. The PCR products were purified using 0.6 × Agencourt AMPure XP beads (Beckman Coulter), and the quality was determined using 1.5% (w/v) agarose gel. Each library was diluted to 12 pmol. Samples were normalized to 4 nM using Nextra XT Library Normalization Beads (Illumina). Normalized libraries were pooled and 150 base paired-ends sequencing was performed with MiSeq Reagent V3 600-cycle kits on the Miseq instrument (Illumina).

Genome assembly, annotation and data analysis

Raw sequence data were generated and FASTQ files were obtained. The data were assessed for quality using FASTQC (v.0.11.5) and filtered for low quality reads and adapter regions using Trimmomatic (v.0.36) [4,5]. The de novo genome assembly was carried out using SPAdes (v.3.13) [5]. Complete genome annotation was performed using NCBI PGAP (v.5.0), Prokka pipeline (v.2.1.1), RAST server (v.2.0) and PATRIC online sever (v.3.6.2) [6], [7], [8], [9], [10]. Antimicrobial resistance genes were further mined using the Resistance Gene Identifier online tool of the comprehensive Antibiotic Resistance Database CARD4 (https://card.mcmaster.ca/analyze/rgi) with all parameters (‘Perfect and Strict hits’ and ‘High quality/Coverage’) set at default [11].

Ethics Statements

This study did not involve the use of human subjects or animal experiments.

CRediT authorship contribution statement

Peter Kotsoana Montso: Conceptualization, Methodology, Data curation, Writing – original draft, Visualization, Investigation, Software, Validation, Writing – review & editing. Victor Mlambo: Conceptualization, Methodology, Supervision, Software, Validation, Writing – review & editing. Collins Njie Ateba: Conceptualization, Methodology, Supervision, Software, Validation, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
SubjectMicrobiology
Specific subject areaMolecular Microbiology and Bioinformatics
Type of dataTableFiguresExcel Sheets
How the data were acquiredWhole genome sequence was performed using Illumina MiSeq platform. The FASTQ files were obtained and imported into Kbase platform (https://kbase.us/). The files were subjected to FASTQC (v.0.11.5) to assess reads quality. Subsequently, raw data were processed using Trimmomatic (v0.36). The assemble algorithm was carried out using SPAdes (v3.13.0), and genome annotation was performed using Prokaryotic Genome Annotation Pipeline (PGAP), Rapid Annotation using Subsystem Technology (RAST) and Pathosystems Resource Integration Center (PATRIC).
Data formatRaw, filtered and analysed.
Description of data collectionGenomic DNA was extracted from two aEPEC O177 isolates (CF-154-A and CF-334-B) obtained from the Department of Microbiology, at NWU. The gDNA was sequenced using Illumina MiSeq platform. After sequencing FASTQ files were obtained. Raw reads were cleaned and assembled into contigs using FASTQC (v.0.11.5) SPAdes (v3.13.0), respectively. The genome annotation was carried out using PGAP, v.2.0 and RAST (v.2.0). The genome maps were drafted using PATRIC (v.3.6.2).
Data source location• Institution: North-West University• City/Town/Region: North-West Province• Country: South Africa
Data accessibilityRepository name: National Center for Biotechnology Information (NCBI), GenBank, and figshare.Data identification numbers: VMKH00000000 (E. coli O177_CF-154-A) and VMKG00000000 (E. coli O177_CF-335-B);PRJNA555014 and PRJNA554852, SAMN12288806 and SAMN12285021E. coli O177_CF-154-A and for E. coli O177_ CF-335-B, respectively).Direct URL to data: https://www.ncbi.nlm.nih.gov/nuccore/VMKH00000000, https://www.ncbi.nlm.nih.gov/nuccore/VMKG00000000, https://figshare.com/s/e3a60e4a3d918527b572
Related research articleP. K. Montso, C. C. Bezuidenhout, C. Mienie, Y. M. Somorin, O. A. Odeyemi, V. Mlambo, C. N. Ateba. Genetic diversity and whole genome sequence analysis data of multidrug resistant atypical enteropathogenic E. coli O177 strains: An assessment of food safety and public health implications. Int J Food Microbiol. 2022, https://doi.org/10.1016/j.ijfoodmicro.2022.109555.
  10 in total

1.  SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors:  Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

2.  RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation.

Authors:  Wenjun Li; Kathleen R O'Neill; Daniel H Haft; Michael DiCuccio; Vyacheslav Chetvernin; Azat Badretdin; George Coulouris; Farideh Chitsaz; Myra K Derbyshire; A Scott Durkin; Noreen R Gonzales; Marc Gwadz; Christopher J Lanczycki; James S Song; Narmada Thanki; Jiyao Wang; Roxanne A Yamashita; Mingzhang Yang; Chanjuan Zheng; Aron Marchler-Bauer; Françoise Thibaud-Nissen
Journal:  Nucleic Acids Res       Date:  2020-12-03       Impact factor: 16.971

3.  Prokka: rapid prokaryotic genome annotation.

Authors:  Torsten Seemann
Journal:  Bioinformatics       Date:  2014-03-18       Impact factor: 6.937

4.  Genetic diversity and whole genome sequence analysis data of multidrug resistant atypical enteropathogenic Escherichia coli O177 strains: An assessment of food safety and public health implications.

Authors:  Peter Kotsoana Montso; Cornelius Carlos Bezuidenhout; Charlotte Mienie; Yinka M Somorin; Olumide A Odeyemi; Victor Mlambo; Collins Njie Ateba
Journal:  Int J Food Microbiol       Date:  2022-01-26       Impact factor: 5.277

5.  The CGView Server: a comparative genomics tool for circular genomes.

Authors:  Jason R Grant; Paul Stothard
Journal:  Nucleic Acids Res       Date:  2008-04-14       Impact factor: 16.971

6.  Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center.

Authors:  Alice R Wattam; James J Davis; Rida Assaf; Sébastien Boisvert; Thomas Brettin; Christopher Bun; Neal Conrad; Emily M Dietrich; Terry Disz; Joseph L Gabbard; Svetlana Gerdes; Christopher S Henry; Ronald W Kenyon; Dustin Machi; Chunhong Mao; Eric K Nordberg; Gary J Olsen; Daniel E Murphy-Olson; Robert Olson; Ross Overbeek; Bruce Parrello; Gordon D Pusch; Maulik Shukla; Veronika Vonstein; Andrew Warren; Fangfang Xia; Hyunseung Yoo; Rick L Stevens
Journal:  Nucleic Acids Res       Date:  2016-11-29       Impact factor: 16.971

7.  The First Isolation and Molecular Characterization of Shiga Toxin-Producing Virulent Multi-Drug Resistant Atypical Enteropathogenic Escherichia coli O177 Serogroup From South African Cattle.

Authors:  Peter Kotsoana Montso; Victor Mlambo; Collins Njie Ateba
Journal:  Front Cell Infect Microbiol       Date:  2019-09-24       Impact factor: 5.293

8.  The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

Authors:  Ross Overbeek; Robert Olson; Gordon D Pusch; Gary J Olsen; James J Davis; Terry Disz; Robert A Edwards; Svetlana Gerdes; Bruce Parrello; Maulik Shukla; Veronika Vonstein; Alice R Wattam; Fangfang Xia; Rick Stevens
Journal:  Nucleic Acids Res       Date:  2013-11-29       Impact factor: 16.971

9.  Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors:  Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal:  Bioinformatics       Date:  2014-04-01       Impact factor: 6.937

10.  CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database.

Authors:  Brian P Alcock; Amogelang R Raphenya; Tammy T Y Lau; Kara K Tsang; Mégane Bouchard; Arman Edalatmand; William Huynh; Anna-Lisa V Nguyen; Annie A Cheng; Sihan Liu; Sally Y Min; Anatoly Miroshnichenko; Hiu-Ki Tran; Rafik E Werfalli; Jalees A Nasir; Martins Oloni; David J Speicher; Alexandra Florescu; Bhavya Singh; Mateusz Faltyn; Anastasia Hernandez-Koutoucheva; Arjun N Sharma; Emily Bordeleau; Andrew C Pawlowski; Haley L Zubyk; Damion Dooley; Emma Griffiths; Finlay Maguire; Geoff L Winsor; Robert G Beiko; Fiona S L Brinkman; William W L Hsiao; Gary V Domselaar; Andrew G McArthur
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.