Literature DB >> 35510268

Data on complete genome sequence and annotation of two multidrug resistant atypical enteropathogenic Escherichia coli O177 serotype isolated from cattle faeces.

Peter Kotsoana Montso^1,2, Victor Mlambo³, Collins Njie Ateba^1,2.

Abstract

Atypical enteropathogenic E. coli belonging to the serotype O177 is a rare strain found in ruminants, especially cattle. When compared to shiga toxin producing E. coli (STEC) O157 and non-O157 STEC (O26, O45, O103, O104, O111, O121, and O145) serotypes, the antimicrobial resistance, virulence factors, and genomic structure of E. coli O177 are poorly understood. Therefore, in this article, we present the whole genome sequence data of two aEPEC E. coli O177 isolates (E. coli O177_CF-154-A and E. coli O177_CF-335-B) generated using Illumina MiSeq platform. The raw data were generated, cleaned, and assembled using Trimmomatic and SPAdes. Genome data analysis yielded 5,112,402 and 5,460,435 bp, comprising contigs 101 and 191 with GC contents of 50.7% and 50.5% for E. coli O177_CF-154-A and E. coli O177_CF-335-B, respectively. Prokaryotic Genome Annotation Pipeline (PGAP) and Rapid Annotation using Subsystem Technology (RAST) showed that the complete genome of E. coli O177_CF-154-A contained 5040 coding sequences (CDS), 5146 genes, 4896 proteins, 90 RNAs, and 78 tRNA while that of E. coli O177_CF-335-B contained 5463 CDS, 5570 genes, 5230 proteins, 92 RNAs, and 80 tRNA for. A total of 426 and 425 subsystem features with 5190 and 5662 CDS were obtained for E. coli O177_CF-154-A and E. coli O177_CF-335-B, respectively. Several genes encoding virulence and antimicrobial resistance were identified in both genomes. Complete genome sequence data of both isolates have been deposited in the National Center for Biotechnology Information (NCBI), GenBank: accession numbers, VMKH00000000 (E. coli O177_CF-154-A) and VMKG00000000 (E. coli O177_CF-335-B). This data can be used as a reference for determining the virulence and antimicrobial resistance in E. coli O177 isolates from different sample sources.

Entities: Chemical

Keywords: Escherichia coli O177; Genome annotation; Genomic data; Virulence and Antimicrobial resistance genes; Whole genome sequence

Year: 2022 PMID： 35510268 PMCID： PMC9058948 DOI： 10.1016/j.dib.2022.108167

Source DB: PubMed Journal: Data Brief ISSN： 2352-3409

Specifications Table

Value of the Data

These data provide genomic features of E. coli O177 serotype. Moreover, these data give an extensive information on the virulence and antimicrobial resistance profile of this serotype, which may contribute to understanding and improving of scientific knowledge of this pathogenic strain. The data may be used by researchers to develop new methods for detection of E. coli O177 serotype from different environmental samples. In addition, these data can be used in public health to establish policy framework and strategy intended to curb antimicrobial resistance, especially in humans. This genome can be used as a reference, especially for comparative genomic and epidemiological studies.

Data Description

Two atypical enteropathogenic E. coli O177 isolates (E. coli O177_CF-154-A and E. coli O177_CF-335-B) were obtained from cattle faeces in the North West province, South Africa (−27° 00′ 0.00″ S 26° 00′ 0.00″ E), Fig. 1. Genome sequencing was performed using Illumina MiSeq platform and a total of 576.5 Mb (CF-154’s genome) and 794.3 Mb (CF-335’s genome) raw data were obtained. The genome characteristics of the two isolates (E. coli O177_CF-154-A and E. coli O177_CF-335-B) are summarised in Table 1 and Fig. 2. The genome sizes were 5,112,402 and 5,460,435 bp, comprising contigs 101 and 191 with GC content of 50.7% and 50.5% for E. coli O177_CF-154-A and E. coli O177_CF-335-B, respectively. There were 5040 coding sequences (CDS), 5146 genes, 4896 proteins, 90 RNAs, and 78 tRNA for E. coli O177_CF-154-A genome, while E. coli O177_CF-335-B genome contained 5463 CDS, 5570 genes, 5230 proteins, 92 RNAs, and 80 tRNA. Furthermore, both genomes contained 2 CRISPR Arrays. Based on RAST annotation, there were 426 and 425 subsystem feature counts with 5190 and 5662 CDS in E. coli O177_CF-154-A and E. coli O177_CF-335-B, respectively. As depicted in Fig. 2, the carbohydrates; amino acids and derivatives; stress response; respiration; DNA metabolism; protein metabolism; membrane transport; and cofactor, vitamins, prosthetic groups, pigments were the most abundant subsystem feature found in both genomes. The circular complete genome draft shown in Fig 3 was constructed using CGView [1]. The Virulence and Resistance Gene Identifier revealed that both genomes contained several virulence and antimicrobial resistance genes, Figs 3-7 and Excel sheets 1 and 2 (S 1 and 2).

Fig 1

An illustration of the North West province map. https://municipalities.co.za/provinces/view/8/north-west.

Table 1

Features of draft genomes of two E. coli O177 isolates obtained from cattle faeces.

	Sample ID
Features	E. coli O177_CF-154-A	E. coli O177_CF-335-B
Genome size	5,112,402 bp	5,460,435 bp
Genome coverage depth	124.7x	162.128x
Total length	5111092 bp	5459908 bp
GC content (%)	50.7	50.5
Number of contigs	101	191
Contigs N50	127249	113919
Contigs L50	14	15
Number of Scaffold	101	-
Scaffold N50	130301	-
Scaffold L50	13	-
Coding genes	4896	5230
Total genes	5146	5570
Total CDSs	5040	5463
Total proteins	4896	5230
rRNA	8, 4, 6 (5S, 16S, 23S)	7, 4, 6 (5S, 16S, 23S)
tRNA	78	80
ncRNA	10	10
CRISPR Arrays	2	2

Fig. 2

Frequency distribution of gene categories in genomes of two E. coli O177isolates obtained from cattle faeces.

Fig. 3

The circular genome map of E. coli O177 isolates (CF-154-A and CF-335-B) obtained from cattle faeces. Circle displays from inside to outside: GC Skew (light orange), GC content (light purple), Drug Tagets (black), Transporters (blue), Virulence factor genes (yellow), Antimicrobial resistance genes (red), Non CDS features (turquoise blue), CDS reverse strand (light purple) and CDS forward strand (green).

Fig. 7

Resistance mechanisms in two E. coli O177 isolates from cattle faeces

An illustration of the North West province map. https://municipalities.co.za/provinces/view/8/north-west. Features of draft genomes of two E. coli O177 isolates obtained from cattle faeces. Frequency distribution of gene categories in genomes of two E. coli O177isolates obtained from cattle faeces. The circular genome map of E. coli O177 isolates (CF-154-A and CF-335-B) obtained from cattle faeces. Circle displays from inside to outside: GC Skew (light orange), GC content (light purple), Drug Tagets (black), Transporters (blue), Virulence factor genes (yellow), Antimicrobial resistance genes (red), Non CDS features (turquoise blue), CDS reverse strand (light purple) and CDS forward strand (green). Distribution of antimicrobial resistance genes in genomes of two E. coli O177 isolates obtained from cattle faeces. Distribution of antimicrobial resistance gene family in genomes of two E. coli O177 isolates obtained from cattle faeces. Antimicrobial drug classes in genomes of two E. coli O177 isolates from cattle faeces Resistance mechanisms in two E. coli O177 isolates from cattle faeces

Experimental Design, Materials and Methods

Bacterial strain

Two atypical enteropathogenic E. coli O177 isolates were obtained from Antimicrobial Resistance and Phage Biocontrol Laboratory, Department of Microbiology. The isolates were selected based on the virulence and antimicrobial resistance profiles as described in the previous studies [2,3]. The stock cultures were removed from −80 °C and revived on MacConkey agar. The plates were incubated at 37 °C for 24 hours. After incubation, a single colony was transferred into 15 falcon tubes containing 10 mL nutrient broth. The tubes were incubated in a shaking incubator (150 rpm) at 37 °C for 24 hours.

Genomic DNA extraction and Sequencing

Genomic DNA was extracted from overnight cultures using the Zymo Research Genomic DNATM-Tissue MiniPrep Kit (Biolab, South Africa) following the manufacturer's instructions. The DNA concentration was determined using the NanoDropTM-Lite 1,000 spectrophotometer (Thermo Fisher Scientific, Walton, ma, USA). After fragmentation, DNA libraries were constructed using the Nextera XT DNA library prep kit (Illumina, USA) following the manufacturer's instruction. The fragmented DNA was amplified using 12 cycles PCR, which adds the index sequences [index 1 (i7) and index 2 (i5)]. The PCR products were purified using 0.6 × Agencourt AMPure XP beads (Beckman Coulter), and the quality was determined using 1.5% (w/v) agarose gel. Each library was diluted to 12 pmol. Samples were normalized to 4 nM using Nextra XT Library Normalization Beads (Illumina). Normalized libraries were pooled and 150 base paired-ends sequencing was performed with MiSeq Reagent V3 600-cycle kits on the Miseq instrument (Illumina).

Genome assembly, annotation and data analysis

Raw sequence data were generated and FASTQ files were obtained. The data were assessed for quality using FASTQC (v.0.11.5) and filtered for low quality reads and adapter regions using Trimmomatic (v.0.36) [4,5]. The de novo genome assembly was carried out using SPAdes (v.3.13) [5]. Complete genome annotation was performed using NCBI PGAP (v.5.0), Prokka pipeline (v.2.1.1), RAST server (v.2.0) and PATRIC online sever (v.3.6.2) [6], [7], [8], [9], [10]. Antimicrobial resistance genes were further mined using the Resistance Gene Identifier online tool of the comprehensive Antibiotic Resistance Database CARD4 (https://card.mcmaster.ca/analyze/rgi) with all parameters (‘Perfect and Strict hits’ and ‘High quality/Coverage’) set at default [11].

Ethics Statements

This study did not involve the use of human subjects or animal experiments.

CRediT authorship contribution statement

Peter Kotsoana Montso: Conceptualization, Methodology, Data curation, Writing – original draft, Visualization, Investigation, Software, Validation, Writing – review & editing. Victor Mlambo: Conceptualization, Methodology, Supervision, Software, Validation, Writing – review & editing. Collins Njie Ateba: Conceptualization, Methodology, Supervision, Software, Validation, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Subject	Microbiology
Specific subject area	Molecular Microbiology and Bioinformatics
Type of data	TableFiguresExcel Sheets
How the data were acquired	Whole genome sequence was performed using Illumina MiSeq platform. The FASTQ files were obtained and imported into Kbase platform (https://kbase.us/). The files were subjected to FASTQC (v.0.11.5) to assess reads quality. Subsequently, raw data were processed using Trimmomatic (v0.36). The assemble algorithm was carried out using SPAdes (v3.13.0), and genome annotation was performed using Prokaryotic Genome Annotation Pipeline (PGAP), Rapid Annotation using Subsystem Technology (RAST) and Pathosystems Resource Integration Center (PATRIC).
Data format	Raw, filtered and analysed.
Description of data collection	Genomic DNA was extracted from two aEPEC O177 isolates (CF-154-A and CF-334-B) obtained from the Department of Microbiology, at NWU. The gDNA was sequenced using Illumina MiSeq platform. After sequencing FASTQ files were obtained. Raw reads were cleaned and assembled into contigs using FASTQC (v.0.11.5) SPAdes (v3.13.0), respectively. The genome annotation was carried out using PGAP, v.2.0 and RAST (v.2.0). The genome maps were drafted using PATRIC (v.3.6.2).
Data source location	• Institution: North-West University• City/Town/Region: North-West Province• Country: South Africa
Data accessibility	Repository name: National Center for Biotechnology Information (NCBI), GenBank, and figshare.Data identification numbers: VMKH00000000 (E. coli O177_CF-154-A) and VMKG00000000 (E. coli O177_CF-335-B);PRJNA555014 and PRJNA554852, SAMN12288806 and SAMN12285021E. coli O177_CF-154-A and for E. coli O177_ CF-335-B, respectively).Direct URL to data: https://www.ncbi.nlm.nih.gov/nuccore/VMKH00000000, https://www.ncbi.nlm.nih.gov/nuccore/VMKG00000000, https://figshare.com/s/e3a60e4a3d918527b572
Related research article	P. K. Montso, C. C. Bezuidenhout, C. Mienie, Y. M. Somorin, O. A. Odeyemi, V. Mlambo, C. N. Ateba. Genetic diversity and whole genome sequence analysis data of multidrug resistant atypical enteropathogenic E. coli O177 strains: An assessment of food safety and public health implications. Int J Food Microbiol. 2022, https://doi.org/10.1016/j.ijfoodmicro.2022.109555.

10 in total

1. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors: Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal: J Comput Biol Date: 2012-04-16 Impact factor: 1.479

2. RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation.

Authors: Wenjun Li; Kathleen R O'Neill; Daniel H Haft; Michael DiCuccio; Vyacheslav Chetvernin; Azat Badretdin; George Coulouris; Farideh Chitsaz; Myra K Derbyshire; A Scott Durkin; Noreen R Gonzales; Marc Gwadz; Christopher J Lanczycki; James S Song; Narmada Thanki; Jiyao Wang; Roxanne A Yamashita; Mingzhang Yang; Chanjuan Zheng; Aron Marchler-Bauer; Françoise Thibaud-Nissen
Journal: Nucleic Acids Res Date: 2020-12-03 Impact factor: 16.971

3. Prokka: rapid prokaryotic genome annotation.

Authors: Torsten Seemann
Journal: Bioinformatics Date: 2014-03-18 Impact factor: 6.937

4. Genetic diversity and whole genome sequence analysis data of multidrug resistant atypical enteropathogenic Escherichia coli O177 strains: An assessment of food safety and public health implications.

Authors: Peter Kotsoana Montso; Cornelius Carlos Bezuidenhout; Charlotte Mienie; Yinka M Somorin; Olumide A Odeyemi; Victor Mlambo; Collins Njie Ateba
Journal: Int J Food Microbiol Date: 2022-01-26 Impact factor: 5.277

5. The CGView Server: a comparative genomics tool for circular genomes.

Authors: Jason R Grant; Paul Stothard
Journal: Nucleic Acids Res Date: 2008-04-14 Impact factor: 16.971

6. Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center.

Authors: Alice R Wattam; James J Davis; Rida Assaf; Sébastien Boisvert; Thomas Brettin; Christopher Bun; Neal Conrad; Emily M Dietrich; Terry Disz; Joseph L Gabbard; Svetlana Gerdes; Christopher S Henry; Ronald W Kenyon; Dustin Machi; Chunhong Mao; Eric K Nordberg; Gary J Olsen; Daniel E Murphy-Olson; Robert Olson; Ross Overbeek; Bruce Parrello; Gordon D Pusch; Maulik Shukla; Veronika Vonstein; Andrew Warren; Fangfang Xia; Hyunseung Yoo; Rick L Stevens
Journal: Nucleic Acids Res Date: 2016-11-29 Impact factor: 16.971

7. The First Isolation and Molecular Characterization of Shiga Toxin-Producing Virulent Multi-Drug Resistant Atypical Enteropathogenic Escherichia coli O177 Serogroup From South African Cattle.

Authors: Peter Kotsoana Montso; Victor Mlambo; Collins Njie Ateba
Journal: Front Cell Infect Microbiol Date: 2019-09-24 Impact factor: 5.293

8. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

Authors: Ross Overbeek; Robert Olson; Gordon D Pusch; Gary J Olsen; James J Davis; Terry Disz; Robert A Edwards; Svetlana Gerdes; Bruce Parrello; Maulik Shukla; Veronika Vonstein; Alice R Wattam; Fangfang Xia; Rick Stevens
Journal: Nucleic Acids Res Date: 2013-11-29 Impact factor: 16.971

9. Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors: Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal: Bioinformatics Date: 2014-04-01 Impact factor: 6.937

10. CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database.

Authors: Brian P Alcock; Amogelang R Raphenya; Tammy T Y Lau; Kara K Tsang; Mégane Bouchard; Arman Edalatmand; William Huynh; Anna-Lisa V Nguyen; Annie A Cheng; Sihan Liu; Sally Y Min; Anatoly Miroshnichenko; Hiu-Ki Tran; Rafik E Werfalli; Jalees A Nasir; Martins Oloni; David J Speicher; Alexandra Florescu; Bhavya Singh; Mateusz Faltyn; Anastasia Hernandez-Koutoucheva; Arjun N Sharma; Emily Bordeleau; Andrew C Pawlowski; Haley L Zubyk; Damion Dooley; Emma Griffiths; Finlay Maguire; Geoff L Winsor; Robert G Beiko; Fiona S L Brinkman; William W L Hsiao; Gary V Domselaar; Andrew G McArthur
Journal: Nucleic Acids Res Date: 2020-01-08 Impact factor: 16.971

10 in total