| Literature DB >> 34901351 |
Shuhaila Mat-Sharani1, Suhaila Sulaiman2, Nik Yusnoraini Yusof3.
Abstract
Escherichia coli strain INF32/16/A is a gram-negative bacteria which is an extended-spectrum beta-lactamases (ESBL). ESBL is an enzyme that is produced by bacteria to become resistant to existing antibiotic such as extended-spectrum penicillin, cephalosporins, and have been threatening the ability to treat an infection. Therefore, genome analysis will provide an insight of how this bacteria able to evolve and the information obtained will able to facilitate in designing new antibiotics. The genome of E. coli strain was sequenced using Illumina MiSeq and raw genome sequence have been submitted into NCBI SRA database (SRR15334628) under Bioproject accession number PRJNA726861.Entities:
Keywords: Escherichia coli; Extended-spectrum beta-lactamase; Genome sequencing; Pathogenic
Year: 2021 PMID: 34901351 PMCID: PMC8639416 DOI: 10.1016/j.dib.2021.107640
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Statistics of the pre-processing data of the genome reads containing forward (32-16-A_R1.fastq) and reverse (32-16-A_R2.fastq) reads.
| Sample Name | R1 | R2 | Total |
|---|---|---|---|
| Total Raw Reads | 796,067 | 796,067 | 1,592,134 |
| Total Raw Reads Bases | 190,618,311 | 190,972,166 | 381,590,477 |
| Total Clean Reads | 424,248 | 424,248 | 848,496 |
| Total Clean Reads Bases | 85,679,126 | 57,343,443 | 143,022,569 |
| Clean Reads (%) | 53.29 | 53.29 | 53.29 (average) |
| GC Content Clean Reads (%) | 50 | 51 | - |
The main assembly statistics of the Medusa-assembled draft genome of E. coli strain INF32/16/A.
| Attributes | Value |
|---|---|
| Number of scaffolds | 97 |
| Total size of scaffolds | 5,212,612 |
| Longest scaffold | 3,201,741 |
| Shortest scaffold | 220 |
| Number of scaffolds > 1K nt | 62 (63.9%) |
| Number of scaffolds > 10K nt | 18 (18.6%) |
| Number of scaffolds > 100K nt | 4 (4.1%) |
| Number of scaffolds > 1M nt | 1 (1.0%) |
| Number of scaffolds > 10M nt | 0 (0.0%) |
| Mean scaffold size | 53,738 |
| Median scaffold size | 1,667 |
| N50 scaffold length | 3,201,741 |
| L50 scaffold count | 1 |
| GC Content | 50.32% |
Fig. 1Genome completeness of the assembled genome of E. coli strain INF32/16/A by using BUSCO tool with enterobacterales_odb10 lineage.
Fig. 2Functional classification based on Gene Ontology for 99.2% of predicted gene models from Prodigal.
| Subject | Health and medical sciences |
| Specific subject area | Microbiology and genomics. |
| Type of data | Table |
| How data were acquired | Paired-end reads of extended spectrum beta lactamase (ESBL)-producing |
| Data format | Raw and analyzed. |
| Parameters for data collection | Genomic DNA from pure culture. 10 µg/ng of DNA was utilized for a 251 bp paired-end sequencing library using an Illumina paired-end DNA sample preparation kit. |
| Description of data collection | Whole genome sequencing performed by Illumina MiSeq system. Raw reads were trimmed using BBDuk (BBTools v36) and assembled using SPAdes v3.9.0. The scaffolding was conducted using Medusa v1.6. The genome completeness of the assembled genome was assessed using BUSCO tool. |
| Data source location | Institution: Institute for Research in Molecular Medicine (INFORMM) |
| Data accessibility | The data is hosted on a public repository. |