| Literature DB >> 27408680 |
Fan Zhang1, Sanbao Su2, Gaoming Yu3, Beiwen Zheng4, Fuchang Shu2, Zhengliang Wang2, Tingsheng Xiang2, Hao Dong5, Zhongzhi Zhang5, DuJie Hou1, Yuehui She2.
Abstract
Enterobacter mori strain 5-4 is a Gram-negative, motile, rod shaped, and facultatively anaerobic bacterium, which was isolated from a mixture of formation water (also known as oil-reservior water) and crude-oil in Karamay oilfield, China. To date, there is only one E. mori genome has been sequenced and very little knowledge about the mechanism of E. mori adapted to the petroleum reservoir. Here, we report the second E. mori genome sequence and annotation, together with the description of features for this organism. The 4,621,281 bp assembly genome exhibits a G + C content of 56.24% and contains 4,317 protein-coding and 65 RNA genes, including 5 rRNA genes.Entities:
Keywords: Enterobacter mori strain 5–4; Formation water; Genome; Hydrocarbon degradation
Year: 2015 PMID: 27408680 PMCID: PMC4940761 DOI: 10.1186/1944-3277-10-9
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Classification and general features of strain 5–4 according to the MIGS recommendations [14]
| MIGS ID | Property | Term | Evidence code a |
|---|---|---|---|
| Classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species | |||
| Strain: Strain 5-4 | IDA | ||
| Gram stain | Negative | IDA | |
| Cell shape | Rod | IDA | |
| Motility | Motile | IDA | |
| Sporulation | Non-sporulating | IDA | |
| Temperature range | 4-45°C | IDA | |
| Optimum temperature | 35°C | IDA | |
| pH range; Optimum | Unknown | IDA | |
| Carbon source | Sorbitol, glycerol, tetradecane and hexadecane | IDA | |
| MIGS-6 | Habitat | Environment | IDA |
| MIGS-6.3 | Salinity | Growth in 0% ~ 7% NaCl | IDA |
| MIGS-22 | Oxygen requirement | Aerobic | IDA |
| MIGS-15 | Biotic relationship | Free living | IDA |
| MIGS-14 | Pathogenicity | Unknown | IDA |
| MIGS-4 | Geographic location | Karamay, China | IDA |
| MIGS-5 | Sample collection | 2012 | IDA |
| MIGS-4.1 | Latitude | 45°62’N | IDA |
| MIGS-4.2 | Longitude | 85°02’E | |
| MIGS-4.4 | Altitude | 460 m | IDA |
aEvidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [25].
Figure 1Scanning electron micrograph of cells of Enterobacter mori strain 5–4 bar: 2.0 μm.
Figure 2Phylogenetic tree highlighting the position of E. mori 5–4 relative to other type strains within the genus Enterobacter. The strains and their corresponding GenBank accession numbers for 16S rRNA genes are shown following the organism names. Bootstrap consensus trees were inferred from 100 replicates, only bootstrap values > 50% were indicated. Xenorhabdus poinarii DSM 4768T was used as anoutgroup. The scale bar, 0.0005 substitutions per nucleotide position.
Project information
| MIGS ID | Property | Term |
|---|---|---|
| MIGS-31 | Finishing quality | High-quality draft |
| MIGS-28 | Libraries used | One pair-end 450 bp library |
| MIGS-29 | Sequencing platforms | Illumina HiSeq 2000 |
| MIGS-31.2 | Fold coverage | 358.0 × (based on 450 bp library) |
| MIGS-30 | Assemblers | Velvet 1.2.07 |
| MIGS-32 | Gene calling method | Glimmer 3.0 |
| Locus Tag | AA74 | |
| Genbank ID | JFHW00000000 | |
| Genbank Date of Release | April 2, 2014 | |
| GOLD ID | Gi0064796 | |
| BIOPROJECT | PRJNA224116 | |
| Project relevance | Industrial | |
| MIGS-13 | Source Material Identifier | CGMCC9982 |
Genome statistics
| Attribute | Value | % of totala |
|---|---|---|
| Genome size (bp) | 4,621,281 | 100.00 |
| DNA Coding region (bp) | 4,117,467 | 89.10 |
| DNA G + C content (bp) | 2,599,117 | 56.24 |
| DNA scaffolds | 36 | |
| Total genes | 4,322 | 100.00 |
| Protein-coding genes | 4,317 | 99.88 |
| RNA genes | 65 | 1.51 |
| Pseudo genes | 17 | 0.39 |
| Genes with function prediction | 980 | 22.67 |
| Genes assigned to COGs | 3,625 | 83.87 |
| Genes assigned to Pfam domains | 3,995 | 92.43 |
| Genes with signal peptides | 420 | 9.72 |
| Genes with transmembrane helices | 1,085 | 25.10 |
| CRISPR repeats | 1 | 0.023 |
aThe total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome.
Number of genes associated with the general COG functional categories
| Code | Value | % age | Description |
|---|---|---|---|
| J | 202 | 4.68 | Translation, ribosomal structure and biogenesis |
| A | 1 | 0.02 | RNA processing and modification |
| K | 400 | 9.27 | Transcription |
| L | 149 | 3.45 | Replication, recombination and repair |
| B | 1 | 0.02 | Chromatin structure and dynamics |
| D | 59 | 1.37 | Cell cycle control, mitosis and meiosis |
| V | 146 | 3.38 | Defense mechanisms |
| T | 228 | 5.28 | Signal transduction mechanisms |
| M | 266 | 6.16 | Cell wall/membrane biogenesis |
| N | 136 | 3.15 | Cell motility |
| U | 130 | 3.01 | Intracellular trafficking and secretion |
| O | 176 | 4.08 | Posttranslational modification, protein turnover, chaperones |
| C | 295 | 6.83 | Energy production and conversion |
| G | 499 | 11.56 | Carbohydrate transport and metabolism |
| E | 604 | 13.99 | Amino acid transport and metabolism |
| F | 94 | 2.18 | Nucleotide transport and metabolism |
| H | 230 | 5.33 | Coenzyme transport and metabolism |
| I | 120 | 2.78 | Lipid transport and metabolism |
| P | 421 | 9.75 | Inorganic ion transport and metabolism |
| Q | 134 | 3.10 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 720 | 16.68 | General function prediction only |
| S | 361 | 8.36 | Function unknown |
| - | 333 | 7.71 | Not in COGs |
The total is based on the total number of protein coding genes in the annotated genome.
Figure 3Genome comparison between 5–4 and LMG 25706 . (A). Alignment is represented as local colinear blocks (colored) filled with a similarity plot. Height of the similarity plot indicates nucleotide identity of both assemblies; (B). Numbers inside the Venn diagrams indicate the number of genes found to be shared among the indicated genomes.