| Literature DB >> 28138355 |
Linghua Xu1,2, Wanxia Shi1, Xian-Chun Zeng1, Ye Yang1, Lingli Zhou1, Yao Mu1, Yichen Liu1.
Abstract
Arthrobacter sp. B6 is a Gram-positive, non-motile, facultative aerobic bacterium, isolated from the arsenic-contaminated aquifer sediment in the Datong basin, China. This strain displays high resistance to arsenic, and can dynamically transform arsenic under aerobic condition. Here, we described the high quality draft genome sequence, annotations and the features of Arthrobacter sp. B6. The G + C content of the genome is 64.67%. This strain has a genome size of 4,663,437 bp; the genome is arranged in 8 scaffolds that contain 25 contigs. From the sequences, 3956 protein-coding genes, 264 pseudo genes and 89 tRNA/rRNA-encoding genes were identified. The genome analysis of this strain helps to better understand the mechanism by which the microbe efficiently tolerates arsenic in the arsenic-contaminated environment.Entities:
Keywords: Arsenate reduction; Arthrobacter sp. B6; Datong basin; Genome; High-arsenic sediment
Year: 2017 PMID: 28138355 PMCID: PMC5259909 DOI: 10.1186/s40793-017-0231-9
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Fig. 1Images of Arthrobacter sp. B6 using scanning electron microscopy (Left) and the appearance of colony morphology on 0.1× Trypticase Soy Broth solid media (Right)
Classification and general features of Arthrobacter sp. B6 [19]
| MIGS ID | Property | Term | Evidence codea |
|---|---|---|---|
| Classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species undetermined | - | ||
| Strain: B6 | IDA | ||
| Gram stain | Positive | IDA | |
| Cell shape | Polymorphic: rod to coccus shaped | IDA | |
| Motility | Non-motile | IDA | |
| Sporulation | Non-sporulating | IDA | |
| Temperature range | 4–37 °C | IDA | |
| Optimum temperature | 30 °C | IDA | |
| pH range; Optimum | 6.0–8.5; 7 | IDA | |
| Carbon source | Dextrin, Tween 40, D-fructose, Gentiobiose, α-D-glucose, Lactulose, Maltotriose, D-mannose, D-mannitol, D-melezitose, Palatinose, D-psicose, D-raffinose, L-rhamnose, D-ribose, D-sorbitol, Sucrose, Turanose, α- hydroxybutyric acid, α-ketoglutaric acid, L-malic acid, Pyruvic acid, D-alanine, L-alanine, L-serine, Glycerol, Adenosine, 2-deoxy adenosine, Inosine. | IDA | |
| MIGS-6 | Habitat | Soil, sediment | IDA |
| MIGS-6.3 | Salinity | 1–7% NaCl (w/v) | IDA |
| MIGS-22 | Oxygen requirement | Aerobic | IDA |
| MIGS-15 | Biotic relationship | free-living | IDA |
| MIGS-14 | Pathogenicity | Non-pathogen | NAS |
| MIGS-4 | Geographic location | Datong basin, Shanxi, China | IDA |
| MIGS-5 | Sample collection | August 2011 | IDA |
| MIGS-4.1 | Latitude | 39.4899 | IDA |
| MIGS-4.2 | Longitude | 112.915 | IDA |
| MIGS-4.4 | Altitude | Not recorded |
aEvidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [30]
Fig. 2Phylogenetic tree based on 16S rRNA gene sequences showing the phylogenetic position of Arthrobacter sp. B6 (●). Sequences were aligned with the CLUSTAL W program and were constructed using maximum-likelihood method implemented in MEGA 6.0 program [17, 18]. GenBank accession numbers are listed in parentheses. Type strains are indicated with a superscript T. Strains with published genomes are shown in bold. Bootstrap support values for 1000 replications above 50% are shown near nodes. The scale bar indicates 0.05 nucleotide substitution per nucleotide position
Project information
| MIGS ID | Property | Term |
|---|---|---|
| MIGS 31 | Finishing quality | High-Quality Permanent Draft |
| MIGS-28 | Libraries used | Illumina Std. shotgun library |
| MIGS 29 | Sequencing platforms | Illumina HiSeq 2000 |
| MIGS 31.2 | Fold coverage | 161 × |
| MIGS 30 | Assemblers | SOAPdenovo v2.04 |
| MIGS 32 | Gene calling method | Glimmer v3.02 |
| Locus Tag | AU175 | |
| Genbank ID | LQAP01000000 | |
| GenBank Date of Release | Jun 15, 2016 | |
| GOLD ID | Gs0118476 | |
| BIOPROJECT | PRJNA306410 | |
| MIGS 13 | Source Material Identifier | CGMCC 1.15656 |
| Project relevance | Biotechnological, Environmental |
Genome statistics
| Attribute | Value | % of Total |
|---|---|---|
| Genome size (bp) | 4,663,437 | 100.00 |
| DNA coding (bp) | 4,100,739 | 87.93 |
| DNA G + C (bp) | 3,015,845 | 64.67 |
| DNA scaffolds | 8 | 100.00 |
| Total genes | 4309 | 100.00 |
| Protein coding genes | 3956 | 91.81 |
| RNA genes | 89 | 2.07 |
| Pseudo genes | 264 | 6.12 |
| Genes in internal clusters | 4250 | 98.63 |
| Genes with function prediction | 3527 | 81.85 |
| Genes assigned to COGs | 2210 | 51.29 |
| Genes with Pfam domains | 3464 | 80.39 |
| Genes with signal peptides | 220 | 5.11 |
| Genes with transmembrane helices | 249 | 5.78 |
| CRISPR repeats | 125 | 2.90 |
Number of genes associated with general COG functional categories
| Code | Value | %age | Description |
|---|---|---|---|
| J | 145 | 6.56 | Translation, ribosomal structure and biogenesis |
| A | 1 | 0.05 | RNA processing and modification |
| K | 162 | 7.33 | Transcription |
| L | 110 | 4.98 | Replication, recombination and repair |
| B | 1 | 0.05 | Chromatin structure and dynamics |
| D | 12 | 0.54 | Cell cycle control, Cell division, chromosome partitioning |
| V | 26 | 1.18 | Defense mechanisms |
| T | 58 | 2.62 | Signal transduction mechanisms |
| M | 72 | 3.26 | Cell wall/membrane biogenesis |
| N | 0 | 0 | Cell motility |
| U | 18 | 0.81 | Intracellular trafficking and secretion |
| O | 65 | 2.94 | Posttranslational modification, protein turnover, chaperones |
| C | 168 | 7.60 | Energy production and conversion |
| G | 225 | 10.18 | Carbohydrate transport and metabolism |
| E | 272 | 12.31 | Amino acid transport and metabolism |
| F | 71 | 3.21 | Nucleotide transport and metabolism |
| H | 111 | 5.02 | Coenzyme transport and metabolism |
| I | 103 | 4.66 | Lipid transport and metabolism |
| P | 127 | 5.75 | Inorganic ion transport and metabolism |
| Q | 66 | 2.99 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 266 | 12.04 | General function prediction only |
| S | 131 | 5.93 | Function unknown |
| - | 2099 | 48.71 | Not in COGs |
The total is based on the total number of protein coding genes in the genome
Fig. 3A graphical circular map of the genome performed with CGview comparison tool [31]. From outside to center, ring 1 and 4 show protein-coding genes oriented in the forward (colored by COG categories) and reverse (colored by COG categories) directions, respectively. ring 2 and 3 denote genes on forward/reverse strand; ring 5 shows G + C% content plot, and the inner-most ring shows GC skew, purple indicating negative values and olive, positive values