| Literature DB >> 26380644 |
Fang Chen1, Hui Wang1, Yajing Cao1, Xiangyang Li1, Gejiao Wang1.
Abstract
Arenimonas donghaensis is the type species of genus Arenimonas which belongs to family Xanthomonadaceae within Gammaproteobacteria. In this study, a total of five type strains of Arenimonas were sequenced. The draft genomic information of A. donghaensis DSM 18148(T) is described and compared with other four genomes of Arenimonas. The genome size of A. donghaensis DSM 18148(T) is 2,977,056 bp distributed in 51 contigs, containing 2685 protein-coding genes and 49 RNA genes.Entities:
Keywords: Arenimonas; Arenimonas donghaensis; Comparative genomics; Genome sequence; Xanthomonadaceae
Year: 2015 PMID: 26380644 PMCID: PMC4572611 DOI: 10.1186/s40793-015-0055-4
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Fig. 1A phylogenetic tree based on the 16S rRNA gene sequences highlighting the position of A. donghaensis HO3-R19T (shown in bold) related to the strains of Arenimonas. The GenBank accession numbers are shown in parentheses. Sequences were aligned using CLUSTALW, and phylogenetic inferences were obtained using the neighbor-joining method within the MEGA 5.05 software [8]. Numbers at the nodes represent percentages of bootstrap values obtained by repeating the analysis 1000 times to generate a majority consensus tree. The scale bar indicates 0.02 nucleotide change per nucleotide position
Fig. 2A scanning electron micrograph of A. donghaensis DSM 18148T cells
Classification and general features of A. donghaensis strain DSM 18148T according to the MIGS recommendations [21]
| MIGS ID | Property | Term | Evidence codea |
|---|---|---|---|
| Classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species | TAS [ | ||
| Type strain: HO3-R19T (= KACC 11381T = DSM 18148T) | |||
| Gram stain | negative | TAS [ | |
| Cell shape | straight or slightly curved rod | TAS [ | |
| Motility | motile | TAS [ | |
| Sporulation | non-spore-forming | TAS [ | |
| Temperature range | 4–37 °C | TAS [ | |
| Optimum temperature | 28 °C | TAS [ | |
| pH range; Optimum | 7.0–9.0; 8.0 | TAS [ | |
| Carbon source | casein, tyrosine and gelatin; β-hydroxybutyric acid, L-alaninamide, L-glutamic acid and glycyl-L-glutamic acid | IDA | |
| GS-6 | Habitat | seashore sand | TAS [ |
| MIGS-6.3 | Salinity | 0–3 % NaCl (w/v) | TAS [ |
| MIGS-22 | Oxygen requirement | aerobic | TAS [ |
| MIGS-15 | Biotic relationship | free-living | NAS |
| MIGS-14 | Pathogenicity | non-pathogen | NAS |
| MIGS-4 | Geographic location | Pohang city, Korea | TAS [ |
| MIGS-5 | Sample collection | not reported | |
| MIGS-4.1 | Latitude | not reported | |
| MIGS-4.2 | Longitude | not reported | |
| MIGS-4.4 | Altitude | not reported |
a,Evidence codes – IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [27]
Project information
| MIGS ID | Property | Term |
|---|---|---|
| MIGS 31 | Finishing quality | High-quality draft |
| MIGS-28 | Libraries used | Illumina Paired-End library (300 bp insert size) |
| MIGS 29 | Sequencing platforms | Illumina Hiseq2000 |
| MIGS 31.2 | Fold coverage | 332.4× |
| MIGS 30 | Assemblers | SOAPdenovo v1.05 |
| MIGS 32 | Gene calling method | GeneMarkS+ |
| Locus Tag | N788 | |
| GenBank ID | AVCJ00000000 | |
| GenBank Date of Release | 2014/08/25 | |
| GOLD ID | Gi0067066 | |
| BIOPROJECT | PRJNA214575 | |
| MIGS 13 | Source Material Identifier | DSM 18148 |
| Project relevance | Genome comparison |
Fig. 3Graphical circular map of A. donghaensis DSM 18148T genome. From outside to center, ring 1, 4 show protein-coding genes colored by COG categories on forward/reverse strand; ring 2, 3 denote genes on forward/reverse strand; ring 5 shows G + C% content plot, and the innermost ring shows GC skew
Genome statistics
| Attribute | Value | % of Total |
|---|---|---|
| Genome size (bp) | 2,977,056 | 100.00 |
| DNA coding (bp) | 2,722,012 | 91.43 |
| DNA G + C (bp) | 2,046,559 | 68.74 |
| DNA scaffolds | 49 | |
| Total genes | 2735 | 100.00 |
| Protein coding genes | 2685 | 98.17 |
| RNA genes | 49 | 1.79 |
| Pseudo genes | 1 | 0.04 |
| Genes in internal clusters | ||
| Genes with function prediction | 472 | 17.26 |
| Genes assigned to COGs | 2244 | 82.05 |
| Genes with Pfam domains | 2194 | 80.22 |
| Genes with signal peptides | 362 | 13.24 |
| Genes with transmembrane helices | 717 | 26.22 |
| CRISPR repeats | 0 | 0.00 |
Number of genes associated with general COG functional categories
| Code | Value | % age | Description |
|---|---|---|---|
| J | 163 | 6.07 | Translation, ribosomal structure and biogenesis |
| A | 1 | 0.04 | RNA processing and modification |
| K | 127 | 4.73 | Transcription |
| L | 107 | 3.99 | Replication, recombination and repair |
| B | 1 | 0.04 | Chromatin structure and dynamics |
| D | 28 | 1.04 | Cell cycle control, Cell division, chromosome partitioning |
| V | 56 | 2.09 | Defense mechanisms |
| T | 171 | 6.37 | Signal transduction mechanisms |
| M | 155 | 5.77 | Cell wall/membrane biogenesis |
| N | 37 | 1.38 | Cell motility |
| U | 68 | 2.53 | Intracellular trafficking and secretion |
| O | 114 | 4.25 | Posttranslational modification, protein turnover, chaperones |
| C | 146 | 5.44 | Energy production and conversion |
| G | 57 | 2.12 | Carbohydrate transport and metabolism |
| E | 173 | 6.44 | Amino acid transport and metabolism |
| F | 55 | 2.05 | Nucleotide transport and metabolism |
| H | 111 | 4.13 | Coenzyme transport and metabolism |
| I | 102 | 3.80 | Lipid transport and metabolism |
| P | 96 | 3.58 | Inorganic ion transport and metabolism |
| Q | 49 | 1.82 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 241 | 8.98 | General function prediction only |
| S | 186 | 6.93 | Function unknown |
| - | 441 | 16.42 | Not in COGs |
The total is based on the total number of protein coding genes in the genome
General features of the five Arenimonas genomes
| Strains | Source | Size (Mb) | CDSs | rRNA clusters | tRNAs | Draft/finished | Genome status contigs | Contigs N50 | GenBank no. |
|---|---|---|---|---|---|---|---|---|---|
|
| Compost | 3.16 | 2849 | 3 | 45 | Draft | 95 | 81,415 | AWXU00000000 |
|
| Seashore sand | 2.98 | 2685 | 4 | 45 | Draft | 51 | 159,562 | AVCJ00000000 |
|
| Oil-contaminated soil | 3.11 | 2861 | 5 | 44 | Draft | 221 | 29,626 | AVCH00000000 |
|
| Iron mine | 3.06 | 2775 | 2 | 44 | Draft | 65 | 99,300 | AVCK00000000 |
|
| Rice rhizosphere | 3.09 | 2897 | 3 | 45 | Draft | 45 | 441,364 | AVCI00000000 |
Fig. 4Genome comparison among the five Arenimonas species. Venn diagram illustrates the number of genes unique or shared among the five Arenimonas genomes
Number of genes in the core genome of the five analyzed Arenimonas genomes associated with general COG functional categories
| Code | Value | % age | Description |
|---|---|---|---|
| A | 1 | 0.10 | RNA processing and modification |
| C | 88 | 8.68 | Energy production and conversion |
| D | 16 | 1.58 | Cell cycle control, cell division, chromosome partitioning |
| E | 88 | 8.68 | Amino acid transport and metabolism |
| F | 42 | 4.14 | Nucleotide transport and metabolism |
| G | 20 | 1.97 | Carbohydrate transport and metabolism |
| H | 59 | 5.82 | Coenzyme transport and metabolism |
| I | 52 | 5.13 | Lipid transport and metabolism |
| J | 126 | 12.43 | Translation, ribosomal structure and biogenesis |
| K | 44 | 4.34 | Transcription |
| L | 53 | 5.23 | Replication, recombination and repair |
| M | 60 | 5.92 | Cell wall/membrane/envelope biogenesis |
| N | 12 | 1.18 | Cell motility |
| O | 64 | 6.31 | Posttranslational modification, protein turnover, chaperones |
| P | 32 | 3.16 | Inorganic ion transport and metabolism |
| Q | 22 | 2.17 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 85 | 8.38 | General function prediction only |
| S | 74 | 7.30 | Function unknown |
| T | 54 | 5.33 | Signal transduction mechanisms |
| U | 27 | 2.66 | Intracellular trafficking, secretion, and vesicular transport |
| V | 11 | 1.08 | Defense mechanisms |
| - | 0 | 0.00 | Not in COGs |
The total is based on the total number of protein coding genes in the core genome
Number of strain-specific genes of A. donghaensis DSM 18148T associated with general COG functional categories
| Code | Value | % age | Description |
|---|---|---|---|
| C | 15 | 2.50 | Energy production and conversion |
| D | 3 | 0.50 | Cell cycle control, cell division, chromosome partitioning |
| E | 17 | 2.83 | Amino acid transport and metabolism |
| F | 3 | 0.50 | Nucleotide transport and metabolism |
| G | 6 | 1.00 | Carbohydrate transport and metabolism |
| H | 15 | 2.50 | Coenzyme transport and metabolism |
| I | 9 | 1.50 | Lipid transport and metabolism |
| J | 7 | 1.16 | Translation, ribosomal structure and biogenesis |
| K | 38 | 6.32 | Transcription |
| L | 11 | 1.83 | Replication, recombination and repair |
| M | 25 | 4.16 | Cell wall/membrane/envelope biogenesis |
| N | 4 | 0.67 | Cell motility |
| O | 10 | 1.66 | Posttranslational modification, protein turnover, chaperones |
| P | 18 | 3.00 | Inorganic ion transport and metabolism |
| Q | 6 | 1.00 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 51 | 8.49 | General function prediction only |
| S | 44 | 7.32 | Function unknown |
| T | 54 | 8.99 | Signal transduction mechanisms |
| U | 7 | 1.16 | Intracellular trafficking, secretion, and vesicular transport |
| V | 16 | 2.66 | Defense mechanisms |
| - | 242 | 40.27 | Not in COGs |
The total is based on the total number of strain-specific genes of A. donghaensis DSM 18148T
Fig. 5A phylogenetic tree highlighting the phylogenetic position of A. donghaensis DSM 18148T. The conserved protein was analyzed by OrthoMCL with Match Cutoff 50 % and E-value Exponent Cutoff 1-e5 [15]. The phylogenetic tree was constructed based on the 1014 single-copy conserved proteins shared among the fifteen genomes. The phylogenies were inferred by MEGA 5.05 with NJ algorithm [8], and 1000 bootstrap repetitions were computed to estimate the reliability of the tree. The genome accession numbers of the strains are shown in parentheses