| Literature DB >> 26203341 |
Chun-Xiao Wang1, Song-Ling Zhu1, Xiao-Yu Wang1, Ye Feng2, Bailiang Li1, Yong-Guo Li3, Randal N Johnston4, Gui-Rong Liu1, Jin Zhou5, Shu-Lin Liu6.
Abstract
Salmonella arizonae (also called Salmonella subgroup IIIa) is a Gram-negative, non-spore-forming, motile, rod-shaped, facultatively anaerobic bacterium. S. arizonae strain RKS2983 was isolated from a human in California, USA. S. arizonae lies somewhere between Salmonella subgroups I (human pathogens) and V (also called S. bongori; usually non-pathogenic to humans) and so is an ideal model organism for studies of bacterial evolution from non-human pathogen to human pathogens. We hence sequenced the genome of RKS2983 for clues of genomic events that might have led to the divergence and speciation of Salmonella into distinct lineages with diverse host ranges and pathogenic features. The 4,574,836 bp complete genome contains 4,203 protein-coding genes, 82 tRNA genes and 7 rRNA operons. This genome contains several characteristics not reported to date in Salmonella subgroup I or V and may provide information about the genetic divergence of Salmonella pathogens.Entities:
Keywords: Facultative anaerobe; Genomic evolution; Host-adapted; S. enterica subspecies arizonae RKS2983; Salmonella pathogenicity islands
Year: 2015 PMID: 26203341 PMCID: PMC4511000 DOI: 10.1186/s40793-015-0015-z
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Classification and general features of RKS2983
| Current classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species | TAS [ | ||
| Subspecies | TAS [ | ||
| Strain RKS2983 | TAS [ | ||
| Serovar 62:z36:- | TAS [ | ||
| Gram stain | Negative | IDA | |
| Cell shape | Rod-shaped | IDA | |
| Motility | Motile | IDA | |
| Sporulation | Non-sporulating | IDA | |
| Temperature range | Mesophilic | IDA | |
| Optimum temperature | 35°C–37°C | IDA | |
| pH | 7.2–7.6 | IDA | |
| Carbon source | Glucose | IDA | |
| MIGS–6 | Habitat | Human | TAS [ |
| MIGS-6.3 | Salinity | Medium | IDA |
| MIGS-22 | Oxygen requirement | Facultative anaerobes | IDA |
| MIGS-15 | Biotic relationship | Endophyte | IDA |
| MIGS-14 | Pathogenicity | Pathogenic | IDA |
| MIGS-4 | Geographic location | California, USA | TAS [ |
| MIGS-5 | Sample collection time | 1985 | TAS [ |
| MIGS-4.1 | Latitude | Not report | NAS |
| MIGS-4.2 | Longitude | Not report | NAS |
| MIGS-4.3 | Depth | Not report | NAS |
| MIGS-4.4 | Altitude | Not report | NAS |
a.) Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [43].
Project information
| MIGS-31 | Finishing quality | Finished |
| MIGS-28 | Libraries used | Illumina Paired-End library and SOLiD mate_pair library (2 x 50 bp) |
| MIGS-29 | Sequencing platforms | Illumina HiSeq 2000 and SOLiD 3.0 |
| MIGS-31.2 | Fold coverage | 100 × |
| MIGS-30 | Assemblers | SOAPdenovo v1.05 |
| MIGS-32 | Gene calling method | Glimmer software that used in the RAST pipeline |
| Genbank ID | CP006693.1 | |
| Genbank date of release | September 22, 2014 | |
| GOLD ID | GI686507741 | |
| BIOPROJECT | PRJNA215272 | |
| MIGS 13 | Source material identifier | CDC 409–85 |
| Project relevance | Evolution in bacteria |
Figure 1Graphical circular map of the S. arizonae RKS2983 genome. From the outside to the center: genes on forward strand (color by COG categories), genes on reverse strand (color by COG categories), GC content, and GC skew. The map was generated with the CGviewer software.
Nucleotide content and gene count levels of the genome
| Genome Size (bp) | 4,574,836 | |
| G + C content (bp) | 2,356,040 | 51.50 |
| Coding region (bp) | 3,924,843 | 85.79 |
| Total genesb | 4,390 | |
| rRNA genes | 22 | 0.50 |
| tRNA genes | 82 | 1.87 |
| Protein-coding genes | 4,203 | 95.70 |
| Pseudogenes | 98 | 2.23 |
| Frameshifted Genes | 78 | 1.78 |
| Genes assigned to COGs | 3,383 | 77.06 |
a.) The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome.
Number of genes associated with the 25 general COG functional categories
| J | 168 | 3.83 | Translation, ribosomal structure and biogenesis |
| A | 1 | 0.02 | RNA processing and modification |
| K | 0 | 0.00 | Transcription |
| L | 216 | 4.92 | Replication, recombination and repair |
| B | 0 | 0.00 | Chromatin structure and dynamics |
| D | 32 | 0.73 | Cell cycle control, mitosis and meiosis |
| Y | 0 | 0.00 | Nuclear structure |
| V | 44 | 1.00 | Defense mechanisms |
| T | 106 | 2.41 | Signal transduction mechanisms |
| M | 223 | 5.08 | Cell wall/membrane biogenesis |
| N | 89 | 2.03 | Cell motility |
| Z | 0 | 0.00 | Cytoskeleton |
| W | 0 | 0.00 | Extracellular structures |
| U | 44 | 1.00 | Intracellular trafficking and secretion |
| O | 137 | 3.12 | Posttranslational modification, protein turnover, chaperones |
| C | 240 | 5.47 | Energy production and conversion |
| G | 307 | 6.99 | Carbohydrate transport and metabolism |
| E | 314 | 7.15 | Amino acid transport and metabolism |
| F | 76 | 1.73 | Nucleotide transport and metabolism |
| H | 142 | 3.23 | Coenzyme transport and metabolism |
| I | 88 | 2.00 | Lipid transport and metabolism |
| P | 182 | 4.15 | Inorganic ion transport and metabolism |
| Q | 53 | 1.21 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 311 | 7.08 | General function prediction only |
| S | 348 | 7.93 | Function unknown |
| - | 1007 | 22.94 | Not in COGs |
a.) The total is based on the total number of protein coding genes in the annotated genome.
Figure 2Phylogenetic tree highlighting the position of S. arizonae RKS2983 (shown in bold) relative to strains of other Salmonella lineages. The corresponding GenBank accession numbers are displayed in parentheses. The tree was built based on the comparison of concatenated nucleotide sequences of 945 conserved genes in all strains. Individual orthologous sequences were aligned by the MAFFT program [32] and concatenated. The phylogenetic tree was constructed by using the MEGA 4.0 software [33] with Neighbor-Joining method. The bootstrap values are shown at branch points.
Figure 3Venn diagram showing the core genes in S. arizonae RKS2983, S. bongori NCTC 12419 and S. typhimurium LT2. The core genes conducted using BLAST with the parameters set at “>70% DNA identity and >0.7 gene length ratio”.
Distribution of known SPIs in four representation genomes of genus
| SPI-1 | + | + | + | + |
| SPI-2 | - | + | + | + |
| SPI-3 | + | - | + | + |
| SPI-4 | + | + | + | + |
| SPI-5 | + | + | + | + |
| SPI-6 | - | - | + | + |
| SPI-7 | - | - | - | + |
| SPI-8 | - | - | - | + |
| SPI-9 | + | + | + | + |
| SPI-10 | - | - | - | + |
| SPI-11 | + | + | + | + |
| SPI-12 | - | - | + | + |
| SPI-13 | + | + | + | - |
| SPI-14 | - | + | + | - |
| SPI-15 | - | - | - | + |
| SPI-16 | - | - | + | + |
| SPI-17 | - | - | - | + |
| SPI-18 | - | - | - | + |
| SPI-19 | - | - | - | - |
| SPI-20 | + | + | - | - |
| SPI-21 | + | + | - | - |
| SPI-22 | - | - | - | - |
+ means SPI is present in the serotype.
- means SPI is absent in the serotype.