| Literature DB >> 23407329 |
David L Bernick1, Kevin Karplus, Lauren M Lui, Joanna K C Coker, Julie N Murphy, Patricia P Chan, Aaron E Cozen, Todd M Lowe.
Abstract
Pyrobaculum oguniense TE7 is an aerobic hyperthermophilic crenarchaeon isolated from a hot spring in Japan. Here we describe its main chromosome of 2,436,033 bp, with three large-scale inversions and an extra-chromosomal element of 16,887 bp. We have annotated 2,800 protein-coding genes and 145 RNA genes in this genome, including nine H/ACA-like small RNA, 83 predicted C/D box small RNA, and 47 transfer RNA genes. Comparative analyses with the closest known relative, the anaerobe Pyrobaculum arsenaticum from Italy, reveals unexpectedly high synteny and nucleotide identity between these two geographically distant species. Deep sequencing of a mixture of genomic DNA from multiple cells has illuminated some of the genome dynamics potentially shared with other species in this genus.Entities:
Keywords: Crenarchaea; Pyrobaculum arsenaticum; Pyrobaculum oguniense; inversion
Year: 2012 PMID: 23407329 PMCID: PMC3558965 DOI: 10.4056/sigs.2645906
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Figure 1Phylogenetic tree of the known species based on 16S ribosomal RNA sequence. Accession numbers and associated culture collection identifiers (when available) for 16S ribosomal RNA genes are: (NC_003364.1, DSM 7523); (NC_009073.1, DSM 21063); (NC_008701.1, DSM 4184); (NC_009376.1, DSM 13514); (CP003316, DSM 13380); (NC_010525.1, DSM 2338); P.sp.1860 (CP003098.1); (AB304846.1, DSM 4185); P.sp.CBA1503 (HM594679.1); P.sp.M0H (AB302407.1); P.sp.AQ1.S2 (DQ778007.1); P.WIJ3 (AJ277125.1); ‘P. neutrophilum’ (X81886). Sequences were aligned using MAFFT v.6 [7], followed by manual curation [8] to remove 16S ribosomal introns and all terminal gap columns caused by missing sequence. The maximum likelihood tree was constructed using Tree-Puzzle v. 5.2 [9] using exact parameter estimates, 10,000 quartets and 1000 puzzling steps. (NC_016070.1, DSM 2078) was included as an outgroup. Numbered branches show bootstrap percentages and branch lengths depict nucleotide mutation rate (see scale bar upper right).
Classification and general features of according to the MIGS recommendations [10].
| | | | |
|---|---|---|---|
| Current classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species | TAS [ | ||
| Type strain | |||
| Cell shape | rods 0.6-1µm × 2-10µm | TAS [ | |
| Motility | |||
| Sporulation | no | ||
| Temperature range | 70–97°C | ||
| Optimum temperature | 90–94°C | ||
| Carbon source | heterotroph1g/L yeast extract or 0.5g/L yeast extract with 0.5g/L tryptone) | TAS [ | |
| Energy source | (see carbon source) | TAS [ | |
| Terminal electron acceptor | O2, sulfur compounds, no growth on NO3 or NO2 | TAS [ | |
| MIGS-6 | Habitat | hot-spring | TAS [ |
| MIGS-6.3 | Salinity | 0–1.5% (w/v); 0% optimal | TAS [ |
| MIGS-22 | Oxygen | facultative aerobe | TAS [ |
| MIGS-15 | Biotic relationship | free-living | NAS |
| MIGS-14 | Pathogenicity | none | NAS |
| MIGS-4 | Geographic location | Tsuetate hot spring, Oguni-cho, Kumamoto prefecture, Japan | TAS [ |
| MIGS-5 | Sample collection time | June 1997 | NAS |
| MIGS-4.1 | Latitude | 33.186 | NAS |
| MIGS-4.2 | Longitude | 131.031 | NAS |
| MIGS-4.3 | Depth | hot-spring sediment / fluid | NAS |
| MIGS-4.4 | Altitude | 300m | NAS |
Evidence codes - TAS: Traceable Author Statement; NAS: Non-traceable Author Statement. These evidence codes are from the Gene Ontology project [22].
Project information
| | | |
|---|---|---|
| MIGS-31 | Finishing quality | Finished |
| MIGS-28 | Libraries used | Roche 454 Titanium library, SOLiD 2×25 Mate-pair (1k-3.5k insert) |
| MIGS-29 | Sequencing platforms | 454 GS FLX Titanium, ABI SOLiD |
| MIGS-31.2 | Fold coverage | 59× 454, 500× SOLiD |
| MIGS-30 | Assemblers | Newbler 2.0.01.14, Custom |
| MIGS-32 | Gene calling method | Prodigal, tRNAScan-SE |
| Genome Database release | Genbank | |
| Genbank ID | 379005763 | |
| Genbank Date of Release | 2012-02-12 | |
| GOLD ID | Gi05801 | |
| Project relevance | Biotechnology |
Nucleotide content and gene count levels of the main chromosomea
| | | |
|---|---|---|
| Genome size (bp) | 243,6033 | 100 |
| DNA Coding region (bp) | 2,164,251 | 88.84 |
| DNA G+C content (bp) | 1,341,816 | 55.08 |
| Total genes | 2,980 | 100 |
| RNA genes | 145 | 4.74 |
| rRNA operons | 1 | |
| Protein-coding genes | 2,800 | 93.96 |
| Genes in paralog clusters | 1,214 | 40.74 |
| Genes assigned to COGs | 1,797 | 60.30 |
| Genes assigned PFAM domains | 1,719 | 57.68 |
| Genes with signal peptides | 794 | 26.64 |
| Genes with transmembrane helices | 646 | 21.68 |
| CRISPR arrays | 5 | % of total |
aThe ECE (16,887 bp) contains 35 genes, has a 50.58% G+C content, and is excluded from this table. Total gene count includes 35 pseudogenes.
Number of genes associated with the 25 general COG functional categories
| | | | |
|---|---|---|---|
| J | 163 | 8.53 | Translation |
| A | 5 | 0.26 | RNA processing and modification |
| K | 112 | 5.86 | Transcription |
| L | 100 | 5.23 | Replication, recombination and repair |
| B | 4 | 0.21 | Chromatin structure and dynamics |
| D | 22 | 1.15 | Cell cycle control, mitosis and meiosis |
| Y | NA | Nuclear structure | |
| V | 15 | 0.78 | Defense mechanisms |
| T | 45 | 2.35 | Signal transduction mechanisms |
| M | 47 | 2.46 | Cell wall/membrane biogenesis |
| N | 4 | 0.21 | Cell motility |
| Z | 1 | 0.05 | Cytoskeleton |
| W | NA | Extracellular structures | |
| U | 22 | 1.15 | Intracellular trafficking and secretion |
| O | 87 | 4.55 | Post-translational modification, protein turnover, chaperones |
| C | 182 | 9.52 | Energy production and conversion |
| G | 82 | 4.29 | Carbohydrate transport and metabolism |
| E | 159 | 8.32 | Amino acid transport and metabolism |
| F | 58 | 3.04 | Nucleotide transport and metabolism |
| H | 115 | 6.02 | Coenzyme transport and metabolism |
| I | 60 | 3.14 | Lipid transport and metabolism |
| P | 83 | 4.34 | Inorganic ion transport and metabolism |
| Q | 26 | 1.36 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 323 | 16.90 | General function prediction only |
| S | 196 | 10.26 | Function unknown |
| - | 1144 | Not in COGs |
Sixteen largest regions present in and absent in
| | | |
|---|---|---|
| 2,420 - 0,020 | paREP2 | |
| 420 - 440 | paREP1/8 | |
| 485 - 530 | paREP2 | |
| 682 - 695 | paREP2 | |
| 887 - 900 | ThiW | |
| 955 - 985 | paREP1/8 | CRISPR cassette |
| 1,090 - 1,120 | paREP1 | Cobalamin biosynthesis cassette |
| 1,160 - 1,180 | CO dehydrogenase | |
| 1,235 - 1,250 | paREP1/8 | |
| 1,440 - 1, 460 | paREP1/8 | |
| 1,540 - 1,565 | aerobic terminal cytochromes | |
| 1,672 - 1,690 | paREP6 | |
| 1,715 - 1,735 | CO dehydrogenase | |
| 1,780 - 1,795 | paREP1 | |
| 1,825 - 1,870 | paREP2 | |
| 2,300 - 2,385 | ThiC |
Summary of genome: one chromosome and one extra-chromosomal element
| Label | Size (bp) | Topology | INSDC identifier |
|---|---|---|---|
| Chromosome (Chr) | 2,436,033 | circular | NC_016885.1 |
| Extra-chromosomal Element (ECE) | 16,887 | circular | NC_016886.1 |
Genomic inversions present within the sampled population
| Inversion name | Coordinates | |||
|---|---|---|---|---|
| | | | ||
| GluDH | 50,930 | 223,540 | 172,611 | 0.17 |
| RAMP/paREP | 932,090 | 955,719 | 23,630 | 0.18 |
| C8 | 1,686,376 | 1,708,299 | 21,924 | 0.35 |
aMinority inversion frequency established as described previously [24].
aThe total is based on the 1,911 COG assignments made across 1,701 protein-coding genes with at least one COG assignment. The Not in COGs category is made up of 1,099 hypothetical protein coding genes and 145 RNA genes. The 35 genes in the ECE are excluded from this analysis.
Figure 2Genomic alignment of with . Outer ring: (+ strand); Inner ring: (- strand). Inter-species alignment blocks shown in light blue and gold (inverted orientation). Intra-species genomic inversions shown as arcs of different colors along outer ring: red: C8 inversion (red); Glutamate Dehydrogenase (GluDH) inversion (green); RAMP/paREP inversion (blue). Positions of paREP elements shown as ticks inside outer ring: paREP1 (red); paREP2b (blue); paREP7 (green). Positions of selected genes which are present in and missing in are shown in text inside outer ring: thiamine biosynthesis genes (ThiW and ThiC); CRISPR Cassette(CAS); cobalamin cluster; CO dehydrogenase(COdh); and the aerobic cytochrome clusters(Cyto-c). Aligned regions smaller than 500 nucleotides have been removed for clarity.