| Literature DB >> 21350632 |
Hiromi Nishida1, Choong-Soo Yun.
Abstract
Although the bacterium Symbiobacterium thermophilum has a genome with a high guanine-cytosine (GC) content (69%), it belongs to a low GC content bacterial group. We detected only 18 low GC content regions with 5 or more consecutive genes whose GC contents were below 65% in the genome of this organism. S. thermophilum has 66 transposase genes, which are markers of transposable genetic elements, and 38 (58%) of them were located in the low GC content regions, suggesting that Symbiobacterium has a similar gene silencing system as Salmonella. The top hit (best match) analyses for each Symbiobacterium protein showed that putative horizontally transferred genes and vertically inherited genes are scattered across the genome. Approximately 25% of the 3338 Symbiobacterium proteins have the highest similarity with the protein of a phylogenetically distant organism. The putative horizontally transferred genes also have a high GC content, suggesting that Symbiobacterium has gained many DNA fragments from phylogenetically distant organisms during the early stage of Firmicutes evolution. After acquiring genes, Symbiobacterium increased the GC content of the horizontally transferred genes and thereby maintained a genome with a high GC content.Entities:
Year: 2010 PMID: 21350632 PMCID: PMC3039409 DOI: 10.4061/2011/634505
Source DB: PubMed Journal: Int J Evol Biol ISSN: 2090-052X
Figure 1Pie chart of the categories of the 3338 Symbiobacterium protein-coding genes. A BLAST search was conducted for all proteins from 147 eukaryotes, 1047 bacteria, and 84 archaea in the KEGG database (http://www.kegg.jp) considering the parameter values given on the GenomeNet website (http://www.genome.jp). The query amino acid sequence was each protein of Symbiobacterium thermophilum. The top hit (best match) for each Symbiobacterium protein was recorded. However, if the top hit was absent or if the E-value of the top hit exceeded 0.1, the Symbiobacterium protein was considered to have no similar protein (category, “No hit”). We categorized the 3338 Symbiobacterium proteins into the following 17 categories: “Actinobacteria,” “Aquificae,” “Archaea/Eukaryota,” “Bacilli,” “Bacteroidetes/Chlorobi,” “Chloroflexi,” “Clostridia,” “Cyanobacteria,” “Deinococcus-Thermus,” “Dictyoglomi,” “Fibrobacteres/Acidobacteria,” “No hit,” “Proteobacteria,” “Symbiobacterium,” “Thermobaculum,” “Thermotogae,” and “Other bacteria.” If the top hit was another protein(s) of Symbiobacterium, then the query protein was considered to belong to the category “Symbiobacterium.”
Figure 2Plots of the location and category of the Symbiobacterium protein-coding genes. X-axis: STH gene number. Y-axis: 0: category “No hit” (Symbiobacterium-specific genes); 1: category “Symbiobacterium” (multiple copied genes); 2: category “Clostridia;” 3: category “Bacilli”; 4: categories “Actinobacteria,” “Aquificae,” “Bacteroidetes/Chlorobi,” “Chloroflexi,” “Cyanobacteria,” “Deinococcus-Thermus,” “Dictyoglomi,” “Fibrobacteres/Acidobacteria,” “Proteobacteria,” “Thermobaculum,” “Thermotogae,” and “Other bacteria”; 5: category “Archaea/Eukaryota.” The italicized numbers indicate 52 clusters (pink) containing 5 or more consecutive genes belonging to the category “Clostridia.”
Figure 3Plots of the location and GC content of the Symbiobacterium protein-coding genes. X-axis: STH gene number. Y-axis: GC content (%) of gene. Red indicates the putative transposase-coding genes, and blue indicates the group II intron-coding maturase genes. The italicized numbers indicate 18 low GC content regions containing 5 or more consecutive genes whose GC contents are below 65%.