| Literature DB >> 17663799 |
Hiromi Kochiwa1, Masaru Tomita, Akio Kanai.
Abstract
BACKGROUND: A theoretical model of genetic redundancy has proposed that the fates of redundant genes depend on the degree of functional redundancy, and that functionally redundant genes will not be inherited together. However, no example of actual gene evolution has been reported that can be used to test this model. Here, we analyzed the molecular evolution of the ribonuclease H (RNase H) family in prokaryotes and used the results to examine the implications of functional redundancy for gene evolution.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17663799 PMCID: PMC1950709 DOI: 10.1186/1471-2148-7-128
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Number of RNase H genes identified in the archaeal and bacterial species.
| Kingdom | Type | No. of species | No. of RNase H genes | ||
| 1 | 2 | 3 | |||
| Archaea | RNase HI | 9 | 6 | 3 | 0 |
| RNase HII | 27 | 27 | 0 | 0 | |
| RNase HIII | 1 | 1 | 0 | 0 | |
| Bacteria | RNase HI | 210 | 173 | 32 | 5 |
| RNase HII | 220 | 219 | 1 | 0 | |
| RNase HIII | 40 | 40 | 0 | 0 | |
Figure 1Potential combinations of the three RNase H genes. The Venn diagram consists of three circles (HI, RNase HI; HII, RNase HII; HIII, RNase HIII) that represent the possible combinations of RNase H genes as Groups A to H. The 353 prokaryotic genomes were classified according to this system, and the results are presented in Additional File 1. The numbers under Groups A to H indicate the number of archaeal (left) and bacterial (right) species identified in each of the groups.
Combinations of the three RNase H genes found in the archaeal and bacterial genomes.
| Kingdom | Subdivision | No. of species | Classification of RNase H combination | ||||||
| Group A | Group B | Group C | Group D | Group E | Group F | Group G | |||
| Archaea | Crenarchaeota | 5 | 0 | 3 | 0 | 0 | 0 | 2 | 0 |
| Euryarchaeota | 21 | 0 | 6 | 1 | 0 | 0 | 14 | 0 | |
| Nanoarchaeota | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | |
| Total | 27 | 0 | 9 | 1 | 0 | 0 | 17 | 0 | |
| Bacteria | Acidobacteria | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| Actinobacteria | 20 | 0 | 18 | 1 | 0 | 1 | 0 | 0 | |
| Aquificae | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | |
| Bacteroidetes | 4 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | |
| Chlamydiae | 7 | 0 | 0 | 7 | 0 | 0 | 0 | 0 | |
| Chlorobi | 3 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | |
| Chloroflexi | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | |
| Cyanobacteria | 8 | 0 | 8 | 0 | 0 | 0 | 0 | 0 | |
| Deinococcus-Thermus | 3 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | |
| Firmicutes | 49 | 15 | 18 | 7 | 0 | 0 | 0 | 9 | |
| Fusobacteria | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | |
| Planctomycetes | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | |
| Alphaproteobacteria | 38 | 0 | 38 | 0 | 0 | 0 | 0 | 0 | |
| Betaproteobacteria | 23 | 0 | 23 | 0 | 0 | 0 | 0 | 0 | |
| Deltaproteobacteria | 11 | 0 | 11 | 0 | 0 | 0 | 0 | 0 | |
| Epsilonproteobacteria | 5 | 0 | 5 | 0 | 0 | 0 | 0 | 0 | |
| Gammaproteobacteria | 52 | 0 | 49 | 0 | 0 | 3 | 0 | 0 | |
| Spirochaetes | 5 | 0 | 4 | 0 | 0 | 1 | 0 | 0 | |
| Thermotogae | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | |
| Total | 235 | 15 | 189 | 16 | 0 | 6 | 0 | 9 | |
All species were classified on the basis of the RNase H combinations shown in the Venn diagram in Figure 1.
Figure 2Bayesian phylogenetic tree of 49 species in the firmicutes. A phylogenetic tree was constructed with Bayesian inference on the basis of the gyrB sequence alignments of 49 species in the firmicutes. Numbers at the nodes represent posterior probabilities. The scale bar represents 0.1 substitutions per site. Letters A through G next to the species names represent the combinations of RNase H genes defined in Figure 1; B' indicates the presence of dsRHbd.
Figure 3Amino acid sequence alignments for the RNase HI domains. RNase HI protein sequences were derived from the bacillales and lactobacillales listed in Additional File 2 and were aligned by using the Clustal method. Arrows and rectangles indicate beta-strands and alpha-helices, respectively. The upper and lower secondary structures were generated on the basis of the RNase HI domains of E. coli [33] and B. halodurans [29]. Dark and light shadings indicate highly conserved and similar amino acid residues, respectively. Asterisks denote amino acid residues that are involved in the catalytic function of RNase HI. The boxed region below the label for alpha-helix 3 forms a basic protrusion handle in the E. coli RNase HI structure. The combinations of RNase H genes are represented to the right of the sequences (see the text for details). The symbols † and § indicate active and inactive RNase H, respectively.
List of genes containing dsRHbd in complete genomes.
| Kingdom | Species | Accession No. | ORF | Direction | Domain |
| Archaea | |||||
| | 832383–832988 | complement | dsRHbd (4–46), RNH (65–201) | ||
| Bacteria | |||||
| | 207338–207967 | complement | dsRHbd (6–48), RNH (79–197) | ||
| | 253987–254616 | complement | dsRHbd (6–48), RNH (79–197) | ||
| | 4371820–4372455 | complement | dsRHbd (6–48), RNH (81–199) | ||
| | 1292573–1293223 | direct | dsRHbd (5–47), RNH (85–199) | ||
| | 3808044–3808685 | complement | RNH (1–138), dsRHbd (156–197) | ||
| | 1399495–1400094 | direct | dsRHbd (6–48), RNH (74–183) | ||
| | 933504–934094 | direct | dsRHbd (6–48), RNH (69–192) | ||
| | 2659515–2660237 | complement | dsRHbd (6–48), RNH (103–238) | ||
| | 1707913–1708542 | complement | dsRHbd (5–49), RNH (70–207) | ||
| | 2281472–2282092 | complement | dsRHbd (5–47), RNH (68–203) | ||
| | 2075150–2075770 | complement | dsRHbd (8–50), RNH (71–203) | ||
| | 116459–117205 | direct | dsRHbd (3–45), RNH (91–245) | ||
| | 146497–147264 | direct | dsRHbd (4–45), RNH (93–253) | ||
| 978609–979283 | complement | Resolvase (1–106), dsRHbd (127–168) | |||
| | 118550–119281 | direct | dsRHbd (3–45), RNH (86–240) | ||
| | 2310574–2311470 | direct | dsRHbd (5–47), RNH (70–225) | ||
| | 459722–460381 | direct | dsRHbd (4–46), RNH (59–216) | ||
| | 559864–560484 | complement | dsRHbd (3–45), RNH (63–199) | ||
| | 382602–383222 | direct | dsRHbd (5–47), RNH (63–198) | ||
| | 375791–376408 | direct | dsRHbd (5–47), RNH (62–198) | ||
| | 1267458–1268120 | complement | dsRHbd (5–47), RNH (73–216) | ||
| | 1651474–1652124 | complement | dsRHbd (6–48), RNH (70–215) | ||
| | 3036447–3037238 | complement | RNH (18–164), dsRHbd (207–248) | ||
| | 1023218–1024003 | direct | dsRHbd (28–70), RNH (105–247) | ||
| | 2628106–2628873 | complement | RNH (3–154), dsRHbd (199–240) | ||
| | 1743847–1744665 | complement | dsRHbd (5–47), RNH (98–247) | ||
| | 2161121–2161870 | complement | dsRHbd (5–47), RNH (77–225) | ||
| | 82187–82945 | complement | dsRHbd (5–47), RNH (81–228) | ||
| | 880428–881219 | direct | dsRHbd (5–47), RNH (91–239) | ||
| | 897096–897740 | complement | dsRHbd (19–62), RNH (74–211) | ||
| | 899069–899668 | complement | dsRHbd (4–47), RNH (88–196) | ||
| | 1322788–1323459 | complement | dsRHbd (7–49), RNH (63–197) |
dsRHbd sequences were found in 32 complete genomes (31 species, one with two strains). Open reading frame (ORF) numbers indicate the genomic positions of the genes that encode dsRHbd. Domain numbers indicate the amino acid positions relative to the start of each protein sequence. RNH represents the RNase H group.
Figure 4Amino acid sequence alignments of dsRHbds from prokaryotes. Arrows and rectangles indicate beta-strands and alpha-helices, respectively. These secondary structures were generated on the basis of the N-terminus domain of S. cerevisiae RNase H1 [35]. Arrows and rectangles indicate beta-strands and alpha-helices, respectively. Dark and light shadings indicate highly conserved and similar amino acid residues, respectively. Asterisk represents an identical amino acid residue. The symbol † indicates that the gene containing the resolvase domain encodes the amino acid sequence of dsRHbd. For detailed information, see Table 3.
Figure 5Bayesian phylogenetic tree for the RNase HI domains. A phylogenetic tree was constructed by using Bayesian inference on the basis of the alignments of RNase HI domain sequences from 12 species in the gammaproteobacteria, as listed in Additional File 4. Species names followed by Arabic numerals are used to distinguish multiple forms of RNase HI in a genome; names not followed by Arabic numerals indicate the presence of a single RNase HI in a genome. Numbers at the nodes represent posterior probabilities. The scale bar equals 0.1 substitutions per site.
Figure 6Possible evolutionary models for the three RNase H genes. Two models that can explain the mutually exclusive evolution of RNase HIII and RNase HI with dsRHbd (Model A) and of RNase HI with and without dsRHbd (Model B). Fand Findicate the different functions of the RNase H enzymes.