| Literature DB >> 15003120 |
Céline Brochier1, Patrick Forterre, Simonetta Gribaldo.
Abstract
BACKGROUND: Phylogenetic analysis of the Archaea has been mainly established by 16S rRNA sequence comparison. With the accumulation of completely sequenced genomes, it is now possible to test alternative approaches by using large sequence datasets. We analyzed archaeal phylogeny using two concatenated datasets consisting of 14 proteins involved in transcription and 53 ribosomal proteins (3,275 and 6,377 positions, respectively).Entities:
Mesh:
Substances:
Year: 2004 PMID: 15003120 PMCID: PMC395767 DOI: 10.1186/gb-2004-5-3-r17
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Unrooted neighbor-joining phylogenetic tree of the RNA polymerase subunit H computed from a Γ-corrected matrix of distances. Numbers close to nodes are bootstrap proportions. The scale bar represents the number of changes per position per unit branch length. For each taxon, the portion of the alignment from positions 57 to 83 is displayed. For clarity, identical amino acids shared by the current taxa and the first taxon (Aeropyrum pernix) are indicated by dashes, whereas stars correspond to missing amino acids.
Figure 2Unrooted maximum likelihood (ML) phylogenetic trees obtained from the transcription and translation datasets. (a) Transcription; (b) translation. The best tree and the branch lengths were calculated using the program PUZZLE with a Γ-law correction. Numbers at the nodes are ML bootstrap supports computed with the RELL method using the MOLPHY program without correction for among-site variation. The scale bars represent the number of changes per position per unit branch length.
Figure 3Unrooted neighbor-joining phylogenetic tree of the RNA polymerase subunits A' and A" computed from a Γ-corrected matrix of distances. (a) Polymerase A'; (b) polymerase A". Numbers close to nodes are bootstrap proportions. The scale bars represent the number of changes per position per unit branch length.
Figure 4Comparison between the percentage of differences observed in the transcription and ribosomal datasets for each couple of taxa. The x-axis represents the percentage of amino-acid differences observed between two taxa for the concatenated transcription dataset. The y-axis represents the percentage of amino-acid differences observed between two taxa for the concatenated ribosomal dataset. Circles show for each pair of taxa the comparison between the observed percentage of differences for the concatenated transcription and ribosomal datasets. The majority of circles are localized close to the diagonal indicating a strong correlation (R = 0.88) between the differences observed into the two concatenated datasets. White circles represent the comparisons of Methanopyrus kandleri with other taxa.
Figure 5Unrooted neighbor-joining phylogenetic tree of the RNA polymerase subunit B computed from a Γ-corrected distance matrix. Numbers close to nodes are bootstrap proportions. The scale bar represents the number of changes per position for a unit branch length. In Methanococcus maripaludis, Methanocaldococcus jannaschii, Methanopyrus kandleri, Methanothermobacter thermoautotrophicus, Archaeoglobus fulgidus, Thermoplasmatales, Methanosarcinales and Halobacteriales genomes, the gene for the RNA polymerase subunit B is split in two parts: B' and B". The black and white boxes correspond to the B' and B" parts of the gene, respectively. S and F represent the split and fusion event hypotheses of the B' and B" parts of the gene.
Indels in the 12 subunits of RNA polymerase
| Total number of indels | Number of specific indels | Percentage of specific indels | |
| 38 | 4 | 10.53 | |
| 33 | 5 | 15.15 | |
| 28 | 1 | 3.57 | |
| 30 | 3 | 10 | |
| 17 | 2 | 11.76 | |
| 23 | 2 | 13.04 | |
| 24 | 3 | 12.50 | |
| 24 | 7 | 29.17 | |
| 22 | 6 | 27.27 | |
| 57 | 27 | 47.37 | |
| Methanosarcinales | 10 | 2 | 20 |
| 17 | 2 | 11.76 | |
| Thermococcales | 19 | 2 | 10.53 |
| Thermoplasmatales | 36 | 8 | 22.22 |
For each species, regions containing insertions/deletions (indels) have been counted for the 12 RNA polymerase subunits (A', A", B, D, E', E", F, H, K, L, N, P), TFS, NusA and NusG. We use 'indel region' terms because if two species exhibit indels in the same region, even if they are different sizes, we count this region as a shared indel region. For each species, the number and percentage of specific regions containing indels (that is, the indel region is exclusive to that species and is not shared by any other species) are indicated. As they share exactly the same indels, the three Pyrococcus species, the three Methanosarcina species and the two Thermoplasma species plus Ferroplasma are grouped in Thermococcales, Methanosarcinales and Thermoplasmatales respectively. Consequently, the specific indels are those specific to the group.
Figure 6An example of an indel being flanked by divergent regions in Methanopyrus kandleri. The portion of the alignment corresponds to positions 1,281 to 1,340 in our RNA polymerase subunit A' dataset. For clarity, identical amino acids shared by each taxon and the first taxon (Sulfolobus tokodaii) are indicated by dashes, whereas stars correspond to missing amino acids.