| Literature DB >> 18042280 |
Kira S Makarova1, Alexander V Sorokin, Pavel S Novichkov, Yuri I Wolf, Eugene V Koonin.
Abstract
BACKGROUND: An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes.Entities:
Mesh:
Substances:
Year: 2007 PMID: 18042280 PMCID: PMC2222616 DOI: 10.1186/1745-6150-2-33
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 4.540
The 41 archaeal genomes included in the arCOGs
| Division | Lineage | Abbreviation | Genome size, Mb | Number of annotated protein-coding genes | OGTa | Life style and other features | Refb | GenBank accession | |
| Crenarchaeota | Desulfurococcales | Aerpe | 1.7 | 1700 | 90°C | Aerobic chemorganotroph, sulfur enhances growth | [60] | ||
| Crenarchaeota | Thermoproteales | Calma | 2 | 1943 | 90°C | Moderate acidophile, heterotroph, anaerobe or microaerophyle | |||
| Crenarchaeota | Cenarchaeales | Censy | 2 | 2017 | ~10°C | Moderate psychrophile, uncultivated symbiont of sponges | [33] | ||
| Crenarchaeota | Desulfurococcales | Hypbu | 1.7 | 1602 | >100°C | Hyperthermophilic neutrophile, anaerobe | [61] | ||
| Crenarchaeota | Thermoproteales | Pyrae | 2.2 | 2605 | 100°C | Facultative nitrate-reducing anaerobe | [62] | ||
| Crenarchaeota | Thermoproteales | Pyrca | 2 | 2149 | 100°C | Same as Pyrae | NA | ||
| Crenarchaeota | Thermoproteales | Pyris | 1.8 | 1978 | 100°C | Same as Pyrae | NA | ||
| Crenarchaeota | Desulfurococcales | Stama | 1.6 | 1570 | 80°C | Anaerobic submarine heterotroph | NA | ||
| Crenarchaeota | Sulfolobales | Sulac | 2.2 | 2223 | 80°C | Aerobic thermoacidophile | [63] | ||
| Crenarchaeota | Sulfolobales | Sulso | 3 | 2977 | 80°C | Sulfur-metabolizing chemorganotroph, thermoacidophilic, motile aerobe | [64] | ||
| Crenarchaeota | Sulfolobales | Sulto | 2.7 | 2825 | 80°C | Same as Sulso | [65] | ||
| Crenarchaeota | Thermoproteales | Thepe | 1.8 | 1876 | 92°C | Acidophilic anaerobe | NA | ||
| Crenarchaeota | Thermoproteales | Thete | 1.8 | 2021 | 96°C | Facultative hydrogen-sulfur authotroph, anaerobe | NA | n/a | |
| Euryarchaeota | Archaeoglobales | Arcfu | 2.2 | 2420 | 83°C | Motile, anaerobic, sulfate-reducing chemolito- or chemorgano- autothroph | [66] | ||
| Euryarchaeota | Halobacteriales | Halma | 4.3 | 4240 | 37°C | Chemoorganotrophic obligate halophile | [67] | ||
| Euryarchaeota | Halobacteriales | Halsp | 2.6 | 2622 | 37°C | Aerobic chemorganotroph, obligate halophile, proteolytic, motile, with cell envelope; 2 extrachromosomal elements | [68] | ||
| Euryarchaeota | Halobacteriales | Halwa | 3.2 | 2646 | 37°C | Halophilic, aerobic heterotroph | [69] | ||
| Euryarchaeota | Methanobacteriales | Metth | 1.8 | 1873 | 65°C | Chemolitoautothroph, strict anaerobe, nitrogen-fixing methanogen | [70] | ||
| Euryarchaeota | Methanosarcinales | Metbu | 2.6 | 2273 | 23°C | Psychrotolerant, strictly anaerobic, slightly halophilic methylotroph | NA | ||
| Euryarchaeota | Methanococcales | Metja | 1.7 | 1786 | 85°C | Chemolito-autothrophic, strictly anaerobic, motile methanogen, 2 extrachromosomal elements | [71] | ||
| Euryarchaeota | Methanococcales | MetmC | 1.8 | 1822 | 37°C | Mesophilic hydrogenotrophic, nitrogen-fixing methanogen | [72] | ||
| Euryarchaeota | Methanococcales | Metmp | 1.7 | 1722 | 37°C | same as MetmC | NA | ||
| Euryarchaeota | Methanomicrobiales | Metla | 1.8 | 1739 | 37°C | Strictly anaerobic, CO2 fixing methanogen | NA | ||
| Euryarchaeota | Methanomicrobiales | Metcu | 2.5 | 2489 | 37°C | Strictly anaerobic methanogen | NA | ||
| Euryarchaeota | Methanopyrales | Metka | 1.7 | 1687 | 110°C | Chemolito-autothrophic, strictly anaerobic, methanogen, high intracellular salt concentration | [41] | ||
| Euryarchaeota | Methanosarcinales | Metsa | 1.9 | 1696 | 60°C | Strictly anaerobic methanogen | NA | ||
| Euryarchaeota | Methanosarcinales | Metac | 5.8 | 4540 | 37°C | Chemolito-autothrophic, anaerobic, nitrogen-fixing, versatile methanogen, motile, forms multicellular structures | [73] | ||
| Euryarchaeota | Methanosarcinales | Metba | 4.8 | 3624 | 37°C | Same as Mac | [74] | ||
| Euryarchaeota | Methanosarcinales | Metma | 4.1 | 3370 | 37°C | Same as Mac | [75] | ||
| Euryarchaeota | Methanobacteriales | Metst | 1.8 | 1534 | 37°C | Methanogen, human intestinal inhabitant | [76] | ||
| Euryarchaeota | Methanomicrobiales | Methu | 3.5 | 3139 | 37°C | Strictly anaerobic methanogen | NA | ||
| Euryarchaeota | Halobacteriales | Natph | 2.8 | 2822 | 37°C | Extreme haloalkaliphile | [77] | ||
| Euryarchaeota | Thermoplasmales | Picto | 1.6 | 1535 | 65°C | Extremely acidophilic moderate thermophile | [78] | ||
| Euryarchaeota | Thermococcales | Pyrab | 1.8 | 1898 | 96°C | Same as Pho | [79] | ||
| Euryarchaeota | Thermococcales | Pyrfu | 1.9 | 2125 | 96°C | Same as Pho | [80] | ||
| Euryarchaeota | Thermococcales | Pyrho | 1.7 | 1955 | 96°C | Anaerobic, motile heterotroph | [81] | ||
| Euryarchaeota | Thermococcales | Theko | 2.1 | 2306 | 85°C | Anaerobic heterotroph | [82] | ||
| Euryarchaeota | Thermoplasmales | Theac | 1.6 | 1482 | 59°C | Chemorganotrophic, thermoacidophilic, motile facultative anaerobe | [83] | ||
| Euryarchaeota | Thermoplasmales | Thevo | 1.6 | 1499 | 60°C | Same as Tac | [84] | ||
| Uncultured methanogenic archaeon | Euryarchaeota | ? | Uncme | 3.2 | 3085 | 37°C | Methanogen isolated from rice rhizosphere | NA | |
| Nanoarchaeota | ? | Naneq | 0.5 | 536 | 80°C | Obligate symbiont of the crenarchaeon | [30] |
aOGT, optimal growth temperature
bOnly the references that report the complete genome of the respective species and its initial analysis are cited
Figure 1A flow chart of the procedure employed for the construction of the arCOGs. See Materials and Methods for the description of each step.
Figure 2Coverage of archaeal genomes with arCOGs and COGs. Cyan, ArCOGs, purple, COGs. Abbreviations are as in Table 1.
Figure 3Distribution of the number of species in arCOGs: three classes of archaeal genes. A semi-logarithmic plot fitted with a sum of 3 exponents
Figure 4Distribution of phyletic patterns by the number of arCOGs. A log-log plot.
The 10 most common phyletic patterns in the arCOGs
| Lineage | Speciesa | Number of arCOGs |
| Mathanosarcinales | Metac, Metba, Metma | 239 |
| Halobacteriales | Halma, Halsp, Halwa, Netph | 204 |
| Sulfolobales | Sulac, Sulso, Sulto | 192 |
| Thermoproteales | Pyrae, Pyrca, Pyris, Thete | 162 |
| Thermococcales | Pyrab, Pyrfu, Pyrho, Theko | 142 |
| Methanosarcinales | Metac, Metba | 126 |
| Methanococcales | MetmC, Metmp | 105 |
| Halobacteriales | Halma, Halwa | 99 |
| Thermoplasmales | Picto, Theac, Thevo | 96 |
aAbbreviations are as in Table 1.
Figure 5Functional breakdown of the entire set of arCOGs and the three core sets. EA, Euryarchaea, CA, Crenarchaea.
Figure 6The gene-content tree of archaea constructed on the basis of the phyletic patterns of arCOGs. The species abbreviations are as in Table 1. Cren, Crenarchaeota; Eury, Euryarchaeota.
Figure 7A reconstruction of gene gain and loss in archaea. Each branch is labeled by 3 numbers: black, the (inferred) number of arCOGs in the node to which the given branch leads; blue, number of arCOGs lost along the branch; red, number of arCOGs gained along the branch. The red circles on branches denote hyperthermophiles, and blue circles denote mesophiles and moderate thermophiles.
Figure 8Low-bound reconstructions for ancestral archaeal forms: genomes close in size to modern hyperthermophiles. Each column shows the total number of annotated protein-coding genes in the respective archaeal species; the colored portions (green for Crenarchaeota, blue for Euryarchaeota, and cyan for Nanoarchaeota) show genes included in arCOGs. The hatched columns show the number of arCOGs assigned to LACA, the Last CrenArchaeal Common Ancestor (LCACA) and the Last EuryArchaeal Common Ancestor (LEACA).
Major features of the reconstructed gene set of LACA
| Complete translation system and essentially complete set of enzymes for tRNA and rRNA modification | |||
| including: | 61 | Ribosomal proteins | |
| 21 | aaRS and related enzymes | ||
| Moderately sophisticated transcription control | |||
| including: | 22 | Transcription regulators | |
| 13 | RNA polymerase subunits | ||
| Advanced DNA replication and repair system | |||
| including: | 6 | Topoisomerases | |
| 4 | DNA polymerase subunits | ||
| Membrane-based redox bioenergetics; partial TCA cycle | |||
| including: | 13 | Pyruvate oxidation | |
| 9 | TCA cycle | ||
| 9 | NADH dehydrogenase or Na+/H+ antiporter | ||
| 8 | V-type ATPase-ATP synthase | ||
| Moderately sophisticated sugar metabolism | |||
| including: | 8 | Glycolysis/Gluconeogenesis | |
| Enzymes for the biosynthesis of all amino acids | |||
| including: | 72 | Amino acid biosynthesis | |
| Enzymes for the biosynthesis of all nucleotides | |||
| including: | 29 | Nucleotide biosynthesis | |
| 6 | Nucleotide salvage | ||
| Enzymes for the biosynthesis of all essential cofactors | |||
| including: | 60 | Cofactor biosynthesis | |
| Fully developed membrane | |||
| including: | 19 | Lipid biosynthesis | |
| Fully developed cell wall | |||
| Sophisticated ion uptake system | |||
| Limited or unknown | |||
| Limited motility and/or conjugation | |||
| Sophisticated system of protein fate control | |||
| including: | 2 | Proteasome | |
| Limited or unknown | |||
| Limited use of bacterial type signal transduction system; original signal transduction | |||
| including: | 3 | Serine/threonine kinase | |
| Fully developed secretion system | |||
| including: | 3 | Preprotein translocase | |
| Viruses abundant at LACA times | |||
| including: | 6 | CASS proteins | |
Figure 9Taxonomic affinities of ArCOGs with bacteria and eukaryotes. For the criteria of taxonomic assignments, see Materials and Methods.A, archaea, B, bacteria, E, eukaryotes.