| Literature DB >> 16792803 |
Igor V Merkeev1, Pavel S Novichkov, Andrey A Mironov.
Abstract
BACKGROUND: Orthologs and paralogs are widely used terms in modern comparative genomics. Existing procedures for resolving orthologous/paralogous relationships are often based on manual revision of clusters of orthologous groups and/or lack any rigorous evolutionary base. DESCRIPTION: We developed a completely automated procedure that creates clusters of orthologous groups at each node of the taxonomy tree (PHOGs--Phylogenetic Orthologous Groups). As a result of this procedure, a tree of orthologous groups was obtained. Each cluster is a "supergene" and it is represented by an "ancestral" sequence obtained from the multiple alignment of orthologous and paralogous genes. The procedure has been applied to the taxonomy tree of organisms from all three domains of life. Protein complements from 50 bacterial, archaeal and eukaryotic species were used to create PHOGs at all tree nodes. 51367 PHOGs were obtained at the root node.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16792803 PMCID: PMC1523204 DOI: 10.1186/1471-2148-6-52
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Figure 1Evolution by gene duplication. Nodes N1, N2, N3 represent speciation events resulting in orthologs. Filled circles (●) mark gene duplication events resulting in paralogs.
Figure 2One connected component contains two orthologous groups A1A2A3A4 and B1B2B3B4. The false BBH bridge A1B3 connects both orthologous groups.
Figure 3The taxonomy tree of organisms used to build the PHOG database. The number of PHOGs at each node of the tree is shown in parentheses.
Number of PHOGs obtained at the nodes of the taxonomy tree for the lineage leading from the Universal Common Ancestor to Escherichia coli O157. For each node, ancestral PHOGs (Na) contain two or more PHOGs from its child nodes that were declared as orthologs and possibly some PHOGs from child nodes that were declared as paralogs (Np). Ratio Np/Na indicates how many paralogs evolved from Na ancestral genes. Node-specific PHOGs (Nns) consist of all ancestral PHOGs that did not find their match during the procedure run for all nodes lying higher in the taxonomy tree.
| Total number of PHOGs, N | 5196 | 5276 | 11868 | 17058 | 23915 | 38234 | 51367 |
| Number of node-specific PHOGs, Nns | 578 | 161 | 1327 | 934 | 996 | 2079 | 2055 |
| Number of ancestral PHOGs, Na | 5196 | 3780 | 5190 | 3766 | 3104 | 3827 | 2055 |
| Number of paralogs, Np | 0 | 629 | 3101 | 2373 | 1576 | 2453 | 1620 |
| Np/Na | 0 | 0.166 | 0.597 | 0.63 | 0.507 | 0.64 | 0.788 |
Figure 4A possible evolutionary scenario for the PHOG16006. Dashed lines indicates gene losses.