| Literature DB >> 17933761 |
Andreas Heger1, Chris P Ponting.
Abstract
The genome sequences of a large number of metazoan species are now known. As multiple closely related genomes are sequenced, comparative studies that previously focussed on only pairs of genomes can now be extended over whole clades. The orthologous and paralogous transcripts in clades (OPTIC) database currently provides sets of gene predictions and orthology assignments for three clades: (i) amniotes, including human, dog, mouse, opossum, platypus and chicken (17 443 orthologous groups); (ii) a Drosophila clade of 12 species (12 889 orthologous groups) and (iii) a nematode clade of four species (13 626 orthologous groups). Gene predictions, multiple alignments and phylogenetic trees are freely available to browse and download from http://genserv.anat.ox.ac.uk/clades. Further genomes and clades will be added in the future.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17933761 PMCID: PMC2238935 DOI: 10.1093/nar/gkm852
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Gene sets and orthology assignments in three clades
| Species | Genes | Genes with orthologs (%) | Orphaned genes (%) |
|---|---|---|---|
| 13 836 | 13 563 (98) | 273 (2) | |
| 13 203 | 12 318 (93) | 885 (7) | |
| 15 467 | 14 356 (93) | 1111 (7) | |
| 14 199 | 13 471 (95) | 728 (5) | |
| 14 971 | 14 218 (95) | 753 (5) | |
| 14 337 | 13 205 (92) | 1132 (8) | |
| 12 304 | 11 609 (94) | 695 (6) | |
| 12 973 | 11 876 (92) | 1097 (8) | |
| 13 144 | 11 360 (86) | 1784 (14) | |
| 12 017 | 11 096 (92) | 921 (8) | |
| 11 717 | 10 883 (93) | 834 (7) | |
| 11 800 | 11 011 (93) | 789 (7) | |
| 20 093 | 14 037 (70) | 6056 (30) | |
| 18 137 | 14 961 (82) | 3176 (18) | |
| 21 931 | 17 759 (81) | 4172 (19) | |
| 18 388 | 13 460 (73) | 4928 (27) | |
| 22 611 | 19 339 (86) | 3272 (14) | |
| 24 442 | 20 758 (85) | 3684 (15) | |
| 19 314 | 18 066 (94) | 1248 (6) | |
| 19 597 | 18 123 (92) | 1474 (8) | |
| 18 596 | 15 312 (82) | 3284 (18) | |
| 16 715 | 13 893 (83) | 2822 (17) |
Gene sets marked with an asterisk (*) were obtained from ENSEMBL, whereas all others have been predicted by the pipeline. Orphans represent genes that have no ortholog in any of the other genomes in the clade. These will represent results of heuristic failures in our ortholog prediction pipeline or in gene predictions, as well as true gene losses.
Figure 1.Browsing the orthology database. A sample session starts with a query for all simple 1:1 ortholog sets (bottom left). It continues with a list of all simple 1:1 ortholog sets containing all six species from the amniotic clade, then by a selection of one particular ortholog set (number 114), and finally with a viewing of the gene-based multiple sequence alignment.