| Literature DB >> 17099229 |
John E Karro1, Yangpan Yan, Deyou Zheng, Zhaolei Zhang, Nicholas Carriero, Philip Cayting, Paul Harrrison, Mark Gerstein.
Abstract
The Pseudogene.org knowledgebase serves as a comprehensive repository for pseudogene annotation. The definition of a pseudogene varies within the literature, resulting in significantly different approaches to the problem of identification. Consequently, it is difficult to maintain a consistent collection of pseudogenes in detail necessary for their effective use. Our database is designed to address this issue. It integrates a variety of heterogeneous resources and supports a subset structure that highlights specific groups of pseudogenes that are of interest to the research community. Tools are provided for the comparison of sets and the creation of layered set unions, enabling researchers to derive a current 'consensus' set of pseudogenes. Additional features include versatile search, the capacity for robust interaction with other databases, the ability to reconstruct older versions of the database (accounting for changing genome builds) and an underlying object-oriented interface designed for researchers with a minimal knowledge of programming. At the present time, the database contains more than 100,000 pseudogenes spanning 64 prokaryote and 11 eukaryote genomes, including a collection of human annotations compiled from 16 sources.Entities:
Mesh:
Year: 2006 PMID: 17099229 PMCID: PMC1669708 DOI: 10.1093/nar/gkl851
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Contents of the pseudogene database at time of submission (June 2006)
| Genome | Number of pseudogenes |
|---|---|
| Eukaryotes | |
| | 31 768 |
| | 8355 |
| | 15 320 |
| | 10 750 |
| | 2802 |
| | 4179 |
| | 15 779 |
| | 1713 |
| | 484 |
| | 5179 |
| | 3250 |
| Eukaryote total | 99 579 |
| Prokaryotes (sample) | |
| | 37 |
| | 10 |
| | 187 |
| | 134 |
| | 18 |
| | 203 |
| | 11 |
| | 39 |
| | 35 |
| | 172 |
| Prokaryote total (including 54 genomes not shown) | 6890 |
| Database total | 106 469 |
All eukaryotic organisms in the database are displayed; listing of prokaryotes has been limited to 10 out of 64 contained in the database.
Figure 1A diagram of the Pseudogene.org search page (Eukaryote section), illustrating two ways a user might search for all processed pseudogenes on chromosome 22 that were created by the protein with Ensembl accession number ENSP00000268661. In (a) the user could choose to search all human pseudogenes, resulting in the search page shown (b), which can then be configured as shown. Or the user could look at all pre-computed sets as shown in (d), choose the set corresponding to the Zheng et al. analysis of chromosome 22 and resulting in the search page shown in (e). In this case both methods will result in the same list, as shown in (c), and by choosing an individual pseudogene the user will see the specific details as shown in (f).
Figure 2Venn diagrams representing the intersections between the sets corresponding to PseudoPipe pipeline, the Torrents identification method and the Hoppsigen method. (Not drawn to scale.) We define two pseudogenes as equivalent if there exists more than a 90% overlap between them.