| Literature DB >> 15608291 |
Endre Barta1, Endre Sebestyén, Tamás B Pálfy, Gábor Tóth, Csaba P Ortutay, László Patthy.
Abstract
DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21,061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically.Entities:
Mesh:
Year: 2005 PMID: 15608291 PMCID: PMC540051 DOI: 10.1093/nar/gki097
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1The data flow of the generation of the chordate DoOP database. The same method is used in the case of the plant DoOP database, except the source BLAST database comes from all Viridiplantae sequences and the query sequences are generated based on the NCBI A.thaliana annotation.
Figure 2Different types of genes according to the positions of the first mRNA and coding (cds) exons. The types 5 and 6 fall into different subcategories based on the number of the first coding exon. If it is the second as in this figure, we call it type 52 or 62, but otherwise we are referring to them generally as 5n or 6n. The positions of the query sequences relative to the first exons are marked with green boxes, while the 500, 1000 and 3000 bp upstream regions that have been put into the database are marked with red boxes.
Figure 3Examples of the DoOP dataviews. In the picture of the cluster the boxes numbered from m1 to m13 show the conserved motifs, the black box shows a predicted repetitive element, while the blue box shows the 5′-UTR.