| Literature DB >> 15608239 |
Laura Elnitski1, Belinda Giardine, Prachi Shah, Yi Zhang, Cathy Riemer, Matthew Weirauch, Richard Burhans, Webb Miller, Ross C Hardison.
Abstract
We describe improvements to two databases that give access to information on genomic sequence similarities, functional elements in DNA and experimental results that demonstrate those functions. GALA, the database of Genome ALignments and Annotations, is now a set of interlinked relational databases for five vertebrate species, human, chimpanzee, mouse, rat and chicken. For each species, GALA records pairwise and multiple sequence alignments, scores derived from those alignments that reflect the likelihood of being under purifying selection or being a regulatory element, and extensive annotations such as genes, gene expression patterns and transcription factor binding sites. The user interface supports simple and complex queries, including operations such as subtraction and intersections as well as clustering and finding elements in proximity to features. dbERGE II, the database of Experimental Results on Gene Expression, contains experimental data from a variety of functional assays. Both databases are now run on the DB2 database management system. Improved hardware and tuning has reduced response times and increased querying capacity, while simplified query interfaces will help direct new users through the querying process. Links are available at http://www.bx.psu.edu/.Entities:
Mesh:
Year: 2005 PMID: 15608239 PMCID: PMC539999 DOI: 10.1093/nar/gki045
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Use of GALA to identify a known regulatory element using features of alignments and annotations. The results of four queries on GALA are displayed as custom tracks in the UCSC Genome Browser (1,2), along with genes and the phyloHMM conservation track (21). The distal major regulatory element for the alpha-like globin genes is marked by an arrow. The track ‘phyloHMMcons>=0.4 no exons’ shows all the DNA segments with a phyloHMMcons score of at least 0.4, which is a rather stringent threshold, after subtracting all the segments exceeding the threshold that overlapped an exon. The track ‘conserved GATA-1 binding site’ shows the matches to GATA-1 binding site weight matrices in TRANSFAC (24) that are also conserved in human, mouse and rat. These match the weight matrices in those species as well. The track ‘cluster of consvd GATA-1 binding sites’ shows the conserved GATA-1 binding sites that have another one within 100 bp. The track ‘Hi phyloHMMcons, no exons, cluster consvd GATA-1 binding sites’ shows all the segments from the first data track that include one of the clusters on the third data track.