| Literature DB >> 19159464 |
Carlos J Madrid-Aliste1, Joseph M Dybas, Ruth Hogue Angeletti, Louis M Weiss, Kami Kim, Istvan Simon, Andras Fiser.
Abstract
BACKGROUND: High throughput proteomics experiments are useful for analyzing the protein expression of an organism, identifying the correct gene structure of a genome, or locating possible post-translational modifications within proteins. High throughput methods necessitate publicly accessible and easily queried databases for efficiently and logically storing, displaying, and analyzing the large volume of data. DESCRIPTION: EPICDB is a publicly accessible, queryable, relational database that organizes and displays experimental, high throughput proteomics data for Toxoplasma gondii and Cryptosporidium parvum. Along with detailed information on mass spectrometry experiments, the database also provides antibody experimental results and analysis of functional annotations, comparative genomics, and aligned expressed sequence tag (EST) and genomic open reading frame (ORF) sequences. The database contains all available alternative gene datasets for each organism, which comprises a complete theoretical proteome for the respective organism, and all data is referenced to these sequences. The database is structured around clusters of protein sequences, which allows for the evaluation of redundancy, protein prediction discrepancies, and possible splice variants. The database can be expanded to include genomes of other organisms for which proteome-wide experimental data are available.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19159464 PMCID: PMC2652494 DOI: 10.1186/1471-2164-10-38
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Screenshot of the EPICDB query front page. This page is where the user chooses the organism to be studied and queries the database for "Experimental Characterizations", "Annotations", "Comparative Genomics", or "Gene predictions and experimental datasets" or searches the database for a specific sequence.
Figure 2Screenshot of the query results page. The sequence clusters that contain a protein matching the user-defined query are displayed along with a summary of their corresponding experimental and computational data.
Figure 3Sequence cluster image. The cluster image showing a cluster of protein sequences (red line) and assigned mass spectrometry peptides (black boxes on protein sequences). Below the protein sequences are the aligned ESTs (blue lines with directionality indicated) and ORFs (black lines). A unique identifier is included for each protein, EST, and ORF sequence.
Figure 4Screenshot of the "MassSpec" data page. The mass spectrometry data page contains the data corresponding to all of the mass spectrometry experiments that contained a mass spectrometry peptide that was assigned to the respective protein.