| Literature DB >> 17090601 |
Yanling Zhang1, Yong Zhang, Jun Adachi, Jesper V Olsen, Rong Shi, Gustavo de Souza, Erica Pasini, Leonard J Foster, Boris Macek, Alexandre Zougman, Chanchal Kumar, Jacek R Wisniewski, Wang Jun, Matthias Mann.
Abstract
Mass spectrometry (MS)-based proteomics has become a powerful technology to map the protein composition of organelles, cell types and tissues. In our department, a large-scale effort to map these proteomes is complemented by the Max-Planck Unified (MAPU) proteome database. MAPU contains several body fluid proteomes; including plasma, urine, and cerebrospinal fluid. Cell lines have been mapped to a depth of several thousand proteins and the red blood cell proteome has also been analyzed in depth. The liver proteome is represented with 3200 proteins. By employing high resolution MS and stringent validation criteria, false positive identification rates in MAPU are lower than 1:1000. Thus MAPU datasets can serve as reference proteomes in biomarker discovery. MAPU contains the peptides identifying each protein, measured masses, scores and intensities and is freely available at http://www.mapuproteome.com using a clickable interface of cell or body parts. Proteome data can be queried across proteomes by protein name, accession number, sequence similarity, peptide sequence and annotation information. More than 4500 mouse and 2500 human proteins have already been identified in at least one proteome. Basic annotation information and links to other public databases are provided in MAPU and we plan to add further analysis tools.Entities:
Mesh:
Substances:
Year: 2006 PMID: 17090601 PMCID: PMC1781136 DOI: 10.1093/nar/gkl784
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Workflow for protein identification and validation.
Figure 2Four search facilities in the MAPU database. (a) Search sections in ORMD. At the top of the section, there is a button ‘List all data’ for list-query. The left side is the advanced search query section, which includes several search terms. Some of these are specific to different sub-databases. The right side is so-called batch search module, but only in ORMD. The bottom of search section is BLAST search section. The input protein sequence should be in fasta format and E-value is 1e−10. (b) Cell sub-cellular map in ORMD. The picture is clickable and presents sub-cellular location name when the mouse is moving over them. We also list all selectable sub-cellular locations in ORMD on the right side. User can also click them directly to go to the protein list report. The same idea will be applied to other sub-databases, such as body fluid database, in the future.
Figure 3Report pages in the MAPU database. Protein list report page (in ORMD, Seminal fluid Database and Red Blood Cell Database). All proteins in our proteome database are hyperlinked in the BLAST result page. User can navigate to the protein report in the relevant sub database using these hyperlinks.
Figure 4Data work flow in our MAPU database. To generate the branches databases more easily and flexibly, we developed dataset parsers/database generators as assistant tools. The whole work flow is from original data to data in database with several functional modules by our parser, generator tools and common templates.