| Literature DB >> 15608170 |
Yunjia Chen1, Yong Zhang, Yanbin Yin, Ge Gao, Songgang Li, Ying Jiang, Xiaocheng Gu, Jingchu Luo.
Abstract
With the improved secreted protein prediction approach and comprehensive data sources, including Swiss-Prot, TrEMBL, RefSeq, Ensembl and CBI-Gene, we have constructed secretomes of human, mouse and rat, with a total of 18 152 secreted proteins. All the entries are ranked according to the prediction confidence. They were further annotated via a proteome annotation pipeline that we developed. We also set up a secreted protein classification pipeline and classified our predicted secreted proteins into different functional categories. To make the dataset more convincing and comprehensive, nine reference datasets are also integrated, such as the secreted proteins from the Gene Ontology Annotation (GOA) system at the European Bioinformatics Institute, and the vertebrate secreted proteins from Swiss-Prot. All these entries were grouped via a TribeMCL based clustering pipeline. We have constructed a web-based secreted protein database, which has been publicly available at http://spd.cbi.pku.edu.cn. Users can browse the database via a GO assignment or chromosomal-location-based interface. Moreover, text query and sequence similarity search are also provided, and the sequence and annotation data can be downloaded freely from the SPD website.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15608170 PMCID: PMC540047 DOI: 10.1093/nar/gki093
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1The pipeline we used to collect secreted proteins. Sequence data were downloaded by December 2003. Non-redundant: identical sequences were excluded via pairwise BLASTP. Sec-HMMER (CJ-SPHMM/TMHMM): the tool used to predict signal peptides and transmembrane region. Set1 and Set2: proteins predicted by PSORT and Sec-HMMER, respectively; Set3: proteins matched by Sec-HMMER only. Rank0: known secreted proteins in Swiss-Prot; Rank1: predicted by both PSORT and Sec-HMMER; Rank2: predicted by either PSORT or Sec-HMMER; and Rank3: predicted by Sec-HMMER only and with a signal peptide >70 amino acids.
Figure 2Screen snapshots of the SPD database, only partially drawn (A and B) for clarity. (A) An example page of the chromosomal location of human protease secreted proteins. The clickable ‘+’ and ‘−’ symbols denotes the plus and minus strands as indicated in the browser at the SPD website. (B) An example of an SPD core dataset entry (TAFA 3.2) showing four divisions of the entry format and various fields with links to detail information and original data source. (C) An example of an SPD reference dataset entry (AY359017) showing two divisions with general information.