Literature DB >> 18978022

Sys-BodyFluid: a systematical database for human body fluid proteome research.

Su-Jun Li1, Mao Peng, Hong Li, Bo-Shu Liu, Chuan Wang, Jia-Rui Wu, Yi-Xue Li, Rong Zeng.   

Abstract

Recently, body fluids have widely become an important target for proteomic research and proteomic study has produced more and more body fluid related protein data. A database is needed to collect and analyze these proteome data. Thus, we developed this web-based body fluid proteome database Sys-BodyFluid. It contains eleven kinds of body fluid proteomes, including plasma/serum, urine, cerebrospinal fluid, saliva, bronchoalveolar lavage fluid, synovial fluid, nipple aspirate fluid, tear fluid, seminal fluid, human milk and amniotic fluid. Over 10,000 proteins are presented in the Sys-BodyFluid. Sys-BodyFluid provides the detailed protein annotations, including protein description, Gene Ontology, domain information, protein sequence and involved pathways. These proteome data can be retrieved by using protein name, protein accession number and sequence similarity. In addition, users can query between these different body fluids to get the different proteins identification information. Sys-BodyFluid database can facilitate the body fluid proteomics and disease proteomics research as a reference database. It is available at http://www.biosino.org/bodyfluid/.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18978022      PMCID: PMC2686600          DOI: 10.1093/nar/gkn849

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

In the post-genome era, proteomic technology has rapidly developed to be a powerful platform for the research of human physiology. It can be applied for identifying potential novel biomarkers for prognosis, diagnosis and therapeusis (1,2). And in recent years it is shown that body fluids have become one of the important targets for proteomics research (3). The body fluids include a wide variety of compositions like plasma/serum, urine, cerebrospinal fluid, saliva, bronchoalveolar lavage fluid, synovial fluid, nipple aspirate fluid, tear fluid, amniotic fluid and so on. Analysis of the protein composition in body fluids can help to understand human disease proteomics better. Hu et al.,(3) reviewed the body fluids research advances in proteome analysis and focused on its applications to human disease biomarker discovery. The importance of body fluids has also been appreciated by recent proteomics work (4). The database ‘MAPU: Max-Planck Unified database of organellar, cellular, tissue and body fluid’ (5) published in 2007 exhibit the close attention of the proteome researchers to the body fluids. The MAPU database stores the data from their own lab and contains several kinds of body fluids, such as urine and tear fluid. To collect more curated proteomics data in the related literatures of the body fluids and provide comprehensive protein annotation, as well as explore the relationships between the different body fluids, we constructed this database Sys-BodyFluid. Abundant proteomics data and in-depth protein annotation make Sys-BodyFluid to be a reference database for body fluid and clinical proteomics research.

DATABASE CONSTRUCTION

Sys-BodyFluid database was implemented through MySQL relational database (http://www.mysql.com). The web graphical user interface was constructed using JavaServer Pages technology (http://java.sun.com/products/jsp/). The manually curated body fluid protein data in the Sys-BodyFluid were imported to MySQL database by JAVA program. The protein annotation data were downloaded from International Protein Index (IPI) database, Gene Ontology (6), GOA database (7) and KEGG (8) pathway database. Open source JAVA library named as JFreeChart (http://www.jfree.org/jfreechart/) distributed under LGPL was adopted to plot the image of the statistics data in the web.

DATA SOURCE AND DATABASE CONTENTS

We searched PubMed and manually curated 50 related peer-review publications published online before May 2008. The primary sequences of the proteins were retrieved by the original ID from their corresponding databases in these publications. Due to the database updates, the protein sequences reported in the literatures may have changed or depleted in the current databases. Therefore, these protein sequences were manually validated before importing into the database. Each protein was mapping to the IPI database to uniform the protein ID in Sys-BodyFluid by blasting these protein sequences against the database (Human IPI Version 3.44) (the E-value cutoff was set to 10−8, the BLAST-HSP coverage was >0.9). Thus, each of the protein has a corresponding IPI ID in the Sys-BodyFluid database. The total unique proteins and paper numbers of the 11 kinds of body fluids in our database are summarized in Table 1. For example, there are 13 papers and 7748 proteins about the plasma/serum research in our database. Users can obtain this statistical information about the Sys-BodyFluid database in the ‘DATABASE’ web link in the website http://www.biosino.org/bodyfluid.
Table 1.

The data summary in Sys-BodyFluid database

Body fluid nameProtein numberPaper number
Plasma/Serum (11–23)774813
Saliva (24–31)21618
Urine (32–40)19419
Cerebrospinal fluid (41–46)12866
Seminal fluid (47,48)9162
Amniotic fluid (49–51)8993
Tear (52,53)5092
Bronchoalveolar lavage fluid (54,55)4112
Milk (56,57)1752
Synovial fluid (58)1141
Nipple aspiration fluid (59,60)842
Total10 13850
The data summary in Sys-BodyFluid database

DATA AVAILABILITY

The Sys-BodyFluid is accessed from graphical web interface (http://www.biosino.org/bodyfluid/) and the data are available for download through the ‘DOWNLOAD’ link in the website as a text file. Users could specify their interested body fluid data to download.

DATABASE UTILITY

Sys-BodyFluid provides users the current database data statistics of different body fluids through the DATABASE link for the paper number and the unique protein number (DATABASE Link). As shown in Figure 1, Sys-BodyFluid offers users an optimal search function, including searching by protein ID, name and sequence similarity (SEARCH link, Figure 1A). The comprehensive browse option allows users to explore comparison analysis between two or more different body fluids data (Browse link, Figure 1B). For each protein in Sys-BodyFluid, we provide detailed annotation information, including protein description, involved body fluids, paper information, domain, Gene Ontology, pathway, sequence and so on (Figure 1C). Users can choose their interested body fluid to browse or download. Web page describing the body fluid provides users particular information. Furthermore, the availability of pathway analysis will assist users to investigate the difference between body fluids through involved metabolism and signal transduction pathway (Pathway link, Figure 1D). Proteins in our database are labeled with ‘red’ color. The body fluid number and paper number the proteins involved in are also showed in the web page.
Figure 1.

The web graphical user interface of Sys-BodyFluid database. (A) Search part and option. Users could search protein by protein ID, protein name and sequence similarity. (B) Browse part. Database allows user browse protein by their interested body fluid and interested paper. Protein existed in two body fluids could also be viewed and multi body fluids can be investigated. (C) Protein annotation part. There is detailed information in the database for each protein, including description, domain, Gene Ontology term, sequence and so on. (D) Pathway part. The proteins (colored by red) in different body fluids and their involved pathway are shown in pathway link. Proteins in our database are labeled with ‘red’ color. The body fluid number and paper number are also showed in the web page.

The web graphical user interface of Sys-BodyFluid database. (A) Search part and option. Users could search protein by protein ID, protein name and sequence similarity. (B) Browse part. Database allows user browse protein by their interested body fluid and interested paper. Protein existed in two body fluids could also be viewed and multi body fluids can be investigated. (C) Protein annotation part. There is detailed information in the database for each protein, including description, domain, Gene Ontology term, sequence and so on. (D) Pathway part. The proteins (colored by red) in different body fluids and their involved pathway are shown in pathway link. Proteins in our database are labeled with ‘red’ color. The body fluid number and paper number are also showed in the web page.

RESULTS AND DISCUSSION

To get more comprehensive understanding of the relationship between body fluids, we compared the proteins composition in different body fluids. The result is shown in Figure 2A. There are 2928 proteins presented in at least two body fluids and 1359 proteins exist in at least three body fluids. Only 15 proteins exist in total 11 body fluids. For these 2928 proteins, GO annotation information were obtained and enrichment analysis was performed using BiNGO (9) and Cytoscape (10). Each node in Figure 2B represents a GO term. The node's size is scaled by protein number and node's color shows P-value of the enrichment analysis. The edge denotes the parent–children relationship between nodes. From this analysis, it is shown that some molecular functions like ‘protein binding’ and ‘enzyme regulator activity’ are over-presented in this dataset, as well as the biological process like ‘transport’ and ‘secretion’. Cellular component like ‘extracellular region’ is significantly enriched.
Figure 2.

(A) The data comparison in different body fluids. There are 2928 proteins presented in at least two body fluids and 1359 proteins existed in at least three body fluids. Only 15 proteins exist in total 11 body fluids. (B) Gene Ontology annotation statistical analysis for the 2928 proteins existing in at least two body fluids.

(A) The data comparison in different body fluids. There are 2928 proteins presented in at least two body fluids and 1359 proteins existed in at least three body fluids. Only 15 proteins exist in total 11 body fluids. (B) Gene Ontology annotation statistical analysis for the 2928 proteins existing in at least two body fluids. Human body fluids proteome analysis is still a challenge because dynamic range and the complexity of the body fluids protein composition. It is important to construct a body fluid reference database dedicated to biomarker discovery research. Previous work like MAPU is a great effort to integrate the data from their own lab and aim to provide a ‘gold standard’ reference proteome database. It is still necessary to refer to other proteomic literature data. For this reason, our database Sys-BodyFluid was build as a complementary database to the MAPU and aimed to provide users more information about the body fluids accompanied by protein abundant annotations. The relationship between different body fluids was also focused in our database. Users can access this database by http://www.biosino.org/bodyfluid.

PERSPECTIVES

As more and more body fluid proteome data have been produced recently, it is planned to update Sys-BodyFluid database every 6 months. New body fluid proteome data produced during the time will be added to our database. Furthermore, more annotation information like protein interaction data will also be included. In the future, we will collect more body fluid proteome data in the disease proteomics research, for example, cancer and diabetes proteome data. If possible, tissue proteomics data will be also included to look into the crosstalk between the tissue protein and the body fluid protein.

FUNDING

Basic Research Foundation (2006CB910700); CAS Project (KSCX2-YW-R-106, KSCX2-YW-R-112, KGCX1-YW-13); High-technology Project (2007AA02Z334). Funding for open access charge: CAS project KSCX2-YW-R-106. Conflict of interest statement. None declared.
  60 in total

1.  The human plasma proteome: a nonredundant list developed by combination of four separate sources.

Authors:  N Leigh Anderson; Malu Polanski; Rembert Pieper; Tina Gatlin; Radhakrishna S Tirumalai; Thomas P Conrads; Timothy D Veenstra; Joshua N Adkins; Joel G Pounds; Richard Fagan; Anna Lobley
Journal:  Mol Cell Proteomics       Date:  2004-01-12       Impact factor: 5.911

Review 2.  Mass spectrometry-based proteomics: current status and potential use in clinical chemistry.

Authors:  Pierre-Alain Binz; Denis F Hochstrasser; Ron D Appel
Journal:  Clin Chem Lab Med       Date:  2003-12       Impact factor: 3.694

3.  Explorative study of the protein composition of amniotic fluid by liquid chromatography electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry.

Authors:  Stefan Nilsson; Margareta Ramström; Magnus Palmblad; Ove Axelsson; Jonas Bergquist
Journal:  J Proteome Res       Date:  2004 Jul-Aug       Impact factor: 4.466

4.  Identification of human whole saliva protein components using proteomics.

Authors:  Rui Vitorino; Maria João C Lobo; António J Ferrer-Correira; Joshua R Dubin; Kenneth B Tomer; Pedro M Domingues; Francisco M L Amado
Journal:  Proteomics       Date:  2004-04       Impact factor: 3.984

5.  Proteomic analysis of human ventricular cerebrospinal fluid from neurologically normal, elderly subjects using two-dimensional LC-MS/MS.

Authors:  Brett R Wenner; Mark A Lovell; Bert C Lynn
Journal:  J Proteome Res       Date:  2004 Jan-Feb       Impact factor: 4.466

6.  Two-dimensional liquid chromatography study of the human whole saliva proteome.

Authors:  Phillip A Wilmarth; Michael A Riviere; D Leif Rustvold; Jeffrey D Lauten; Theresa E Madden; Larry L David
Journal:  J Proteome Res       Date:  2004 Sep-Oct       Impact factor: 4.466

7.  Establishment of a near-standard two-dimensional human urine proteomic map.

Authors:  Jisun Oh; Jae-Hoon Pyo; Eun-Hyun Jo; Sun-Il Hwang; Sun-Chul Kang; Jae-Hwan Jung; Eui-Kyun Park; Shin-Yoon Kim; Je-Yong Choi; Jinkyu Lim
Journal:  Proteomics       Date:  2004-11       Impact factor: 3.984

8.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology.

Authors:  Evelyn Camon; Michele Magrane; Daniel Barrell; Vivian Lee; Emily Dimmer; John Maslen; David Binns; Nicola Harte; Rodrigo Lopez; Rolf Apweiler
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

9.  A comprehensive characterization of the peptide and protein constituents of human seminal fluid.

Authors:  Kim Y C Fung; L Michael Glode; Spencer Green; Mark W Duncan
Journal:  Prostate       Date:  2004-10-01       Impact factor: 4.104

10.  Characterization of the human urinary proteome: a method for high-resolution display of urinary proteins on two-dimensional electrophoresis gels with a yield of nearly 1400 distinct protein spots.

Authors:  Rembert Pieper; Christine L Gatlin; Andrew M McGrath; Anthony J Makusky; Madhu Mondal; Michael Seonarain; Erin Field; Courtney R Schatz; Marla A Estock; Nasir Ahmed; Norman G Anderson; Sandra Steiner
Journal:  Proteomics       Date:  2004-04       Impact factor: 3.984

View more
  28 in total

1.  Preliminary use of differential scanning calorimetry of cerebrospinal fluid for the diagnosis of glioblastoma multiforme.

Authors:  Alexis A Chagovetz; Randy L Jensen; Larry Recht; Michael Glantz; Alexander M Chagovetz
Journal:  J Neurooncol       Date:  2011-07-01       Impact factor: 4.130

2.  A tool for biomarker discovery in the urinary proteome: a manually curated human and animal urine protein biomarker database.

Authors:  Chen Shao; Menglin Li; Xundou Li; Lilong Wei; Lisi Zhu; Fan Yang; Lulu Jia; Yi Mu; Jiangning Wang; Zhengguang Guo; Dan Zhang; Jianrui Yin; Zhigang Wang; Wei Sun; Zhengguo Zhang; Youhe Gao
Journal:  Mol Cell Proteomics       Date:  2011-08-29       Impact factor: 5.911

3.  Body fluid identification by mass spectrometry.

Authors:  Heyi Yang; Bo Zhou; Haiteng Deng; Mechthild Prinz; Donald Siegel
Journal:  Int J Legal Med       Date:  2013-03-24       Impact factor: 2.686

Review 4.  Current state of the art for enhancing urine biomarker discovery.

Authors:  Michael Harpole; Justin Davis; Virginia Espina
Journal:  Expert Rev Proteomics       Date:  2016-06       Impact factor: 3.940

5.  Towards proteome standards: the use of absolute quantitation in high-throughput biomarker discovery.

Authors:  Tzu-Chiao Chao; Nicole Hansmeier; Rolf U Halden
Journal:  J Proteomics       Date:  2010-04-22       Impact factor: 4.044

6.  Differential proteomic analysis of pathway biomarkers in human breast cancer by integrated bioinformatics.

Authors:  Liu Fu-Jun; Jin Shao-Hua; Shen Xiao-Fang
Journal:  Oncol Lett       Date:  2012-08-24       Impact factor: 2.967

Review 7.  The scientific exploration of saliva in the post-proteomic era: from database back to basic function.

Authors:  Stefan Ruhl
Journal:  Expert Rev Proteomics       Date:  2012       Impact factor: 3.940

8.  A dynamic range compression and three-dimensional peptide fractionation analysis platform expands proteome coverage and the diagnostic potential of whole saliva.

Authors:  Sricharan Bandhakavi; Matthew D Stone; Getiria Onsongo; Susan K Van Riper; Timothy J Griffin
Journal:  J Proteome Res       Date:  2009-12       Impact factor: 4.466

Review 9.  Urinary proteomic profiling for diagnostic bladder cancer biomarkers.

Authors:  Steve Goodison; Charles J Rosser; Virginia Urquidi
Journal:  Expert Rev Proteomics       Date:  2009-10       Impact factor: 3.940

10.  The human salivary proteome is radiation responsive.

Authors:  Heather D Moore; Richard G Ivey; Uliana J Voytovich; Chenwei Lin; Derek L Stirewalt; Era L Pogosova-Agadjanyan; Amanda G Paulovich
Journal:  Radiat Res       Date:  2014-04-10       Impact factor: 2.841

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.