Literature DB >> 24304897

Plasma Proteome Database as a resource for proteomics research: 2014 update.

Vishalakshi Nanjappa1, Joji Kurian Thomas, Arivusudar Marimuthu, Babylakshmi Muthusamy, Aneesha Radhakrishnan, Rakesh Sharma, Aafaque Ahmad Khan, Lavanya Balakrishnan, Nandini A Sahasrabuddhe, Satwant Kumar, Binit Nitinbhai Jhaveri, Kaushal Vinaykumar Sheth, Ramesh Kumar Khatana, Patrick G Shaw, Srinivas Manda Srikanth, Premendu P Mathur, Subramanian Shankar, Dindagur Nagaraja, Rita Christopher, Suresh Mathivanan, Rajesh Raju, Ravi Sirdeshmukh, Aditi Chatterjee, Richard J Simpson, H C Harsha, Akhilesh Pandey, T S Keshava Prasad.   

Abstract

Plasma Proteome Database (PPD; http://www.plasmaproteomedatabase.org/) was initially described in the year 2005 as a part of Human Proteome Organization's (HUPO's) pilot initiative on Human Plasma Proteome Project. Since then, improvements in proteomic technologies and increased throughput have led to identification of a large number of novel plasma proteins. To keep up with this increase in data, we have significantly enriched the proteomic information in PPD. This database currently contains information on 10,546 proteins detected in serum/plasma of which 3784 have been reported in two or more studies. The latest version of the database also incorporates mass spectrometry-derived data including experimentally verified proteotypic peptides used for multiple reaction monitoring assays. Other novel features include published plasma/serum concentrations for 1278 proteins along with a separate category of plasma-derived extracellular vesicle proteins. As plasma proteins have become a major thrust in the field of biomarkers, we have enabled a batch-based query designated Plasma Proteome Explorer, which will permit the users in screening a list of proteins or peptides against known plasma proteins to assess novelty of their data set. We believe that PPD will facilitate both clinical and basic research by serving as a comprehensive reference of plasma proteins in humans and accelerate biomarker discovery and translation efforts.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 24304897      PMCID: PMC3965042          DOI: 10.1093/nar/gkt1251

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Plasma proteome represents an important subproteome, as it harbors proteins secreted from almost all tissues (1,2). In addition to classical blood proteins, plasma contains proteins secreted by various cells, glands and tissues along with proteins derived from commensal and infectious organisms and parasites residing inside the body (1). The plasma proteome comprises 22 highly abundant proteins including albumin, immunoglobulins, transferrin and haptoglobin, which make up 99% of total protein abundance in plasma. The remaining fraction is composed of proteins of much lower abundance including proteolytically cleaved protein fragments (3). This wide dynamic range of protein abundance, greater than 10 orders of magnitude, makes plasma proteome a challenging proteome to analyze. Plasma proteins also undergo various types of post-translational modifications such as glycosylation, which add to its complexity (4,5). An additional level of heterogeneity is brought about by inter- and intra-individual variation due to age, gender and genetic factors. All these factors make plasma the most complex and diverse body fluid, and its analysis poses challenge for the proteomic community. Despite the analysis of plasma not being straightforward from an analytical standpoint, it is the most investigated body fluid in clinical diagnostics. Components of plasma including circulating tumor cells, cell-free RNA and DNA, metabolites, electrolytes and proteins are all considered as molecular markers for early detection of diseases, disease monitoring and prognosis (6–9). When compared with other body fluids such as cerebrospinal fluid, gastric juice, bile and synovial fluid, plasma is more readily accessible and requires a simple collection procedure (1,10,11). Thus, detection and quantitation of endogenous or foreign antigens or antibodies directed against these antigens in plasma could be used to determine the physiological and pathological states of an individual (4). Several plasma or serum proteins have already been identified as potential biomarkers of diseases including cardiovascular diseases, autoimmune diseases, infectious diseases and neurological disorders (12–19). Owing to the importance of plasma proteins, several proteomic efforts have been carried out to explore human plasma proteins. In 2002, Anderson and Anderson compiled immunoassay- and 2D gel electrophoresis-based investigations of plasma proteome and reported the presence of 289 proteins (1). Adkins et al. used two different separation techniques followed by mass spectrometry (MS) for characterization of proteins from depleted serum and reported 490 proteins (20). Human Proteome Organization (HUPO) initiated a pilot phase of a major community initiative—the Human Plasma Proteome Project (HPPP)—in 2002 to determine the human plasma or serum protein constituents (21). With the involvement of 35 laboratories across the globe, this led to the identification of 3020 plasma proteins (22). As a part of this initiative, we developed a web-based resource called the Plasma Proteome Database (PPD) (23). Recent incorporation of depletion strategies to remove high-abundance proteins and multiple fractionation strategies coupled to LC-MS/MS approaches and high-resolution MS have resulted in a substantial increase in the number of proteins identified from plasma. For example, a study by Liu et al. coupled two different fractionation methods to MS and identified 9087 proteins in plasma, which is the largest data set on plasma proteins reported thus far (24). Farrah et al. have analyzed raw MS/MS data from plasma/serum, submitted to the Proteomics Identifications Database (PRIDE) and Human Plasma PeptideAtlas using Trans-Proteomic Pipeline and reported a set of high-confidence 1929 proteins (25–27). We systematically documented these newly described human plasma proteins and made them available for the biomedical community through the updated version of PPD.

RESULTS

Documentation of plasma proteins in PPD

Plasma proteome occupies an important niche at the interface of proteomics, diagnostics and medicine. To enable sharing of data across laboratories involved in HPPP, we developed the PPD as a web-based resource for plasma proteins (23). To keep up with exponential increase in plasma proteome data published in recent times, we systematically documented newly described human plasma proteins and curated information pertaining to them in PPD. For this, as a first step, we carried out a PubMed-based literature by searching for groups that have extensively contributed to plasma or serum proteomics. Additional PubMed searches were carried out to fetch scientific articles by using keywords pertaining to plasma proteins. The information from articles was annotated in PPD by experienced researchers at the Institute of Bioinformatics (IOB) along with clinical investigators from collaborating clinical centers in India. Currently, PPD contains 10 546 proteins linked to 509 scientific articles. However, as experimental data were obtained from diverse experimental platforms and published in the form of peer-reviewed research articles, we did not attempt any post hoc determination of any false positives in the database. However, as an alternative, to increase the confidence of identification of proteins in plasma or serum, we have included the number of articles reporting the detection of each protein in plasma or serum. This is listed as ‘Total number of studies’ under ‘Experimental evidence’ in the molecule page of respective protein as shown in Figure 1. In this database, the detection of 12 proteins in plasma is supported by >50 publications and that of 167 proteins is supported by >10 publications. Of the remaining 10 367 proteins, 2199 are supported by two publications, whereas 6762 proteins are supported by a single publication. A histogram depicting the distribution of proteins with corresponding number of publications is shown Figure 2. In the current update, we have included peptide data for proteins identified from MS-based studies, which is one of the newly added features of PPD. A comparison of previous and current versions of PPD provided in Table 1 shows a 3-fold increase of protein information in this 2014 update. From our analysis, we observed that 4668 and 1972 proteins were unique to PPD when compared with data from other publicly available data resources on plasma or serum proteins—the Sys-BodyFluid and Healthy Human Individual's Integrated Plasma Proteome (HIP2), respectively (28,29). The data present in the Sys-BodyFluid database were annotated from 15 articles, whereas the HIP2 database contained plasma proteome data cataloged from 4 articles. In contrast to other databases that provide plasma/serum protein catalogs generated from a limited number of studies, PPD includes data annotated from 509 peer-reviewed publications.
Figure 1.

Data analysis workflow using Plasma Proteome Explorer and a screenshot of a ‘Molecule Page’ for Haptoglobin in PPD. When queried using UniProt IDs (as shown in A), the plasma proteome explorer displays two different type of results (shown in B). These correspond to (i) proteins present in PPD, which are hyperlinked to their corresponding molecule pages and (ii) proteins not present in PPD, which are linked to an external database, UniProt. Clicking the molecule leads the user to the respective molecule page (shown in C). The graphical representation shows domains and motifs found in the protein. The molecule page also displays the alternate localization of protein, the associated biological process and molecular function of the protein. In addition, the plasma concentration reported in healthy individuals along with corresponding PubMed identifiers and any other information regarding presence in plasma EVs is displayed. Further, MRM data is provided with information on proteotypic peptides, peptide m/z values, charge, collision energy, transitions determined, type of fragment ions and the mass spectrometer used along with a link to the corresponding publication (shown in D).

Figure 2.

Distribution of proteins annotated in PPD according to number of corresponding publications. The histogram shows that a high number of proteins (8793 proteins) in plasma have been reported only in a single study. Of the 509 articles annotated, 426 were found to support the presence of ≤10 proteins in plasma or serum.

Table 1.

A comparison of data in the current version of PPD with the initial database in 2005

Data FeaturesPPD-2005PPD-2014Number of articles annotated in the current update
Total number of proteins377810 546509
Total number of proteins with plasma levelsNot annotated1278276
Total number of proteins derived from plasma extracellular vesiclesNot annotated3189
Total number of proteins with MRM dataNot included27924
Data analysis workflow using Plasma Proteome Explorer and a screenshot of a ‘Molecule Page’ for Haptoglobin in PPD. When queried using UniProt IDs (as shown in A), the plasma proteome explorer displays two different type of results (shown in B). These correspond to (i) proteins present in PPD, which are hyperlinked to their corresponding molecule pages and (ii) proteins not present in PPD, which are linked to an external database, UniProt. Clicking the molecule leads the user to the respective molecule page (shown in C). The graphical representation shows domains and motifs found in the protein. The molecule page also displays the alternate localization of protein, the associated biological process and molecular function of the protein. In addition, the plasma concentration reported in healthy individuals along with corresponding PubMed identifiers and any other information regarding presence in plasma EVs is displayed. Further, MRM data is provided with information on proteotypic peptides, peptide m/z values, charge, collision energy, transitions determined, type of fragment ions and the mass spectrometer used along with a link to the corresponding publication (shown in D). Distribution of proteins annotated in PPD according to number of corresponding publications. The histogram shows that a high number of proteins (8793 proteins) in plasma have been reported only in a single study. Of the 509 articles annotated, 426 were found to support the presence of ≤10 proteins in plasma or serum. A comparison of data in the current version of PPD with the initial database in 2005 In addition to documentation of plasma proteins and peptide information, for each protein in PPD, we have provided external links to Human Protein Reference Database (HPRD) (30), NetPath (31), Entrez Gene (32) and UniProt (33) (Figure 1). For the benefit of users, a collection of important published research articles and reviews describing high-throughput investigations on plasma/serum proteome has been provided. We are seeking participation of the scientific community by inviting investigators to submit relevant articles for inclusion in PPD and have created a new portal in PPD to submit such articles. To facilitate engagement with the community, there is a ‘feedback’ option, which allows users to submit their comments or suggestions. Finally, we have provided an option to download PPD data in XML and Microsoft Excel formats (http://www.plasmaproteomedatabase.org/download).

Plasma concentration of proteins

An important step in plasma biomarker discovery is the relative quantification of proteins in healthy and disease conditions. Determination of relative and absolute concentration of plasma proteins is helpful for monitoring disease predisposition, onset, diagnosis and progression. Several studies have been carried out in this direction. An early effort was reported by Anderson and Anderson in 2002, where they cataloged plasma concentration range for 70 proteins from published literature (1). Subsequently in 2005 and 2008, their group documented plasma levels for 177 and 211 proteins, respectively, which were reported to be associated with cardiovascular disease, stroke and cancer based on literature evidence (34,35). Recently, Farrah et al. provided an estimate for plasma levels for 1200 proteins based on spectral counting (26). We have incorporated plasma protein levels in this update of PPD and documented plasma protein concentrations for 1278 proteins from 276 articles. Availability of this information will be useful for both clinicians and researchers alike. To document plasma protein levels, we searched PubMed using Gene symbols OR Protein name OR alternate name(s) of the protein AND Plasma level* OR Serum level* as queries. We engaged four medical students (S.K., B.N.J., K.V.S. and R.K.K, who are also co-authors) to document the plasma protein levels in normal individuals from the literature. The plasma protein concentration data in PPD can be browsed as shown in Figure 1 along with the method used for the measurement in each case. We also developed a web-based submission portal (http://www.plasmaproteomedatabase.org/protein_concentration) for submission of data pertaining to protein concentration in plasma/serum by the clinical/analytical community. The researchers can submit the information obtained from their study through this portal, and the information will be processed in PPD along with the source of the information. The ‘Plasma protein concentration data’ file contains a list of plasma proteins sorted based on their concentration in plasma/serum. This file can be downloaded at http://www.plasmaproteomedatabase.org/download.

Incorporation of data from multiple reaction monitoring experiments

Traditionally, enzyme-linked immunosorbent assay and western blotting have been used for protein identification and quantitation in plasma. The success of these conventional methods relies heavily on the availability of specific antibodies against each protein. These methods also provide limited throughput, as each assay can monitor one protein at a time. Recent advances in MS have enabled simultaneous identification and quantitation of hundreds of proteins in plasma in a single experiment (36–38). Targeted approaches like multiple reaction monitoring (MRM) have enabled quantitation of plasma proteins with high specificity as well as increased throughput (36,38). MRM assays use multiple parameters to identify and quantify specific protein/s in complex samples. Proteotypic peptides that uniquely represent a protein are first selected, and assays are developed, which take into account the retention time of the peptides, their m/z values and measurement of specific fragment ions (39). This multi-step process provides superior specificity and allows for accurate quantification of protein levels over a wide dynamic range of abundance (36,40). MRM analysis has thus revolutionized biomarker research and is being widely used to determine altered protein levels across various disease phenotypes. A compilation of proteotypic peptides of plasma proteins that generate good signals in MS experiments along with their transitions will be immensely beneficial in developing MRM assays. In PPD, we have annotated 591 reported peptides corresponding to 279 proteins for MRM analysis. This includes details of precursor m/z, fragment ion m/z, collision energy, charge of the precursor ion and instrument used for the analysis (Figure 1).

Proteins identified from extracellular vesicles in plasma

Extracellular vesicles (EVs) are membranous sacs released by both normal and diseased cell types and are also reported in body fluids such as blood, urine, saliva and synovial fluid (41–44). EVs include ectosomes, exosomes and apoptotic bodies. They facilitate communication between cells by transfer of membrane and cytosolic proteins and are also shown to play role in mediating immune response and antigen presentation (44). EVs are shown to be present in plasma from healthy and certain cancer patients including glioblastoma and colon cancer (45,46). Therefore, identification and characterization of EVs in plasma should aid in disease diagnosis and discovery of biomarkers. Considering the importance of EVs, we have curated and included plasma EV proteins in PPD. There are 318 EV proteins annotated in PPD and each is linked to Exocarta and Vesiclepedia, both manually curated resources for proteins, messenger RNA, micro RNA and lipids reported from EVs including exosomes (42,43).

Plasma Proteome Explorer–a batch query interface in PPD

We have enabled a batch query utility called ‘Plasma Proteome Explorer’, which allows users to query the plasma proteome by providing a list of peptides, gene symbols, Entrez Gene IDs, RefSeq accessions or UniProt IDs. Data analysis workflow of plasma proteome explorer is represented in Figure 1. The results obtained are displayed in a web page with following features—PPD ID, which are hyperlinked to PPD ‘molecule page’, gene symbol and Entrez Gene ID. Plasma proteome explorer will enable biologists with limited bioinformatics skills to compare their own experimental data set with that of plasma proteome. In a single step, users can differentiate between novel plasma proteins, novel MS-derived peptides detected from plasma, known plasma proteins and peptides. This will enable researchers in biomarker development to screen candidate molecules that can be detected in plasma, from a larger set of candidate protein biomarkers identified in the discovery phase to pursue further for validation. We believe that the plasma proteome explorer will occupy the intermediate niche between ‘biomarker discovery’ and ‘biomarker validation’ phases. We have enabled a separate query tab, which allows querying of proteins based on protein name, gene symbol, Entrez Gene ID, UniProt ID, sample type, experimental method and MS-based platform. In addition, search by protein sequence can be carried out in PPD. The ‘Browse’ option allows the user to pursue proteins categorized on the basis of their protein name, biological process, molecular function, proteins with information on MRM data and plasma exosomal proteins.

CONCLUSIONS

The main goal of the PPD is to foster research in the area of clinical proteomics, especially the discovery of plasma biomarkers, to diagnose and monitor human diseases. We have introduced a number of new features to expand the utility of this database, particularly in the field of biomarker discovery and validation. Plasma proteome explorer will prove useful for selecting secreted candidate biomarkers from a larger set of differentially expressed genes/proteins in any disease or perturbed biological conditions. As the database develops, we foresee the need for the incorporation of information on disease association and availability of reagents such as enzyme-linked immunosorbent assay kits. We anticipate that PPD will provide a platform to bring together the communities of biomedical researchers engaged in biomarker discovery and validation along with proteomic researchers developing analytical solutions to overcome problems in plasma proteome research. We foresee that PPD will serve as a centralized reference repository for such a community approach.

FUNDING

Funding for open access charge: Waived by Oxford University Press. Conflict of interest statement. None declared.
  46 in total

Review 1.  Serum proteomics in cancer diagnosis and management.

Authors:  Kevin P Rosenblatt; Peter Bryant-Greenwood; J Keith Killian; Arpita Mehta; David Geho; Virginia Espina; Emanuel F Petricoin; Lance A Liotta
Journal:  Annu Rev Med       Date:  2004       Impact factor: 13.739

Review 2.  Therapeutic potential of the plasma proteome.

Authors:  Julia Tait Lathrop; N Leigh Anderson; Norman G Anderson; David J Hammond
Journal:  Curr Opin Mol Ther       Date:  2003-06

Review 3.  Exosomes: proteomic insights and diagnostic potential.

Authors:  Richard J Simpson; Justin We Lim; Robert L Moritz; Suresh Mathivanan
Journal:  Expert Rev Proteomics       Date:  2009-06       Impact factor: 3.940

4.  Use of mass spectrometry to identify protein biomarkers of disease severity in the synovial fluid and serum of patients with rheumatoid arthritis.

Authors:  Hua Liao; Jiang Wu; Eric Kuhn; Wendy Chin; Betty Chang; Michael D Jones; Steve O'Neil; Karl R Clauser; Johann Karl; Fritz Hasler; Ronenn Roubenoff; Werner Zolg; Brad C Guild
Journal:  Arthritis Rheum       Date:  2004-12

5.  Proteome analysis of serum from type 2 diabetics with nephropathy.

Authors:  Hyun-Jung Kim; Eun-Hee Cho; Ji-Hye Yoo; Pan-Kyeom Kim; Jun-Seop Shin; Mi-Ryung Kim; Chan-Wha Kim
Journal:  J Proteome Res       Date:  2007-02       Impact factor: 4.466

6.  Reproducible quantification of cancer-associated proteins in body fluids using targeted proteomics.

Authors:  Ruth Hüttenhain; Martin Soste; Nathalie Selevsek; Hannes Röst; Atul Sethi; Christine Carapito; Terry Farrah; Eric W Deutsch; Ulrike Kusebauch; Robert L Moritz; Emma Niméus-Malmström; Oliver Rinner; Ruedi Aebersold
Journal:  Sci Transl Med       Date:  2012-07-11       Impact factor: 17.956

7.  Multiplexed MRM-based quantitation of candidate cancer biomarker proteins in undepleted and non-enriched human plasma.

Authors:  Andrew J Percy; Andrew G Chambers; Juncong Yang; Christoph H Borchers
Journal:  Proteomics       Date:  2013-06-06       Impact factor: 3.984

8.  Mapping the human plasma proteome by SCX-LC-IMS-MS.

Authors:  Xiaoyun Liu; Stephen J Valentine; Manolo D Plasencia; Sarah Trimpin; Stephen Naylor; David E Clemmer
Journal:  J Am Soc Mass Spectrom       Date:  2007-04-24       Impact factor: 3.109

9.  High sensitivity detection of plasma proteins by multiple reaction monitoring of N-glycosites.

Authors:  Jianru Stahl-Zeng; Vinzenz Lange; Reto Ossola; Katrin Eckhardt; Wilhelm Krek; Ruedi Aebersold; Bruno Domon
Journal:  Mol Cell Proteomics       Date:  2007-07-20       Impact factor: 5.911

10.  A list of candidate cancer biomarkers for targeted proteomics.

Authors:  Malu Polanski; N Leigh Anderson
Journal:  Biomark Insights       Date:  2007-02-07
View more
  106 in total

1.  Interleukin-22 regulates the complement system to promote resistance against pathobionts after pathogen-induced intestinal damage.

Authors:  Mizuho Hasegawa; Shoko Yada; Meng Zhen Liu; Nobuhiko Kamada; Raúl Muñoz-Planillo; Nhu Do; Gabriel Núñez; Naohiro Inohara
Journal:  Immunity       Date:  2014-10-16       Impact factor: 31.745

Review 2.  Immobilization Techniques for Aptamers on Gold Electrodes for the Electrochemical Detection of Proteins: A Review.

Authors:  Franziska V Oberhaus; Dieter Frense; Dieter Beckmann
Journal:  Biosensors (Basel)       Date:  2020-04-28

3.  Data-Driven Approach To Determine Popular Proteins for Targeted Proteomics Translation of Six Organ Systems.

Authors:  Maggie P Y Lam; Vidya Venkatraman; Yi Xing; Edward Lau; Quan Cao; Dominic C M Ng; Andrew I Su; Junbo Ge; Jennifer E Van Eyk; Peipei Ping
Journal:  J Proteome Res       Date:  2016-07-19       Impact factor: 4.466

4.  Detection of early pancreatic ductal adenocarcinoma with thrombospondin-2 and CA19-9 blood markers.

Authors:  Jungsun Kim; William R Bamlet; Ann L Oberg; Kari G Chaffee; Greg Donahue; Xing-Jun Cao; Suresh Chari; Benjamin A Garcia; Gloria M Petersen; Kenneth S Zaret
Journal:  Sci Transl Med       Date:  2017-07-12       Impact factor: 17.956

5.  Plasma Kallikrein-Kinin System as a VEGF-Independent Mediator of Diabetic Macular Edema.

Authors:  Takeshi Kita; Allen C Clermont; Nivetha Murugesan; Qunfang Zhou; Kimihiko Fujisawa; Tatsuro Ishibashi; Lloyd Paul Aiello; Edward P Feener
Journal:  Diabetes       Date:  2015-05-15       Impact factor: 9.461

6.  Integrated Cellular and Plasma Proteomics of Contrasting B-cell Cancers Reveals Common, Unique and Systemic Signatures.

Authors:  Harvey E Johnston; Matthew J Carter; Kerry L Cox; Melanie Dunscombe; Antigoni Manousopoulou; Paul A Townsend; Spiros D Garbis; Mark S Cragg
Journal:  Mol Cell Proteomics       Date:  2017-01-04       Impact factor: 5.911

Review 7.  Spatial and temporal dynamics of the cardiac mitochondrial proteome.

Authors:  Edward Lau; Derrick Huang; Quan Cao; T Umut Dincer; Caitie M Black; Amanda J Lin; Jessica M Lee; Ding Wang; David A Liem; Maggie P Y Lam; Peipei Ping
Journal:  Expert Rev Proteomics       Date:  2015-03-09       Impact factor: 3.940

8.  Targeted proteomics: a bridge between discovery and validation.

Authors:  Robert Harlan; Hui Zhang
Journal:  Expert Rev Proteomics       Date:  2014-10-28       Impact factor: 3.940

9.  Proteomics of microparticles with SILAC Quantification (PROMIS-Quan): a novel proteomic method for plasma biomarker quantification.

Authors:  Michal Harel; Pazit Oren-Giladi; Orit Kaidar-Person; Yuval Shaked; Tamar Geiger
Journal:  Mol Cell Proteomics       Date:  2015-01-26       Impact factor: 5.911

10.  Elevated AKAP12 in paclitaxel-resistant serous ovarian cancer cells is prognostic and predictive of poor survival in patients.

Authors:  Nicholas W Bateman; Elizabeth Jaworski; Wei Ao; Guisong Wang; Tracy Litzi; Elizabeth Dubil; Charlotte Marcus; Kelly A Conrads; Pang-ning Teng; Brian L Hood; Neil T Phippen; Lisa A Vasicek; William P McGuire; Keren Paz; David Sidransky; Chad A Hamilton; G Larry Maxwell; Kathleen M Darcy; Thomas P Conrads
Journal:  J Proteome Res       Date:  2015-03-19       Impact factor: 4.466

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.