Literature DB >> 28742128

Construction and characterization of the Korean whole saliva proteome to determine ethnic differences in human saliva proteome.

Ha Ra Cho1, Han Sol Kim1, Jun Seo Park1, Seung Cheol Park2, Kwang Pyo Kim2, Troy D Wood3, Yong Seok Choi1.   

Abstract

As the first step to discover protein disease biomarkers from saliva, global analyses of the saliva proteome have been carried out since the early 2000s, and more than 3,000 proteins have been identified in human saliva. Recently, ethnic differences in the human plasma proteome have been reported, but such corresponding studies on human saliva in this aspect have not been previously reported. Thus, here, in order to determine ethnic differences in the human saliva proteome, a Korean whole saliva (WS) proteome catalogue indexing 480 proteins was built and characterized through nLC-Q-IMS-TOF analyses of WS samples collected from eleven healthy South Korean male adult volunteers for the first time. Identification of 226 distinct Korean WS proteins, not observed in the integrated human saliva protein dataset, and significant gene ontology distribution differences in the Korean WS proteome compared to the integrated human saliva proteome strongly support ethnic differences in the human saliva proteome. Additionally, the potential value of ethnicity-specific human saliva proteins as biomarkers for diseases highly prevalent in that ethnic group was confirmed by finding 35 distinct Korean WS proteins likely to be associated with the top 10 deadliest diseases in South Korea. Finally, the present Korean WS protein list can serve as the first level reference for future proteomic studies including disease biomarker studies on Korean saliva.

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28742128      PMCID: PMC5524414          DOI: 10.1371/journal.pone.0181765

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Saliva is secreted from salivary glands, including three major glands (parotid, submandibular, sublingual glands) and minor glands. Saliva has various functions. It maintains oral cavity homeostasis, lubricates oral tissues, promotes chewing, swallowing, digestion, and speaking, and protects the oral cavity against microorganisms [1-4]. It is composed of water, proteins, peptides, lipids, other small molecules, and minerals. Healthy adults are known to produce 500–1500 mL of saliva daily at a rate of about 0.5 mL/min [1, 4]. Most organic compounds in saliva are produced in the salivary glands, but some are transferred from blood through various mechanisms, including diffusion, active transport, and ultrafiltration [4]. Moreover, its collection is non-invasive and it is easy to collect and store saliva samples [5]. Thus, saliva can be a good alternative to blood for diagnosis due to its characteristics mentioned above. For example, major systemic infections of viruses such as human immunodeficiency virus, hepatitis C virus, and human papillomavirus have been successfully tested by saliva-based diagnostic methods [6]. Thus, clinical diagnosis using saliva specimens is an emerging field. Among various constituents of saliva, proteins have gained the most interest as probable disease biomarkers because numerous proteins are known to be present in the saliva and many of them are believed to represent the progress of diseases [4]. As the first step to discover protein disease biomarkers from saliva, global analyses of the saliva proteome have been carried out since the early 2000s. As a result, more than 3000 proteins have been identified in human saliva [1, 7–11]. Some of them are accessible through public databases such as Human Salivary Proteome Central Repository (1,166 proteins) and Sys-BodyFluid Database (2,161 proteins) [12, 13]. Additionally, systematic comparisons of human saliva and plasma proteomes have been carried out and several interesting points have been reported in the saliva proteome [1, 2]. First, only about 27% of proteins identified in human whole saliva (WS) are found in plasma, indicating that it is possible to discover totally novel biomarkers from saliva [2]. In addition, human saliva and plasma proteomes are over-represented in the categories of response-to-stimulus and response-to-stress compared to total human proteome. This indicates that both fluids (saliva and plasma) might play important roles of in the defense system of the human body and their probable potential for disease diagnosis [1, 2]. These points have been supported by the discovery of many protein disease biomarker candidates from saliva for oral diseases and systemic diseases [2, 5, 14–17]. Moreover, about 58% of immunoglobulins (Igs) identified in human saliva are found in plasma, and the abundance of these overlapping Igs in saliva and plasma shows a high correlation (r of at least 0.87) [1, 2]. This indicates that it is possible to transform antibody-based diagnostic methods using blood to methods employing saliva; an excellent example is the commercial saliva HIV test kit [6]. Recently, ethnic differences in human plasma proteome have been reported. Jeong et al. confirmed 100 unique proteins out of 185 proteins in Korean plasma compared to 3,380 proteins in the HUPO Plasma Proteome Project dataset [18], and Kim et al. observed plasma level differences of some cardiovascular disease protein marker candidates between African-American and non-Hispanic White ethnicity [19]. These results indicate that there is a fundamental need to determine ethnic difference in human saliva proteome. Unique proteins might be found only in ethnicity-specific saliva samples and they might be useful as novel biomarkers for diseases prevalent in that ethnic group. Therefore, the Korean WS proteome catalogue indexing 480 proteins was built and characterized in this study through proteomic analyses of WS samples collected from eleven healthy Korean male adult volunteers for the first time. It was then compared to the integrated human saliva proteome including 3,449 proteins to determine ethnic differences in human saliva proteome. Confirmed differences of protein identities and GO category distributions between the two proteomes strongly support that there are ethnic differences in the human saliva proteome. In addition, some distinct proteins in the Korean WS are likely to be associated with highly prevalent diseases in South Korea, demonstrating the high diagnostic potential of ethnicity-specific human saliva proteins for diseases highly prevalent within an ethnic group. Finally, the present list of Korean WS proteins can serve as the first level reference for future proteomic studies including disease biomarker studies on Korean saliva.

Materials and methods

Sample collection and Preparation

This study was approved by Dankook University Institutional Review Board. The participants of this study were recruited between August 1, 2014 and Augst 31, 2014 by posting its notices including the brief summary of this study around Dankook University, Cheonan, Chungnam, South Korea. A total of 15 volunteers wanted to participate this study, but 4 of them were excluded based on their history of diseases informed through their introductory screening surveys. Finally, eleven healthy South Korean male adults (25.9±2.3 years old; 22–30 years old) were decided to be the participants and each participant signed an informed consent form. Any specific baseline demographic characteristics of the study populations were not available in this study, because only basic personal information and history of diseases were obtained through the introductory screening survey. WS (15 mL/person) was collected from volunteers at 9:30 am prior to eating and after rinsing the mouth with water. A protease inhibitor cocktail solution (Sigma-Aldrich, St. Louis, MO) was spiked (the final volume ratio of 1:100) to WS samples immediately after sample collection. These protease-spiked samples were centrifuged at 12,000 rpm and 4°C for 10 min. Each supernatant was stored at -70°C until use. Prior to protein digestion, 4 mL of the thawed protease-spiked sample supernatant was applied to a 3 kDa cutoff filter unit (Amicon Ultra-4, Merck Millipore, Billerica, MA) for buffer exchange with water. The filter unit was centrifuged at 3,500 rpm and the retentate was dried by vacuum centrifugation. The dried residue was resuspended with 500 μL of water and its total protein concentration was determined by BCA assay (Pierce BCA Protein Assay Kit, Thermo Scientific, Waltham, MA). An appropriate portion of the resuspended solution (equivalent to 1 mg of total protein) was then dried by vacuum centrifugation again, and the resulting residue was applied to procedures described previously with slight modifications (S1 Appendix) [20, 21]. A portion of the final form of the sample solution was subjected to nanoliquid chromatography-quadrupole-ion mobility spectroscopy-time of flight (nLC-Q-IMS-TOF) analysis. In the case of the pooled Korean WS sample, 1 mL of each thawed protease-spiked sample supernatant was mixed and the mixture was applied to the same method mentioned above. A portion of the final form of the pooled sample solution was subjected to nLC-Q-IMS-TOF analysis and nLC-Q-orbitrap analysis.

Separation and analysis

All nLC-Q-IMS-TOF analyses were carried out on a Waters nanoACQUITY UPLC system (Waters, Milford, MA) and a Waters SYNAPT G2-S HDMS system. The prepared sample was injected into a Waters nanoACQUITY UPLC Symmetry C18 trap column (5 μm, 0.18×20 mm). It was desalted with 99% mobile phase A (0.1% formic acid in water) and 1% mobile phase B (0.1% formic acid in acetonitrile) for 5 min at a flow rate of 10 μL/min. Trap column-retained peptides were eluted into a Waters nanoACQUITY UPLC BEH300 C18 column (1.7 μm, 0.075×250 mm) and separated by a linear gradient of mobile phase B from 1 to 60% for 120 min at a flow rate of 250 nL/min. Peptides eluted from the analytical column were delivered into the mass spectrometer through a nanoelectrospray ionization (nESI) source operating in positive ion mode. Mass spectrometry of peptide ions was performed in resolution data-independent acquisition mode (MSE). Prior to fragmentation processes, IMS was carried out to separate similar precursor ions. Parameters related with mass spectrometry are listed in S1 Appendix. All nLC-Q-orbitrap analyses were carried out on a Thermo Scientific Easy-nLC 1000 system (Waltham, MA) and a Thermo Scientific Q Exactive system. The prepared sample was desalted by Top Tip (Glygen, Columbia, MD) following the direction by the manufacturer and the desalted sample was separated on an in-house analytical column (0.075×250 mm), packed with C18 resin (Jupiter, 3 μm, Phenomenex, Torrance, CA), by a linear gradient of mobile phase B from 1 to 80% for 120 min at a flow rate of 300 nL/min. Peptides eluted from the column were delivered into the mass spectrometer through a nESI source operating in positive ion mode. Mass spectrometry of peptide ions was performed in data-dependent product ion scan (MS2). Parameters related with mass spectrometry are listed in S1 Appendix.

Protein identification and bioinformatics

Raw data from nLC-Q-IMS-TOF and nLC-Q-orbitrap were analyzed with Waters ProteinLynx Global Server (PLGS) v3.0.2 and Thermo Proteome Discoverer v2.1, respectively. For the identification of peptides and proteins, database search against the IPI human database v3.87 was performed and database search parameters are listed in S1 Appendix. All database search results were verified manually. For GO analysis of saliva proteomes, the Generic GO term mapper was used [22]. Significance of difference in individual GO categories between the Korean WS proteome and the integrated human saliva proteome was tested by the chi-square method [23]. Database of disease-related biomarkers was used to check probable association between distinct proteins observed in Korean WS proteome but not in the integrated human saliva proteome and diseases [24].

Results

The Korean WS proteome

Based on nLC-Q-IMS-TOF analyses of Korean WS samples, the Korean WS proteome was built successfully for the first time. In order to enhance the credibility of protein identification results, the following criteria were set: 1) any identification derived from only one unique peptide was rejected, 2) FDR was kept at no more than 1%, 3) only protein identification with at least 95% probability from PLGS results were accepted, and 4) all results which passed the above criteria were verified manually. These criteria were applied to all downstream protein identifications. As a result, a total of 480 proteins were identified (S1 Table). Also, the distribution of theoretical molecular weight and isoelectric point (pI) of the Korean WS proteome were examined (Fig 1A for molecular weight and Fig 1B for pI). As shown in Fig 1A, a large portion (82.3%) of the Korean WS proteome is composed of proteins with molecular weight of less than 60 kDa. There is a roughly inverse correlation between distribution proportions and molecular weights of component proteins at range of 60–160 kDa. In the case of pI distribution, the Korean WS proteome is composed of 16.5, 37.5, 30.0, and 16.0% of proteins with pI values lower than 5.0, between 5.0 and 7.0, between 7.0 and 9.0, and higher than 9.0, respectively (Fig 1B). The average molecular weight and pI value of the Korean WS proteome were calculated to 42 kDa and 6.95, respectively.
Fig 1

Theoretical molecular weight (A) and isoelectric point (B) distribution of the Korean whole saliva proteome.

Theoretical molecular weight (A) and isoelectric point (B) distribution of the Korean whole saliva proteome.

Comparison of protein lists from the Korean WS proteome and the integrated human saliva proteome

To determine ethnic differences in the human saliva proteome, the present Korean WS proteome was compared to P. Sivadasan et al.'s updated human saliva protein list (a total of 3,449 proteins) built by the integration of their own and previously-reported five human saliva protein lists [1, 7–11]. For accurate comparison between proteomes, all available information of a protein, including IPI accession number, SwissProt number, gene symbol, amino acid sequence, molecular weight, and brief description, was used for its UniProt KB search. Then, search results from similar proteins in various proteomes were carefully compared to one another to determine if they are the same. As shown in Fig 2, the Korean WS protein list has 226 out of 480 (47.1%) proteins not included in the integrated human saliva protein list. These distinct Korean WS proteins are summarized in Table 1 and S2 Table.
Fig 2

Venn diagram illustrating the total number of proteins specific to either the Korean whole saliva proteome or the integrated human saliva proteome and those observed in both proteomes.

Table 1

Distinct proteins observed in Korean whole saliva but not in other human saliva.

1, response to stimulus; 2, response to stress; 3, cell communication; 4, protein metabolic; 5, other primary metabolic; 6, transport; 7, organization and biogenesis; 8, catabolic process; 9, cell homeostasis; 10, regulation of biological process; 11, nucleic acid binding; 12, protein binding; 13, other binding; 14, transporter activity; 15, signal transducer activity; 16, catalytic activity; 17, motor activity; 18, structural regulator; 19, transcription regulator; 20, antioxidant activity; 21, enzyme regulator activity.

Accession numberDescriptionBiological processMolecular functionBiomarker candidates
IPI00000149ISOFORM 1 OF CASPASE-81, 2, 3, 4, 6, 7, 8, 1012, 13, 15, 16o
IPI00003842ISOFORM 1 OF MICROTUBULE-ASSOCIATED PROTEIN 2712, 18o
IPI00006079ISOFORM 1 OF BCL-2-ASSOCIATED TRANSCRIPTION FACTOR 12, 3, 4, 5, 7, 1011, 12x
IPI00011528CLEAVAGE STIMULATION FACTOR SUBUNIT 14, 511, 12x
IPI00011651ISOFORM 1 OF RECEPTOR-TYPE TYROSINE-PROTEIN PHOSPHATASE GAMMA3, 4, 1012, 15, 16, 18x
IPI00012865ISOFORM PLD1A OF PHOSPHOLIPASE D11, 3, 4, 5, 6, 7, 8, 1012, 13, 16x
IPI00015894CDC42 EFFECTOR PROTEIN 43, 7, 1012, 21x
IPI00016472ISOFORM 2 OF ZINC FINGER CCCH DOMAIN-CONTAINING PROTEIN 13unknown12, 13x
IPI00017290ISOFORM 5 OF SERINE/THREONINE-PROTEIN PHOSPHATASE 4 REGULATORY SUBUNIT 3Aunknownunknownx
IPI00018214ISOFORM 1 OF PROTEIN MAX1, 2, 3, 4, 5, 7, 1011, 12, 19o
IPI00019226ISOFORM 1 OF BROMODOMAIN-CONTAINING PROTEIN 83, 4, 5, 7, 1011, 12, 15, 18, 19o
IPI00019794SH3 AND MULTIPLE ANKYRIN REPEAT DOMAINS PROTEIN 31, 3, 4, 6, 7, 9, 1012, 13, 18x
IPI00020019ADIPONECTIN1, 2, 3, 4, 5, 6, 7, 8, 1012, 13o
IPI00022434UNCHARACTERIZED PROTEIN (SIMILAR TO SERUM ALBUMIN)unknownunknowno
IPI00024134IG KAPPA CHAIN V-I REGION WALKER3, 4, 6, 1013, 16x
IPI00026126MAMMAGLOBIN-B1, 312o
IPI00027838RNA-BINDING PROTEIN 4B1, 5, 1011, 12, 13x
IPI00029403SORTING NEXIN-41, 2, 3, 6, 7, 1012, 13x
IPI00030144PEPTIDYL-PROLYL CIS-TRANS ISOMERASE A-LIKE 4A/B/C416x
IPI00032896ISOFORM 1 OF DISRUPTED IN SCHIZOPHRENIA 1 PROTEIN1, 3, 4, 7, 8, 9, 1012, 18x
IPI00033892GDNF-INDUCIBLE ZINC FINGER PROTEIN 14, 5, 7, 1011, 13, 19x
IPI00064474ZINC FINGER CCCH DOMAIN-CONTAINING PROTEIN 10unknown12, 13x
IPI00102979ORAL CANCER-OVEREXPRESSED PROTEIN 1unknownunknownx
IPI00103636ISOFORM 2 OF WAP FOUR-DISULFIDE CORE DOMAIN PROTEIN 24, 1021o
IPI00107502ISOFORM 1 OF WD REPEAT AND SOCS BOX-CONTAINING PROTEIN 13, 1012, 16x
IPI00150554UNCHARACTERIZED PROTEIN (SIMILAR TO CELL DIVISION CYCLE-ASSOCIATED 7-LIKE PROTEIN)unknownunknowno
IPI00158280FORMYLTETRAHYDROFOLATE DEHYDROGENASE ISOFORM A VARIANTunknownunknownx
IPI00158847DNAJ HOMOLOG SUBFAMILY C MEMBER 14613x
IPI00160396ISOFORM 4 OF PUTATIVE HOMEODOMAIN TRANSCRIPTION FACTOR 24, 5, 1011x
IPI00169307RHO GTPASE-ACTIVATING PROTEIN 213, 6, 712, 21x
IPI00170867ISOFORM 1 OF DEATH DOMAIN-ASSOCIATED PROTEIN 61, 2, 3, 4, 5, 7, 1012, 13, 15, 21x
IPI00181753CDNA FLJ13286 FIS, CLONE OVARC1001154, HIGHLY SIMILAR TO HOMO SAPIENS CLONE 24720 EPITHELIN 1 AND 2 MRNAunknownunknowno
IPI00185027ISOFORM 1 OF ARGININE-GLUTAMIC ACID DIPEPTIDE REPEATS PROTEIN4, 5, 6, 7, 1011, 12, 13, 19x
IPI00216773UNCHARACTERIZED PROTEIN (SIMILAR TO SERUM ALBUMIN)unknownunknowno
IPI00217345ISOFORM 2 OF UDP-GLCNAC:BETAGAL BETA-1,3-N- ACETYLGLUCOSAMINYLTRANSFERASE 21, 4, 5, 716x
IPI00217979UNCHARACTERIZED PROTEIN MLL53, 4, 5, 7, 1012, 13, 16x
IPI00218573ISOFORM 3 OF INTERLEUKIN-1 RECEPTOR ANTAGONIST PROTEIN1, 2, 3, 5, 6, 7, 1012, 13o
IPI00218824UBIQUITOUSLY TRANSCRIBED TETRATRICOPEPTIDE REPEAT PROTEIN Y-LINKED TRANSCRIPT VARIANT 294, 5, 7, 1013, 16x
IPI00220644ISOFORM M1 OF PYRUVATE KINASE ISOZYMES M1/M21, 2, 4, 5, 7, 8, 1011, 12, 13, 16o
IPI00238220UNCHARACTERIZED PROTEIN KIAA20227unknownx
IPI00246842ISOFORM 1 OF TATA BOX-BINDING PROTEIN-ASSOCIATED FACTOR RNA POLYMERASE I SUBUNIT C4, 5, 1011, 12x
IPI00291064ISOFORM 1 OF AN1-TYPE ZINC FINGER PROTEIN 1unknown13x
IPI00292753ISOFORM 6 OF GTPASE-ACTIVATING PROTEIN AND VPS9 DOMAIN-CONTAINING PROTEIN 13, 6, 712, 21x
IPI00299749ZINC FINGER PROTEIN 164, 5, 7, 1011, 12, 13, 19x
IPI00300502MYOZENIN-1712x
IPI00305457PRO2275unknownunknowno
IPI00328712UNCHARACTERIZED PROTEIN C2ORF53unknownunknownx
IPI00329002SEC14 DOMAIN AND SPECTRIN REPEAT-CONTAINING PROTEIN 1unknown12, 13x
IPI00329688PROTEIN YIPF3712x
IPI00334190STOMATIN-LIKE PROTEIN 22, 3, 4, 5, 6, 7, 9, 1012, 13x
IPI00376170PEPTIDYL-PROLYL CIS-TRANS ISOMERASE A-LIKE4, 5, 6, 7, 1012, 13, 16o
IPI00382577KAPPA 1 LIGHT CHAIN VARIABLE REGIONunknownunknownx
IPI00384361ISOFORM 3 OF TRIGGERING RECEPTOR EXPRESSED ON MYELOID CELLS 21, 2, 3, 4, 6, 7, 1012, 13, 18x
IPI00384397MYOSIN-REACTIVE IMMUNOGLOBULIN LIGHT CHAIN VARIABLE REGIONunknownunknownx
IPI00384404RHEUMATOID FACTOR RF-ET9unknownunknownx
IPI00385533ISOFORM 4 OF IODOTYROSINE DEHALOGENASE 1512, 16, 20x
IPI00386134IG LAMBDA CHAIN V-I REGION BL23, 4, 6, 1013, 16x
IPI00386136IG LAMBDA CHAIN V-VI REGION WLT3, 4, 6, 1013, 16x
IPI00411765ISOFORM 2 OF 14-3-3 PROTEIN SIGMA2, 3, 4, 6, 7, 1012, 21o
IPI00412407SERPIN PEPTIDASE INHIBITOR, CLADE B (OVALBUMIN), MEMBER 4, ISOFORM CRA_Aunknownunknowno
IPI00412701ZINC FINGER PROTEIN 5174, 5, 1011, 13x
IPI00418164ISOFORM 1 OF COILED-COIL DOMAIN-CONTAINING PROTEIN 30unknownunknownx
IPI00418780ISOFORM 3 OF REGULATOR OF G-PROTEIN SIGNALING 33, 412, 21x
IPI00419253ISOFORM 1 OF NCK-ASSOCIATED PROTEIN 5unknownunknownx
IPI00423464PUTATIVE UNCHARACTERIZED PROTEIN DKFZP686K03196unknownunknowno
IPI00433833UNCHARACTERIZED PROTEIN (SIMILAR TO THO COMPLEX SUBUNIT 3)unknownunknownx
IPI00445778CDNA FLJ60358, HIGHLY SIMILAR TO REGULATOR OF G-PROTEIN SIGNALING 3unknownunknownx
IPI00451401ISOFORM 2 OF TRIOSEPHOSPHATE ISOMERASE4, 5, 7, 812, 16o
IPI00477003COILED-COIL DOMAIN-CONTAINING PROTEIN KIAA1407unknownunknownx
IPI00478115ISOFORM 2 OF DELETED IN MALIGNANT BRAIN TUMORS 1 PROTEIN1, 2, 3, 4, 6, 712, 15, 18o
IPI00479295GOLGIN SUBFAMILY A MEMBER 8-LIKE PROTEIN 2unknownunknownx
IPI00514669UNCHARACTERIZED PROTEIN (SIMILAR TO SH3 DOMAIN-BINDING GLUTAMIC ACID-RICH-LIKE PROTEIN 3)unknownunknownx
IPI00556287PUTATIVE UNCHARACTERIZED PROTEINunknownunknownx
IPI00556391ACTIN-LIKE PROTEINunknownunknownx
IPI00641229IG ALPHA-2 CHAIN C REGION1, 2, 3, 4, 5, 6, 7, 1013x
IPI00643174GATA-TYPE ZINC FINGER PROTEIN 14, 5, 7, 1011, 12, 13, 19x
IPI00644220CDNA FLJ60139, HIGHLY SIMILAR TO HOMO SAPIENS HIV TAT SPECIFIC FACTOR 1 (HTATSF1), MRNAunknownunknownx
IPI0064626558 KDA PROTEIN (SIMILAR TO AMYLASE)512, 13, 16x
IPI00646771ISOFORM 4 OF DISKS LARGE HOMOLOG 23, 5, 7, 1012, 13, 16x
IPI00657660UNCHARACTERIZED PROTEIN (SIMILAR TO HEMOGLOBIN SUBUNIT DELTA)613, 14o
IPI00657911UNCHARACTERIZED PROTEIN (SIMILAR TO HEMOGLOBIN SUBUNIT GAMMA-2)613, 14o
IPI00719452IGL@ PROTEINunknownunknownx
IPI00738655POTE ANKYRIN DOMAIN FAMILY MEMBER Junknownunknownx
IPI00739539POTE ANKYRIN DOMAIN FAMILY MEMBER Funknownunknowno
IPI00740545POTE ANKYRIN DOMAIN FAMILY MEMBER Iunknownunknownx
IPI00742996RING FINGER PROTEIN 214unknown13x
IPI00743194KAPPA LIGHT CHAIN VARIABLE REGIONunknownunknownx
IPI00745468SIMILAR TO TRANSMEMBRANE PROTEIN 90A1unknownx
IPI00746516PEJVAKINunknownunknownx
IPI00759479ISOFORM 1 OF IQ MOTIF AND SEC7 DOMAIN-CONTAINING PROTEIN 23, 7unknownx
IPI00759832ISOFORM SHORT OF 14-3-3 PROTEIN BETA/ALPHA3, 4, 5, 6, 7, 1012o
IPI00761051ISOFORM 1 OF ERYTHROID DIFFERENTIATION-RELATED FACTOR 14, 5, 1012x
IPI00783184IMMUNGLOBULIN HEAVY CHAIN VARIABLE REGIONunknownunknownx
IPI00783287IMMUNGLOBULIN HEAVY CHAIN VARIABLE REGIONunknownunknownx
IPI00783393IMMUNGLOBULIN HEAVY CHAIN VARIABLE REGIONunknownunknownx
IPI00783909IMMUNGLOBULIN HEAVY CHAIN VARIABLE REGIONunknownunknownx
IPI00794543CDNA FLJ75174, HIGHLY SIMILAR TO HOMO SAPIENS CALMODULIN 1 (PHOSPHORYLASE KINASE, DELTA), MRNAunknownunknownx
IPI00816314PUTATIVE UNCHARACTERIZED PROTEIN DKFZP686I15196unknownunknowno
IPI00816799RHEUMATOID FACTOR D5 LIGHT CHAINunknownunknownx
IPI00827510HRV FAB 026-VLunknownunknownx
IPI00827560HRV FAB N27-VLunknownunknownx
IPI00827581VARIABLE IMMNOGLOBULIN ANTI-ESTRADIOL HEAVY CHAINunknownunknownx
IPI00827618VH6DJ PROTEINunknownunknownx
IPI00827690VH-3 FAMILY (VH26)D/J PROTEINunknownunknownx
IPI00827789HRV FAB N6-VLunknownunknownx
IPI00827826COLD AGGLUTININ FS-2 L-CHAINunknownunknownx
IPI00828061ANTI-MUCIN1 HEAVY CHAIN VARIABLE REGIONunknownunknownx
IPI00828088VH6DJ PROTEINunknownunknownx
IPI00828099UGA8Hunknownunknownx
IPI0082959013 KDA PROTEIN (SIMILAR TO IMMUNOGLOBULIN HEAVY VARIABLE 3–72)unknownunknownx
IPI0082960013 KDA PROTEIN (SIMILAR TO IMMUNOGLOBULIN KAPPA VAIRABLE 2D-26)unknownunknownx
IPI0082970113 KDA PROTEIN (SIMILAR TO IMMUNOGLOBULIN HEAVY VARIABLE 1–18)unknownunknownx
IPI00829752MYOSIN-REACTIVE IMMUNOGLOBULIN HEAVY CHAIN VARIABLE REGIONunknownunknownx
IPI0082981213 KDA PROTEIN (SIMILAR TO IMMUNOGLOBULIN HEAVY VARIABLE 3–13)3, 4, 6, 1013, 16x
IPI0082984113 KDA PROTEIN (SIMILAR TO IMMUNOGLOBULIN HEAVY VARIABLE 3–73)unknownunknownx
IPI00829845IMMUNGLOBULIN HEAVY CHAIN VARIABLE REGIONunknownunknowno
IPI00829896HEMOGLOBIN LEPORE-BALTIMOREunknownunknowno
IPI00829956RHEUMATOID FACTOR C6 LIGHT CHAINunknownunknownx
IPI00844168INTERFERON-INDUCIBLE PROTEIN AIM21, 2, 3, 4, 5, 6, 7, 1011, 12x
IPI00844239IMMUNOBLOBULIN G1 FAB HEAVY CHAIN VARIABLE REGIONunknownunknownx
IPI00847618UNCHARACTERIZED PROTEIN (SIMILAR TO SH3 AND MULTIPLE ANKYRIN REPEAT DOMAINS PROTEIN 3)1, 3, 4, 6, 7, 9, 1012, 13, 18x
IPI00853045ANTI-RHD MONOCLONAL T125 KAPPA LIGHT CHAINunknownunknownx
IPI00853455PROTEINunknownunknownx
IPI00853525UNCHARACTERIZED PROTEIN (Apolipoprotein A)4, 613o
IPI00853641UNCHARACTERIZED PROTEIN (SIMILAR TO HEMOGLOBIN SUBUNIT EPSILON)613, 14x
IPI00854644IG KAPPA CHAIN V-III REGION TI3, 4, 6, 1013, 16x
IPI00854732SIMILAR TO MYOSIN-REACTIVE IMMUNOGLOBULIN HEAVY CHAIN VARIABLE REGIONunknownunknownx
IPI00854743IMMUNGLOBULIN HEAVY CHAIN VARIABLE REGIONunknownunknownx
IPI00854841MYOSIN-REACTIVE IMMUNOGLOBULIN HEAVY CHAIN VARIABLE REGIONunknownunknownx
IPI00871622PUTATIVE ZINC-ALPHA-2-GLYCOPROTEIN-LIKE 1unknownunknownx
IPI00871681PUTATIVE UNCHARACTERIZED PROTEIN ENSP00000344348unknownunknownx
IPI00878173CDNA FLJ39583 FIS, CLONE SKMUS2004897, HIGHLY SIMILAR TO ACTIN, ALPHA SKELETAL MUSCLEunknownunknownx
IPI0087828223 KDA PROTEIN (SIMILAR TO SERUM ALBUMIN)6unknowno
IPI00884092ANTI-HER3 SCFVunknownunknownx
IPI00888712PUTATIVE BETA-ACTIN-LIKE PROTEIN 3611x
IPI00888954IMMUNGLOBULIN HEAVY CHAIN VARIABLE REGIONunknownunknownx
IPI00890703CRYOCRYSTALGLOBULIN CC1 KAPPA LIGHT CHAIN VARIABLE REGIONunknownunknownx
IPI00890754CRYOCRYSTALGLOBULIN CC1 HEAVY CHAIN VARIABLE REGIONunknownunknownx
IPI00892724CELL DIVISION CYCLE-ASSOCIATED 7-LIKE PROTEIN ISOFORM 2unknownunknowno
IPI00893862SFI1 HOMOLOG, SPINDLE ASSEMBLY ASSOCIATEDunknownunknownx
IPI00894523UNCHARACTERIZED PROTEIN (SIMILAR TO ACTIN, CYTOPLASMIC 1)unknownunknowno
IPI00896380ISOFORM 2 OF IG MU CHAIN C REGION1, 2, 3, 4, 6, 7, 1011, 12, 13x
IPI00903112CDNA FLJ36533 FIS, CLONE TRACH2004428, HIGHLY SIMILAR TO LACTOTRANSFERRINunknownunknownx
IPI00908653CDNA FLJ60121, HIGHLY SIMILAR TO UNC-13 HOMOLOG Bunknownunknownx
IPI00909257CDNA FLJ50149, HIGHLY SIMILAR TO HOMO SAPIENS ATTRACTIN-LIKE 1 (ATRNL1), MRNAunknownunknownx
IPI00909649IG KAPPA CHAIN C REGION1, 2, 3, 4, 6, 7, 1013, 16x
IPI00910380CDNA FLJ54278, HIGHLY SIMILAR TO SPARC-LIKE PROTEIN 1unknownunknownx
IPI00910779CDNA FLJ52141, HIGHLY SIMILAR TO 14-3-3 PROTEIN GAMMAunknownunknownx
IPI00915869MALATE DEHYDROGENASE, CYTOPLASMIC ISOFORM 3516x
IPI00916434ANTI-(ED-B) SCFVunknownunknownx
IPI00921079CDNA FLJ61647, HIGHLY SIMILAR TO NUCLEOLAR COMPLEX PROTEIN 2 HOMOLOGunknownunknowno
IPI00922693CDNA FLJ53662, HIGHLY SIMILAR TO ACTIN, ALPHA SKELETAL MUSCLEunknownunknowno
IPI00924681UNCHARACTERIZED PROTEIN (SIMILAR TO ZINC FINGER PROTEIN 568)unknownunknownx
IPI00924820SIMILAR TO KAPPA LIGHT CHAIN VARIABLE REGIONunknownunknownx
IPI00924948UNCHARACTERIZED PROTEIN (SIMILAR TO ZINC-ALPHA-2-GLYCOPROTEIN)unknown13o
IPI00925547UNCHARACTERIZED PROTEIN (SIMILAR TO LACTOTRANSFERRIN)1, 2, 4, 1013, 16x
IPI00930072PUTATIVE UNCHARACTERIZED PROTEIN DKFZP686E23209unknownunknownx
IPI00930124PUTATIVE UNCHARACTERIZED PROTEIN DKFZP686C11235unknownunknowno
IPI00930351HBBM FUSED GLOBIN PROTEINunknownunknowno
IPI00930404ISOFORM 2 OF KALLIKREIN-1416o
IPI00930442PUTATIVE UNCHARACTERIZED PROTEIN DKFZP686M242181, 2, 3, 4, 6, 7, 1013, 16x
IPI0093766910 KDA PROTEINunknownunknownx
IPI00939278UNCHARACTERIZED PROTEIN (SIMILAR TO ZINC-ALPHA-2-GLYCOPROTEIN)unknown13o
IPI00940245IMMUNOGLOBULIN HEAVY CHAIN VARIANTunknownunknownx
IPI0094072710 KDA PROTEINunknownunknownx
IPI00943106UNCHARACTERIZED PROTEIN (SIMILAR TO GOLGIN SUBFAMILY A MEMBER 8N)unknownunknownx
IPI00953352ISOFORM 10 OF PROTEIN SFI1 HOMOLOG5, 1012x
IPI00956140HCG1652138, ISOFORM CRA_Aunknownunknownx
IPI00956602ANTI-STREPTOCOCCAL/ANTI-MYOSIN IMMUNOGLOBULIN LAMBDA LIGHT CHAIN VARIABLE REGIONunknownunknownx
IPI00964049UNCHARACTERIZED PROTEIN (SIMILAR TO SPECKL-TYPE POZ PROTEIN)unknownunknownx
IPI0096565312 KDA PROTEIN (SIMILAR TO IMMUNOGLOBULIN KAPPA VARIABLE 2D-40)3, 4, 6, 1013, 16x
IPI0096682969 KDA PROTEIN (SIMILAR TO SERUM ALBUMIN)6unknowno
IPI00966961PROTEIN (SIMILAR TO CENTROSOMAL PROTEIN 192 KDA)7, 10unknownx
IPI00969456IGKC PUTATIVE UNCHARACTERIZED PROTEINunknownunknownx
IPI00969620LIGHT CHAIN FABunknownunknownx
IPI00972963LAMBDA LIGHT CHAIN OF HUMAN IMMUNOGLOBULIN SURFACE ANTIGEN-RELATED PROTEINunknownunknownx
IPI00973016UBIQUITOUSLY TRANSCRIBED TETRATRICOPEPTIDE REPEAT PROTEIN Y-LINKED TRANSCRIPT VARIANT 27unknownunknownx
IPI00973032V1-17 PROTEINunknownunknownx
IPI00973424IGLC1 PUTATIVE UNCHARACTERIZED PROTEINunknownunknownx
IPI00973474PUTATIVE UNCHARACTERIZED PROTEIN (SIMILAR TO IG GAMMA-3 CHAIN C REGION)1, 2, 3, 4, 6, 7, 1013, 16o
IPI00973513IMUNOGLOBULIN HEAVY CHAINunknownunknownx
IPI00973531IGLC2 PUTATIVE UNCHARACTERIZED PROTEINunknownunknownx
IPI00974544ISOFORM SV OF 14-3-3 PROTEIN EPSILON1, 2, 3, 4, 6, 7, 1012o
IPI00976299UNCHARACTERIZED PROTEIN (SIMILAR TO UTEROGLOBIN)3unknowno
IPI00976559UNCHARACTERIZED PROTEIN (SIMILAR TO DIPHTHAMIDE BIOSYNTHESIS PROTEIN 2)4unknownx
IPI00976928SIMILAR TO MYOSIN-REACTIVE IMMUNOGLOBULIN HEAVY CHAIN VARIABLE REGIONunknownunknownx
IPI00977221HYPOTHETICAL PROTEIN LOC100291917unknownunknownx
IPI00977405SIMILAR TO IG KAPPA CHAIN V-III REGION VG PRECURSOR3, 4, 6, 1013, 16x
IPI00977733SIMILAR TO VH-7 FAMILY (N54P3)D/J PROTEINunknownunknownx
IPI00977788SIMILAR TO HEPATITIS B VIRUS RECEPTOR BINDING PROTEINunknownunknownx
IPI00978208PEPTIDYL-PROLYL CIS-TRANS ISOMERASE A-LIKE4, 5, 6, 7, 1012, 13, 16o
IPI00978930IG ALPHA-1 CHAIN C REGION1, 2, 3, 4, 5, 6, 7, 1013x
IPI00979730SIMILAR TO IG KAPPA CHAIN V-II REGION GM607 PRECURSOR3, 4, 6, 1013, 16x
IPI00980227CONSERVED HYPOTHETICAL PROTEIN MUC5B1, 2, 4, 1012x
IPI00981659SIMILAR TO COLD AGGLUTININ FS-1 H-CHAINunknownunknownx
IPI00982101UNCHARACTERIZED PROTEINunknown12o
IPI00983257NEURONAL ACETYLCHOLINE RECEPTOR ALPHA-4 SUBUNITunknownunknownx
IPI00983475SIMILAR TO HEPATITIS B VIRUS RECEPTOR BINDING PROTEINunknownunknownx
IPI0098515011 KDA PROTEINunknownunknownx
IPI00985211SIMILAR TO VH-3 FAMILY (VH26)D/J PROTEINunknownunknownx
IPI00985251PROTEIN (SIMILAR TO UPF0317 PROTEIN C14ORF159)unknownunknownx
IPI0101005658 KDA PROTEIN (SIMILAR TO AMYLASE)512, 13, 16x
IPI01011189MUC5B UNCHARACTERIZED PROTEINunknownunknownx
IPI01011784UNCHARACTERIZED PROTEIN (SIMILAR TO E3 UBIQUITIN-PROTEIN LIGASE MDM2)unknownunknowno
IPI010125046-PHOSPHOGLUCONATE DEHYDROGENASE, DECARBOXYLATING4, 516o
IPI01013214RHEUMATOID FACTOR RF-IP14unknownunknownx
IPI01014621UNCHARACTERIZED PROTEIN (SIMILAR TO ZINC FINGER PROTEIN 705A)4, 5, 1011x
IPI01014804UNCHARACTERIZED PROTEINunknownunknownx
IPI01015266PUTATIVE UNCHARACTERIZED PROTEIN DKFZP686O16217unknownunknownx
IPI01017938IG LAMBDA-6 CHAIN C REGION1, 2, 3, 4, 6, 7, 1013, 16x
IPI01018060IG LAMBDA-3 CHAIN C REGIONS1, 2, 3, 4, 6, 7, 1013, 16x
IPI01018161PYRUVATE KINASE ISOZYMES M1/M2 ISOFORM C1, 2, 4, 5, 7, 8, 1011, 12, 13, 16o
IPI01018257HYPOTHETICAL PROTEIN (SIMILAR TO POLYMERIC IMMUNOGLOBULIN RECEPTOR)unknownunknowno
IPI01018712ACTA2 PROTEINunknownunknownx
IPI01018716TOSPEAK-5unknownunknownx
IPI01018887UBIQUITOUSLY TRANSCRIBED TETRATRICOPEPTIDE REPEAT PROTEIN Y-LINKED TRANSCRIPT VARIANT 8unknownunknownx
IPI01018897MUCIN-5ACunknownunknowno
IPI01018949CDNA FLJ51266, HIGHLY SIMILAR TO VITRONECTINunknownunknownx
IPI01022126UNCHARACTERIZED PROTEIN (SIMILAR TO KAT8 REGULATORY NSL COMPLEX SUBUNIT 2)unknownunknownx
IPI01022319UNCHARACTERIZED PROTEIN (SIMILAR TO TWINFILIN-1)unknownunknownx
IPI01022662UNCHARACTERIZED PROTEIN (SIMILAR TO CARBONIC ANHYDRASE 6)513, 16x
IPI0102281815 KDA PROTEIN (SIMILAR TO LYSOZYME)1, 2, 4unknowno
IPI01022820AMYLOID LAMBDA 6 LIGHT CHAIN VARIABLE REGION SARunknownunknownx
IPI01024840PROTEIN (SIMILAR TO X-LINKED RETINITIS PIGMENTOSA GTPASE REGULATOR-INTERACTING PROTEIN 1)7unknownx
IPI01025882MYOSIN-REACTIVE IMMUNOGLOBULIN LIGHT CHAIN VARIABLE REGIONunknownunknownx
IPI01026053PROTEIN (SIMILAR TO EPIDIDYMAL SECRETORY PROTEIN E1)unknownunknownx

Distinct proteins observed in Korean whole saliva but not in other human saliva.

1, response to stimulus; 2, response to stress; 3, cell communication; 4, protein metabolic; 5, other primary metabolic; 6, transport; 7, organization and biogenesis; 8, catabolic process; 9, cell homeostasis; 10, regulation of biological process; 11, nucleic acid binding; 12, protein binding; 13, other binding; 14, transporter activity; 15, signal transducer activity; 16, catalytic activity; 17, motor activity; 18, structural regulator; 19, transcription regulator; 20, antioxidant activity; 21, enzyme regulator activity. For the determination of the inter-platform variability in the nLC-Q-IMS-TOF system used in this study and the validation of the identities of proteins, especially, the distinct Korean WS proteins, results of the analyses of the pooled Korean WS sample by the nLC-Q-IMS-TOF system and a nLC-Q-orbitrap system were compared. As a result, 141 and 208 proteins were identified from the nLC-Q-TOF platform and the nLC-Q-orbitrap platform, respectively, and 98 out of 141 proteins (69.5%) from the nLC-Q-TOF platform were overlapped with those from the nLC-Q-orbitrap platform. Among proteins identified in the pooled sample, 130 proteins from the nLC-Q-TOF platform and 147 proteins from the nLC-Q-orbitrap platform were found to be within the Korean WS proteome index. Additionally, among those proteins overlapped with the Korean WS proteome, 22 out of 130 proteins (16.9%) and 29 out of 147 proteins (19.7%) were confirmed to belong to the distinct Korean WS proteins from the nLC-Q-TOF platform and the nLC-Q-orbitrap platform, respectively. Finally, the portion of the distinct proteins from the nLC-Q-IMS-TOF platform, which overlaps with those from the nLC-Q-orbitrap plarform was 68.2% (S3 and S4 Tables and S1 Fig). In addition to the comparison of protein identities, GO annotation in terms of cellular component, biological process, and molecular function between the Korean WS proteome and the integrated human saliva proteome was compared (Fig 3). First, in GO cellular component categories, the Korean WS proteome was significantly over-represented in extracellular space and the plasma membrane but under-represented in organelle, intracellular, cytoplasma, and the cell compared to the integrated human saliva proteome (p < 0.05). GO biologic process categories also showed higher portions of proteins for response to stimulus, cell communication, protein metabolism, and transport in the Korean WS proteome than those in the integrated human saliva proteome (p < 0.05). However, the opposite tendency was observed in proteins for other primary metabolic and organization and biogenesis (p < 0.05). Finally, in the case of GO molecular function categories, over-representation of the Korean WS proteome was observed in other binding, catalytic activity, antioxidant activity, and enzyme regulatory activity with under-representation in protein binding compared to the integrated human saliva proteome were found (p < 0.05). Allocation of proteins observed in the Korean WS proteome according to their GO annotation can be found in S5–S7 Tables.
Fig 3

Relative allocation and comparison of proteins observed in the Korean whole saliva proteome and the integrated human saliva proteome according to their gene ontology annotations in terms of cellular component (A), biological process (B), and molecular function (C). *, p < 0.05.

Relative allocation and comparison of proteins observed in the Korean whole saliva proteome and the integrated human saliva proteome according to their gene ontology annotations in terms of cellular component (A), biological process (B), and molecular function (C). *, p < 0.05.

Distinct Korean WS proteins and diseases

To evaluate the clinical applicability of ethnicity-specific human saliva proteome, 226 proteins observed in the Korean WS proteome, but not in the integrated human saliva proteome, were searched against the Database of disease-related biomarkers [24]. As shown in Table 1 and S2 Table, 50 of 226 distinct Korean WS proteins (22.1%) were found to be disease biomarker candidates. Also, Table 2 and S2 Table indicate that 7–21 of these distinct Korean WS proteins are probably associated with individual conditions representing the top 10 deadliest diseases in South Korea, 2015 (cerebrovascular disease, lung cancer, ischemic heart disease, liver cancer, diabetes mellitus, stomach cancer, colorectal cancer, pancreatic cancer, hypertension, and dementia) [25].
Table 2

Distinct Korean WS proteins probably associated with the top 10 deadliest diseases in South Korea, 2015.

RankDisease (A)The distinct Korean whole saliva proteins probably associated with A
1Cerebrovascular diseaseIPI00000149, IPI00003842, IPI00022434, IPI00216773, IPI00878282, IPI00966829, IPI00150554, IPI00892724, IPI00218573, IPI00220644, IPI01018161, IPI00305457, IPI00853525, IPI00930404
2Lung cancerIPI00000149, IPI00003842, IPI00020019, IPI00022434, IPI00216773, IPI00878282, IPI00966829, IPI00103636, IPI00150554, IPI00892724, IPI00220644, IPI01018161, IPI00376170, IPI00978208, IPI00478115, IPI00976299, IPI01018257, IPI01011784, IPI01018897, IPI01022818, IPI00930404
3Ischemic heart diseaseIPI00000149, IPI00020019, IPI00022434, IPI00216773, IPI00878282, IPI00966829, IPI00103636, IPI00150554, IPI00892724, IPI00218573, IPI00305457, IPI00376170, IPI00978208, IPI00411765, IPI00451401, IPI00739539, IPI00853525, IPI01011784, IPI00930404
4Liver cancerIPI00000149, IPI00022434, IPI00216773, IPI00878282, IPI00966829, IPI00150554, IPI00892724, IPI00181753, IPI00220644, IPI01018161, IPI00305457, IPI00451401, IPI00739539, IPI00853525, IPI01018257, IPI01011784, IPI01018897, IPI01022818
5Diabetes mellitusIPI00000149, IPI00018214, IPI00020019, IPI00022434, IPI00216773, IPI00878282, IPI00966829, IPI00150554, IPI00892724, IPI00305457, IPI00853525, IPI00976299, IPI00930404
6Stomach cancerIPI00022434, IPI00216773, IPI00878282, IPI00966829, IPI00305457, IPI01018897, IPI01022818
7Colorectal cancerIPI00000149, IPI00018214, IPI00019226, IPI00022434, IPI00216773, IPI00878282, IPI00966829, IPI00026126, IPI00150554, IPI00892724, IPI00220644, IPI01018161, IPI00739539, IPI01018257, IPI01011784, IPI01018897, IPI01022818
8Pancreatic cancerIPI00022434, IPI00216773, IPI00878282, IPI00966829, IPI00103636, IPI00220644, IPI01018161, IPI00411765, IPI01018897, IPI01022818, IPI00930404
9HypertensionIPI00018214, IPI00022434, IPI00216773, IPI00878282, IPI00966829, IPI00150554, IPI00892724, IPI00218573, IPI00657911, IPI00853525, IPI00976299, IPI01022818, IPI00930404
10DementiaIPI00000149, IPI00003842, IPI00022434, IPI00216773, IPI00878282, IPI00966829, IPI00150554, IPI00892724, IPI00181753, IPI00218573, IPI00220644, IPI01018161, IPI00305457, IPI00376170, IPI00978208, IPI00739539, IPI00853525, IPI00924948, IPI00939278, IPI00930351, IPI00930404

Discussion

As the initial step to determine ethnic difference of human saliva proteome, the Korean WS proteome was constructed for the first time due to the fact that Korea is the most ethnically homogenous country in the world [26]. A total of 480 proteins are catalogued in the Korean WS proteome (S1 Table), including most of commonly observed saliva proteins (amylase, cystantins, acidic proline rich proteins, basic proline rich proteins, mucins, lactotransferrin, carbonic anhydrase, lysozymes, peroxidases, albumin, and statherines) [11, 27]. This observation indicated that the analytical method employed in the present study was performed properly. However, three groups of common saliva proteins (thymosins, defensins, and histatins) were not observed in the present study. Although an exact explanation on their absence cannot be provided with certainty, loss during sample preparation, the under-sampling issue of mass spectrometry brought by the complexity of a sample, their cleavage into small peptides, and/or binding of the resulting peptides to tissues may contribute to their absence [11, 27, 28]. For the actual determination of ethnic differences in human saliva proteome, the present Korean WS protein list was compared to the integrated human saliva protein list in a couple of ways. First, comparison of protein identities in each list revealed that 47.1% (226 out of 480) of proteins were unique in the Korean WS proteome. Discovering a large portion of Korean WS unique proteins from the Korean WS proteome was expected, because similar portion to that (54.1%, 100 out of 185 proteins) was already reported from distinct Korean plasma proteins compared to human plasma proteome [18]. However, there is a possibility of identifying common proteins for the first time by employing different analytical techniques, which would weaken the possibility of the connection between the distinct Korean WS proteins and ethnic differences in human saliva proteome. Thus, for the determination of the inter-platform variability in the nLC-Q-IMS-TOF system used in this study and the validation of the identities of proteins (especially, the distinct Korean WS proteins) simultaneously, results of the analyses of the pooled Korean WS sample by the nLC-Q-IMS-TOF system and a nLC-Q-orbitrap system, a platform widely used for proteomics were compared. As a result, 141 and 208 proteins were identified from the nLC-Q-TOF platform and the nLC-Q-orbitrap platform, respectively, and 98 out of 141 proteins (69.5%) from the nLC-Q-TOF platform were overlapped with those from the nLC-Q-orbitrap platform (S3 and S4 Tables and S1 Fig). If about 70–80% of the repeatability (the inner-system comparison) and about 60–80% of the reproducibility (the inter-system including inter-platform comparison) of a standardized analysis platform in protein identification are considered [29], no significant influence of the inter-platform variability in the nLC-Q-IMS-TOF system as well as the high credibility of protein identities in the Korean WS proteome can be urged. Among proteins identified in the pooled sample, 130 proteins from the nLC-Q-TOF platform and 147 proteins from the nLC-Q-orbitrap platform were found to be within the Korean WS proteome index. Additionally, among those proteins overlapped with the Korean WS proteome, 22 out of 130 proteins (16.9%) and 29 out of 147 proteins (19.7%) were confirmed to belong to the distinct Korean WS proteins from the nLC-Q-TOF platform and the nLC-Q-orbitrap platform, respectively (S3 and S4 Tables and S1 Fig). The numbers of every type of proteins identified from the analyses of the pooled sample by using each platform are smaller than the counter parts of the Korean WS proteome, likely the consequence of the dilution of individual proteins by saliva pooling and the analyses of a single sample. However, the portion of the distinct proteins from the nLC-Q-IMS-TOF platform, which overlaps with those from the nLC-Q-orbitrap platform, is still close to 70% (68.2%). Thus, no significant influence of the inter-platform variability in our system is observed, lending high credibility of the protein identities in the Korean WS proteome. Interestingly, the number of proteins from the nLC-Q-orbitrap platform, which belong to distinct protein in Korean WS is larger than that from the nLC-Q-IMS-TOF platform (22 proteins from the nLC-Q-IMS-TOF platform vs. 29 proteins from the nLC-Q-orbitrap platform; S3 and S4 Tables and S1 Fig). This observation provides additional evidence to support the high credibility of the identities of the distinct Korean WS proteins. Therefore, since the existence of the distinct Korean WS proteins are more confident and the possibility of their identification by the inter-platform variability can be significantly excluded, ethnic differences in the human saliva proteome, especially in the Korean WS proteome become more evident. While it was observed that the nLC-Q-IMS-TOF system of this study did not bring higher performance than other proteomics platforms due to the relatively small number of proteins identified, the identification of the distinct proteins confirmed that the nLC-Q-IMS-TOF system still has good performance for the identification of distinct Korean WS proteins. Actually, to build a global protein list, most proteomic studies have employed multi-dimensional proteomics technique to include as many as possible proteins in their lists [1, 7–11, 30, 31]. However, such multi-dimensional proteomics technique demands enormous analysis time and computing power for protein identification. Therefore, we chose the combination of nLC-Q-TOF (a single dimensional technique) and IMS (an additional technique to separate ions based on their different mobility in a carrier gas) “on-line” instead of using the conventional multi-dimensional technique [28, 32]. To the best of our knowledge, this is the first study that applies IMS to saliva proteomics. From comparison of GO annotations between the Korean WS proteome and the integrated human saliva proteome, some categories in the Korean WS proteome showed over-representation or under-representation (Fig 3). Regarding their applications to biomarker-related studies, over-represented categories might be more important than under-represented ones due to the probability of finding more meaningful information from more proteins belonging to over-represented categories. In the present study, over-represented GO categories in the Korean WS proteome are as follows: extracellular and plasma membrane of cellular components (Fig 3A), response to stimulus, cell communication, protein metabolic, and transport of biological processes (Fig 3B), and other binding, catalytic activity, antioxidant activity, and enzyme regulatory activity of molecular function (Fig 3C). Interestingly, most of them can provide substantial information on diseases due to their connectivity to disease-related features such as extracellular secretion for biological function (the extracellular category of cellular component), the defense system of the body (the response to stimulus category of biological processes), cellular signal transduction (the cell communication category of biological processes), chemical reactions and pathways involving a specific proteins (the protein metabolic category of biological processes), positioning of a substance or cellular entity (the transport category of biological processes), non-covalent interaction of a non-protein molecule with specific site(s) on another molecule (the other binding category of the biological processes), catalysis of a biochemical reaction (the catalytic activity category of the biological processes), inhibition of oxidation (the antioxidant activity category of the biological processes), and/or modulation (by direct binding) of the activity of an enzyme (enzyme regulator activity category for molecular function) [1, 2, 10, 33]. Additionally, over-representation of protein metabolic and catalytic activity categories in the Korean WS proteome compared with the integrated human saliva proteome may be consistent with its larger portion of proteins with molecular weight of less than 60 kDa (82.3%, Fig 1A), partially resulting from the cleavage of higher-molecular-weight proteins, than that of Yan et al.'s report (68%) [1]. In line with these findings, our results suggest another clue to discover ethnic differences in the human saliva proteome and the possibility of using such difference for early diagnosis and/or prognosis of diseases. For further evaluation of the clinical applicability of ethnicity-specific human saliva proteome, 226 distinct proteins observed in Korean WS, but not in other human saliva, were searched through the Database of disease-related biomarkers. As a result, 22.1% (50 out of 226) of these distinct proteins were found to be disease biomarker candidates (Table 1 and S2 Table), firmly supporting the probable value of using ethnicity-specific human saliva proteome for disease biomarker applications. Also, all top 10 deadliest diseases in South Korea, 2015 (cerebrovascular disease, lung cancer, ischemic heart disease, liver cancer, diabetes mellitus, stomach cancer, colorectal cancer, pancreatic cancer, hypertension, and dementia) are found to have at least 7 disease biomarker candidates which belong to the distinct Korean WS proteins (Table 2 and S2 Table) [25]. The total number of distinct Korean WS proteins probably associated with the top 10 deadliest diseases in South Korea is 35, representing 70.0% (35 out of 50) of disease biomarker candidate proteins among distinct Korean WS proteins (Tables 1 and 2 and S2 Table). Thus, this result clearly shows that ethnicity-specific human saliva proteins have diagnostic potential for diseases highly prevalent in that ethnic group. However, this study has a couple of limitations. First, as mentioned above, it did not employ any multi-dimensional separation technique, and, as a result, a relatively small number of proteins was catalogued in the Korean WS proteome index. Interestingly, however, its limited performance must have played an important role in supporting ethnicity-related differences in human saliva, because it did not seem to produce any significant platform-specific performance, the source of the inter-platform variability. Also, since WS samples were collected from only eleven young male adult volunteers, there would be concerns of gender bias as well as a lack of representativeness in the results because of the narrow age range of the participants. Thus, the expansion of the Korean WS proteome by analyzing more samples, including female WS and a broader range of participant ages, by using the nLC-Q-IMS-TOF system or a multi-dimensional proteomics technique is expected in the near future.

Conclusions

The Korean WS proteome catalogue indexing 480 proteins was built and characterized from nLC-Q-IMS-TOF analyses of WS samples collected from eleven healthy Korean male adult volunteers in this study for the first time. From comparison of the Korean WS proteome with the integrated human saliva proteome in terms of protein identities and GO annotations, evidences strongly support ethnic difference in human saliva proteome. Additionally, the potential value of ethnicity-specific human saliva proteins as biomarkers for diseases highly prevalent in that ethnic group was confirmed by finding 35 distinct Korean WS proteins probably associated with the top 10 deadliest diseases in South Korea. Finally, the present Korean WS protein list can serve as the first level reference for future proteomic studies including disease biomarker studies on Korean saliva.

A total of 480 Korean whole saliva proteins identified in the present study.

Among multiple results on a certain protein from different sample and replicate analyses, only one with the highest PLGS score was selected for this table. (XLSX) Click here for additional data file.

Distinct proteins observed in Korean whole saliva but not in other human saliva and their probable association with diseases including the top 10 deadliest diseases in South Korea, 2015.

(XLSX) Click here for additional data file.

Proteins identified from the nLC-Q-IMS-TOF analysis of pooled Korean whole saliva.

(XLSX) Click here for additional data file.

Proteins identified from the nLC-Q-orbitrap analysis of pooled Korean whole saliva.

(XLSX) Click here for additional data file.

Allocation of proteins observed in the Korean whole saliva proteome according to their gene ontology annotation in terms of cellular component.

(XLSX) Click here for additional data file.

Allocation of proteins observed in the Korean whole saliva proteome according to their gene ontology annotation in terms of biological process.

(XLSX) Click here for additional data file.

Allocation of proteins observed in the Korean whole saliva proteome according to their gene ontology annotation in terms of molecular function.

(XLSX) Click here for additional data file.

Venn diagrams illustrating the number of proteins specific to either the nLC-Q-IMS-TOF analysis of pooled Korean whole saliva or the nLC-Q-orbitrap analysis of pooled Korean whole saliva proteome and those observed in both proteomes.

Total proteins identified (A). Proteins which belong to the Korean whole saliva proteome (B). Proteins which belong to the distinct Korean whole saliva proteins (C). (PPTX) Click here for additional data file.

Supplementary materials and methods.

(DOCX) Click here for additional data file.
  25 in total

Review 1.  Comparative human salivary and plasma proteomes.

Authors:  J A Loo; W Yan; P Ramachandran; D T Wong
Journal:  J Dent Res       Date:  2010-08-25       Impact factor: 6.116

2.  The proteomes of human parotid and submandibular/sublingual gland salivas collected as the ductal secretions.

Authors:  Paul Denny; Fred K Hagen; Markus Hardt; Lujian Liao; Weihong Yan; Martha Arellanno; Sara Bassilian; Gurrinder S Bedi; Pinmannee Boontheung; Daniel Cociorva; Claire M Delahunty; Trish Denny; Jason Dunsmore; Kym F Faull; Joyce Gilligan; Mireya Gonzalez-Begne; Frédéric Halgand; Steven C Hall; Xuemei Han; Bradley Henson; Johannes Hewel; Shen Hu; Sherry Jeffrey; Jiang Jiang; Joseph A Loo; Rachel R Ogorzalek Loo; Daniel Malamud; James E Melvin; Olga Miroshnychenko; Mahvash Navazesh; Richard Niles; Sung Kyu Park; Akraporn Prakobphol; Prasanna Ramachandran; Megan Richert; Sarah Robinson; Melissa Sondej; Puneet Souda; Mark A Sullivan; Jona Takashima; Shawn Than; Jianghua Wang; Julian P Whitelegge; H Ewa Witkowska; Lawrence Wolinsky; Yongming Xie; Tao Xu; Weixia Yu; Jimmy Ytterberg; David T Wong; John R Yates; Susan J Fisher
Journal:  J Proteome Res       Date:  2008-03-25       Impact factor: 4.466

3.  Data management and functional annotation of the Korean reference plasma proteome.

Authors:  Seul-Ki Jeong; Eun-Young Lee; Jin-Young Cho; Hyoung-Joo Lee; An-Sung Jeong; Sang Yun Cho; Young-Ki Paik
Journal:  Proteomics       Date:  2010-03       Impact factor: 3.984

4.  Effects of traveling wave ion mobility separation on data independent acquisition in proteomics studies.

Authors:  Pavel V Shliaha; Nicholas J Bond; Laurent Gatto; Kathryn S Lilley
Journal:  J Proteome Res       Date:  2013-05-02       Impact factor: 4.466

Review 5.  Emerging salivary biomarkers by mass spectrometry.

Authors:  Qihui Wang; Qiaoling Yu; Qingyu Lin; Yixiang Duan
Journal:  Clin Chim Acta       Date:  2014-09-03       Impact factor: 3.786

6.  A proteomic approach to compare saliva from individuals with and without oral leukoplakia.

Authors:  Danielle Resende Camisasca; Lorena da Rós Gonçalves; Márcia Regina Soares; Vanessa Sandim; Fábio César Sousa Nogueira; Carlos Henrique Saraiva Garcia; Rodrigo Santana; Silvia Paula de Oliveira; Luisa Aguirre Buexm; Paulo Antônio Silvestre de Faria; Fernando Luiz Dias; Denise de Abreu Pereira; Russolina B Zingali; Gilda Alves; Simone Queiroz Chaves Lourenço
Journal:  J Proteomics       Date:  2016-07-29       Impact factor: 4.044

7.  Human salivary proteome--a resource of potential biomarkers for oral cancer.

Authors:  Priya Sivadasan; Manoj Kumar Gupta; Gajanan J Sathe; Lavanya Balakrishnan; Priyanka Palit; Harsha Gowda; Amritha Suresh; Moni Abraham Kuriakose; Ravi Sirdeshmukh
Journal:  J Proteomics       Date:  2015-06-12       Impact factor: 4.044

8.  A dynamic range compression and three-dimensional peptide fractionation analysis platform expands proteome coverage and the diagnostic potential of whole saliva.

Authors:  Sricharan Bandhakavi; Matthew D Stone; Getiria Onsongo; Susan K Van Riper; Timothy J Griffin
Journal:  J Proteome Res       Date:  2009-12       Impact factor: 4.466

9.  Targeted human cerebrospinal fluid proteomics for the validation of multiple Alzheimer's disease biomarker candidates.

Authors:  Yong Seok Choi; Shuyu Hou; Leila H Choe; Kelvin H Lee
Journal:  J Chromatogr B Analyt Technol Biomed Life Sci       Date:  2013-05-10       Impact factor: 3.205

10.  Diagnostic Potential of Novel Salivary Host Biomarkers as Candidates for the Immunological Diagnosis of Tuberculosis Disease and Monitoring of Tuberculosis Treatment Response.

Authors:  Ruschca Jacobs; Elizna Maasdorp; Stephanus Malherbe; Andre G Loxton; Kim Stanley; Gian van der Spuy; Gerhard Walzl; Novel N Chegou
Journal:  PLoS One       Date:  2016-08-03       Impact factor: 3.240

View more
  3 in total

1.  HBFP: a new repository for human body fluid proteome.

Authors:  Dan Shao; Lan Huang; Yan Wang; Xueteng Cui; Yufei Li; Yao Wang; Qin Ma; Wei Du; Juan Cui
Journal:  Database (Oxford)       Date:  2021-10-13       Impact factor: 3.451

2.  Salivary proteomics of healthy dogs: An in depth catalog.

Authors:  Sheila M F Torres; Eva Furrow; Clarissa P Souza; Jennifer L Granick; Ebbing P de Jong; Timothy J Griffin; Xiong Wang
Journal:  PLoS One       Date:  2018-01-12       Impact factor: 3.240

3.  Human OMICs and Computational Biology Research in Africa: Current Challenges and Prospects.

Authors:  Yosr Hamdi; Lyndon Zass; Houcemeddine Othman; Fouzia Radouani; Imane Allali; Mariem Hanachi; Chiamaka Jessica Okeke; Melek Chaouch; Maureen Bilinga Tendwa; Chaimae Samtal; Reem Mohamed Sallam; Nihad Alsayed; Michael Turkson; Samah Ahmed; Alia Benkahla; Lilia Romdhane; Oussema Souiai; Özlem Tastan Bishop; Kais Ghedira; Faisal Mohamed Fadlelmola; Nicola Mulder; Samar Kamal Kassim
Journal:  OMICS       Date:  2021-04-01
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.