Literature DB >> 21938208

Computational analysis of Concanavalin A binding glycoproteins of human seminal plasma.

Anil Kumar Tomar, Balwinder Singh Sooch, Savita Yadav.   

Abstract

Glycoproteins have immense clinical importance and comparative glycoproteomics has become a powerful tool for biomarker discovery and disease diagnosis. Seminal plasma glycoproteins participate in fertility related processes including sperm-egg recognition, modulation of capacitation and acrosome reaction inhibition. Affinity chromatography using broad specificity lectin such as Con A is widely applied for glycoproteins enrichment. More notably, Con A-interacting fraction of human seminal plasma has decapacitating activity which makes this fraction critically important. In our previous study, we isolated Con A-interacting glycoproteins from human seminal plasma and subsequently identified them by mass spectrometry. Here, we report the computational analysis of these proteins using bioinformatics tools. The analysis includes: prediction of glycosylation sites using sequence information (NetNGlyc 1.0), functional annotations to cluster these proteins into various functional groups (InterProScan and Blast2GO) and identification of protein interaction networks (STRING database). The results indicate that these proteins are involved in various biological processes including transport, morphogenesis, metabolic processes, cell differentiation and homeostasis. The clusters illustrate two major molecular functions - hydrolase activity (6) and protein (4)/carbohydrate (1)/lipid binding (1). The large interactomes of proteins point towards their versatile roles in wide range of biological processes.

Entities:  

Year:  2011        PMID: 21938208      PMCID: PMC3174039          DOI: 10.6026/97320630007069

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

Glycosylation is one of the most common post-translational modifications and more than half of all mammalian proteins are glycosylated [1]. The studies towards isolation, discovery and subsequent identification of glycosylated proteins are becoming more and more important in glycoproteomics and disease diagnosis [2]. In particular, differential glycosylations (e.g. missing, aberrant or additional) are known to be linked to certain diseases and may be utilized as markers for diagnosis and/or therapeutic monitoring [3]. Glycoproteins play essential roles in controlling various biological processes in immunology, cancer, protein folding, host‐pathogen interactions, human diseases and signal transduction etc. The broad specificity lectins, such as Concanavalin A (Con A), are widely applied for enriching serum glycoproteins [4]. Human seminal plasma contains a large array of proteins of clinical importance which are essentially needed to maintain the reproductive physiology of spermatozoa and for successful fertilization. Seminal plasma glycoproteins are known to participate in sperm-egg recognition [5], modulation of capacitation [6, 7] and acrosome reaction inhibition [8]. Moreover, Con A-interacting fraction of human seminal plasma is reported to have decapacitating activity [9]. Thus, functional analysis of various proteins of this fraction is of immense importance for better understanding of fertility related processes. We had isolated glycoproteins from human seminal plasma by lectin affinity chromatography using Con A – agarose. Overall ten proteins bands on SDS-PAGE gel, corresponding to nine different proteins, were identified by MALDI-TOF/MS analysis, viz. aminopeptidase N precursor (ANPEP), lactoferrin (LTF), prostatic acid phosphatase (ACPP), human zinc-alpha-2- glycoprotein (AZGP1), prostate specific antigen (KLK3), progestagen-associated endometrial protein (PAEP), kinesin light chain 4 (KLC4), izumo sperm-egg fusion protein 1 (IZUMO1) and prolactin inducible protein (PIP) [10]. There are a number of bioinformatics tools available for in silico analysis of proteins isolated and identified in protein chemistry laboratories. These analyses help us in better understanding of functional aspects of new proteins in various biological processes. Hence, we report the computational analysis of Con A binding glycoproteins, identified, using various bioinformatics tools. The objectives of this study include, (1) prediction of glycosylation sites using sequence information and to compare the results with available experimental data, (2) functional annotation studies using Interpro and Blast2GO to cluster these proteins into functional groups, and (3) identification of protein-protein interaction (PPI) networks.

Methodology

Sequence Retrieval

The amino acid sequences of glycoproteins - ANPEP (P15144), LTF (P02788), ACPP (P15309), AZGP1 (P25311), KLK3 (P07288), PAEP (Q5T6T6), KLC4 (Q9NSK0), IZUMO1 (Q8IYV9) and PIP (P12273) were retrieved in FASTA format from Protein Knowledgebase (UniProt KB) [11].

Prediction of glycosylation sites and comparison with known sites

The possible N-glycosylation sites were predicted using NetNGlyc 1.0 program [12]. This program predicts Nglycosylation sites in human proteins using artificial neural networks that examine the amino acid sequence of N-X-S/T (Asn-Xaa-Ser/Thr). The predicted sites were compared with the known sites in these proteins, as evidenced by direct experiments [13].

Statistical analysis of amino acid content

The statistical analysis of amino acid content of each protein was done using program Pepstat [14]. This is a basic statistical tool which calculates the % share of individual amino acids in a protein sequence as well as shares of nine specific types of amino acid groups, such as tiny, small, aliphatic, aromatic, polar, non-polar, charged, basic and acidic.

Functional annotations and clustering using Blast2GO and InterProScan

InterProScan is a popular program suite for protein sequence analysis and classification [15]. It classifies sequences at various levels such as superfamily, family and subfamily and predicts the occurrence of functional domains and repeats. InterPro analysis was performed for glycoproteins to identify their subcellular locations and functions. The functional annotations were also carried out using Blast2GO and subsequently proteins were grouped into functional clusters [16]. All protein sequences were arranged in a single file in FASTA format and uploaded to the Blast2Go software suite [17] to facilitate batch handling of sequence data. The file was processed by implementing batch mode blastp, mapping to retrieve GO terms associated with each blast hit and Gene Ontology annotations. The program finally provides refined functional terms to each query based on their functions, statistical testing and InterProScan analysis. Finally, the retrieved information was used for graphical representation of results (cellular components, biological processes and molecular functions) in the form of pie charts.

Prediction of protein-protein interaction (PPI) networks

PPI networks for each protein were retrieved from STRING database [18, 19]. This database consists of known and predicted protein interactions collected from direct (physical) and indirect (functional) associations. This database quantitatively integrates interaction data from four sources – genomic context, high-throughput experiments, co-expression and previous knowledge from research publications.

Results and Discussion

The predicted N-glycosylation sites in Con A binding glycoproteins were compared with experimentally known sites in these proteins (see, Table 1 in supplementary material ). AZGP1, LTF, ACPP, KLK3 and PIP have 4,3,3,1 and 1 known N-glycosylation sites respectively, which were accurately predicted by NetNGlyc 1.0 program. This program predicts that ANPEP has 11 potential N-glycosylation sites (N42, N128, N234, N265, N319, N527, N573, N625, N681, N735, and N818), of which six are previously known (N128, N234, N265, N573, N681 and N818). IZUMO1 has one known N-glycosylation site at position N204 and predictions indicate that it may have another potential glycosylation site at position N239. PAEP and KLC4 have no known N-glycosylation sites and NetNGlyc 1.0 predicts that they may have 2 (N33, N55) and 1 (N4) glycosylation sites respectively. The Pepstat results show that nine glycoprotein sequences have the following mean mole percentage values of different types of residues: Aliphatic (A+I+L+V) = 29.13±6.81; Aromatic (F+H+W+Y) = 11.25 ±2.37; Non-polar (A+C+F+G+I+L+ M+P+V+W+Y) = 53.73±4.12; Polar (D+E+H+K+N+Q+R+ S+T+Z) = 46.16±4.12; Charged (B+D+E+H+K+R+Z) = 25.46±3.83; Basic (H+K+R) = 13.30±1.80; Acidic (B+D+E+Z) = 12.13±2.36. InterProScan results were integrated to the Blast2GO analysis to increase the confidence level of functional clustering. The final outputs of functional annotation studies are shown in Figure 1 and Table 2 (see supplementary material). The annotations specify that these proteins are involved in various biological processes including transport (LTF, PAEP), morphogenesis (ANPEP, KLK3), metabolic processes (ANPEP, AZGP1, KLK3), cell differentiation (ANPEP) and homeostasis (LTF). ACPP and IZUMO1 have reported roles in hydrolysis and reproduction (sperm-egg fusion) respectively, but the exact biological processes, they are involved, are still unknown. The functional clusters show that Con A − binding glycoproteins have two major molecular functions – hydrolase activity (ANPEP, LTF, ACPP, AZGP1, KLK3, PAEP) and binding – protein (LTF, AZGP1, KLC4, PIP)/carbohydrate (LTF)/lipid (AZGP1). The subcellular localizations of these proteins are also shown in the results, indicating that most of them originate from different cellular components. These proteins play imperative roles in various biological processes related to fertility/infertility and their expression regulates the processes essentially required for successful fertilization. Their key roles in reproductive physiology are well discussed [10]. PPI networks for these glycoproteins are shown in Figure 2(A-I). The large interactomes for most of the proteins point towards their versatile roles in wide range of biological processes. Thus, in depth characterization of these proteins may reveal that these are more important and multifaceted entities than what we are assuming about them for long.
Figure 1

Blast2GO analysis of Con A binding glycoproteins of human seminal plasma (see, Table 2 in supplementary material)

Figure 2

Protein interaction networks (see, Tables 3-11 in supplementary material)

Conclusion

The computational tools aid to the functional characterization of biomolecules by identifying their homologs in the biological databases and retrieving information from the research articles published worldwide. It helps researchers to guide their future studies towards in vivo or in vitro functional characterization. We have successfully identified the N-glycosylation sites of Con A binding glycoproteins isolated from human seminal plasma, clustered them into functional groups and mapped their interactomes. Thus, it is of importance in better understanding of functional aspects of these proteins in reproductive physiology.
  11 in total

Review 1.  On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database.

Authors:  R Apweiler; H Hermjakob; N Sharon
Journal:  Biochim Biophys Acta       Date:  1999-12-06

Review 2.  Protein glycosylation and diseases: blood and urinary oligosaccharides as markers for diagnosis and therapeutic monitoring.

Authors:  G Durand; N Seta
Journal:  Clin Chem       Date:  2000-06       Impact factor: 8.327

3.  Selective isolation of glycoproteins and glycopeptides for MALDI-TOF MS detection supported by magnetic particles.

Authors:  Katrin Sparbier; Sonja Koch; Irina Kessler; Thomas Wenzel; Markus Kostrzewa
Journal:  J Biomol Tech       Date:  2005-12

4.  Boar spermadhesins AQN-1 and AQN-3: oligosaccharide and zona pellucida binding characteristics.

Authors:  J J Calvete; E Carrera; L Sanz; E Töpfer-Petersen
Journal:  Biol Chem       Date:  1996 Jul-Aug       Impact factor: 3.915

5.  Human sperm coating antigen from seminal plasma origin.

Authors:  A Iborra; C Morte; P Fuentes; V García-Framis; P Andolz; P Martínez
Journal:  Am J Reprod Immunol       Date:  1996-08       Impact factor: 3.886

6.  Efficacy of glycoprotein enrichment by microscale lectin affinity chromatography.

Authors:  Milan Madera; Benjamin Mann; Yehia Mechref; Milos V Novotny
Journal:  J Sep Sci       Date:  2008-08       Impact factor: 3.645

7.  Purification and partial characterization of acrosome reaction inhibiting glycoprotein from human seminal plasma.

Authors:  R C Drisdel; S R Mack; R A Anderson; L J Zaneveld
Journal:  Biol Reprod       Date:  1995-07       Impact factor: 4.285

8.  Identification of gp17 glycoprotein and characterization of prostatic acid phosphatase (PAP) and carboxypeptidase E (CPE) fragments in a human seminal plasma fraction interacting with concanavalin A.

Authors:  A C Marquínez; A M Andreetta; N González; C Wolfenstein-Todel; J M Scacciati de Cerezo
Journal:  J Protein Chem       Date:  2003-07

9.  Phosphatidylcholine-binding proteins of bovine seminal plasma modulate capacitation of spermatozoa by heparin.

Authors:  I Thérien; G Bleau; P Manjunath
Journal:  Biol Reprod       Date:  1995-06       Impact factor: 4.285

10.  High-throughput functional annotation and data mining with the Blast2GO suite.

Authors:  Stefan Götz; Juan Miguel García-Gómez; Javier Terol; Tim D Williams; Shivashankar H Nagaraj; María José Nueda; Montserrat Robles; Manuel Talón; Joaquín Dopazo; Ana Conesa
Journal:  Nucleic Acids Res       Date:  2008-04-29       Impact factor: 16.971

View more
  4 in total

1.  Molecular analysis of hemagglutinin, neuraminidase, matrix genes provide insight into the genetic diversity of seasonal H3N2 human influenza a viruses in Bangladesh during July-August, 2012.

Authors:  Mukesh Jain; Sohidul Islam; A S M Zisanur Rahman; Sharmin Akhtar; Kazi Nadim Hasan; Gias Uddin Ahsan; Abdul Khaleque; Maqsud Hossain
Journal:  Virusdisease       Date:  2018-02-01

2.  Construction and Analysis of the Cell Surface's Protein Network for Human Sperm-Egg Interaction.

Authors:  Soudabeh Sabetian Fard Jahromi; Mohd Shahir Shamsir
Journal:  ISRN Bioinform       Date:  2013-08-12

3.  Genetic mutations in influenza H3N2 viruses from a 2012 epidemic in Southern China.

Authors:  Jing Zhong; Lijun Liang; Ping Huang; Xiaolan Zhu; Lirong Zou; Shouyi Yu; Xin Zhang; Yonghui Zhang; Hanzhong Ni; Jin Yan
Journal:  Virol J       Date:  2013-11-26       Impact factor: 4.099

4.  Lectin binding of human sperm associates with DEFB126 mutation and serves as a potential biomarker for subfertility.

Authors:  Aijie Xin; Li Cheng; Hua Diao; Yancheng Wu; Shumin Zhou; Changgen Shi; Yangyang Sun; Peng Wang; Shiwei Duan; Jufen Zheng; Bin Wu; Yao Yuan; Yihua Gu; Guowu Chen; Xiaoxi Sun; Huijuan Shi; Shengce Tao; Yonglian Zhang
Journal:  Sci Rep       Date:  2016-02-01       Impact factor: 4.379

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.