Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome.

Literature DB >> 25568279

GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome.

Fuyi Li¹, Chen Li¹, Mingjun Wang¹, Geoffrey I Webb¹, Yang Zhang¹, James C Whisstock², Jiangning Song³.

Abstract

MOTIVATION: Glycosylation is a ubiquitous type of protein post-translational modification (PTM) in eukaryotic cells, which plays vital roles in various biological processes (BPs) such as cellular communication, ligand recognition and subcellular recognition. It is estimated that >50% of the entire human proteome is glycosylated. However, it is still a significant challenge to identify glycosylation sites, which requires expensive/laborious experimental research. Thus, bioinformatics approaches that can predict the glycan occupancy at specific sequons in protein sequences would be useful for understanding and utilizing this important PTM.
RESULTS: In this study, we present a novel bioinformatics tool called GlycoMine, which is a comprehensive tool for the systematic in silico identification of C-linked, N-linked, and O-linked glycosylation sites in the human proteome. GlycoMine was developed using the random forest algorithm and evaluated based on a well-prepared up-to-date benchmark dataset that encompasses all three types of glycosylation sites, which was curated from multiple public resources. Heterogeneous sequences and functional features were derived from various sources, and subjected to further two-step feature selection to characterize a condensed subset of optimal features that contributed most to the type-specific prediction of glycosylation sites. Five-fold cross-validation and independent tests show that this approach significantly improved the prediction performance compared with four existing prediction tools: NetNGlyc, NetOGlyc, EnsembleGly and GPP. We demonstrated that this tool could identify candidate glycosylation sites in case study proteins and applied it to identify many high-confidence glycosylation target proteins by screening the entire human proteome.
AVAILABILITY AND IMPLEMENTATION: The webserver, Java Applet, user instructions, datasets, and predicted glycosylation sites in the human proteome are freely available at http://www.structbioinfor.org/Lab/GlycoMine/. CONTACT: Jiangning.Song@monash.edu or James.Whisstock@monash.edu or zhangyang@nwsuaf.edu.cn SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Chemical Species

Mesh：

Substances：
Amino Acids
Proteome

Year: 2015 PMID： 25568279 DOI： 10.1093/bioinformatics/btu852

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

53 in total

1. PRISMOID: a comprehensive 3D structure database for post-translational modifications and mutations with functional impact.

Authors: Fuyi Li; Cunshuo Fan; Tatiana T Marquez-Lago; André Leier; Jerico Revote; Cangzhi Jia; Yan Zhu; A Ian Smith; Geoffrey I Webb; Quanzhong Liu; Leyi Wei; Jian Li; Jiangning Song
Journal: Brief Bioinform Date: 2020-05-21 Impact factor: 11.622

Review 2. Critical evaluation of bioinformatics tools for the prediction of protein crystallization propensity.

Authors: Huilin Wang; Liubin Feng; Geoffrey I Webb; Lukasz Kurgan; Jiangning Song; Donghai Lin
Journal: Brief Bioinform Date: 2018-09-28 Impact factor: 11.622

Review 3. Application of Proteomics Technologies in Oil Palm Research.

Authors: Benjamin Yii Chung Lau; Abrizah Othman; Umi Salamah Ramli
Journal: Protein J Date: 2018-12 Impact factor: 2.371

4. MULTiPly: a novel multi-layer predictor for discovering general and specific types of promoters.

Authors: Meng Zhang; Fuyi Li; Tatiana T Marquez-Lago; André Leier; Cunshuo Fan; Chee Keong Kwoh; Kuo-Chen Chou; Jiangning Song; Cangzhi Jia
Journal: Bioinformatics Date: 2019-09-01 Impact factor: 6.937

5. Twenty years of bioinformatics research for protease-specific substrate and cleavage site prediction: a comprehensive revisit and benchmarking of existing methods.

Authors: Fuyi Li; Yanan Wang; Chen Li; Tatiana T Marquez-Lago; André Leier; Neil D Rawlings; Gholamreza Haffari; Jerico Revote; Tatsuya Akutsu; Kuo-Chen Chou; Anthony W Purcell; Robert N Pike; Geoffrey I Webb; A Ian Smith; Trevor Lithgow; Roger J Daly; James C Whisstock; Jiangning Song
Journal: Brief Bioinform Date: 2019-11-27 Impact factor: 11.622

6. Computational prediction and interpretation of both general and specific types of promoters in Escherichia coli by exploiting a stacked ensemble-learning framework.

Authors: Fuyi Li; Jinxiang Chen; Zongyuan Ge; Ya Wen; Yanwei Yue; Morihiro Hayashida; Abdelkader Baggag; Halima Bensmail; Jiangning Song
Journal: Brief Bioinform Date: 2021-03-22 Impact factor: 11.622

7. Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome.

Authors: Fuyi Li; Chen Li; Tatiana T Marquez-Lago; André Leier; Tatsuya Akutsu; Anthony W Purcell; A Ian Smith; Trevor Lithgow; Roger J Daly; Jiangning Song; Kuo-Chen Chou
Journal: Bioinformatics Date: 2018-12-15 Impact factor: 6.937

8. Discriminating cirRNAs from other lncRNAs using a hierarchical extreme learning machine (H-ELM) algorithm with feature selection.

Authors: Lei Chen; Yu-Hang Zhang; Guohua Huang; Xiaoyong Pan; ShaoPeng Wang; Tao Huang; Yu-Dong Cai
Journal: Mol Genet Genomics Date: 2017-09-14 Impact factor: 3.291

9. DeepCleave: a deep learning predictor for caspase and matrix metalloprotease substrates and cleavage sites.

Authors: Fuyi Li; Jinxiang Chen; André Leier; Tatiana Marquez-Lago; Quanzhong Liu; Yanze Wang; Jerico Revote; A Ian Smith; Tatsuya Akutsu; Geoffrey I Webb; Lukasz Kurgan; Jiangning Song
Journal: Bioinformatics Date: 2020-02-15 Impact factor: 6.937

10. Large-scale comparative assessment of computational predictors for lysine post-translational modification sites.

Authors: Zhen Chen; Xuhan Liu; Fuyi Li; Chen Li; Tatiana Marquez-Lago; André Leier; Tatsuya Akutsu; Geoffrey I Webb; Dakang Xu; Alexander Ian Smith; Lei Li; Kuo-Chen Chou; Jiangning Song
Journal: Brief Bioinform Date: 2019-11-27 Impact factor: 11.622