Jialin Yu1, Shaoping Shi1, Fang Zhang1, Guodong Chen1, Man Cao1. 1. Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China.
Abstract
MOTIVATION: Protein glycation is a familiar post-translational modification (PTM) which is a two-step non-enzymatic reaction. Glycation not only impairs the function but also changes the characteristics of the proteins so that it is related to many human diseases. It is still much more difficult to systematically detect glycation sites due to the glycated residues without crucial patterns. Computational approaches, which can filter supposed sites prior to experimental verification, can extremely increase the efficiency of experiment work. However, the previous lysine glycation prediction method uses a small number of training datasets. Hence, the model is not generalized or pervasive. RESULTS: By searching from a new database, we collected a large dataset in Homo sapiens. PredGly, a novel software, can predict lysine glycation sites for H.sapiens, which was developed by combining multiple features. In addition, XGboost was adopted to optimize feature vectors and to improve the model performance. Through comparing various classifiers, support vector machine achieved an optimal performance. On the basis of a new independent test set, PredGly outperformed other glycation tools. It suggests that PredGly can provide more instructive guidance for further experimental research of lysine glycation. AVAILABILITY AND IMPLEMENTATION: https://github.com/yujialinncu/PredGly. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Protein glycation is a familiar post-translational modification (PTM) which is a two-step non-enzymatic reaction. Glycation not only impairs the function but also changes the characteristics of the proteins so that it is related to many human diseases. It is still much more difficult to systematically detect glycation sites due to the glycated residues without crucial patterns. Computational approaches, which can filter supposed sites prior to experimental verification, can extremely increase the efficiency of experiment work. However, the previous lysine glycation prediction method uses a small number of training datasets. Hence, the model is not generalized or pervasive. RESULTS: By searching from a new database, we collected a large dataset in Homo sapiens. PredGly, a novel software, can predict lysine glycation sites for H.sapiens, which was developed by combining multiple features. In addition, XGboost was adopted to optimize feature vectors and to improve the model performance. Through comparing various classifiers, support vector machine achieved an optimal performance. On the basis of a new independent test set, PredGly outperformed other glycation tools. It suggests that PredGly can provide more instructive guidance for further experimental research of lysine glycation. AVAILABILITY AND IMPLEMENTATION: https://github.com/yujialinncu/PredGly. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Sabit Ahmed; Afrida Rahman; Md Al Mehedi Hasan; Md Khaled Ben Islam; Julia Rahman; Shamim Ahmad Journal: PLoS One Date: 2021-04-01 Impact factor: 3.240
Authors: Jared A Delmar; Jihong Wang; Seo Woo Choi; Jason A Martins; John P Mikhail Journal: Mol Ther Methods Clin Dev Date: 2019-10-01 Impact factor: 6.698