Literature DB >> 29420699

O-GlcNAcPRED-II: an integrated classification algorithm for identifying O-GlcNAcylation sites based on fuzzy undersampling and a K-means PCA oversampling technique.

Cangzhi Jia1, Yun Zuo1, Quan Zou2.   

Abstract

Motivation: Protein O-GlcNAcylation (O-GlcNAc) is an important post-translational modification of serine (S)/threonine (T) residues that involves multiple molecular and cellular processes. Recent studies have suggested that abnormal O-G1cNAcylation causes many diseases, such as cancer and various neurodegenerative diseases. With the available protein O-G1cNAcylation sites experimentally verified, it is highly desired to develop automated methods to rapidly and effectively identify O-GlcNAcylation sites. Although some computational methods have been proposed, their performance has been unsatisfactory, particularly in terms of prediction sensitivity.
Results: In this study, we developed an ensemble model O-GlcNAcPRED-II to identify potential O-GlcNAcylation sites. A K-means principal component analysis oversampling technique (KPCA) and fuzzy undersampling method (FUS) were first proposed and incorporated to reduce the proportion of the original positive and negative training samples. Then, rotation forest, a type of classifier-integrated system, was adopted to divide the eight types of feature space into several subsets using four sub-classifiers: random forest, k-nearest neighbour, naive Bayesian and support vector machine. We observed that O-GlcNAcPRED-II achieved a sensitivity of 81.05%, specificity of 95.91%, accuracy of 91.43% and Matthew's correlation coefficient of 0.7928 for five-fold cross-validation run 10 times. Additionally, the results obtained by O-GlcNAcPRED-II on two independent datasets also indicated that the proposed predictor outperformed five published prediction tools. Availability and implementation: http://121.42.167.206/OGlcPred/. Supplementary information: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29420699     DOI: 10.1093/bioinformatics/bty039

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  27 in total

1.  PRISMOID: a comprehensive 3D structure database for post-translational modifications and mutations with functional impact.

Authors:  Fuyi Li; Cunshuo Fan; Tatiana T Marquez-Lago; André Leier; Jerico Revote; Cangzhi Jia; Yan Zhu; A Ian Smith; Geoffrey I Webb; Quanzhong Liu; Leyi Wei; Jian Li; Jiangning Song
Journal:  Brief Bioinform       Date:  2020-05-21       Impact factor: 11.622

2.  2lpiRNApred: a two-layered integrated algorithm for identifying piRNAs and their functions based on LFE-GM feature selection.

Authors:  Yun Zuo; Quan Zou; Jianyuan Lin; Min Jiang; Xiangrong Liu
Journal:  RNA Biol       Date:  2020-03-05       Impact factor: 4.652

Review 3.  Tools, tactics and objectives to interrogate cellular roles of O-GlcNAc in disease.

Authors:  Charlie Fehl; John A Hanover
Journal:  Nat Chem Biol       Date:  2021-12-21       Impact factor: 16.174

4.  HSM6AP: a high-precision predictor for the Homo sapiens N6-methyladenosine (m^6 A) based on multiple weights and feature stitching.

Authors:  Jing Li; Shida He; Fei Guo; Quan Zou
Journal:  RNA Biol       Date:  2021-02-12       Impact factor: 4.652

5.  A sequence-based multiple kernel model for identifying DNA-binding proteins.

Authors:  Yuqing Qian; Limin Jiang; Yijie Ding; Jijun Tang; Fei Guo
Journal:  BMC Bioinformatics       Date:  2021-05-31       Impact factor: 3.169

6.  A Novel Triple Matrix Factorization Method for Detecting Drug-Side Effect Association Based on Kernel Target Alignment.

Authors:  Xiaoyi Guo; Wei Zhou; Yan Yu; Yijie Ding; Jijun Tang; Fei Guo
Journal:  Biomed Res Int       Date:  2020-05-28       Impact factor: 3.411

7.  Classifying Included and Excluded Exons in Exon Skipping Event Using Histone Modifications.

Authors:  Wei Chen; Pengmian Feng; Hui Ding; Hao Lin
Journal:  Front Genet       Date:  2018-10-01       Impact factor: 4.599

8.  Computational Prediction and Analysis of Associations between Small Molecules and Binding-Associated S-Nitrosylation Sites.

Authors:  Guohua Huang; Jincheng Li; Chenglin Zhao
Journal:  Molecules       Date:  2018-04-19       Impact factor: 4.411

9.  O-GlcNAcylation Prediction: An Unattained Objective.

Authors:  Theo Mauri; Laurence Menu-Bouaouiche; Muriel Bardor; Tony Lefebvre; Marc F Lensink; Guillaume Brysbaert
Journal:  Adv Appl Bioinform Chem       Date:  2021-06-08

Review 10.  O-GlcNAcylation in Hyperglycemic Pregnancies: Impact on Placental Function.

Authors:  Jie Ning; Huixia Yang
Journal:  Front Endocrinol (Lausanne)       Date:  2021-06-01       Impact factor: 5.555

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.