Literature DB >> 34236968

A Novel Method for Identification of Glutarylation Sites Combining Borderline-SMOTE With Tomek Links Technique in Imbalanced Data.

Qiao Ning, Xiaowei Zhao, Zhiqiang Ma.   

Abstract

Glutarylation is a type of post-translational modification that occurs on lysine residues. It plays an irreplaceable role in various cellular functions. Therefore, identification of glutarylation sites is significant for understanding the molecular mechanism of glutarylation. In this study, we proposed a method named DEXGB_Glu to identify lysine glutarylation sites using XGBoost as classifier which was optimized by differential evolution algorithm. Aiming at the imbalance between positive samples and negative samples, Borderline-SMOTE method was employed to synthesize positive samples, increasing their amount equal to negative samples. Then, Tomek links technique was applied to filter out noise data. Analysis of this method and its results showed that differential evolution algorithm obviously improved the performance and the combination of Borderline-SMOTE and Tomek links effectively solved the imbalance between positive samples and negative samples. Finally, the performance of this method was much better than other methods in prediction of glutarylation sites. The data and code are available on https://github.com/ningq669/DEXGB_Glu.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 34236968     DOI: 10.1109/TCBB.2021.3095482

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.702


  1 in total

1.  Combining Resampling Strategies and Ensemble Machine Learning Methods to Enhance Prediction of Neonates with a Low Apgar Score After Induction of Labor in Northern Tanzania.

Authors:  Clifford Silver Tarimo; Soumitra S Bhuyan; Quanman Li; Weicun Ren; Michael Johnson Mahande; Jian Wu
Journal:  Risk Manag Healthc Policy       Date:  2021-09-07
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.