Literature DB >> 30880183

dForml(KNN)-PseAAC: Detecting formylation sites from protein sequences using K-nearest neighbor algorithm via Chou's 5-step rule and pseudo components.

Qiao Ning1, Zhiqiang Ma1, Xiaowei Zhao2.   

Abstract

Formylation is a type of post-translational modification that can occur on lysine sites, which plays an irreplaceable role in organism. To better understand the mechanism, it is necessary to identify formylation sites in proteins accurately. Computational method is popular because of its more convenience and higher speed than traditional experimental methods. However, no computational method has been proposed for prediction of lysine formylation. In this study, we developed a predictor named LFPred to identify lysine formylation sites using sequence features (including amino acid composition (AAC), binary profile features (BPF), and amino acid index (AAI)) combined K-nearest neighbor algorithm as classifier. We chose discrete window instead of continuous window according to information entropy. Besides, we took measure to select more reliable negative samples and address the severe imbalance between positive samples and negative samples. Finally, the performance of LFPred is measured with a specificity of 79.9% and a sensibility of 81.4% using jackknife test method, which indicated that our method can be a useful tool for prediction of lysine formylation sites.
Copyright © 2019 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Discrete window; Formylation; Information entropy; K-nearest neighbor algorithm; Sequence feature

Mesh:

Substances:

Year:  2019        PMID: 30880183     DOI: 10.1016/j.jtbi.2019.03.011

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  5 in total

Review 1.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

2.  DWPPI: A Deep Learning Approach for Predicting Protein-Protein Interactions in Plants Based on Multi-Source Information With a Large-Scale Biological Network.

Authors:  Jie Pan; Zhu-Hong You; Li-Ping Li; Wen-Zhun Huang; Jian-Xin Guo; Chang-Qing Yu; Li-Ping Wang; Zheng-Yang Zhao
Journal:  Front Bioeng Biotechnol       Date:  2022-03-21

3.  Identification of Human Enzymes Using Amino Acid Composition and the Composition of k-Spaced Amino Acid Pairs.

Authors:  Lifu Zhang; Benzhi Dong; Zhixia Teng; Ying Zhang; Liran Juan
Journal:  Biomed Res Int       Date:  2020-05-22       Impact factor: 3.411

4.  predPhogly-Site: Predicting phosphoglycerylation sites by incorporating probabilistic sequence-coupling information into PseAAC and addressing data imbalance.

Authors:  Sabit Ahmed; Afrida Rahman; Md Al Mehedi Hasan; Md Khaled Ben Islam; Julia Rahman; Shamim Ahmad
Journal:  PLoS One       Date:  2021-04-01       Impact factor: 3.240

5.  X-ray image based COVID-19 detection using evolutionary deep learning approach.

Authors:  Seyed Mohammad Jafar Jalali; Milad Ahmadian; Sajad Ahmadian; Rachid Hedjam; Abbas Khosravi; Saeid Nahavandi
Journal:  Expert Syst Appl       Date:  2022-03-30       Impact factor: 8.665

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.