Literature DB >> 30201434

pLoc_bal-mGneg: Predict subcellular localization of Gram-negative bacterial proteins by quasi-balancing training dataset and general PseAAC.

Xiang Cheng1, Xuan Xiao2, Kuo-Chen Chou3.   

Abstract

One of the hottest topics in molecular cell biology is to determine the subcellular localization of proteins from various different organisms. This is because it is crucially important for both basic research and drug development. Recently, a predictor called "pLoc-mGneg" was developed for identifying the subcellular localization of Gram-negative bacterial proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, called "multiplex proteins", may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mGneg was trained by an extremely skewed dataset in which some subset (subcellular location) was about 5 to 70 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset. To alleviate such a consequence, we have developed a new and bias-reducing predictor called pLoc_bal-mGneg by quasi-balancing the training dataset. Cross-validation tests on exactly the same experiment-confirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mGneg, the existing state-of-the-art predictor in identifying the subcellular localization of Gram-negative bacterial proteins. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mGneg/, by which users can easily get their desired results without the need to go through the detailed mathematics.
Copyright © 2018 Elsevier Ltd. All rights reserved.

Keywords:  Chou's intuitive metrics; Five-step rules; Gram-negative bacterial proteins; IHTS; ML-GKR; Multi-label system

Mesh:

Substances:

Year:  2018        PMID: 30201434     DOI: 10.1016/j.jtbi.2018.09.005

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  10 in total

1.  iPhosY-PseAAC: identify phosphotyrosine sites by incorporating sequence statistical moments into PseAAC.

Authors:  Yaser Daanial Khan; Nouman Rasool; Waqar Hussain; Sher Afzal Khan; Kuo-Chen Chou
Journal:  Mol Biol Rep       Date:  2018-10-11       Impact factor: 2.316

2.  MULTiPly: a novel multi-layer predictor for discovering general and specific types of promoters.

Authors:  Meng Zhang; Fuyi Li; Tatiana T Marquez-Lago; André Leier; Cunshuo Fan; Chee Keong Kwoh; Kuo-Chen Chou; Jiangning Song; Cangzhi Jia
Journal:  Bioinformatics       Date:  2019-09-01       Impact factor: 6.937

Review 3.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

4.  Identify Lysine Neddylation Sites Using Bi-profile Bayes Feature Extraction via the Chou's 5-steps Rule and General Pseudo Components.

Authors:  Zhe Ju; Shi-Yun Wang
Journal:  Curr Genomics       Date:  2019-12       Impact factor: 2.236

5.  RAACBook: a web server of reduced amino acid alphabet for sequence-dependent inference by using Chou's five-step rule.

Authors:  Lei Zheng; Shenghui Huang; Nengjiang Mu; Haoyue Zhang; Jiayu Zhang; Yu Chang; Lei Yang; Yongchun Zuo
Journal:  Database (Oxford)       Date:  2019-01-01       Impact factor: 3.451

6.  iMethylK_pseAAC: Improving Accuracy of Lysine Methylation Sites Identification by Incorporating Statistical Moments and Position Relative Features into General PseAAC via Chou's 5-steps Rule.

Authors:  Sarah Ilyas; Waqar Hussain; Adeel Ashraf; Yaser Daanial Khan; Sher Afzal Khan; Kuo-Chen Chou
Journal:  Curr Genomics       Date:  2019-05       Impact factor: 2.236

7.  iSulfoTyr-PseAAC: Identify Tyrosine Sulfation Sites by Incorporating Statistical Moments via Chou's 5-steps Rule and Pseudo Components.

Authors:  Omar Barukab; Yaser Daanial Khan; Sher Afzal Khan; Kuo-Chen Chou
Journal:  Curr Genomics       Date:  2019-05       Impact factor: 2.236

8.  Identifying Pupylation Proteins and Sites by Incorporating Multiple Methods.

Authors:  Wang-Ren Qiu; Meng-Yue Guan; Qian-Kun Wang; Li-Liang Lou; Xuan Xiao
Journal:  Front Endocrinol (Lausanne)       Date:  2022-04-26       Impact factor: 6.055

9.  A machine learning technique for identifying DNA enhancer regions utilizing CIS-regulatory element patterns.

Authors:  Ahmad Hassan Butt; Tamim Alkhalifah; Fahad Alturise; Yaser Daanial Khan
Journal:  Sci Rep       Date:  2022-09-07       Impact factor: 4.996

10.  Characterization of the relationship between FLI1 and immune infiltrate level in tumour immune microenvironment for breast cancer.

Authors:  Shiyuan Wang; Yakun Wang; Chunlu Yu; Yiyin Cao; Yao Yu; Yi Pan; Dongqing Su; Qianzi Lu; Wuritu Yang; Yongchun Zuo; Lei Yang
Journal:  J Cell Mol Med       Date:  2020-04-05       Impact factor: 5.310

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.