Literature DB >> 30010789

pLoc_bal-mAnimal: predict subcellular localization of animal proteins by balancing training dataset and PseAAC.

Xiang Cheng1,2, Wei-Zhong Lin1, Xuan Xiao1,2, Kuo-Chen Chou2,3.   

Abstract

Motivation: A cell contains numerous protein molecules. One of the fundamental goals in cell biology is to determine their subcellular locations, which can provide useful clues about their functions. Knowledge of protein subcellular localization is also indispensable for prioritizing and selecting the right targets for drug development. With the avalanche of protein sequences emerging in the post-genomic age, it is highly desired to develop computational tools for timely and effectively identifying their subcellular localization based on the sequence information alone. Recently, a predictor called 'pLoc-mAnimal' was developed for identifying the subcellular localization of animal proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with the multi-label systems in which some proteins, called 'multiplex proteins', may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mAnimal was trained by an extremely skewed dataset in which some subset (subcellular location) was about 128 times the size of the other subsets. Accordingly, such an uneven training dataset will inevitably cause a biased consequence.
Results: To alleviate such biased consequence, we have developed a new and bias-reducing predictor called pLoc_bal-mAnimal by quasi-balancing the training dataset. Cross-validation tests on exactly the same experiment-confirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mAnimal, the existing state-of-the-art predictor, in identifying the subcellular localization of animal proteins. Availability and implementation: To maximize the convenience for the vast majority of experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mAnimal/, by which users can easily get their desired results without the need to go through the complicated mathematics. Supplementary information: Supplementary data are available at Bioinformatics online.

Mesh:

Substances:

Year:  2019        PMID: 30010789     DOI: 10.1093/bioinformatics/bty628

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  16 in total

1.  iPhosY-PseAAC: identify phosphotyrosine sites by incorporating sequence statistical moments into PseAAC.

Authors:  Yaser Daanial Khan; Nouman Rasool; Waqar Hussain; Sher Afzal Khan; Kuo-Chen Chou
Journal:  Mol Biol Rep       Date:  2018-10-11       Impact factor: 2.316

Review 2.  Structural Variability in the RLR-MAVS Pathway and Sensitive Detection of Viral RNAs.

Authors:  Qiu-Xing Jiang
Journal:  Med Chem       Date:  2019       Impact factor: 2.745

Review 3.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

4.  Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM.

Authors:  Liwen Wu; Song Gao; Shaowen Yao; Feng Wu; Jie Li; Yunyun Dong; Yunqi Zhang
Journal:  Front Genet       Date:  2022-06-15       Impact factor: 4.772

5.  Prediction of DNA binding proteins using local features and long-term dependencies with primary sequences based on deep learning.

Authors:  Guobin Li; Xiuquan Du; Xinlu Li; Le Zou; Guanhong Zhang; Zhize Wu
Journal:  PeerJ       Date:  2021-05-03       Impact factor: 2.984

6.  iPseU-CNN: Identifying RNA Pseudouridine Sites Using Convolutional Neural Networks.

Authors:  Muhammad Tahir; Hilal Tayara; Kil To Chong
Journal:  Mol Ther Nucleic Acids       Date:  2019-04-11

7.  csDMA: an improved bioinformatics tool for identifying DNA 6 mA modifications via Chou's 5-step rule.

Authors:  Ze Liu; Wei Dong; Wei Jiang; Zili He
Journal:  Sci Rep       Date:  2019-09-11       Impact factor: 4.379

8.  RAACBook: a web server of reduced amino acid alphabet for sequence-dependent inference by using Chou's five-step rule.

Authors:  Lei Zheng; Shenghui Huang; Nengjiang Mu; Haoyue Zhang; Jiayu Zhang; Yu Chang; Lei Yang; Yongchun Zuo
Journal:  Database (Oxford)       Date:  2019-01-01       Impact factor: 3.451

9.  iMethylK_pseAAC: Improving Accuracy of Lysine Methylation Sites Identification by Incorporating Statistical Moments and Position Relative Features into General PseAAC via Chou's 5-steps Rule.

Authors:  Sarah Ilyas; Waqar Hussain; Adeel Ashraf; Yaser Daanial Khan; Sher Afzal Khan; Kuo-Chen Chou
Journal:  Curr Genomics       Date:  2019-05       Impact factor: 2.236

10.  Characterization of the relationship between FLI1 and immune infiltrate level in tumour immune microenvironment for breast cancer.

Authors:  Shiyuan Wang; Yakun Wang; Chunlu Yu; Yiyin Cao; Yao Yu; Yi Pan; Dongqing Su; Qianzi Lu; Wuritu Yang; Yongchun Zuo; Lei Yang
Journal:  J Cell Mol Med       Date:  2020-04-05       Impact factor: 5.310

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.