Literature DB >> 30556503

pLoc_bal-mVirus: Predict Subcellular Localization of Multi-Label Virus Proteins by Chou's General PseAAC and IHTS Treatment to Balance Training Dataset.

Xuan Xiao1,2, Xiang Cheng1,2, Genqiang Chen3, Qi Mao4, Kuo-Chen Chou2,5.   

Abstract

BACKGROUND/
OBJECTIVE: Knowledge of protein subcellular localization is vitally important for both basic research and drug development. Facing the avalanche of protein sequences emerging in the post-genomic age, it is urgent to develop computational tools for timely and effectively identifying their subcellular localization based on the sequence information alone. Recently, a predictor called "pLoc-mVirus" was developed for identifying the subcellular localization of virus proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, known as "multiplex proteins", may simultaneously occur in, or move between two or more subcellular location sites. Despite the fact that it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mVirus was trained by an extremely skewed dataset in which some subset was over 10 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset.
METHODS: Using the Chou's general PseAAC (Pseudo Amino Acid Composition) approach and the IHTS (Inserting Hypothetical Training Samples) treatment to balance out the training dataset, we have developed a new predictor called "pLoc_bal-mVirus" for predicting the subcellular localization of multi-label virus proteins.
RESULTS: Cross-validation tests on exactly the same experiment-confirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mVirus, the existing state-of-theart predictor for the same purpose.
CONCLUSION: Its user-friendly web-server is available at http://www.jci-bioinfo.cn/pLoc_balmVirus/, by which the majority of experimental scientists can easily get their desired results without the need to go through the detailed complicated mathematics. Accordingly, pLoc_bal-mVirus will become a very useful tool for designing multi-target drugs and in-depth understanding of the biological process in a cell. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.net.

Entities:  

Keywords:  Chou's 5-step rules; Chou's general PseAAC; Chou's intuitive metrics; ML-GKR; Multi-label system; multi-target drugs; virus proteins.

Mesh:

Substances:

Year:  2019        PMID: 30556503     DOI: 10.2174/1573406415666181217114710

Source DB:  PubMed          Journal:  Med Chem        ISSN: 1573-4064            Impact factor:   2.745


  8 in total

1.  Subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and LDA.

Authors:  Lei Du; Qingfang Meng; Yuehui Chen; Peng Wu
Journal:  BMC Bioinformatics       Date:  2020-05-24       Impact factor: 3.169

Review 2.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

3.  Identify Lysine Neddylation Sites Using Bi-profile Bayes Feature Extraction via the Chou's 5-steps Rule and General Pseudo Components.

Authors:  Zhe Ju; Shi-Yun Wang
Journal:  Curr Genomics       Date:  2019-12       Impact factor: 2.236

4.  RAACBook: a web server of reduced amino acid alphabet for sequence-dependent inference by using Chou's five-step rule.

Authors:  Lei Zheng; Shenghui Huang; Nengjiang Mu; Haoyue Zhang; Jiayu Zhang; Yu Chang; Lei Yang; Yongchun Zuo
Journal:  Database (Oxford)       Date:  2019-01-01       Impact factor: 3.451

5.  iMethylK_pseAAC: Improving Accuracy of Lysine Methylation Sites Identification by Incorporating Statistical Moments and Position Relative Features into General PseAAC via Chou's 5-steps Rule.

Authors:  Sarah Ilyas; Waqar Hussain; Adeel Ashraf; Yaser Daanial Khan; Sher Afzal Khan; Kuo-Chen Chou
Journal:  Curr Genomics       Date:  2019-05       Impact factor: 2.236

6.  iSulfoTyr-PseAAC: Identify Tyrosine Sulfation Sites by Incorporating Statistical Moments via Chou's 5-steps Rule and Pseudo Components.

Authors:  Omar Barukab; Yaser Daanial Khan; Sher Afzal Khan; Kuo-Chen Chou
Journal:  Curr Genomics       Date:  2019-05       Impact factor: 2.236

7.  AptaNet as a deep learning approach for aptamer-protein interaction prediction.

Authors:  Neda Emami; Reza Ferdousi
Journal:  Sci Rep       Date:  2021-03-16       Impact factor: 4.379

8.  Characterization of the relationship between FLI1 and immune infiltrate level in tumour immune microenvironment for breast cancer.

Authors:  Shiyuan Wang; Yakun Wang; Chunlu Yu; Yiyin Cao; Yao Yu; Yi Pan; Dongqing Su; Qianzi Lu; Wuritu Yang; Yongchun Zuo; Lei Yang
Journal:  J Cell Mol Med       Date:  2020-04-05       Impact factor: 5.310

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.