Literature DB >> 27681207

Multi-iPPseEvo: A Multi-label Classifier for Identifying Human Phosphorylated Proteins by Incorporating Evolutionary Information into Chou's General PseAAC via Grey System Theory.

Wang-Ren Qiu1,2,3, Quan-Shu Zheng1, Bi-Qian Sun1, Xuan Xiao1,4.   

Abstract

Predicting phosphorylation protein is a challenging problem, particularly when query proteins have multi-label features meaning that they may be phosphorylated at two or more different type amino acids. In fact, human protein usually be phosphorylated at serine, threonine and tyrosine. By introducing the "multi-label learning" approach, a novel predictor has been developed that can be used to deal with the systems containing both single- and multi-label phosphorylation protein. Here we proposed a predictor called Multi-iPPseEvo by (1) incorporating the protein sequence evolutionary information into the general pseudo amino acid composition (PseAAC) via the grey system theory, (2) balancing out the skewed training datasets by the asymmetric bootstrap approach, and (3) constructing an ensemble predictor by fusing an array of individual random forest classifiers thru a voting system. Rigorous cross-validations via a set of multi-label metrics indicate that the multi-label phosphorylation predictor is very promising and encouraging. The current approach represents a new strategy to deal with the multi-label biological problems, and the software is freely available for academic use at http://www.jci-bioinfo.cn/Multi-iPPseEvo.
© 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

Entities:  

Keywords:  Ensemble classifier; Multi-label learning; Protein phosphorylation; Random Forests

Mesh:

Substances:

Year:  2016        PMID: 27681207     DOI: 10.1002/minf.201600085

Source DB:  PubMed          Journal:  Mol Inform        ISSN: 1868-1743            Impact factor:   3.353


  6 in total

Review 1.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

2.  Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human.

Authors:  Chengchao Wu; Shixin Yao; Xinghao Li; Chujia Chen; Xuehai Hu
Journal:  Int J Mol Sci       Date:  2017-02-16       Impact factor: 5.923

3.  Pse-Analysis: a python package for DNA/RNA and protein/ peptide sequence analysis based on pseudo components and kernel methods.

Authors:  Bin Liu; Hao Wu; Deyuan Zhang; Xiaolong Wang; Kuo-Chen Chou
Journal:  Oncotarget       Date:  2017-02-21

4.  iRNA-AI: identifying the adenosine to inosine editing sites in RNA sequences.

Authors:  Wei Chen; Pengmian Feng; Hui Yang; Hui Ding; Hao Lin; Kuo-Chen Chou
Journal:  Oncotarget       Date:  2017-01-17

5.  ProFold: Protein Fold Classification with Additional Structural Features and a Novel Ensemble Classifier.

Authors:  Daozheng Chen; Xiaoyu Tian; Bo Zhou; Jun Gao
Journal:  Biomed Res Int       Date:  2016-08-28       Impact factor: 3.411

6.  Identifying Acetylation Protein by Fusing Its PseAAC and Functional Domain Annotation.

Authors:  Wang-Ren Qiu; Ao Xu; Zhao-Chun Xu; Chun-Hua Zhang; Xuan Xiao
Journal:  Front Bioeng Biotechnol       Date:  2019-12-06
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.