Literature DB >> 25896721

Protein remote homology detection by combining Chou's distance-pair pseudo amino acid composition and principal component analysis.

Bin Liu1,2,3, Junjie Chen4, Xiaolong Wang5,6.   

Abstract

Protein remote homology detection is one of the important tasks in computational proteomics, which is important for basic research and practical application. Currently, the SVM-based discriminative methods have shown superior performance. However, the existing feature vectors still cannot suitably represent the protein sequences, and often lack an interpretable model for analysis of characteristic features. Previous studies showed that sequence-order effects and physicochemical properties are important for representing protein sequences. However, how to use these kinds of information for constructing predictors is still a challenging problem. In this study, in order to incorporate the sequence-order information and physicochemical properties into the prediction, a method called disPseAAC is proposed, in which the feature vector is constructed by combining the occurrences of amino acid pairs within the Chou's pseudo amino acid composition (PseAAC) approach. The predictive performance and computational cost are further improved by employing the principal component analysis strategy. Various experiments are conducted on a benchmark dataset. Experimental results show that disPseAAC achieves an ROC score of 0.922, outperforming some existing state-of-the-art methods. Furthermore, the learnt model can easily be analyzed in terms of discriminative features, and the computational cost of the proposed method is much lower than that of other profile-based methods.

Entities:  

Keywords:  Principal component analysis; Protein remote homology; Pseudo amino acid composition; Support vector machine

Mesh:

Substances:

Year:  2015        PMID: 25896721     DOI: 10.1007/s00438-015-1044-4

Source DB:  PubMed          Journal:  Mol Genet Genomics        ISSN: 1617-4623            Impact factor:   3.291


  130 in total

1.  Protein ranking: from local to global structure in the protein similarity network.

Authors:  Jason Weston; Andre Elisseeff; Dengyong Zhou; Christina S Leslie; William Stafford Noble
Journal:  Proc Natl Acad Sci U S A       Date:  2004-04-15       Impact factor: 11.205

2.  Protein homology detection by HMM-HMM comparison.

Authors:  Johannes Söding
Journal:  Bioinformatics       Date:  2004-11-05       Impact factor: 6.937

3.  Using the augmented Chou's pseudo amino acid composition for predicting protein submitochondria locations based on auto covariance approach.

Authors:  Yu-hong Zeng; Yan-zhi Guo; Rong-quan Xiao; Li Yang; Le-zheng Yu; Meng-long Li
Journal:  J Theor Biol       Date:  2009-03-31       Impact factor: 2.691

4.  Predicting the cofactors of oxidoreductases based on amino acid composition distribution and Chou's amphiphilic pseudo-amino acid composition.

Authors:  Guang-Ya Zhang; Bai-Shan Fang
Journal:  J Theor Biol       Date:  2008-03-19       Impact factor: 2.691

5.  iMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach.

Authors:  Bin Liu; Longyun Fang; Fule Liu; Xiaolong Wang; Kuo-Chen Chou
Journal:  J Biomol Struct Dyn       Date:  2015-03-03

6.  Gram-positive and Gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into Chou׳s general PseAAC.

Authors:  Abdollah Dehzangi; Rhys Heffernan; Alok Sharma; James Lyons; Kuldip Paliwal; Abdul Sattar
Journal:  J Theor Biol       Date:  2014-09-28       Impact factor: 2.691

7.  Predicting antibacterial peptides by the concept of Chou's pseudo-amino acid composition and machine learning methods.

Authors:  Maede Khosravian; Fateme Kazemi Faramarzi; Majid Mohammad Beigi; Mandana Behbahani; Hassan Mohabatkar
Journal:  Protein Pept Lett       Date:  2013-02       Impact factor: 1.890

Review 8.  3D structural conformation and functional domains of polysialyltransferase ST8Sia IV required for polysialylation of neural cell adhesion molecules.

Authors:  Guo-Ping Zhou; Ri-Bo Huang; Frederic A Troy
Journal:  Protein Pept Lett       Date:  2015       Impact factor: 1.890

9.  Prediction of protein S-nitrosylation sites based on adapted normal distribution bi-profile Bayes and Chou's pseudo amino acid composition.

Authors:  Cangzhi Jia; Xin Lin; Zhiping Wang
Journal:  Int J Mol Sci       Date:  2014-06-10       Impact factor: 5.923

10.  PSNO: predicting cysteine S-nitrosylation sites by incorporating various sequence-derived features into the general form of Chou's PseAAC.

Authors:  Jian Zhang; Xiaowei Zhao; Pingping Sun; Zhiqiang Ma
Journal:  Int J Mol Sci       Date:  2014-06-25       Impact factor: 5.923

View more
  18 in total

1.  repRNA: a web server for generating various feature vectors of RNA sequences.

Authors:  Bin Liu; Fule Liu; Longyun Fang; Xiaolong Wang; Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2015-06-18       Impact factor: 3.291

2.  Prediction of Protein Submitochondrial Locations by Incorporating Dipeptide Composition into Chou's General Pseudo Amino Acid Composition.

Authors:  Khurshid Ahmad; Muhammad Waris; Maqsood Hayat
Journal:  J Membr Biol       Date:  2016-01-08       Impact factor: 1.843

3.  iDPF-PseRAAAC: A Web-Server for Identifying the Defensin Peptide Family and Subfamily Using Pseudo Reduced Amino Acid Alphabet Composition.

Authors:  Yongchun Zuo; Yang Lv; Zhuying Wei; Lei Yang; Guangpeng Li; Guoliang Fan
Journal:  PLoS One       Date:  2015-12-29       Impact factor: 3.240

4.  dRHP-PseRA: detecting remote homology proteins using profile-based pseudo protein sequence and rank aggregation.

Authors:  Junjie Chen; Ren Long; Xiao-Long Wang; Bin Liu; Kuo-Chen Chou
Journal:  Sci Rep       Date:  2016-09-01       Impact factor: 4.379

5.  iRSpot-DACC: a computational predictor for recombination hot/cold spots identification based on dinucleotide-based auto-cross covariance.

Authors:  Bingquan Liu; Yumeng Liu; Xiaopeng Jin; Xiaolong Wang; Bin Liu
Journal:  Sci Rep       Date:  2016-09-19       Impact factor: 4.379

6.  Predicting cancerlectins by the optimal g-gap dipeptides.

Authors:  Hao Lin; Wei-Xin Liu; Jiao He; Xin-Hui Liu; Hui Ding; Wei Chen
Journal:  Sci Rep       Date:  2015-12-09       Impact factor: 4.379

7.  Improved Species-Specific Lysine Acetylation Site Prediction Based on a Large Variety of Features Set.

Authors:  Qiqige Wuyun; Wei Zheng; Yanping Zhang; Jishou Ruan; Gang Hu
Journal:  PLoS One       Date:  2016-05-16       Impact factor: 3.240

8.  DephosSite: a machine learning approach for discovering phosphotase-specific dephosphorylation sites.

Authors:  Xiaofeng Wang; Renxiang Yan; Jiangning Song
Journal:  Sci Rep       Date:  2016-03-22       Impact factor: 4.379

9.  Recombination spot identification Based on gapped k-mers.

Authors:  Rong Wang; Yong Xu; Bin Liu
Journal:  Sci Rep       Date:  2016-03-31       Impact factor: 4.379

10.  Prediction of phosphothreonine sites in human proteins by fusing different features.

Authors:  Ya-Wei Zhao; Hong-Yan Lai; Hua Tang; Wei Chen; Hao Lin
Journal:  Sci Rep       Date:  2016-10-04       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.