Literature DB >> 18342336

Remote protein homology detection using recurrence quantification analysis and amino acid physicochemical properties.

Yuchen Yang1, Erwin Tantoso, Kuo-Bin Li.   

Abstract

Remote homology detection refers to the detection of structure homology in evolutionarily related proteins with low sequence similarity. Supervised learning algorithms such as support vector machine (SVM) are currently the most accurate methods. In most of these SVM-based methods, efforts have been dedicated to developing new kernels to better use the pairwise alignment scores or sequence profiles. Moreover, amino acids' physicochemical properties are not generally used in the feature representation of protein sequences. In this article, we present a remote homology detection method that incorporates two novel features: (1) a protein's primary sequence is represented using amino acid's physicochemical properties and (2) the similarity between two proteins is measured using recurrence quantification analysis (RQA). An optimization scheme was developed to select different amino acid indices (up to 10 for a protein family) that are best to characterize the given protein family. The selected amino acid indices may enable us to draw better biological explanation of the protein family classification problem than using other alignment-based methods. An SVM-based classifier will then work on the space described by the RQA metrics. The classification scheme is named as SVM-RQA. Experiments at the superfamily level of the SCOP1.53 dataset show that, without using alignment or sequence profile information, the features generated from amino acid indices are able to produce results that are comparable to those obtained by the published state-of-the-art SVM kernels. In the future, better prediction accuracies can be expected by combining the alignment-based features with our amino acids property-based features. Supplementary information including the raw dataset, the best-performing amino acid indices for each protein family and the computed RQA metrics for all protein sequences can be downloaded from http://ym151113.ym.edu.tw/svm-rqa.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18342336     DOI: 10.1016/j.jtbi.2008.01.028

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  13 in total

1.  Machine learning based prediction for peptide drift times in ion mobility spectrometry.

Authors:  Anuj R Shah; Khushbu Agarwal; Erin S Baker; Mudita Singhal; Anoop M Mayampurath; Yehia M Ibrahim; Lars J Kangas; Matthew E Monroe; Rui Zhao; Mikhail E Belov; Gordon A Anderson; Richard D Smith
Journal:  Bioinformatics       Date:  2010-05-21       Impact factor: 6.937

2.  Maximum margin classifier working in a set of strings.

Authors:  Hitoshi Koyano; Morihiro Hayashida; Tatsuya Akutsu
Journal:  Proc Math Phys Eng Sci       Date:  2016-03       Impact factor: 2.704

3.  Recurrence Quantification for the Analysis of Coupled Processes in Aging.

Authors:  Timothy R Brick; Allison L Gray; Angela D Staples
Journal:  J Gerontol B Psychol Sci Soc Sci       Date:  2017-12-15       Impact factor: 4.077

4.  Protein remote homology detection by combining Chou's distance-pair pseudo amino acid composition and principal component analysis.

Authors:  Bin Liu; Junjie Chen; Xiaolong Wang
Journal:  Mol Genet Genomics       Date:  2015-04-21       Impact factor: 3.291

5.  Computational Insights into the Structural and Functional Impacts of nsSNPs of Bone Morphogenetic Proteins.

Authors:  Hafiz Ishfaq Ahmad; Nabeel Ijaz; Gulnaz Afzal; Akhtar Rasool Asif; Aziz Ur Rehman; Abdur Rahman; Irfan Ahmed; Muhammad Yousaf; Abdelmotaleb Elokil; Sayyed Aun Muhammad; Sarah M Albogami; Saqer S Alotaibi
Journal:  Biomed Res Int       Date:  2022-07-04       Impact factor: 3.246

6.  Physicochemical property distributions for accurate and rapid pairwise protein homology detection.

Authors:  Bobbie-Jo M Webb-Robertson; Kyle G Ratuiste; Christopher S Oehmen
Journal:  BMC Bioinformatics       Date:  2010-03-19       Impact factor: 3.169

7.  Using amino acid physicochemical distance transformation for fast protein remote homology detection.

Authors:  Bin Liu; Xiaolong Wang; Qingcai Chen; Qiwen Dong; Xun Lan
Journal:  PLoS One       Date:  2012-09-28       Impact factor: 3.240

8.  Using distances between Top-n-gram and residue pairs for protein remote homology detection.

Authors:  Bin Liu; Jinghao Xu; Quan Zou; Ruifeng Xu; Xiaolong Wang; Qingcai Chen
Journal:  BMC Bioinformatics       Date:  2014-01-24       Impact factor: 3.169

9.  A computational approach identifies two regions of Hepatitis C Virus E1 protein as interacting domains involved in viral fusion process.

Authors:  Roberto Bruni; Angela Costantino; Elena Tritarelli; Cinzia Marcantonio; Massimo Ciccozzi; Maria Rapicetta; Gamal El Sawaf; Alessandro Giuliani; Anna Rita Ciccaglione
Journal:  BMC Struct Biol       Date:  2009-07-29

10.  An ensemble method for predicting subnuclear localizations from primary protein structures.

Authors:  Guo Sheng Han; Zu Guo Yu; Vo Anh; Anaththa P D Krishnajith; Yu-Chu Tian
Journal:  PLoS One       Date:  2013-02-27       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.