Literature DB >> 18067866

Exploring an alignment free approach for protein classification and structural class prediction.

P Deschavanne1, P Tufféry.   

Abstract

Alignment free methods based on Chaos Game Representation (CGR), also known as sequence signature approaches, have proven of great interest for DNA sequence analysis. Indeed, they have been successfully applied for sequence comparison, phylogeny, detection of horizontal transfers or extraction of representative motifs in regulation sequences. Transposing such methods to proteins poses several fundamental questions related to representation space dimensionality. Several studies have tackled these points, but none has, so far, brought the application of CGRs to proteins to their fully expected potential. Yet, several studies have shown that techniques based on n-peptide frequencies can be relevant for proteins. Here, we investigate the effectiveness of a strategy based on the CGR approach using a fixed reverse encoding of amino acids into nucleic sequences. We first explore its relevance to protein classification into functional families. We then attempt to apply it to the prediction of protein structural classes. Our results suggest that the reverse encoding approach could be relevant in both cases. We show that it is able to classify functional families of proteins by extracting signatures close to the ProSite patterns. Applied to structural classification, the approach reaches scores of correct classification close to 84%, i.e. close to the scores of related methods in the field. Various optimizations of the approach are still possible, which open the door for future applications.

Mesh:

Substances:

Year:  2007        PMID: 18067866     DOI: 10.1016/j.biochi.2007.11.004

Source DB:  PubMed          Journal:  Biochimie        ISSN: 0300-9084            Impact factor:   4.079


  9 in total

1.  Identifying anticancer peptides by using a generalized chaos game representation.

Authors:  Li Ge; Jiaguo Liu; Yusen Zhang; Matthias Dehmer
Journal:  J Math Biol       Date:  2018-10-05       Impact factor: 2.259

2.  Function-based classification of carbohydrate-active enzymes by recognition of short, conserved peptide motifs.

Authors:  Peter Kamp Busk; Lene Lange
Journal:  Appl Environ Microbiol       Date:  2013-03-22       Impact factor: 4.792

3.  pSuc-FFSEA: Predicting Lysine Succinylation Sites in Proteins Based on Feature Fusion and Stacking Ensemble Algorithm.

Authors:  Jianhua Jia; Genqiang Wu; Wangren Qiu
Journal:  Front Cell Dev Biol       Date:  2022-05-24

4.  Pattern matching through Chaos Game Representation: bridging numerical and discrete data structures for biological sequence analysis.

Authors:  Susana Vinga; Alexandra M Carvalho; Alexandre P Francisco; Luís Ms Russo; Jonas S Almeida
Journal:  Algorithms Mol Biol       Date:  2012-05-02       Impact factor: 1.405

5.  Fractal MapReduce decomposition of sequence alignment.

Authors:  Jonas S Almeida; Alexander Grüneberg; Wolfgang Maass; Susana Vinga
Journal:  Algorithms Mol Biol       Date:  2012-05-02       Impact factor: 1.405

6.  Biological sequences as pictures: a generic two dimensional solution for iterated maps.

Authors:  Jonas S Almeida; Susana Vinga
Journal:  BMC Bioinformatics       Date:  2009-03-31       Impact factor: 3.169

7.  Customised fragments libraries for protein structure prediction based on structural class annotations.

Authors:  Jad Abbass; Jean-Christophe Nebel
Journal:  BMC Bioinformatics       Date:  2015-04-29       Impact factor: 3.169

8.  Proposing a highly accurate protein structural class predictor using segmentation-based features.

Authors:  Abdollah Dehzangi; Kuldip Paliwal; James Lyons; Alok Sharma; Abdul Sattar
Journal:  BMC Genomics       Date:  2014-01-24       Impact factor: 3.969

9.  Application of Chaotic Laws to Improve Haplotype Assembly Using Chaos Game Representation.

Authors:  Mohammad Hossein Olyaee; Alireza Khanteymoori; Khosrow Khalifeh
Journal:  Sci Rep       Date:  2019-07-17       Impact factor: 4.379

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.