Literature DB >> 16527830

A generalization of the PST algorithm: modeling the sparse nature of protein sequences.

Florencia G Leonardi1.   

Abstract

MOTIVATION: A central problem in genomics is to determine the function of a protein using the information contained in its amino acid sequence. Variable length Markov chains (VLMC) are a promising class of models that can effectively classify proteins into families and they can be estimated in linear time and space.
RESULTS: We introduce a new algorithm, called Sparse Probabilistic Suffix Trees (SPST), that identifies equivalence between the contexts of a VLMC. We show that, in many cases, the identification of these equivalence can improve the classification rate of the classical Probabilistic Suffix Trees (PST) algorithm. We also show that better classification can be achieved by identifying representative fingerprints in the amino acid chains, and this variation in the SPST algorithm is called F-SPST.

Entities:  

Mesh:

Year:  2006        PMID: 16527830     DOI: 10.1093/bioinformatics/btl088

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  2 in total

1.  Retrieving the structure of probabilistic sequences of auditory stimuli from EEG data.

Authors:  Noslen Hernández; Aline Duarte; Guilherme Ost; Ricardo Fraiman; Antonio Galves; Claudia D Vargas
Journal:  Sci Rep       Date:  2021-02-10       Impact factor: 4.379

2.  Behavioral sequence analysis reveals a novel role for beta2* nicotinic receptors in exploration.

Authors:  Nicolas Maubourguet; Annick Lesne; Jean-Pierre Changeux; Uwe Maskos; Philippe Faure
Journal:  PLoS Comput Biol       Date:  2008-11-21       Impact factor: 4.475

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.