Literature DB >> 8877502

Discovering patterns and subfamilies in biosequences.

A Brazma1, I Jonassen, E Ukkonen, J Vilo.   

Abstract

We consider the problem of automatic discovery of patterns and the corresponding subfamilies in a set of biosequences. The sequences are unaligned and may contain noise of unknown level. The patterns are of the type used in PROSITE database. In our approach we discover patterns and the respective subfamilies simultaneously. We develop a theoretically substantiated significance measure for a set of such patterns and an algorithm approximating the best pattern set and the subfamilies. The approach is based on the minimum description length (MDL) principle. We report a computing experiment correctly finding subfamilies in the family of chromo domains and revealing new strong patterns.

Mesh:

Year:  1996        PMID: 8877502

Source DB:  PubMed          Journal:  Proc Int Conf Intell Syst Mol Biol        ISSN: 1553-0833


  4 in total

Review 1.  U2AF homology motifs: protein recognition in the RRM world.

Authors:  Clara L Kielkopf; Stephan Lücke; Michael R Green
Journal:  Genes Dev       Date:  2004-07-01       Impact factor: 11.361

2.  Discovery of phosphorylation motif mixtures in phosphoproteomics data.

Authors:  Anna Ritz; Gregory Shakhnarovich; Arthur R Salomon; Benjamin J Raphael
Journal:  Bioinformatics       Date:  2008-11-07       Impact factor: 6.937

3.  A systems biology approach to transcription factor binding site prediction.

Authors:  Xiang Zhou; Pavel Sumazin; Presha Rajbhandari; Andrea Califano
Journal:  PLoS One       Date:  2010-03-26       Impact factor: 3.240

4.  Evaluating deterministic motif significance measures in protein databases.

Authors:  Pedro Gabriel Ferreira; Paulo J Azevedo
Journal:  Algorithms Mol Biol       Date:  2007-12-24       Impact factor: 1.405

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.