Literature DB >> 12050063

Detecting cryptically simple protein sequences using the SIMPLE algorithm.

M Mar Albà1, Roman A Laskowski, John M Hancock.   

Abstract

MOTIVATION: Low-complexity or cryptically simple sequences are widespread in protein sequences but their evolution and function are poorly understood. To date methods for the detection of low complexity in proteins have been directed towards the filtering of such regions prior to sequence homology searches but not to the analysis of the regions per se. However, many of these regions are encoded by non-repetitive DNA sequences and may therefore result from selection acting on protein structure and/or function.
RESULTS: We have developed a new tool, based on the SIMPLE algorithm, that facilitates the quantification of the amount of simple sequence in proteins and determines the type of short motifs that show clustering above a certain threshold. By modifying the sensitivity of the program simple sequence content can be studied at various levels, from highly organised tandem structures to complex combinations of repeats. We compare the relative amount of simplicity in different functional groups of yeast proteins and determine the level of clustering of the different amino acids in these proteins. AVAILABILITY: The program is available on request or online at http://www.biochem.ucl.ac.uk/bsm/SIMPLE.

Entities:  

Mesh:

Substances:

Year:  2002        PMID: 12050063     DOI: 10.1093/bioinformatics/18.5.672

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  24 in total

1.  Complexity: an internet resource for analysis of DNA sequence complexity.

Authors:  Y L Orlov; V N Potapov
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

2.  Neurological proteins are not enriched for repetitive sequences.

Authors:  Melanie A Huntley; G Brian Golding
Journal:  Genetics       Date:  2004-03       Impact factor: 4.562

3.  TCP transcription factors predate the emergence of land plants.

Authors:  Olivier Navaud; Patrick Dabos; Elodie Carnus; Dominique Tremousaygue; Christine Hervé
Journal:  J Mol Evol       Date:  2007-06-12       Impact factor: 2.395

4.  HighSSR: high-throughput SSR characterization and locus development from next-gen sequencing data.

Authors:  Alexander Churbanov; Rachael Ryan; Nabeeh Hasan; Donovan Bailey; Haofeng Chen; Brook Milligan; Peter Houde
Journal:  Bioinformatics       Date:  2012-09-06       Impact factor: 6.937

5.  Functional insights from the distribution and role of homopeptide repeat-containing proteins.

Authors:  Noel G Faux; Stephen P Bottomley; Arthur M Lesk; James A Irving; John R Morrison; Maria Garcia de la Banda; James C Whisstock
Journal:  Genome Res       Date:  2005-04       Impact factor: 9.043

6.  The origin of conserved protein domains and amino acid repeats via adaptive competition for control over amino acid residues.

Authors:  Mary M Rorick; Günter P Wagner
Journal:  J Mol Evol       Date:  2009-12-19       Impact factor: 2.395

7.  Composition-modified matrices improve identification of homologs of saccharomyces cerevisiae low-complexity glycoproteins.

Authors:  Juan E Coronado; Oliver Attie; Susan L Epstein; Wei-Gang Qiu; Peter N Lipke
Journal:  Eukaryot Cell       Date:  2006-04

8.  Discovery of Recurrent Sequence Motifs in Saccharomyces cerevisiae Cell Wall Proteins.

Authors:  Juan E Coronado; Susan L Epstein; Wei-Gang Qiu; Peter N Lipke
Journal:  Match (Mulh)       Date:  2007       Impact factor: 2.497

9.  Organization and evolution of a gene-rich region of the mouse genome: a 12.7-Mb region deleted in the Del(13)Svea36H mouse.

Authors:  Ann-Marie Mallon; Laurens Wilming; Joseph Weekes; James G R Gilbert; Jennifer Ashurst; Sandrine Peyrefitte; Lucy Matthews; Matthew Cadman; Richard McKeone; Chris A Sellick; Ruth Arkell; Marc R M Botcherby; Mark A Strivens; R Duncan Campbell; Simon Gregory; Paul Denny; John M Hancock; Jane Rogers; Steve D M Brown
Journal:  Genome Res       Date:  2004-09-13       Impact factor: 9.043

10.  Tandem and cryptic amino acid repeats accumulate in disordered regions of proteins.

Authors:  Michelle Simon; John M Hancock
Journal:  Genome Biol       Date:  2009-06-01       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.