Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Detecting cryptically simple protein sequences using the SIMPLE algorithm.

Literature DB >> 12050063

Detecting cryptically simple protein sequences using the SIMPLE algorithm.

M Mar Albà¹, Roman A Laskowski, John M Hancock.

Abstract

MOTIVATION: Low-complexity or cryptically simple sequences are widespread in protein sequences but their evolution and function are poorly understood. To date methods for the detection of low complexity in proteins have been directed towards the filtering of such regions prior to sequence homology searches but not to the analysis of the regions per se. However, many of these regions are encoded by non-repetitive DNA sequences and may therefore result from selection acting on protein structure and/or function.
RESULTS: We have developed a new tool, based on the SIMPLE algorithm, that facilitates the quantification of the amount of simple sequence in proteins and determines the type of short motifs that show clustering above a certain threshold. By modifying the sensitivity of the program simple sequence content can be studied at various levels, from highly organised tandem structures to complex combinations of repeats. We compare the relative amount of simplicity in different functional groups of yeast proteins and determine the level of clustering of the different amino acids in these proteins. AVAILABILITY: The program is available on request or online at http://www.biochem.ucl.ac.uk/bsm/SIMPLE.

Entities: Species

Mesh：

Substances：
Proteins

Year: 2002 PMID： 12050063 DOI： 10.1093/bioinformatics/18.5.672

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

24 in total

1. Complexity: an internet resource for analysis of DNA sequence complexity.

Authors: Y L Orlov; V N Potapov
Journal: Nucleic Acids Res Date: 2004-07-01 Impact factor: 16.971

2. Neurological proteins are not enriched for repetitive sequences.

Authors: Melanie A Huntley; G Brian Golding
Journal: Genetics Date: 2004-03 Impact factor: 4.562

3. TCP transcription factors predate the emergence of land plants.

Authors: Olivier Navaud; Patrick Dabos; Elodie Carnus; Dominique Tremousaygue; Christine Hervé
Journal: J Mol Evol Date: 2007-06-12 Impact factor: 2.395

4. HighSSR: high-throughput SSR characterization and locus development from next-gen sequencing data.

Authors: Alexander Churbanov; Rachael Ryan; Nabeeh Hasan; Donovan Bailey; Haofeng Chen; Brook Milligan; Peter Houde
Journal: Bioinformatics Date: 2012-09-06 Impact factor: 6.937

5. Functional insights from the distribution and role of homopeptide repeat-containing proteins.

Authors: Noel G Faux; Stephen P Bottomley; Arthur M Lesk; James A Irving; John R Morrison; Maria Garcia de la Banda; James C Whisstock
Journal: Genome Res Date: 2005-04 Impact factor: 9.043

6. The origin of conserved protein domains and amino acid repeats via adaptive competition for control over amino acid residues.

Authors: Mary M Rorick; Günter P Wagner
Journal: J Mol Evol Date: 2009-12-19 Impact factor: 2.395

7. Composition-modified matrices improve identification of homologs of saccharomyces cerevisiae low-complexity glycoproteins.

Authors: Juan E Coronado; Oliver Attie; Susan L Epstein; Wei-Gang Qiu; Peter N Lipke
Journal: Eukaryot Cell Date: 2006-04

8. Discovery of Recurrent Sequence Motifs in Saccharomyces cerevisiae Cell Wall Proteins.

Authors: Juan E Coronado; Susan L Epstein; Wei-Gang Qiu; Peter N Lipke
Journal: Match (Mulh) Date: 2007 Impact factor: 2.497

9. Organization and evolution of a gene-rich region of the mouse genome: a 12.7-Mb region deleted in the Del(13)Svea36H mouse.

Authors: Ann-Marie Mallon; Laurens Wilming; Joseph Weekes; James G R Gilbert; Jennifer Ashurst; Sandrine Peyrefitte; Lucy Matthews; Matthew Cadman; Richard McKeone; Chris A Sellick; Ruth Arkell; Marc R M Botcherby; Mark A Strivens; R Duncan Campbell; Simon Gregory; Paul Denny; John M Hancock; Jane Rogers; Steve D M Brown
Journal: Genome Res Date: 2004-09-13 Impact factor: 9.043

10. Tandem and cryptic amino acid repeats accumulate in disordered regions of proteins.

Authors: Michelle Simon; John M Hancock
Journal: Genome Biol Date: 2009-06-01 Impact factor: 13.583