Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Simple sequences are rare in the Protein Data Bank.

Literature DB >> 12012345

Simple sequences are rare in the Protein Data Bank.

Abstract

A simple sequence is abundant in the proteins that have been sequenced to date. But unusual protein features, such as a simple sequence, are not present in the same high frequency within structural databases. A subset of these simple sequences, a group with a highly repetitive nature has been shown to be abundant in eukaryotes but not in prokaryotes. In this study, an examination of the eukaryotic proteins in the Protein Data Bank (PDB) has revealed a large deficiency of low complexity, highly repetitive protein repeats. Through simulated databases of similar samples of eukaryotic proteins taken from the National Center for Biotechnology Information (NCBI) database, it is shown that the PDB contains a significantly less highly repetitive, simple sequence than artificial databases of similar composition randomly derived from NCBI. When the structural data for those few PDB sequences that did contain a highly repetitive simple sequence is examined in detail, it is found that in most cases the tertiary structure is unknown for the regions consisting of a simple sequence. This lack of a simple sequence both in the PDB database and in the structural information suggests that this type of simple sequence may produce disordered structures that make structural characterization difficult. Copyright 2002 Wiley-Liss, Inc.

Entities: Disease Gene

Mesh：

Substances：
Proteins

Year: 2002 PMID： 12012345 DOI： 10.1002/prot.10150

Source DB: PubMed Journal: Proteins ISSN： 0887-3585

Keyword Cloud
Cited

43 in total

1. On the properties and sequence context of structurally ambivalent fragments in proteins.

Authors: Igor B Kuznetsov; S Rackovsky
Journal: Protein Sci Date: 2003-11 Impact factor: 6.725

2. Neurological proteins are not enriched for repetitive sequences.

Authors: Melanie A Huntley; G Brian Golding
Journal: Genetics Date: 2004-03 Impact factor: 4.562

3. Natural selection drives the accumulation of amino acid tandem repeats in human proteins.

Authors: Loris Mularoni; Alice Ledda; Macarena Toll-Riera; M Mar Albà
Journal: Genome Res Date: 2010-03-24 Impact factor: 9.043

4. Unexpected features of the dark proteome.

Authors: Nelson Perdigão; Julian Heinrich; Christian Stolte; Kenneth S Sabir; Michael J Buckley; Bruce Tabor; Beth Signal; Brian S Gloss; Christopher J Hammang; Burkhard Rost; Andrea Schafferhans; Seán I O'Donoghue
Journal: Proc Natl Acad Sci U S A Date: 2015-11-17 Impact factor: 11.205

5. Effect of low-complexity regions on protein structure determination.

Authors: Ryan M Bannen; Craig A Bingman; George N Phillips
Journal: J Struct Funct Genomics Date: 2008-02-27

6. RCPdb: An evolutionary classification and codon usage database for repeat-containing proteins.

Authors: Noel G Faux; Gavin A Huttley; Khalid Mahmood; Geoffrey I Webb; Maria Garcia de la Banda; James C Whisstock
Journal: Genome Res Date: 2007-06-13 Impact factor: 9.043

7. Genome-wide evidence for selection acting on single amino acid repeats.

Authors: Wilfried Haerty; G Brian Golding
Journal: Genome Res Date: 2010-01-07 Impact factor: 9.043

8. Genome-wide analysis of histidine repeats reveals their role in the localization of human proteins to the nuclear speckles compartment.

Authors: Eulàlia Salichs; Alice Ledda; Loris Mularoni; M Mar Albà; Susana de la Luna
Journal: PLoS Genet Date: 2009-03-06 Impact factor: 5.917

9. Low-complexity regions within protein sequences have position-dependent roles.

Authors: Alain Coletta; John W Pinney; David Y Weiss Solís; James Marsh; Steve R Pettifer; Teresa K Attwood
Journal: BMC Syst Biol Date: 2010-04-13

10. Trinucleotide repeats in human genome and exome.

Authors: Piotr Kozlowski; Mateusz de Mezer; Wlodzimierz J Krzyzosiak
Journal: Nucleic Acids Res Date: 2010-03-09 Impact factor: 16.971