Literature DB >> 7952898

Non-globular domains in protein sequences: automated segmentation using complexity measures.

J C Wootton1.   

Abstract

Computational methods based on mathematically-defined measures of compositional complexity have been developed to distinguish globular and non-globular regions of protein sequences. Compact globular structures in protein molecules are shown to be determined by amino acid sequences of high informational complexity. Sequences of known crystal structure in the Brookhaven Protein Data Bank differ only slightly from randomly shuffled sequences in the distribution of statistical properties such as local compositional complexity. In contrast, in the much larger body of deduced sequences in the SWISS-PROT database, approximately one quarter of the residues occur in segments of non-randomly low complexity and approximately half of the entries contain at least one such segment. Sequences of proteins with known, physicochemically-defined non-globular regions have been analyzed, including collagens, different classes of coiled-coil proteins, elastins, histones, non-histone proteins, mucins, proteoglycan core proteins and proteins containing long single solvent-exposed alpha-helices. The SEG algorithm provides an effective general method for partitioning the globular and non-globular regions of these sequences fully automatically. This method is also facilitating the discovery of new classes of long, non-globular sequence segments, as illustrated by the example of the human CAN gene product involved in tumor induction.

Entities:  

Mesh:

Substances:

Year:  1994        PMID: 7952898     DOI: 10.1016/0097-8485(94)85023-2

Source DB:  PubMed          Journal:  Comput Chem        ISSN: 0097-8485


  182 in total

1.  Treble clef finger--a functionally diverse zinc-binding structural motif.

Authors:  N V Grishin
Journal:  Nucleic Acids Res       Date:  2001-04-15       Impact factor: 16.971

2.  A superfamily of archaeal, bacterial, and eukaryotic proteins homologous to animal transglutaminases.

Authors:  K S Makarova; L Aravind; E V Koonin
Journal:  Protein Sci       Date:  1999-08       Impact factor: 6.725

3.  Expectations from structural genomics.

Authors:  S E Brenner; M Levitt
Journal:  Protein Sci       Date:  2000-01       Impact factor: 6.725

4.  Comparative genome analysis of the pathogenic spirochetes Borrelia burgdorferi and Treponema pallidum.

Authors:  G Subramanian; E V Koonin; L Aravind
Journal:  Infect Immun       Date:  2000-03       Impact factor: 3.441

5.  Genome analysis: Assigning protein coding regions to three-dimensional structures.

Authors:  A A Salamov; M Suwa; C A Orengo; M B Swindells
Journal:  Protein Sci       Date:  1999-04       Impact factor: 6.725

6.  From complete genomes to measures of substitution rate variability within and between proteins.

Authors:  N V Grishin; Y I Wolf; E V Koonin
Journal:  Genome Res       Date:  2000-07       Impact factor: 9.043

7.  Common fold in helix-hairpin-helix proteins.

Authors:  X Shao; N V Grishin
Journal:  Nucleic Acids Res       Date:  2000-07-15       Impact factor: 16.971

Review 8.  Natively unfolded proteins: a point where biology waits for physics.

Authors:  Vladimir N Uversky
Journal:  Protein Sci       Date:  2002-04       Impact factor: 6.725

9.  Proteomics of Mycoplasma genitalium: identification and characterization of unannotated and atypical proteins in a small model genome.

Authors:  S Balasubramanian; T Schneider; M Gerstein; L Regan
Journal:  Nucleic Acids Res       Date:  2000-08-15       Impact factor: 16.971

10.  Phase transition of spindle-associated protein regulate spindle apparatus assembly.

Authors:  Hao Jiang; Shusheng Wang; Yuejia Huang; Xiaonan He; Honggang Cui; Xueliang Zhu; Yixian Zheng
Journal:  Cell       Date:  2015-09-17       Impact factor: 41.582

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.