Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Automated search of natively folded protein fragments for high-throughput structure determination in structural genomics.

Literature DB >> 11206052

Automated search of natively folded protein fragments for high-throughput structure determination in structural genomics.

Y Kuroda¹, K Tani, Y Matsuo, S Yokoyama.

Abstract

Structural genomic projects envision almost routine protein structure determinations, which are currently imaginable only for small proteins with molecular weights below 25,000 Da. For larger proteins, structural insight can be obtained by breaking them into small segments of amino acid sequences that can fold into native structures, even when isolated from the rest of the protein. Such segments are autonomously folding units (AFU) and have sizes suitable for fast structural analyses. Here, we propose to expand an intuitive procedure often employed for identifying biologically important domains to an automatic method for detecting putative folded protein fragments. The procedure is based on the recognition that large proteins can be regarded as a combination of independent domains conserved among diverse organisms. We thus have developed a program that reorganizes the output of BLAST searches and detects regions with a large number of similar sequences. To automate the detection process, it is reduced to a simple geometrical problem of recognizing rectangular shaped elevations in a graph that plots the number of similar sequences at each residue of a query sequence. We used our program to quantitatively corroborate the premise that segments with conserved sequences correspond to domains that fold into native structures. We applied our program to a test data set composed of 99 amino acid sequences containing 150 segments with structures listed in the Protein Data Bank, and thus known to fold into native structures. Overall, the fragments identified by our program have an almost 50% probability of forming a native structure, and comparable results are observed with sequences containing domain linkers classified in SCOP. Furthermore, we verified that our program identifies AFU in libraries from various organisms, and we found a significant number of AFU candidates for structural analysis, covering an estimated 5 to 20% of the genomic databases. Altogether, these results argue that methods based on sequence similarity can be useful for dissecting large proteins into small autonomously folding domains, and such methods may provide an efficient support to structural genomics projects.

Mesh：

Substances：

Year: 2000 PMID： 11206052 PMCID： PMC2144534 DOI： 10.1110/ps.9.12.2313

Source DB: PubMed Journal: Protein Sci ISSN： 0961-8368 Impact factor: 6.725

39 in total

1. Solution structure of the link module: a hyaluronan-binding domain involved in extracellular matrix stability and cell migration.

Authors: D Kohda; C J Morton; A A Parkar; H Hatanaka; F M Inagaki; I D Campbell; A J Day
Journal: Cell Date: 1996-09-06 Impact factor: 41.582

Review 2. NMR structures of proteins and protein complexes beyond 20,000 M(r).

Authors: G M Clore; A M Gronenborn
Journal: Nat Struct Biol Date: 1997-10

3. Touring protein fold space with Dali/FSSP.

Authors: L Holm; C Sander
Journal: Nucleic Acids Res Date: 1998-01-01 Impact factor: 16.971

4. Pfam: a comprehensive database of protein domain families based on seed alignments.

Authors: E L Sonnhammer; S R Eddy; R Durbin
Journal: Proteins Date: 1997-07

Review 5. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors: S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal: Nucleic Acids Res Date: 1997-09-01 Impact factor: 16.971

6. CATH--a hierarchic classification of protein domain structures.

Authors: C A Orengo; A D Michie; S Jones; D T Jones; M B Swindells; J M Thornton
Journal: Structure Date: 1997-08-15 Impact factor: 5.006

Review 7. Structural genomics: beyond the human genome project.

Authors: S K Burley; S C Almo; J B Bonanno; M Capel; M R Chance; T Gaasterland; D Lin; A Sali; F W Studier; S Swaminathan
Journal: Nat Genet Date: 1999-10 Impact factor: 38.330

8. SCOP: a structural classification of proteins database for the investigation of sequences and structures.

Authors: A G Murzin; S E Brenner; T Hubbard; C Chothia
Journal: J Mol Biol Date: 1995-04-07 Impact factor: 5.469

9. The SWISS-PROT protein sequence data bank and its new supplement TREMBL.

Authors: A Bairoch; R Apweiler
Journal: Nucleic Acids Res Date: 1996-01-01 Impact factor: 16.971

10. Modular arrangement of proteins as inferred from analysis of homology.

Authors: E L Sonnhammer; D Kahn
Journal: Protein Sci Date: 1994-03 Impact factor: 6.725

13 in total

1. Characteristics and prediction of domain linker sequences in multi-domain proteins.

Authors: Takanori Tanaka; Yutaka Kuroda; Shigeyuki Yokoyama
Journal: J Struct Funct Genomics Date: 2003

2. Characterization and prediction of linker sequences of multi-domain proteins by a neural network.

Authors: Satoshi Miyazaki; Yutaka Kuroda; Shigeyuki Yokoyama
Journal: J Struct Funct Genomics Date: 2002

3. Computer-aided NMR assay for detecting natively folded structural domains.

Authors: Takayuki Hondoh; Atsushi Kato; Shigeyuki Yokoyama; Yutaka Kuroda
Journal: Protein Sci Date: 2006-03-07 Impact factor: 6.725

4. IS-Dom: a dataset of independent structural domains automatically delineated from protein structures.

Authors: Teppei Ebina; Yuki Umezawa; Yutaka Kuroda
Journal: J Comput Aided Mol Des Date: 2013-05-29 Impact factor: 3.686

5. H-DROP: an SVM based helical domain linker predictor trained with features optimized by combining random forest and stepwise selection.

Authors: Teppei Ebina; Ryosuke Suzuki; Ryotaro Tsuji; Yutaka Kuroda
Journal: J Comput Aided Mol Des Date: 2014-06-26 Impact factor: 3.686

6. ThreaDomEx: a unified platform for predicting continuous and discontinuous protein domains by multiple-threading and segment assembly.

Authors: Yan Wang; Jian Wang; Ruiming Li; Qiang Shi; Zhidong Xue; Yang Zhang
Journal: Nucleic Acids Res Date: 2017-07-03 Impact factor: 16.971

10. Development of an accurate classification system of proteins into structured and unstructured regions that uncovers novel structural domains: its application to human transcription factors.

Authors: Satoshi Fukuchi; Keiichi Homma; Yoshiaki Minezaki; Takashi Gojobori; Ken Nishikawa
Journal: BMC Struct Biol Date: 2009-04-30