Literature DB >> 15689510

Availability of short amino acid sequences in proteins.

Joji M Otaki1, Shunsuke Ienaka, Tomonori Gotoh, Haruhiko Yamamoto.   

Abstract

Much attention is being paid to protein databases as an important information source for proteome research. Although used extensively for similarity searches, protein databases themselves have not fully been characterized. In a systematic attempt to reveal protein-database characters that could contribute to revealing how protein chains are constructed, frequency distributions of all possible combinatorial sets of three, four, and five amino acids ("triplets," "quartets," and "pentats"; collectively called constituent sequences) have been examined in the nonredundant (nr) protein database, demonstrating the existence of nonrandom bias in their "availability" at the population level. Nonexistent short sequences of pentats were found that showed low availability in biological proteins against their expected probabilities of occurrence. Among them, six representative ones were successfully synthesized as peptides with reasonably high yields in a conventional Fmoc method, excluding the possibility that a putative physicochemical energy barrier in forming them could be a direct cause for the low availability. They were also expressed as soluble fusion proteins in a conventional Escherichia coli BL21Star(DE3) system with reasonably high yield, again excluding a possible difficulty in their biological synthesis. Together, these results suggest that information on three-dimensional structures and functions of proteins exists in the context of connections of short constituent sequences, and that proteins are composed of evolutionarily selected constituent sequences, which are reflected in their availability differences in the database. These results may have biological implications for protein structural studies.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15689510      PMCID: PMC2279279          DOI: 10.1110/ps.041092605

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  29 in total

1.  On the significance of alternating patterns of polar and non-polar residues in beta-strands.

Authors:  Yael Mandel-Gutfreund; Lydia M Gregoret
Journal:  J Mol Biol       Date:  2002-10-25       Impact factor: 5.469

Review 2.  Biomedical informatics for proteomics.

Authors:  Mark S Boguski; Martin W McIntosh
Journal:  Nature       Date:  2003-03-13       Impact factor: 49.962

3.  Apparent loss-of-function mutant GPCRs revealed as constitutively desensitized receptors.

Authors:  Alyson M Wilbanks; Stéphane A Laporte; Laura M Bohn; Larry S Barak; Marc G Caron
Journal:  Biochemistry       Date:  2002-10-08       Impact factor: 3.162

4.  The web server of IBM's Bioinformatics and Pattern Discovery group: 2004 update.

Authors:  Tien Huynh; Isidore Rigoutsos
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

5.  Prediction of protein conformation.

Authors:  P Y Chou; G D Fasman
Journal:  Biochemistry       Date:  1974-01-15       Impact factor: 3.162

6.  Algorithms for prediction of alpha-helical and beta-structural regions in globular proteins.

Authors:  V I Lim
Journal:  J Mol Biol       Date:  1974-10-05       Impact factor: 5.469

7.  Arginine as an evolutionary intruder into protein synthesis.

Authors:  T H Jukes
Journal:  Biochem Biophys Res Commun       Date:  1973-08-06       Impact factor: 3.575

Review 8.  Conformation of polypeptides and proteins.

Authors:  G N Ramachandran; V Sasisekharan
Journal:  Adv Protein Chem       Date:  1968

9.  Non-Darwinian evolution.

Authors:  J L King; T H Jukes
Journal:  Science       Date:  1969-05-16       Impact factor: 47.728

10.  Amino acid composition of proteins: Selection against the genetic code.

Authors:  T H Jukes; R Holmquist; H Moise
Journal:  Science       Date:  1975-07-04       Impact factor: 47.728

View more
  10 in total

1.  Forbidden penta-peptides.

Authors:  Tamir Tuller; Benny Chor; Nathan Nelson
Journal:  Protein Sci       Date:  2007-10       Impact factor: 6.725

2.  Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences.

Authors:  Sharon Penias Navon; Guy Kornberg; Jin Chen; Tali Schwartzman; Albert Tsai; Elisabetta Viani Puglisi; Joseph D Puglisi; Noam Adir
Journal:  Proc Natl Acad Sci U S A       Date:  2016-06-15       Impact factor: 11.205

3.  Pentamers not found in the universal proteome can enhance antigen specific immune responses and adjuvant vaccines.

Authors:  Ami Patel; Jessica C Dong; Brett Trost; Jason S Richardson; Sarah Tohme; Shawn Babiuk; Anthony Kusalik; Sam K P Kung; Gary P Kobinger
Journal:  PLoS One       Date:  2012-08-24       Impact factor: 3.240

Review 4.  A frequency-based linguistic approach to protein decoding and design: Simple concepts, diverse applications, and the SCS Package.

Authors:  Kenta Motomura; Morikazu Nakamura; Joji M Otaki
Journal:  Comput Struct Biotechnol J       Date:  2013-03-29       Impact factor: 7.271

5.  Global pentapeptide statistics are far away from expected distributions.

Authors:  Jarosław Poznański; Jan Topiński; Anna Muszewska; Konrad J Dębski; Marta Hoffman-Sommer; Krzysztof Pawłowski; Marcin Grynberg
Journal:  Sci Rep       Date:  2018-10-11       Impact factor: 4.379

6.  Screening non-coding RNAs in transcriptomes from neglected species using PORTRAIT: case study of the pathogenic fungus Paracoccidioides brasiliensis.

Authors:  Roberto T Arrial; Roberto C Togawa; Marcelo de M Brigido
Journal:  BMC Bioinformatics       Date:  2009-08-04       Impact factor: 3.169

7.  Genomic DNA k-mer spectra: models and modalities.

Authors:  Benny Chor; David Horn; Nick Goldman; Yaron Levy; Tim Massingham
Journal:  Genome Biol       Date:  2009-10-08       Impact factor: 13.583

8.  Word decoding of protein amino Acid sequences with availability analysis: a linguistic approach.

Authors:  Kenta Motomura; Tomohiro Fujita; Motosuke Tsutsumi; Satsuki Kikuzato; Morikazu Nakamura; Joji M Otaki
Journal:  PLoS One       Date:  2012-11-21       Impact factor: 3.240

9.  Characterization of oligopeptide patterns in large protein sets.

Authors:  Anders Bresell; Bengt Persson
Journal:  BMC Genomics       Date:  2007-10-01       Impact factor: 3.969

10.  C-terminal motif prediction in eukaryotic proteomes using comparative genomics and statistical over-representation across protein families.

Authors:  Ryan S Austin; Nicholas J Provart; Sean R Cutler
Journal:  BMC Genomics       Date:  2007-06-26       Impact factor: 3.969

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.