Literature DB >> 18606361

Potential implications of availability of short amino acid sequences in proteins: an old and new approach to protein decoding and design.

Joji M Otaki1, Tomonori Gotoh, Haruhiko Yamamoto.   

Abstract

Three-dimensional structure of a protein molecule is primarily determined by its amino acid sequence, and thus the elucidation of general rules embedded in amino acid sequences is of great importance in protein science and engineering. To extract valuable information from sequences, we propose an analytical method in which a protein sequence is considered to be constructed by serial superimpositions of short amino acid sequences of n amino acid sets, especially triplets (3-aa sets). Using the comprehensive nonredundant protein database, we first examined "availability" of all possible combinatorial sets of 8,000 triplet species. Availability score was mathematically defined as an indicator for the relative "preference" or "avoidance" for a given short constituent sequence to be used in protein chain. Availability scores of real proteins were clearly biased against those of randomly generated proteins. We found many triplet species that occurred in the database more than expected or less than expected. Such bias was extended to longer sets, and we found that some species of pentats (5-aa sets) that occurred reasonably frequently in the randomly generated protein population did not occur at all in any real proteins known today. Availability score was dependent on species, potentially serving as a phylogenetic indicator. Furthermore, we suggest possibilities of various biotechnological applications of characteristic short sequences such as human-specific and pathogen-specific short sequences obtained from availability analysis. Availability score was also dependent on secondary structures, potentially serving as a structural indicator. Availability analysis on triplets may be combined with a comprehensive data collection on the varphi and psi peptide-bond angles of the amino acid at the center of each triplet, i.e., a collection of Ramachandran plots for each triplet. These triplet characters, together with other physicochemical data, will provide us with basic information between protein sequence and structure, by which structure prediction and engineering may be greatly facilitated. Availability analysis may also be useful in identifying word processing units in amino acid sequences based on an analogy to natural languages. Together with other approaches, availability analysis will elucidate general rules hidden in the primary sequences and eventually contributes to rebuilding the paradigm of protein science.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18606361     DOI: 10.1016/S1387-2656(08)00004-5

Source DB:  PubMed          Journal:  Biotechnol Annu Rev        ISSN: 1387-2656


  6 in total

1.  Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences.

Authors:  Sharon Penias Navon; Guy Kornberg; Jin Chen; Tali Schwartzman; Albert Tsai; Elisabetta Viani Puglisi; Joseph D Puglisi; Noam Adir
Journal:  Proc Natl Acad Sci U S A       Date:  2016-06-15       Impact factor: 11.205

2.  Pentamers not found in the universal proteome can enhance antigen specific immune responses and adjuvant vaccines.

Authors:  Ami Patel; Jessica C Dong; Brett Trost; Jason S Richardson; Sarah Tohme; Shawn Babiuk; Anthony Kusalik; Sam K P Kung; Gary P Kobinger
Journal:  PLoS One       Date:  2012-08-24       Impact factor: 3.240

Review 3.  A frequency-based linguistic approach to protein decoding and design: Simple concepts, diverse applications, and the SCS Package.

Authors:  Kenta Motomura; Morikazu Nakamura; Joji M Otaki
Journal:  Comput Struct Biotechnol J       Date:  2013-03-29       Impact factor: 7.271

4.  Global pentapeptide statistics are far away from expected distributions.

Authors:  Jarosław Poznański; Jan Topiński; Anna Muszewska; Konrad J Dębski; Marta Hoffman-Sommer; Krzysztof Pawłowski; Marcin Grynberg
Journal:  Sci Rep       Date:  2018-10-11       Impact factor: 4.379

5.  Word decoding of protein amino Acid sequences with availability analysis: a linguistic approach.

Authors:  Kenta Motomura; Tomohiro Fujita; Motosuke Tsutsumi; Satsuki Kikuzato; Morikazu Nakamura; Joji M Otaki
Journal:  PLoS One       Date:  2012-11-21       Impact factor: 3.240

6.  Enhancing the Immune Response of a Nicotine Vaccine with Synthetic Small "Non-Natural" Peptides.

Authors:  Hoang-Thanh Le; Nya L Fraleigh; Jordan D Lewicky; Justin Boudreau; Paul Dolinar; Nitin Bhardwaj; Francisco Diaz-Mitoma; Sabine Montaut; Sarah Fallahi; Alexandrine L Martel
Journal:  Molecules       Date:  2020-03-12       Impact factor: 4.411

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.