| Literature DB >> 26664034 |
Barbara Kalinowska1, Artur Krzykalski2, Irena Roterman3.
Abstract
The Early Stage (ES) intermediate represents the starting structure in protein folding simulations based on the Fuzzy Oil Drop (FOD) model. The accuracy of FOD predictions is greatly dependent on the accuracy of the chosen intermediate. A suitable intermediate can be constructed using the sequence-structure relationship information contained in the so-called contingency table - this table expresses the likelihood of encountering various structural motifs for each tetrapeptide fragment in the amino acid sequence. The limited accuracy with which such structures could previously be predicted provided the motivation for a more indepth study of the contingency table itself. The Contingency Table Browser is a tool which can visualize, search and analyze the table. Our work presents possible applications of Contingency Table Browser, among them - analysis of specific protein sequences from the point of view of their structural ambiguity.Entities:
Year: 2015 PMID: 26664034 PMCID: PMC4658648 DOI: 10.6026/97320630011486
Source DB: PubMed Journal: Bioinformation ISSN: 0973-2063
Figure 1a) Subdivision of the φ,ψ conformational space (Ramachandran plot) into seven zones corresponding to seven structural codes (A to G). Grey patches indicate the most frequently occurring conformations of the protein backbone and its secondary folds (α-helixes and β-strands). The elliptical path expresses the limited conformational subspace to which the early stage (ES) intermediate is assumed to belong; b) Fragment of the contingency table visualized by Contingency Table Browser. Columns correspond to individual tetrapeptide fragments in protein 2BA2 (PDB code) while rows correspond to structural motifs; c) Frequency of occurrence of each four-letter structural motif for a specific tetrapeptide (IGRL) visualized as a bar chart; d) Visualization of the entire contingency table (columns correspond to tetrapeptides while rows represent structural motifs). Despite the overwhelming volume of data, preferred conformation zones can clearly be discerned. For example, the two marked bands correspond to α-helixes (1) and β-strands (2) respectively. Additionally, we have highlighted the prevalence of codes A (3) and G (4) for glycine-containing tetrapeptides, as well as the characteristic correlation between proline and code G (5).