| Literature DB >> 21569793 |
Neethu Krishna1, Kunchur Guruprasad.
Abstract
Helices, strands and coils in proteins of known three-dimensional structure, corresponding to heptapeptide and large sequences ('probe' peptides), were scanned against peptide sequences of variable length, comprising seven or more residues that correspond to a different conformation ('target' peptides) in protein crystal structures available from the Protein Data Bank (PDB). Where the 'probe' and 'target' peptide sequences exactly match, they correspond to 'chameleon' sequences in protein structures. We observed ∼548 heptapeptide and large chameleon sequences that included peptides in the coil conformation from 53,794 PDB files that were analyzed. However, after excluding several chameleon peptides based on the quality of protein structure data, redundancy and peptides associated with cloning artifacts, such as, histidine-tags, we observed only ten chameleon peptides in structurally different proteins and the maximum length comprised seven amino acid residues. Our analysis suggests that the quality of protein structure data is important for identifying possibly, the 'true chameleons' in PDB. Majority of the chameleon sequences correspond to an entire strand in one protein that is observed as part of helix sequence in another protein. The heptapeptide chameleons are characterized with a high propensity of alanine, leucine and valine amino acid residues. The total hydropathy values range between -11.2 and 22.9, the difference in solvent accessibility between 2.0 Å(2) and 373 Å(2) units and the difference in total number of residue neighbor contacts between 0 and 7 residues. Our work identifies for the first time heptapeptide and large sequences that correspond to a single complete helix, strand or coil, which adopt entirely different secondary structures in another protein.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21569793 PMCID: PMC7124434 DOI: 10.1016/j.ijbiomac.2011.04.017
Source DB: PubMed Journal: Int J Biol Macromol ISSN: 0141-8130 Impact factor: 6.953
Dataset.
| Item | Total |
|---|---|
| PDB files analyzed | 53,794 |
| Helices | 600,421 |
| Strands | 619,173 |
| Coils | 1,256,996 |
| Non-redundant helices | 181,975 |
| Non-redundant strands | 137,953 |
| Non-redundant coils | 77,520 |
| Heptapeptide and large helices | 330,115 |
| Heptapeptide and large strands | 167,534 |
| Heptapeptide and large coils | 40,497 |
| Non-redundant heptapeptide and large helices | 132,663 |
| Non-redundant heptapeptide and large strands | 56,713 |
| Non-redundant heptapeptide and large coils | 12,916 |
| Heptapeptide and large chameleon sequences (in PDB crystal structures with resolution ≤2.5 Å) | 80 |
Fig. 1Total (A) hydropathy, (C) solvent accessibility, (E) residue neighbor contacts for heptapeptide and large chameleon sequences identified (in this work) and equivalent values shown in figures (B), (D), (F), respectively, evaluated for heptapeptide and large chameleon sequences selected from high quality protein crystal structures selected from [9]. The values shown in (C), (D), (E) and (F) along the Y-axis are for the chameleon peptide sequence in the corresponding protein pairs (represented by dash and continuous line). (G) Amino acid propensity values for combined heptapeptide and large chameleon sequences (this work) and chameleon-HS and chameleon-HE sequences selected from [9] as shown along the X-axis in (B).