Literature DB >> 19759809

Analysis of protein chameleon sequence characteristics.

Amine Ghozlane¹, Agnel Praveen Joseph, Aurelie Bornot, Alexandre G de Brevern.

Abstract

Conversion of local structural state of a protein from an alpha-helix to a beta-strand is usually associated with a major change in the tertiary structure. Similar changes were observed during the self assembly of amyloidogenic proteins to form fibrils, which are implicated in severe diseases conditions, e.g., Alzheimer disease. Studies have emphasized that certain protein sequence fragments known as chameleon sequences do not have a strong preference for either helical or the extended conformations. Surprisingly, the information on the local sequence neighborhood can be used to predict their secondary at a high accuracy level. Here we report a large scale-analysis of chameleon sequences to estimate their propensities to be associated with different local structural states such as alpha -helices, beta-strands and coils. With the help of the propensity information derived from the amino acid composition, we underline their complexity, as more than one quarter of them prefers coil state over to the regular secondary structures. About half of them show preference for both alpha-helix and beta-sheet conformations and either of these two states is favored by the rest.

Entities: Chemical Disease Species

Keywords: chameleon sequence; secondary structures; structural characteristics

Year: 2009 PMID： 19759809 PMCID： PMC2732029 DOI： 10.6026/97320630003367

Source DB: PubMed Journal: Bioinformation ISSN： 0973-2063

Background

Repetitive secondary structures like α-helices and β-strands have been viewed as key building blocks of proteins. These local protein structures are stabilized mainly by hydrogen bonds within the protein backbone. In 1984, Kabsch and Sander identified identical fragment sequences of limited length found in both α-helices and β-strands, namely chameleon sequences [1]. This suggests that only local sequence composition and the order of amino acids are not sufficient to predict the secondary structure accurately [2]. The number of examples supporting the above speculation has strikingly increased in the recent past [3]. Elegant experimental studies have shown the importance of nonlocal interactions to guide the formation of α -helix or β - strand, e.g. the IgG-binding domain of protein G (GB1) [4]. Chameleon sequences have also been designed, e.g. MATa2 and MCM1 DNA complexes [5]. Studies have emphasized that these chameleon sequences, have no strong preference for either α-helical or β-strand conformations [6]. Nonetheless, the information on the local sequence neighborhood can be used to predict their secondary at a high accuracy level [3,7]. Here, we have analyzed chameleon sequences to estimate their propensities to form not only the regular secondary structures like α-helix or β-strand, but also coil [8].

Description

Unlike the previous studies that focused only on limited parts of the Protein DataBank [9], all the protein structures available in 2007 (∼40.000 protein structures) have been used. Secondary structures have been assigned for these proteins using the DSSP algorithm [10]. Only those proteins with complete side-chain co-ordinates and without multiple breaks in the chain were considered, leading to a final number of 14,692,070 amino acid residues associated to a given secondary structure. The 8 secondary structural assignments made by DSSP were reduced to the 3 classical states: helix includes α, 3.10 and π-helices, strand has only the β-strand assignments, and coil covering the rest of the assignments (γ-bridges, turns, bends, and coil). Default parameters of the program have been used. In the second step, we searched for chameleon sequences of length L, L ranging from 4 to 8 amino acids. A fragment is considered as a chameleon sequence if all the residues in this fragment are associated at least once to the helical conformation and also, at least once to the β-strand. Thus, numerous chameleon sequences have been located: 63,228 (for L = 4 residues), 34,408 (for L = 5), 2,423 (for L = 6), 179 (for L = 7) and 64 (for L = 8). As the dataset is large and complete when compared to the ones used in previous studies, more examples were found, especially for the longer fragments [3]. Our main goal is to check whether the chameleon sequences don't have any strong preference for either helical or strand conformations [6], and also to extend the questioning to the preference of chameleon sequences for the coil state, a question not directly tackled in the previous works. For this purpose, we have used a simple methodology. We have used a non-redundant databank containing proteins with not more than 20% pairwise sequence identity. The selected chains have X-ray crystallographic resolutions less than 1.6 Å, with a R-factor less than 0.25 (details can be found in [11]). Using this non-redundant databank, the propensity of an amino acid k to be associated to a given secondary structure state i, namely pik, has been computed (see equation 1 in ) and i corresponds to α-helix, β-strand or the coil state, while k corresponds to one of the 20 amino acids. Hence, each chameleon sequence XS is associated to a score Sα, Sβ and Scoil As these scores are propensity products, a score Si of 1.0 corresponds to the random value. If Si is higher than one, this chameleon sequence is found preferentially associated with the secondary state i and vice versa. This measure is crude but gives some basic insights into the behaviors of chameleon sequence. Figure 1a shows a plot of Sα versus Sβ for the 63,228 chameleon sequences (for L=4 residues). The adequacy scores greater than 4.0 were set to a maximum value of 4.0. The figure shows that 53.7% and 47.3% of the chameleon sequences have Sβ and Sα scores greater than 1.0 respectively. Thus, each square delineated by the red lines are quite equivalent. Sβ scores go far beyond Sα scores, as 16% of the Sβ scores are greater than 2.0, 5.3% than 3.0 and 2.7% than 4.0, while only 5.1% of the Sα scores are greater than 2.0 and 0.2% than 3.0. 21.6% of the chameleon sequences have Sα and Sβ scores greater than one, with an average Scoil of 0.42 (i.e. less than two times the random value). For 25.7% of these fragments, α-helix is statistically preferred over β-strand, with an average Scoil of 0.68, while for 24.7%, only β-strand is preferred (average Scoil of 0.65). Interestingly, 27.9% of the chameleon sequences have Sα and Sβ less than 1.0, i.e., the coil state is favored.

Figure 1

(a) Distribution of adequacy scores S(α) and S(β) of chameleon sequence fragment of length 4. The legend gives the occurrence number of observed fragments. (b) example of the chameleon sequence fragments MLIL found (left) in a β-strand of Guinea pig 11 beta-hydroxysteroid 2 dehydrogenase type 1 (PDB code 1XSE) and in an α-helix of a hyperthermophilic tungstoperin enzyme 2 aldehyde ferredoxin oxidoreductase (PDB code 1aor). The blue point in (a)represents the scores of example (b).

Figure 1b shows the chameleon sequence fragment MLIL that have Sα and Sβ scores greater than 2.0 (shown as the blue dot in Figure 1a). In type-1 beta-hydroxysteroid 2 dehydrogenase, this chameleon sequence forms the central β-strand of a β-sheet composed of 5 β-strands (Figure 1b left), while in hyperthermophilic tungstoperin enzyme 2 aldehyde ferredoxin oxidoreductase, this sequence is in the middle of a long α-helix (Figure 1b right). With this simple approach, we have underlined that chameleon sequences have no strong preference for either α- or β-conformation. We have also found that very different chameleon sequences exist, some showing a higher preference for either helical or strand conformations, some showing preference for both, while some sequences favor the coil state over the regular secondary structures. These observations again support the idea that non-local factors [2,3] have a major influence over the secondary structure that an amino acid sequence adopts. Supplementary information can be found on our website: http://www.dsimb.inserm.fr/~joseph/chameleon/

10 in total

1. The Protein Data Bank.

Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

2. Predictions of protein segments with the same aminoacid sequence and different secondary structure: a benchmark for predictive methods.

Authors: I Jacoboni; P L Martelli; P Fariselli; M Compiani; R Casadio
Journal: Proteins Date: 2000-12-01

3. Protein contacts, inter-residue interactions and side-chain modelling.

Authors: Guilhem Faure; Aurélie Bornot; Alexandre G de Brevern
Journal: Biochimie Date: 2007-11-28 Impact factor: 4.079

4. Conformational contagion in a protein: structural properties of a chameleon sequence.

Authors: Kazufumi Takano; Yoshiaki Katagiri; Atsushi Mukaiyama; Hyongi Chon; Hiroyoshi Matsumura; Yuichi Koga; Shigenori Kanaya
Journal: Proteins Date: 2007-08-15

5. Chameleon sequences in the PDB.

Authors: M Mezei
Journal: Protein Eng Date: 1998-06

6. Origins of structural diversity within sequentially identical hexapeptides.

Authors: B I Cohen; S R Presnell; F E Cohen
Journal: Protein Sci Date: 1993-12 Impact factor: 6.725

7. Context-dependent secondary structure formation of a designed protein sequence.

Authors: D L Minor; P S Kim
Journal: Nature Date: 1996-04-25 Impact factor: 49.962

8. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.

Authors: W Kabsch; C Sander
Journal: Biopolymers Date: 1983-12 Impact factor: 2.505

9. On the use of sequence homologies to predict protein structure: identical pentapeptides can have completely different conformations.

Authors: W Kabsch; C Sander
Journal: Proc Natl Acad Sci U S A Date: 1984-02 Impact factor: 11.205

10. Analysis of chameleon sequences and their implications in biological processes.

Authors: Jun-Tao Guo; Jerzy W Jaromczyk; Ying Xu
Journal: Proteins Date: 2007-05-15

10 in total

8 in total

Review 1. ChSeq: A database of chameleon sequences.

Authors: Wenlin Li; Lisa N Kinch; P Andrew Karplus; Nick V Grishin
Journal: Protein Sci Date: 2015-06-16 Impact factor: 6.725

Review 2. From local structure to a global framework: recognition of protein folds.

Authors: Agnel Praveen Joseph; Alexandre G de Brevern
Journal: J R Soc Interface Date: 2014-04-16 Impact factor: 4.118

3. Identification of local conformational similarity in structurally variable regions of homologous proteins using protein blocks.

Authors: Garima Agarwal; Swapnil Mahajan; Narayanaswamy Srinivasan; Alexandre G de Brevern
Journal: PLoS One Date: 2011-03-18 Impact factor: 3.240