| Literature DB >> 19468045 |
Changsik Kim1, Jiwon Choi, Seong Joon Lee, William J Welsh, Sukjoon Yoon.
Abstract
The calculation of contact-dependent secondary structure propensity (CSSP) is a unique and sensitive method that detects non-native secondary structure propensities in protein sequences. This method has applications in predicting local conformational change, which typically is observed in core sequences of protein aggregation and amyloid fibril formation. NetCSSP implements the latest version of the CSSP algorithm and provides a Flash chart-based graphic interface that enables an interactive calculation of CSSP values for any user-selected regions in a given protein sequence. This feature also can quantitatively estimate the mutational effect on changes in native or non-native secondary structural propensities in local sequences. In addition, this web tool provides precalculated non-native secondary structure propensities for over 1,400,000 fragments that are seven-residues long, collected from PDB structures. They are searchable for chameleon subsequences that can serve as the core of amyloid fibril formation. The NetCSSP web tool is available at http://cssp2.sookmyung.ac.kr/.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19468045 PMCID: PMC2703942 DOI: 10.1093/nar/gkp351
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.ROC plot validation of NetCSSP algorithm on two data sets. The sequence potential for aggregation was calculated from CSSP-derived P(helix), P(β) and P(coil) values in the form, ln(P(β)/[P(helix) × P(coil)] (5). Test set1 includes a total of 104 fragments of ≥10 amino-acid length, and test set2 includes 70 fragments of 10 amino-acid length. Both test sets were retrieved from literature (13). ROC plots represent the prioritization of aggregation-prone sequences over non-aggregates based on the CSSP values. AUC (Area Under the Curve) is in the range of 0–1 and represents the predictive power of the method.
Figure 2.Workflow of CSSP calculation in the NetCSSP web server. Only sequence information is required for the CSSP calculation. When the 3D structure (in PDB format) is submitted, the predicted CSSP will be displayed in comparison with the native secondary structure information.
Figure 3.Output of single ANN mode NetCSSP profile of horse myoglobin (PDB ID: 1DWR). Only the N-terminal region (sequence 4–53) is displayed. The native helical conformation is displayed in red bars. The CSSP is predicted at 20 different energy steps of >(i, I ± 4) interaction for helical, beta and coil propensities. The bottom diagram shows the sum of energy step-wise CSSPs. The additive CSSP values for the entire sequence and the residue-average values are given in the upper panel. One can also interactively calculate the CSSPs for any user-specified residues and energy steps. The light pink box shows the CSSP values for seventh residue, W, at an intermediate >(i, I ± 4) energy level. The blue-shaded region represents a selection of 25-GQEVLI-30 sub-sequence and its CSSPs are presented at the upper panel.
Search of chameleon sequences
| Sequence | Secondary structure | PDB | Chain | SCOP | CSSP | Non-native P(helix) | Non-native P(β) |
|---|---|---|---|---|---|---|---|
| GQEVLLT | CCEEEEE | 1o89 | A | b.35.1.2 | 0.48 | 0.3 | – |
| QEVLLVQ | HHHHHHH | 1a8o | – | a.28.3.1 | 0.43 | – | 0.33 |
| QEVLLWL | HHHHHHH | 1csh | – | a.103.1.1 | 0.46 | – | 0.36 |
| TLAQEVL | HHHHHHH | 1e1o | A | d.104.1.1 | 0.52 | – | 0.23 |
| AQEVLLA | EEEEEEE | 1exs | A | b.60.1.1 | 0.28 | 0.51 | – |
| KPIQEVL | CCHHHHH | 1het | A | c.2.1.1 | 0.55 | – | 0.21 |
| QEVLKSI | HHHHHHH | 1mg7 | A | d.14.1.6 | 0.53 | – | 0.22 |
| NLQEVLG | CCCEEEC | 1n3l | A | c.26.1.1 | 0.33 | 0.41 | – |
| LQEVLNT | HHHHHHH | 1odf | A | c.37.1.6 | 0.55 | – | 0.24 |
| QEVLLPR | CEEEECC | 1ojq | A | d.166.1.1 | 0.51 | 0.19 | – |
| AHQEVLF | EEEEEEE | 1p9l | A | d.81.1.3 | 0.31 | 0.34 | – |
| IQEVLEV | HHHHHCC | 1qgu | B | c.92.2.3 | 0.53 | – | 0.34 |
| QEVLETM | HHHHHHH | 1tml | – | c.6.1.1 | 0.58 | – | 0.27 |
The subsequence (GQEVLI) in the shaped box in Figure 3 has both strong helical and beta propensities. Searching the fragment database, including precalculated CSSPs values, shows that GQEVL and QEVL are found in both helical and beta contexts in various native proteins. The native secondary structure is represented by C (coil), E (extended β) and H (helix).
aCSSP represents the calculated propensity for the native secondary structure for a seven-residue sequence. For example, when a residue adopts ‘coil’ for the native structure, P(coil) of calculated CSSPs was selected.
Output of search of chameleon sequences with the highest non-native P(helix) and non-native P(β) values
| Sequence | Secondary structure | PDB | Chain | SCOP | CSSP (for native structure) | Non-native P(helix) | Non-native P(β) | Relative P(helix) | Relative P(β) |
|---|---|---|---|---|---|---|---|---|---|
| LRRARAA | CCCCCCC | 1cer | O | d.81.1.1 | 0.18 | 0.69 | 0.13 | 3.93 | 0.73 |
| KQMLAKA | CCCCCCC | 1goj | A | c.37.1.9 | 0.13 | 0.68 | 0.18 | 5.10 | 1.33 |
| QEQLEKA | CCCCCCC | 1gx5 | A | e.8.1.4 | 0.17 | 0.68 | 0.13 | 3.89 | 0.77 |
| AKEAAQK | CCCCCCC | 1g9l | A | a.144.1.1 | 0.2 | 0.68 | 0.1 | 3.50 | 0.53 |
| ARAQARQ | CCCEEEE | 1omh | A | d.89.1.5 | 0.14 | 0.68 | – | 4.77 | – |
| AVIVVFD | CCCCCCC | 1bgx | T | c.120.1.2 | 0.15 | 0.22 | 0.58 | 1.45 | 3.80 |
| VTVTVFD | CCCCCCC | 1eu1 | A | b.52.2.2 | 0.3 | 0.12 | 0.57 | 0.41 | 1.88 |
| VFEVNIR | HHHHHHH | 1nxc | A | a.102.2.1 | 0.23 | – | 0.57 | – | 2.52 |
| VYWFTVE | HHHCCCC | 1toh | – | d.178.1.1 | 0.22 | – | 0.57 | – | 2.54 |
| VYVVFSV | CCCCCCC | 1vho | A | c.56.5.4 | 0.16 | 0.23 | 0.57 | 1.45 | 3.52 |
By searching the fragment DB, one can quantitatively analyze non-native secondary structure propensities in comparison with native secondary structure patterns.