| Literature DB >> 24848016 |
Ian Walsh1, Flavio Seno2, Silvio C E Tosatto3, Antonio Trovato2.
Abstract
The formation of amyloid aggregates upon protein misfolding is related to several devastating degenerative diseases. The propensities of different protein sequences to aggregate into amyloids, how they are enhanced by pathogenic mutations, the presence of aggregation hot spots stabilizing pathological interactions, the establishing of cross-amyloid interactions between co-aggregating proteins, all rely at the molecular level on the stability of the amyloid cross-beta structure. Our redesigned server, PASTA 2.0, provides a versatile platform where all of these different features can be easily predicted on a genomic scale given input sequences. The server provides other pieces of information, such as intrinsic disorder and secondary structure predictions, that complement the aggregation data. The PASTA 2.0 energy function evaluates the stability of putative cross-beta pairings between different sequence stretches. It was re-derived on a larger dataset of globular protein domains. The resulting algorithm was benchmarked on comprehensive peptide and protein test sets, leading to improved, state-of-the-art results with more amyloid forming regions correctly detected at high specificity. The PASTA 2.0 server can be accessed at http://protein.bio.unipd.it/pasta2/.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24848016 PMCID: PMC4086119 DOI: 10.1093/nar/gku399
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Receiver Operating Characteristic (ROC) curve for PASTA 2.0 and four other methods. Marked on the x-axis is an important area of the curve, the low false positive rate or high specificity zone. ROC calculated on Pep424 set. PASTA 2.0 is improved over all other comparisons in general. In the high specificity zone the area under the ROC (AUC) is measured and given in the legend. Tango stops at 12% FPR because all energies are 0 after this point and therefore no variation can be performed.
Performance on detecting aggregating residues from the Reg33 set
| Method | Sensitivity | Specificity | Q2 | MCC |
|---|---|---|---|---|
| Aggrescan | 35.37 | 79.26 | 57.32 | 0.13 |
| AMYLPRED2 | 39.27 | 84.48 | 61.88 | 0.22 |
| FoldAmyloid (contacts) | 20.71 | 86.97 | 76.17 | 0.08 |
| FoldAmyloid (triple hybrid) | 19.21 | 86.22 | 75.30 | 0.06 |
| Tango | 13.67 | 95.57 | 54.62 | 0.14 |
| MetAmyl (high specificity) | 39.05 | 83.14 | 77.24 | 0.19 |
| MetAmyl (global accuracy) | 52.46 | 70.73 | 68.29 | 0.17 |
| FishAmyloid | 13.73 | 93.68 | 82.98 | 0.10 |
| PASTA 2.0 (90% specificity) | 30.24 | 90.00 | 80.23 | 0.22 |
| PASTA 2.0 (85% specificity) | 40.87 | 84.95 | 77.77 | 0.24 |
Default thresholds used for FoldAmyloid, FishAmyloid and MetAmyl. Results for AMYLPRED2, Aggrescan and Tango are taken directly from (15).
Figure 2.Sample output for nasopharyngeal carcinoma-associated proline-rich protein 4. (A) Residue assignment of disorder, α-helix, β-strand, coil and a parallel aggregation region marked with an oval, along with the energy of the aggregation pairings and legend. (B) Pairing and linear probability profiles as a function of the residue position. The probabilities show an interesting aggregation-prone region with large helix probability but also high strand probability. In addition, the protein is predicted to be completely disordered but tends to be less so in the aggregating region. The diagonal line in the pairing probability predicts a parallel in-register arrangement for the aggregation-prone stretch. (C) The free energy pairing matrix and the free energy profiles. In mutation mode the free energy profile can be used to visualize the changes in aggregation potential for the mutants. In this case the mutants are V8D and V8P, both decrease aggregation potential (higher energy) in green and blue, respectively.