| Literature DB >> 33810341 |
László Keresztes1, Evelin Szögi1, Bálint Varga1, Viktor Farkas2, András Perczel2,3, Vince Grolmusz1,4.
Abstract
The amyloid state of proteins is widely studied with relevance to neurology, biochemistry, and biotechnology. In contrast with nearly amorphous aggregation, the amyloid state has a well-defined structure, consisting of parallel and antiparallel β-sheets in a periodically repeated formation. The understanding of the amyloid state is growing with the development of novel molecular imaging tools, like cryogenic electron microscopy. Sequence-based amyloid predictors were developed, mainly using artificial neural networks (ANNs) as the underlying computational technique. From a good neural-network-based predictor, it is a very difficult task to identify the attributes of the input amino acid sequence, which imply the decision of the network. Here, we present a linear Support Vector Machine (SVM)-based predictor for hexapeptides with correctness higher than 84%, i.e., it is at least as good as the best published ANN-based tools. Unlike artificial neural networks, the decisions of the linear SVMs are much easier to analyze and, from a good predictor, we can infer rich biochemical knowledge. In the Budapest Amyloid Predictor webserver the user needs to input a hexapeptide, and the server outputs a prediction for the input plus the 6 × 19 = 114 distance-1 neighbors of the input hexapeptide.Entities:
Keywords: Budapest Amyloid Predictor; amyloid; site-specific amyloidogenecity; support vector machines
Year: 2021 PMID: 33810341 PMCID: PMC8067080 DOI: 10.3390/biom11040500
Source DB: PubMed Journal: Biomolecules ISSN: 2218-273X
The precomputed values from Equation (1) are listed in the rows, corresponding to the amino acids. The columns are correspond to their positions.
| 1 | 2 | 3 | 4 | 5 | 6 | |
|---|---|---|---|---|---|---|
| A | −0.26 | −0.32 | −0.27 | −0.14 | −0.43 | −0.22 |
| R | −0.45 | −0.41 | −0.46 | −0.33 | −0.52 | −0.35 |
| N | −0.40 | −0.34 | −0.49 | −0.27 | −0.46 | −0.30 |
| D | −0.49 | −0.43 | −0.56 | −0.41 | −0.56 | −0.36 |
| C | −0.09 | −0.21 | 0.03 | −0.05 | −0.17 | −0.05 |
| Q | −0.37 | −0.30 | −0.36 | −0.34 | −0.48 | −0.32 |
| E | −0.51 | −0.41 | −0.43 | −0.30 | −0.61 | −0.39 |
| G | −0.23 | −0.37 | −0.46 | −0.37 | −0.30 | −0.33 |
| H | −0.32 | −0.26 | −0.26 | −0.30 | −0.35 | −0.25 |
| I | −0.06 | −0.08 | 0.26 | 0.09 | −0.06 | −0.07 |
| L | −0.10 | −0.18 | 0.02 | 0.04 | −0.22 | −0.13 |
| K | −0.39 | −0.45 | −0.51 | −0.35 | −0.59 | −0.32 |
| M | −0.17 | −0.25 | −0.02 | −0.10 | −0.19 | −0.18 |
| F | −0.13 | −0.11 | 0.05 | −0.03 | −0.13 | −0.11 |
| P | −0.56 | −0.38 | −0.56 | −0.51 | −0.42 | −0.45 |
| S | −0.37 | −0.35 | −0.41 | −0.30 | −0.48 | −0.23 |
| T | −0.34 | −0.33 | −0.28 | −0.23 | −0.40 | −0.23 |
| W | −0.17 | −0.17 | −0.09 | −0.06 | −0.12 | −0.16 |
| Y | −0.23 | −0.11 | −0.13 | −0.06 | −0.18 | −0.15 |
| V | −0.05 | −0.14 | 0.19 | 0.14 | −0.19 | 0.01 |
The amyloidogenecity order of the amino acids, decreasing from left to right.
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | V | I | C | L | F | M | W | G | Y | A | H | T | S | Q | K | N | R | D | E | P |
| 2 | I | F | Y | V | W | L | C | M | H | Q | A | T | N | S | G | P | R | E | D | K |
| 3 | I | V | F | C | L | M | W | Y | H | A | T | Q | S | E | R | G | N | K | D | P |
| 4 | V | I | L | F | C | W | Y | M | A | T | N | H | E | S | R | Q | K | G | D | P |
| 5 | I | W | F | C | Y | M | V | L | G | H | T | P | A | N | Q | S | R | D | K | E |
| 6 | V | C | I | F | L | Y | W | M | A | T | S | H | N | Q | K | G | R | D | E | P |
Figure 1The ROC (Receiver Operating Characteristics) curve of the Budapest Amyloid Predictor. The AUC (Area Under Curve) value is 0.89. The precision-recall curve is provided as Figure S1 in the supporting material.