| Literature DB >> 26538500 |
Clemens Žváček1, Gerald Friedrichs2, Leonhard Heizinger3, Rainer Merkl4.
Abstract
BACKGROUND: The central element of each enzyme is the catalytic site, which commonly catalyzes a single biochemical reaction with high specificity. It was unclear to us how often sites that catalyze the same or highly similar reactions evolved on different, i. e. non-homologous protein folds and how similar their 3D poses are. Both similarities are key criteria for assessing the usability of pose comparison for function prediction.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26538500 PMCID: PMC4634577 DOI: 10.1186/s12859-015-0807-6
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Enzymatic function and their evolution in different SCOP superfamilies. For the dark bars, enzymes were grouped according to their four-digit EC number and the number of SCOP superfamilies was deduced for each of these substrate specific enzymatic functions. The light bars show the histogram deduced from a compilation of three-digit EC numbers, which subsumes enzymes with the same function but different substrates
Fig. 2Occurrence of enzymes from the six EC classes. The dark bars give the fraction of the enzymes in ENZ_SITES, the light bars their fraction in the PDB. On top of the bars, the ratio of the corresponding fraction values is printed. The value of 1.16 indicates that enzymes from EC class 1 are 16 % overrepresented in ENZ_SITES with respect to their occurrence in the PDB
Numbers of TPs and sensitivity for different representations of catalytic sites and for scores to compare their poses
| Method | TP | Sensitivity |
|---|---|---|
| Cα | 177 | 0.26 |
| Cα + Cβ | 206 | 0.30 |
| Cα + centroid | 215 | 0.31 |
|
| 216 | 0.31 |
| Stark score | 214 | 0.31 |
| Torrance score | 213 | 0.31 |
| RMSD | 211 | 0.30 |
|
| 218 | 0.31 |
|
| 220 | 0.32 |
In the upper part, the column Method lists the performance reached for different representations of the catalytic sites. Cα and Cβ indicate a representation restricted to the respective atoms, the centroid represents the full sidechain with one point in 3D. For EQUI(Y/F, I/L), Superpose3D considers the residue pairs Y, F and I, L as structurally equivalent. The second part of the table lists results deduced by using different scores for the 3D comparison of poses. For these cases, the all-atom mode of Superpose3D was used; for details see text
Classification performance for pose comparison
| Method | MCC | TP | FP | TN | FN | |
|---|---|---|---|---|---|---|
| DIFF_SF |
| 0.19 | 61 | 58 | 417 | 159 |
|
| 0.44 | 2 | 8 | 608 | 0 | |
|
| 0.64 | 5 | 5 | 607 | 1 | |
|
| 0.36 | 35 | 38 | 493 | 52 | |
|
| 0.29 | 5 | 5 | 587 | 21 | |
|
| 0.11 | 6 | 279 | 333 | 0 | |
|
| 0.13 | 20 | 265 | 327 | 6 | |
| ALL_ENZ_SITES |
| 0.29 | 14 | 41 | 546 | 17 |
|
| 0.47 | 59 | 78 | 459 | 22 | |
|
| 0.57 | 167 | 73 | 325 | 53 | |
|
| 0.58 | 163 | 65 | 337 | 53 |
In all cases, the cut-offs for the comparison of poses (cu_ RMSD or cu_ RMSD ) and sets of GO terms (S ) are given under Method. The terms CC, BP, and MF indicate which combination of terms from the annotation domains were analyzed, respectively. The columns on the right give the MCC value and the number of TP, FP, TN, and FN cases, which resulted from a classification using these cut-offs. For the experiments labeled DIFF_SF, only poses for enzymes from different superfamilies were compared; for ALL_ENZ_SITES, sites from all entries of ENZ_SITES were compared