| Literature DB >> 26455268 |
Abstract
One peptidase can usually be distinguished from another biochemically by its action on proteins, peptides and synthetic substrates. Since 1996, the MEROPS database (http://merops.sanger.ac.uk) has accumulated a collection of cleavages in substrates that now amounts to 66,615 cleavages. The total number of peptidases for which at least one cleavage is known is 1700 out of a total of 2457 different peptidases. This paper describes how the cleavages are obtained from the scientific literature, how they are annotated and how cleavages in peptides and proteins are cross-referenced to entries in the UniProt protein sequence database. The specificity profiles of 556 peptidases are shown for which ten or more substrate cleavages are known. However, it has been proposed that at least 40 cleavages in disparate proteins are required for specificity analysis to be meaningful, and only 163 peptidases (6.6%) fulfil this criterion. Also described are the various displays shown on the website to aid with the understanding of peptidase specificity, which are derived from the substrate cleavage collection. These displays include a logo, distribution matrix, and tables to summarize which amino acids or groups of amino acids are acceptable (or not acceptable) in each substrate binding pocket. For each protein substrate, there is a display to show how it is processed and degraded. Also described are tools on the website to help with the assessment of the physiological relevance of cleavages in a substrate. These tools rely on the hypothesis that a cleavage site that is conserved in orthologues is likely to be physiologically relevant, and alignments of substrate protein sequences are made utilizing the UniRef50 database, in which in each entry sequences are 50% or more identical. Conservation in this case means substitutions are permitted only if the amino acid is known to occupy the same substrate binding pocket from at least one other substrate cleaved by the same peptidase.Entities:
Keywords: Binding pocket; Cleavage; Peptidase; Scissile bond; Specificity; Substrate
Mesh:
Substances:
Year: 2015 PMID: 26455268 PMCID: PMC4756867 DOI: 10.1016/j.biochi.2015.10.003
Source DB: PubMed Journal: Biochimie ISSN: 0300-9084 Impact factor: 4.079
Counts of different peptidases (peptidase species) by catalytic type.
| Aspartic | Glutamic | Metallo | Cysteine | Serine | Threonine | Mixed | Asparagine lyases | Unknown | Total | |
|---|---|---|---|---|---|---|---|---|---|---|
| Sequenced and characterized | 170 | 7 | 633 | 615 | 942 | 46 | 5 | 23 | 16 | 2457 |
| Sequenced only | 118 | 0 | 327 | 297 | 644 | 29 | 0 | 1 | 3 | 1419 |
| Sequence not known | 8 | 0 | 86 | 19 | 87 | 1 | 0 | 0 | 49 | 250 |
| Non-peptidase homologues | 4 | 0 | 103 | 52 | 145 | 26 | 0 | 0 | 0 | 330 |
| Pseudogenes | 24 | 0 | 5 | 21 | 17 | 3 | 0 | 0 | 0 | 70 |
| Total | 324 | 7 | 1154 | 1004 | 1835 | 105 | 5 | 24 | 68 | 4526 |
Counts of substrates per catalytic type.
| Aspartic | Glutamic | Metallo | Cysteine | Serine | Threonine | Mixed | Asparagine lyases | Unknown | Total | |
|---|---|---|---|---|---|---|---|---|---|---|
| Physiological | 2780 | 7 | 4113 | 7311 | 5877 | 33 | 2 | 60 | 81 | 20,264 |
| Pathological | 266 | 0 | 345 | 700 | 34 | 0 | 0 | 0 | 4 | 1349 |
| Non-physiological | 2893 | 70 | 8762 | 3112 | 21,301 | 84 | 1 | 3 | 3 | 36,229 |
| Synthetic | 364 | 32 | 1569 | 1179 | 2547 | 42 | 37 | 0 | 51 | 5821 |
| Theoretical | 176 | 0 | 545 | 106 | 638 | 0 | 0 | 300 | 0 | 1765 |
| Unclassified | 77 | 0 | 554 | 194 | 314 | 26 | 1 | 0 | 21 | 1187 |
| Total | 6556 | 109 | 15,888 | 12,602 | 30,711 | 185 | 41 | 363 | 160 | 66,615 |
Counts of peptidases with known substrate cleavages by catalytic type.
| Aspartic | Glutamic | Metallo | Cysteine | Serine | Threonine | Mixed | Asparagine lyases | Unknown | Total | |
|---|---|---|---|---|---|---|---|---|---|---|
| Sequenced and characterized | 114 | 4 | 477 | 401 | 520 | 18 | 2 | 17 | 33 | 1586 |
| Sequence not known | 5 | 0 | 41 | 6 | 34 | 0 | 0 | 0 | 28 | 114 |
| Total | 119 | 4 | 518 | 407 | 554 | 18 | 2 | 17 | 61 | 1700 |
Peptidases with 10 or more known cleavages and peptidase specificity derived from substrate cleavages in the MEROPS collection. For each peptidase with ten or more known substrate cleavages are shown. Peptidases are arranged by specificity (number of binding pockets, then preference in each binding pocket in the order P4 to P4′, then by amino acid in alphabetical order) and then by MEROPS identifier. For each peptidase, the MEROPS identifier, the recommended peptidase name, the number of substrate cleavages, and preferences for binding pockets P4 to P4′ are shown. The brighter the shade of green, the greater the preference; five shades are shown ranging from darkest green (50–59% of substrates) to brightest green (90% or greater of substrates). Up to two amino acids are shown in a binding pocket where a preference occurs. Where the preference is for any of a group of amino acids, or the preference for the group is greater than that for a single or amino acid, the following symbols are shown: λ (aliphatic: Ile, Leu, Val), @ (aromatic: Phe, Trp, Tyr), + (acidic: Asp, Glu), − (basic: Arg, His, Lys), Σ (small: Ala, Cys, Gly, Ser) and Ω (other: Asn, Gln, Met, Pro, Thr). Where no preference for an amino acid or group of amino acids exists, and where there are 200 or more cleavages, up to two amino acids that are not acceptable in a binding pocket are shown as white text on a black background. For exopeptidases which act at N- and C-termini of proteins, no residue may be possible in some binding pockets and in these cases the binding pockets are shaded grey. Binding pockets shaded black or grey are ignored for the ordering of items in the table. The “Reliability score” is the percentage difference calculated by counting all the differences between substrates for the same enzyme, dividing the total differences by the number of comparisons times the number of residues P4–P4′ considered, and multiplying by 100. Reliability scores of 75% or greater difference are highlighted in green; scores 50% or greater in yellow, and scores of less than 50% in red. See text for details.
Fig. 1Cleavages per peptidase. The bar chart shows the number of known substrate cleavages per peptidase on the Y axis and the count of peptidases with this number of cleavages on the X axis.
Fig. 2Example of a specificity logo and distribution matrix. The specificity logo and distribution matrix are shown for thimet oligopeptidase. In the logo, the taller the character the greater the preference in substrate binding pockets S4 to S4′ (numbered as 1 to 8 on the X axis). In the specificity matrix the number of times an amino acid occurs in the residue range P4 to P4′ in substrates is shown. The brighter the green highlighting, the greater the preference for an amino acid in that position. An amino acid that has not been observed to occupy a specific binding pocket is shown as white text on a black background. Amino acids are ordered so that amino acids with similar properties are grouped together.
Fig. 3Example of a substrate page. Part of the substrates page for thimet oligopeptidase is shown. For each substrate the following are shown: name; a cross-reference and link to the entry in the UniProt database where appropriate; the residue range of the substrate as used in the experiment with reference to the numbering in the UniProt entry; a description of the cleavage where the scissile bond is represented by the symbol ‘+’; whether the cleavage is physiological, non-physiological, pathological or in a synthetic substrate; the evidence by which the cleavage site was determined; the residues occupying residues P4 to P4′ in the substrate; the source reference; and a cross-reference and link to the CutDB database [14]. By default substrates are listed alphabetically, but the order can be changed by clicking the column heading. It is possible to filter the results for physiological, nonphysiological, pathological or cleavages in synthetic substrates by clicking on the appropriate letter in the table legend.
Fig. 4Example of a substrate alignment. Part of the alignment for orthologues of the Ebola virus envelope glycoprotein is shown, with the known cleavage of the glycoprotein from the Zaire strain (UniProt P87671) by ADAM17 (MEROPS ID M12.217) at residue 637 highlighted in green. Residues in the range P4–P4′ are highlighted in pink if they are identical to that from the Zaire strain; substituted residues are highlighted in orange if the amino acid from another ADAM17 substrate is known to occupy the same binding pocket; and substituted residues are shown as white on black if the amino acid is not known to occupy the same binding pocket from any ADAM17 substrate.
An example of a file for submission to the Analyse Substrates service.
| MEROPS ID | UniProt | Cleavage position |
|---|---|---|
| A01.004 | P05067 | 671 |
| A01.009 | P05067 | 705 |
| A01.009 | P05067 | 713 |
| A01.009 | P05067 | 714 |
| A01.009 | P05067 | 719 |
| A01.009 | P05067 | 720 |
| A01.041 | P05067 | 690 |
| A01.041 | P05067 | 691 |
| A22.001 | P05067 | 711 |
| A22.001 | P05067 | 713 |
| A22.001 | P05067 | 714 |
| C01.060 | P05067 | 704 |
| C01.060 | P05067 | 708 |
| C01.060 | P05067 | 711 |
| C01.084 | P05067 | 685 |
| C01.084 | P05067 | 685 |
| C01.084 | P05067 | 685 |
| C01.084 | P05067 | 689 |
| C01.084 | P05067 | 690 |
| C01.084 | P05067 | 690 |
| C14.003 | P05067 | 739 |
| C14.005 | P05067 | 672 |
| C14.005 | P05067 | 739 |
| M02.001 | P05067 | 711 |
| M10.003 | P05067 | 687 |
| M10.003 | P05067 | 705 |
| M10.003 | P05067 | 706 |
| M10.004 | P05067 | 687 |
| M10.004 | P05067 | 691 |
| M10.004 | P05067 | 694 |
| M10.004 | P05067 | 701 |
| M10.004 | P05067 | 704 |
| M10.004 | P05067 | 705 |
| M10.014 | P05067 | 579 |
| M10.014 | P05067 | 687 |
| M10.016 | P05067 | 463 |
| M10.016 | P05067 | 579 |
| M10.016 | P05067 | 622 |
| M10.016 | P05067 | 685 |
| M10.017 | P05067 | 685 |
| M10.017 | P05067 | 687 |
Results from the Analyse Substrates service.
| MEROPS identifier | Total cleavages known | Substrate UniProt accession | Homologues | Cleaved at | P4 count | P3 count | P2 count | P1 count | P1′ count | P2′ count | P3′ count | P4′ count |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A01.004 | 24 | P05067 | 352 | 671 | 10 | 7 | 25 | 5 | 0 | 0 | 0 | 1 |
| A01.009 | 897 | P05067 | 352 | 705 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| A01.009 | 897 | P05067 | 352 | 713 | 0 | 0 | 1 | 1 | 6 | 5 | 5 | 5 |
| A01.009 | 897 | P05067 | 352 | 714 | 0 | 1 | 1 | 6 | 5 | 5 | 5 | 5 |
| A01.009 | 897 | P05067 | 352 | 719 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| A01.009 | 897 | P05067 | 352 | 720 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 3 |
| A01.041 | 33 | P05067 | 352 | 690 | 0 | 2 | 2 | 1 | 1 | 3 | 1 | 10 |
| A01.041 | 33 | P05067 | 352 | 691 | 2 | 2 | 1 | 1 | 1 | 1 | 1 | 1 |
| A22.001 | 16 | P05067 | 352 | 711 | 1 | 0 | 0 | 1 | 1 | 2 | 6 | 6 |
| A22.001 | 16 | P05067 | 352 | 713 | 4 | 1 | 1 | 2 | 6 | 5 | 5 | 5 |
| A22.001 | 16 | P05067 | 352 | 714 | 0 | 1 | 1 | 6 | 5 | 5 | 5 | 8 |
| C01.060 | 632 | P05067 | 352 | 704 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| C01.060 | 632 | P05067 | 352 | 708 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 |
| C01.060 | 632 | P05067 | 352 | 711 | 1 | 0 | 0 | 0 | 1 | 1 | 5 | 5 |
| C01.084 | 19 | P05067 | 352 | 685 | 1 | 0 | 30 | 1 | 6 | 1 | 3 | 3 |
| C01.084 | 19 | P05067 | 352 | 685 | 1 | 0 | 30 | 1 | 6 | 1 | 3 | 3 |
| C01.084 | 19 | P05067 | 352 | 685 | 1 | 0 | 30 | 1 | 6 | 1 | 3 | 3 |
| C01.084 | 19 | P05067 | 352 | 689 | 0 | 1 | 3 | 2 | 2 | 6 | 4 | 2 |
| C01.084 | 19 | P05067 | 352 | 690 | 1 | 3 | 3 | 1 | 5 | 4 | 4 | 8 |
| C01.084 | 19 | P05067 | 352 | 690 | 1 | 3 | 3 | 1 | 5 | 4 | 4 | 8 |
| C14.003 | 651 | P05067 | 352 | 739 | 1 | 1 | 4 | 4 | 3 | 3 | 3 | 3 |
| C14.005 | 201 | P05067 | 352 | 672 | 7 | 5 | 4 | 16 | 0 | 0 | 0 | 0 |
| C14.005 | 201 | P05067 | 352 | 739 | 1 | 1 | 4 | 4 | 3 | 3 | 3 | 3 |
| M02.001 | 5 | P05067 | 352 | 711 | 1 | 1 | 4 | 1 | 1 | 2 | 173 | 173 |
| M10.003 | 3417 | P05067 | 352 | 687 | 2 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| M10.003 | 3417 | P05067 | 352 | 705 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| M10.003 | 3417 | P05067 | 352 | 706 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| M10.004 | 369 | P05067 | 352 | 687 | 2 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| M10.004 | 369 | P05067 | 352 | 691 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| M10.004 | 369 | P05067 | 352 | 694 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| M10.004 | 369 | P05067 | 352 | 701 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| M10.004 | 369 | P05067 | 352 | 704 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| M10.004 | 369 | P05067 | 352 | 705 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| M10.014 | 132 | P05067 | 352 | 579 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| M10.014 | 132 | P05067 | 352 | 687 | 11 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| M10.016 | 20 | P05067 | 352 | 463 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 9 |
| M10.016 | 20 | P05067 | 352 | 579 | 7 | 0 | 18 | 0 | 2 | 4 | 0 | 1 |
| M10.016 | 20 | P05067 | 352 | 622 | 13 | 6 | 19 | 2 | 1 | 6 | 11 | 6 |
| M10.016 | 20 | P05067 | 352 | 685 | 0 | 0 | 11 | 1 | 5 | 0 | 0 | 2 |
| M10.017 | 27 | P05067 | 352 | 685 | 0 | 0 | 11 | 1 | 5 | 0 | 0 | 2 |
| M10.017 | 27 | P05067 | 352 | 687 | 11 | 1 | 0 | 0 | 0 | 3 | 1 | 1 |