| Literature DB >> 24157837 |
Neil D Rawlings1, Matthew Waller, Alan J Barrett, Alex Bateman.
Abstract
Peptidases, their substrates and inhibitors are of great relevance to biology, medicine and biotechnology. The MEROPS database (http://merops.sanger.ac.uk) aims to fulfill the need for an integrated source of information about these. The database has hierarchical classifications in which homologous sets of peptidases and protein inhibitors are grouped into protein species, which are grouped into families, which are in turn grouped into clans. Recent developments include the following. A community annotation project has been instigated in which acknowledged experts are invited to contribute summaries for peptidases. Software has been written to provide an Internet-based data entry form. Contributors are acknowledged on the relevant web page. A new display showing the intron/exon structures of eukaryote peptidase genes and the phasing of the junctions has been implemented. It is now possible to filter the list of peptidases from a completely sequenced bacterial genome for a particular strain of the organism. The MEROPS filing pipeline has been altered to circumvent the restrictions imposed on non-interactive blastp searches, and a HMMER search using specially generated alignments to maximize the distribution of organisms returned in the search results has been added.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24157837 PMCID: PMC3964991 DOI: 10.1093/nar/gkt953
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Counts of protein species, families and clans for proteolytic enzymes and protein inhibitors in the MEROPS database
| MEROPS 9.5 | MEROPS 9.9 | |||
|---|---|---|---|---|
| Peptidases | Inhibitors | Peptidases | Inhibitors | |
| Sequences | 192 053 | 17 451 | 413 834 | 28 502 |
| Identifiers | ||||
| Experimentally characterized and sequenced | 2308 | 518 | 2438 | 542 |
| Hypothetical from model organisms | 1250 | 0 | 1362 | 0 |
| Not active as peptidase or inhibitor | 298 | 117 | 327 | 115 |
| Experimentally characterized but unsequenced | 145 | 0 | 148 | 0 |
| Pseudogenes | 70 | 0 | 70 | 0 |
| Compound and complex proteins | 15 | 52 | 16 | 49 |
| Total | 4086 | 687 | 4361 | 706 |
| Families | 225 | 71 | 244 | 76 |
| Clans | 44 | 34 | 55 | 39 |
The numbers in Release 9.9 of MEROPS (August 2013) are compared with those in Release 9.5 of MEROPS (July 2011). A peptidase is referred to as ‘unsequenced’ when no sequence is known, or the known sequence fragments are insufficient to be able to assign the peptidase to a family
Information in the MEROPS database
| MEROPS 9.5 | MEROPS 9.9 | |
|---|---|---|
| Substrate cleavages: total | 54 838 | 64 022 |
| Substrate cleavages: physiological | 18 280 | 20 591 |
| Substrate cleavages: non-physiological | 28 376 | 35 897 |
| Substrate cleavages: pathological | 990 | 1166 |
| Substrate cleavages: synthetic substrates | 4229 | 4906 |
| Peptidase-inhibitor interactions: total | 4017 | 4485 |
| Peptidase-inhibitor interactions: proteins | 1220 | 1304 |
| Peptidase-inhibitor interactions: SMI | 2373 | 2562 |
| References | 43 497 | 52 600 |
Substrate cleavage totals do not include cleavages derived only from the SwissProt database (mainly removal of initiating methionines and signal peptides). A naturally occurring cleavage is described as ‘physiological’ when the peptidase and substrate are from the same organism and ‘pathological’ if the organisms differ and are pathogen and host. More than half of the cleavage positions in the MEROPS collection have been identified by mass spectroscopy, of which over 4800 cleavages were obtained from the PRIDE database (4) and over 3100 from the TOPPR database (5). Over 3300 cleavages were derived from the CutDB database (6). Molecular Connections (Bangalore, India) have provided over 10 000 cleavages collected from the literature. How these data have been annotated has been described previously (7)
Example of sequences used in an alignment submitted to the HMMER server
| Organism | Phylum | MEROPS identifier | Accession | Residue range |
|---|---|---|---|---|
| Human | Chordata | A01.070 | B4DVY9 | 63–388 |
| Arthropoda | A01.A66 | Q9VEK4 | 51–370 | |
| Hemichordata | A01.009 | XP_002731917 | 55–386 | |
| Echinodermata | A01.096 | XP_780533 | 66–310 | |
| Annelida | A01.009 | 12–343 | ||
| Nematoda | A01.A73 | CAB60913 | 56–320 | |
| Platyhelminthes | G4VG04 | 58–336 | ||
| Cnidaria | A01.006 | XP_002154870 | 92–417 | |
| Placozoa | B3RK54 | 16–344 | ||
| Porifera | XP_003385244 | 56–379 | ||
| Streptophyta | A01.A33 | O65453 | 33–335 | |
| Rhodophyta | A01.053 | 82–406 | ||
| Chlorophyta | A01.096 | Q7XB41 | 65–307, 490–578 | |
| Ochrophyta | B7FZ37 | 86–448 | ||
| Heterokontophyta | D7FLX5 | 93–407 | ||
| Oomycota | D0N6R0 | 25–378 | ||
| Basidiomycota | A8N6S9 | 143–366 | ||
| Ascomycota | A01.018 | P07267 | 78–405 | |
| Zygomycota | I1BX70 | 57–254 | ||
| Chytridiomycota | A01.018 | F4NZG7 | 69–399 | |
| Sarcomastigophora | A01.A89 | O76856 | 50–378 | |
| Parabasalidea | A2FIM5 | 44–351 |
The identifiers for the sequences used to generate an alignment for family A1 subfamily A are shown. Where no MEROPS identifier is listed, it is because a putative peptidase was used that could not be mapped to a MEROPS identifier. Accessions cited are mainly UniProt or RefSeq or are Protein Identifiers. The sequences from Capitella capitata and Meloidogyne incognita are translations from the genes Capca1_225009 and Minc12021, respectively. The residue range of the peptidase domain is given; in the case of Q7XB41, an unrelated nested domain interrupts the peptidase domain.
Figure 1.Form for the submission of a peptidase summary for the MEROPS community annotation project. The summary for carboxypeptidase A6 (MEROPS identifier M14.018) is shown. The summary was kindly provided by Professor Lloyd Fricker.
Figure 3.Example of a gene structure. The gene structures for cathepsin E (MEROPS identifier A01.010) are shown.