Literature DB >> 23175605

The EBI enzyme portal.

Rafael Alcántara1, Joseph Onwubiko, Hong Cao, Paula de Matos, Jennifer A Cham, Jules Jacobsen, Gemma L Holliday, Julia D Fischer, Syed Asad Rahman, Bijay Jassal, Mikael Goujon, Francis Rowland, Sameer Velankar, Rodrigo López, John P Overington, Gerard J Kleywegt, Henning Hermjakob, Claire O'Donovan, María Jesús Martín, Janet M Thornton, Christoph Steinbeck.   

Abstract

The availability of comprehensive information about enzymes plays an important role in answering questions relevant to interdisciplinary fields such as biochemistry, enzymology, biofuels, bioengineering and drug discovery. At the EMBL European Bioinformatics Institute, we have developed an enzyme portal (http://www.ebi.ac.uk/enzymeportal) to provide this wealth of information on enzymes from multiple in-house resources addressing particular data classes: protein sequence and structure, reactions, pathways and small molecules. The fact that these data reside in separate databases makes information discovery cumbersome. The main goal of the portal is to simplify this process for end users.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 23175605      PMCID: PMC3531056          DOI: 10.1093/nar/gks1112

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

The number of registered databases in the NAR Database Issue during the last year has increased 7% to 1380 (1) with coverage of a great variety of fields and scopes. Within this landscape, users usually have to navigate from one resource to another in order to gather the information they need. That is where portals designed for a particular community can play an important role, simplifying the search process and presenting views of data from different databases. Many portals already exist which integrate data from different resources. Biomart Central Portal (2) uses the generic Biomart federation technology to present combined data from virtually any biology database. BioPortal (3) gives access to biomedical ontologies and tools to work with them. BioProject and BioSample (4) include metadata annotations in their registers, which allow the aggregated search of experimental data submitted to different NCBI, EBI and DDBJ databases. Other portals are focused on more specific scientific fields. The Technology Portal of PSI-SBKB (5) integrates technology summaries and tools for structural biology, along with videos and social networks. InterStoreDB (6) provides plant phenotypic and genomic data from different sequence, crop and alignment databases. The IKMC web portal is a friendly interface to search many different BioMarts for data about targeted and trapped mouse knockout availability and structure (7). Enzymes—biomolecules that catalyse specific chemical reactions—are of central importance in many fields of science, and their study is the foundation of the entire field of biochemistry. Enzymes have key regulatory and metabolic roles and are of large commercial and healthcare importance. The engineering of new functions, and the development of specific inhibitors for enzymes, is becoming more important in the post-genomic era. The EBI hosts a number of resources providing data on enzymes. One of them is IntEnz, the Integrated Enzyme relational database that provides the NC-IUBMB nomenclature and classification and facilitates downloads in different formats. Other EBI databases that provide enzyme data, such as UniProtKB and PDBe, are cross-referenced from IntEnz. However, these hyperlinks do not provide a good overview of the information users require and make it very difficult for users to navigate. In addition, there are other resources—Reactome, ChEBI, ChEMBL, Rhea—containing detailed data about the biochemistry of enzymes which could also be integrated for the benefit of users. It was with these aims in mind that the enzyme portal was launched. The enzyme portal is a free resource that summarizes publicly available information about enzymes, including small-molecule chemistry, biochemical pathways and drug compounds. It provides a concise summary of information from: UniProt Knowledgebase (8); Protein Data Bank in Europe (PDBe) (9); Rhea, a database of enzyme-catalysed reactions (10); Reactome, a database of biochemical pathways (11); IntEnz, a resource with enzyme nomenclature information (12); ChEBI (13) and ChEMBL (14), which contain information about small-molecule chemistry and bioactivity; MACiE (15) for highly detailed, curated information about reaction mechanisms; EFO (16), the Experimental Factor Ontology, a system for annotation of experiments from which the enzyme portal retrieves disease-related information, concretely from children of its ‘disease’ entry. The enzyme portal collates diverse information about enzymes and displays it in an organized overview. It covers many species, including mammals, invertebrates and plants, and provides a simple way to compare orthologues.

MATERIALS AND METHODS

User-centred design was applied to improve usability and meet the expectations of users

The portal was designed by following a user-centred design lifecycle (17), whereby decisions regarding the design of the portal were made based on evidence gathered from users representing our target audience, namely, scientists working in enzyme-related research, such as biomarker discovery, enzymology and drug discovery (see also Supplementary Methods).

The portal was implemented using Java technologies

We used Java technologies—Spring Web MVC, JAXB, JAX-WS—and several Web Service APIs either SOAP—EB-Eye, ChEBI, CiteXplore—or REST—UniProt, Reactome, Rhea, ChEMBL, BioMart, DAS—to build the web application. XML schemas describe the underlying model. For quick cross-referencing and building filters for search results, an Oracle database was populated with cross-references to/from UniProtKB identifiers to other databases, including those not directly cross-referenced currently in the UniProt Knowledgebase such as ChEBI and ChEMBL. UniProtKB accessions are used throughout the enzyme portal as enzyme identifiers.

SEARCHING THE ENZYME PORTAL

Searches in the enzyme portal are based on the powerful EB-eye search engine (18), which indexes many EBI resources, updating for every release and provides a web service API. From the website homepage, a free text search can use any relevant query terms, such as enzyme names, EC numbers, UniProtKB accessions, gene names and small-molecule names. Search results are shown in a table (Figure 1) with orthologues grouped and split into separate pages where appropriate. The results list the enzyme name, a description of its function and any synonyms as well as the list of species in which the enzyme is found. If there are any related diseases, these will also be displayed.
Figure 1.

Search results are grouped as orthologues. Note the paging (top right of the table) and the filter facets (left).

Search results are grouped as orthologues. Note the paging (top right of the table) and the filter facets (left). Each result shows on its left side a fully coloured thumbnail of the protein structure or a greyed image if none is found for any of the grouped orthologues.

Search filters

On the left-hand side, there is a list of species, compounds and diseases which are related to the search results. Users can filter these simply by clicking the corresponding checkbox so that only enzymes matching the checked items will be shown, hiding the others.

Species

A list of species where enzymes included in the search results are found is shown so that users can narrow the search to any specie(s) of interest (Figure 2).
Figure 2.

Search results for ‘CFTR’ filtered to show only enzymes known in Ma’s night monkey.

Search results for ‘CFTR’ filtered to show only enzymes known in Ma’s night monkey.

Compound

Any small molecules known to interact in some way with the enzymes in the list are shown here and can also be used as filters. This includes cofactors, activators, inhibitors and drugs.

Disease

Search results can also be filtered according to any diseases associated to them. For example, checking a box labelled ‘stroke’ will display only those enzymes which have been related to this disease. Note that several filters within the same section—species, compounds or diseases—have the effect of union of results (boolean OR), while filters from different sections result in intersection of results (boolean AND). For example, checking Xenopus laevis and Drosophila melanogaster will display only enzymes present in one species or the other; checking Rattus norvegicus and GMP3- will display only enzymes present in rat and known to interact with the nucleotide.

ENZYME DATA

Clicking on an enzyme name (or one of its orthologues) in the list of search results takes to the enzyme page, organized into tabs (on the left-hand side).

Enzyme summary

The first one (Figure 3) is the enzyme summary tab, which contains the description of the enzyme function, its classification in the EC hierarchy, any synonyms and some information about the protein sequence. An ‘organisms’ drop-down menu at the top can be used to switch between the known orthologues of the enzyme, from any of the available tabs.
Figure 3.

Enzyme summary. Orthologues can be selected using the ‘organisms’ drop-down menu at the top. Notice the breadcrumbs at the top left, which include links to the search results and any other orthologues visited previously.

Enzyme summary. Orthologues can be selected using the ‘organisms’ drop-down menu at the top. Notice the breadcrumbs at the top left, which include links to the search results and any other orthologues visited previously. While navigating search results and orthologues, a history of the users’ navigation is kept in the form of breadcrumbs. Throughout the portal, all data sources are acknowledged and linked, so that users can access in-depth information (such as the protein sequence from UniProtKB or the enzyme EC classification from IntEnz).

Protein 3D structure

The protein structure tab (Figure 4) shows any experimental 3D models of the enzyme. If there are several of them, any one can be selected from the drop-down menu, which indicates the number of structures available.
Figure 4.

3D structure tab showing basic information about experimental models of the protein structure.

3D structure tab showing basic information about experimental models of the protein structure. Only basic information on the model is displayed: again, users can navigate to the PDBe web pages that describe the structure.

Reactions and pathways

The reactions and pathways tab (Figure 5) shows the biochemical reaction(s) catalysed by the enzyme—with the chemical structures of the participants linked to ChEBI—as well as any metabolic pathways in which the enzyme may be involved.
Figure 5.

Reactions and pathways tab: chemical structures are hyperlinks to the corresponding entities in ChEBI. Additional information about the reaction available from Rhea, Reactome and MACiE is also linked from here. This tab also includes a list of pathways in with every reaction can be involved, including descriptions and graphics when available.

Reactions and pathways tab: chemical structures are hyperlinks to the corresponding entities in ChEBI. Additional information about the reaction available from Rhea, Reactome and MACiE is also linked from here. This tab also includes a list of pathways in with every reaction can be involved, including descriptions and graphics when available. Only summarized information is shown in this tab. For more information—additional data about the reaction, reaction mechanisms and context within pathways—users are referred to the data source (in this case, Rhea, MACiE and Reactome, respectively) with the provided hyperlinks when available.

Small molecules

The small molecules tab (Figure 6) includes any available information from UniProtKB about cofactors, activators, inhibitors and drugs, and also any bioactive compounds from the ChEMBL database which have been associated with the enzyme. The chemical structures are links to navigate to ChEBI (with its focus on structure and nomenclature) or ChEMBL (with its focus on bioactivity and function).
Figure 6.

Small molecules that have been associated in some way to the enzyme, including drugs, inhibitors and activators.

Small molecules that have been associated in some way to the enzyme, including drugs, inhibitors and activators.

Diseases

The disease tab (Supplementary Figure S1) lists any diseases that are related to the selected enzyme—OMIM entries and MeSH terms cross-referenced from UniProtKB, translated into EFO identifiers, with a short description of the disease and how it relates to the enzyme.

Literature

The literature tab (Supplementary Figure S2) lists bibliographic citations relevant to the enzyme. Article titles are linked to the EBI’s CiteXplore bibliography database. If an abstract is available, clicking on the ‘toggle abstract’ link will show it. Citations can be filtered according to the different aspects of the enzyme, i.e. the other tabs.

DISCUSSION

The enzyme portal integrates many resources, most of them hosted by EBI and also external ones such as BioPortal. Its main goal is to provide information about enzymes in a suitable format, with a usable interface designed for intended users. Instead of reinventing the wheel, it makes use of available and reliable resources to that end. Although some of these resources already incorporate real semantics, others do not, which makes it difficult to extract meaningful information on enzymes using existing semantic web technologies. This portal fills these gaps using the well-known relationships between the EBI databases. The EB-eye tool (18) provides gene and protein summaries for search results. The enzyme portal complements this with additional metabolic information: catalytic activity, pathways, regulation by small molecules and related diseases. It also keeps a look and feel consistent with that of the EB-eye summaries. The original data sources are always acknowledged and linked from the enzyme portal pages. This portal offers useful overviews on enzymes, but users are referred to the original databases in order to get in-depth information. The data provided by the enzyme portal are live in the sense that they are not stored in a data warehouse, but retrieved on demand from the different data sources. This way, maintenance is reduced to a minimum and the most recent data are guaranteed by relying on web services from the provider databases, which is an advantage over the data warehouse approach. The enzyme portal is a one-stop shop for enzyme-related information in resources developed at the EBI. It accumulates this information and aims to present it to the scientist with a unified user experience. The enzyme portal team does not curate enzyme information and therefore is a secondary information resource or portal. At some point, a user interested in more detail will always leave its pages and refer to the information in the underlying primary database (UniProtKB, PDBe, etc.) directly. BRENDA (19) is the most comprehensive resource about enzymes worldwide and has invested a great amount into the abstraction and curation of enzymes and their related information. BRENDA contains valuable information that cannot be found in the enzyme portal at the moment, such as substrate, kinetic, specificity, stability, application, disease-related and engineering data. As a primary resource, BRENDA could be a candidate for an information source for the enzyme portal in the future.

Future technical developments

Programmatic access through web services will be provided in the future, making use of the existing XML schemas defining the underlying object model. Other features demanded during the design process were customized downloads—users will be able to save search results and enzyme summaries in the format of their choice—and side-by-side enzyme comparison. Additional means to browse the data through enzyme classification, compound classification and disease annotation has also been requested.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online: Supplementary Figures 1 and 2, Supplementary Methods and Supplementary References [20-22].

FUNDING

The European Molecular Biology Laboratory (core funding). Funding for open access charge: European Molecular Biology Laboratory—European Bioinformatics Institute. Conflict of interest statement. None declared.
  18 in total

1.  Fast and efficient searching of biological data resources--using EB-eye.

Authors:  Franck Valentin; Silvano Squizzato; Mickael Goujon; Hamish McWilliam; Juri Paern; Rodrigo Lopez
Journal:  Brief Bioinform       Date:  2010-02-11       Impact factor: 11.622

2.  Modeling sample variables with an Experimental Factor Ontology.

Authors:  James Malone; Ele Holloway; Tomasz Adamusiak; Misha Kapushesky; Jie Zheng; Nikolay Kolesnikov; Anna Zhukova; Alvis Brazma; Helen Parkinson
Journal:  Bioinformatics       Date:  2010-03-03       Impact factor: 6.937

3.  BRENDA, the enzyme information system in 2011.

Authors:  Maurice Scheer; Andreas Grote; Antje Chang; Ida Schomburg; Cornelia Munaretto; Michael Rother; Carola Söhngen; Michael Stelzer; Juliane Thiele; Dietmar Schomburg
Journal:  Nucleic Acids Res       Date:  2010-11-09       Impact factor: 16.971

4.  The IKMC web portal: a central point of entry to data and resources from the International Knockout Mouse Consortium.

Authors:  Martin Ringwald; Vivek Iyer; Jeremy C Mason; Kevin R Stone; Hamsa D Tadepally; James A Kadin; Carol J Bult; Janan T Eppig; Darren J Oakley; Sebastien Briois; Elia Stupka; Vincenza Maselli; Damian Smedley; Songyan Liu; Jens Hansen; Richard Baldock; Geoff G Hicks; William C Skarnes
Journal:  Nucleic Acids Res       Date:  2010-10-06       Impact factor: 16.971

5.  PDBe: Protein Data Bank in Europe.

Authors:  S Velankar; Y Alhroub; C Best; S Caboche; M J Conroy; J M Dana; M A Fernandez Montecelo; G van Ginkel; A Golovin; S P Gore; A Gutmanas; P Haslam; P M S Hendrickx; E Heuson; M Hirshberg; M John; I Lagerstedt; S Mir; L E Newman; T J Oldfield; A Patwardhan; L Rinaldi; G Sahni; E Sanz-García; S Sen; R Slowley; A Suarez-Uruena; G J Swaminathan; M F Symmons; W F Vranken; M Wainwright; G J Kleywegt
Journal:  Nucleic Acids Res       Date:  2011-11-21       Impact factor: 16.971

6.  Reorganizing the protein space at the Universal Protein Resource (UniProt).

Authors: 
Journal:  Nucleic Acids Res       Date:  2011-11-18       Impact factor: 16.971

7.  ChEMBL: a large-scale bioactivity database for drug discovery.

Authors:  Anna Gaulton; Louisa J Bellis; A Patricia Bento; Jon Chambers; Mark Davies; Anne Hersey; Yvonne Light; Shaun McGlinchey; David Michalovich; Bissan Al-Lazikani; John P Overington
Journal:  Nucleic Acids Res       Date:  2011-09-23       Impact factor: 16.971

8.  MACiE: exploring the diversity of biochemical reactions.

Authors:  Gemma L Holliday; Claudia Andreini; Julia D Fischer; Syed Asad Rahman; Daniel E Almonacid; Sophie T Williams; William R Pearson
Journal:  Nucleic Acids Res       Date:  2011-11-03       Impact factor: 16.971

9.  BioMart Central Portal: an open database network for the biological community.

Authors:  Jonathan M Guberman; J Ai; O Arnaiz; Joachim Baran; Andrew Blake; Richard Baldock; Claude Chelala; David Croft; Anthony Cros; Rosalind J Cutts; A Di Génova; Simon Forbes; T Fujisawa; E Gadaleta; D M Goodstein; Gunes Gundem; Bernard Haggarty; Syed Haider; Matthew Hall; Todd Harris; Robin Haw; S Hu; Simon Hubbard; Jack Hsu; Vivek Iyer; Philip Jones; Toshiaki Katayama; R Kinsella; Lei Kong; Daniel Lawson; Yong Liang; Nuria Lopez-Bigas; J Luo; Michael Lush; Jeremy Mason; Francois Moreews; Nelson Ndegwa; Darren Oakley; Christian Perez-Llamas; Michael Primig; Elena Rivkin; S Rosanoff; Rebecca Shepherd; Reinhard Simon; B Skarnes; Damian Smedley; Linda Sperling; William Spooner; Peter Stevenson; Kevin Stone; J Teague; Jun Wang; Jianxin Wang; Brett Whitty; D T Wong; Marie Wong-Erasmus; L Yao; Ken Youens-Clark; Christina Yung; Junjun Zhang; Arek Kasprzyk
Journal:  Database (Oxford)       Date:  2011-09-18       Impact factor: 3.451

10.  Chemical Entities of Biological Interest: an update.

Authors:  Paula de Matos; Rafael Alcántara; Adriano Dekker; Marcus Ennis; Janna Hastings; Kenneth Haug; Inmaculada Spiteri; Steve Turner; Christoph Steinbeck
Journal:  Nucleic Acids Res       Date:  2009-10-23       Impact factor: 16.971

View more
  6 in total

1.  Updates in Rhea--a manually curated resource of biochemical reactions.

Authors:  Anne Morgat; Kristian B Axelsen; Thierry Lombardot; Rafael Alcántara; Lucila Aimo; Mohamed Zerara; Anne Niknejad; Eugeni Belda; Nevila Hyka-Nouspikel; Elisabeth Coudert; Nicole Redaschi; Lydie Bougueleret; Christoph Steinbeck; Ioannis Xenarios; Alan Bridge
Journal:  Nucleic Acids Res       Date:  2014-10-20       Impact factor: 16.971

2.  The EBI Search engine: providing search and retrieval functionality for biological data from EMBL-EBI.

Authors:  Silvano Squizzato; Young Mi Park; Nicola Buso; Tamer Gur; Andrew Cowley; Weizhong Li; Mahmut Uludag; Sangya Pundir; Jennifer A Cham; Hamish McWilliam; Rodrigo Lopez
Journal:  Nucleic Acids Res       Date:  2015-04-08       Impact factor: 16.971

3.  Updates in Rhea - an expert curated resource of biochemical reactions.

Authors:  Anne Morgat; Thierry Lombardot; Kristian B Axelsen; Lucila Aimo; Anne Niknejad; Nevila Hyka-Nouspikel; Elisabeth Coudert; Monica Pozzato; Marco Pagni; Sébastien Moretti; Steven Rosanoff; Joseph Onwubiko; Lydie Bougueleret; Ioannis Xenarios; Nicole Redaschi; Alan Bridge
Journal:  Nucleic Acids Res       Date:  2016-10-26       Impact factor: 16.971

4.  The EBI search engine: EBI search as a service-making biological data accessible for all.

Authors:  Young M Park; Silvano Squizzato; Nicola Buso; Tamer Gur; Rodrigo Lopez
Journal:  Nucleic Acids Res       Date:  2017-07-03       Impact factor: 16.971

5.  ProtVista: visualization of protein sequence annotations.

Authors:  Xavier Watkins; Leyla J Garcia; Sangya Pundir; Maria J Martin
Journal:  Bioinformatics       Date:  2017-07-01       Impact factor: 6.937

6.  The European Bioinformatics Institute's data resources 2014.

Authors:  Catherine Brooksbank; Mary Todd Bergman; Rolf Apweiler; Ewan Birney; Janet Thornton
Journal:  Nucleic Acids Res       Date:  2013-11-23       Impact factor: 16.971

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.