| Literature DB >> 25883136 |
Mark Davies1, Michał Nowotka1, George Papadatos1, Nathan Dedman1, Anna Gaulton1, Francis Atkinson1, Louisa Bellis1, John P Overington2.
Abstract
ChEMBL is now a well-established resource in the fields of drug discovery and medicinal chemistry research. The ChEMBL database curates and stores standardized bioactivity, molecule, target and drug data extracted from multiple sources, including the primary medicinal chemistry literature. Programmatic access to ChEMBL data has been improved by a recent update to the ChEMBL web services (version 2.0.x, https://www.ebi.ac.uk/chembl/api/data/docs), which exposes significantly more data from the underlying database and introduces new functionality. To complement the data-focused services, a utility service (version 1.0.x, https://www.ebi.ac.uk/chembl/api/utils/docs), which provides RESTful access to commonly used cheminformatics methods, has also been concurrently developed. The ChEMBL web services can be used together or independently to build applications and data processing workflows relevant to drug discovery and chemical biology.Entities:
Mesh:
Year: 2015 PMID: 25883136 PMCID: PMC4489243 DOI: 10.1093/nar/gkv352
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
ChEMBL web service resources
| Example | Description | Name |
|---|---|---|
| Activity values recorded in an Assay | Activity | |
| Assay details as reported in source Document/Dataset | Assay | |
| WHO ATC Classification for drugs | ATC | |
| Target binding site definition | BindingSite | |
| Biotherapeutic molecules | Biotherapeutic | |
| Cell line information | CellLine | |
| Look up ChEMBL Id entity type | ChEMBL-IdLookup | |
| Document/Dataset from which Assays have been extracted | Document | |
| Mechanism of action information for FDA-approved drugs | Mechanism | |
| Molecule/biotherapeutics information | Molecule | |
| Relationships between molecule parents and salts | MoleculeForm | |
| Targets (protein and non-protein) defined in Assay | Target | |
| Target sequence information (A Target may have 1 or more sequences) | Target-Component | |
| Graphical (png, svg, json) representation of Molecule | Image | |
| Protein family classification of TargetComponents | Protein- Classification | |
| Molecule substructure search | Substructure | |
| Molecule similarity search | Similarity | |
| Document/Dataset source | Source |
Figure 1.Interactive online SPORE documentation for the Activity resource.
Figure 2.ChEMBL web service schema diagram. The oval shapes represent ChEMBL web service resources and the line between two resources indicates that they share a common attribute. The arrow direction shows where the primary information about a resource type can be found. A dashed line indicates the relationship between two resources behaves differently. For example the Image resource provides a graphical based representation of a Molecule.
Figure 3.ChEMBL web service filtering and sorting examples. A: Example arguments for web service filter query (note the double underscore between
Example filter types that can be used in ChEMBL web services
| Example | Description | Filter types |
|---|---|---|
| Exact match with query | exact (iexact) | |
| Wild card search with query | contains (icontains) | |
| Starts with query | startswith (istartswith) | |
| Ends with query | endswith (iendswith) | |
| Regular expression query | regex (iregex) | |
| Greater than (or equal) | gt (gte) | |
| Less than (or equal) | lt (lte) | |
| Within a range of values | range | |
| Appears within list of query values | in | |
| Field is null | isnull |
The ‘i’ versions of filter types, e.g. iexact, represent case insensitive forms. The ‘gt’ and ‘lt’ filter_type examples demonstrate how to access a field within a nested block. In these cases the full_mwt and alogp fields are contained within a ‘molecule_properties’ block. To access the fields contained within this section, the field name is double underscore prepended with the outer block name, e.g. ‘molecule_properties__full_mwt’.
Example ChEMBL web service calls, which would return protein targets that interact with drugs classified as being used in the treatment of diabetes
| Step | Description | URL |
|---|---|---|
| 1 | Return the distinct set of molecules that match ATC codes starting with ‘A10’ | |
| 2 | Return the distinct set of targets from the activities resource for previously matched molecules, where the pChEMBL value is greater than or equal to 6. For each molecule returned in Step 1, the following example URL will be requested, changing the molecule_chembl_id each time. | |
| 3 | Return additional target data e.g. name, organism, target type and accessions. For each target returned in Step 2, the following example URL will be requested, changing the target chembl_id each time. |
Additional processing of results are required at each stage to generate final results.
Figure 4.ChEMBL web service page_meta section from a request to https://www.ebi.ac.uk/chembl/api/data/activity.json.
Figure 5.Example of using the ChEMBL web service client to return protein targets that interact with drugs classified as being used in the treatment diabetes.
Figure 6.ChEMBL web service software components. The green circles correspond to python libraries developed by the ChEMBL group and are available on the ChEMBL GitHub site. Django (https://www.djangoproject.com/) and Tastypie (https://django-tastypie.readthedocs.org) are open source python libraries, which are core dependencies of the ChEMBL web services.
Figure 7.(A) KNIME workflow combining data and utility ChEMBL web service requests. (B) The image of the drug palbociclib is extracted from a patent document and then converted to the corresponding 2D structure via a Beaker call - shown in (C). The structure is then used as a query in a similarity search web service call against the ChEMBL database. Finally, all bioactivities for the drug and its close analogues are retrieved via a third web service call. (D) The resulting bioactivities and filtered and plotted in a heat map across compounds and their corresponding biological targets.