| Literature DB >> 21948594 |
Anna Gaulton1, Louisa J Bellis, A Patricia Bento, Jon Chambers, Mark Davies, Anne Hersey, Yvonne Light, Shaun McGlinchey, David Michalovich, Bissan Al-Lazikani, John P Overington.
Abstract
ChEMBL is an Open Data database containing binding, functional and ADMET information for a large number of drug-like bioactive compounds. These data are manually abstracted from the primary published literature on a regular basis, then further curated and standardized to maximize their quality and utility across a wide range of chemical biology and drug-discovery research problems. Currently, the database contains 5.4 million bioactivity measurements for more than 1 million compounds and 5200 protein targets. Access is available through a web-based interface, data downloads and web services at: https://www.ebi.ac.uk/chembldb.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21948594 PMCID: PMC3245175 DOI: 10.1093/nar/gkr777
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Sources of compound and bioactivity data in ChEMBL_11
| Data Source | Number of compound structures | Number of assays | Number of activity results | Number of targets | Number of protein targets | Number of organisms |
|---|---|---|---|---|---|---|
| ChEMBL literature extraction | 629 943 | 580 624 | 3 282 945 | 7 957 | 5104 | 1552 |
| PubChem BioAssay | 364 203 | 1636 | 2 079 974 | 681 | 647 | 63 |
| GSK TCAMS Malaria Data ( | 13 467 | 6 | 81 198 | 3 | 0 | 2 |
| PDBe Ligands | 12 337 | 0 | 0 | 0 | 0 | 0 |
| Novartis-GNF Malaria Data ( | 5675 | 4 | 22 788 | 3 | 0 | 2 |
| St Jude Children's Hospital Malaria Data | 1524 | 16 | 5456 | 8 | 0 | 5 |
| Guide to Receptors and Channels ( | 560 | 344 | 801 | 239 | 239 | 6 |
| Sanger Institute Genomics of Drug Sensitivity in Cancer | 17 | 352 | 5984 | 352 | 0 | 1 |
aPubChem BioAssay set includes only confirmatory/panel assays from PubChem that have dose–response end points.
bOnly compounds with dose-response measurements from the St Jude malaria screening data set have been incorporated into ChEMBL, but the full high-throughput screening data can be downloaded from the ChEMBL-NTD website: https://www.ebi.ac.uk/chemblntd.
Figure 1.Retrieving bioactivity data with a substructure search. A choice of sketchers allows the user to enter a structure of interest and search the database for compounds similar to, or containing that substructure (a). The resulting list of compounds can then be filtered graphically, according to their physicochemical properties (e.g. calculated lipophilicity AlogP and molecular weight) using the sliders and ‘update chart’ button (b). When a suitable compound set has been created, a drop-down menu allows the user to retrieve all relevant bioactivity results from the database, or filter the results further by activity type (c).
Figure 2.Compound report card for Fingolimod (CHEMBL314854) showing synonyms, approved drug features (see Supplementary Figure 2), a link to retrieve clinical trial data, calculated compound properties and structure representations, and different salt forms of the molecule (in this case, a hydrochloride salt). The lower portion of the page has a series of clickable widgets, showing breakdown of the activity data for this compound by activity type (e.g. IC50, EC50), assay type (e.g. binding/functional/ADMET) or target type (e.g. enzyme, receptor). Clicking on a portion of one of the pie charts takes the user directly to the relevant bioactivity results.