| Literature DB >> 27899599 |
Yanli Wang1, Stephen H Bryant2, Tiejun Cheng3, Jiyao Wang3, Asta Gindulyte3, Benjamin A Shoemaker3, Paul A Thiessen3, Siqian He3, Jian Zhang3.
Abstract
PubChem's BioAssay database (https://pubchem.ncbi.nlm.nih.gov) has served as a public repository for small-molecule and RNAi screening data since 2004 providing open access of its data content to the community. PubChem accepts data submission from worldwide researchers at academia, industry and government agencies. PubChem also collaborates with other chemical biology database stakeholders with data exchange. With over a decade's development effort, it becomes an important information resource supporting drug discovery and chemical biology research. To facilitate data discovery, PubChem is integrated with all other databases at NCBI. In this work, we provide an update for the PubChem BioAssay database describing several recent development including added sources of research data, redesigned BioAssay record page, new BioAssay classification browser and new features in the Upload system facilitating data sharing. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27899599 PMCID: PMC5210581 DOI: 10.1093/nar/gkw1118
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
A list of PubChem BioAssay services
| Service | Description | URL example |
|---|---|---|
| BioAssay Record Page | Access and download a bioassay record | |
| BioAssay Search | Search BioAssay Database with Entrez | |
| BioAssay Search, Advanced page | An interface for searching multiple search fields | |
| An interface for reviewing search history and refining search results with Boolean operation | ||
| PubChem Upload | Substance and BioAssay submission system | |
| BioAssay FTP | FTP for all PubChem BioAssay records and related information | |
| BioAssay Data Standard | XML Data specification for PubChem BioAssay data model | |
| BioAssay Service Home | BioAssay Service Home | |
| BioAssay Classification | Browse BioAssay classification tree | |
| Bioactivity Data Tool | Retrieve a full data table from a single bioassay record | |
| Retrieve and download cross-assay bioactivity data for a single substance sample (SID), chemical structure (CID), protein target (GI, UniProt or GenBank accession), gene target (GeneID) or publication (PMID) | ||
| Bioassay Download Tool | A flexible download interface | |
| PubChem PUG/REST/SOAP | Programmatic tool and REST api for data retrieval | |
| PubChem Widget Help | PubChem widgets enable you to display PubChem data in your pages | |
| Structure-Activity Analysis (SAR) | Analyze and visualize Structure-Activity relationship with clustering tools and a heatmap-style display | |
| Dose-response Curve Tool | Analyze bioassay test results and visualize dose-response curve | |
| Scatter Plot/Histogram | Analyze bioassay test results with histogram or scatter plot | |
| Related BioAssays | Summarize bioassay relationship by: same assay project, overlap of active compounds, overlap of active gene, target sequence similarity, deposited annotation, same publication and gene interaction | |
| BioActivity Summary - Compound-centric | Summarize and analyze bioactivity data for a set of records, presented from the compound point of view | |
| BioActivity Summary - Assay-centric | Summarize and analyze bioactivity data for a set of records, presented from the assay point of view | |
| BioActivity Summary - Target-centric | Summarize and analyze bioactivity data for a set of records, presented from the target point of view |
PubChem BioAssay statistics
| Total | Chemical assays | RNAi assays | |||
|---|---|---|---|---|---|
| 2004–2013 | 2014–present | 2004–2013 | 2014–present | ||
| assay records (AID) | 1 218 687 | 737 994 | 480 616 | 57 | 48 |
| substance samples (SID) | 3 576 066 | 2 755 032 | 1 396 693 | 213 030 | 293 499 |
| chemical structures (CID) | 2 283 533 | 1 956 998 | 986 237 | - | - |
| bioactivity outcomes | 231 303 607 | 222 198 148 | 8 764 075 | 701 993 | 419 081 |
| data points | 1 514 223 504 | 1 403 289 248 | 100 451 032 | 9 404 999 | 7 473 465 |
| species | 3543 | 2730 | 1895 | 6 | 2 |
| protein targets | 10 636 | 7450 | 6972 | - | - |
| protein targets (human) | 4771 | 3378 | 3495 | - | - |
| gene targets | 55 714 | - | - | 38 694 | 52 986 |
| gene targets (human) | 24 888 | - | - | 24 460 | 22 656 |
| gene targets (phenotype) | 15 866 | - | - | 12 816 | 4524 |
Figure 1.A bioassay record (AID 1510, https://pubchem.ncbi.nlm.nih.gov/bioassay/1510). (A) The overview of the record page. The table of contents provides quick navigation to a list of sections shown on the page. Each section has an anchor and its URL can be used for widget embedding. (B) Selected sections: Data Table, Same-Project BioAssays and BioAssay Annotations.
Figure 2.The PubChem BioAssay Classification Tree (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=80). A hierarchical display is provided, which can be navigated and explored by expanding to the sub-trees upon clicking on the triangle icon . A click on the numbers on a node (showing the count of BioAssay records with that annotation) leads to a report in Entrez for the associated assay records.
Figure 3.Collective and cross-assay bioactivity data for a specific gene target in the PubChem BioAssay database (https://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.html?geneid=1956). The filters at the top of the web page may be used to drill down to a subset of interest. For example, using the filters provided under the ‘Substance Types’ section, one may retrieve the RNAi data by clicking on ‘RNAi (34)’. The ‘Primary drug (385)’ filter in the section allows the retrieval of the bioactivity data for drugs that were developed to specifically target the query protein/gene, while the ‘Drug (1646)’ filter retrieves bioactivity for any drugs in general which were tested in the assays. This latter filter allows one to identify drug molecules that show experimental evidence (based on PubChem BioAssay data) for binding or affecting the query protein/gene target so that their potential for drug repositioning (against the query protein/gene target) may be further explored. Drug and target information supporting these two filters were obtained from annotations in the DrugBank.