Literature DB >> 27899599

PubChem BioAssay: 2017 update.

Yanli Wang1, Stephen H Bryant2, Tiejun Cheng3, Jiyao Wang3, Asta Gindulyte3, Benjamin A Shoemaker3, Paul A Thiessen3, Siqian He3, Jian Zhang3.   

Abstract

PubChem's BioAssay database (https://pubchem.ncbi.nlm.nih.gov) has served as a public repository for small-molecule and RNAi screening data since 2004 providing open access of its data content to the community. PubChem accepts data submission from worldwide researchers at academia, industry and government agencies. PubChem also collaborates with other chemical biology database stakeholders with data exchange. With over a decade's development effort, it becomes an important information resource supporting drug discovery and chemical biology research. To facilitate data discovery, PubChem is integrated with all other databases at NCBI. In this work, we provide an update for the PubChem BioAssay database describing several recent development including added sources of research data, redesigned BioAssay record page, new BioAssay classification browser and new features in the Upload system facilitating data sharing. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27899599      PMCID: PMC5210581          DOI: 10.1093/nar/gkw1118

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

PubChem BioAssay (1–4) is an open access database hosted by the National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH). It started in 2004 serving as a public repository for information generated from chemogenomic, medicinal chemistry and functional genomics research. All data in the database are freely accessible to the public for searching and download. Recent reviews on the community's use of the PubChem resource (5–7) highlighted that the collection of bioactivity and toxicity data in PubChem BioAssay has greatly supported research in several fields such as medicinal chemistry, drug discovery, pharmaceutical genomics and informatics research. Small molecule data in PubChem BioAssay are cross-linked to chemical structures via the referenced samples in the assay. The PubChem BioAssay database is also linked to other biomedical and literature databases hosted at NCBI such as PubMed, Protein, Gene, Taxonomy etc. Metadata in the database are integrated with the NCBI's search engine, Entrez, making the PubChem BioAssay database accessible by interactive keyword search using the web interface and by programmatic retrieval via E-Utilities. Assay data can also be retrieved and analyzed via web-based and programmatic tools provided by PubChem. An update for the services and their URLs for accessing, searching, downloading and analyzing PubChem BioAssay data is provided in Table 1. Most of the web based services can also be accessed at https://pubchem.ncbi.nlm.nih.gov/assay/.
Table 1.

A list of PubChem BioAssay services

ServiceDescriptionURL example
BioAssay Record PageAccess and download a bioassay recordhttps://pubchem.ncbi.nlm.nih.gov/bioassay/805
BioAssay SearchSearch BioAssay Database with Entrezhttps://www.ncbi.nlm.nih.gov/pcassay/
BioAssay Search, Advanced pageAn interface for searching multiple search fieldshttps://www.ncbi.nlm.nih.gov/pcassay/limits
An interface for reviewing search history and refining search results with Boolean operationhttps://www.ncbi.nlm.nih.gov/pcassay/advanced
PubChem UploadSubstance and BioAssay submission systemhttps://pubchem.ncbi.nlm.nih.gov/upload/
BioAssay FTPFTP for all PubChem BioAssay records and related informationftp://ftp.ncbi.nlm.nih.gov/pubchem/Bioassay/
BioAssay Data StandardXML Data specification for PubChem BioAssay data modelftp://ftp.ncbi.nlm.nih.gov/pubchem/data_spec/
BioAssay Service HomeBioAssay Service Homehttps://pubchem.ncbi.nlm.nih.gov/assay/
BioAssay ClassificationBrowse BioAssay classification treehttps://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?p=classification
Bioactivity Data ToolRetrieve a full data table from a single bioassay recordhttps://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.html?aid=1811
Retrieve and download cross-assay bioactivity data for a single substance sample (SID), chemical structure (CID), protein target (GI, UniProt or GenBank accession), gene target (GeneID) or publication (PMID)https://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.html?sid=103164874
https://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.html?cid=2244
https://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.html?gi=29725609
https://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.html?uniprot=P00533
https://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.html?ncbiacc=NP_005219
https://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.html?geneid=1956
https://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.html?pmid=25728019
Bioassay Download ToolA flexible download interfacehttps://pubchem.ncbi.nlm.nih.gov/assay/assaydownload.cgi
PubChem PUG/REST/SOAPProgrammatic tool and REST api for data retrievalhttps://pubchem.ncbi.nlm.nih.gov/pug_rest/PUG_REST.html
https://pubchem.ncbi.nlm.nih.gov/pug/pughelp.html
PubChem Widget HelpPubChem widgets enable you to display PubChem data in your pageshttps://pubchem.ncbi.nlm.nih.gov/widget/docs/widget_help.html
Structure-Activity Analysis (SAR)Analyze and visualize Structure-Activity relationship with clustering tools and a heatmap-style displayhttps://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?p=heat
Dose-response Curve ToolAnalyze bioassay test results and visualize dose-response curvehttps://pubchem.ncbi.nlm.nih.gov/assay/plot.cgi?plottype=1
Scatter Plot/HistogramAnalyze bioassay test results with histogram or scatter plothttps://pubchem.ncbi.nlm.nih.gov/assay/plot.cgi?plottype=2
Related BioAssaysSummarize bioassay relationship by: same assay project, overlap of active compounds, overlap of active gene, target sequence similarity, deposited annotation, same publication and gene interactionhttps://pubchem.ncbi.nlm.nih.gov/bioassay/1510#section=Same-Project-BioAssays
https://pubchem.ncbi.nlm.nih.gov/bioassay/1510#section=Related-BioAssays
BioActivity Summary - Compound-centricSummarize and analyze bioactivity data for a set of records, presented from the compound point of viewhttps://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.cgi?tab=1
BioActivity Summary - Assay-centricSummarize and analyze bioactivity data for a set of records, presented from the assay point of viewhttps://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.cgi?tab=2
BioActivity Summary - Target-centricSummarize and analyze bioactivity data for a set of records, presented from the target point of viewhttps://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.cgi?tab=3
With continuous development towards supporting open data during the past 12 years, the PubChem BioAssay database is committed to meet the increasing need from the community for information archival, retrieval and mining. PubChem BioAssay stays as a leading repository of research data pertaining to drug discovery by: (i) supporting broad types of bioactivity information with an optimized and flexible data model; (ii) maintaining steady enhancement of database infrastructure and scalability; (iii) utilizing new technology for data archival, viewing, indexing, search and download; (iv) enhancing data upload system; (v) integrating with other biomedical resources. In this work, we provide an update on several aspects of the information resource, including data content and data sources growth, database infrastructure consolidation, the redesigned and widgetized BioAssay record page, new BioAssay classification browser and added features for previously provided web services. Enhanced management for assay data embargo, release and sharing of on-hold data by the PubChem Upload system is also described.

BioAssay DATA

The PubChem BioAssay database currently contains over one million records holding 230 000 000 bioactivity outcomes deposited by over 80 organizations (data sources) across the world. The data content for the time period of 2004–2013 and 2014–2016 is given in Table 2. High-throughput screening (HTS) data were provided by laboratories and screening centers from academic institutes, universities, government organizations as well as pharmaceutical companies including those participated the HTS campaigns for the development of chemical probes funded by the NIH Molecular Libraries Program (8,9). The database also contains literature-based data including both publication driven depositions submitted by journal authors and those contributed by PDBbind (10), IUPHAR (11), BindingDB (12), ChEMBL (13) and other literature curation projects. Additionally, third-party annotations with community adopted vocabulary and ontology (13–15) for the assay data in PubChem, such as assay format, assay type, detection method and cell line are collected and linked to the pertinent data sets, which are shown in the BioAssay record page and used for searching and filtering assay data in Entrez and data analysis tools.
Table 2.

PubChem BioAssay statistics

TotalChemical assaysRNAi assays
2004–20132014–present2004–20132014–present
assay records (AID)1 218 687737 994480 6165748
substance samples (SID)3 576 0662 755 0321 396 693213 030293 499
chemical structures (CID)2 283 5331 956 998986 237--
bioactivity outcomes231 303 607222 198 1488 764 075701 993419 081
data points1 514 223 5041 403 289 248100 451 0329 404 9997 473 465
species35432730189562
protein targets10 63674506972--
protein targets (human)477133783495--
gene targets55 714--38 69452 986
gene targets (human)24 888--24 46022 656
gene targets (phenotype)15 866--12 8164524
Nearly one-third of the assay data sources were added in the past three years. The majority of these data sources deposited RNAi data to PubChem along with manuscript submission to journals supporting open access for functional genomic research. Nature Cell Biology led this effort by recommending RNAi data deposition to public repository, which recently also started to call small molecule data sharing. Other open access journals, such as PLoS One, joined the force lately for recommending data sharing via public repository and brought in the deposition of several small molecule data sets in PubChem (16,17). Assay data contributed by these sources are linked to the respective publications indexed in PubMed (16–36), allowing PubChem users to access the articles for additional information, and vice versa, PubMed users also gain access to the research data in the BioAssay archive supporting machine readable format. All BioAssay depositors and the associated information, such as affiliations, summary of data submissions may be viewed at the PubChem Data Source page at: https://pubchem.ncbi.nlm.nih.gov/sources/, whereas data sources are grouped by geographic location and various other categories. One may access the submissions of a specific depositor by following the Substance or BioAssay record counts presented under the ‘Data Counts by Type’ field. The significant growth of the BioAssay database requires a robust and scalable database system. A set of relational databases and tables are set up to: (i) archive bioassay submissions, track update and provide version control; (ii) maintain data embargo and release status; (iii) record and derive links and relationships among assays and other biomedical information; (iv) store third-party annotations and link to the respective data sets; (v) provide search indexes; (vi) support data retrieval for web display, rest API and data analysis tools; (vi) facilitate daily update for BioAssay FTP. Many efforts have been invested to enhance the database infrastructure. Additional mechanisms were implemented for tracking information about BioAssay protein and gene targets. Mappings between NCBI protein GI number, GenBank accessions and UniProt ID were created to facilitate data retrieval and integration. These efforts were made to facilitate: (i) bioassay submission; (ii) integration of deposited small-molecule and RNAi data; (iii) integration of biological annotations for proteins and genes from public biomedical databases; (iv) access to the bioactivity data in PubChem using non-NCBI sequence identifier of the assay target; (v) and development of new PubChem services to enhance BioAssay target search; (vi) improve discoverability of the biological data in the database.

NEW WEB FEATURES AND SERVICES

PubChem BioAssay provides web-based and programmatic tools for data search, access, analysis and download. Several recently developed web services are described here for improving assay data search and navigation by classifying deposited metadata and third-party annotations.

BioAssay record page

PubChem provides a full access to each deposited BioAssay record. The PubChem BioAssay Record page, replacing the legacy Summary page, has been revamped to streamline data flow, support data and service reutilization and unify the web page presence across the PubChem resource. Taking the advantage of new web technology, the data-driven interface was designed and optimized for both touch- and mouse-based devices with a similar approach for the recent revamp of the PubChem Compound Summary page and the Substance Record page (37). The widgetized web page consisting of multiple sections automatically adapts to the available screen size with a responsive design, making it friendly for navigating page content and reviewing information with desktops, tablets and mobile phones. Furthermore, the new design provides an ability to embed any section or subsection of the page as a widget in another web page without the need of separate codebase eliminating the burden of maintenance by a third-party, which is greatly beneficial to non-PubChem resources that are interested in integrating PubChem BioAssay data. Information and instruction about embedding the PubChem widget are available at https://pubchem.ncbi.nlm.nih.gov/widget/docs/widget_help.html. A deposited BioAssay record can be accessed by an AID number, the primary accession. An example for a data set reporting an assay for identifying antagonists of the Sphingosine 1-Phosphate Receptor 4 (AID: 1510; https://pubchem.ncbi.nlm.nih.gov/bioassay/1510) is shown in Figure 1. The BioAssay Record page provides a primary access with version control to the initial submission and all subsequent updates of an assay. It also provides third-party annotations and links to tools supporting data analysis and download. The ‘Download’ button on top allows one to download depositor-provided metadata and assay result, as well as chemical structures for tested small-molecule samples. The table of contents can be expanded for quick navigation to each individual section. Full data set is retrieved by default at the Data Table section. Additionally, the assay data table may be partitioned according to activity outcomes (e.g. active, inactive, or subset with micromolar activity or nanomolar activity), allowing users to quickly filter, select and download the results of interest. To facilitate hit evaluation, data comparison and target identification, the structure image of a small molecule sample links to the specific bioactivity analysis tool that shows all available across-assay data for the compound through CID (PubChem Compound accession for unique chemical structure). Similarly, for RNAi assay (e.g. https://pubchem.ncbi.nlm.nih.gov/bioassay/1904), the gene ID under the gene target column links to a specific bioactivity analysis tool that summarizes all RNAi data as well as small molecule targeting the gene, allowing one to discovery small-molecule tools, compare results with other research and identify biological functionalities suggested by other RNAi screens. The display of depositor submitted information is followed by sections presenting related BioAssay data sets from the same or multiple projects. The presentation of depositor provided related assay data sets helps one to track the development of an assay project, and facilitate data validation and interpretation, across-assay comparison when combined with the target and publication based related assay data sets derived by PubChem. An excerpt at the top of the record page prompts the availability of such related data highlighting the importance of data integration for assay data interpretation and reutilization. Third-party annotations are presented as the last section of the page together with indication of the sources.
Figure 1.

A bioassay record (AID 1510, https://pubchem.ncbi.nlm.nih.gov/bioassay/1510). (A) The overview of the record page. The table of contents provides quick navigation to a list of sections shown on the page. Each section has an anchor and its URL can be used for widget embedding. (B) Selected sections: Data Table, Same-Project BioAssays and BioAssay Annotations.

A bioassay record (AID 1510, https://pubchem.ncbi.nlm.nih.gov/bioassay/1510). (A) The overview of the record page. The table of contents provides quick navigation to a list of sections shown on the page. Each section has an anchor and its URL can be used for widget embedding. (B) Selected sections: Data Table, Same-Project BioAssays and BioAssay Annotations.

BioAssay classification tool

A hierarchical tree view is developed upon the software frame of the PubChem Classification Browser providing an additional approach to browse, search and access the BioAssay data. The tool (available at https://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?p=classification and https://pubchem.ncbi.nlm.nih.gov/classification/#hid=80) offers an overview of the assay data sets with attributes of common interest by putting together assay metadata, third-party annotations, classifications of assay target, taxonomy and associated publications. It organizes the BioAssay records with nodes in the hierarchy presenting various classifications and annotations. This tool allows one to browse the distribution of BioAssay data among nodes in the hierarchy of interest, aggregate information for a particular sub-class and perform specific search by a data source, assay type etc. (Figure 2). The counts shown with nodes in the hierarchical display link to assay data entries indexed in Entrez upon clicking. Individual assay lists sent from the classification browser can be combined using the Entrez's query refining functionality, providing a powerful way to drill down to the desired data sets satisfying multiple search criteria in one's mind. As examples of usage, users can drill down to IC50 or Kd data under ‘Activity Types/Potency’, or to cell-based or biochemical data under ‘Assay Types’. Users can also browse the over 7000 data sets from HTS projects under ‘HTS Projects’, including both genomic wide RNAi screenings and those chemical probe development projects funded by the NIH Molecular Libraries Program.
Figure 2.

The PubChem BioAssay Classification Tree (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=80). A hierarchical display is provided, which can be navigated and explored by expanding to the sub-trees upon clicking on the triangle icon . A click on the numbers on a node (showing the count of BioAssay records with that annotation) leads to a report in Entrez for the associated assay records.

The PubChem BioAssay Classification Tree (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=80). A hierarchical display is provided, which can be navigated and explored by expanding to the sub-trees upon clicking on the triangle icon . A click on the numbers on a node (showing the count of BioAssay records with that annotation) leads to a report in Entrez for the associated assay records. The ‘Publications’ branch allows one to browse assay data sets by journal name and publication year of the associated primary citation. More importantly, the tool allows one to search data in BioAssay by research subject based on the classification of the PubMed citations with controlled vocabulary of biomedical terms provided in MeSH. As an example shown in Figure 2, one can follow the branch ‘Chemical and Drugs Category’ under the ‘MeSH Tree’ to retrieve the assay data entries reporting bioactivity for angiotensin receptor antagonists, which link to the abstracts of the publications in PubMed. For the ‘Targets’ branch in the tool, four types of protein ontologies and classifications have been incorporated for assay targets including ChEMBL, GO, IUPHAR and KEGG. Recording taxonomy ID, one of the key metadata fields in the BioAssay data model, enables the organization of the BioAssay data sets by biological organisms to provide a hierarchical display of the assay data sets based on the taxonomy classifications maintained at NCBI, whereas the ‘Taxonomy’ tree in the classification tool resembles a subset of the NCBI Taxonomy tree with taxonomy nodes having assay data.

New features for bioactivity analysis tool

PubChem BioAssay allows several ways to reference protein and gene targets. An assay target is often linked to many small molecule samples tested in hundreds of assays making it important to combine the information for identifying chemical tools with proper selectivity assessment. Bioactivity analysis tools were previously developed for retrieving and aggregating dynamically across-assay bioactivity data for a protein or gene target. With the growth of BioAssay submissions, it is critical to provide selection functionality to slice and dice the information. In addition, for therapeutic targets, it is essential to indicate the drugs that were primarily developed for treating disease via the query target. Moreover, it is important to emphasize all drug molecules tested against the query target to facilitate the identification of alternative therapeutic effects of these drugs toward drug repositioning. Several new selection functionality for filtering bioactivity data were added and upgraded to these tools by incorporating recently obtained assay annotations, which enables the retrieval of assay data from a particular detection method, the selection of toxicity result or subsetting bioactivity data from drug molecules (Figure 3, https://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.html?geneid=1956). As another feature, the ‘Selectivity’ column is added to assist target selectivity assessment for the tested chemical sample, which counts the total unique targets that a chemical sample was tested against and the number of targets it was active for. In addition to taking gene ID, this bioactivity analysis tool also accepts GenBank accession or UniProt ID for specifying the query target, SID or CID for specifying the tested samples (RNAi reagents or small molecules), PMID for specifying the literature that contains assay data. The service is also optimized to support programmatic download of bioactivity data with filtering functionality.
Figure 3.

Collective and cross-assay bioactivity data for a specific gene target in the PubChem BioAssay database (https://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.html?geneid=1956). The filters at the top of the web page may be used to drill down to a subset of interest. For example, using the filters provided under the ‘Substance Types’ section, one may retrieve the RNAi data by clicking on ‘RNAi (34)’. The ‘Primary drug (385)’ filter in the section allows the retrieval of the bioactivity data for drugs that were developed to specifically target the query protein/gene, while the ‘Drug (1646)’ filter retrieves bioactivity for any drugs in general which were tested in the assays. This latter filter allows one to identify drug molecules that show experimental evidence (based on PubChem BioAssay data) for binding or affecting the query protein/gene target so that their potential for drug repositioning (against the query protein/gene target) may be further explored. Drug and target information supporting these two filters were obtained from annotations in the DrugBank.

Collective and cross-assay bioactivity data for a specific gene target in the PubChem BioAssay database (https://pubchem.ncbi.nlm.nih.gov/assay/bioactivity.html?geneid=1956). The filters at the top of the web page may be used to drill down to a subset of interest. For example, using the filters provided under the ‘Substance Types’ section, one may retrieve the RNAi data by clicking on ‘RNAi (34)’. The ‘Primary drug (385)’ filter in the section allows the retrieval of the bioactivity data for drugs that were developed to specifically target the query protein/gene, while the ‘Drug (1646)’ filter retrieves bioactivity for any drugs in general which were tested in the assays. This latter filter allows one to identify drug molecules that show experimental evidence (based on PubChem BioAssay data) for binding or affecting the query protein/gene target so that their potential for drug repositioning (against the query protein/gene target) may be further explored. Drug and target information supporting these two filters were obtained from annotations in the DrugBank.

PUBLIC ACCESS, SEARCH, DOWNLOAD

PubChem provides multiple means to access, search and download the BioAssay data including the BioAssay Record page and classification tool described above. PubChem BioAssay is indexed in Entrez (https://www.ncbi.nlm.nih.gov/pcassay/) under numerous fields to support keyword search. It is cross-linked with several other biomedical databases such as with PubMed via the provided citation in the assay submission, or to NCBI Gene via the assay target specification. As a result, users of genomic information in Entrez may retrieve biological test result in PubChem relating to the gene target, and PubMed users can go to BioAssay to retrieve data as discussed in a publication. BioAssay data can be downloaded using: (i) download functions in the BioAssay Record page supporting ASN, XML, JSON and CSV formats; (ii) A web-based service for bulk download at https://pubchem.ncbi.nlm.nih.gov/assay/assaydownload.cgi taking AID list and optionally SID (PubChem Substance accession) list; (iii) programmatic tools provided by NCBI's E-Utilities (https://www.ncbi.nlm.nih.gov/books/NBK25497/), PubChem PUG/SOAP (https://pubchem.ncbi.nlm.nih.gov/pug/pughelp.html) and PUG/REST (https://pubchem.ncbi.nlm.nih.gov/pug_rest/PUG_REST.html) (38), which provide great flexibility for retrieving AID specific metadata and assay result, database links and bioactivity data across multiple assays for a compound or target; (iv) Daily updated PubChem BioAssay FTP at ftp://ftp.ncbi.nlm.nih.gov/pubchem/Bioassay/, primarily providing open access to all deposited BioAssay records via ASN, XML, JSON and CSV formats. New information is added under the directory ‘Extras’ at BioAssay FTP including: (i) the file ‘Aid2Annotation’ providing third party annotations in the tag/value structure; (ii) the file ‘Aid2GiGeneidAccessionUniprot’ containing AID and the mapping of target identifiers between protein GI, GenBank accession, UniProt ID and Gene ID); (iii) a subfolder ‘VendorCatalogs’ containing files for several RNAi product vendors with mapping between vendor catalog ID and the assigned PubChem SID for the RNAi sample.

PubChem UPLOAD FOR BioAssay SUBMISSION

As a public repository accepting submission of complex research data, a robust and user friendly deposition system plays a key role. User interface for the PubChem Upload system (https://pubchem.ncbi.nlm.nih.gov/upload/) has been continuously optimized. Several features are now added for managing data embargo, release and sharing on-hold data among depositor maintained user group. PubChem allows data embargo to provide depositors the needed time for getting research published or completing patent application. Only Upload account holders could have a full access to a BioAssay record under embargo previously. A newly developed mechanism now provides a full access privilege to collaborators, journal reviewers or editors via a specifically requested URL. Upload now supports bulk release of embargoed BioAssay records by taking a list of BioAssay accessions (AID). In addition, it streamlines the release of the BioAssay records and related Substance records so that once depositors request a list of AID to be released, the associated on-hold Substance records are looked up and released automatically by the Upload system. A FAQ section (https://pubchem.ncbi.nlm.nih.gov/upload/docs/upload_faq.html) is added to the Upload help page (https://pubchem.ncbi.nlm.nih.gov/upload/docs/upload_help.html) providing quick tips for common questions and update operations.

SUMMARY

PubChem started in 2004 as a public repository for biological data from small molecule and RNAi screenings. As of today over 80 organizations and laboratories across the world have shared research data via PubChem BioAssay. There are many challenges for developing and maintaining public repositories. The community have put force together for critical thoughts to define guidance for desirable data management. Recently, the FAIR (Findability, Accessibility, Interoperability and Reusability) principle has been proposed to provide guidance for managing public data, maintaining data flow and sharing analysis tools and pipelines. This effort is to bring clarity and encourage public data stakeholders to work toward the simple guidance together with funding agency, researcher and publisher to harmonize research data and maximize the value of scholarly digital publishing. The PubChem BioAssay repository has been designed and developed largely complying with the FAIR principle when reviewed retrospectively. The BioAssay data model was designed with machine readability and all data in the database is freely available to the community. The assay data can be searched using the NCBI Entrez system: https://www.ncbi.nlm.nih.gov/pcassay/. PubChem provides additional tools to support data search, access and analysis. Many add-on services and tools are developed by the community to extend and complement the functionality of the PubChem resource and to provide additional annotations to the data content in PubChem (5). The interplay between PubChem and the community's efforts are mutual beneficial. The information platform at PubChem is under continuous development to encourage re-use of the cheminformatics, chemical biology and functional genomics research data in PubChem, and to enable and ease the integration by community's effort. PubChem will continue to improve services and tools as technology advances, integrate with third-party annotations and other public biomedical data, and work with funding agencies and publishers supporting research data archiving and reutilization. PubChem welcomes the community to share the resource and contribute to the repository.
  38 in total

Review 1.  Advancing Biological Understanding and Therapeutics Discovery with Small-Molecule Probes.

Authors:  Stuart L Schreiber; Joanne D Kotz; Min Li; Jeffrey Aubé; Christopher P Austin; John C Reed; Hugh Rosen; E Lucile White; Larry A Sklar; Craig W Lindsley; Benjamin R Alexander; Joshua A Bittker; Paul A Clemons; Andrea de Souza; Michael A Foley; Michelle Palmer; Alykhan F Shamji; Mathias J Wawer; Owen McManus; Meng Wu; Beiyan Zou; Haibo Yu; Jennifer E Golden; Frank J Schoenen; Anton Simeonov; Ajit Jadhav; Michael R Jackson; Anthony B Pinkerton; Thomas D Y Chung; Patrick R Griffin; Benjamin F Cravatt; Peter S Hodder; William R Roush; Edward Roberts; Dong-Hoon Chung; Colleen B Jonsson; James W Noah; William E Severson; Subramaniam Ananthan; Bruce Edwards; Tudor I Oprea; P Jeffrey Conn; Corey R Hopkins; Michael R Wood; Shaun R Stauffer; Kyle A Emmitte
Journal:  Cell       Date:  2015-06-04       Impact factor: 41.582

2.  Synthetic Lethal Screens Identify Vulnerabilities in GPCR Signaling and Cytoskeletal Organization in E-Cadherin-Deficient Cells.

Authors:  Bryony J Telford; Augustine Chen; Henry Beetham; James Frick; Tom P Brew; Cathryn M Gould; Andrew Single; Tanis Godwin; Kaylene J Simpson; Parry Guilford
Journal:  Mol Cancer Ther       Date:  2015-03-16       Impact factor: 6.261

3.  High-content genome-wide RNAi screens identify regulators of parkin upstream of mitophagy.

Authors:  Samuel A Hasson; Lesley A Kane; Koji Yamano; Chiu-Hui Huang; Danielle A Sliter; Eugen Buehler; Chunxin Wang; Sabrina M Heman-Ackah; Tara Hessa; Rajarshi Guha; Scott E Martin; Richard J Youle
Journal:  Nature       Date:  2013-11-24       Impact factor: 49.962

4.  PubChem's BioAssay Database.

Authors:  Yanli Wang; Jewen Xiao; Tugba O Suzek; Jian Zhang; Jiyao Wang; Zhigang Zhou; Lianyi Han; Karen Karapetyan; Svetlana Dracheva; Benjamin A Shoemaker; Evan Bolton; Asta Gindulyte; Stephen H Bryant
Journal:  Nucleic Acids Res       Date:  2011-12-02       Impact factor: 16.971

5.  Cofactor-independent phosphoglycerate mutase from nematodes has limited druggability, as revealed by two high-throughput screens.

Authors:  Gregory J Crowther; Michael L Booker; Min He; Ting Li; Sylvine Raverdy; Jacopo F Novelli; Panqing He; Natalie R G Dale; Amy M Fife; Robert H Barker; Martin L Kramer; Wesley C Van Voorhis; Clotilde K S Carlow; Ming-Wei Wang
Journal:  PLoS Negl Trop Dis       Date:  2014-01-09

6.  Genome-wide functional genomic and transcriptomic analyses for genes regulating sensitivity to vorinostat.

Authors:  Katrina J Falkenberg; Cathryn M Gould; Ricky W Johnstone; Kaylene J Simpson
Journal:  Sci Data       Date:  2014-07-08       Impact factor: 6.444

7.  New compound sets identified from high throughput phenotypic screening against three kinetoplastid parasites: an open resource.

Authors:  Imanol Peña; M Pilar Manzano; Juan Cantizani; Albane Kessler; Julio Alonso-Padilla; Ana I Bardera; Emilio Alvarez; Gonzalo Colmenarejo; Ignacio Cotillo; Irene Roquero; Francisco de Dios-Anton; Vanessa Barroso; Ana Rodriguez; David W Gray; Miguel Navarro; Vinod Kumar; Alexander Sherstnev; David H Drewry; James R Brown; Jose M Fiandor; J Julio Martin
Journal:  Sci Rep       Date:  2015-03-05       Impact factor: 4.379

8.  An overview of the PubChem BioAssay resource.

Authors:  Yanli Wang; Evan Bolton; Svetlana Dracheva; Karen Karapetyan; Benjamin A Shoemaker; Tugba O Suzek; Jiyao Wang; Jewen Xiao; Jian Zhang; Stephen H Bryant
Journal:  Nucleic Acids Res       Date:  2009-11-19       Impact factor: 16.971

9.  PubChem BioAssay: 2014 update.

Authors:  Yanli Wang; Tugba Suzek; Jian Zhang; Jiyao Wang; Siqian He; Tiejun Cheng; Benjamin A Shoemaker; Asta Gindulyte; Stephen H Bryant
Journal:  Nucleic Acids Res       Date:  2013-11-05       Impact factor: 16.971

10.  The composition and organization of Drosophila heterochromatin are heterogeneous and dynamic.

Authors:  Joel M Swenson; Serafin U Colmenares; Amy R Strom; Sylvain V Costes; Gary H Karpen
Journal:  Elife       Date:  2016-08-11       Impact factor: 8.140

View more
  129 in total

Review 1.  Insights into Computational Drug Repurposing for Neurodegenerative Disease.

Authors:  Manish D Paranjpe; Alice Taubes; Marina Sirota
Journal:  Trends Pharmacol Sci       Date:  2019-07-17       Impact factor: 14.819

Review 2.  Advancing computer-aided drug discovery (CADD) by big data and data-driven machine learning modeling.

Authors:  Linlin Zhao; Heather L Ciallella; Lauren M Aleksunes; Hao Zhu
Journal:  Drug Discov Today       Date:  2020-07-11       Impact factor: 7.851

3.  Enhancing Molecular Promiscuity Evaluation Through Assay Profiles.

Authors:  Sorin Avram; Ramona Curpan; Alina Bora; Cristian Neanu; Liliana Halip
Journal:  Pharm Res       Date:  2018-10-18       Impact factor: 4.200

4.  Exploring Protein-Protein Interactions as Drug Targets for Anti-cancer Therapy with In Silico Workflows.

Authors:  Alexander Goncearenco; Minghui Li; Franco L Simonetti; Benjamin A Shoemaker; Anna R Panchenko
Journal:  Methods Mol Biol       Date:  2017

Review 5.  Large-Scale Prediction of Drug-Target Interaction: a Data-Centric Review.

Authors:  Tiejun Cheng; Ming Hao; Takako Takeda; Stephen H Bryant; Yanli Wang
Journal:  AAPS J       Date:  2017-06-02       Impact factor: 4.009

6.  Structure-based identification of inhibitors targeting obstruction of the HIVgp41 N-heptad repeat trimer.

Authors:  T Dwight McGee; Hyun Ah Yi; William J Allen; Amy Jacobs; Robert C Rizzo
Journal:  Bioorg Med Chem Lett       Date:  2017-05-08       Impact factor: 2.823

Review 7.  Using ChEMBL web services for building applications and data processing workflows relevant to drug discovery.

Authors:  Michał M Nowotka; Anna Gaulton; David Mendez; A Patricia Bento; Anne Hersey; Andrew Leach
Journal:  Expert Opin Drug Discov       Date:  2017-06-12       Impact factor: 6.098

Review 8.  Machine and deep learning approaches for cancer drug repurposing.

Authors:  Naiem T Issa; Vasileios Stathias; Stephan Schürer; Sivanesan Dakshanamurthy
Journal:  Semin Cancer Biol       Date:  2020-01-03       Impact factor: 15.707

9.  Prognostic Biomarker-Based Identification of Drugs for Managing the Treatment of Endometrial Cancer.

Authors:  Dilraj Kaur; Chakit Arora; Gajendra Pal Singh Raghava
Journal:  Mol Diagn Ther       Date:  2021-06-22       Impact factor: 4.074

10.  A Deep Learning Approach to Antibiotic Discovery.

Authors:  Jonathan M Stokes; Kevin Yang; Kyle Swanson; Wengong Jin; Andres Cubillos-Ruiz; Nina M Donghia; Craig R MacNair; Shawn French; Lindsey A Carfrae; Zohar Bloom-Ackermann; Victoria M Tran; Anush Chiappino-Pepe; Ahmed H Badran; Ian W Andrews; Emma J Chory; George M Church; Eric D Brown; Tommi S Jaakkola; Regina Barzilay; James J Collins
Journal:  Cell       Date:  2020-02-20       Impact factor: 41.582

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.