| Literature DB >> 34850112 |
Nikta Feizi1, Sisira Kadambat Nair1, Petr Smirnov1,2, Gangesh Beri1, Christopher Eeles1, Parinaz Nasr Esfahani1, Minoru Nakano1, Denis Tkachuk1, Anthony Mammoliti1,2, Evgeniya Gorobets3, Arvind Singh Mer1,2, Eva Lin4, Yihong Yu4, Scott Martin4, Marc Hafner5, Benjamin Haibe-Kains1,2,6,7,8.
Abstract
Cancer pharmacogenomics studies provide valuable insights into disease progression and associations between genomic features and drug response. PharmacoDB integrates multiple cancer pharmacogenomics datasets profiling approved and investigational drugs across cell lines from diverse tissue types. The web-application enables users to efficiently navigate across datasets, view and compare drug dose-response data for a specific drug-cell line pair. In the new version of PharmacoDB (version 2.0, https://pharmacodb.ca/), we present (i) new datasets such as NCI-60, the Profiling Relative Inhibition Simultaneously in Mixtures (PRISM) dataset, as well as updated data from the Genomics of Drug Sensitivity in Cancer (GDSC) and the Genentech Cell Line Screening Initiative (gCSI); (ii) implementation of FAIR data pipelines using ORCESTRA and PharmacoDI; (iii) enhancements to drug-response analysis such as tissue distribution of dose-response metrics and biomarker analysis; and (iv) improved connectivity to drug and cell line databases in the community. The web interface has been rewritten using a modern technology stack to ensure scalability and standardization to accommodate growing pharmacogenomics datasets. PharmacoDB 2.0 is a valuable tool for mining pharmacogenomics datasets, comparing and assessing drug-response phenotypes of cancer models.Entities:
Mesh:
Year: 2022 PMID: 34850112 PMCID: PMC8728279 DOI: 10.1093/nar/gkab1084
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Details of new and updated datasets in PharmacoDB 2.0
| Dataset | Description | PSet Molecular Data | # Cell lines | # Drugs | # Tissues | Assay | Dose–response source | ORCESTRA |
|---|---|---|---|---|---|---|---|---|
| US National Cancer Institute 60 anticancer drug screen ( | NCI-60 dataset consists of molecular profiles as well as the dose–response data from screening small molecules including approved and investigational drug compounds | RNA-seq isoforms RNA-seq composite Microarray MicroRNA | 162 | 54774 | 15 | Sulforhodamine B colorimetry |
|
|
| The PRISM Repurposing dataset ( | The PRISM dataset consists of dose–response data from assessing the anti-cancer effects of non-oncology drugs on human cancer cell-lines using the PRISM barcoding method developed by Broad Institute of MIT and Harvard | *RNA-seq Microarray Mutation CNV | **480 | 1437 | 22 | PRISM (Luminex) |
|
|
| Genomics of Drug Sensitivity in Cancer ( | Genomics of Drug Sensitivity in Cancer (GDSC) Project is part of a collaboration between Wellcome Trust Sanger Institute and the Massachusetts General Hospital Cancer Center. Both GDSC1 and 2 datasets contains dose response as well as molecular data from screening anti-cancer therapeutics across genetically characterized human cancer cell lines | RNA-seq Microarray Mutation Mutation (Exome) CNV Fusion | 1104 | 303 | 29 | Resazurin or Syto60 |
|
|
| Genomics of Drug Sensitivity in Cancer ( | RNA-seq Microarray Mutation Mutation (Exome) CNV Fusion | 1104 | 190 | 29 | CellTiter Glo |
|
| |
| The Genentech Cell Line Screening Initiative ( | The gCSI data were generated and shared by Genentech as part of the Genentech Cell Line Screening Initiative. gCSI dataset includes dose–response data as well as and molecular profiles from screening drugs on independently characterized cell lines | RNA-seq Mutation CNV | 788 | 44 | 27 | CellTiter Glo |
|
|
An overview of the new and updated datasets, types of molecular profiles included in each dataset, number of cell lines, drugs, and tissue types in each dataset, and the assay type used for measuring the dose–response values. The link to both the source of raw dose–response data and the corresponding PSets on ORCESTRA is provided in the table.
Note: The number of drugs is reported based on the PSets' unique drug IDs (Supplementary Data).
*Cell lines are directly obtained from the Cell Broad-Novartis Cancer Cell Line Encyclopedia (CCLE) project. Molecular profiles are accessible from CCLE dataset on ORCESTRA (https://orcestra.ca/pset/10.5281/zenodo.3905462).
**19 out of 499 original cell lines failed the STR fingerprinting comparison tests and were not included in the PSet.
Figure 1.PharmacoDB 2.0 overview. (A) The new version of PharmacoDB includes updated and new large-scale pharmacogenomic datasets. The web-application contains enriched annotations for drugs and cell lines via connectivity to external databases. PharmacoDB 2.0 includes new analytical methods for tissue-specific and pan-cancer biomarker discovery. The new web-interface ensures scalability and simplifies maintenance. PharmacoDB 2.0 is made fully reproducible through the use of the ORCESTRA platform and automated data ingestion pipelines. (B) Bar plots showing previous (Version 1) and current (Version 2) database statistics.
Figure 2.Computational processing pipeline of raw pharmacogenomic data for ingestion into PharmacoDB. Different panels show the process of ingesting public datasets into PharmacoDB 2.0. The first panel highlights the sources of the newly added datasets, while the subsequent panels highlight the tools and technologies used for Data Processing and Standardization, Data Ingestion and Annotation, and for building the PharmacoDB 2.0 web app itself.
Figure 3.Visualization of tissue-specific drug–response and gene–drug associations. (A) Drug response (AAC) of Dabrafenib across various tissues from all datasets. (B) Differential sensitivity of skin cell lines to Dabrafenib; cell lines and datasets of interest can be highlighted in the plot by checking the boxes. (C) Forest plot of Pearson correlations between Lapatinib response and ERBB2 expression in breast tissue. Data from RNA sequencing is shown here. The significant associations (FDR < 0.05 and pearson correlation coefficient, r > 0.7) is highlighted in bright pink. (D) Manhattan plot showing the association of copy number alterations with Lapatinib response in all datasets and across all tissue types, with ERBB2 highlighted. The genomic coordinates are displayed on the x-axis, and negative logarithm of the association P-value is displayed on the y-axis. The different colors of each block show the extent of each chromosome.