Literature DB >> 24401833

Pharmacogenomic characterization of gemcitabine response--a framework for data integration to enable personalized medicine.

Michael Harris¹, Krithika Bhuvaneshwar, Thanemozhi Natarajan, Laura Sheahan, Difei Wang, Mahlet G Tadesse, Ira Shoulson, Ross Filice, Kenneth Steadman, Michael J Pishvaian, Subha Madhavan, John Deeken.

Abstract

OBJECTIVES: Response to the oncology drug gemcitabine may be variable in part due to genetic differences in the enzymes and transporters responsible for its metabolism and disposition. The aim of our in-silico study was to identify gene variants significantly associated with gemcitabine response that may help to personalize treatment in the clinic.
METHODS: We analyzed two independent data sets: (a) genotype data from NCI-60 cell lines using the Affymetrix DMET 1.0 platform combined with gemcitabine cytotoxicity data in those cell lines, and (b) genome-wide association studies (GWAS) data from 351 pancreatic cancer patients treated on an NCI-sponsored phase III clinical trial. We also performed a subset analysis on the GWAS data set for 135 patients who were given gemcitabine+placebo. Statistical and systems biology analyses were performed on each individual data set to identify biomarkers significantly associated with gemcitabine response.
RESULTS: Genetic variants in the ABC transporters (ABCC1, ABCC4) and the CYP4 family members CYP4F8 and CYP4F12, CHST3, and PPARD were found to be significant in both the NCI-60 and GWAS data sets. We report significant association between drug response and variants within members of the chondroitin sulfotransferase family (CHST) whose role in gemcitabine response is yet to be delineated.
CONCLUSION: Biomarkers identified in this integrative analysis may contribute insights into gemcitabine response variability. As genotype data become more readily available, similar studies can be conducted to gain insights into drug response mechanisms and to facilitate clinical trial design and regulatory reviews.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2014 PMID： 24401833 PMCID： PMC3888473 DOI： 10.1097/FPC.0000000000000015

Source DB: PubMed Journal: Pharmacogenet Genomics ISSN： 1744-6872 Impact factor: 2.089

Introduction

Gemcitabine (2′-deoxy-2′,2′-difluorocytidine, dFdC), an analog of deoxycytidine with proven anti-tumorigenic effects, is used in the treatment of solid tumors including pancreatic cancer. Gemcitabine is active toward many solid tumor types but has a narrow therapeutic index and variable responses ranging from lack of efficacy to severe cytotoxicity, which may be attributed to variability in drug exposure and metabolism 1. Significant variations in individual response to gemcitabine therapy are common among pancreatic cancer patients. Earlier studies based on cell lines as well as patient–control populations have demonstrated that interindividual variations in germline DNA can impact cellular response to oncology drugs. Identification of such variants with functional/regulatory impact can serve to predict toxicity and efficacy of chemotherapeutic agents 2. In this study, we have focused on genes encoding drug-metabolizing enzymes and transporters (DMETs), and their association with response to gemcitabine. The products of DMET genes play a substantial role in drug pharmacokinetics, and may have a role in predicting response and clinical outcomes in cancer patients. Some of these variants can not only impact drug metabolism and transport, but also affect the expression of cancer-related signaling proteins in downstream pathways. Several potentially actionable variants have been reported in DMET genes that affect drug toxicity and efficacy among individuals, yet few have been tested in the clinic. Several variants in genes directly involved in gemcitabine metabolism have been reported to impact gemcitabine response (e.g. deoxycytidine kinase, DCK; DNA polymerase epsilon, POLE; cytidine deaminase, CDA; and transporters: SLC28A1, SLC28A2, SLC28A3, SLC29A1, SLC29A2, ABCB1, ABCC2, and ABCC10). A significant association has been demonstrated between gemcitabine sensitivity and variants in one or more of the above genes 3–5 including associations between single-nucleotide polymorphism (SNP) haplotypes and gemcitabine treatment outcome in pancreatic cancer patients 6–8.

Regulatory viewpoint

Developing innovative clinical evaluation tools and advancing personalized medicine has been identified by the FDA as a core priority area in advancing regulatory science 9. Understanding the relationship between genetic markers and response to medical products, in terms of both efficacy and toxicity, is a critical component of this priority. First steps toward this understanding include modeling this relationship with a combination of in-vitro and in-vivo genomic and phenotypic markers as described in this paper. Prospective validation of such models along with consideration for regulatory approval and clinical adoption of resultant tools will be required, but work such as this helps lay a strong foundation for more personalized, effective, and safe medical product use.

Importance of data integration to determine clinically actionable variants

Understanding the genetic and molecular mechanisms underlying complex diseases such as cancer is extremely challenging. Genome-wide association studies (GWAS) have been extensively used in the past decade to discover important genetic variants. However, the identified SNPs explain only a small proportion of the phenotypic variation, and the predictive power of these SNPs remains low for many complex diseases 10. To fully elucidate genetic underpinnings of disease a systems biology approach is necessary to characterize variants, mRNA, copy number, proteins, and metabolites, as well as their cellular interactions 11. Gene set and pathway association analyses are playing an increasingly important role in explaining disease mechanisms through the identification of functional genetic interactions 12. Many gene–disease association analyses are based on SNP genotype profiling or gene expression studies. However, SNPs can influence many downstream processes including the expression levels of multiple genes and/or protein levels, and variations in expression levels can directly or indirectly impact disease progression and even drug response 13. An integrative approach combining multiple data types can more accurately capture pathway associations 12 for discovery of clinically actionable variants.

Statistical approaches commonly used to associate variants with disease and/or drug response

Fisher’s exact test (FET) is commonly used in the association of germline polymorphisms with drug response 14. The use of probabilistic networks in conjunction with traditional statistical models for mining relationships and associations from genotype–phenotype data is well established 15. Probabilistic network methods for pharmacogenomics and newer methods such as the Markov Blanket concept may be helpful to better analyze these complex genotype–phenotype associations 16. Considering the complexity of both cancer prognosis and individual drug response to chemotherapeutics, application of these association methods in conjunction with novel informatics and data integration approaches is necessary to identify clinically relevant variants for validation studies and ultimately testing in the clinic for pharmacogenomics applications.

Methods

Data sets

We analyzed SNPs in DMET genes from gemcitabine-focused studies on cell lines and patients to identify associations with drug response. The analysis workflow is summarized in Fig. 1.

Fig. 1

Analysis workflow. GWAS, genome-wide association studies; SNP, single-nucleotide polymorphism.

Analysis workflow. GWAS, genome-wide association studies; SNP, single-nucleotide polymorphism. NCI-60 data sets: SNP data: DNA from the NCI-60 cell lines was provided by the Developmental Therapeutics Program (DTP) of the National Cancer Institute (NCI). The NCI-60 are well-characterized tumor cell lines. DNA was analyzed using the Affymetrix Targeted Human Drug Metabolizing Enzymes and Transporters (DMET) 1.0 chip 17, which determines the genotype for 1256 variants in 170 genes involved in drug disposition. An additional 14 variants in DNA repair enzymes of interest outside of the DMET chip were also genotyped due to their potential role in a number of anticancer drug pathways. Gene expression data: This published data set consists of mRNA expression of the NCI-60 cell lines. Raw data on the Affymetrix U133A gene chip, obtained from the Gene Expression Omnibus (GEO, accession number GSE5720), were used in our analysis 18. Drug sensitivity data: Drug sensitivity information is denoted by its GI50 value. The GI50 concentration is defined as the concentration required to achieve 50% growth inhibition. A data set containing the log 10 of mean GI50 values for the NCI-60 cell lines over seven experiments at a concentration of 10−6 mol/l for the drug of interest (gemcitabine, NSC613327) was queried and downloaded from the NCI DTP website . These data have been provided as Supplementary File S1 (Supplemental digital content 1, ). These drug sensitivity data for cell lines are analogous to patient outcome data in response to the same drug. GWAS: all patients data set: GWAS data from 351 patients with advanced pancreatic cancer treated on a CALGB phase III clinical trial with gemcitabine with or without bevacizumab (CALGB 80303) were downloaded from dbGaP [dbGaP: phs000250.v1.p1]. Patient outcomes were classified based on RECIST criteria, including CR (complete response), PR (partial response), SD (stable disease), PD (progressive disease), and unevaluated or NA (not applicable). Patients with clinical response reported as ‘unevaluated’ or NA were excluded from further analysis. Of the 561 466 SNPs in the data, we extracted 2847 SNPs located in genes contained on the Affymetrix DMET platform (Affymetrix Inc., Santa Clara, California, USA) for analysis. Results from the GWAS study showed no significant association between bevacizumab and overall survival 19. We therefore used the entire cohort for analysis to maximize the input data set. GWAS: gemcitabine+placebo data set: We analyzed the GWAS data set on the subgroup of 135 patients who were given gemcitabine and placebo to remove the effects of bevacizumab on the results.

Filtering and preprocessing of data sets

The NCI-60 gene expression data were normalized using the Robust Multichip Average (RMA) method in the R Bioconductor package (Affymetrix Inc.) 20. All SNP data sets were uniformly processed. Samples with more than 60% missing genotype calls were removed. SNPs with the same genotype across all samples were filtered out since they did not contribute new information. Missing calls, denoted as NoCall (NC), PossibleRareAllele (PRA), or NotAvailable (NA) for the NCI-60 SNP data and 0/0 for the GWAS all patients data set were ignored. SNPs with only reference genotype and missing calls across all samples were also filtered out. After filtering, there were 59 cell lines and 432 variants remaining in the NCI-60 SNP genotype data. For the GWAS: all patients data set, patients with responses CR and PR were aggregated as they were assumed to be more ‘sensitive’ to drug therapy; similarly, patients with PD and SD were aggregated as they were assumed to be more ‘resistant’ to drug therapy. The GWAS: all patients data set had 293 patients with 2846 variants following this aggregation and filtering. After preprocessing of the GWAS: gemcitabine+placebo data set, there were 2837 variants and 135 patients for analysis. We chose two methods to analyze these SNP data sets: FET and Probabilistic Network Analysis (PNA) using the software BayesiaLab 5.0.6 . FET is a well-known nonparametric test that evaluates the association between categorical variables 21. FET was implemented in the PLINK Whole Genome Association Analysis Toolset 22, which conducts an allele-based test to identify genotype–phenotype association. A probabilistic network (PN) is a representation of a joint probability distribution 23. PNA was conducted using the genotype data and gemcitabine response values for each SNP data set. PNs can learn complex models and therefore offer an attractive option to analyze pharmacogenomic SNP data for discovery and prediction 24,25. They also allow for the opportunity to build clinical decision support systems in the future. Each SNP was represented as a node in the network, whose states are the genotype values observed for that SNP. Gemcitabine response was represented as a special node called the target node, whose states were the discrete gemcitabine response values (sensitive or resistant). Probabilistic network analysis requires discrete outcome variables, and hence the GI50 values from the NCI-60 gemcitabine response were dichotomized into two groups that correspond to resistant and sensitive phenotypes. We decided to use two levels to ensure adequate sample size in each group. To understand the distribution of the data, a histogram of the log 10 (GI50) values with the corresponding kernel density estimate were plotted (Fig. 2). The smallest antimode, observed at −6.875, was used as the cutoff value 26 between Resistant and Sensitive groups, as displayed by the vertical red line. After the above filtering steps, we produced a consolidated file containing the variant data and associated drug response values. We then used this as input for the SNP analysis described below.

Fig. 2

Histogram and density plot for gemcitabine.

Estimation of haploblocks

We estimated haploblocks using PLINK to estimate the functional impact of the SNPs and to annotate SNPs located in intergenic regions, which can be associated with SNPs in gene or coding regions if they fall in the same haplotype block. SNPs that fall outside of coding regions (e.g. intronic or UTR regions) are more likely to have a significant impact on gene function if they fall in the same haplotype block as other high-impact coding SNPs. Haplotype blocks were not used for the association analysis and were solely used to determine SNP–gene associations and for functional annotation.

SNP comparative analysis of gemcitabine response

We investigated the association between genotype data and the gemcitabine response in each data set. Genes found to be significant were considered for further downstream analysis. We also identified genes found in more than one data set as these are likely to play a critical role in gemcitabine response. For the FET and haplotype block estimation, a PLINK input file was created for each of the SNP data sets. SNPs that were not biallelic were not considered because PLINK only works on biallelic SNPs. FET results with a P-value less than 0.05 were considered significant. Of the numerous network learning algorithms in PNA, we applied the Augmented Markov Blanket (AMB) algorithm because it has the ability to subset a limited number of SNPs that best predict the outcome 16. A Markov Blanket corresponds to a set of nodes in the network that make the target independent of all the other nodes conditional on this subset of nodes. The AMB algorithm attempts to find a Markov Blanket for the target node (in this case gemcitabine response) and then uses an unsupervised search to find direct dependencies between each variable belonging to this Markov Blanket. The AMB algorithm implemented in BayesiaLab uses some of the concepts from the Smart-Greedy+algorithm 27. Specifically, it combines a score-based Augmented Markov Blanket learning algorithm with principles inspired by the constraint-based approaches. The AMB is a nonstochastic learning algorithm that always returns the same result given the same input data and parameters. The AMB algorithm hence created an optimal network with SNPs that were best associated with gemcitabine response 24,28,29. The genotype data were input ‘as is’ and no further coding was carried out, nor was any genetic model assumed. The parameter we controlled was the ‘structural coefficient,’ which allowed for controlling the extent of the blanket. We first found the optimal value of this coefficient by repeating the AMB algorithm five times with different values of the structural coefficient in a given interval between 0.1 and 1. At each iteration, the network structure was learned on the entire data set with a growing structural coefficient. This analysis, called ‘Structural coefficient analysis’, plotted the Structure/Data ratio for different values of the structural coefficient (SC). The ‘knee area’ (the area before the strong increase in the graph) gave the optimal range of the coefficient to be used in the network. From this analysis, we obtained the optimal SC of 0.3 for the NCI-60 and the GWAS: all patients data sets, and 0.35 for the GWAS: gemcitabine+placebo data set. This optimal structural coefficient obtained was used to create the network for each data set. To confirm the stability of the networks, we ran cross-validation with data perturbation to ensure that the frequency measures for arc and node confidence were robust. The structural coefficient analysis, the final network obtained from PNA, and the cross-validation results are shown in Supplementary File S2 (Supplemental digital content 2, ). Once the network was created a ‘target analysis report’ was made, which ranked the nodes according to the information they brought to the knowledge of the target variable. This is defined as ‘mutual information’. This report also displays P-values from an independence test based on the χ2-distribution computed between each variable in the network and the target variable. The mutual information and the P-values were used to further prune the SNPs. Only SNPs with P-values less than 0.05 and mutual information of greater than 0 were retained inside the Augmented Markov Blanket. The resulting SNPs from each test were annotated using the Affymetrix DMET annotation file, and NCBI’s dbSNP database. The union of the significant SNPs obtained from FET and PNA was considered for further downstream analysis. By taking the union of the results across data sets used, we generated an accurate and comprehensive set of SNPs that are best associated with gemcitabine response.

NCI-60 integrative analysis of SNP, gene expression, and drug response data in NCI-60 cell lines

We first associated gene expression data with gemcitabine response by applying Student’s two sample t-tests to the normalized gene expression levels to compare sensitive samples to those that were resistant. The 58 cell lines that had both gene expression and GI50 data were considered. Next, to understand the association between SNPs and gene expression levels, we applied simple linear regression models relating each gene expression value to each SNP 30. SNPs were considered as the independent variables and the gene expression data levels were the dependent variables. For each SNP, the most frequent genotype was used as the reference to avoid making genetic model assumptions. If a SNP had k genotypes (k−1), indicator variables were introduced (k=2, 3). Finally, the above analyses were combined. Using the significant SNPs from PNA and FET, we extracted the significant gene probe sets affected by these SNPs from the regression results. Only those probe sets that were significant with respect to gemcitabine response from the T-test results were selected 7, resulting in a consolidated list of SNPs that affect expression of a gene that in turn affects gemcitabine response. This integrative analysis was performed in R (), and can be repeated on any drug of interest for which GI50 data are available and can help identify SNPs and probe sets best associated with drug response. For each analysis we applied multiple testing correction using Benjamini and Hochberg false discovery rates (FDR) 31. Because of the discovery nature of this analysis, and hence to maximize the inclusion criteria, the adjusted p-values were not used as a filtering threshold for integration.

Systems biology analysis

The significant genes found in each data set were validated against the literature, and pathway analysis was performed to determine their potential role in one or more of the following processes: gemcitabine metabolism, pancreatic cancer, or interaction with other proteins implicated in the response to gemcitabine or other drugs used to treat pancreatic cancer. We also used the results from the NCI-60 integrated analysis to see whether such associations could be explained at the molecular level using pathway analysis. Pathway analysis was performed using IPA (Ingenuity Systems Inc., Redwood City, California, USA; ) and Ariadne’s Pathway Studio (Rockville, Maryland, USA).

Results

The PLINK-based FET on the NCI-60 data set resulted in 25 significant SNPs (P<0.05). PNA analysis identified three SNPs and two of these overlapped with the FET results. In total there were 26 unique SNPs corresponding to 14 unique genes. The genes ABCC1 (rs8187858, FET, P=0.001 32, cds-synon), CHST3 (rs4148943, FET, P=0.002, UTR-3), ABCC4 (rs4148551, FET, P=0.01, UTR-3), CYP2E1 (rs3813867, FET, P=0.01 33,34, nearGene-5), and ALDH3A1 (rs887241, FET, P=0.01, missense 35) were among the most significant hits. Of the 14 genes identified, three are from the ABCC family (ABCC1, ABCC4, ABCC6), and two are from the CHST family (CHST3, CHST13). CYP genes identified include: CYP2D6, CYP2E1 and CYP4F2, CYP4F8, and CYP4F12. ALDH3A1, PPARD, SLCO1B1, and VKORC1 were also identified as significant. Functional analysis of the significant SNPs identified two as damaging: rs2108622, a missense mutation in the CYP4F2; and rs1056522, a missense mutation in the CHST13 gene. The FET on the GWAS: gemcitabine+Placebo data set resulted in 123 significant SNPs (P<0.05). PNA analysis identified 19 significant SNPs, of which 12 overlapped with the FET results. In total there were 130 unique SNPs corresponding to 50 unique genes. Top hits include: CYP39A1 (rs3799884 FET, P=2.0E−4, intron), ABCC4 (rs7993878, FET, P=2.3E−4, intron), SLC6A6 (rs2327896, FET, P=5.1E−4, UTR-3), SLC29A2 (rs2279861, FET, P=1.1E−3, intron), and ABCB1 (rs17327624, FET, P=1.3E−3, intron). Among the 50 genes identified were three members of the ABCC family (ABCC1, ABCC4, ABCC8) as well as two CHST family members (CHST8, CHST11). Twelve CYP genes were identified including two members of the CYP2 family (CYP2B6, CYP2J2) and four members of the CYP4 family (CYP4B1, CYP4F3, CYP4F8, CYP4F12). Functional analysis of the significant SNPS identified rs4646487, a missense mutation in the CYP4B1 gene, as damaging. The FET on the GWAS: all patients data set resulted in 121 significant SNPs (P<0.05). PNA analysis identified 18 SNPs of which nine overlapped with the FET results. In total there were 130 unique SNPs corresponding to 54 unique genes. Top hits include: CYP4F10P pseudogene (rs1543284, FET, P=5.0E−5), CYP4F10P pseudogene (rs1543285, FET, P=2.2E−4), rs1063803 (no direct gene association but in the same LD block as SNPs in the CYP4F3 gene, FET, P=3.6E−4), rs3794989 (no direct gene association but in the same LD block as SNPs in the CYP4F8 gene, FET, P=4.9E−4), CHST8 (rs17325358, FET, P=1.5E−3, intron), SLC6A6 (rs2341985, P=1.5E−3, UTR-3), CYP39A1 (rs3799884, FET, P=1.8E−3, intron), CHST11 (rs312172, FET, P=2.1E−3, intron), and SLC29A2 (rs2279861, FET, P=2.1E−3, intron). The 54 genes identified contained three ABCC family members (ABCC1, ABCC4, ABCC8), five members of the CHST family (CHST3, CHST6, CHST8, CHST9, CHST11), and 10 CYP genes including three members from the CYP4 family (CYP4B1, CYP4F3, CYP4F8), and 10 SLC genes were identified including two members of the SLC7A subfamily (SLC7A7, SLC7A8) and two members of the SLC22A subfamily (SLC22A3, SLC22A6). Comparison of the significant genes from each data set resulted in the genes ABCC1, ABCC4, and CYP4F8 common to all three data sets. The genes CYP4F12 and PPARD were found in both the NCI-60 and the GWAS: gemcitabine + placebo data sets. CHST3 was found in both the NCI-60 and the GWAS: all patients data set. Thirty-four genes were found in common between the GWAS: all patients and GWAS: gemcitabine+placebo data sets including three members of the UGT2 family (UGT2A1, UGT2A2, UGT2B4). Table 1 lists the top genes found in each set. Cells in the table are marked ‘Yes’ if an SNP associated with the gene was found to be significant by either FET or PNA. Detailed results of the SNP comparative analysis using PNA and FET can be found in Supplementary File S3 (Supplemental digital content 3, ) and Supplementary File S4 (Supplemental digital content 4, ), respectively.

Table 1

Genes common to the two study sets after significance analysis

Haploblock analysis

The haplotype block estimation (Supplementary File S5, Supplemental digital content 5, ) revealed that SNP rs4148943 in the CHST3 gene, which was a top hit in the NCI-60 analysis, is found in the same haplotype block as rs4148946, which was previously reported as a top hit in an analysis of long-term survival in stage III myeloma patients (Table 2) 21. In addition, rs3842 in the ABCB1 gene has been associated with genetic susceptibility of lung cancer 36 and response to antidepressant treatment 37.

Table 2

Estimated haploblocks from NCI-60 and GWAS: all patients data sets for top hits in the association analysis

NCI-60 cell line integrative analysis of SNPs, gene expression, and drug response

SNPs in genes that effect their own expression

From the linear regression analysis of NCI-60 SNP and gene expression data, we focused on SNPs that were associated with expression of their own genes. The results, as shown in Table 3, contain 10 unique SNPs in PPARD, five in ABCC4, two in CYP2E1, and one in SLCO1B1. We annotated all variants using the SNPnexus annotation tool 38,39.

Table 3

SNPs affecting their own gene expression

SNPs that affect gene expression and drug response

The integrative analysis on SNPs, gene expression, and GI50 response revealed significant associations between several variants, and expression of multiple genes that were also associated with GI50 response. Supplementary File S6 (Supplemental digital content 6, ) shows 25 unique SNPs (14 unique genes) associated with 51 unique probes (39 unique genes), with a total of 129 combinations. Several of the genes include CHST3 variants affecting SFPQ, SLO4C1, and SLC24A3; PPARD variant affecting SFPQ; CYP2D6 variant affecting FMO5; and CYP4F12 variant affecting FETUB, HNRNPU, and HNRNPA2B1.

Pathway analysis

We performed pathway analysis to understand the biological role of genes containing significant variants in a comparative study of the NCI-60 cell lines and GWAS data sets. Our study identified five significant variant genes (ABCC1, ABCC4, CYP4F8, CYP4F12, PPARD) common to both the NCI-60 and the GWAS: gemcitabine+ placebo data sets with respect to gemcitabine response. Pathway analysis identified shared molecular mechanisms for some genes at both the downstream target level and regulatory level (Fig. 3a). Downstream, ABCC1 and ABCC4 exert positive regulation on other ABC transporters via shared ligands 40 or inducing agents 41. At the regulatory level, transcription factors MYCN and NFE2L2 42–45 and cytokines (IL1B) 46,47 play a major role in enhancing the expression of ABCC1 and ABCC4. Pathway analysis of significant genes from NCI-60 (Fig. 3b) and the GWAS: gemcitabine+placebo (Fig. 3c) studies pointed to genes involved in tumor cell proliferation, apoptotic, and inflammatory pathways.

Fig. 3

Pathway analysis of top genes from SNP comparative analysis. (a) Network created using genes common to the NCI-60 and GWAS: gemcitabine+placebo data sets; (b) Network using the significant genes in the NCI-60 data set; (c) Network using the significant genes in the GWAS: gemcitabine+placebo data set. The significant genes have been indicated using the blue highlight. GWAS, genome-wide association studies; SNP, single-nucleotide polymorphism. To understand the biological associations between SNPs that affect the expression of target genes as well as drug response, we looked for molecular interactions between genes in which significant SNPs have been identified and target genes that might in-turn affect drug response. The chondroitin sulfotransferase gene, CHST3, was strongly associated with multiple differentially expressed genes in the integrated data set (Fig. 4a). Pathway analysis showed that CHST3 may indirectly influence the RNA-processing gene SFPQ (PTB-associated splicing factor) via other matrix proteins and cell adhesion molecules such as versican and vitronectin, whereas SFPQ shares common pathways with genes whose expression is associated with CHST3 variants. In the PPARD network (Fig. 4b) PPARD may regulate the PTB-associated splicing factor via the transcription factor SP1. SP1 and SFPQ are known to dynamically regulate growth factor-regulated gene expression response in mammalian cells 48,49. The CYP450 network identified potential regulatory mechanisms via the xenosensors (Fig. 4c–e). The CYP2D6 network showed interaction with other proteins that regulate one of the significantly associated expression genes, Flavin Containing Monooxygenase 5 (FMO5), which is known to mediate the oxidative metabolism of several xenobiotics 50. Cyp enzymes are regulated by xenosensors such as NR1I2 (PXR) and NR1H4 (FXR) 51, which also induce Fetuin-B, a tumor suppressor, and may contribute to interindividual variability in drug response 52,53 (Fig. 4d). Heterogeneous nuclear ribonucleoproteins (hnRNPU, hnRNPA, hnRNPC) positively regulate the expression of cytokines such as TNF-α 54, which in turn can reduce the expression of xenosensors and their CYP enzyme targets 55 and drug transporters such as SLCO1B1 56 (Fig. 4e).

Fig. 4

Pathway analysis on SNPs that affect gene expression and drug response. (a) a CHST3 network; (b) a PPARD network; (c–e) a CYP450 network. Entities highlighted in yellow indicate the gene where the SNP is located. Entities highlighted in orange indicate genes from the probes. The legend shows the different types of entities and relationships. GWAS, genome-wide association studies; SNP, single-nucleotide polymorphism.

Discussion

Abnormal expression and activity of drug transporters and metabolizing enzymes may arise from inherent or acquired polymporphisms and lead to undesirable drug response 2. Identification of such variants can aid in the selection of patients who may benefit from gemcitabine treatment. For the GWAS data sets we compared patients treated with gemcitabine+bevacizumab and gemcitabine+placebo as the addition of bevacizumab was shown to have no clinical benefit over gemcitabine treatment alone in these patients. Bevacizumab is an anti-angiogenesis drug that inhibits VEGF-A and its action is different from gemcitabine’s, which inhibits RRM1, a gene that encodes the regulatory subunit of ribonucleotide reductase critical to the synthesis of deoxyribonucleotides. The GWAS: gemcitabine+placebo data set is a smaller, less noisy data set compared with the larger and more diverse GWAS: all patients data set. These differences could be reflected at the molecular level and potentially explain the small difference in results between these two GWAS data sets in the SNP comparative analyses of gemcitabine response and pathway analysis. ABC transporters have the ability to transport diverse types of anticancer drugs including gemcitabine. Members of ABC transporter family are also involved in diverse processes that impact cell proliferation and apoptotic pathways. Variants in two of these genes, ABCC1 57 and ABCC4 58, were significant in both studies, the overexpression of which has been observed to contribute to gemcitabine resistance in pancreatic cancer cells. As the functional significance of the SNPs in regulating these genes is unknown, we looked at molecular interactions that might explain their effect on gemcitabine. We identified shared pathways for these genes, and suspect that one or more of these SNPs could have a combined effect on gemcitabine response. CYP4F12, whose role in pancreatic cancer or gemcitabine resistance is yet unknown, is one of the enzymes induced by p53 in response to xenobiotic signals such as chemotherapy 59. Our analyses also revealed significant variants in CHST genes that have not been reported to be involved in gemcitabine transport or metabolism. These enzymes utilize 3′-phospho-5′-adenylyl sulfate (PAPS) as sulfonate donor and catalyze the transfer of sulfate to selective glycoprotein ligands such as chondroitin, and the resulting chondroitin sulfates play a critical role in oncogenic HRAS signaling 60 and diverse biological functions including cell proliferation, adhesion, migration, and differentiation 61,62. Members of the CHST family also mediate inflammation, immunity, angiogenesis, and extracellular matrix reorganization 63,64. So far there is one study 65 that reports the involvement of chondroitin sulfates in enhancing antitumor activity of gemcitabine in human bladder cancer cells. Looking at SNPs within the NCI-60 data that affect their own gene expression we found 10 unique SNPs in PPARD and one in SLCO1B1. SLCO1B1 is known to regulate the uptake of drugs and compounds in the liver and SNP rs4149056 is predicted by SIFT and polyphen to be damaging. The C allele of this SNP has been linked to reduced transport activity and hence impaired drug effect 66,67. Another SLCO1B1 variant, rs4149015, known to affect its own expression 68, may impact its transporting activity and, consequently, drug response. Variants in PPARG2 have been known to affect mRNA expression of PPARG1 and its target genes 69,70, which play a diverse role in cellular energy metabolism and angiogenesis. Activation of PPARβ/δ, in particular, has been reported to favor angiogenesis, and is suggested to play a critical role in inducing angiogenesis in pancreatic cancer 71. Such alterations in tumor cell characteristics can consequently modify drug response. Variants in ABCC4 have been observed to affect intracellular concentrations of some drugs even though the SNPs were synonymous or within 3′UTR 72,73. SNPs can influence the expression levels of multiple genes either through cis-regulatory or trans-regulatory effects; and variations in gene expression levels can directly or indirectly impact drug response 74,75. To understand these effects, we integrated the gene expression data with DMET SNP and drug response data for the NCI-60 cell lines. Associations involving detoxifying proteins such as CYP enzymes and transporters such as SLCO1B1 revealed pathways closely related to drug response regulation. Associations involving other variant genes such as CHST3 and PPARD suggest a major role for cell adhesion, cell growth, and tumor-invasive processes in the alteration’s drug response. Although trans-regulatory effects, if present, will have to be experimentally validated, the pathway analysis provides a potential mechanism to explain how SNPs could have a genome-wide impact on drug response, directly or indirectly, by altering tumor cell characteristics.

Conclusion and future perspective

There is strong agreement among clinicians about the need to implement pharmacogenomic discoveries in clinical practice due to the changing needs in medicine. However, most physicians have limited familiarity with genomics and inadequate knowledge of the pharmacogenomic tests available to help their patients 76,77. Many associations have been identified between variant genotypes and drug response phenotypes, some of which are now identified in FDA-approved medical product labels 78. Despite the growing evidence of pharmacogenetic involvement in drug response, physician uptake of pharmacogenetic testing has been poor. We believe this is in part due to lack of strong evidence of clinical utility, evolving clinical guidelines around these markers, and lack of integrated decision support tools. Another major barrier to address is the lack of adequate physician education in the field 76. To overcome some of these challenges, adequate computational and informatics data analysis and support will be necessary to analyze and interpret results from clinical studies and to help evaluate pharmacogenomics biomarker panels. Multidisciplinary teams of clinicians, geneticists, bioinformaticians, and systems biologists need to partner to apply growing pharmacogenomic knowledge in the clinic. Decision support tools must be tightly coupled to electronic health records and clinical work flows to alert healthcare providers to actionable pharmacogenetic information. Although we present an integrative systems biology and genetic analysis of ‘omics’ data for understanding variants that are strongly associated with gemcitabine response (e.g. CHST family of proteins), this report is intended as a framework to systematically integrate varied data types and strengthen the evidence behind pharmacogenomics markers. Such frameworks can ultimately provide motivation to develop clinical decision support tools and interfaces to help clinicians use pharmacogenomics data to improve patient care.

73 in total

1. Vincristine transcriptional regulation of efflux drug transporters in carcinoma cell lines.

Authors: Rong Huang; Daryl J Murry; Dhanashri Kolwankar; Stephen D Hall; David R Foster
Journal: Biochem Pharmacol Date: 2006-04-18 Impact factor: 5.858

2. The ABCC4 gene is a promising target for pancreatic cancer therapy.

Authors: Zhuo Zhang; Jiancheng Wang; Baiyong Shen; Chenghong Peng; Minhua Zheng
Journal: Gene Date: 2011-10-02 Impact factor: 3.688

3. Expression of multidrug transporter MRP4/ABCC4 is a marker of poor prognosis in neuroblastoma and confers resistance to irinotecan in vitro.

Authors: Murray D Norris; Janice Smith; Kara Tanabe; Peter Tobin; Claudia Flemming; George L Scheffer; Peter Wielinga; Susan L Cohn; Wendy B London; Glenn M Marshall; John D Allen; Michelle Haber
Journal: Mol Cancer Ther Date: 2005-04 Impact factor: 6.261

4. Analysis of functional germline polymorphisms for prediction of response to anthracycline-based neoadjuvant chemotherapy in breast cancer.

Authors: Joanna Szkandera; Gudrun Absenger; Nadia Dandachi; Peter Regitnig; Sigurd Lax; Michael Stotz; Hellmut Samonigg; Wilfried Renner; Armin Gerger
Journal: Mol Genet Genomics Date: 2012-08-18 Impact factor: 3.291

5. Sequence variations of ABCB1, SLC6A2, SLC6A3, SLC6A4, CREB1, CRHR1 and NTRK2: association with major depression and antidepressant response in Mexican-Americans.

Authors: C Dong; M-L Wong; J Licinio
Journal: Mol Psychiatry Date: 2009-10-20 Impact factor: 15.992

6. Predicting response to short-acting bronchodilator medication using Bayesian networks.

Authors: Blanca E Himes; Ann Chen Wu; Qing Ling Duan; Barbara Klanderman; Augusto A Litonjua; Kelan Tantisira; Marco F Ramoni; Scott T Weiss
Journal: Pharmacogenomics Date: 2009-09 Impact factor: 2.533

7. MYCN-mediated regulation of the MRP1 promoter in human neuroblastoma.

Authors: Chitra F Manohar; James A Bray; Helen R Salwen; Janice Madafiglio; Andy Cheng; Claudia Flemming; Glenn M Marshall; Murray D Norris; Michelle Haber; Susan L Cohn
Journal: Oncogene Date: 2004-01-22 Impact factor: 9.867

8. SNPnexus: a web server for functional annotation of novel and publicly known genetic variants (2012 update).

Authors: Abu Z Dayem Ullah; Nicholas R Lemoine; Claude Chelala
Journal: Nucleic Acids Res Date: 2012-04-28 Impact factor: 16.971

9. Identification of pregnane-X receptor target genes and coactivator and corepressor binding to promoter elements in human hepatocytes.

Authors: Niresh Hariparsad; Xiaoyan Chu; Jocelyn Yabut; Paul Labhart; Dylan P Hartley; Xudong Dai; Raymond Evers
Journal: Nucleic Acids Res Date: 2009-01-07 Impact factor: 16.971

10. Genomic variation in myeloma: design, content, and initial application of the Bank On A Cure SNP Panel to detect associations with progression-free survival.

Authors: Brian Van Ness; Christine Ramos; Majda Haznadar; Antje Hoering; Jeff Haessler; John Crowley; Susanna Jacobus; Martin Oken; Vincent Rajkumar; Philip Greipp; Bart Barlogie; Brian Durie; Michael Katz; Gowtham Atluri; Gang Fang; Rohit Gupta; Michael Steinbach; Vipin Kumar; Richard Mushlin; David Johnson; Gareth Morgan
Journal: BMC Med Date: 2008-09-08 Impact factor: 8.775

7 in total

Review 1. In vitro human cell line models to predict clinical response to anticancer drugs.

Authors: Nifang Niu; Liewei Wang
Journal: Pharmacogenomics Date: 2015 Impact factor: 2.533

2. Pharmacogenomics and functional imaging to predict irinotecan pharmacokinetics and pharmacodynamics: the predict IR study.

Authors: Michael Michael; Winston Liauw; Sue-Anne McLachlan; Emma Link; Annetta Matera; Michael Thompson; Michael Jefford; Rod J Hicks; Carleen Cullinane; Athena Hatzimihalis; Ian G Campbell; Simone Rowley; Phillip J Beale; Christos S Karapetis; Timothy Price; Mathew E Burge
Journal: Cancer Chemother Pharmacol Date: 2021-03-23 Impact factor: 3.333

3. Correlative Studies in Clinical Trials: A Position Statement From the International Thyroid Oncology Group.

Authors: Keith C Bible; Gilbert J Cote; Michael J Demeure; Rossella Elisei; Sissy Jhiang; Matthew D Ringel
Journal: J Clin Endocrinol Metab Date: 2015-09-29 Impact factor: 5.958

4. Functional genomics based on germline genome-wide association studies of endocrine therapy for breast cancer.

Authors: Jacqueline Zayas; Sisi Qin; Jia Yu; James N Ingle; Liewei Wang
Journal: Pharmacogenomics Date: 2020-06-16 Impact factor: 2.533

5. NanoString expression profiling identifies candidate biomarkers of RAD001 response in metastatic gastric cancer.

Authors: Kakoli Das; Xiu Bin Chan; David Epstein; Bin Tean Teh; Kyoung-Mee Kim; Seung Tae Kim; Se Hoon Park; Won Ki Kang; Steve Rozen; Jeeyun Lee; Patrick Tan
Journal: ESMO Open Date: 2016-02-17

6. Bayesian dynamic profiling and optimization of important ranked energy from gray level co-occurrence (GLCM) features for empirical analysis of brain MRI.

Authors: Lal Hussain; Areej A Malibari; Jaber S Alzahrani; Mohamed Alamgeer; Marwa Obayya; Fahd N Al-Wesabi; Heba Mohsen; Manar Ahmed Hamza
Journal: Sci Rep Date: 2022-09-13 Impact factor: 4.996

7. GSTP1 rs1695 is associated with both hematological toxicity and prognosis of ovarian cancer treated with paclitaxel plus carboplatin combination chemotherapy: a comprehensive analysis using targeted resequencing of 100 pharmacogenes.

Authors: Tomoko Yoshihama; Koya Fukunaga; Akira Hirasawa; Hiroyuki Nomura; Tomoko Akahane; Fumio Kataoka; Wataru Yamagami; Daisuke Aoki; Taisei Mushiroda
Journal: Oncotarget Date: 2018-07-03

7 in total