Literature DB >> 29249361

Pharmacogenomics of GPCR Drug Targets.

Alexander S Hauser¹, Sreenivas Chavali², Ikuo Masuho³, Leonie J Jahn⁴, Kirill A Martemyanov³, David E Gloriam⁵, M Madan Babu⁶.

Abstract

Natural genetic variation in the human genome is a cause of individual differences in responses to medications and is an underappreciated burden on public health. Although 108 G-protein-coupled receptors (GPCRs) are the targets of 475 (∼34%) Food and Drug Administration (FDA)-approved drugs and account for a global sales volume of over 180 billion US dollars annually, the prevalence of genetic variation among GPCRs targeted by drugs is unknown. By analyzing data from 68,496 individuals, we find that GPCRs targeted by drugs show genetic variation within functional regions such as drug- and effector-binding sites in the human population. We experimentally show that certain variants of μ-opioid and Cholecystokinin-A receptors could lead to altered or adverse drug response. By analyzing UK National Health Service drug prescription and sales data, we suggest that characterizing GPCR variants could increase prescription precision, improving patients' quality of life, and relieve the economic and societal burden due to variable drug responsiveness. VIDEO ABSTRACT.

Entities: Chemical

Keywords: FDA approved drugs; GPCR; clinical trial; drug response; economic burden; natural variation; opioid receptor; personalized medicine; pharmacogenomics; polymorphism

Mesh：

Substances：

Year: 2017 PMID： 29249361 PMCID： PMC5766829 DOI： 10.1016/j.cell.2017.11.033

Source DB: PubMed Journal: Cell ISSN： 0092-8674 Impact factor: 41.582

Introduction

A system of rigorous clinical trials and regulation exist to ensure that a new drug is safe and effective when reaching the market. However, natural human genetic variation(s) may cause individuals to respond differently to the same medication. For instance, genetic variation is linked to differences in response to anti-hypertensive drugs such as β-blockers, angiotensin receptor blockers, and angiotensin converting enzyme (ACE) inhibitors (Johnson, 2008, Liggett et al., 2006, Mialet Perez et al., 2003). Natural variation may also increase the propensity for adverse reaction to drugs (Roden and George, 2002). Thus, genetic variation in drug targets may alter therapeutic efficacy and safety of drugs. Inadequate accounting for adverse drug reactions cost a fiscal burden of ∼30 billion US dollars annually in the US alone (Sultana et al., 2013). Hence, understanding genetic variation in drug targets has direct bearing on tailoring drug prescriptions (i.e., personalized healthcare) to maximize efficacy and safety while reducing side effects. Many druggable targets for treatment of common diseases involve G-protein-coupled receptors (GPCRs) that mediate therapeutic effects of ∼34% of the marketed drugs (Santos et al., 2017, Rask-Andersen et al., 2014, Hauser et al., 2017). The sales of GPCR targeting drugs represent a global market share of over 27% (The IDG Knowledge Management Center, 2016). Food and Drug Administration (FDA)-approved drugs target at least 108 GPCRs (herein referred as GPCR drug targets), with an additional 66 receptors targeted by agents that are/were in clinical trials (Table S1). Some drugs act through several targets and frequently include GPCRs. Thus, GPCRs serve as primary and secondary targets and determine the pharmacological profiles of the responses (Allen and Roth, 2011, Hauser et al., 2017). Although studies have identified polymorphisms in GPCRs that lead to variable or adverse drug response (Table S2), the prevalence and impact of genetic variation among all human GPCRs that are targeted by FDA-approved drugs remain unknown. In this study, we present a comprehensive analysis and map of the pharmacogenomics landscape of GPCR drug targets (Figure 1).

Figure 1

Pharmacogenomic Landscape of GPCR Drug Targets

Schematic highlighting the different types of data analyzed in this study, ranging from data on drug targets, variants, functional effects, sequences, structures to information on prescription, and sales of drugs in the UK.

Pharmacogenomic Landscape of GPCR Drug Targets Schematic highlighting the different types of data analyzed in this study, ranging from data on drug targets, variants, functional effects, sequences, structures to information on prescription, and sales of drugs in the UK.

Prevalence and Incidence of Natural Variation in GPCR Drug Targets

What is the prevalence of GPCR drug targets to harbor a missense variation (MV) within an individual? An investigation of complete genotype information for 2,504 “healthy” individuals from the 1000 Genomes Project (Auton et al., 2015) showed that, on average, an individual harbors 68 missense variations within the coding region of one-third of the GPCR drug targets (Figure 2A). Of these, an average of 8 variants per individual have previously known clinical associations with altered drug response (Figure 2B). For instance, the heterozygous A307T variation (minor allele frequency [MAF]: 0.49) in the follicle stimulating hormone receptor (FSHR) is prevalent in women who develop polycystic ovary syndrome and is associated with a higher responsiveness to exogenous FSH (Dolfin et al., 2011). The G9S variant (MAF: 0.48) in the dopamine receptor 3 (DRD3) is linked to increased risk of gastrointestinal toxicity upon Levodopa treatment in subjects with Parkinson’s disease (Rieck et al., 2016) (Figure 2C). Analysis of 1,762 trios (father, mother, offspring) (Turner et al., 2017) revealed that 6 offsprings harbor at least one new de novo MV in a GPCR drug target. In other words, a new, non-lethal, missense, germline mutation in a GPCR drug target arises in 1 of every ∼300 newborns (Figure 2D). These observations collectively suggest that GPCR drug targets are likely to show substantial variation with new missense mutations continuing to arise within their coding region.

Figure 2

Distribution of Individuals Harboring Missense Variation in GPCR Drug Targets

(A and B) Estimates based on genotype data from 2,504 individual genomes was made per individual on (A) number of missense variants in GPCR drug targets (left) and the number of GPCR drug targets harboring a missense variation (right) and (B) number of clinically known variants that alter efficacy of drug response or toxicity in GPCR drug targets (left) and the number of affected GPCR drug targets with clinically known mutations (right).

(C) Allele frequencies of variants, known to alter drug response in 2,504 individuals (number of homozygous/heterozygous carriers) (Table S2).

(D) Analysis of 1,762 studied trios (father-mother-offspring) revealed a total of 9 de novo missense mutations in 6 GPCR drug targets.

Distribution of Individuals Harboring Missense Variation in GPCR Drug Targets (A and B) Estimates based on genotype data from 2,504 individual genomes was made per individual on (A) number of missense variants in GPCR drug targets (left) and the number of GPCR drug targets harboring a missense variation (right) and (B) number of clinically known variants that alter efficacy of drug response or toxicity in GPCR drug targets (left) and the number of affected GPCR drug targets with clinically known mutations (right). (C) Allele frequencies of variants, known to alter drug response in 2,504 individuals (number of homozygous/heterozygous carriers) (Table S2). (D) Analysis of 1,762 studied trios (father-mother-offspring) revealed a total of 9 de novo missense mutations in 6 GPCR drug targets.

Mutational Landscape of GPCR Drug Targets

In addition to MV, mutations that introduce a stop codon, cause a frameshift or affect essential splice sites constitute loss-of-function variations (LoF). The abundance of a protein-coding gene may be affected by deletions and/or duplications (copy number variation [CNV]). Such mutational events may alter the functional property and/or change the abundance of a drug target, either of which can influence drug efficacy, safety profile, and adverse reaction. How much variability is seen in the GPCR drug targets in the human population? To characterize the spectrum and prevalence of variation in GPCR drug targets, we investigated data from the exome aggregation consortium (ExAC), which contains aggregated information on MVs, LoFs, and CNVs for ∼60,000 ‘healthy’ individuals (Lek et al., 2016, Ruderfer et al., 2016). This allowed us to characterize the mutational landscape of currently druggable GPCRs in the human population. We find a total of 14,192 MVs in 108 GPCR drug targets, with a mean of 128 rare (MAF <1 × 10−3) and 3.7 common (MAF ≥ 1 × 10−3) variants per receptor (Figure 3A and S1A). On average, 25% of all positions in each of the 108 GPCRs contain a MV (Figure 3A). GPCR drug targets have, on average, a LoF mutation in 9.3 different positions per receptor (Figure 3B). Our conservative estimate suggests that on average, at least 120 of the 60,706 individuals harbor such LoF mutations (i.e., stop codon, essential splice site, and frameshift mutation) in a GPCR drug target (0.2%; STAR Methods). In fact, a minimum of one LoF variant has been observed in each of the 108 GPCRs suggesting that heterozygosity, regulatory epistasis, and buffering mechanisms such as allele-specific expression might offset the effects of these drastic mutations in healthy individuals (Lappalainen et al., 2011, Kukurba et al., 2014). Many GPCR drug targets are also susceptible to CNVs and each of the GPCRs analyzed had on average two duplications and three deletions reported in the ExAC dataset (Figure 3C).

Figure 3

Genetic Variation Landscape of GPCR Drug Targets

(A–C) Scatterplots of (A) missense variation (red), (B) loss-of-function mutations (blue), and (C) copy-number variation (purple) for GPCR drug targets. Each mutation type shows the number of observed variants (separated into deletions and duplications for CNVs) for a given GPCR drug target. Missense variation density was obtained by normalizing number of missense mutations to the receptor sequence length. Loss-of-function mutations are presented as the minimum percentage of individuals harboring at least one copy of a protein-truncating variant (STAR Methods). Correlations and mean values (μ) are shown for MVs and LoFs. Mean values (μ) for the distributions are provided. Genetic variation landscapes of GPCR drug targets that are in clinical trials are provided in Table S3 and S4). Lower half of the figure shows the distribution of top 10 (upper panels) and bottom 10 ranking GPCR drug targets (lower panels).

Putative Functional Impact of Variants in GPCR Drug Targets

Can the observed variants of a receptor affect drug response and what fraction of the variants is likely to have an impact? We analyzed each of the MVs and investigated whether they map to functionally relevant regions (i.e., ligand binding, effector binding, allosteric sodium binding pocket, activation micro-switches, and post-translational modification sites) to infer possible impact (Figure 4A; Table S4). We identified functional sites based on data from all 196 available GPCR-ligand crystal structures; published literature and 2,544 experimentally validated post-translational modification sites (PTMs). 2,036 mutations in the different receptors fall within known functional sites (14.3% of 14,192 MVs; Z score = 3.7; p = 1.5 × 10−4; permutation test; Figure 4A; Table S4).

Figure 4

Missense Variations in GPCR Drug Targets and Their Possible Functional Impact

(A) Variants predicted to have impact by SIFT or PolyPhen (dark green). MVs can affect different functional sites (light green), which were defined as ligand binding (left), post-translational-modification site (bottom), and micro-switches including allosteric sodium ion binding pocket and G-protein/arrestin interaction interface (right). The displayed structures show missense variants within 5 Å (red) of an approved drug (left and right) or MVs within 5 Å distance to the G protein or arrestin. PDB IDs are provided in the bottom of each structure sub-panel.

(B) Disease ontology (left), FDA-approved drug (middle), and variant (right) known (i.e., statistical association in clinical-genetics studies) to alter drug response or efficacy or lead to adverse drug response (Table S2). In some cases, the drug and disease are linked to reflect the clinical study design and are not drugs given to treat those diseases.

Pharmacological and Clinical Effects of Receptor Variants

Are the individuals with variant receptors likely to respond differently to drugs? To investigate this, we compiled experimental data on ligand-mutant GPCR interactions. Consistent with the findings above, of the 49 experimentally tested mutations corresponding to naturally occurring variants in 24 receptors, 32% (16 of 49) show at least a 5-fold change in affinity or potency to one of the ligands tested. Of these, 68% (11 of 16) fall within known or putative functional sites (Table S4). These observations suggest that naturally occurring variations in these drug targets can affect drug binding. We then analyzed data on drug-missense variant associations from population-based clinical studies and compiled a dataset of statistical association between mutations in 16 different positions among 11 receptors and altered response to one of 39 approved drugs (Figure 4B; Table S2). These drugs cover treatment of diverse disorders ranging from metabolic, respiratory, nervous, and reproductive system disorders to cardiovascular diseases. We categorized the diseases for which a drug is administered and linked this information with the variant, and the position on the receptor. Of the 16 variants that have been associated with an altered drug response, four variants reside within putative functional sites and have modest to high allele frequencies (MAF = 0.18 – 0.49; Figure 4B). The rest of the MVs reside outside of functional sites, suggesting that such mutations may be affecting yet-to-be characterized functional sites. Such sites may facilitate interactions with other factors such as chaperones, membrane proteins, cytosolic proteins, and membrane lipids, form a part of oligomerization interface or linear motifs within disordered regions, or may influence receptor biogenesis and expression homeostasis (Chen et al., 2014, van der Lee et al., 2014, Venkatakrishnan et al., 2014, Boucrot et al., 2015, Gupta et al., 2017).

Natural Receptor Variants Can Impact Drug Response and Bias Signaling

In order to better understand the implications of the observed variations, we experimentally investigated the impact of variations in two GPCR drug targets. We first analyzed the μ-opioid receptor, a highly variable GPCR drug target with one of the highest numbers of FDA-approved drugs (Figure S2). Previously uncharacterized polymorphisms around the ligand-binding site were selected and tested using a real-time bioluminescence resonance energy transfer (BRET) assay, which monitors state transitions in G-protein subunits upon receptor stimulation. Quantifying the onset kinetics of G-protein activation by GPCRs (KON) in this assay approximates potency, while the maximal response amplitude (RMax) reflects the efficacy (Masuho et al., 2015a) (Figures 5A, 5B, S5A, and S5B). Our assay could reliably discriminate differential effects of full agonist (morphine) from that of partial agonist (buprenorphine) and antagonist (naloxone), compared to the endogenous agonist endomorphine-1 for the wild-type receptor (Figure 5C).

Figure 5

Effects of Natural Mutations on Drug Activity and G-Protein Coupling

(A) Positions of selected missense variations of the μ-opioid receptor (MOR) near the ligand-binding pocket.

(B) Schema of the BRET assay for real-time monitoring of G-protein activation. Activating μ-opioid receptors by agonist leads to the dissociation of inactive heterotrimeric G proteins into active GTP-bound Gα and Venus-Gβγ subunits. The free Venus-Gβγ then interacts with the Gβγ effector mimetic masGRK3ct-Nluc to increase the BRET signal.

(C) Ligand/drug-induced maximum BRET amplitude (RMax) and activation rate constants (kON) by wild-type μ-opioid receptor (mean ± SEM, n = 6 wells).

(D) Real-time monitoring of ligand/drug actions on μ-opioid receptor mutants (mean response trace, n = 3 or 6).

(E) Quantification of stimulus bias (RMAX, left and kON, right) of μ-opioid receptor mutants. The values of agonist-induced responses were normalized to the reference, wild-type μ-opioid receptor (black line). The values of naloxone-induced responses were normalized to the K235N3.36x36 mutant (thickness represents the SEM over n = 3).

(F) Positions of selected missense variations of the Cholecystokinin receptor type A.

(G) Schema of the BRET assay for real-time monitoring of G-protein activity for CCKAR experiments.

(H) Agonist-induced maximum BRET amplitude (RMax) and activation rate constants (kON) by wild-type CCKAR (mean ± SEM, n = 6 wells).

(I) Real-time monitoring of G-protein activation by CCKAR mutants stimulated with caerulein (30 μM, applied at 5 s, n = 3) normalized to the maximum amplitude of the wild-type receptor.

(J) Quantification of G-protein-coupling bias (RMAX, left and kON, right) of CCKAR mutants normalized to wild-type CCKAR (black line, thickness represents the SEM over n = 3).

Drugs and Targets that Are Affected due to Functional Site Variability

Which receptors have polymorphisms in a significant fraction of their functional sites? For this, we analyzed the ExAC data and identified receptors that show most and least variation in known functional sites. On average, each receptor harbors at least one polymorphism in 23% of the known functional sites (Table S5). Among the highly variable GPCR drug targets are the somatostatin SST5 receptor (SSTR5), cholecystokinin A receptor (CCKAR), dopamine D5 receptor (DRD5), and the calcitonin receptor (CALCR), all of which display amino acid alterations in more than 40% of the known functional sites (Figure 6A). Thus, for some receptors, the high incidence of variation (i.e., fraction of known functional sites with a MV) makes drug effects more likely, whereas the drug targets with less variability could be expected to maintain the expected drug response. We then developed a score for each drug to rank them based on how likely they are to manifest altered response due to variability in the fraction of known functional site in its drug target(s) (Table S6). We find that ergoloid (for treating cardiovascular disease), olanzapine, and asenapine (both for treating schizophrenia) are drugs whose targets are most variable and thus are more likely to manifest variable response (Table S6). Drugs, GPCR Functional Site Variability, and Associated Economic Impact (A) Number of FDA-approved drugs (y axis, log-scale) against their missense mutation density within known functional sites (x axis) for GPCR drug targets. Color represents the total number of missense variants for each receptor within known functional sites as seen in the ExAC dataset. (B) Number of prescribed items by the National Health Service each month (y axis, log-scale) against the maximum number of missense variations in known and putative functional sites of its therapeutic GPCR target for each FDA-approved drug. UK sales volume is shown in million GBP per month. (C) Estimated economic burden on the National Health Service per year due to ineffective drug prescription (STAR Methods). Please see Table S7 for each drug. See also Figures S6 and S7.

Figure S7

Possible Economic Burden Due to Loss-of-Function and Copy-Number Variation Observed for GPCR Drug Targets, Related to Figure 6

(A–C) For each FDA-approved drug, a pot of the number of prescribed items by the National Health Service each month (y axis, logarithmic scale in thousands) against (B) the percentage of individuals with loss of function mutations and (C) maximum number of copy number variations in its therapeutic GPCR target is shown. Sales volume is shown in million GBP per month.

What fraction of the human population is likely to carry a variant GPCR drug target with a mutation in a known or putative functional site? We find that on average, 3.1% of the 2,504 individuals in the 1000 Genomes Project carry at least one allele with a missense variation in a known functional site in any given GPCR drug target (11.9% in known or putative functional site; Table S5). For instance, over 86% and 69% of the individuals carry at least one allele with a missense mutation in a known functional site in the cannabinoid receptor 2 (CNR2) and glucagon like peptide 1 (GLP1) receptor, respectively, that are targeted by the common antiemetic nabilone, and several anti-diabetic drugs such as exenatide. In line with what we observed for μ-opioid receptor variants, this suggests that a substantial fraction of the population might carry variant receptors and remain healthy but have the potential to display differences in drug response when treated with a drug. Using the genotype information of mutations in known and putative functional sites, we ranked drugs whose targets are most variable (i.e., fraction of population) in their functional sites and thus are more likely to manifest different response (Table S6). The information and the framework that we describe here can be used to prioritize drugs (Table S6) for pharmacovigilance investigations by regulatory bodies, post-market follow-up studies such as in drug repurposing efforts, as well as in personalizing prescriptions (e.g., dose, treatment regime, etc.). The information on the spectrum and prevalence of receptor variation (Table S5) can also be used for patient stratification of those entering clinical trials in order to maximize success in clinical outcomes.

Possible Economic Impact of Drug Target Variability

Just in the UK alone, the National Health Service spent ∼1.7 billion pounds in primary care for GPCR targeting drugs in 2016 (∼21% of all drug cost). What is the economic burden of drug cost associated with genetic variation in GPCR drug targets? For this, we investigated all the 279 FDA-approved drugs that are prescribed actively in the UK, the number of prescribed items for a 6-year duration as available in the National Health Service health records (2011–17), and the cost associated with prescriptions. Drugs that are highly prescribed bind to receptors that are polymorphic within known and putative functional sites in the human population (Figure 6B). We find the same trend when considering LoF and CNVs (Figures S7A and S7B). For example, the highly polymorphic drug target, μ-opioid receptor, mediates the effect of morphine, tramadol, codeine, buprenorphine, and fentanyl among other drugs (n = 22) that are prescribed over 4 million times each month and account for sales of more than 432 million British Pounds (GBP) per year just in the UK. Even if a small fraction (7%, from Table S5) of these prescriptions are ineffective or lead to unexpected adverse reactions, such a difference may contribute to a differentially effective treatment outcome and a significant economic burden (an estimate of ∼30 million GBP per year in the UK alone). Possible Economic Burden Due to Loss-of-Function and Copy-Number Variation Observed for GPCR Drug Targets, Related to Figure 6 (A–C) For each FDA-approved drug, a pot of the number of prescribed items by the National Health Service each month (y axis, logarithmic scale in thousands) against (B) the percentage of individuals with loss of function mutations and (C) maximum number of copy number variations in its therapeutic GPCR target is shown. Sales volume is shown in million GBP per month. To analyze each drug, their GPCR targets, polymorphisms within drug targets, National Health Service prescription and sales data, prescription statistics, affected individuals considering known and putative functional sites as well as homozygosity/heterozygosity, and economic burden estimates, we present an interactive resource at http://gpcrdb.org/mutational_landscape/economicburden. In this simplistic estimate of economic burden (STAR Methods), each prescription is treated as being made for a separate individual due to patient anonymity (i.e., we do not account for the recurrent prescription of the same individual). Furthermore, information about the dose per prescription, and how this has been altered based on patient response is not considered. While we do not independently weigh the variables in our equation, at the level of an individual, the estimates can be higher or lower depending on the exact response due to the polymorphism (i.e., known and putative functional site, homozygosity, heterozygosity, neutral, loss- or gain-of-function effects) and the patient’s alternative management by the National Health Service. Providing a broad estimate, in the UK, the possible economic burden on the National Health Service due to ineffective prescription of drugs targeting GPCRs could range between 14 million (considering % of individuals with both alleles having a mutation only in known functional sites) and 501 million (considering % of individuals with at least one allele having a mutation in known or putative functional sites) GBP annually (Figure 6C; Table S7; STAR Methods). These estimates do not take into account other sources of economic burden such as hospital prescriptions, nature of the illness (e.g., chronic v/s short-term treatment for acute disease), age of patient (i.e., life-years of treatment), mutations outside of the coding region that affect drug target expression level, associated additional patient care and additional hospital costs in the case of adverse drug reaction. The calculations suggest that polymorphisms in GPCR drug targets may constitute a substantial, unaccounted healthcare expense. Thus, investing efforts on understanding and actively incorporating the effect of drug target polymorphisms and drug response in trials and the clinic has the potential to curtail the recurrent, unaccounted expenditure on public health. Future studies should carefully collect more information and consider various factors such as the exact effect of heterozygous nature of a receptor polymorphism (i.e., loss of function, gain of function, buffering), the dose and ways in which drugs are prescribed (e.g., repeat prescriptions), efficacy is measured in clinical practice, and drug kinetics (that will vary in different individuals). These factors will be different depending on the drug, receptor, individual, disease, country and medical practice, and need to be considered for each receptor and drug independently to obtain robust estimates of economic burden due to variable drug response. We hope that the resource and analysis platform that we present here and our estimates using a simplistic model will inspire new and novel lines of investigations addressing this essential socio-economic and health problem.

Discussion

Despite the importance of GPCRs as a major family of drug targets, no receptor variants are included in the labeling information of drugs (FDA, 2014, Thompson et al., 2014, Raimondi et al., 2017). Although most variants have minor consequences and are not described in the literature, even severe adverse effects are estimated to only be reported in 1%–10% of cases (Giacomini et al., 2007), possibly due to a lack of a single post-marketing surveillance system for registering such effects. Additionally, the effects that drugs produce are often confounded by drug-drug interactions due to their combinatorial use, overlap with disease symptoms, and general health status of the patient. Disentangling the contribution of the variation with respect to the disease condition or physiological differences and drug treatment outcomes can help understand how and why certain mutations are buffered under normal physiological conditions in the healthy human population but could cause different responses to drugs (Figure 7A).

Figure 7

GPCR Variants Can Contribute to Altered Drug Response

(A) Receptor response outcome upon binding to an endogenous ligand binding (E) and upon drug treatment (D). A receptor variant may induce phenotype-altering (e.g., disease) perturbations (endogenous ligand) and/or altered drug response (and combinations thereof). The position of the missense variation (top right) renders its effect. Some mutations may have no effect on the binding of the endogenous ligand or the drug and are entirely neutral. Mutations in/near the ligand-binding interface might affect endogenous ligand signaling, drug response, or both. Mutations in the effector-binding interface (G protein/arrestin) most likely affect both endogenous ligand signaling and drug response.

(B) Differences in drug response due to different mutations between individuals in a population. The drug target variation spectrum may differently affect individual drug responses by potentially altering ligand potencies and efficacies, receptor conformation, surface expression, and/or biasing signaling. Personalized target sequencing could facilitate prognosis of a patient’s drug response. Additionally, pharmacological characterization of genetic variants that have been cataloged in humans could foster precision prescription.

STAR★Methods

Key Resources Table

Contact for Reagent and Resource Sharing

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, M. Madan Babu (madanm@mrc-lmb.cam.ac.uk).

Experimental Model and Subject Details

Cell Lines and Tissue Culture

HEK293T/17 cells were grown in culture medium (Dulbecco’s modified Eagle’s medium supplemented with 10% fetal bovine serum, MEM non-essential amino acids (Life Technologies), 1 mM sodium pyruvate, and antibiotics (100 units/mL penicillin and 100 μg/mL streptomycin) at 37°C in a humidified incubator containing 5% CO2. This cell line is derived from a female and was purchased from ATCC.

Method Details

Use of generic residue numbering systems to compare GPCR positions

In order to make the findings presented here applicable to all GPCRs, the GPCRdb numbering system (http://gpcrdb.org) was used throughout this study (Isberg et al., 2015). The structure-based GPCRdb numbering scheme is an adaption of the sequence-based Ballesteros-Weinstein scheme with corrections for helix bulges and constrictions. GPCRdb numbers are distinguished by a unique separator x and may be used alone, e.g., 5x47, or together with one of the sequence-based schemes (such as Ballesteros-Weinstein), e.g., 5.46x47. Cross-class comparisons of generic residue positions (note, that e.g., 6x50 in Class A does not correspond to 6x50 in Class B) were conducted according to a class residue translation table (Isberg et al., 2015). The offset to the Class A x.50 to the corresponding Class B position can be deduced from following offsets: TM1-7: 4, 7, 4, 0, −4, 5, 4. Class A to C is translated with the following offset for TM1-7: 4, −4, 4, −10, 0, 2, −5. Class A to F is translated by TM1-7: −3, −1, 0, 0, 4, −1, 0. For example, the Class C 4x50 position corresponds to the 4x60 position in Class A receptors.

Datasets

FDA-approved drugs and their GPCR targets

All FDA-approved drugs and agents that are/were in clinical trials that target GPCRs were obtained from GPCRdb (Hauser et al., 2017). In this list, every single FDA-approved drug-receptor interaction has a source reference article obtained via DrugBank (ver. 5.0.7) (Law et al., 2014) and literature search. The respective references (PubMed IDs) are available as annotation. Other annotations include primary and secondary targets of approved drugs and additional targets that reached clinical trials. For the latter, information about whether it is still in clinical trials (“ongoing”) or terminated is provided. Approved agents that are close analogs of the equivalent endogenous ligands are noted. The list of drug-GPCR interactions is provided in Table S1. Please see www.gpcrdb.org/drugs/drugbrowser (for all drug-receptor pairings) and www.gpcrdb.org/drugs/drugmapping (for a GPCR drug target overview). The dataset from Hauser et al. (Hauser et al., 2017) contains 475 FDA-approved drugs that target 108 GPCRs. Additionally, it contains 532 agents in clinical trials (discontinued and ongoing) targeting a total of 165 receptors. Of these, 66 receptors do not have any FDA-approved drug (i.e., not part of the 108 GPCR drug targets).

Genotype data

To estimate the number of coding loci with missense mutations (with respect to the reference human genome GRCh37/hg19) within the GPCR drug targets in an individual, we obtained genotype data for 2,504 individuals from the 1000 Genomes Project (Auton et al., 2015). Each of the genomes of the 2,504 individuals was separately investigated for missense variants in GPCR drug targets and their clinical association with altered drug response in literature.

De novo mutations data

To estimate the de novo missense mutation rate within GPCR drug targets, we obtained de novo mutations from 1,762 control trios (without any reported pathological conditions) compiled from ten different studies from denovo-db (Turner et al., 2017).

Genetic variation data

We retrieved natural genetic variation data for all GPCRs from the Exome Aggregation Consortium (ExAC), which compiled exome-sequencing data from various large-scale cohorts spanning 60,706 unrelated individuals from 6 distinct human populations. We extracted polymorphism data for the coding-region of the 108 GPCR drug targets and GPCRs that reached clinical trials (n = 66) via their respective Ensembl transcript IDs as used in the GPCRdb (Table S5). Minor Allele frequencies (MAFs) were calculated as the ratio of allele counts of the less frequent allele to the total number of alleles at that locus. Each variant was mapped to its respective structural segment (e.g., transmembrane regions, intra-cellular loops, extra-cellular loops, etc.), generic residue position (i.e., GPCRdb numbering), receptor class, ligand-type and family classification according to the GuideToPharmacology (Southan et al., 2016) and GPCRdb (Munk et al., 2016a) using custom R and python pandas scripts. Densities (number of MVs normalized to the protein length) were calculated using the respective protein length. Missense variants were categorised into “changed” and “similar” depending on whether the amino acid substitution switched defined categories such as hydrophobic (‘A’, ‘C’, ‘F’, ‘I’, ‘L’, ‘M’, ‘V’, ‘W’, ‘Y’), aromatic (‘F’, ‘H’, ‘W’, ‘Y’), polar uncharged (‘S’, ‘T’,’N’, ‘Q’), helix breakers (‘P’, ‘G’), negative (‘D’, ‘E’) and positive (‘H’, ‘K’, ‘R’) or remained within the same category. Loss of function (LoF) mutations was extracted by filtering for essential splice sites, gained stop codon or frameshift mutations. As no genotype data were available, to calculate the number of individuals with a LoF mutation, we estimated the minimum number of individuals with LoF mutations. For this, we extracted positions within a GPCR drug target with the highest allele count for LoF mutations and corrected for homozygous individuals (i.e., the number of homozygotes were subtracted from the total allele count to arrive at the number of heterozygous individuals). This number (minimum population frequency) reflects the minimum number of heterozygous individuals, who carry at least one LoF mutation in a GPCR drug target. Copy number variations (CNVs) were analyzed from 59,898 human exomes for all GPCR drug targets (Ruderfer et al., 2016). For a total of 82 receptor genes out of the 108 GPCR drug targets, CNVs could be confidently called. We calculated the z-score for each category (number of missense variants, missense variant density, number of loss of functions, loss of function minimum population frequency, number of deletions and number of duplications) bywhere: x is the observed value for the category considered. μ is the mean of the 108 GPCR drug targets in a given category. σ is the standard deviation of the 108 GPCR drug targets in a given category. The absolute value of the z-score represents the distance between the individual observed value and the population mean in units of standard deviation. We then used hierarchical clustering ‘heatmap.2′ from ‘gplots’ in R to identify receptors that are highly polymorphic in multiple mutation categories (Figure S1B).

Estimation of putative functional impact of variants in GPCR drug targets

An analysis of the putative functional impact of variants was obtained using SIFT (Ng and Henikoff, 2003) (sorting intolerant from tolerant) and PolyPhen (Adzhubei et al., 2013) (polymorphism phenotyping). A missense variant position was classified as a putative functional site if SIFT (0-1; 0-0.05: deleterious, > 0.05: tolerated) or PolyPhen outcome (0-1; 0-0.1: benign, 0.1-0.2: possibly damaging, > 0.2: probably damaging) predicted a deleterious or possibly damaging outcome. This resulted in the identification of 9,522 variants (referred to as putative functional sites).

Engineered in vitro mutations

Data on 28,779 experimentally characterized in vitro mutations was retrieved from the GPCRdb (Pándy-Szekeres et al., 2017). From this dataset, we extracted 9240 mutants for human receptor constructs and with reported effect values on ligand affinity/potency. This includes experimental values for different ligands for a total of 1,996 distinct receptor positions. Additionally, all experimental data obtained by investigating orthologous receptors from related species such as mouse, rat, hamster, dog, pig and rhesus macaque were included if the corresponding generic structural-alignment position in the human receptor was identical in amino acid (Isberg et al., 2015). The combined dataset comprised of 13,223 mutants in 68 distinct receptors and a total of 928 unique ligands. We then looked for identical amino acid substitutions (same receptor, position, wild-type and mutant amino acid) that mimicked a naturally occurring variant in the ExAC dataset for the 108 GPCR drug targets. In cases where there were multiple data points, we considered maximum fold changes. (Table S4). A cut-off of 5-fold for the maximal absolute fold changes was applied for experimental in vitro support of impact on drug response (Munk et al., 2016b).

Clinical annotations

Clinical information about missense variant-drug pairs with reported changes in efficacy, dosage or toxicity/ADR were manually curated from the literature and PharmGKB, which aggregates the impact of human genetic variation on drug response (https://www.pharmgkb.org/). We obtained the level of evidence for each association as listed in PharmGKB. If it was not available, we assigned the evidence level based on PharmGKB criteria (https://www.pharmgkb.org/page/clinAnnLevels), so as to guide the reader in terms of the evidence that is available for a particular clinical association with a receptor variant. All figures and text referring to clinical associations have been limited to a level of evidence of 3 or above (3, 2 and 1). In this annotation scheme, 4 is the lowest level of evidence and 1 is the highest level of evidence. Disease annotations were extracted from the indications in the drugs@FDA database (https://www.accessdata.fda.gov/scripts/cder/daf/) or the actual study that reported the altered drug response. Higher disease ontology categories were assigned using the Experimental Factor Ontology (EFO) retrieved from the Open Targets database (Koscielny et al., 2017). Corresponding positions and amino acid substitutions of reported SNP identifiers were retrieved using BioMart for the “canonical transcripts” as stored in GPCRdb (Table S5) (Smedley et al., 2015). Network analysis between disease categories, drugs and missense variants were done using Cytoscape 3.4.

Allele-specific expression data

We obtained information on allele-specific expression of genes from published literature (Savova et al., 2016). Briefly, genes were classified as those with monoallelic and biallelic expression, by integrating gene-expression data and specific chromatin signatures of gene expression (co-occurrence of silencing mark on Histone H3K27me3 and active mark H3K36me3 on the gene body) in six different cell types. We assessed for enrichment of GPCR drug targets among genes with monoallelic expression using permutation test.

Fraction of receptor length with a polymorphism or population with variant receptor

For each of the GPCR drug targets we calculated: (i) the ratio of receptor length with missense variation in a known functional sites per GPCR drug target using the ExAC data (Table S5) and (ii) the fraction of affected individuals in the human population (n = 2,504; based on the 1000 Genomes Project dataset, Table S5). The fraction of affected individuals was calculated using four different criteria by considering individuals who have a variation in (i) known functional sites in both alleles (homozygous), (ii) known functional sites in at least one allele (i.e., homozygous and heterozygous), (iii) known or putative functional sites in both alleles (homozygous), and (iv) known or putative functional sites in at least one allele (i.e., homozygous and heterozygous). Known functional sites include ligand binding, effector binding, post-translational modification site, sodium binding site and micro-switches. Putative functional sites include those predicted to be deleterious based on SIFT or PolyPhen.

Drug scores for prioritization

We developed a score for each FDA-approved drug (n = 475) to rank them based on how likely they are to manifest altered response due to the prevalence of known functional site variability of its target(s) in the human population. The drug score (Table S6) based on the fraction of known functional site that are polymorphic in a drug target using the ExAC data was calculated by: The drug score (Table S6) based on prevalence of affected individual (i.e., 1000 Genomes Project) was calculated by: The fraction of affected individuals was calculated using four different criteria by considering individuals who have a variation in (i) known functional sites in both alleles (homozygous), which is the most conservative, (ii) known functional sites in at least one allele (i.e., homozygous and heterozygous), (iii) known or putative functional sites in both alleles (homozygous), and (iv) known or putative functional sites in at least one allele (i.e., homozygous and heterozygous), which is the least conservative.

National Health Service prescription data

Every month, the National Health Service (NHS) in the UK publishes anonymised data about the drugs prescribed by general practitioners. National Health Service data were retrieved from openprescribing.net (DataLab-EBM, 2017) (08/2017) for the list of drugs targeting GPCRs and mapped back to their reported target of therapeutic action. From the 475 queried FDA-approved drugs, data were available for 279 drugs targeting 92 distinct GPCRs (not all FDA-approved drugs are prescribed in the UK due to alternative treatments). Items are the number of times the drug appeared on a prescription form that month (defined by National Health Service Digital as “A prescription item is a single supply of a medicine, dressing or appliance written on a prescription form”). The actual cost is the estimated cost to the National Health Service, which is usually lower than Net Ingredient Cost (“the basic price of a drug, i.e. the price listed in the Drug Tariff or price lists”). Openprescribing.net provides the actual cost by subtracting the average percentage discount per item received by pharmacists based on the previous month from the Net Ingredient Cost, but adding in the value of a container allowance for each prescription item (DataLab-EBM, 2017). Total National Health Service spending was aggregated over each month of 2016 and compared with the total GPCR-targeting drugs to calculate the share for the GPCR targeting drugs (Figure 1). Indications were grouped according to the British National Formulary (BNF), which is a reference book containing the standard list of medicines prescribed in the UK and also includes information on indications, dosage and side effects.

Estimation of economic burden

The economic burden estimate was calculated using the following formula:where: The average National Health Service cost is the average yearly cost over a 4-year period (2013-2016) per GPCR targeting drug that is listed (n = 279). 2012 and 2017 have partial sales data and were not considered. % Individuals is the percentage of affected individuals with a missense variant in a functional site of the respective drug target(s) (n = 2,504 individuals from 1000 Genomes Project genotype data as a representative for the UK population; this data includes non-Caucasian populations as well) (Table S5). The % of affected individuals was calculated using four different criteria by considering individuals who have a variation in (i) known functional sites in both alleles (homozygous), which is the most conservative, (ii) known functional sites in at least one allele (i.e., homozygous and heterozygous), (iii) known or putative functional sites in both alleles (homozygous), and (iv) known or putative functional sites in at least one allele (i.e., homozygous and heterozygous), which is the least conservative. Known functional sites include ligand binding, effector binding, post-translational modification site, sodium binding site and micro-switches. Putative functional site include those predicted to be deleterious based on SIFT or PolyPhen (see above). More specifically, for each drug we collected the respective targets and computed economic burden using the following four criteria above: considering (i) % individuals with homozygous alleles in known functional sites, (ii) % individuals with at least one variant allele in a known functional site, (iii) % individuals with homozygous alleles in known or putative functional sites and (iv) % individuals with at least one variant allele in a known or putative functional sites. For these estimates, we have incorporated the following considerations (below). The economic burden estimates will vary if one scales/factors these variables differently: We have considered that each prescription (National Health Service data) is made for a unique individual, due to patient anonymity. Furthermore, information about the dose per prescription, and how this has been altered based on patient response is not explicitly modeled. The effect of known and putative site polymorphisms as well as homozygous/heterozygous conditions are all treated the same way. One could also obtain estimates by weighing these variables differently on a case-by-case basis for each receptor/drug. The focus has been prescription only from GPs. There might be significant additions to the economic burden if one also considers hospital prescriptions. We used the data from 1000 Genomes Project as representative of the UK population, which may vary depending on the receptor. We have not explicitly modeled the age, gender, nature of illness (chronic v/s short-term) and mutations in non-coding regions, which may affect expression level of drug targets.

Interaction interface, functional, and structural site assignment

Post-translational modification sites

We obtained publicly available experimental data on post-translational modification sites for the GPCR drug targets (including Ubiquitylation, Phosphorylation, Palmitoylation, Hydroxylation, N/O-linked Glycosylation, Sulfation, Prenylation, ADP-ribosylation, Methylation, Sumoylation, Acetylation, Disulfide bond and S-Nitrosylation) from dbPTM (Huang et al., 2016). Additionally, all post-translational modification sites from PhosphoSitePlus (03/2017) from low-throughput experimental techniques, proteomic mass spectrometry and shotgun proteomic experiments were collected and combined with the dbPTM dataset (Hornbeck et al., 2015) (Table S4). Each post-translational modification site was cross-validated for identical amino acids in both dbPTM and its corresponding GPCRdb wild-type amino acid. For each receptor, we then obtained the post-translational modification sites with reported missense variation.

Generic transmembrane ligand-interacting sites

Ligand interaction sites were extracted from all 196 available receptor-ligand crystal structure complexes featuring 43 unique GPCR ligand-receptor complexes from Chordate organisms and higher (excluding organisms such as Herpesvirus 5 and Todarodes pacificus). The union of ligand-interacting positions were aggregated for each of the 42 unique crystallized receptors including aromatic, polar, and hydrophobic interactions within 4.5 Å of the co-crystallized ligand (Pándy-Szekeres et al., 2017, Munk et al., 2016b). This led to a total set of 457 ligand-binding residues. Ligand binding site definition for receptors without structural information have been inferred from the receptor family level (based on NC-IUPHAR receptor family nomenclature obtained from GPCRdb) for which structural information was available (e.g., dopamine D5 receptor ligand-binding site is inherited from dopamine D3 receptor). To infer the ligand binding site from the crystallized receptor, we used only positions that have a generic numbering in the GPCRdb numbering scheme (Isberg et al., 2015), which excludes ligand-interacting positions for most of the loops (three ECL2 positions, including 34x52 at the top of the 7TM ligand cavity, are included). For ortholog receptors (e.g., mouse, rat, bovine), we inferred generic positions only if the corresponding human residue is identical to the ortholog receptor. This allowed ligand-binding pocket characterization for 24 families and a total of 139 receptors. We provide annotation as to whether a position is a known ligand-binding site (LB) from a crystallized structure or an inferred one from a related structure of the same receptor family in Table S4.

Arrestin binding interface

All available GPCR-arrestin complexes (PDB: 4ZWJ, PDB: 4PXF, PDB: 5DGY, all are Class A rhodopsin structures) were retrieved from the PDB and visualized using PyMol. The inter GPCR-arrestin residue contact network (RCN) for these structures was computed using Arpeggio (Jubb et al., 2017) with the maximum range of interaction set to 4.5 Å. The union of all interacting residue positions was taken and corresponding residues (with same GPCRdb numbers) were retrieved from the GPCRdb WebServices. This led to the identification of 29 arrestin contacting positions that were used for the effector interaction site analysis. Arrestin residue positions were applied to other classes by their corresponding generic position (see Use of common residue numbering systems to compare GPCR positions).

G protein binding interface

The inter GPCR-G protein residue contact network was generated as described for arrestin. For the class A interface, the β2AR-Gs (PDB: 3SN6), A2A-Gsmini (PDB: 5G53) and Rhodopsin-Gt-C-peptide (PDB: 3DQB, PDB: 3PQR, PDB: 4A4M) were used. This led to the identification of a total of 33 G-protein interacting positions on the receptor. The class B GPCR-G protein interface was calculated using the Calcitonin Receptor-Gs (PDB: 5UZ7) and the GLP-1-Gs (PDB: 5VAI) structures. For class C and class F receptors, the union of both class A and B interface positions were considered as the G protein coupling positions.

Microswitches

Residues involved in mediating the conformational transition during activation (microswitches) were obtained from the literature (Trzaskowski et al., 2012). These include: D/E3x49, K/Y7x43, R3x50, F5x47, Y5x58, E6x30, T6x34, W6x48, P6x50, N7x49, P7x50, Y7x53, I3x40. Additionally, we considered the “P-I-F” motif (P5x50, I3x40 and F6x44) as an important motif for conformational rearrangements upon receptor activation (Wacker et al., 2013). The positions 3x46, 6x37 and 7x53 were also considered as microswitches due to their importance in structural rearrangements (Venkatakrishnan et al., 2016). Micro-switches were only considered for Class A receptors.

Sodium ion pocket

Residue positions (n = 15) of the sodium pocket were extracted from Katritch et al. (Katritch et al., 2014) and comprised of N1x50, V1x53, L2x46, A2x47, A2x49, D2x50, S3x39, L3x43, F6x44, W6x48, N7x45, S7x46, N7x49, P7x50, Y7x53. The Na+ pocket was only considered for Class A receptors.

Structural site assignments

Segments of structural sites were extracted using class-specific multiple sequence alignments. Helix segments were assigned based on a residue independent generic numbering position as described in the GPCRdb (e.g., 1x50 - > TM1). Residues falling between transmembrane regions were assigned to their respective loop region (extracellular and intracellular loops 1-3). Residues before TM1 and after Helix 8 were assigned C-terminal and N-terminal, respectively.

Pharmacological validation

Selection criteria for in vitro variants

We performed functional analysis and pharmacological validation of mutations in two GPCRs for the ligand binding (OPRM) and G protein binding (CCKAR). We selected the μ-opioid receptor, due to its importance in analgesia, physical dependence and respiratory depression. Additionally, μ-opioid receptor is targeted by nearly 40 FDA-approved drugs and is one of the highly polymorphic receptor. Specifically, a structural analysis identified 5 variants of potentially damaging or altering effects upon drug binding. These include M153V3x36, K235M5.39x40, K235N5.39x40, V238I5.42x43 and V302I6x55. For the Cholecystokinin A receptor (CCKAR) we investigated the effect of polymorphisms at the G protein binding interface, which is known to couple to several G proteins to produce its physiological effects (Dufresne et al., 2006). We selected 7 variants at positions, which are potentially important to interact with the G protein, that were predicted to be deleterious in SIFT and PolyPhen and all mutations changed their residue properties (see Genetic variation data). These include R139I3x50, Q148E34x52, R150Q34x54, R150W34x54, E243K5x72, N304H6x26, V311E6x33. Experimental outcome of all selected variants is provided in Figure S5.

Genetic constructs

Human wild-type and mutant receptors were synthesized and cloned into pcDNA3.1(+) by GenScript. pCMV5 plasmids encoding Gαo, Gαq, and Gαs were gifts from Dr. Hiroshi Itoh (Nara Institute of Science and Technology, Japan). Plasmids encoding Venus 156-239-Gγ1, and Venus 1-155-Gγ2 were gifts from Dr. Nevin A. Lambert (Augusta University, USA) (Hollins et al., 2009). masGRK3ct-Nluc construct were generated as reported previously (Masuho et al., 2015a). PTX-S1 in mammalian expression vector was kindly provided by Dr. Eitan Reuveny (Weizmann Institute of Science, Israel) (Raveh et al., 2010).

Transient transfection

We coated 6-cm culture dishes during incubation for 10 min at 37°C with 2.5 mL of Matrigel solution (approximately 10 μg/mL growth factor-reduced Matrigel (BD Biosciences) in culture medium). Cells were seeded into the 6-cm dishes containing Matrigel solution at a density of 4 × 106 cells/dish. After 4 hr, expression constructs (total 10 μg/dish) were transfected into the cells using PLUS (10 μl/dish) and Lipofectamine LTX (12 μl/dish) reagents. Venus 156-239-Gγ1 (0.42 μg), Venus 1-155-Gγ2 (0.42 μg), and masGRK3ct-Nluc (0.42 μg) was transfected with different amount of Gα and receptor constructs. μ-opioid receptor (2.52 μg), cholecystoninin A receptor (2.52 μg), GαoA (0.42 μg), Gαq (0.84 μg), Gαs (2.52 μg), or Gα13 (1.68 μg) was used. Empty vector pcDNA3.1(+) was used to normalize the total amount of transfected plasmid DNA. PTX-S1 (0.42 μg) was transfected with Gαq, Gαs, and Gα13 to inhibit possible coupling to endogenous Gα belonging to Gi/o subfamily.

BRET assay for monitoring G protein activity

BRET experiments were performed as previously reported with slight modifications (Masuho et al., 2015a, Masuho et al., 2015b). Sixteen to twenty-four hours post-transfection, HEK293T/17 cells were washed once with BRET buffer (PBS containing 0.5 mM MgCl2 and 0.1% glucose) and detached by pipetting with BRET buffer gently. Cells were harvested with centrifugation at 500 g for 5 min and resuspended in BRET buffer. Approximately 50,000 to 100,000 cells per well were distributed in 96-well flat-bottomed white microplates (Greiner Bio-One). The Nluc substrate, furimazine, were purchased from Promega and used according to the manufacturer’s instruction. BRET measurements were performed using a microplate reader (POLARstar Omega or PHERAstar FSX, BMG Labtech) equipped with two emission photomultiplier tubes. All measurements were performed at room temperature. The BRET signal is determined by calculating the ratio of the light emitted by Venus-Gβ1γ2 (535 nm with a 30-nm band path width) over the light emitted by masGRK3ct-Nluc (475 nm with a 30-nm band path width). The average baseline value recorded before agonist stimulation was subtracted from BRET signal values, and the resulting difference (ΔBRET ratio) was plotted as traces. The transiently transfected cells were stimulated by endomorphin-1 (100 μM), morphine (100 μM), buprenorphine (10 μM), naloxone (100 μM) for μ-opioid receptor or caerulein (30 μM) for CCKAR at 5 s. No agonist-induced activation of G proteins was detected from the negative control that omitted CCKAR. Each trace represents the mean of the responses measured in three or six wells. The maximal value recorded upon agonist stimulation was reported as RMax. The activation rate constants (kON) were obtained by fitting a single exponential curve to the traces with Clampfit ver. 10.3 software (Molecular Devices).

Quantification and Statistical Analysis

Statistical analysis to analyze the experimental data was performed with GraphPad Prism 6. Statistical parameters are reported in Figure Legends. All graphs pertaining to the experimental data were made with SigmaPlot 12.5.

Estimation of statistical significance

Enrichment of receptors with allele-specific expression

Enrichment of GPCRs among genes with monoallelic expression was assessed with permutation tests by performing 100,000 randomizations. In each randomization, each GPCR was replaced with a random gene and the number of such randomly obtained genes that overlapped with genes with monoallelic expression was noted. From the random distribution, we computed the Z-score, which captures the distance of the actual number of observations (e.g., GPCRs with monoallelic expression) to the mean of random expectation in terms of the number of standard deviations. We estimated p value as the ratio of the number of simulations where the random observations were greater than or equal to the number of actually observed values to the total number of randomizations (100,000).

Enrichment of MVs in known functional site

To assess for enrichment of MVs affecting known functional sites, we obtained a list of random sites with similar number of sites with MV for all GPCRs and then counted the number of random sites that overlapped with known functional sites. We performed 100,000 such randomizations to obtain a distribution of random expectations. Z-score and p values were estimated as described above.

Statistical significance for overlap between known and putative functional sites

For assessing whether the overlap between known and putative functional sites (deciphered from SIFT and PolyPhen predictions) was significantly higher than expected by chance, we randomly replaced each known functional site with a random site with a missense mutation in GPCRs and noted how many of these random mutant sites overlapped with the putative functional sites. We performed 100,000 such randomizations and obtained a distribution of expected overlap by chance. Z-score and p values were estimated as described above.

Statistical significance for putative functional sites in different structural segments

We estimated statistical significance of differences in the distribution of putative functional sites and uncharacterized sites in different structural segments of GPCRs using the non-parametric Wilcoxon rank sum test (Figures S4C; absolute counts and S4D; normalized for segment length).

Data and Software Availability

Data and code availability

All relevant data are integrated into the web resource in GPCRdb (Pándy-Szekeres et al., 2017) (www.gpcrdb.org) and can also be obtained from GitHub (https://github.com/protwis/gpcrdb_data). All data that support the findings of this study have been provided as Tables S1, S2, S3, S4, S5, S6, and S7 as were deposited in Mendeley Data (https://doi.org/10.17632/pr5v9t8z36.1). The open-source code can be obtained from GitHub (https://github.com/protwis/protwis). In-house written scripts can be obtained from GitHub (https://github.com/AlexanderHauser/GPCR-Pharmacogenomics). For specific data/script requests please contact the lead contact.

Additional Resources

Missense variation mapping of specific receptors

To analyze and characterize the extent of genetic variation, we incorporated all missense variations and functional annotations into the framework of the GPCRdb (http://www.gpcrdb.org/) that allows researchers to map all missense variants for a selected receptor, using NC-IUPHAR receptor nomenclature, onto a snake–like diagram and helix-boxplot of the selected receptor residue topologies. Further, information regarding amino acid substitution, allele counts, allele frequencies, functional annotations, SIFT/PolyPhen scores and heterozygosity for each individual position is provided (www.gpcrdb.org/mutational_landscape/). A statistic page is presented for an overview of genetic variation in GPCR drug targets (www.gpcrdb.org/mutational_landscape/statistics). Possible economic impact of drug target variability can be analyzed and estimated for individual drugs and different groups of drugs via http://www.gpcrdb.org/mutational_landscape/economicburden. The National Health Service spending on each drug can be obtained from the drug-receptor pairing page, which includes specific information of the drug such as indication, target family, target category (primary/secondary), status, drug type, mechanism of action and references (www.gpcrdb.org/drugs/drugbrowser). The natural variation dataset can be downloaded in full, accessed via an extensive API (http://gpcrdb.org/services/reference/), or searched and browsed via a web interface.

Author Contributions

A.S.H. and M.M.B. designed the project, analyzed the data, interpreted the results, and wrote the manuscript. All authors read and provided inputs on the manuscript. A.S.H. compiled the datasets, wrote scripts, implemented and performed the calculations, made figures, and developed the web services. S.C. was involved in the compilation of variation data, permutation test, and calculation of allele-specific expression and making the relevant figures. L.J.J. curated clinical associations. I.M. and K.A.M. performed the experiments, analyzed the data, and contributed to writing the experimental section of the manuscript. D.E.G. provided critical inputs on the project, manuscript writing, and data analysis. M.M.B. initiated, managed, and set the direction of research for the project.

REAGENT or RESOURCE	SOURCE	IDENTIFIER
Chemicals, Peptides, and Recombinant Proteins

Endomorphin-1	Tocris	Cat# 1055
Dynorphin A	Abcam	Cat# ab120412
Morphine	Sigma	Cat# M8777
Buprenorphine	Sigma	Cat# B9275
Naloxone	Tocris	Cat# 0599
Caerulein	Toronto Research Chemicals	Cat# C070000
Nano-Glo Luciferase Assay Substrate (furimazine)	Promega	Cat# N1120

Experimental Models: Cell Lines

Human: HEK293T/17 cell	ATCC	Cat# CRL-11268

Recombinant DNA

μ-opioid receptor wt	This paper	N/A
μ-opioid receptor M153V	This paper	N/A
μ-opioid receptor K235M	This paper	N/A
μ-opioid receptor K235N	This paper	N/A
μ-opioid receptor V238I	This paper	N/A
μ-opioid receptor V302I	This paper	N/A
CCKAR wt	This paper	N/A
CCKAR R139I	This paper	N/A
CCKAR Q148E	This paper	N/A
CCKAR R150Q	This paper	N/A
CCKAR R150W	This paper	N/A
CCKAR E234K	This paper	N/A
CCKAR N304H	This paper	N/A
CCKAR V311E	This paper	N/A
Gαo	Hiroshi Itoh	N/A
Gαq	Hiroshi Itoh	N/A
Gαs	Hiroshi Itoh	N/A
Gα13	cDNA Resource Center	Cat# GNA1300001
Venus 156-239 Gβ1	PMID: 19258039	N/A
Venus 1-155 Gγ2	PMID: 19258039	N/A
masGRK3ct-Nluc	PMID: 26628681	N/A
PTX-S1	PMID: 21111235	N/A

Software and Algorithms

GPCRdb	Pándy-Szekeres et al., 2017	https://github.com/protwis/protwis
Python v2.7.13 and v3.4.3	Python Software Foundation	https://www.python.org/; RRID: SCR_008394
R v3.4.1	R Core Team	https://www.r-project.org/
Pandas v0.20.3	Wes McKinney	http://pandas.pydata.org/
Engineered in vitro mutations	Munk et al., 2016b	http://www.gpcrdb.org/; RRID: SCR_007419
PyMol v1.8.0.6	Schrodinger	http://pymol.org/2/; RRID: SCR_000305
Arpeggio	Jubb et al., 2017	http://biosig.unimelb.edu.au/arpeggioweb/
SigmaPlot v12.5	Systat Software	http://www.sigmaplot.co.uk/
Prism v6	GraphPad	https://www.graphpad.com/scientific-software/prism/
ChemDraw Professional v16.0	PerkinElmer	http://www.cambridgesoft.com/
Clampfit v10.3	Molecular Devices	http://mdc.custhelp.com/

Other

GPCRdb resource website for the GPCR Pharmacogenomics publication	This paper	https://github.com/AlexanderHauser/protwis/tree/SNP
Custom scripts for analysis	This paper	https://github.com/AlexanderHauser/GPCR-Pharmacogenomics

Deposited Data

Mendeley dataset	This paper	https://doi.org/10.17632/pr5v9t8z36.1

68 in total

1. Nonenzymatic rapid control of GIRK channel function by a G protein-coupled receptor kinase.

Authors: Adi Raveh; Ayelet Cooper; Liora Guy-David; Eitan Reuveny
Journal: Cell Date: 2010-11-24 Impact factor: 41.582

Review 2. Pharmacogenomics in the clinic.

Authors: Mary V Relling; William E Evans
Journal: Nature Date: 2015-10-15 Impact factor: 49.962

Review 3. Trends in GPCR drug discovery: new agents, targets and indications.

Authors: Alexander S Hauser; Misty M Attwood; Mathias Rask-Andersen; Helgi B Schiöth; David E Gloriam
Journal: Nat Rev Drug Discov Date: 2017-10-27 Impact factor: 84.694

4. Comparison of predicted and actual consequences of missense mutations.

Authors: Lisa A Miosge; Matthew A Field; Yovina Sontani; Vicky Cho; Simon Johnson; Anna Palkova; Bhavani Balakishnan; Rong Liang; Yafei Zhang; Stephen Lyon; Bruce Beutler; Belinda Whittle; Edward M Bertram; Anselm Enders; Christopher C Goodnow; T Daniel Andrews
Journal: Proc Natl Acad Sci U S A Date: 2015-08-12 Impact factor: 11.205

5. Crystal structure of the β2 adrenergic receptor-Gs protein complex.

Authors: Søren G F Rasmussen; Brian T DeVree; Yaozhong Zou; Andrew C Kruse; Ka Young Chung; Tong Sun Kobilka; Foon Sun Thian; Pil Seok Chae; Els Pardon; Diane Calinski; Jesper M Mathiesen; Syed T A Shah; Joseph A Lyons; Martin Caffrey; Samuel H Gellman; Jan Steyaert; Georgios Skiniotis; William I Weis; Roger K Sunahara; Brian K Kobilka
Journal: Nature Date: 2011-07-19 Impact factor: 49.962

6. The BioMart community portal: an innovative alternative to large, centralized data repositories.

Authors: Damian Smedley; Syed Haider; Steffen Durinck; Luca Pandini; Paolo Provero; James Allen; Olivier Arnaiz; Mohammad Hamza Awedh; Richard Baldock; Giulia Barbiera; Philippe Bardou; Tim Beck; Andrew Blake; Merideth Bonierbale; Anthony J Brookes; Gabriele Bucci; Iwan Buetti; Sarah Burge; Cédric Cabau; Joseph W Carlson; Claude Chelala; Charalambos Chrysostomou; Davide Cittaro; Olivier Collin; Raul Cordova; Rosalind J Cutts; Erik Dassi; Alex Di Genova; Anis Djari; Anthony Esposito; Heather Estrella; Eduardo Eyras; Julio Fernandez-Banet; Simon Forbes; Robert C Free; Takatomo Fujisawa; Emanuela Gadaleta; Jose M Garcia-Manteiga; David Goodstein; Kristian Gray; José Afonso Guerra-Assunção; Bernard Haggarty; Dong-Jin Han; Byung Woo Han; Todd Harris; Jayson Harshbarger; Robert K Hastings; Richard D Hayes; Claire Hoede; Shen Hu; Zhi-Liang Hu; Lucie Hutchins; Zhengyan Kan; Hideya Kawaji; Aminah Keliet; Arnaud Kerhornou; Sunghoon Kim; Rhoda Kinsella; Christophe Klopp; Lei Kong; Daniel Lawson; Dejan Lazarevic; Ji-Hyun Lee; Thomas Letellier; Chuan-Yun Li; Pietro Lio; Chu-Jun Liu; Jie Luo; Alejandro Maass; Jerome Mariette; Thomas Maurel; Stefania Merella; Azza Mostafa Mohamed; Francois Moreews; Ibounyamine Nabihoudine; Nelson Ndegwa; Céline Noirot; Cristian Perez-Llamas; Michael Primig; Alessandro Quattrone; Hadi Quesneville; Davide Rambaldi; James Reecy; Michela Riba; Steven Rosanoff; Amna Ali Saddiq; Elisa Salas; Olivier Sallou; Rebecca Shepherd; Reinhard Simon; Linda Sperling; William Spooner; Daniel M Staines; Delphine Steinbach; Kevin Stone; Elia Stupka; Jon W Teague; Abu Z Dayem Ullah; Jun Wang; Doreen Ware; Marie Wong-Erasmus; Ken Youens-Clark; Amonida Zadissa; Shi-Jian Zhang; Arek Kasprzyk
Journal: Nucleic Acids Res Date: 2015-04-20 Impact factor: 16.971

7. A global reference for human genetic variation.

Authors: Adam Auton; Lisa D Brooks; Richard M Durbin; Erik P Garrison; Hyun Min Kang; Jan O Korbel; Jonathan L Marchini; Shane McCarthy; Gil A McVean; Gonçalo R Abecasis
Journal: Nature Date: 2015-10-01 Impact factor: 49.962

8. denovo-db: a compendium of human de novo variants.

Authors: Tychele N Turner; Qian Yi; Niklas Krumm; John Huddleston; Kendra Hoekzema; Holly A F Stessman; Anna-Lisa Doebley; Raphael A Bernier; Deborah A Nickerson; Evan E Eichler
Journal: Nucleic Acids Res Date: 2016-10-05 Impact factor: 16.971

Review 9. A comprehensive map of molecular drug targets.

Authors: Rita Santos; Oleg Ursu; Anna Gaulton; A Patrícia Bento; Ramesh S Donadi; Cristian G Bologa; Anneli Karlsson; Bissan Al-Lazikani; Anne Hersey; Tudor I Oprea; John P Overington
Journal: Nat Rev Drug Discov Date: 2016-12-02 Impact factor: 84.694

10. Genetic variants affecting equivalent protein family positions reflect human diversity.

Authors: Francesco Raimondi; Matthew J Betts; Qianhao Lu; Asuka Inoue; J Silvio Gutkind; Robert B Russell
Journal: Sci Rep Date: 2017-10-06 Impact factor: 4.379

163 in total

1. Coupling of Human Rhodopsin to a Yeast Signaling Pathway Enables Characterization of Mutations Associated with Retinal Disease.

Authors: Benjamin M Scott; Steven K Chen; Nihar Bhattacharyya; Abdiwahab Y Moalim; Sergey V Plotnikov; Elise Heon; Sergio G Peisajovich; Belinda S W Chang
Journal: Genetics Date: 2018-12-04 Impact factor: 4.562

2. D₂ Dopamine Receptor G Protein-Biased Partial Agonists Based on Cariprazine.

Authors: Yudao Shen; John D McCorvy; Michael L Martini; Ramona M Rodriguiz; Vladimir M Pogorelov; Karen M Ward; William C Wetsel; Jing Liu; Bryan L Roth; Jian Jin
Journal: J Med Chem Date: 2019-04-18 Impact factor: 7.446

3. Naturally Occurring Missense MRGPRX2 Variants Display Loss of Function Phenotype for Mast Cell Degranulation in Response to Substance P, Hemokinin-1, Human β-Defensin-3, and Icatibant.

Authors: Ibrahim Alkanfari; Kshitij Gupta; Tahsin Jahan; Hydar Ali
Journal: J Immunol Date: 2018-05-23 Impact factor: 5.422

4. A cell-cell interaction format for selection of high-affinity antibodies to membrane proteins.

Authors: Zhuo Yang; Yue Wan; Pingdong Tao; Min Qiang; Xue Dong; Chih-Wei Lin; Guang Yang; Tianqing Zheng; Richard A Lerner
Journal: Proc Natl Acad Sci U S A Date: 2019-07-08 Impact factor: 11.205

5. Illuminating G-Protein-Coupling Selectivity of GPCRs.

Authors: Asuka Inoue; Francesco Raimondi; Francois Marie Ngako Kadji; Gurdeep Singh; Takayuki Kishi; Akiharu Uwamizu; Yuki Ono; Yuji Shinjo; Satoru Ishida; Nadia Arang; Kouki Kawakami; J Silvio Gutkind; Junken Aoki; Robert B Russell
Journal: Cell Date: 2019-05-31 Impact factor: 41.582

6. Discovery of Amantamide, a Selective CXCR7 Agonist from Marine Cyanobacteria.

Authors: Xiao Liang; Danmeng Luo; Jia-Lei Yan; Mohammad A Rezaei; Lilibeth A Salvador-Reyes; Sarath P Gunasekera; Chenglong Li; Tao Ye; Valerie J Paul; Hendrik Luesch
Journal: Org Lett Date: 2019-02-19 Impact factor: 6.005

7. Benchmarking GPCR homology model template selection in combination with de novo loop generation.

Authors: Gregory L Szwabowski; Paige N Castleman; Chandler K Sears; Lee H Wink; Judith A Cole; Daniel L Baker; Abby L Parrill
Journal: J Comput Aided Mol Des Date: 2020-07-31 Impact factor: 3.686

8. Mechanism of Hormone Peptide Activation of a GPCR: Angiotensin II Activated State of AT₁R Initiated by van der Waals Attraction.

Authors: Khuraijam Dhanachandra Singh; Hamiyet Unal; Russell Desnoyer; Sadashiva S Karnik
Journal: J Chem Inf Model Date: 2019-01-16 Impact factor: 4.956

9. Structure and Dynamics of Adrenomedullin Receptors AM₁ and AM₂ Reveal Key Mechanisms in the Control of Receptor Phenotype by Receptor Activity-Modifying Proteins.

Authors: Yi-Lynn Liang; Matthew J Belousoff; Madeleine M Fletcher; Xin Zhang; Maryam Khoshouei; Giuseppe Deganutti; Cassandra Koole; Sebastian G B Furness; Laurence J Miller; Debbie L Hay; Arthur Christopoulos; Christopher A Reynolds; Radostin Danev; Denise Wootten; Patrick M Sexton
Journal: ACS Pharmacol Transl Sci Date: 2020-03-20

10. GPCR-PEnDB: a database of protein sequences and derived features to facilitate prediction and classification of G protein-coupled receptors.

Authors: Khodeza Begum; Jonathon E Mohl; Fredrick Ayivor; Eder E Perez; Ming-Ying Leung
Journal: Database (Oxford) Date: 2020-11-20 Impact factor: 3.451