| Literature DB >> 32895551 |
Jie Zheng1,2, Valeriia Haberland3,4, Denis Baird3,4, Venexia Walker3,4, Philip C Haycock3,4, Mark R Hurle5, Alex Gutteridge6, Pau Erola3, Yi Liu3, Shan Luo3,7, Jamie Robinson3, Tom G Richardson3, James R Staley3,8, Benjamin Elsworth3, Stephen Burgess8, Benjamin B Sun8, John Danesh8,9,10,11,12,13, Heiko Runz14, Joseph C Maranville15, Hannah M Martin16, James Yarmolinsky3, Charles Laurin3, Michael V Holmes3,17,18,19, Jimmy Z Liu14, Karol Estrada14, Rita Santos20, Linda McCarthy6, Dawn Waterworth5, Matthew R Nelson5, George Davey Smith3,4,21, Adam S Butterworth4,8,9,10,11,12, Gibran Hemani3,4, Robert A Scott22,23, Tom R Gaunt24,25,26.
Abstract
The human proteome is a major source of therapeutic targets. Recent genetic association analyses of the plasma proteome enable systematic evaluation of the causal consequences of variation in plasma protein levels. Here we estimated the effects of 1,002 proteins on 225 phenotypes using two-sample Mendelian randomization (MR) and colocalization. Of 413 associations supported by evidence from MR, 130 (31.5%) were not supported by results of colocalization analyses, suggesting that genetic confounding due to linkage disequilibrium is widespread in naïve phenome-wide association studies of proteins. Combining MR and colocalization evidence in cis-only analyses, we identified 111 putatively causal effects between 65 proteins and 52 disease-related phenotypes ( https://www.epigraphdb.org/pqtl/ ). Evaluation of data from historic drug development programs showed that target-indication pairs with MR and colocalization support were more likely to be approved, evidencing the value of this approach in identifying and prioritizing potential therapeutic targets.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32895551 PMCID: PMC7610464 DOI: 10.1038/s41588-020-0682-6
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330
Figure 1Study design of this phenome-wide MR study of the plasma proteome.
The study included instrument selection and validation, outcome selection, four types of MR analyses, colocalization, sensitivity analyses, and drug target validation.
Figure 2A demonstration of pairwise conditional and colocalization (PWCoCo) analysis.
Assume there are two conditional independent association pQTL signals (SNP 1 and SNP 2) and two conditional independent outcome signals (SNP 1 and SNP3) in the tested region. A naïve colocalization analysis using marginal association statistics will return weak evidence of colocalization (showed in regional plots A and D). By conducting the analyses conditioning on SNP 2 (plot B) and 1 (plot C) for the pQTLs and conditioning on SNP 1 (plot E) and 3 (plot F) for the outcome phenotype, each of the ninepairwise combinations of pQTL and outcome association statistics (represented as lines with different colors in the middle of this figure) will be tested using colocalization. In this case, the combination of plot B and plot E shows evidence of colocalization but the remaining eightdo not.
Figure 3Miami plot for the cis-only analysis, with circles representing the MR results for proteins on human phenotypes.
The labels refer to top MR findings with colocalization evidence, with each protein represented by one label. The color refers to top MR findings with P < 3.09 x 10-7, where red refers to immune-mediated phenotypes, blue refers to cardiovascularphenotypes, green refers to lung-related phenotypes, purple refers to bone phenotypes, orange refers to cancers, yellow refers to glycemic phenotypes, brown refers to psychiatric phenotypes, pink refers to other phenotypesand grey refers to phenotypes that showed less evidence of colocalization. The x-axis is the chromosome and position of each MR finding in the cis region. The y-axis is the -log10 P value of the MR findings, MR findings with positive effects (increased level of proteins associated with increasing the phenotype level) are represented by filled circles on the top of the Miami plot, while MR findings with negative effects (decreased level of proteins associated with increasing the phenotype level) are on the bottom of the Miami plot.
Figure 4Regional association plots of IL23R plasma protein level and Crohn’s disease in theIL23R region.
a, b, Regional plots of IL23R protein level and Crohn’s disease without conditional analysis. Plot in b lists the sets of conditionally independent signals for Crohn’s disease in this region: rs7517847, rs7528924, rs183020189, rs7528804 (a proxy for the second IL23R hit rs3762318, r [2]=0.42 in the 1000 Genome Europeans) and rs11209026 (a proxy for the top IL23R hit rs11581607, r [2]=1 in the 1000 Genome Europeans), conditional P value < 1x10-7. c, Regional plot of IL23R with the joint SNP effects conditioned on the second hit (rs3762318) for IL23R. d, Regional plot of Crohn’s disease with the joint SNP effects adjusted for other independent signals except the top IL23R signal rs11581607. e, Regional plot of IL23R with the joint SNP effects conditioned on the top hit (rs11581607) for IL23R. f, Regional plot of Crohn’s disease with the joint SNP effects adjusted for other independent signals except the second IL23R signal rs3762318. The heatmap ofthe colocalization evidence for IL23R association on Crohn’s disease (CD) in the IL23R region is presented in Supplementary Figure 4.
Enrichment analysis comparing target-indication pairs with or without MR and colocalization evidence
| YES | NO | |
|---|---|---|
| YES | 4 | 40 |
| NO | 0 | 147 |
The protein-phenotype association pairs were grouped into four categories: (i) pairs with both MR/colocalization indications of causality and drug trial success; (ii) pairs with MR and colocalization evidence but no drug trial evidence; (iii) pairs with no strong MR or colocalization evidence but with drug trial evidence; and (iv) pairs with no strong MR, colocalization or drug trial evidence. The cut-off for MR evidence was P < 3.5 x 10-7; the cut off for colocalization evidence was posterior probability > 80%. The drug trial evidence was obtained from PharmaProjects database. The MR and colocalization analysis results involved in this analysis including both tier 1 and tier 2 instruments in both cis and trans region. More results comparing MR and trial evidence for cis-only and tier 1 instruments can be found in Supplementary Table 20.
Figure 5Enrichment of phenome-wide MR of the plasma proteome with the druggable genome.
In this figure, we only show proteins with convincing MR and colocalization evidence with at least one of the 70 phenotypes. The x-axis shows the categories of 70 human phenotypes, where the phenotypes have been grouped into 8 categories: 8 autoimmune diseases (red), 3 bone phenotypes (purple), 8 cancers (orange), 12 cardiovascular phenotypes (blue), 4 glycemic phenotypes (yellow), 2 lung phenotypes (green), 4 psychiatric phenotypes (brown), and 29 other phenotypes (pink). The y-axis presents the tiers of the druggable genome (as defined by Finan et al.[39]) of 120 proteins under analysis, where the proteins have been classified into 4 groups based on their druggability: tier 1 contains 23 proteins that are efficacy targets of approved small molecules and biotherapeutic drugs, tier 2 contains 11 proteins closely related to approved drug targets or with associated drug-like compounds, tier 3 contains58 secreted or extracellular proteins or proteins distantly related to approved drug targets, and 28 proteins have unknown druggable status (Unclassified). The cells with colors are protein-phenotype associations with strong MR and colocalization evidence. Cells in green are associations overlapping with the tier 1 druggable genome, while cells in yellow, red or purple were associations with tier 2, tier 3 or unclassified. More detailed information is shown in Supplementary Table 24.