| Literature DB >> 32628676 |
Andrew D Bretherick1, Oriol Canela-Xandri1,2, Peter K Joshi3, David W Clark3, Konrad Rawlik2, Thibaud S Boutin1, Yanni Zeng1,4,5,6, Carmen Amador1, Pau Navarro1, Igor Rudan3, Alan F Wright1, Harry Campbell3, Veronique Vitart1, Caroline Hayward1, James F Wilson1,3, Albert Tenesa1,2, Chris P Ponting1, J Kenneth Baillie2, Chris Haley1,2.
Abstract
To efficiently transform genetic associations into drug targets requires evidence that a particular gene, and its encoded protein, contribute causally to a disease. To achieve this, we employ a three-step proteome-by-phenome Mendelian Randomization (MR) approach. In step one, 154 protein quantitative trait loci (pQTLs) were identified and independently replicated. From these pQTLs, 64 replicated locally-acting variants were used as instrumental variables for proteome-by-phenome MR across 846 traits (step two). When its assumptions are met, proteome-by-phenome MR, is equivalent to simultaneously running many randomized controlled trials. Step 2 yielded 38 proteins that significantly predicted variation in traits and diseases in 509 instances. Step 3 revealed that amongst the 271 instances from GeneAtlas (UK Biobank), 77 showed little evidence of pleiotropy (HEIDI), and 92 evidence of colocalization (eCAVIAR). Results were wide ranging: including, for example, new evidence for a causal role of tyrosine-protein phosphatase non-receptor type substrate 1 (SHPS1; SIRPA) in schizophrenia, and a new finding that intestinal fatty acid binding protein (FABP2) abundance contributes to the pathogenesis of cardiovascular disease. We also demonstrated confirmatory evidence for the causal role of four further proteins (FGF5, IL6R, LPL, LTA) in cardiovascular disease risk.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32628676 PMCID: PMC7337286 DOI: 10.1371/journal.pgen.1008785
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Fig 1Proteome-by-phenome Mendelian Randomization.
A) Genome-wide associations of the plasma concentrations of 249 proteins from two independent European cohorts (discovery and replication) were calculated. The plot shows pQTL position against chromosomal location of the gene that encodes the protein under study for all replicated pQTLs. The area of a filled circle is proportional to its -log10(p-value) in the replication cohort. Blue circles indicate pQTLs ±150kb of the gene (‘local-pQTLs’); red circles indicate pQTLs more than 150kb from the gene. B, C) Local-pQTLs of 64 proteins were taken forward for proteome-by-phenome MR analysis. These were assessed against 778 outcome phenotypes from GeneAtlas [20] (panel B; UK Biobank) and 68 phenotypes identified using Phenoscanner [21,22] (panel C). In each set of results an FDR of <0.05 was considered significant. D) Heterogeneity in dependent instruments (HEIDI [23]) testing was undertaken for MR significant results from GeneAtlas (n = 271). This test seeks to distinguish a single causal variant at a locus effecting both exposure and outcome directly (as in i) or in a causal chain (as in ii), from two causal variants in linkage disequilibrium (as in iii), one affecting the exposure and the other effecting the outcome.
Fig 2Significant (FDR <0.05) proteome-by-phenome MR protein-outcome causal inferences: Disease subset.
MR significant (FDR<5%) protein-disease outcome results. a) All MR significant (FDR<5%) protein-disease outcome results for outcomes from the Phenoscanner [21,22] studies (see key for details). b) All MR significant (FDR<5%) protein-disease outcome results for outcomes from GeneAtlas [20]. An asterisk indicates MR estimates that are not significantly heterogeneous upon HEIDI testing (see key for details). c) Key. From the outside in: HGNC symbol of the protein (exposure); disease outcome; key color (matching the protein name in the outer ring); bar chart of the signed squared beta estimate divided by the squared standard error of the MR estimate, using pQTL data from the discovery cohort (CROATIA-Vis); bar chart of the signed squared beta estimate divided by the squared standard error of the MR estimate, using pQTL data from the replication cohort (ORCADES). Central links join identical outcomes for which more than one protein was found to be MR significant. The color of the links indicates similar outcome groups, e.g. thyroid disease. The key to the outcome descriptions is detailed further in S9 and S10 Tables. d) Example concordance (due to sample overlap) plot for all proteins with significant MR evidence in GeneAtlas for causal roles in asthma (IL1RL1, IL1RL2, IL2RA, IL4R, IL6R). GeneAtlas traits are on the left. Phenoscanner traits are on the right. Thickness of connecting lines is proportional to -log10(p-value). The Phenoscanner studies included here are derived from [24,26,27,30,38,41–43], of which [26,38,42,43] include at least some part of the UKBB data. However, [26,42,43] use only data from the first phase (~150,000 individuals) genotype release from UK Biobank.
Fig 3Co-localization of SHPS1 (encoded by SHPS1: Synonym SIRPA) and schizophrenia DNA associations.
Upper panel, LocusZoom [56] of the region surrounding SHPS1 and the associations with schizophrenia [28]; lower panel, associations with SHPS1. Lower panel inset, the relative concentration of SHPS1 across the 3 genotypes of rs4813319 –the DNA variant used as the instrumental variable (IV) in the MR analysis: CC, CT, and TT.