| Literature DB >> 30774981 |
Maya S Safarova1, Benjamin A Satterfield1, Xiao Fan1, Erin E Austin1, Zhan Ye2, Lisa Bastarache3, Neil Zheng3, Marylyn D Ritchie4, Kenneth M Borthwick5, Marc S Williams6, Eric B Larson7, Aaron Scrol7, Gail P Jarvik8, David R Crosslin8,9, Kathleen Leppig10, Laura J Rasmussen-Torvik11, Sarah A Pendergrass5, Amy C Sturm6, Bahram Namjou12, Amy Sanghavi Shah13, Robert J Carroll3, Wendy K Chung14,15, Wei-Qi Wei3, QiPing Feng16, C Michael Stein16, Dan M Roden17, Teri A Manolio18, Daniel J Schaid19, Joshua C Denny3, Scott J Hebbring20, Mariza de Andrade19, Iftikhar J Kullo1.
Abstract
We conducted an electronic health record (EHR)-based phenome-wide association study (PheWAS) to discover pleiotropic effects of variants in three lipoprotein metabolism genes PCSK9, APOB, and LDLR. Using high-density genotype data, we tested the associations of variants in the three genes with 1232 EHR-derived binary phecodes in 51,700 European-ancestry (EA) individuals and 585 phecodes in 10,276 African-ancestry (AA) individuals; 457 PCSK9, 730 APOB, and 720 LDLR variants were filtered by imputation quality (r 2 > 0.4), minor allele frequency (>1%), linkage disequilibrium (r 2 < 0.3), and association with LDL-C levels, yielding a set of two PCSK9, three APOB, and five LDLR variants in EA but no variants in AA. Cases and controls were defined for each phecode using the PheWAS package in R. Logistic regression assuming an additive genetic model was used with adjustment for age, sex, and the first two principal components. Significant associations were tested in additional cohorts from Vanderbilt University (n = 29,713), the Marshfield Clinic Personalized Medicine Research Project (n = 9562), and UK Biobank (n = 408,455). We identified one PCSK9, two APOB, and two LDLR variants significantly associated with an examined phecode. Only one of the variants was associated with a non-lipid disease phecode, ("myopia") but this association was not significant in the replication cohorts. In this large-scale PheWAS we did not find LDL-C-related variants in PCSK9, APOB, and LDLR to be associated with non-lipid-related phenotypes including diabetes, neurocognitive disorders, or cataracts.Entities:
Year: 2019 PMID: 30774981 PMCID: PMC6370860 DOI: 10.1038/s41525-019-0078-7
Source DB: PubMed Journal: NPJ Genom Med ISSN: 2056-7944 Impact factor: 8.617
Clinical characteristics of study participants
| Variable | Discovery Cohort (eMERGE Network) | Replication Cohort 1 (Marshfield PMRP) | Replication Cohort 2 (BioVU) | Replication Cohort 3 (UK Biobank) | ||
|---|---|---|---|---|---|---|
| Race | EA | AA | EA | EA | AA | EA |
|
| 51,700 | 10,276 | 9562 | 26,582 | 3131 | 408,455 |
| Mean age years | 58 | 51 | 62 | 62 | 61 | 57 |
| Female (%) | 54 | 67 | 62 | 58 | 52 | 54 |
AA African-ancestry; BioVU Vanderbilt DNA biobank; EA European-ancestry; eMERGE electronic MEdical Records and GEnomics Network; PMRP Marshfield Clinic Personalized Medicine Research Project
Fig. 1Selection of variants in the discovery cohort for the primary analysis. Collectively, individuals in the discovery cohort contained the number of variants shown for PCSK9, APOB, and LDLR. These variants were passed through various quality control filters and other selection measures including imputation quality (r2 > 0.4), minor allele frequency (MAF) > 1%, LDL-C association at the given thresholds for EA and AA, and linkage disequilibrium (r2 < 0.3). The variants passing these filters were used in the primary analysis. The rsID for each variant is shown
Variants that passed quality control filters in the primary analysis compared with the Global Lipids Genetics Consortium
| Gene | Chr | Positiona | rsID | Ref | Alt | Annotation | eMERGE cohort | GLGC metabochip | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MAF EA (%) | Beta | MAF in 1kGP (%) | Betab | |||||||||
|
| 1 | 55505647 | rs11591147 | G | T | issense | 1.4 | −12.97 | 1.3 × 10−27 | 1.7 | −0.50 | 1.6 × 10−142 |
| 55519015 | rs639750 | T | G | Intron | 32.7 | −1.82 | 1.0 × 10−9 | – | – | – | ||
|
| 2 | 21233972 | rs533617 | T | C | Missense | 3.8 | −4.40 | 1.3 × 10−9 | 4.9 | −0.14 | 1.7 × 10−27 |
| 21263639 | rs531819 | G | T | Intron | 15.5 | −4.07 | 2.6 × 10−26 | 19.1 | −0.12 | 1.3 × 10−57 | ||
| 21263900 | rs1367117 | G | A | Missense | 31.6 | 3.52 | 6.4 × 10−32 | 71.2 | −0.11 | 1.4 × 10−75 | ||
|
| 19 | 11202306 | rs6511720 | G | T | Regulatory intron | 11.4 | −5.79 | 4.2 × 10−39 | 9.8 | −0.23 | 2.8 × 10−151 |
| 11206575 | rs6511721 | A | G | Retained intron | 48.3 | 1.73 | 5.6 × 10−10 | 48.8 | −0.06 | 1.5 × 10−29 | ||
| 11227480 | rs2738447 | C | A | Nonsense mediated decay | 41.5 | −1.67 | 4.0 × 10−9 | 42.9 | −0.05 | 8.4 × 10−13 | ||
| 11231203 | rs72658867 | G | A | Splice regions | 1.1 | −10.20 | 2.8 × 10−14 | – | – | – | ||
| 11243445 | rs5742911 | A | G | 3′ UTR | 30.7 | −1.79 | 3.7 × 10−9 | 26.8 | −0.06 | 5.3 × 10−24 | ||
Selection criteria: Imputation quality r2 > 0.4; MAF > 1%; LCL-C association (threshold of 5.0 × 10−8); LD r2 < 0.3
GLGC Global Lipids Genetics Consortium, Chr chromosome number, Ref reference allele, Alt alternate allele, MAF minor allele frequency, LDL-C low-density lipoprotein cholesterol, 1kGP 1000 Genomes program
aPosition in human genome assembly hg19
bThe difference in Beta between eMERGE and GLGC is primarily due to differences in units of measurements. eMERGE used mg/dL while GLGC used mmol/L
Fig. 2Study outline for primary analysis. AA African-ancestry, EA European-ancestry, EHR electronic health record, eMERGE electronic MEdical Records and GEnomics Network, LD linkage disequilibrium, PMRP Personalized Medicine Research Project, QC quality control
Significant associations in the discovery and replication cohorts
| Phecode | Description | Variant | MAF (%) | eMERGE discovery cohort | eMERGE 5-fold cross validation | Marshfield replication cohort | Vanderbilt replication cohort | UK Biobank | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Cases | Controls | Odds ratiob | 95% CI | Cases | Controls | Cases | Controls | |||||||||
|
| ||||||||||||||||
| 272 | Disorders of lipid metabolism | rs11591147 | 1.4 | 25,298 | 17,205 | 0.64 | 0.51–0.76 | 8291 | 1796 | 9076 | 14,560 | |||||
| rs531819 | 15.5 | 25,298 | 17,205 | 0.88 | 0.84–0.92 | |||||||||||
| rs1367117 | 31.5 | 25,298 | 17,205 | 1.07 | 1.04–1.10 | 5.4 × 10−4 | 9314 | 18,219 | ||||||||
| rs6511720 | 11.4 | 25,298 | 17,205 | 0.83 | 0.78–0.87 | 7852 | 1710 | 9346 | 18,219 | |||||||
| rs6511721 | 48.3 | 25,298 | 17,205 | 1.07 | 1.04–1.10 | 4.0 × 10−4 | ||||||||||
| 272.1 | Hyperlipidemia | rs11591147 | 1.4 | 25,168 | 17,205 | 0.64 | 0.51–0.76 | 7666 | 1796 | 9050 | 14,560 | |||||
| rs531819 | 15.5 | 25,168 | 17,205 | 0.88 | 0.84–0.92 | |||||||||||
| rs1367117 | 31.5 | 25,168 | 17,205 | 1.07 | 1.04–1.11 | 3.6 × 10−4 | 9346 | 18,219 | ||||||||
| rs6511720 | 11.4 | 25,168 | 17,205 | 0.83 | 0.78–0.87 | 7259 | 1710 | 9314 | 18,219 | |||||||
| rs6511721 | 48.3 | 25,168 | 17,205 | 1.07 | 1.04–1.10 | 4.6 × 10−4 | ||||||||||
| 272.11 | Hypercholesterolemia | rs11591147 | 1.5 | 11,753 | 17,205 | 0.60 | 0.44–0.76 | 5602 | 1796 | 3840 | 14,560 | |||||
| rs531819 | 15.7 | 11,753 | 17,205 | 0.87 | 0.81–0.92 | |||||||||||
| rs6511720 | 11.5 | 11,753 | 17,205 | 0.80 | 0.74–0.86 | 5316 | 1710 | 3953 | 18,219 | |||||||
| 272.13 | Mixed hyperlipidemia | rs6511720 | 12.0 | 4942 | 17,205 | 0.84 | 0.76–0.91 | 147 | 1710 | 3.6 × 10−1 | 4572 | 18,219 | ||||
|
| ||||||||||||||||
| 367.1 | Myopia | rs6511720 | 11.4 | 4138 | 36,272 | 0.85 | 0.77–0.92 | c8.8 × 10−4 | 3879 | 1868 | 4.5 × 10−1 | 823 | 27,142 | 3.5 × 10−1 | 4.6 × 10−1 | |
ICD-9 codes were extracted from individual EHRs and converted to phecodes using the PheWAS R package
CI confidence interval, LDL-C low-density lipoprotein cholesterol, MAF minor allele frequency
aBold values are statistically significant
bOdds ratio refers to the Alt allele
cBorderline significant, other variants in LD with this variant were significant