| Literature DB >> 36220816 |
Margaret Sunitha Selvaraj1,2,3, Xihao Li4, Zilin Li4, Akhil Pampana2, David Y Zhang5,6, Joseph Park5,6, Stella Aslibekyan7, Joshua C Bis8, Jennifer A Brody8, Brian E Cade9, Lee-Ming Chuang10, Ren-Hua Chung11, Joanne E Curran12, Lisa de Las Fuentes13,14, Paul S de Vries15, Ravindranath Duggirala12, Barry I Freedman16, Mariaelisa Graff17, Xiuqing Guo18, Nancy Heard-Costa19, Bertha Hidalgo7, Chii-Min Hwu20, Marguerite R Irvin7, Tanika N Kelly21,22, Brian G Kral23, Leslie Lange24, Xiaohui Li18, Martin Lisa25, Steven A Lubitz1,26, Ani W Manichaikul27, Preuss Michael28, May E Montasser29, Alanna C Morrison15, Take Naseri30, Jeffrey R O'Connell29, Nicholette D Palmer31, Patricia A Peyser32, Muagututia S Reupena33, Jennifer A Smith32, Xiao Sun21, Kent D Taylor18, Russell P Tracy34, Michael Y Tsai35, Zhe Wang28, Yuxuan Wang36, Wei Bao37, John T Wilkins38, Lisa R Yanek23, Wei Zhao32, Donna K Arnett39, John Blangero12, Eric Boerwinkle15, Donald W Bowden31, Yii-Der Ida Chen40, Adolfo Correa41, L Adrienne Cupples36, Susan K Dutcher42, Patrick T Ellinor1,26, Myriam Fornage43, Stacey Gabriel44, Soren Germer45, Richard Gibbs46, Jiang He21,22, Robert C Kaplan47,48, Sharon L R Kardia32, Ryan Kim49, Charles Kooperberg48, Ruth J F Loos28,50, Karine A Viaud-Martinez51, Rasika A Mathias23, Stephen T McGarvey52, Braxton D Mitchell29,53, Deborah Nickerson54, Kari E North17, Bruce M Psaty8,55,56, Susan Redline9, Alexander P Reiner48,55, Ramachandran S Vasan57,58,59, Stephen S Rich27, Cristen Willer60, Jerome I Rotter18, Daniel J Rader5,6,61, Xihong Lin2,4,62, Gina M Peloso63, Pradeep Natarajan64,65,66.
Abstract
Blood lipids are heritable modifiable causal factors for coronary artery disease. Despite well-described monogenic and polygenic bases of dyslipidemia, limitations remain in discovery of lipid-associated alleles using whole genome sequencing (WGS), partly due to limited sample sizes, ancestral diversity, and interpretation of clinical significance. Among 66,329 ancestrally diverse (56% non-European) participants, we associate 428M variants from deep-coverage WGS with lipid levels; ~400M variants were not assessed in prior lipids genetic analyses. We find multiple lipid-related genes strongly associated with blood lipids through analysis of common and rare coding variants. We discover several associated rare non-coding variants, largely at Mendelian lipid genes. Notably, we observe rare LDLR intronic variants associated with markedly increased LDL-C, similar to rare LDLR exonic variants. In conclusion, we conducted a systematic whole genome scan for blood lipids expanding the alleles linked to lipids for multiple ancestries and characterize a clinically-relevant rare non-coding variant model for lipids.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36220816 PMCID: PMC9553944 DOI: 10.1038/s41467-022-33510-7
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 17.694
Fig. 1Overall study schematic.
The analyses were conducted using the multi-ancestral TOPMed freeze8 data to associate whole genome sequence variation with lipid phenotypes (i.e., LDL-C, HDL-C, TC, and TG). A total of 66,329 samples with lipids quantified data from five ancestry groups were analyzed. Single variant GWAS were carried out using SAIGE on the Encore platform using SNPs with MAC >20. Both trans-ancestry and ancestry-specific GWAS were conducted. Genome-wide rare variant (MAF <1%) gene-centric and region-based aggregate tests were grouped and analyzed using STAARpipeline. Finally, single variant and rare variant associations at Mendelian dyslipidemia genes were investigated in further detail. TOPMed Trans-Omics for Precision Medicine, HDL-C high-density lipoprotein cholesterol, LDL-C low-density lipoprotein cholesterol, TC total cholesterol, TG triglycerides, GWAS genome wide association study, SAIGE Scalable and Accurate Implementation of GEneralized mixed model, MAC minor allele count, MAF minor allele frequency, SNPs single nucleotide polymorphisms.
Fig. 2Summary of single variant genome-wide association.
Representation of the single variant GWAS results from TOPMed Freeze 8 whole genome sequenced data of 66,329 samples. Each quarter represents a different lipid phenotype, and dots extending in clock-wise fashion represent variants with increasing evidence of association as noted by −log10(p-value), which was truncated at 200. The outer three circles show the GWAS data from TOPMed freeze8 where variants binned to nominally significant (p-value 0.05–5 × 10−07), suggestive significant (p-value 5 × 10−07–5 × 10−09) and genome wide significant (p-value < 5 × 10−09). The inner three circles compare our TOPMed results with known significantly associated lipid loci and variants from the MVP summary statistics and GWAS catalog to the identified novel variants and loci that are genome-wide significant from the current study, respectively. The figure represents the outputs from two-sided genetic association testing preformed using SAIGE-QT model, where the model was adjusted for all the covariates; see Methods. TOPMed Trans-Omics for Precision Medicine, GWAS genome wide association study, MVP million veteran program.
Putative novel variants identified in TOPMed and evidence for replication
| Associated lipid phenotype | Novel variant class | Variants (Gene) | Discovery Cohort TOPMed Freeze8 ( | Replication Cohort Meta Analysis (METASOFT) MGB Biobank (N = 25,137); Penn Medicine Biobank (N = 20,079); UK Biobank (N = 424,955) | ||||
|---|---|---|---|---|---|---|---|---|
| Effect estimate | MAF | Beta | Std.Err | |||||
| LDL-C | Novel locus | 12:97352354:T:C | −12.439 | 4.88 × 10−09 | 0.003 | 3.316 | 3.62 x 10−01 | 3.634 |
| LDL-C | Novel variant | 16:56957451:C:T ( | −1.568 | 2.88 × 10−09 | 0.283 | −1.459 | 8.74 x 10−84 | 0.075 |
| LDL-C | Novel locus | 4:176382171:C:T | −16.086 | 2.82 × 10−09 | 0.002 | −0.980 | 7.80 x 10−01 | 3.514 |
| TC | Novel variant | 13:113841051:T:C ( | 1.731 | 1.12 × 10−09 | 0.278 | 1.262 | 1.29 x 10−38 | 0.097 |
| TC | Novel variant | 7:137875053:T:C ( | −4.106 | 7.54 × 10−11 | 0.045 | −3.538 | 7.70 x 10−07 | 0.716 |
| TG | Novel locus | 11:69219641:C:T | 0.232 | 1.98 × 10−09 | 0.002 | −0.030 | 6.04 x 10−01 | 0.059 |
| TG | Novel variant | 13:107551611:C:T ( | 0.052 | 6.78 × 10−10 | 0.045 | 0.015 | 2.20 x 10−02 | 0.006 |
Variants identified as novel after comparing with the GWAS catalog and MVP summary statistics for associations with lipid phenotypes, including LDL-C, TC, and TG. All effect estimates are in mg/dL units, except for TG which was log-transformed in analysis thereby representing fractional change. Variants are categorized as novel loci or novel variant (i.e., known locus associated with another lipid phenotype) and the genes assigned to the variants per TOPMed whole genome sequence annotations (WGSA) are listed. Data is provided for the discovery (TOPMed freeze8) and replication cohorts (Imputed datasets from MGB Biobank, Penn Medicine Biobank and UK Biobank). Meta-analysis with the replication cohorts was carried out and the corresponding beta, p-values and standard-errors are provided. All the effect-estimates and p-values are reported from two-sided association testing with all independent samples from each cohort (Discovery-TOPMed: 66,329; Replication-MGB Biobank: 25,137; UK Biobank: 424,955; Penn Biobank: 20,079).
GWAS genome wide association study, MVP million veteran program, LDL-C low-density lipoprotein cholesterol, TC total cholesterol, TG triglycerides, TOPMed trans-omics for precision medicine, WGSA whole genome sequence annotations.
Fig. 3Comparison of effects estimates for HDL-C and LDL-C among variants in the CETP locus.
The color scale of the data points was based on −log10 p-values from HDL-C association and the size of each data point was based on −log10 p-values of LDL-C association. Variants which are genome wide significant with LDL-C are represented as chromosome:position:reference allele:alternate allele. The effect estimates and p-values were calculated from two-sided genetic association testing preformed using SAIGE-QT model, where the model was adjusted for all the covariates; see Methods. HDL-C high-density lipoprotein cholesterol, LDL-C low-density lipoprotein cholesterol.
Fig. 4Conditional analysis of coding rare-variants from the same gene and a near-by gene.
Non-coding rare variant sets significantly associated with TC and TG after the conditional analysis on known variants are shown with additional adjustment on rare-coding variants. The additional adjustment for rare-coding variants were carried out for the same gene of the aggregate set and for certain gene aggregates (SPC24) the conditional analysis was carried out with a nearby Mendelian gene. After adjusting for rare-coding variants and known variants, EHD3 signal drops minimally, whereas signal from PCSK9 (promoter-DHS, enhancer-DHS), LDLR-loci (enhancer-DHS, SPC24 enhancer-DHS) enhances significantly. APOB1, SPC24 (enhancer-CAGE), HBB and APOE signal drops after the conditional analysis on rare-coding variants. The different colored dots on the plot represents the conditional STAAR-O p-values when adjusting for known variants (Set1) and rare-coding variants of the same or near-by gene. The p-values were calculated from two-sided aggregate testing preformed using STAAR gene-centric model, where the model was adjusted for all the covariates; see Methods. STAAR variant-Set Test for Association using annotation information, TC total cholesterol, TG triglycerides, CAGE cap analysis of gene expression, DHS DNase hypersensitivity.
Fig. 5Influence of common and rare variants with hypercholesterolemia.
In addition to monogenic contributions from rare variants in Mendelian hypercholesterolemia genes, multiple genome-wide significant LDL-C-associated common variants also yield a polygenic basis for hypercholesterolemia. In the present work, we now identify rare non-coding variants in proximity of Mendelian hypercholesterolemia genes, specifically LDLR and PCSK9, that also contribute to the genetic basis of hypercholesterolemia. Parts of the figure were generated using pictures from Servier Medical Art. Servier Medical Art by Servier is licensed under a Creative Commons Attribution 3.0 Unported License (https://creativecommons.org/licenses/by/3.0/). LDL-C low-density lipoprotein cholesterol.