Literature DB >> 29212778

Identification of 64 Novel Genetic Loci Provides an Expanded View on the Genetic Architecture of Coronary Artery Disease.

Abstract

RATIONALE: Coronary artery disease (CAD) is a complex phenotype driven by genetic and environmental factors. Ninety-seven genetic risk loci have been identified to date, but the identification of additional susceptibility loci might be important to enhance our understanding of the genetic architecture of CAD.
OBJECTIVE: To expand the number of genome-wide significant loci, catalog functional insights, and enhance our understanding of the genetic architecture of CAD. METHODS AND
RESULTS: We performed a genome-wide association study in 34 541 CAD cases and 261 984 controls of UK Biobank resource followed by replication in 88 192 cases and 162 544 controls from CARDIoGRAMplusC4D. We identified 75 loci that replicated and were genome-wide significant (P<5×10-8) in meta-analysis, 13 of which had not been reported previously. Next, to further identify novel loci, we identified all promising (P<0.0001) loci in the CARDIoGRAMplusC4D data and performed reciprocal replication and meta-analyses with UK Biobank. This led to the identification of 21 additional novel loci reaching genome-wide significance (P<5×10-8) in meta-analysis. Finally, we performed a genome-wide meta-analysis of all available data revealing 30 additional novel loci (P<5×10-8) without further replication. The increase in sample size by UK Biobank raised the number of reconstituted gene sets from 4.2% to 13.9% of all gene sets to be involved in CAD. For the 64 novel loci, 155 candidate causal genes were prioritized, many without an obvious connection to CAD. Fine mapping of the 161 CAD loci generated lists of credible sets of single causal variants and genes for functional follow-up. Genetic risk variants of CAD were linked to development of atrial fibrillation, heart failure, and death.
CONCLUSIONS: We identified 64 novel genetic risk loci for CAD and performed fine mapping of all 161 risk loci to obtain a credible set of causal variants. The large expansion of reconstituted gene sets argues in favor of an expanded omnigenic model view on the genetic architecture of CAD.

Entities: Chemical

Keywords: computational biology; coronary artery disease; genetics; genome-wide association study; sample size

Mesh：

Substances：

Year: 2017 PMID： 29212778 PMCID： PMC5805277 DOI： 10.1161/CIRCRESAHA.117.312086

Source DB: PubMed Journal: Circ Res ISSN： 0009-7330 Impact factor: 17.367

Coronary artery disease (CAD) is the predominant cause of ischemic heart disease often leading to myocardial infarction and a leading cause of death. Globally, deaths because of ischemic heart disease increased by 16.6% from 2005 to 2015 to 8.9 million deaths. However, the age-standardized mortality rates are decreasing (fell by 12.8%)[1] because of preventive and treatment strategies established on evolving knowledge of the underlying pathophysiology of CAD. Editorial, see p In This Issue, see p CAD is a complex disease, resulting from numerous additive and interacting contributions in an individual’s environment and lifestyle in combination with their underlying genetic architecture. Since the first genome-wide association studies (GWAS) for CAD in 2007,[2-4] multiple additional studies with progressively larger sample sizes identified 97 genome-wide significant genetic loci associated with CAD[5-10] at the time of analysis. The continuous effort to identify additional loci associated with CAD and share these early with the scientific community is important, especially to enhance our understanding of the biological underpinnings of CAD and to catalyze the development of drugs. A comprehensive understanding of the genetic architecture of CAD is also essential to enable precision medicine approaches by identifying subgroups of patients at increased risk of CAD or its complications and might identify those with a specific driving pathophysiology in whom a particular therapeutic or preventive approach would be most useful.[11] To further our knowledge of the genetic architecture of CAD, we performed a de novo GWAS of the UK Biobank resource and meta-analyses with CARDIoGRAMplusC4D data. Our approach led to the identification of 64 novel loci associated with CAD, expanding the grand total to 161. These loci were interrogated using bioinformatic approaches to catalog and interpret the potential biological relevance of our findings. We also performed network and gene-set analyses and proposed the omnigenic model to explain our findings. This expanding resource is now available for other investigators to help to further elucidate the underlying biology and relevance.

Methods

The data that support the findings of this study are available from the corresponding author on reasonable request. The de novo GWAS analysis and meta-analysis have been posted on Mendeley (doi:10.17632/2zdd47c94h.1; doi:10.17632/gbbsrpx6bs.1). A summary of the methods is provided below, and a more detailed description of the experimental procedures is provided in the Online Data Supplement.

Study Design and Samples

The study design consisted of a reciprocal 2-stage sequential discovery and replication approach (Online Figure I) providing the most robust statistical evidence followed by an overall meta-analysis of all available data for which currently no replication data were available in this study. First, using the UK Biobank resource, we conducted a GWAS to discover single-nucleotide polymorphisms (SNPs) associated with CAD. In stage 2, we took forward all promising SNPs reaching nominal significance (P<0.0001) for replication in CARDIoGRAMplusC4D data. Replicating SNPs (P<0.05 after Bonferroni adjustment) were meta-analyzed and considered true when surpassing the genome-wide significance threshold (P<5×10−8). The reciprocal stage 1 entailed the identification for all promising SNPs (P<0.0001) in CARDIoGRAMplusC4D and replication in UK Biobank (P<0.05 after Bonferroni adjustment) followed by meta-analysis. Again, SNPs replicating and surpassing the genome-wide significance threshold were considered true. A sentinel SNP in a locus was defined as the most significant variant in a 1-mb region that was independent from other sentinel SNPs (r2<0.1). A locus was defined as a region of 1 mb at either side of the sentinel SNP. A locus was considered novel if the sentinel SNP was not within a 1-mb window (at either side) of earlier reported genome-wide significant SNPs (Online Table I). Finally, we performed a genome-wide meta-analysis of the UK Biobank resource and CARDIoGRAMplusC4D to identify additional CAD-associated loci (P<5×10−8 in meta-analysis). A potential sample overlap between the UK Biobank and cohorts of CARDIoGRAMplusC4D was estimated to be <0.1%; no evidence was found that this biased the test statistics (Online Data Supplement).

Candidate Genes and Insights in Biology

Candidate causal genes at each of the loci were prioritized based on proximity, expression quantitative trait locus (eQTL) data, DEPICT analyses (Data-Driven Expression-Prioritized Integration for Complex Traits),[12] and long-range chromatin interactions of variants with gene promoters (Online Data Supplement).[8,13] Summary information of genes was obtained via queries in GeneCards, EntrezGene, UniProt, and Tocris. The Mouse Genomic Informatics database was used for obtaining insights into mammalian phenotypes associated with disruption of candidate genes. DEPICT was also used to test for enrichment of gene sets and identify relevant tissues and cell types. Ingenuity pathway analysis (June 2017 release) was performed to strengthen the biological relevancy of the novel loci.

Insights in Loci by Associations With Other Phenotypes

The GWAS catalog was queried and a phenome scan was performed by intersecting the identified loci with the GWAS catalog and by testing the association of the newly identified SNPs with a wide range of phenotypes using linear or logistic regression analysis in UK Biobank (Online Data Supplement). Genetic risk scores (GRS) were constructed using effect estimates obtained from the CARDIoGRAMplusC4D data as described previously.[8] Multivariable Cox proportional hazards models were fitted for quintiles of the GRS in the UK Biobank resource, to assess the extent to which the GRS could predict new-onset atrial fibrillation/flutter and heart failure.

Regulatory DNA and Fine Mapping of Probable Causal Variants

To systematically characterize the functional, cellular, and regulatory contribution of genetic variation, we used GARFIELD,[14] analyzing the enrichment of genome-wide association summary statistics in tissue-specific functional elements at given significance thresholds. Probabilistic Annotation Integrator was used to fine-map loci by integrating genetic association signal strength with genomic functional annotation data.[15] We explored the potential target genes of these candidate causal variants by determining their direct effects on protein function (missense variants) and evidence connecting the causal variant in an untranslated region (Utr)-3′ region to gene expression (eQTL) or physical interactions (Hi-C) with the promotor of an eQTL gene. Determination of potential causal mechanisms of the potential causal variants based on (1) missense variation, (2) chromatin interaction between the causal variant and the promotor of a gene for which the causal variant was also significantly associated with gene expression by eQTL analyses, or (3) Utr3′ overlapping variants that were also significantly associated with gene expression of the same gene corresponding to the Utr3′ position. In addition, for genes/mechanisms to be prioritized by eQTL analyses and chromatin interactions or Utr′3, the respective causal variant was required to be in an enhancer region.

Results

Genome-Wide Analyses of 34 541 Cases and 261 984 Controls

The stage 1 GWAS analysis in UK Biobank (34 541 cases and 261 984 controls; Online Table II) of 7 947 838 SNPs revealed 630 suggestive SNPs (P<0.0001) in 442 loci (Online Table III). Eighty-six independent SNPs in 75 loci both replicated (P<0.05 Bonferroni adjusted) in stage 2 in ≤88 192 cases and 162 544 controls of CARDIoGRAMplusC4D, and achieved genome-wide significance (P<5×10−8) with no evidence of heterogeneity of effects (Phet≥0.10). Thirteen of the 75 loci are not established CAD-associated loci (Table 1).

Table 1.

Sixty-Four Novel Genome-Wide Significant CAD Loci

Sixty-Four Novel Genome-Wide Significant CAD Loci Next, we reanalyzed the data from the MetaboChip meta-analysis of CARDIoGRAMplusC4D,[9] the CARDIoGRAMplusC4D 1000 Genomes meta-analysis,[7] and the CARDIoGRAM Exome array data[16] to identify the promising SNPs (P<0.0001). We identified 568 promising SNPs located in 375 loci (Online Table IV). One hundred and thirteen independent SNPs in 96 loci both replicated (P<0.05 Bonferroni adjusted) in stage 2, UK Biobank, and achieved genome-wide significance in meta-analysis (P<5×10−8), including 21 additional novel loci (Table 1; Online Table V). Finally, we performed a meta-analysis of CARDIoGRAMplusC4D[9] and the CARDIoGRAMplusC4D 1000 Genomes meta-analysis[7] with UK Biobank and identified 30 additional loci for which no replication test was available (Table 1; Online Table VI) increasing the total number of genome-wide significant CAD loci to 161 (Online Figure II). The novel variants were common (>5%, except for 1, rs112635299 near SERPINA1). Online Figure III shows the regional association plot of each novel locus. For some variants, a dominant or recessive linkage model appears to be a better fit compared with an additive model (Online Table VII). Complete summary statistics of all SNPs in UK Biobank and the UK Biobank CARDIoGRAMplusC4D meta-analysis are available as download on www.cardiomics.net.

Candidate Genes and Deeper Insights Into Biology

To disentangle whether associations were driven more by acute myocardial infarction as opposed to stable CAD, we performed multinomial logistic regression analyses for all genome-wide significant (P<5×10−8) loci in UK Biobank. In total, 17 666 of 34 541 CAD individuals were diagnosed with myocardial infarction. None of the novel loci and only 2 previously identified variants (rs9349379 and rs10947789) appear to be mainly driven by its association with myocardial infarction rather than stable CAD (false discovery rate [FDR], P<0.05; Online Table VIII). We further explored the potential biology of the 64 novel CAD-associated loci by prioritizing 155 candidate causal genes in these loci: 69 genes were in proximity (the nearest gene and any additional gene within 10 kb) of the lead variant, 9 genes contained coding genetic variation in linkage disequilibrium (r2>0.8) with the lead variant (Online Table IX), 50 genes were selected based on eQTL analyses (Online Table X), 64 genes showed significant chromatin interactions (Hi-C) between the genetic variant and promoter of the gene (Online Table XI), and 60 genes were prioritized based on DEPICT analyses (Online Table XII). Of the 155 candidate genes, 63 were prioritized by multiple methods of identification, which may be used to prioritize candidate causal genes. A summary of the current function annotation of each novel candidate gene is provided in Online Table XIII, and knowledge on pharmacological compounds and nutrients influencing these genes is provided in Online Table XIV. Next, we performed a systematic search in the Mouse Genome Informatics database to identify the effect of mutations in orthologous genes for these candidate causal genes (details in Online Table XV). In brief, we identified 34 genes that expressed at least 1 cardiovascular system phenotype (AGT, ARHGAP42, BACH1, CALCRL, CASQ2, CCM2, CDC123, CDKN1A, FIGN, FOXC1, GIT1, GNPAT, HCRT, HSD17B12, MAP1S, MAP3K1, MSANTD1, NGF, NPHP3, PCIF1, PDS5B, PLCG1, PLEKHA1, PPP2R3A, PRDM16, PRKCE, RAC1, SEMA5A, SH3PXD2A, TFPI, TIPARP, TMEM106B, VEGFA, and ZFPM2) and 34 genes that affected other potentially plausible traits linked to CAD, including metabolic/lipid/adipose/weight abnormalities (AGT, CORO6, FIGN, GIT1, KAT2A, NGF, PPP2R3A, NPHH3, SH3PXD2A, TMEM106B, VEGFA, ZHX3, OPTN, FAM213A, DNAJC7, and COPRS), abnormalities in inflammation or white blood cells (DHX58, FHL3, HNRNPD, PLCG2, PRDM16, TFPI, VEGFA, ZNF335, PRKCE, MYO1G, RAC1, and ARID4A), and abnormalities in platelets or coagulation (FHL3, PLCG2, TFPI, VEGFA, DST, and KLF4).

Novel Insights From Pathway Analyses

Ingenuity pathway analysis restricted to the 155 candidate causal genes confirmed that these are enriched for effects on the cardiovascular system and cell cycle functions (Online Table XVI). Pathway insights provided by the DEPICT framework identified 1525 reconstituted gene sets that could be captured in 156 meta gene sets (Online Table XVII). The 4 most significant metasets were complete embryonic lethality during organogenesis, blood vessel development, anemia, and SRC PPI subnetwork. The platelet α-granule lumen, SRC PPI subnetwork, blood vessel development, and hemostasis had the largest betweenness centrality—an indicator of a node’s centrality in the network. The tissue enrichment analyses by DEPICT indicated blood vessels as the most relevant tissue (P=4×10−7); 41 additional tissues or cell types were significantly enriched at FDR<0.05 (Online Table XVIII). We compared the contribution of novel information with previous work. The previous CARDIoGRAMplusC4D analysis led to 457 reconstituted gene sets (at FDR<0.05); the addition of the intermediate data set UK Biobank of 150 k individuals identified a total of 889 significant gene sets, substantially less than the current 1525 gene sets (Figure 1; Online Table XVII). Considering all 10 968 possible gene sets, this study represents an increase from 4.16% to 13.90% of all gene sets involved in CAD since the 1000 Genomes analysis of CARDIoGRAMplusC4D in 2015. Genes implicated by DEPICT on the FDR<0.05 level are 94 in the previous data, which has increased to 540 genes.

Figure 1.

Network analyses of reconstituted gene sets. The total number of significant gene sets involved in coronary artery disease (CAD) increased to 13.90% since the 1000 Genome genome-wide association studies of CARDIoGRAMplusC4D, considering all possible gene sets. Clustering by modularity using Gephi software indicated that pathways specific for cardiovascular/heart development, inflammation, lipids, kidney and coagulation clustered together. PPI networks & Other indicates a remaining bin predominantly populated by protein–protein interaction networks. To increase our understanding of potentially mediating mechanisms at the genetic variant level, we searched the GWAS catalog for previously reported variants. Of the 64 novel loci, 23 loci were in linkage disequilibrium (r2>0.6) with genetic variants previously reported to be associated with other traits surpassing the genome-wide significant (P<5×10−8) threshold (Online Table XIX). We found associations with anthropometric measurements (rs6905288, rs1591805, rs3936511, and rs840616), antineutrophil antibody-associated vasculitis (rs112635299), angiotensinogen measurements (rs699), coffee consumption (rs13723), C-reactive protein (rs667920), pulmonary function (rs61848342, rs13723, and rs112635299), fibrinogen levels (rs67920, rs16844401, and rs2074158), glomerular filtration rate (rs12500824), high-density lipoprotein cholesterol (rs667920, rs10512861, and rs6905288), low-density lipoprotein cholesterol (rs10512861), total cholesterol (rs6997340), triglycerides (rs667920, rs3936511, rs6905288, and rs6997340), diabetes mellitus (rs1591805 and rs3936511), blood pressure indices (rs260020, rs17080091, rs61776719, rs7696431, and rs1317507), transferrin levels (rs6997340), QRS amplitude (rs13723), abdominal aortic aneurysm (rs885150 and rs3827066), adiponectin measurements (rs6905288), and age at menarche (rs1591805); full details can be found in Online Table XIX. We also explored the association of the 64 lead SNPs with a range of traits in UK Biobank resource. Consistent with the GWAS-catalog search and in keeping with earlier observations in established CAD loci, several of our novel loci were associated with hyperlipidemia, blood pressure traits, diabetes mellitus, and anthropometric traits (Figure 2). For example, rs6905288 (VEGFA) was also associated with waist-to-hip ratio and hyperlipidemia, and rs61776719 (FHL3 and UTP11L) was also closely associated with pulse pressure in UK Biobank. Interestingly, we observed that 15 of 64 loci were associated with platelet counts.

Figure 2.

Heatmap of associations in UK Biobank with novel loci. Heatmap of z scores for different diseases and phenotypes in UK Biobank, aligned to increased risk of coronary artery disease. Only significant associations (false discovery rate<0.01) are shown. The genetic risk score constructed with the known and novel loci, weighted using coefficients of CARDIoGRAMplusC4D, is highlighted by the red rectangle. BMI indicates body mass index; COPD, chronic obstructive pulmonary disease; RBC, red blood cell; and TIA, transient ischemic attack.

Genetic Risk for CAD, and Association With CAD Risk Factors and Outcome

To explore potential clinical relevance, we constructed a GRS, weighted for their effects in CARDIoGRAMplusC4D by multiplying the effect sizes with the number of effect variants of each variant in each individual, and divided this GRS into quintiles. The associations with many different traits and diseases from the UK Biobank are visualized in Figure 2. The risk of a future diagnosis of atrial fibrillation and heart failure in UK Biobank participants was higher in quantile 5 individuals as compared with quantile 1 (hazard ratio, 1.18 [95% confidence interval, 1.10–1.27; P=1.2×10−6] and 1.59 [95% confidence interval, 1.43–1.77; P=3.3×10−18], respectively; Online Figure IV). In addition, all-cause mortality and especially cardiovascular mortality was higher in individuals of quantile 5 compared with quantile 1 (hazard ratio, 1.12 [95% confidence interval, 1.06–1.19; P=4×10−4] and 1.94 [95% confidence interval, 1.70–2.21; P=2×10−23], respectively; Online Figure IV).

Role of Regulatory DNA and Fine Mapping of Candidate Causal Variants

Across the genome, virtually all tissues showed significant enrichment of DNase I hypersensitivity sites providing limited indications for involved biology (Figure 3A and 3B). Minimal differential enrichment of functional elements for the identified genetic loci was observed in blood vessels and liver. To facilitate future functional studies directed at causal variants and molecular mechanisms, we prioritized variants via the probabilistic framework of Probabilistic Annotation Integrator. Because no clear differential enrichment was observed for tissue-specific functional elements, we focused on DNA annotations from the study of Finucane et al[17] that are not specific for tissue or cell types. Probabilistic Annotation Integrator determined the significance of each annotation to be causal (Figure 3C and 3D), and a model was constructed using linkage disequilibrium information, P value distribution, and information on coding variation, conservation and H3K4me1 sites to prioritize potential causal SNPs of all 161 (known and novel) loci. This analysis yielded 28 variants ≥95% confidence level for which we prioritized candidate genes (Online Table XX; Table 2).

Figure 3.

The role of regulatory DNA underlying coronary artery disease (CAD)-associated single-nucleotide polymorphisms (SNPs). Enrichment of genome-wide association analysis P values in Dnase I hypersensitive sites (DHS). CAD SNPs at different genome-wide association study (GWAS) threshold were significantly enriched in DHS footprints (A) and hot spots (B) across many different tissues and cell types. The fold enrichment was highly significant for most tissues and cell types (P<1×10−8) as indicated by the 4 colored circles next to the labels, 3 colored circles indicate P<1×10−7. Label sizes of tissue types were downsized because of space limitations; tissue types may be represented by multiple samples, indicated by hash marks of the same color. C, Subsequent prioritization of potential causal annotations underlying the 161 CAD loci also suggested that regions of DHS may be underlying the associations, but coding variants, conservation, 5′ untranslated region (UTR), and H3K4me1 annotations were more likely to be causal. D, Posterior probabilities for causality for each variant in the 164 CAD loci were calculated by an empirical Bayes approach implemented in the Probabilistic Annotation Integrator Framework, taking into account linkage disequilibrium (LD), association statistics, and the potentially causal annotations and summarized in Table 2 and Online Table XX. CTCF indicates transcriptional repressor CTCF; DGF, digital genomic footprint by Dnase1 hypersensitivity; FANTOM5, functional annotation of the mammalian genome V5; TFBS, transcription factor binding site; and TSS, transcription start site.

Table 2.

For 28 Loci, the 95% Credible Set of Causal Variants Consisted of a Single Coronary Artery Disease Variant

For 28 Loci, the 95% Credible Set of Causal Variants Consisted of a Single Coronary Artery Disease Variant The role of regulatory DNA underlying coronary artery disease (CAD)-associated single-nucleotide polymorphisms (SNPs). Enrichment of genome-wide association analysis P values in Dnase I hypersensitive sites (DHS). CAD SNPs at different genome-wide association study (GWAS) threshold were significantly enriched in DHS footprints (A) and hot spots (B) across many different tissues and cell types. The fold enrichment was highly significant for most tissues and cell types (P<1×10−8) as indicated by the 4 colored circles next to the labels, 3 colored circles indicate P<1×10−7. Label sizes of tissue types were downsized because of space limitations; tissue types may be represented by multiple samples, indicated by hash marks of the same color. C, Subsequent prioritization of potential causal annotations underlying the 161 CAD loci also suggested that regions of DHS may be underlying the associations, but coding variants, conservation, 5′ untranslated region (UTR), and H3K4me1 annotations were more likely to be causal. D, Posterior probabilities for causality for each variant in the 164 CAD loci were calculated by an empirical Bayes approach implemented in the Probabilistic Annotation Integrator Framework, taking into account linkage disequilibrium (LD), association statistics, and the potentially causal annotations and summarized in Table 2 and Online Table XX. CTCF indicates transcriptional repressor CTCF; DGF, digital genomic footprint by Dnase1 hypersensitivity; FANTOM5, functional annotation of the mammalian genome V5; TFBS, transcription factor binding site; and TSS, transcription start site. For example, rs974819 was prioritized as causal variant and could be linked to PDGFD by Hi-C evidence and eQTL data in relevant tissues (Online Figure V). In total, 15 of the 28 fine-mapped loci could be pinpointed to 1 single potential causal mechanism implicating a single gene. For 2 loci, there were 2 potential causal mechanisms (TRPC4AP/PROCR and MRPS6/SLC5A3) with equal evidence.

Discussion

The present study is the largest genetic association study of CAD performed to date. We report on the primary results and downstream bioinformatic analyses of the meta-analysis of de novo GWAS data derived from the UK Biobank combined with existing data from CARDIoGRAMplusC4D, leading to the inclusion of ≤122 733 cases and 424 528 controls. This study contributes to the existing literature by reporting 64 novel genetic loci representing 38% of all 161 GWAS-identified CAD loci to date.[18] For the novel loci, a detailed catalog of 155 candidate genes (based on proximity, gene-expression data, coding variation, and physical chromatin interaction) is provided. We demonstrate that the increase in significantly associated CAD loci results in a large expansion of implicated reconstituted gene networks, from 4% to almost 14%. Finally, by integrating genetic association strength, linkage disequilibrium, and functional annotation data, we performed fine mapping of all 161 CAD loci, providing a novel credible list of causal variants and plausible genes to be prioritized for functional validation. The 64 novel genetic loci reported in this single article are exceptionally large compared with previous articles, including those of CARDIoGRAMplusC4D and others reporting on 10 to 15 novel loci each.[2-10] Thirty-four of the 64 loci are significant in a robust reciprocal replication strategy between CARDIoGRAMplusC4D and the UK Biobank, but another 30 are genome-wide significant in the overall meta-analysis as is commonly considered sufficient evidence.[7,10] The obvious reason for the large number of novel loci is the considerable number of novel CAD cases and non-CAD controls compared with these earlier efforts combined with less heterogeneity in samples, collection, and definitions used. By increasing the sample size, more loci can be identified, more genes can be implicated, and more gene networks or pathways can be constructed. Not only is the increase of associated loci in the past decade rapidly outpacing functional validation, even understanding biological networks seems to insufficiently accommodate the increased amount of GWAS hits under the conceptual polygenetic model. This can be illustrated by the large increase of reconstituted gene networks observed in our study. For the first time, we show that almost 14% of all existing gene networks are involved in the complex CAD trait (Figure 1), and this will only increase when further samples are added to the GWAS study making it increasingly more difficult to consider these all to be key pathways. In our data, we also observed genetic association signals to be spread across most of the genome, and many of the novel 155 candidate genes do not have an obvious connection to CAD. In addition, virtually all cell types showed significant enrichment of DNase I hypersensitivity and other functional elements. These notions are all supportive of the omnigenic model, which has recently been proposed by the Pritchard team suggesting that prevailing conceptual models for complex diseases are incomplete. The omnigenic model hypothesizes that all gene-regulatory networks are sufficiently interconnected such that all genes expressed in disease-relevant cells can influence the function of core disease-related genes and a major proportion of heritability can be explained by effects of genes outside key pathways.[19] To further our knowledge, it is questionable whether further increasing the GWAS sample size will resolve the outstanding issues concerning our incomplete understanding of cellular regulatory networks and our ability to differentiate core genes from peripheral genes. If the omnigenic model is indeed correct, detailed mapping of cell-specific regulatory networks will be essential to understand CAD. To facilitate functional research based on our findings, we not only provided extensive bioinformatic analyses of coding variation, gene expression, and chromatin interactions for the 64 novel loci but also performed novel fine mapping and presented statistically convincing arguments for causal genetic variants at 28 loci, linking 19 genes in the 161 CAD loci. In the known loci, these genes included APOE, PCSK9, ANGPTL4, and SORT1, all implicated as core genes in lipid metabolism. Recently, PCSK9 has been validated in clinical trials,[20] and functional studies are also supporting a key role for SORT1.[21] More recently, EDN1 has indeed been identified as the likely causal gene in the pathogenesis of CAD instead of the nearby PHACTR.[22] In the novel loci, we found evidence for causal variants linked to FNDC3B (Fibronectin Type III Domain Containing 3B), CCM2 (CCM2 Scaffolding Protein), and TRIM5 (Tripartite Motif Containing 5). Indeed, the functional link between these genes and CAD is not obvious and remains to be determined. FNDC3B has been suggested to function as a positive regulator of adipogenesis.[23] CCM2 has been implicated in abnormal vascular morphogenesis in the brain, leading to cerebral cavernous malformations[24] but is also expressed in the heart. Although its effect in the coronary arteries has not been investigated, Ccm2 knockdown in the mouse brain endothelial cells leads to increased monolayer permeability, decreased tubule formation, and reduced cell migration after wound healing.[25] TRIM5 has been suggested to promote innate immune signaling, and its activity is amplified by retroviral infections.[26] All SNP-gene mechanisms proposed in this article should be experimentally sought out. Also, the analyses were restricted to variants available in the Haplotype Reference Consortium imputation panel. Although this is the largest imputation panel to date, it only comprised SNPs; future fine-mapping efforts are necessary that include non-SNPs as well, such as indels, to cover the additional aspects of the human variation landscape. However, a 95% credible set that contains just 1 potential causal variant per locus provides a first starting point for generating new hypotheses and scientific explorations. In our current work, we validated our previous finding that these genetic variants of CAD also predict the risk of atrial fibrillation, heart failure,[8] and extended it to all-cause death. We also aimed to differentiate between stable CAD and acute myocardial infarction by performing multinomial logistic regression analyses. Most loci were not driven by 1 clinical presentation specifically. However, for 2 previously identified loci (rs9349379 [EDN1] and rs10947789 [KCNK5]), we found statistical evidence that these loci may be driven by acute myocardial infarction and not stable CAD. Also, for this observation, functional hypotheses are to be developed and tested. Our variants might be driven mainly by nonfatal CAD, and different variants might exist for fatal heart disease. Some limitations of the current work are to be acknowledged. This work is based on statistical evidence and does not provide functional experimental validation. The genetic variants identified and the genes prioritized require further direct investigations in future studies to elucidate their role, and function, in the development and progression of CAD. However, in the short term, these data open up new possibilities to improve quantitative measures of genetic risk prediction. Recent data suggests that instead of operating in a deterministic fashion, high genetic risk is indeed modifiable by lifestyle,[27] pharmacotherapy,[28] and also by incorporation of genetic risk into shared decision-making sessions with patients.[29] In conclusion, our GWAS, meta-analyses, and bioinformatic analyses provide several novel insights into the biology of CAD. We report 64 novel loci, link 155 candidate genes, and performed fine mapping of all old and novel loci, providing a credible list of causal genetic variants. However, with the ever-increasing sample size, our work is the first to indicate that an omnigenic model may be more appropriate to accommodate the complex genetic architecture of CAD, compared with a polygenic model. In addition to an expanded view, it also suggests new methods and tools are required to further our understanding of CAD biology through genetics.

Acknowledgments

This research has been conducted using the UK Biobank resource under application number 12006 and 15031. We thank the CARDIoGRAMplusC4D investigators for making their data publicly available. We would like to thank the Center for Information Technology of the University of Groningen for their support and for providing access to the Peregrine high-performance computing cluster.

Sources of Funding

N. Verweij is supported by Marie Sklodowska-Curie GF (call: H2020-MSCA-IF-2014; project identifier: 661395) and an NWO VENI grant (016.186.125). We acknowledge the support from the Netherlands Cardiovascular Research Initiative—an initiative with support of the Dutch Heart Foundation, CVON2015-17 EARLY-SYNERGY.

Disclosures

None.

29 in total

1. A common variant on chromosome 9p21 affects the risk of myocardial infarction.

Authors: Anna Helgadottir; Gudmar Thorleifsson; Andrei Manolescu; Solveig Gretarsdottir; Thorarinn Blondal; Aslaug Jonasdottir; Adalbjorg Jonasdottir; Asgeir Sigurdsson; Adam Baker; Arnar Palsson; Gisli Masson; Daniel F Gudbjartsson; Kristinn P Magnusson; Karl Andersen; Allan I Levey; Valgerdur M Backman; Sigurborg Matthiasdottir; Thorbjorg Jonsdottir; Stefan Palsson; Helga Einarsdottir; Steinunn Gunnarsdottir; Arnaldur Gylfason; Viola Vaccarino; W Craig Hooper; Muredach P Reilly; Christopher B Granger; Harland Austin; Daniel J Rader; Svati H Shah; Arshed A Quyyumi; Jeffrey R Gulcher; Gudmundur Thorgeirsson; Unnur Thorsteinsdottir; Augustine Kong; Kari Stefansson
Journal: Science Date: 2007-05-03 Impact factor: 47.728

2. Incorporating a Genetic Risk Score Into Coronary Heart Disease Risk Estimates: Effect on Low-Density Lipoprotein Cholesterol Levels (the MI-GENES Clinical Trial).

Authors: Iftikhar J Kullo; Hayan Jouni; Erin E Austin; Sherry-Ann Brown; Teresa M Kruisselbrink; Iyad N Isseh; Raad A Haddad; Tariq S Marroush; Khader Shameer; Janet E Olson; Ulrich Broeckel; Robert C Green; Daniel J Schaid; Victor M Montori; Kent R Bailey
Journal: Circulation Date: 2016-02-25 Impact factor: 29.690

3. Fad104, a positive regulator of adipogenesis, negatively regulates osteoblast differentiation.

Authors: Keishi Kishimoto; Ayumi Kato; Shigehiro Osada; Makoto Nishizuka; Masayoshi Imagawa
Journal: Biochem Biophys Res Commun Date: 2010-05-20 Impact factor: 3.575

4. Cardiovascular Efficacy and Safety of Bococizumab in High-Risk Patients.

Authors: Paul M Ridker; James Revkin; Pierre Amarenco; Robert Brunell; Madelyn Curto; Fernando Civeira; Marcus Flather; Robert J Glynn; Jean Gregoire; J Wouter Jukema; Yuri Karpov; John J P Kastelein; Wolfgang Koenig; Alberto Lorenzatti; Pravin Manga; Urszula Masiukiewicz; Michael Miller; Arend Mosterd; Jan Murin; Jose C Nicolau; Steven Nissen; Piotr Ponikowski; Raul D Santos; Pamela F Schwartz; Handrean Soran; Harvey White; R Scott Wright; Michal Vrablik; Carla Yunis; Charles L Shear; Jean-Claude Tardif
Journal: N Engl J Med Date: 2017-03-17 Impact factor: 91.245

5. 52 Genetic Loci Influencing Myocardial Mass.

Authors: Pim van der Harst; Jessica van Setten; Niek Verweij; Georg Vogler; Lude Franke; Matthew T Maurano; Xinchen Wang; Irene Mateo Leach; Mark Eijgelsheim; Nona Sotoodehnia; Caroline Hayward; Rossella Sorice; Osorio Meirelles; Leo-Pekka Lyytikäinen; Ozren Polašek; Toshiko Tanaka; Dan E Arking; Sheila Ulivi; Stella Trompet; Martina Müller-Nurasyid; Albert V Smith; Marcus Dörr; Kathleen F Kerr; Jared W Magnani; Fabiola Del Greco M; Weihua Zhang; Ilja M Nolte; Claudia T Silva; Sandosh Padmanabhan; Vinicius Tragante; Tõnu Esko; Gonçalo R Abecasis; Michiel E Adriaens; Karl Andersen; Phil Barnett; Joshua C Bis; Rolf Bodmer; Brendan M Buckley; Harry Campbell; Megan V Cannon; Aravinda Chakravarti; Lin Y Chen; Alessandro Delitala; Richard B Devereux; Pieter A Doevendans; Anna F Dominiczak; Luigi Ferrucci; Ian Ford; Christian Gieger; Tamara B Harris; Eric Haugen; Matthias Heinig; Dena G Hernandez; Hans L Hillege; Joel N Hirschhorn; Albert Hofman; Norbert Hubner; Shih-Jen Hwang; Annamaria Iorio; Mika Kähönen; Manolis Kellis; Ivana Kolcic; Ishminder K Kooner; Jaspal S Kooner; Jan A Kors; Edward G Lakatta; Kasper Lage; Lenore J Launer; Daniel Levy; Alicia Lundby; Peter W Macfarlane; Dalit May; Thomas Meitinger; Andres Metspalu; Stefania Nappo; Silvia Naitza; Shane Neph; Alex S Nord; Teresa Nutile; Peter M Okin; Jesper V Olsen; Ben A Oostra; Josef M Penninger; Len A Pennacchio; Tune H Pers; Siegfried Perz; Annette Peters; Yigal M Pinto; Arne Pfeufer; Maria Grazia Pilia; Peter P Pramstaller; Bram P Prins; Olli T Raitakari; Soumya Raychaudhuri; Ken M Rice; Elizabeth J Rossin; Jerome I Rotter; Sebastian Schafer; David Schlessinger; Carsten O Schmidt; Jobanpreet Sehmi; Herman H W Silljé; Gianfranco Sinagra; Moritz F Sinner; Kamil Slowikowski; Elsayed Z Soliman; Timothy D Spector; Wilko Spiering; John A Stamatoyannopoulos; Ronald P Stolk; Konstantin Strauch; Sian-Tsung Tan; Kirill V Tarasov; Bosco Trinh; Andre G Uitterlinden; Malou van den Boogaard; Cornelia M van Duijn; Wiek H van Gilst; Jorma S Viikari; Peter M Visscher; Veronique Vitart; Uwe Völker; Melanie Waldenberger; Christian X Weichenberger; Harm-Jan Westra; Cisca Wijmenga; Bruce H Wolffenbuttel; Jian Yang; Connie R Bezzina; Patricia B Munroe; Harold Snieder; Alan F Wright; Igor Rudan; Laurie A Boyer; Folkert W Asselbergs; Dirk J van Veldhuisen; Bruno H Stricker; Bruce M Psaty; Marina Ciullo; Serena Sanna; Terho Lehtimäki; James F Wilson; Stefania Bandinelli; Alvaro Alonso; Paolo Gasparini; J Wouter Jukema; Stefan Kääb; Vilmundur Gudnason; Stephan B Felix; Susan R Heckbert; Rudolf A de Boer; Christopher Newton-Cheh; Andrew A Hicks; John C Chambers; Yalda Jamshidi; Axel Visel; Vincent M Christoffels; Aaron Isaacs; Nilesh J Samani; Paul I W de Bakker
Journal: J Am Coll Cardiol Date: 2016-09-27 Impact factor: 24.094

6. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease.

Authors: Heribert Schunkert; Inke R König; Sekar Kathiresan; Muredach P Reilly; Themistocles L Assimes; Hilma Holm; Michael Preuss; Alexandre F R Stewart; Maja Barbalic; Christian Gieger; Devin Absher; Zouhair Aherrahrou; Hooman Allayee; David Altshuler; Sonia S Anand; Karl Andersen; Jeffrey L Anderson; Diego Ardissino; Stephen G Ball; Anthony J Balmforth; Timothy A Barnes; Diane M Becker; Lewis C Becker; Klaus Berger; Joshua C Bis; S Matthijs Boekholdt; Eric Boerwinkle; Peter S Braund; Morris J Brown; Mary Susan Burnett; Ian Buysschaert; John F Carlquist; Li Chen; Sven Cichon; Veryan Codd; Robert W Davies; George Dedoussis; Abbas Dehghan; Serkalem Demissie; Joseph M Devaney; Patrick Diemert; Ron Do; Angela Doering; Sandra Eifert; Nour Eddine El Mokhtari; Stephen G Ellis; Roberto Elosua; James C Engert; Stephen E Epstein; Ulf de Faire; Marcus Fischer; Aaron R Folsom; Jennifer Freyer; Bruna Gigante; Domenico Girelli; Solveig Gretarsdottir; Vilmundur Gudnason; Jeffrey R Gulcher; Eran Halperin; Naomi Hammond; Stanley L Hazen; Albert Hofman; Benjamin D Horne; Thomas Illig; Carlos Iribarren; Gregory T Jones; J Wouter Jukema; Michael A Kaiser; Lee M Kaplan; John J P Kastelein; Kay-Tee Khaw; Joshua W Knowles; Genovefa Kolovou; Augustine Kong; Reijo Laaksonen; Diether Lambrechts; Karin Leander; Guillaume Lettre; Mingyao Li; Wolfgang Lieb; Christina Loley; Andrew J Lotery; Pier M Mannucci; Seraya Maouche; Nicola Martinelli; Pascal P McKeown; Christa Meisinger; Thomas Meitinger; Olle Melander; Pier Angelica Merlini; Vincent Mooser; Thomas Morgan; Thomas W Mühleisen; Joseph B Muhlestein; Thomas Münzel; Kiran Musunuru; Janja Nahrstaedt; Christopher P Nelson; Markus M Nöthen; Oliviero Olivieri; Riyaz S Patel; Chris C Patterson; Annette Peters; Flora Peyvandi; Liming Qu; Arshed A Quyyumi; Daniel J Rader; Loukianos S Rallidis; Catherine Rice; Frits R Rosendaal; Diana Rubin; Veikko Salomaa; M Lourdes Sampietro; Manj S Sandhu; Eric Schadt; Arne Schäfer; Arne Schillert; Stefan Schreiber; Jürgen Schrezenmeir; Stephen M Schwartz; David S Siscovick; Mohan Sivananthan; Suthesh Sivapalaratnam; Albert Smith; Tamara B Smith; Jaapjan D Snoep; Nicole Soranzo; John A Spertus; Klaus Stark; Kathy Stirrups; Monika Stoll; W H Wilson Tang; Stephanie Tennstedt; Gudmundur Thorgeirsson; Gudmar Thorleifsson; Maciej Tomaszewski; Andre G Uitterlinden; Andre M van Rij; Benjamin F Voight; Nick J Wareham; George A Wells; H-Erich Wichmann; Philipp S Wild; Christina Willenborg; Jaqueline C M Witteman; Benjamin J Wright; Shu Ye; Tanja Zeller; Andreas Ziegler; Francois Cambien; Alison H Goodall; L Adrienne Cupples; Thomas Quertermous; Winfried März; Christian Hengstenberg; Stefan Blankenberg; Willem H Ouwehand; Alistair S Hall; Panos Deloukas; John R Thompson; Kari Stefansson; Robert Roberts; Unnur Thorsteinsdottir; Christopher J O'Donnell; Ruth McPherson; Jeanette Erdmann; Nilesh J Samani
Journal: Nat Genet Date: 2011-03-06 Impact factor: 38.330

7. Genetic risk, coronary heart disease events, and the clinical benefit of statin therapy: an analysis of primary and secondary prevention trials.

Authors: J L Mega; N O Stitziel; S Kathiresan; M S Sabatine; J G Smith; D I Chasman; M Caulfield; J J Devlin; F Nordio; C Hyde; C P Cannon; F Sacks; N Poulter; P Sever; P M Ridker; E Braunwald; O Melander
Journal: Lancet Date: 2015-03-04 Impact factor: 79.321

8. Biological interpretation of genome-wide association studies using predicted gene functions.

Authors: Tune H Pers; Juha M Karjalainen; Yingleong Chan; Harm-Jan Westra; Andrew R Wood; Jian Yang; Julian C Lui; Sailaja Vedantam; Stefan Gustafsson; Tonu Esko; Tim Frayling; Elizabeth K Speliotes; Michael Boehnke; Soumya Raychaudhuri; Rudolf S N Fehrmann; Joel N Hirschhorn; Lude Franke
Journal: Nat Commun Date: 2015-01-19 Impact factor: 14.919

9. Integrating functional data to prioritize causal variants in statistical fine-mapping studies.

Authors: Gleb Kichaev; Wen-Yun Yang; Sara Lindstrom; Farhad Hormozdiari; Eleazar Eskin; Alkes L Price; Peter Kraft; Bogdan Pasaniuc
Journal: PLoS Genet Date: 2014-10-30 Impact factor: 5.917

10. Identification of 15 novel risk loci for coronary artery disease and genetic risk of recurrent events, atrial fibrillation and heart failure.

Authors: Niek Verweij; Ruben N Eppinga; Yanick Hagemeijer; Pim van der Harst
Journal: Sci Rep Date: 2017-06-05 Impact factor: 4.379

324 in total

Review 1. Impact of Genes and Environment on Obesity and Cardiovascular Disease.

Authors: Yoriko Heianza; Lu Qi
Journal: Endocrinology Date: 2019-01-01 Impact factor: 4.736

Review 2. Polygenic Scores to Assess Atherosclerotic Cardiovascular Disease Risk: Clinical Perspectives and Basic Implications.

Authors: Krishna G Aragam; Pradeep Natarajan
Journal: Circ Res Date: 2020-04-23 Impact factor: 17.367

3. Genomic Risk Stratification Predicts All-Cause Mortality After Cardiac Catheterization.

Authors: Michael G Levin; Rachel L Kember; Renae Judy; David Birtwell; Heather Williams; Zolt Arany; Jay Giri; Marie Guerraty; Tom Cappola; Jinbo Chen; Daniel J Rader; Scott M Damrauer
Journal: Circ Genom Precis Med Date: 2018-11

4. Causal influences of neuroticism on mental health and cardiovascular disease.

Authors: Fuquan Zhang; Ancha Baranova; Chao Zhou; Hongbao Cao; Jiu Chen; Xiangrong Zhang; Mingqing Xu
Journal: Hum Genet Date: 2021-05-11 Impact factor: 4.132

5. Associations of Combined Genetic and Lifestyle Risks With Incident Cardiovascular Disease and Diabetes in the UK Biobank Study.

Authors: M Abdullah Said; Niek Verweij; Pim van der Harst
Journal: JAMA Cardiol Date: 2018-08-01 Impact factor: 14.676

Review 6. Exploring the dark genome: implications for precision medicine.

Authors: Tudor I Oprea
Journal: Mamm Genome Date: 2019-07-04 Impact factor: 2.957

7. Phenome-wide Burden of Copy-Number Variation in the UK Biobank.

Authors: Matthew Aguirre; Manuel A Rivas; James Priest
Journal: Am J Hum Genet Date: 2019-07-25 Impact factor: 11.025

8. Splice variants of lncRNA RNA ANRIL exert opposing effects on endothelial cell activities associated with coronary artery disease.

Authors: Hyosuk Cho; Yabo Li; Stephen Archacki; Fan Wang; Gang Yu; Susmita Chakrabarti; Yang Guo; Qiuyun Chen; Qing Kenneth Wang
Journal: RNA Biol Date: 2020-06-30 Impact factor: 4.652

9. Hypertension is associated with a variant in the RARRES2 gene in populations of Ouro Preto, Minas Gerais, Brazil: a cross-sectional study.

Authors: Aline Priscila Batista; Keila Furbino Barbosa; Rafael Júnior de Azevedo; Valeska Natiely Vianna; Erica Maria de Queiroz; Carolina Coimbra Marinho; George Luiz Lins Machado-Coelho
Journal: Int J Mol Epidemiol Genet Date: 2021-06-15

10. Lymphangiogenic therapy prevents cardiac dysfunction by ameliorating inflammation and hypertension.

Authors: LouJin Song; Xian Chen; Terri A Swanson; Brianna LaViolette; Jincheng Pang; Teresa Cunio; Michael W Nagle; Shoh Asano; Katherine Hales; Arun Shipstone; Hanna Sobon; Sabra D Al-Harthy; Youngwook Ahn; Steven Kreuser; Andrew Robertson; Casey Ritenour; Frank Voigt; Magalie Boucher; Furong Sun; William C Sessa; Rachel J Roth Flach
Journal: Elife Date: 2020-11-17 Impact factor: 8.140