Literature DB >> 21875899

Thirty-five common variants for coronary artery disease: the fruits of much collaborative labour.

Abstract

Coronary artery disease (CAD) is the leading cause of death worldwide. Affected individuals cluster in families in patterns that reflect the sharing of numerous susceptibility genes. Genome-wide and large-scale gene-centric genotyping studies that involve tens of thousands of cases and controls have now mapped common disease variants to 34 distinct loci. Some coronary disease common variants show allelic heterogeneity or copy number variation. Some of the loci include candidate genes that imply conventional or emerging risk factor-mediated mechanisms of disease pathogenesis. Quantitative trait loci associations with risk factors have been informative in Mendelian randomization studies as well as fine-mapping of causative variants. But, for most loci, plausible mechanistic links are uncertain or obscure at present but provide potentially novel directions for research into this disease's pathogenesis. The common variants explain ~4% of inter-individual variation in disease risk and no more than 13% of the total heritability of coronary disease. Although many CAD genes are presently undiscovered, it is likely that larger collaborative genome-wide association studies will map further common/low-penetrance variants and hoped that low-frequency or rare high-penetrance variants will also be identified in medical resequencing experiments.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Year: 2011 PMID： 21875899 PMCID： PMC3179381 DOI： 10.1093/hmg/ddr384

Source DB: PubMed Journal: Hum Mol Genet ISSN： 0964-6906 Impact factor: 6.150

INTRODUCTION

Coronary artery disease (CAD) is the most frequent cause of death in high-income countries and the second most common cause of death in medium and low-income countries (1). It most commonly presents clinically in cases of angina pectoris and myocardial infarction (heart attack), which are due to atherosclerotic plaques that develop progressively as we age and occasionally rupture. Genetic epidemiological studies of family history and twin concordance studies are consistent with an underlying multifactorial model of disease susceptibility with a significant polygenic component. This is complemented by genetic analysis of heritable conventional risk factors such as low-density lipoprotein (LDL)-cholesterol and systolic blood pressure, which collectively might explain a minor portion of coronary disease risk (2). Over the past 4 years, researchers have completed several genome-wide association studies (GWASs) to map underlying common susceptibility variants for coronary disease. In parallel with GWASs of other complex diseases, it was soon apparent that typical effect sizes for individual single-nucleotide polymorphisms (SNPs) were fairly small, so large sample sizes would be required for reliable gene mapping. This has encouraged collaboration between individual research groups and led to the formation of consortia to pool the results of GWASs using meta-analysis techniques. Progress has been facilitated by the availability of phased haplotype training sets (notably, from the HapMap project: http://hapmap.ncbi.nlm.nih.gov/downloads/phasing) and the accompanying genotype imputation software (for example, MACH: www.sph.umich.edu/csg/abecasis/MACH/index.html or IMPUTE: mathgen.stats.ox.ac.uk/impute/impute_v2.html). These population genetic resources and statistical genetic tools provide an efficient solution to the fact that individual GWASs are often carried out on different SNP arrays with variable SNP overlap. All this effort came to a crescendo this year with the publication of two papers from the CARDIoGRAM (3) and C4D (4) consortia that together scanned nearly 40K coronary disease cases for susceptibility gene signals. Combined with the results from other recent large-scale studies (5–7), 35 common coronary disease variants have been robustly mapped by GWASs (3–7) or gene-centric SNP arrays (8) (Table 1). These results are mostly based on cases and controls of European descent. The C4D (4) and gene-centric (8) discovery experiments included South Asians from Pakistan and India as well as Europeans; these designs were optimally powered to detect variants that were common to both ancestry groups. Wang et al. (7) carried out their GWAS discovery and replication experiments in the Chinese Han population.

Table 1.

Thirty-five common susceptibility variants for coronary artery disease

Chr	Position	Locus^a	SNP	References	Reported effect		SNP-specific heritability (), %
Chr	Position	Locus^a	SNP	References	EAF	OR	K_p= 2%	K_p= 5%	K_p= 10%
1	55 496 039	PCSK9	rs11206510	MIGen (36)	0.82	1.08	0.03	0.04	0.05
1	56 962 821	PPAP2B	rs17114036	CARDIoGRAM (3)	0.91	1.17	0.07	0.09	0.11
1	109 822 166	SORT1	rs599839	Samani et al. (58), MIGen (36)	0.78	1.11	0.06	0.08	0.10
1	222 823 529	MIA3	rs17465637	Samani et al. (58), MIGen (36)	0.74	1.14	0.12	0.15	0.18
2	44 072 576	ABCG8	rs4299376	HumanCVD (8)	0.29	1.09	0.04	0.05	0.07
2	203 745 885	WDR12	rs6725887	MIGen (36)	0.15	1.14	0.07	0.09	0.11
3	138 119 952	MRAS	rs2306374	Erdmann et al. (59)	0.18	1.12	0.06	0.07	0.09
5	131 867 702	IL5	rs2706399	HumanCVD (8)	0.48	1.02	0.01	0.01	0.01
6	11 774 583	C6orf105	rs6903956^b	Wang et al. (7)	0.07	1.51	0.35	0.45	0.56
6	12 927 544	PHACTR1	rs12526453	MIGen (36)	0.67	1.10	0.06	0.08	0.10
6	35 034 800	ANKS1A	rs17609940	CARDIoGRAM (3)	0.75	1.07	0.03	0.04	0.05
6	134 214 525	TCF21	rs12190287	CARDIoGRAM (3)	0.62	1.08	0.05	0.06	0.07
6	160 961 137	LPA	rs3798220	Clarke et al. (30)	0.02	1.92	0.25	0.32	0.40
6	161 010 118	LPA	rs10455872	Clarke et al. (30)	0.07	1.70	0.57	0.73	0.90
7	107 244 545	7q22	rs10953541	C4D 2011 (4)	0.80	1.08	0.05	0.06	0.08
7	129 663 496	ZC3HC1	rs11556924	CARDIoGRAM (3)	0.62	1.09	0.06	0.07	0.09
8	126 495 818	TRIB1	rs10808546	HumanCVD (8)	0.65	1.04	0.02	0.02	0.02
9	22 098 574	ANRIL/CDKN2BAS	rs4977574	WTCCC (60), McPherson et al. (61), Helgadottir et al. (62), Samani et al. (58), MIGen (36)	0.46	1.29	0.53	0.68	0.84
9	136 154 168	ABO	rs579459	CARDIoGRAM (3), Reilly et al. (6)	0.21	1.10	0.05	0.06	0.08
10	30 335 122	KIAA1462	rs2505083	C4D 2011 (4), Erdmann et al. (5)	0.38	1.07	0.05	0.06	0.08
10	44 775 824	CXCL12	rs1746048	Samani et al. (58), MIGen (36)	0.87	1.09	0.03	0.03	0.04
10	91 002 927	LIPA	rs1412444	C4D 2011 (4)	0.42	1.08	0.05	0.07	0.08
10	104 719 096	CYP17A1-NT5C2	rs12413409	CARDIoGRAM (3)	0.89	1.12	0.04	0.05	0.07
11	103 660 567	PDGFD	rs974819	C4D 2011 (4)	0.32	1.08	0.05	0.06	0.08
11	116 648 917	APOA1-C3-A4-A5	rs964184	CARDIoGRAM (3)	0.13	1.13	0.05	0.07	0.09
12	111 884 608	SH2B3	rs3184504	Soranzo et al. (63)	0.44	1.07	0.04	0.05	0.06
13	110 960 712	COL4A1-A2	rs4773144	CARDIoGRAM (3)	0.44	1.07	0.04	0.05	0.06
14	100 133 942	HHIPL1	rs2895811	CARDIoGRAM (3)	0.43	1.07	0.04	0.05	0.06
15	79 111 093	ADAMTS7	rs4380028	C4D 2011 (4), CARDIoGRAM (3), Reilly et al. (6)	0.60	1.07	0.05	0.06	0.08
17	2 126 504	SMG6-SRR	rs216172	CARDIoGRAM (3)	0.37	1.07	0.03	0.05	0.06
17	17 543 722	PEMT	rs12936587	CARDIoGRAM (3)	0.56	1.07	0.04	0.05	0.06
17	46 988 597	GIP-ATP	rs46522	CARDIoGRAM (3)	0.53	1.06	0.03	0.04	0.04
19	11 163 601	LDLR	rs1122608	MIGen (36)	0.77	1.14	0.10	0.12	0.15
19	45 395 619	APOE	rs2075650	HumanCVD (8)	0.14	1.14	0.07	0.09	0.11
21	35 599 128	MRPS6	rs9982601	MIGen (36)	0.15	1.18	0.11	0.14	0.18
							3.30	4.27	5.29

Chr, chromosome; Position, position (in bp) on GRCh37/hg19 (Genome Reference Consortium February 2009); EAF, effect allele frequency; OR, odds ratio; , SNP-specific heritability estimates are shown for three disease prevalence estimates; Kp, disease prevalence estimate for SNP-specific heritability estimate; total SNP-encoded heritability for each disease prevalence estimate are shown in bold type.

aMost locus assignments are provisional based on proximity (see text).bEffect allele frequency and odds ratio are given for Chinese Han population.

Thirty-five common susceptibility variants for coronary artery disease Chr, chromosome; Position, position (in bp) on GRCh37/hg19 (Genome Reference Consortium February 2009); EAF, effect allele frequency; OR, odds ratio; , SNP-specific heritability estimates are shown for three disease prevalence estimates; Kp, disease prevalence estimate for SNP-specific heritability estimate; total SNP-encoded heritability for each disease prevalence estimate are shown in bold type. aMost locus assignments are provisional based on proximity (see text).bEffect allele frequency and odds ratio are given for Chinese Han population. For some of the SNPs, there is circumstantial evidence to highlight an underlying gene. For example, LIPA encodes lipase A, which catalyzes the hydrolysis of cholesteryl esters and triglycerides. The lead CAD risk SNP in LIPA (4,8) is strongly correlated with expression quantitative trait loci (eQTL) SNP, with the CAD risk allele correlated with increased expression of LIPA mRNA in monocytes (9) and liver (4), suggesting a functional relationship between the disease association signal and this candidate gene. However, for most of the SNPs mapped by GWASs, it is difficult to implicate an underlying gene. Inspection of recombination frequency maps derived from the HapMap project suggest genetic boundaries defined by recombination hotspots, which drives the traditional approach of fine-mapping using haplotype block information (10). However, there is ample evidence that cis-regulatory mechanisms in the soma can operate over tens, hundreds or even millions of base pairs and are presumably unaffected by meiotic recombination (11). It is becoming possible to define functional genomic boundaries based on chromatin architecture-related factors, such as CCCTC-binding factor sites (12) that are found in the vast majority of vertebrate insulator elements. We suspect that, for most coronary disease loci, it will take some time to complete the sequence of functional genomics experiments that will be required to confidently lay the blame of disease susceptibility on a specific underlying gene. The loci shown in Table 1 should therefore be interpreted as provisional assignments based mainly on proximity (nearest coding sequence). It is noteworthy that the lists of confidently assigned genes across the recent crop of studies show little overlap, which at first glance seems unlikely to be due to differences in tagging SNP coverage, phenotypic heterogeneity or ancestry (these studies are heavily weighted towards European ancestry). We suspect that it most likely has its roots in the small effects conferred by the susceptibility genes (each allele increases risk by ∼5%). Such small gene effects would, in a GWAS discovery experiment comprising 20K cases and 20K controls, be expected to have only a 5% probability of passing even a modest stage-1 (tentative discovery) threshold of P< 0.00001. Therefore, in absolute terms, this low power to detect an individual gene means that different, and only occasionally overlapping, sets of susceptibility loci are likely to emerge from similarly sized studies (13,14). Of course, even for loci that have surpassed the de facto genome-wide significance threshold of 5 × 10−8, there is still an appreciable chance that one or more of the 34 loci will prove to be false-positives (15). We expect based on binomial theory that the maximum number of false-positive associations in Table 1 will be three or fewer. Skol et al. (16) pointed out that a combined analysis of stage 1 discovery data (based on GWAS SNP arrays) and stage 2 validation data (a subset of the most promising SNPs from stage 1) is more efficient than attempting formal replication in stage 2 and such an analysis is now standard practise. However, some researchers, mindful of historical difficulties of interpreting complex genetic data (17), prefer to apply a cautious approach to secure robust (i.e. taking into account stage 2 multiple testing penalties) and independent (of GWAS discovery) replication data (3,4). GWAS is a ‘hypothesis-free’ approach to the study of complex diseases, which depends on mutations that unpredictably occurred in previous generations, and purely by chance (genetic drift) or by (balancing) selection is now common. As such the loci that are identified provide a framework for what we know and do not know about pathogenesis from other hypothesis-led (e.g. physiological, biochemical or cell biological) experiments. In the case of coronary disease genetics, we have examples that illustrate complex genetic architectural features such as allelic heterogeneity, pleiotropy, risk factor QTLs, copy number variation (CNV) or synthetic associations, which are discussed below. There are findings that could lead to tangible clinical benefits relatively soon (LPA and SORT1). But mainly there are leads to point researchers in unanticipated directions, at least some of which we hope will provide novel biological insights into how atherosclerotic plaques develop and rupture.

HOW MUCH HERITABILITY HAS BEEN MAPPED?

Manolio et al. (18) have pointed out that despite the success of GWAS in mapping common susceptibility variants for many multifactorial diseases, collectively these variants typically explain a modest fraction of the total heritability of these conditions. The accuracy of such calculations depends on the fidelity of a series of locus-specific heritability estimates as well as the total (i.e. measured plus unmeasured) heritability. Locus-specific heritability estimates are based on odds ratio estimates that assume that the lead SNP at a locus accurately tags the disease-causing variant. No correction is usually made for potential biases due to the ‘winner’s curse’ (19) or to signal attenuation due to clinically unscreened control data (20). External (to the case–control data) epidemiological information on disease prevalence is required if the multifactorial threshold model is to be used to calculate locus-specific heritabilities. Prevalence estimates can vary substantially with clinical phenotype, sex, and age and are drifting over time with changing environmental risk factor exposures (www.heartstats.org); we suggest that a range of prevalences between 2 and 10% is relevant to coronary disease GWAS case series. Invaluable coronary disease heritability data are derived from the longitudinal study of over 20K twins in the Swedish Twin Registry (Karolinska Institute, Stockholm, Sweden). The total heritability of coronary disease was estimated for angina in men as 39% (95% CI 29–49%) and in women as 43% (8–51%) (21) and for death from coronary disease in men as 57% (45–69%) and in women as 38% (26–50%) (22). Genetic association studies generally use samples collected from survivors of disease, for ethical and other pragmatic reasons. For coronary disease, case series are clinically heterogeneous if they include different diagnostic subgroups (e.g. chronic stable angina or myocardial infarction) with subtly differing pathologies that might have an impact on susceptibility. So, we propose that a heritability estimate of 40% will encompass the clinical heterogeneity across typical GWAS case series. Taking all of these issues into account, we estimate that between 8 and 13% of the total heritability of coronary disease can be explained by the 35 common variants (Table 1). So, the vast majority of the heritability is currently unexplained. X-linked single-gene disorders have intrinsic advantages for gene mapping, so it is a pity that the X-chromosome has sometimes been overlooked in the search for coronary disease susceptibility loci (contrast the CARDiOGRAM study which was based on imputed data with C4D which was based solely on genotype data). This can easily be resolved as appropriate analytic means (i.e. phased haplotype training sets and imputation software) are now freely available and have proved productive in the investigation of the X chromosome in other disease areas [e.g. type 2 diabetes (23)]. Although there was little evidence for non-additive genetic effects in the aforementioned twin studies of coronary disease, we note that classic MZ/DZ concordance studies have very limited power to identify non-additive variance components (24). Moreover, dominance and epistatic variance components are inevitably confounded in this design (25). Consequently, there seems no reason why epistatic and high-penetrance/low-frequency alleles should not explain a portion of the missing heritability. Indeed, some of the linkages detected in earlier affected-sib-pair studies (26,27) might be conferred by low-frequency alleles with or without allelic heterogeneity. For example, the locus on chromosome 17p reported by PROCARDIS (27) was associated with a sibling recurrence risk ratio (λsib) of 1.29 that could theoretically be conferred by a dominant, low-frequency (0.5%) allele of intermediate penetrance (17.2%) and with a 1.8% phenocopy rate. Such linkage signals are usually intractable to conventional GWAS based on common SNPs (28) but might be resolved by means of resequencing-based analyses or genotyping arrays with good coverage of low-frequency variants.

SUSCEPTIBILITY ENCODED BY CNV

The development of array-based methods to systematically study CNV has allowed researchers to study the role of this rich source of genetic variation in common multifactorial diseases (29). It is ironic that these high-throughput techniques with genome-wide coverage of CNVs have overlooked an exemplar of common disease susceptibility namely that encoded by the apolipoprotein(a) (LPA) gene. This gene includes a highly variable number of kringle IV-2 repeats (range at least 12–44) which result in numerous isoforms that can be typed by protein electrophoresis (30) or genomically quantified by qPCR (30–32). Two SNPs that tag short isoform alleles that are encoded by relatively low copy numbers of kringle IV-2 sequences show strong associations with high lipoprotein(a) levels and with coronary disease risk (OR= ∼1.5) (30). These SNPs were not included on commonly used GWAS SNP arrays (but were fortuitously included in the design of the HumanCVD gene-centric SNP array) and have been recalcitrant to genotype imputation due to their frequency or linkage disequilibrium properties. Consequently, they have not been assessed in GWAS meta-analyses of coronary disease susceptibility.

ALLELIC HETEROGENEITY AND PLEIOTROPY

Allelic heterogeneity is a regular feature of complex diseases and traits. For example, multiple independent signals were detected at 19 of 180 height QTL (33). Researchers have systematically scanned for secondary association signals of coronary disease (by conditioning on the lead SNP at each locus), an approach that identified multiple independent SNP signals in LPA (30). The coronary disease associations in PCKS9 detected by the non-synonymous R46L SNP rs11591147 (34,35) and a non-coding SNP rs11206510 (36) appear to be independent as the two SNPs show little linkage disequilibrium (r2= 0.04 in PROCARDIS Human CVD data). A possible example of allelic heterogeneity arises for SNPs rs3825807 (3) and rs1994016 (6) that map to alternate flanks of the ADAMTS7 gene on chromosome 15 to rs4380028 (4) and are in moderate linkage disequilibrium (r2= ∼0.50). Coronary disease shows substantial clinical heterogeneity that is reflected in morphological differences in the atherosclerotic plaques (37) that might in turn reflect differences in inherited susceptibilities. Plaque rupture and subsequent coronary thrombosis causes acute coronary syndromes such as myocardial infarction, thereby motivating searches for genes that might influence plaque stability. Reilly et al. (6) undertook a GWAS of coronary disease patients with angiographic disease contrasting those cases that had suffered myocardial infarction with those who had not. This study mapped a novel association to the ABO blood group system with SNPs that strongly tag the O allele. The CARDIoGRAM consortium (3), which studied a mixture of coronary disease cases of which two-thirds had suffered a myocardial infarction and a mixture of screened and unscreened controls, also mapped a susceptibility signal to the ABO system. Their lead SNP is only in moderate linkage disequilibrium (r2= 0.39) with the Reilly et al. (6) signal so may be due to allelic heterogeneity or pleiotropy. We must await further fine-mapping studies of the ABO and ADAMTS7 loci to fully understand the details of these associations.

QTL MAPPING AND CONVENTIONAL RISK FACTORS

Conventional risk factors for coronary disease such as circulating lipid levels and blood pressure are heritable (quantitative) traits. There has been much effort in mapping common QTL for these traits using GWAS or other large-scale SNP arrays in population-based as well as case–control samples (38–40). Technical difficulties such as uncontrolled fasting status or on-treatment measurements have been largely overcome to produce a rich crop of risk factor QTL. Notable examples of overlap with coronary disease loci (Table 2) include TRIB1 (Drosophila tribbles homologue, a gene that interacts with the mitogen-activated protein kinase cascade), which has pleiotropic effects on circulating triglyceride, LDL- and HDL-cholesterol levels (40) and CYP17A1 (17-α hydroxylase gene involved in steroid hormone metabolism) which is a systolic blood pressure QTL (38,39). However, most (22 of 34) of the coronary disease loci do not show convincing risk factor QTL effects (Table 2). It may be that novel heritable intermediate phenotypes will eventually be identified that will explain some of the disease associations; this should lead to informative insights into pathological mechanisms. Indeed, expectations that coronary genes would be involved in innate immunity or thrombosis have not emerged from large-scale genetic association studies to date.

Table 2.

Risk factor QTL and coronary artery disease

Locus^a	QTL	Lead QTL SNP	CAD Risk SNP	r²	Reference(s)
PCSK9	LDL, TC	rs2479409	rs11206510	<0.30	Kathiresan et al. (64), Teslovich et al. (40)
PPAP2B			rs17114036
SORT1	LDL, TC	rs646776	rs599839	0.91	Kathiresan et al. (64), Teslovich et al. (40)
MIA3			rs17465637
ABCG8	LDL	rs4299376	rs4299376	1.00	Teslovich et al. (40)
WDR12			rs6725887
MRAS			rs2306374
IL5			rs2706399
C6orf105			rs6903956*
PHACTR1			rs12526453
ANKS1A			rs17609940
TCF21			rs12190287
LPA	Lp(a)	rs10455872	rs10455872	1.00	Clarke et al. (30)
LPA	Lp(a)	rs3798220	rs3798220	1.00	Clarke et al. (30)
LPA	LDL, TC	rs1564348	rs10455872	<0.30	Teslovich et al. (40)
LPA	HDL	rs1084651	rs10455872	NA^b	Teslovich et al. (40)
7q22			rs10953541
ZC3HC1			rs11556924
TRIB1	TG, TC, LDL, HDL	rs2954029	rs10808546	0.96	Kathiresan et al. (64), Teslovich et al. (40)
ANRIL/CDKN2BAS			rs4977574
ABO	LDL, TC	rs9411489	rs579459	<0.30	Teslovich et al. (40)
KIAA1462			rs2505083
CXCL12			rs1746048
LIPA			rs1412444
CYP17A1-NT5C2	blood pressure	rs11191548	rs12413409	1.00	Newton-Cheh et al. (38)
PDGFD			rs974819
APOA1-C3-A4-A5	TG, HDL	rs964184	rs964184	1.00	Kathiresan et al. (64), Teslovich et al. (40)
SH2B3	blood pressure	rs3184504	rs3184504	1.00	Levy et al. (39)
COL4A1-A2			rs4773144
HHIPL1			rs2895811
ADAMTS7			rs4380028
SMG6-SRR			rs216172
PEMT			rs12936587
GIP-ATP			rs46522
LDLR	LDL, TC	rs6511720	rs1122608	<0.30	Teslovich et al. (40)
APOE	LDL, TC, HDL	rs4420638	rs2075650	0.40	Kathiresan et al. (64), Teslovich et al. (40)
MRPS6			rs9982601

r2, measure of linkage disequilibrium between the lead QTL SNP and the lead risk SNP; LDL, LDL-cholesterol; HDL, HDL-cholesterol; TC, total cholesterol; TG, triglycerides.

aMost locus assignments are provisional based on proximity (see text).

bSNP rs1084651 had a >5% genotyping failure rate in the HapMap database Rel23.

Risk factor QTL and coronary artery disease r2, measure of linkage disequilibrium between the lead QTL SNP and the lead risk SNP; LDL, LDL-cholesterol; HDL, HDL-cholesterol; TC, total cholesterol; TG, triglycerides. aMost locus assignments are provisional based on proximity (see text). bSNP rs1084651 had a >5% genotyping failure rate in the HapMap database Rel23.

MOVING FROM ASSOCIATED LOCUS TO CAUSATIVE GENE

Robust assignments of common susceptibility variants are to be applauded, but it may take some time to resolve the underlying molecular genetic mechanisms. For instance, the first coronary disease locus to confidently emerge from GWAS mapped to chromosome 9p21 to a region that was initially believed to be a gene desert. However, it was quickly recognized in fine-mapping studies (10) that the associated region, which was of prior interest to cancer genetics researchers (41,42), was potentially linked to neighbouring cyclin-dependent kinase inhibitor genes CDKN2A (which has multiple synonyms including p16, see www.genecards.org/cgi-bin/carddisp.pl?gene=CDKN2A for more details) and CDKN2B (p15, see www.genecards.org/cgi-bin/carddisp.pl?gene=CDKN2B). Subsequent studies of murine models have highlighted a Cdkn2a/b-mediated mechanism involving smooth muscle cell proliferation (43). Studies of human transcription enhancer elements propose that the large non-coding antisense RNA molecule CDKN2ABAS, which is also known by the monikers ANRIL and CDKN2B-AS1, is involved in the long-range transcriptional regulation of several genes, including CDKN2A and CDKN2B, in vascular endothelial cell lines (44). Another notable success story followed the overlap of LDL-cholesterol QTL and coronary disease association signals on chromosome 1p. Here eQTL were particularly informative to resolve the role of SORT1 which encodes sortilin from it's neighbours (45). Parallel functional studies have implicated sortilin as a novel regulator of lipoprotein production in the liver (46), thereby providing a mechanistic link to the coronary disease susceptibility although some details of the mechanism need to be reconciled.

DISCREPANCIES BETWEEN MEASURED AND PREDICTED GENETIC RISK

Using results from proxy SNPs that are in almost complete linkage disequilibrium with each other, the CARDIoGRAM and C4D studies provide a joint SORT1 per-allele risk estimate equal to 1.12 (1.09–1.15). The measured per-allele QTL effect on LDL-cholesterol is equal to 0.145 mmol/l (0.135–0.155) (40). Substituting the latter QTL effect into the Framingham coronary heart disease risk equation (47) predicts a per-allele relative risk equal to 1.042 (1.039–1.045). So, the coronary disease risk estimate derived from GWAS is substantially higher than that predicted from the effect on LDL-cholesterol levels derived from long-term prospective studies. A similar discrepancy was noted in a genotype risk score analysis of the Malmö Diet and Cancer study (48). Given the underlying sample sizes (>100K for the lipid QTL GWAS experiment) and the consequent precision of the risk factor effect sizes, it seems unlikely that these have been systematically underestimated. It is possible that the disease risk estimates have been overestimated [e.g. winner’s curse (19)] and/or the disease associations are only partially mediated through the accompanying risk factor QTL effect (i.e. pleiotropy). Moreover, the Framingham risk equations, which were based on US population data from 1970s onwards, were based on single (baseline) cholesterol measurements. Cholesterol measurements are subject to short-term (e.g. variable fasting) as well as long-term variation (e.g. changes in diet). Such within-individual variation can result in the systematic underestimation of the strength of a risk factor association with disease, an epidemiological effect known as regression dilution bias (49). Whatever the explanation, these findings emphasize that genetic epidemiological inferences from cross-sectional data need cautious interpretation and that information from prospective studies (e.g. UK BIOBANK, www.ukbiobank.ac.uk) will be very informative.

GENETIC INSIGHTS FOR EMERGING RISK FACTORS

Common variants that show quantitative genetic variation for disease risk factors or intermediate phenotypes can probe the putative causal relationship between risk factor and disease. This Mendelian randomization (MR) information (50) is particularly useful when there are no drugs that specifically modulate the exposure. For instance, Lp(a) is an LDL particle that appears to be proatherogenic in cross-sectional and prospective epidemiological studies. But drugs such as niacin that reduce Lp(a) concentrations also beneficially increase the HDL levels. So, it can be difficult in randomized clinical trials (RCT) to unambiguously attribute any clinical benefit to specific mechanisms. Following a large-scale candidate gene study, two SNPs in the apolipoprotein(a) gene were shown to tag short isoform alleles that were strikingly associated with raised Lp(a) concentrations and coronary disease risk (30) (also discussed in CNV and allelic heterogeneity sections above). Simultaneous modelling of disease risk and quantitative genetic variation were consistent with a direct causal link, thus predicting that pharmacological lowering of Lp(a) levels will be beneficial to patients. Loci that carry triglyceride QTL such as TRIB1 and APOA1-C3-A4-A5 also show QTL effects for HDL and LDL. Consequently, these pleiotropic loci will not be useful for MR probing of the role of triglyceride, a well-studied lipid that is presently not routinely included in cardiovascular risk calculations.

CANDIDATE AND POSITIONALLY CLONED GENES

Before the GWAS epoch, there was much effort expended in scanning candidate genes, those genes with known or predicted functions that might be involved in coronary disease pathogenesis, for susceptibility variants. This research was supplemented by positional cloning experiments that were unbiased in terms of gene candidature (51). In comparison with the levels of statistical support required for GWAS, the evidence for most of the candidate genes was modest despite meta-analyses involving up to 36K subjects (52). For instance, genetic variation in the apolipoprotein E gene (APOE) has well-known effects on LDL-cholesterol and is a highly plausible coronary disease candidate gene. However, a meta-analysis of 17 studies with at least 500 cases that included 21 331 cases and 47 467 controls estimated the risk of carrying the protective ɛ2 allele (versus ɛ3/ ɛ3 homozygotes) as 0.80 (95% CI 0.70–0.90) (53); the significance of this association is approximately P= 0.0005, four orders of magnitude below genome-wide significance. GWAS, even when enhanced by genotype imputation, may not accurately tag specific candidate gene variants [e.g. LPA and rs10455872 (30)]. So, the absence of a GWAS signal cannot be assumed to negate prior candidate or positional cloned genes (8). Indeed, the design of the HumanCVD gene-centric SNP array has revealed several novel loci LIPA, IL5, TRIB1 and ABCG5/ABCG8 as well as finally robustly confirming the candidature of APOE (8).

FUTURE DIRECTIONS AND PROSPECTS

Models of the genetic architecture of complex traits (33) predict that large numbers of small effect susceptibility loci remain to be discovered, some of which should be tractable to well-powered GWAS. The momentum amongst researchers to meta-analyse GWAS data will be sustained as larger consortia (e.g. the recently merged CARDIoGRAMplusC4D consortium) are formed. They can take full advantage of initiatives such as the Metabochip project, a custom array containing 196 725 SNPs that builds on the CARDIoGRAM stage-1 discovery results. So, it is reasonable to expect that the number of common coronary disease variants will increase, although as each novel variant will be associated with increasingly tiny effects, it seems that the missing heritability gap will never be filled by common variants alone. The compilation of an exhaustive list of human genetic variation through the dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP) and HapMap (http://hapmap.ncbi.nlm.nih.gov/) projects is being systematically expanded by the 1000 Genomes project (http://www.1000genomes.org). This is particularly useful for low-frequency variants (minimum allele frequency <5%), which were largely absent from the early genome-wide SNPs arrays, which were designed to type common variants in GWAS. Synthetic associations due to tagging of rare variants by common SNPs or haplotypes composed of common SNPs can occasionally be detected (54); this is particularly useful if the rare variant might disrupt gene function (e.g. non-synonymous SNP). Consequently, analyses of imputed genotypes derived from the 1000 genomes project are an immediate research priority. This is encouraged by the detection of a haplotype association with coronary disease (55) that subsequently was partially explained as a synthetic association to an LPA SNP (30). Imputation-based analysis will be complemented by whole-genome or exome resequencing experiments aimed at identifying low-frequency variants and unique high-penetrance mutations; together, these approaches have the potential to reveal novel disease mechanisms. Finally, as the list of mapped disease variants expands, and fine-mapping and functional genomic studies refine loci to resolve underlying genes, pathway and network analysis should prove useful to provide systems level insights into coronary disease pathogenesis (56,57). Conflict of Interest statement. None declared.

FUNDING

This work was supported by the European Community Sixth Framework Program (LSHM-CT-2007-037273), the British Heart Foundation, the Oxford BHF Centre of Research Excellence and the Wellcome Trust (090532/Z/09/Z). Funding to pay the Open Access publication charges for this article was provided by the Wellcome Trust.

62 in total

1. Genetic influences on CHD-death and the impact of known risk factors: comparison of two frailty models.

Authors: Slobodan Zdravkovic; Andreas Wienke; Nancy L Pedersen; Marjorie E Marenberg; Anatoli I Yashin; Ulf de Faire
Journal: Behav Genet Date: 2004-11 Impact factor: 2.805

2. Design of case-controls studies with unscreened controls.

Authors: V Moskvina; P Holmans; K M Schmidt; N Craddock
Journal: Ann Hum Genet Date: 2005-09 Impact factor: 1.670

3. Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies.

Authors: Andrew D Skol; Laura J Scott; Gonçalo R Abecasis; Michael Boehnke
Journal: Nat Genet Date: 2006-01-15 Impact factor: 38.330

Review 4. Genetic susceptibility to coronary artery disease: from promise to progress.

Authors: Hugh Watkins; Martin Farrall
Journal: Nat Rev Genet Date: 2006-03 Impact factor: 53.242

Review 5. The regulation of INK4/ARF in cancer and aging.

Authors: William Y Kim; Norman E Sharpless
Journal: Cell Date: 2006-10-20 Impact factor: 41.582

6. A genomewide linkage study of 1,933 families affected by premature coronary artery disease: The British Heart Foundation (BHF) Family Heart Study.

Authors: Nilesh J Samani; Paul Burton; Massimo Mangino; Stephen G Ball; Anthony J Balmforth; Jenny Barrett; Timothy Bishop; Alistair Hall
Journal: Am J Hum Genet Date: 2005-10-25 Impact factor: 11.025

7. Genetic effects versus bias for candidate polymorphisms in myocardial infarction: case study and overview of large-scale evidence.

Authors: Evangelia E Ntzani; Evangelos C Rizos; John P A Ioannidis
Journal: Am J Epidemiol Date: 2007-02-10 Impact factor: 4.897

8. Characterization of a germ-line deletion, including the entire INK4/ARF locus, in a melanoma-neural system tumor family: identification of ANRIL, an antisense noncoding RNA whose expression coclusters with ARF.

Authors: Eric Pasmant; Ingrid Laurendeau; Delphine Héron; Michel Vidaud; Dominique Vidaud; Ivan Bièche
Journal: Cancer Res Date: 2007-04-15 Impact factor: 12.701

9. Genome-wide mapping of susceptibility to coronary artery disease identifies a novel replicated locus on chromosome 17.

Authors: Martin Farrall; Fiona R Green; John F Peden; Per G Olsson; Robert Clarke; Mai-Lis Hellenius; Stephan Rust; Jacob Lagercrantz; Maria Grazia Franzosi; Helmut Schulte; Alisoun Carey; Gunnar Olsson; Gerd Assmann; Gianni Tognoni; Rory Collins; Anders Hamsten; Hugh Watkins
Journal: PLoS Genet Date: 2006-05-19 Impact factor: 5.917

10. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls.

Authors:
Journal: Nature Date: 2007-06-07 Impact factor: 49.962

68 in total

1. Massively parallel high-order combinatorial genetics in human cells.

Authors: Alan S L Wong; Gigi C G Choi; Allen A Cheng; Oliver Purcell; Timothy K Lu
Journal: Nat Biotechnol Date: 2015-08-17 Impact factor: 54.908

2. Coronary heart disease is associated with a mutation in mitochondrial tRNA.

Authors: Zidong Jia; Xinjian Wang; Yanwen Qin; Ling Xue; Pingping Jiang; Yanzi Meng; Suxue Shi; Yan Wang; Jun Qin Mo; Min-Xin Guan
Journal: Hum Mol Genet Date: 2013-06-04 Impact factor: 6.150

Review 3. Coronary artery disease in Bangladesh: a review.

Authors: A K M Monwarul Islam; A A S Majumder
Journal: Indian Heart J Date: 2013-07-10

Review 4. Asian-Indians: a review of coronary artery disease in this understudied cohort in the United States.

Authors: Devarshi R Ardeshna; Tamunoinemi Bob-Manuel; Amit Nanda; Arindam Sharma; William Paul Skelton; Michelle Skelton; Rami N Khouzam
Journal: Ann Transl Med Date: 2018-01

Review 5. Predicting cardiovascular risk in type 2 diabetes: the heterogeneity challenges.

Authors: M Odette Gore; Darren K McGuire; Ildiko Lingvay; Julio Rosenstock
Journal: Curr Cardiol Rep Date: 2015-07 Impact factor: 2.931

6. Gene editing and the health of future generations.

Authors: Christopher Gyngell
Journal: J R Soc Med Date: 2017-04-26 Impact factor: 5.344

7. Identifying Novel Gene Variants in Coronary Artery Disease and Shared Genes With Several Cardiovascular Risk Factors.

Authors: Marissa LeBlanc; Verena Zuber; Bettina Kulle Andreassen; Aree Witoelar; Lingyao Zeng; Francesco Bettella; Yunpeng Wang; Linda K McEvoy; Wesley K Thompson; Andrew J Schork; Sjur Reppe; Elizabeth Barrett-Connor; Symen Ligthart; Abbas Dehghan; Kaare M Gautvik; Christopher P Nelson; Heribert Schunkert; Nilesh J Samani; Paul M Ridker; Daniel I Chasman; Pål Aukrust; Srdjan Djurovic; Arnoldo Frigessi; Rahul S Desikan; Anders M Dale; Ole A Andreassen
Journal: Circ Res Date: 2015-10-20 Impact factor: 17.367

Review 8. Genetics of coronary artery disease and myocardial infarction--2013.

Authors: Thorsten Kessler; Jeanette Erdmann; Heribert Schunkert
Journal: Curr Cardiol Rep Date: 2013-06 Impact factor: 2.931

9. Cardiovascular twist to the rapidly evolving apolipoprotein L1 story.

Authors: Martin Farrall
Journal: Circ Res Date: 2014-02-28 Impact factor: 17.367

10. Large-scale association analysis identifies new risk loci for coronary artery disease.

Authors: Panos Deloukas; Stavroula Kanoni; Christina Willenborg; Martin Farrall; Themistocles L Assimes; John R Thompson; Erik Ingelsson; Danish Saleheen; Jeanette Erdmann; Benjamin A Goldstein; Kathleen Stirrups; Inke R König; Jean-Baptiste Cazier; Asa Johansson; Alistair S Hall; Jong-Young Lee; Cristen J Willer; John C Chambers; Tõnu Esko; Lasse Folkersen; Anuj Goel; Elin Grundberg; Aki S Havulinna; Weang K Ho; Jemma C Hopewell; Niclas Eriksson; Marcus E Kleber; Kati Kristiansson; Per Lundmark; Leo-Pekka Lyytikäinen; Suzanne Rafelt; Dmitry Shungin; Rona J Strawbridge; Gudmar Thorleifsson; Emmi Tikkanen; Natalie Van Zuydam; Benjamin F Voight; Lindsay L Waite; Weihua Zhang; Andreas Ziegler; Devin Absher; David Altshuler; Anthony J Balmforth; Inês Barroso; Peter S Braund; Christof Burgdorf; Simone Claudi-Boehm; David Cox; Maria Dimitriou; Ron Do; Alex S F Doney; NourEddine El Mokhtari; Per Eriksson; Krista Fischer; Pierre Fontanillas; Anders Franco-Cereceda; Bruna Gigante; Leif Groop; Stefan Gustafsson; Jörg Hager; Göran Hallmans; Bok-Ghee Han; Sarah E Hunt; Hyun M Kang; Thomas Illig; Thorsten Kessler; Joshua W Knowles; Genovefa Kolovou; Johanna Kuusisto; Claudia Langenberg; Cordelia Langford; Karin Leander; Marja-Liisa Lokki; Anders Lundmark; Mark I McCarthy; Christa Meisinger; Olle Melander; Evelin Mihailov; Seraya Maouche; Andrew D Morris; Martina Müller-Nurasyid; Kjell Nikus; John F Peden; N William Rayner; Asif Rasheed; Silke Rosinger; Diana Rubin; Moritz P Rumpf; Arne Schäfer; Mohan Sivananthan; Ci Song; Alexandre F R Stewart; Sian-Tsung Tan; Gudmundur Thorgeirsson; C Ellen van der Schoot; Peter J Wagner; George A Wells; Philipp S Wild; Tsun-Po Yang; Philippe Amouyel; Dominique Arveiler; Hanneke Basart; Michael Boehnke; Eric Boerwinkle; Paolo Brambilla; Francois Cambien; Adrienne L Cupples; Ulf de Faire; Abbas Dehghan; Patrick Diemert; Stephen E Epstein; Alun Evans; Marco M Ferrario; Jean Ferrières; Dominique Gauguier; Alan S Go; Alison H Goodall; Villi Gudnason; Stanley L Hazen; Hilma Holm; Carlos Iribarren; Yangsoo Jang; Mika Kähönen; Frank Kee; Hyo-Soo Kim; Norman Klopp; Wolfgang Koenig; Wolfgang Kratzer; Kari Kuulasmaa; Markku Laakso; Reijo Laaksonen; Ji-Young Lee; Lars Lind; Willem H Ouwehand; Sarah Parish; Jeong E Park; Nancy L Pedersen; Annette Peters; Thomas Quertermous; Daniel J Rader; Veikko Salomaa; Eric Schadt; Svati H Shah; Juha Sinisalo; Klaus Stark; Kari Stefansson; David-Alexandre Trégouët; Jarmo Virtamo; Lars Wallentin; Nicholas Wareham; Martina E Zimmermann; Markku S Nieminen; Christian Hengstenberg; Manjinder S Sandhu; Tomi Pastinen; Ann-Christine Syvänen; G Kees Hovingh; George Dedoussis; Paul W Franks; Terho Lehtimäki; Andres Metspalu; Pierre A Zalloua; Agneta Siegbahn; Stefan Schreiber; Samuli Ripatti; Stefan S Blankenberg; Markus Perola; Robert Clarke; Bernhard O Boehm; Christopher O'Donnell; Muredach P Reilly; Winfried März; Rory Collins; Sekar Kathiresan; Anders Hamsten; Jaspal S Kooner; Unnur Thorsteinsdottir; John Danesh; Colin N A Palmer; Robert Roberts; Hugh Watkins; Heribert Schunkert; Nilesh J Samani
Journal: Nat Genet Date: 2012-12-02 Impact factor: 38.330