Literature DB >> 23074578

Genome-wide association studies in myocardial infarction and coronary artery disease.

Pier Mannuccio Mannucci¹, Luca A Lotta, Flora Peyvandi.

Abstract

Myocardial infarction (MI) and its major determinant, coronary artery disease (CAD), are complex diseases arising from the interaction between several genetic and environmental factors. Until recently, the genetic basis of these diseases was poorly understood. Genome-wide genetic association studies have afforded a comprehensive insight into the association between genetic variants and diseases. To date, seven genome-wide association studies have been conducted in CAD/MI, identifying thirteen genomic regions at which common genetic variants influence the predisposition to these diseases. This review article summarizes the progress achieved in the genetic basis of MI and CAD by means of genome-wide association studies and the potential clinical applications of these findings.

Entities: Disease Gene Mutation Species

Keywords: Coronary artery disease; Genome-wide association study; Myocardial infarction

Year: 2010 PMID： 23074578 PMCID： PMC3466835

Source DB: PubMed Journal: J Tehran Heart Cent ISSN： 1735-5370

Introduction

Myocardial infarction (MI) and its principal determinant, coronary artery disease (CAD), are major causes of death and disability worldwide.1 Complex diseases arising from the interaction between several genetic and environmental factors, MI and CAD are distinguished from monogenic Mendelian disorders (i.e. thalassemia, hemophilia, and cystic fibrosis) in which mutations of a single gene are believed to cause the greatest part of the disease phenotype.

Atherothrombotic diseases: The role of genetic risk factors

Epidemiological studies have firmly established an association between several environmental risk factors and the occurrence of CAD/MI.2–6 Smoking, dyslipidemia, diabetes, obesity, little physical activity, and hypertension are well recognized risk factors for atherothrombosis and display a similar prevalence in geographical areas and populations of different ethnicity and levels of income, as highlighted by the INTERHEART study.7 Beside and independently from other risk factors, a positive family history, defined as at least one first-degree relative having developed the disease at an age younger than 55 years in males or 65 years in females, has been consistently shown to correlate with the risk of atherothrombosis in the frame of epidemiological studies. Pertaining specifically to MI, a positive family history is associated with a risk increase of a magnitude comparable to that of other established risk factors,7–9 with the risk for MI in the siblings of affected patients being from 2 to 11 fold higher than that of the general population.10–11 In addition, the concordance in the development of MI is higher among monozygous than dizygous twins.12, 13 A positive family history has a higher prevalence in young patients with early-onset MI.14 On the whole, it is well established that genetic factors play an important role in the pathophysiology of MI.

Genetic association studies

A genetic variant is a DNA sequence variant that differs from the reference sequence of the human genome, completed in 2003 by the Human Genome Project.15 Genetic association studies compare the frequency of genetic variants, usually in the frame of retrospective case-control studies carried out in subjects with and without a disease, with the aim to identify associations between genetic variants and disease predisposition. Identifying variants associated with a given disease can confer pathophysiological insights (identification of biological pathways implicated in disease occurrence), improve the prediction of the risk of developing the disease in the future, and lead to the identification of a subset of diseased individuals that might benefit from a specific medical treatment. Candidate gene studies are the simplest and the most common type of genetic association studies. The genetic variants investigated in these studies are selected a priori, on the basis of their localization in genes which encode proteins with a known function in a biological pathway that is putatively implicated in the pathophysiology of the disease. For instance, the frequency of variants located in genes implicated in lipoprotein metabolism or hemostasis is compared in cases with MI and in healthy controls. Candidate gene studies conducted in the past were affected by several limitations. First, they were usually conducted in rather small case-control cohorts, with a high risk of false positive findings (type 1 error), so that the results of single studies were seldom replicated. Typically, a large replication study published in 2007 failed to replicate any of the 84 previously reported associations between genetic variants and CAD, in spite of an adequately powered study population.16 Only a handful of candidate gene studies have succeeded in the identification of reproducible associations. This is the case, for instance, for studies that established the association between the ɛ4 allele of the gene encoding apolipoprotein E (APOE) and Alzheimer’s disease,17, 18 between the coagulation factor V gene polymorphism called factor V Leiden (an established genetic risk factor for venous thrombosis) and early-onset MI (Spreafico M, Peyvandi F, Foco L, et al. Factor V Leiden, but not prothrombin G20120A, is associated with premature myocardial infarction. Circulation 2008;118:S-956.), and between the haplotypes of the major histocompatiblity complex (MHC) and several autoimmune diseases.19 However, candidate gene studies based on an a priori hypothesis have failed so far to unravel previously unknown disease pathways.

Principles of genome-wide association studies

Genome-wide association studies (GWAS) became possible after the publication of the International Haplotype Map Project (HapMap)20, 21 and the development of array-based platforms that enable the investigation of up to one million variants in cases and controls of a certain disease (or other phenotypic traits). The HapMap was a large collaborative project that described the frequencies of genetic variants with a minor allele frequency above 5% in four distinct populations: Han Chinese, Japanese, Black African from Nigeria, and Caucasian of European ancestry from the USA.20, 21 GWAS follow a multi-stage procedure:22 A complex disease, with a high prevalence in the population (GWAS typically need thousands of participants) is chosen, provided there is already evidence for a genetic basis for the condition/disease object of the study. Cases and controls are recruited, possibly with bias-free modalities that guard against the selection of individuals not representative of the reference population. More rarely, GWAS are conducted in the frame of prospective, population-based cohort studies. Genetic variants are then sequenced both in the cases and controls. Genotyping is performed using arrays that sequence up to one million variants across the human genome. Extensive quality control procedures, applied to each step of the study, help to control for the integrity and purity of DNA samples, presence of genotyping errors, genetic stratification of the study population, and reliability of the genotyping data. These procedures are critical to reduce type 1 errors, which are common in genetic association studies. Statistical analysis is thereafter performed to identify variants associated with the disease. A variant is considered to be associated with a disease when its prevalence is significantly greater in cases than controls (Figure 1). Since testing for the association of approximately one million variants enhances the risk of false positive findings owing to multiple testing, the conventional statistical significance threshold used to define a true association in GWAS is p value < 5 × 10–8 (corresponding to 0.05 after adjustment for one million independent tests).22 In GWAS, a small number of variants do reach this stringent threshold, even when more than 4,000 individuals are investigated. Usually, the first genome-wide screen allows to identify a number of variants that show suggestive association with the disease (p value < 10–4) and that subsequently need retesting in other cohorts to confirm or exclude association (replication stage).

Figure 1.

This Manhattan plot shows the statistical association between single nucleotide polymorphisms (SNPs) and a disease. Dots represent the SNPs and the bands of different tones of grey different chromosomes. In the figure, an association between chromosome 5 SNPs and the disease is presented (open circle)

The first GWAS to be conducted under these modalities was a large case-control study by the Wellcome Trust Case-Control Consortium (WTCCC), in which 14,000 cases affected by 7 among the most common complex diseases (CAD, arterial hypertension, rheumatoid arthritis, Crohn’s disease, bipolar disorder, and diabetes mellitus types I and II) were compared with a set of 3,000 healthy controls.23 The WTCCC study identified 24 genetic variants associated with at least one of these complex diseases and helped to clarify key methodological issues, setting the stage for the more than 400 GWAS that were to follow. These GWAS have so far identified more than 250 loci at which common variants influence the predisposition to diseases that are common (i.e., diabetes, autoimmune diseases, and several types of cancer), an achievement that by far outweighed that of the previous decade of genetic studies. Results are available in the catalogue of published GWAS prepared by the National Cancer Institute (NCI)-National Human Genome Research Institute (NHGRI).24 The genetic variants that can be identified by GWAS are common variants (with at least 5% frequency in the population) and have a low effect size; the conferred relative risks, as expressed by odds ratio, usually range between 1.1 and 1.5. These results confirm the views that the genetic predisposition to common diseases consists of the combined effect of numerous common genetic variants, each of a small effect size. However, it should be noted that GWAS identify regions of the genome (loci) rather than variants of specific genes. Indeed, the specific variant(s) identified by GWAS may simply represent the signal of one or more hidden variant(s) (not typed in the arrays used in GWAS). Limitations of GWAS need to be mentioned. First, these studies need very large samples of cases and controls. Second, DNA and data quality control procedures and statistical analysis need to be carried out by expert centres. Third, the overall cost of GWAS, ranging from hundreds of thousands to millions of US dollars, is prohibitive for most research groups worldwide. And finally, even after and in spite of all quality control procedures, there is still the chance that the results of GWAS include false-positive results, so that an independent replication of these results is still important even after testing thousands of individuals.

Results of genome-wide association studies in coronary artery disease/myocardial infarction

Seven GWAS have addressed the relationship between CAD/MI and common genetic variants23, 25–30 and found 13 loci at which common genetic variation alters the predisposition for these diseases. A GWAS on the number of circulating eosinophil leukocyte number has also identified a 14th locus associated with both eosinophil numbers and MI (SH2B3 at chromosome12p24, p value for association with MI = 8.6 × 10–8).31 All MI/CAD risk variants at these loci are characterized by a small effect size coupled with a rather high frequency in the population. Among the loci found by GWAS, there are genomic areas that encompass genes whose mutations cause familial hypercholesterolemia (LDLR and PCSK9).29 The fact that these hypothesis-free scans of the human genome were able to track diseases pathways that are known to have an established role in the pathophysiology of MI/CAD confirms the reliability and the biologic plausibility of the results of GWAS. The reliability of GWAS is further highlighted by the fact that all studies have identified a locus on chromosome 9p21.3 encompassing common variants affecting the risk for MI/CAD. Chromosome 9p21.3 variants are, among genome-wide variants, those with the largest effect on disease risk (odds ratio ranges of 1.28–1.47).25–27, 29 These variants have been also successfully replicated in several non-Caucasian populations,32–37 but the mechanisms by which they increase the risk for MI/CAD is still unclear. Interestingly, genetic variants at the same locus, at which the genes of CDKN2A and CDKN2B are localized, alter susceptibility to other arterial diseases such as abdominal aortic aneurism and intracranial aneurism, suggesting that chromosome 9p21.3 has a role in the process of arterial wall remodeling.38 A few studies have tried to tackle the issue of a potential clinical role of this largely replicated genetic locus, especially for the prediction of cardiovascular outcomes.39–41 Horne et al.,39 in a prospective study evaluating the incidence of MI in a cohort of CAD patients, found no association with incident thrombotic events, suggesting that chromosome 9p21.3 was associated with the development of CAD rather than with the development of MI in CAD patients. Accordingly, in a population-based prospective cohort study that evaluated healthy adults, Ye et al.40 found an association of chromosome 9p21 variants with both the occurrence and the progression of CAD. Also Yamagishi et al.,41 in the frame of the Atherosclerosis Risk in Communities (ARIC) study (a community-based prospective study of apparently healthy adults), found an association with prevalent CAD, peripheral artery disease, carotid atherosclerosis, and incident heart failure of chromosome 9p21 variants. Although these findings suggest a potential role for chromosome 9p21 variants as predictive markers for the development and the progression of CAD, it is still to be established whether and to which extent the inclusion of chromosome 9p21 testing in already existing prognostic algorithms would result in risk reclassification and improved risk prediction.

Conclusion

GWAS have represented a turning point in the study of genetic predisposition to common complex diseases and other heritable medical traits. The first generation of GWAS is about to be completed. Future steps of genetic research in the field will point towards two directions: 1. Translation of the results of GWAS into clinical applications. 2. Further enlargement of knowledge on the genetic basis of complex diseases. Pertaining to the application of genetic knowledge to the clinical field, results of GWAS may be used to refine the prediction of disease risk and to identify new therapeutic targets for drug development. There is unanimous agreement that current knowledge on genetic risk factors is not as large as it should be for a useful prediction of the risk of common diseases, because genetic risk variants identified so far explain only a tiny part of the heritability of the most common diseases. For example, it is estimated that MI risk variants identified thus far account for as little as 3% of the heritability of MI.29 While for most complex diseases the number of known genetic variants associated with disease risk does not exceed 10, statisticians have estimated that the number of variants required for an adequate risk prediction would be at least 200–300 variants, given the frequency and the disease risk burden of currently known variants.42 The potential of GWAS stands in the unbiased, hypothesis-free identification of genetic loci, and hence of proteins and biological pathways involved in the development of diseases whose pathophysiology has been poorly characterized so far.43 GWAS have indicated a number of new potential therapeutic targets. With respect to the enlargement of our knowledge on the genetics of complex diseases, a second generation of larger GWAS, including dozens of thousands participants, is underway. Huge meta-analyses of already published GWAS are also being carried out. Another expanding field is the development of high-throughput techniques for the whole resequencing of large parts of the genome (or even the entire genome), at decreasing costs. While the sequencing of the genome in the frame of the Human Genome Project took more than 5 years and 100 million dollars, the newly developed techniques of massively parallel sequencing should enable researchers to obtain in a few weeks an accurate sequence of the genome of an individual for 30,000–150,000 dollars.44, 45 The application of these techniques to the study of complex diseases, even if still far away, holds promise to change the current concepts on their genetic architecture. In parallel, the ‘1000 Genomes Project’, which will produce the genome sequence of ∼2000 individuals of different ethnic groups, and the ‘Genotype-Tissue Expression Project’, which will combine genetic data with information on gene expression in different tissues, should hopefully provide the tools needed to exploit the large amount of data made available by these new sequencing platforms.

44 in total

1. Initial sequencing and analysis of the human genome.

Authors: E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin; K Devon; K Dewar; M Doyle; W FitzHugh; R Funke; D Gage; K Harris; A Heaford; J Howland; L Kann; J Lehoczky; R LeVine; P McEwan; K McKernan; J Meldrim; J P Mesirov; C Miranda; W Morris; J Naylor; C Raymond; M Rosetti; R Santos; A Sheridan; C Sougnez; Y Stange-Thomann; N Stojanovic; A Subramanian; D Wyman; J Rogers; J Sulston; R Ainscough; S Beck; D Bentley; J Burton; C Clee; N Carter; A Coulson; R Deadman; P Deloukas; A Dunham; I Dunham; R Durbin; L French; D Grafham; S Gregory; T Hubbard; S Humphray; A Hunt; M Jones; C Lloyd; A McMurray; L Matthews; S Mercer; S Milne; J C Mullikin; A Mungall; R Plumb; M Ross; R Shownkeen; S Sims; R H Waterston; R K Wilson; L W Hillier; J D McPherson; M A Marra; E R Mardis; L A Fulton; A T Chinwalla; K H Pepin; W R Gish; S L Chissoe; M C Wendl; K D Delehaunty; T L Miner; A Delehaunty; J B Kramer; L L Cook; R S Fulton; D L Johnson; P J Minx; S W Clifton; T Hawkins; E Branscomb; P Predki; P Richardson; S Wenning; T Slezak; N Doggett; J F Cheng; A Olsen; S Lucas; C Elkin; E Uberbacher; M Frazier; R A Gibbs; D M Muzny; S E Scherer; J B Bouck; E J Sodergren; K C Worley; C M Rives; J H Gorrell; M L Metzker; S L Naylor; R S Kucherlapati; D L Nelson; G M Weinstock; Y Sakaki; A Fujiyama; M Hattori; T Yada; A Toyoda; T Itoh; C Kawagoe; H Watanabe; Y Totoki; T Taylor; J Weissenbach; R Heilig; W Saurin; F Artiguenave; P Brottier; T Bruls; E Pelletier; C Robert; P Wincker; D R Smith; L Doucette-Stamm; M Rubenfield; K Weinstock; H M Lee; J Dubois; A Rosenthal; M Platzer; G Nyakatura; S Taudien; A Rump; H Yang; J Yu; J Wang; G Huang; J Gu; L Hood; L Rowen; A Madan; S Qin; R W Davis; N A Federspiel; A P Abola; M J Proctor; R M Myers; J Schmutz; M Dickson; J Grimwood; D R Cox; M V Olson; R Kaul; C Raymond; N Shimizu; K Kawasaki; S Minoshima; G A Evans; M Athanasiou; R Schultz; B A Roe; F Chen; H Pan; J Ramser; H Lehrach; R Reinhardt; W R McCombie; M de la Bastide; N Dedhia; H Blöcker; K Hornischer; G Nordsiek; R Agarwala; L Aravind; J A Bailey; A Bateman; S Batzoglou; E Birney; P Bork; D G Brown; C B Burge; L Cerutti; H C Chen; D Church; M Clamp; R R Copley; T Doerks; S R Eddy; E E Eichler; T S Furey; J Galagan; J G Gilbert; C Harmon; Y Hayashizaki; D Haussler; H Hermjakob; K Hokamp; W Jang; L S Johnson; T A Jones; S Kasif; A Kaspryzk; S Kennedy; W J Kent; P Kitts; E V Koonin; I Korf; D Kulp; D Lancet; T M Lowe; A McLysaght; T Mikkelsen; J V Moran; N Mulder; V J Pollara; C P Ponting; G Schuler; J Schultz; G Slater; A F Smit; E Stupka; J Szustakowki; D Thierry-Mieg; J Thierry-Mieg; L Wagner; J Wallis; R Wheeler; A Williams; Y I Wolf; K H Wolfe; S P Yang; R F Yeh; F Collins; M S Guyer; J Peterson; A Felsenfeld; K A Wetterstrand; A Patrinos; M J Morgan; P de Jong; J J Catanese; K Osoegawa; H Shizuya; S Choi; Y J Chen; J Szustakowki
Journal: Nature Date: 2001-02-15 Impact factor: 49.962

2. Validation of the Framingham coronary heart disease prediction scores: results of a multiple ethnic groups investigation.

Authors: R B D'Agostino; S Grundy; L M Sullivan; P Wilson
Journal: JAMA Date: 2001-07-11 Impact factor: 56.272

3. Prediction of coronary heart disease using risk factor categories.

Authors: P W Wilson; R B D'Agostino; D Levy; A M Belanger; H Silbershatz; W B Kannel
Journal: Circulation Date: 1998-05-12 Impact factor: 29.690

4. Maternal and paternal history of myocardial infarction and risk of cardiovascular disease in men and women.

Authors: H D Sesso; I M Lee; J M Gaziano; K M Rexrode; R J Glynn; J E Buring
Journal: Circulation Date: 2001-07-24 Impact factor: 29.690

5. Gene dose of apolipoprotein E type 4 allele and the risk of Alzheimer's disease in late onset families.

Authors: E H Corder; A M Saunders; W J Strittmatter; D E Schmechel; P C Gaskell; G W Small; A D Roses; J L Haines; M A Pericak-Vance
Journal: Science Date: 1993-08-13 Impact factor: 47.728

6. Family history as a risk factor for early onset myocardial infarction in young women.

Authors: Y Friedlander; P Arbogast; S M Schwartz; S M Marcovina; M A Austin; F R Rosendaal; A P Reiner; B M Psaty; D S Siscovick
Journal: Atherosclerosis Date: 2001-05 Impact factor: 5.162

7. No evidence of association between prothrombotic gene polymorphisms and the development of acute myocardial infarction at a young age.

Authors:
Journal: Circulation Date: 2003-03-04 Impact factor: 29.690

8. Familial occurrence of coronary heart disease: effect of age at diagnosis.

Authors: A M Rissanen
Journal: Am J Cardiol Date: 1979-07 Impact factor: 2.778

9. Genetic susceptibility to death from coronary heart disease in a study of twins.

Authors: M E Marenberg; N Risch; L F Berkman; B Floderus; U de Faire
Journal: N Engl J Med Date: 1994-04-14 Impact factor: 91.245

10. Apolipoprotein E: high-avidity binding to beta-amyloid and increased frequency of type 4 allele in late-onset familial Alzheimer disease.

Authors: W J Strittmatter; A M Saunders; D Schmechel; M Pericak-Vance; J Enghild; G S Salvesen; A D Roses
Journal: Proc Natl Acad Sci U S A Date: 1993-03-01 Impact factor: 11.205

1 in total

1. circCELF1 Inhibits Myocardial Fibrosis by Regulating the Expression of DKK2 Through FTO/m⁶A and miR-636.

Authors: Xue-Xun Li; Bin Mu; Xi Li; Zi-Dong Bie
Journal: J Cardiovasc Transl Res Date: 2022-02-07 Impact factor: 4.132

1 in total