Literature DB >> 19737788

Genetics and the general physician: insights, applications and future challenges.

Abstract

Scientific and technological advances in our understanding of the nature and consequences of human genetic variation are now allowing genetic determinants of susceptibility to common multifactorial diseases to be defined, as well as our individual response to therapy. I review how genome-wide association studies are robustly identifying new disease susceptibility loci, providing insights into disease pathogenesis and potential targets for drug therapy. Some of the remarkable advances being made using current genetic approaches in Crohn's disease, coronary artery disease and atrial fibrillation are described, together with examples from malaria, HIV/AIDS, asthma, prostate cancer and venous thrombosis which illustrate important principles underpinning this field of research. The limitations of current approaches are also noted, highlighting how much of the genetic risk remains unexplained and resolving specific functional variants difficult. There is a need to more clearly understand the significance of rare variants and structural genomic variation in common disease, as well as epigenetic mechanisms. Specific examples from pharmacogenomics are described including warfarin dosage and prediction of abacavir hypersensitivity that illustrate how in some cases such knowledge is already impacting on clinical practice, while in others prospective evaluation of clinical utility and cost-effectiveness is required to define opportunities for personalized medicine. There is also a need for a broader debate about the ethical implications of current advances in genetics for medicine and society.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Year: 2009 PMID： 19737788 PMCID： PMC2766102 DOI： 10.1093/qjmed/hcp115

Source DB: PubMed Journal: QJM ISSN： 1460-2393

Introduction

The translation of recent advances in our understanding of the genetic basis of common multifactorial diseases into clinical practice remains limited. However, the extraordinary pace of change in human genetics means that this field of research is now starting to challenge how we understand and manage disease, with opportunities for new insights into pathogenesis, drug development and the tailoring of clinical care for the individual patient. This review provides an introduction to the nature of human genetic variation and its functional consequences for disease. Recent insights into the role of genetic diversity in a number of important common diseases serve to illustrate both the advances achieved to date and the challenges that lie ahead.

Approaches to defining genetic determinants of common disease

Linkage and association

Considerable success was achieved using linkage analysis and positional cloning (for a definition of these and other genetic terms, see Glossary in Appendix 1) to identify rare variants with high penetrance responsible for diseases showing a mendelian pattern of inheritance such as cystic fibrosis and haemochromatosis.[1,2] In contrast, progress in defining genetic susceptibility loci in common multifactorial diseases remained frustratingly slow until the advent of genome-wide association studies in 2005. Prior to this time, the application of a linkage-based approach to common ‘complex’ traits was recognized to be of limited value as multiple genetic loci were likely to be involved in conjunction with environmental factors; moreover, in contrast to mendelian disorders, the underlying genetic variants were of low penetrance, relatively high allele frequency and typically associated with a modest magnitude of effect. Despite this, there were some notable successes involving linkage studies such as in Crohn's disease with the demonstration of the important contribution of nucleotide-binding oligomerization domain containing 2 (NOD2) variants as will be described shortly. Studies seeking evidence of genetic association in common diseases using a candidate gene approach were in general underpowered to detect variants associated with a modest magnitude of effect and were not able to dissect the complex underlying genomic architecture of coinherited variants. This was incompletely understood until recently and typically studies analysed only a limited number of single-nucleotide polymorphisms (SNPs) as genetic markers, making resolution of specific variants difficult. These factors, together with the complexities of gene–gene and gene–environment interactions, meant that the results of candidate gene association studies were often difficult to interpret and replicate. There were, however, some notable successes achieved through a candidate gene approach with gene selection based on biological plausibility, for example important genetic determinants of susceptibility to human immunodeficiency virus 1 (HIV-1) and malaria were resolved, together with resistance to new variant Creutzfeldt–Jakob prion disease and norovirus diarrhoea. An important component to many of these discoveries were initial insights based on microbial challenge experiments or observation of extreme clinical phenotypes.

Genetic variation and application of genome-wide association studies

The sequencing of the human genome provided a route map to catalogue and interpret genetic variation among individuals.[12,13] The scale of diversity is vast, ranging from microscopically visible variation at the level of individual chromosomes to variation at the deoxyribonucleic acid (DNA) sequence level with nucleotide substitutions, insertions and deletions, including >10 million SNPs (dbSNP, http://www.ncbi.nlm.nih.gov/projects/SNP/) (Figure 1). Submicroscopic structural variation, notably copy number variation, is increasingly recognized and plays a critical role in a number of different diseases. A number of major collaborative studies, such as the International HapMap Project, aim to understand the genomic architecture of human genetic variation based on very extensive genotyping of common SNPs among different ethnically diverse human populations (http://www.hapmap.org).[18,19] This work, together with other studies, enabled a much clearer understanding of how sequence variants are coinherited, with important advances in understanding linkage disequilibrium and haplotypic structure (Figure 2).[20,21]

Figure 1.

Figure 2.

Linkage disequilibrium and haplotypes. (A) For a C>T SNP, the minor T allele shows association with cases of disease, being present twice as often as in controls. (B) However, when further variants are genotyped, a second G>C SNP shows perfect linkage disequilibrium with the initial disease associated SNP: everyone who has the T allele of the first SNP always has the C allele of the second SNP and it is not possible to say whether the disease association is due to the first or second SNP. (C) For four SNPs there are 16 possible combinations in which they could theoretically occur together in a population. However, variants occur together more often than expected by chance and the observed coinherited blocks of variants are described as haplotypes, of which three are present in this theoretical population sample. (D) Haplotype blocks break down at sites of recombination. Understanding the haplotype block structure of the genome across populations is highly informative for population genetics studies.

Classes of genetic variation. Classification of genetic variation based on size according to Scherer and colleagues with illustrations of a single-nucleotide substitution, a copy number variant and presence of an additional chromosome. Reproduced and adapted by permission from Macmillan Publishers Ltd: Nat Genet copyright (2007) and Nat Rev Genet (2004); and http://www.sanger.ac.uk/humgen/cnv/. Linkage disequilibrium and haplotypes. (A) For a C>T SNP, the minor T allele shows association with cases of disease, being present twice as often as in controls. (B) However, when further variants are genotyped, a second G>C SNP shows perfect linkage disequilibrium with the initial disease associated SNP: everyone who has the T allele of the first SNP always has the C allele of the second SNP and it is not possible to say whether the disease association is due to the first or second SNP. (C) For four SNPs there are 16 possible combinations in which they could theoretically occur together in a population. However, variants occur together more often than expected by chance and the observed coinherited blocks of variants are described as haplotypes, of which three are present in this theoretical population sample. (D) Haplotype blocks break down at sites of recombination. Understanding the haplotype block structure of the genome across populations is highly informative for population genetics studies. This has provided a remarkable resource of sequence level diversity in the form of common SNP markers and showed how considerable economies of scale could be achieved based on genotyping informative SNPs (sometimes described as ‘tag SNPs’) which define particular haplotypes. This, together with major advances in the technologies needed for accurate, affordable and high throughput genotyping, has enabled the successful application of genome-wide association studies in a number of common diseases, ranging from early successes with age-related macular degeneration to type 1 diabetes and Crohn's disease. There have also been very significant advances in our ability to statistically analyse and interpret such data, exemplified by the Wellcome Trust Case Control Consortium (WTCCC) study of seven common diseases. This study highlighted the utility in a UK population of common controls for multiple diseases as well as the ability of genome-wide genotyping data to resolve potential population stratification. The WTCCC also showed the value of statistical approaches to infer (by a process described as imputation) additional genotypes based on the genomic architecture from the HapMap studies to allow finer resolution of observed associations. Growing numbers of genome-wide association studies have underlined the importance of sample size to ensure adequately powered studies able to detect small effect sizes; of replication in independent samples given the issues relating to statistical correction for multiple testing when hundreds of thousands of SNP markers are analysed for evidence of association; and the persisting ‘winners curse’ in association studies in which initial studies will tend to overestimate the magnitude of association. Clear, minimally heterogeneous phenotypes and substantially increased sample sizes involving tens of thousands of cases together with meta-analysis are currently advocated to allow detection of smaller odds ratios (typically <1.5).

Challenges to be addressed

A note of caution is, however, appropriate as despite the successes of genome-wide association studies, only a minority of the attributable genetic risk is typically being explained for a given disease. This may be improved by larger studies with increasing numbers of common SNP markers but there may be other explanations. For example, we are not yet able to effectively assess the contribution of low penetrance, less common alleles. There is also increasing interest in rare variants and the role of structural genomic variants which have until recently been very hard to detect and analyse. Technological advances in high throughput sequencing and microarray-based comparative genome hybridization are rapidly advancing this field, the latter allowing detection of increasingly small structural variants such that the extent of copy number variation across the human genome is now being appreciated and associated with susceptibility to a number of common diseases ranging from Parkinson's disease to HIV-1 infection. Such technologies have significant potential clinical utility, for example in relation to causes of learning disability,[31,32] although issues relating to predictive value for specific phenotypes and clinical outcomes need to be resolved before such testing could be considered evidence based. Finally, it is clear that gene–gene and gene–environment interactions play an important and potentially confounding role in current approaches, which also do not account for epigenetic effects. Recent genome-wide associations have been scientifically compelling and have highlighted a number of new biological pathways in disease processes, as highlighted by studies in Crohn's disease described below, but it is unclear the extent to which we are ready to use such information in a clinical arena. The prediction of risk is complex and at present it has been advocated that our estimation of risk remains imprecise for possession of particular genetic variants, and indeed is likely to change in magnitude as more data are accumulated. Moreover, how we combine possession of individual variants into a ‘total genetic risk’ score in a clinically validated manner remains in its infancy, and the use and interpretation of genetic testing for individual patients, particularly in a ‘direct-to-consumer’ setting, remains highly controversial.[34,35] Careful prospective clinical evaluation considering sensitivity, specificity, predictive value of testing, relationship with outcome and cost-effectiveness need to be combined with wider public and patient education to establish clear guidelines and ethical processes for regulated implantation of genetic testing in common diseases.[36,37]

Coronary artery disease and atrial fibrillation

Among the many associations found in the WTCCC study, the association of coronary artery disease with SNP markers on chromosome 9p21.3 highlights both the excitement and the work remaining to be done inherent in current genome-wide association studies. This was the most significant association seen across the genome for this disease and has been independently replicated in different populations.[25,38-40] Meta-analysis of >12 000 cases and ∼29 000 controls shows that possession of each copy of the risk allele is associated with an odds ratio of 1.24 (95% confidence interval 1.20–1.29). The magnitude of risk is relatively modest but highly significant; the biological mechanism is unknown although nearby are CDKN2A and CDKN2B, genes encoding cyclin-dependent kinase inhibitors involved in cancer cell biology with a postulated role in atherosclerosis through modulation of transforming growth factor beta. Intriguingly, there is now evidence of a large noncoding ribonucleic acid (RNA) in the risk locus, ANRIL, specific transcripts of which are differentially expressed in the presence of the risk allele which is thought to relate to regulatory variants in a conserved enhancer element. This correlates with altered expression of CDKN2B consistent with a regulatory role for ANRIL and is postulated to modulate disease by altering cellular proliferation. Further work is required to resolve specific regulatory variants and the functional mechanisms involved. However, the disease association may prove highly informative in terms of both novel insights into pathogenesis and in the clinic. A prospective study, for example, shows a role in risk assessment for predicting the presence of angiographic coronary artery disease but not severity, which is independent of family history and other known risk factors, although a recent study of cardiovascular disease in the Women's Genome Health Study found no benefit in risk prediction. It is also striking that within 10 kb on a neighbouring linkage disequilibrium block is a SNP marker showing strong association with type 2 diabetes, although the specific variants and their functional consequences relating to coronary artery disease and type 2 diabetes at 9p21 remain unknown. Success has also been achieved using genome-wide association studies in atrial fibrillation with association at chromosome 4q25. Again, extensive replication has confirmed this association among individuals of North European descent with a meta-analysis showing an odds ratio of 1.9 (1.6–2.26) for the most strongly associated SNP with atrial fibrillation. In this case, the associated SNP is in a gene desert ∼50 000 bases away from the nearest gene, but that gene is particularly intriguing as it is paired-like homeodomain 2 (PITX2) which encodes a transcription factor important to cardiac development, notably the sleeve of cardiac tissue extending out over the pulmonary vein known to be important in the aetiology of atrial fibrillation and the target of therapeutic ablation therapy for this condition. Further work is needed to resolve the specific variants responsible for this association, but the result illustrates how an unbiased genetics approach has implicated a specific regulatory protein which may unlock the underlying mechanism of disease.

Insights from Crohn's disease

Research in Crohn's disease highlights many important issues related to the genetics of common disease. The pathological basis of this disease is thought to involve an environmental trigger such as an infectious agent(s) leading to disease in genetically susceptible individuals. An early striking result was achieved through genome-wide linkage analysis which identified a major genetic risk locus and subsequently specific coding SNPs within the NOD2 gene. NOD2 encodes a protein critical to the recognition of bacteria and subsequent proinflammatory response. Individuals inheriting one risk allele have an odds ratio of 2.4 (2–2.9) for disease compared to those without a copy, this increases to 17.1 (10.7–27.2) for carriage of at least two risk alleles. The advent of genome-wide association studies has dramatically increased the number of genetic susceptibility loci in Crohn's disease to over 30.[24,56] It was notable, however, that initial scans did not highlight the known role of NOD2 as the panel of SNP markers did not include the known coding variants associated with disease risk. Significant association was seen for a genotyped SNP in modest linkage disequilibrium with those variants, but the observed effect size was considerably lower (odds ratios of 1.3 and 1.9 for heterozygotes and homozygotes, respectively), showing how SNP coverage on the genotyping platform used can be very important. Overall, the effect sizes seen with the many highly significant new susceptibility loci in Crohn's disease are modest, usually of the order of 1.5-fold. This is a typical finding in genome-wide association scans of common multifactorial diseases and highlights how, even with several thousand cases, such studies are relatively underpowered.

Genomic signposts

It is important to remember that genome-wide association studies are using SNPs as markers to define association. The finding of association does not imply causation to that particular variant but rather the first insights into the likely genomic location of functionally important variants. This could be compared to planting a signpost in a highly variable landscape to indicate the region of interest and helping to guide us in the right direction on the path to resolving causative variants. The ‘fine mapping’ of disease associations represents a major hurdle still to be overcome, both at the level of understanding exactly what variation is present to more clearly resolve the genetic association, and in establishing the functional significance of specific variants. Approaches currently employed in fine mapping loci identified by genome-wide association studies include denser genotyping of specific regions and local resequencing, although to uncover rare and structural genomic variants relatively large numbers of individuals may need to be analysed using sophisticated approaches. Coinheritance or linkage disequilibrium between variants adds considerably to the difficulties of fine mapping as it can be very hard to dissect apart the effects of individual variants. It is worth noting that functional effects may be operating at a considerable distance, in Crohn's disease, for example, strong hits of association have been found within ‘gene deserts’ in which associated SNP markers are far from any known gene.

Relationship to structural genomic variation

A further insight into genome-wide association studies in Crohn's disease is the recent striking result that a large deletion, involving 20 kb of DNA upstream of the immunity-related GTPase family, M (IRGM) gene, is responsible for the observed association with SNP markers at that gene. The presence of this deletion, which is associated with increased disease risk, significantly altered levels of gene expression. Similar structural variation, notably copy number variation, is likely to underpin many observed disease associations found using DNA sequence variants.

Insights into disease pathogenesis

The association with IRGM suggested that autophagy may play an important role in Crohn's disease pathogenesis, an unexpected result added further credence by the association with ATG16 autophagy related 16-like 1 (ATG16L1).[59,60] An additional novel insight into disease pathogenesis is the strong association with variants involving genes in the interleukin 23 receptor (IL23R) signaling pathway.[60,61] IL23 plays a key role in innate and T-cell-mediated intestinal inflammation and the significance of the association with IL23R SNPs is substantially increased by the finding of disease association with variants in genes that are part of the IL23 signalling pathway namely interleukin 12B (IL12B), signal transducer and activator of transcription 3 (STAT3) and Janus kinase 2 (JAK2).

Lessons from infectious diseases including HIV and malaria

Functional consequences of genetic variants in malaria

Malaria has exerted a major selective pressure on genetic diversity in human populations and provides elegant examples of how variants may exert their functional effects. It bears repetition to briefly consider again two classic examples from this field, namely sickle cell and Duffy blood group antigen, to highlight how functional consequences may operate at a structural level (changing the structure and function of the encoded protein) or by variation in noncoding DNA modulating levels of gene expression. In the former case, the selective advantage gained by individuals with sickle cell trait (having one copy of the allele encoding haemoglobin S) in areas endemic for Plasmodium falciparum has driven a monogenic disorder to high allele frequencies in particular populations notably in sub-Saharan Africa. Hemoglobin S results from an A to T nucleotide substitution in the coding sequence of the hemoglobin beta (HBB) gene which leads to an amino acid substitution from glutamic acid to valine and a radically altered phenotype in terms of malarial disease risk, in particular protection from severe manifestations of the disease. Malaria provides a further example of the consequences of diversity at the level of gene regulation through a SNP modulating expression of the duffy antigen receptor for chemokines gene (DARC) (also known as the FY gene). Individuals who do not have the Duffy blood group antigen on the surface of their red blood cells are completely protected from P. vivax malaria. This failure of expression is the result of a T to C nucleotide substitution in the regulatory sequence (the promoter region) found upstream of the DARC gene which disrupts the binding of the transcription factor GATA-1, a protein which binds DNA and helps regulate gene expression (Figure 3).

Figure 3.

Duffy blood group antigen and P. vivax malaria. Schematic representation of how an A to G SNP disrupts a GATA-1 binding site resulting in loss of expression of the DARC gene and hence the cell surface expression of the receptor that renders red blood cells resistant to invasion by the malarial parasite. The effect of the SNP is highly specific to red blood cells where the transcription factor GATA-1 is found. Selective advantage is thought to have driven the variant to a very high allele frequency in Africa such that it is almost universally found in most West and Central African populations. Reproduced from by permission of Oxford University Press. Consideration of these two examples serves as an introduction to the functional consequences of genetic diversity. Early work focused on variation in coding DNA, but more recently it is clear that variation in noncoding DNA is also important. The consequences are wide ranging for the magnitude and timing of gene expression at the transcription level, as well as how RNA is alternatively spliced, its stability and processing.

HIV-1 and the chemokine (C-C motif) receptor 5 ΔΔ32 deletion

Like malaria, research into human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome (AIDS) has advanced our understanding of the nature and implications of genetic variation. Host genetic factors have now been found to operate at a number of different levels in relation to HIV-1, including how the virus gains entry into cells, barriers to infection within cells and modulation of cell-mediated and -innate immunity. Sequencing the CC chemokine receptor CCR5 gene (the major host coreceptor protein for HIV-1) among individuals who were HIV-1 seronegative and whose T cells were resistant to viral infection revealed a 32 bp deletion which leads to a frameshift and dramatic truncation of the protein such that the receptor fails to be expressed on the cell surface. When present in a homozygous state, the deletion completely protects against viral entry to cells. When a single copy is inherited, the consequences are dependent on other variation involving the second allele. This highlights both the important role of structural variation in common disease and of the CCR5 coreceptor which is now a drug target through reversible small molecule antagonists such as maraviroc.

Copy number variation and HIV-1

Research into genetic susceptibility to HIV-1 also highlighted how copy number variation may modulate susceptibility to common disease with the discovery that possession of lower numbers of copies of the chemokine (C-C motif) ligand 3-like 1 (CCL3L1) gene, when compared to the median for a given population, were highly significant in terms of susceptibility to HIV-1 and rate of disease progression. CCL3L1 is a natural ligand of CCR5 that inhibits infection of cells by HIV-1 strains which use this coreceptor. Increased expression seen with higher copy number is thought to result in more binding to CCR5 and reduced accessibility to HIV-1.

Genetic determinants of viral load

HIV-1 infection also provides an elegant example of how genome-wide association scans can be applied to other phenotypes, in this case to the amount of virus in plasma during the asymptomatic phase prior to AIDS (the viral set point). This has been found to vary significantly between patients with variants in the major histocompatibility complex (MHC) on chromosome 6 explaining about 15% of the observed variability in viral load. The observed associations are intriguing, as the most strongly associated SNP marker is within HLA complex P5 (HCP5), a gene itself thought to have arisen from an endogenous retroviral element. Expression of this gene may potentially interfere with the virus but more likely the association is due to linkage disequilibrium with HLA-B*5701. The second strongest association is with a SNP near to HLA-C, which is also associated with expression of HLA-C.

Human leukocyte antigen, killer-cell immunoglobulin-like receptors and HIV-1

Genetic variation in the MHC, notably human leukocyte antigen (HLA) class I alleles such as HLA-B57 and HLA-B35, are important determinants of disease progression. For example, individuals homozygous for HLA-B*35 progress to AIDS in half the median time of those without a copy. The situation is, however, complex as diversity in other genomic loci, specifically the highly polymorphic KIR genes encoding killer immunoglobulin-like receptors, is important with the combination of high expressing KIR allotypes and specific HLA-B alleles most informative for AIDS progression and levels of HIV-1 RNA.

Pharmacogenomics: abacavir hypersensitivity

HLA-B*5701 and HIV-1 also provide one of the few clear examples of how knowledge of genetic variation can directly impact on clinical care in terms of drug use, in this case to avoid drug hypersensitivity. There is now robust evidence from a prospective, randomized, multicenter, double-blind study of nearly 2000 adult patients infected with HIV-1 that prospective screening for this HLA allele can eliminate immunologically proven hypersensitivity reactions to abacavir, a reverse transcriptase inhibitor used to treat HIV-1. This is thought to involve the endogenous pathway for antigen presentation and is highly specific to HLA-B*5701 and not closely related alleles, with the drug or a metabolite of it interacting directly with the antigen-binding cleft or modifying it to allow self-antigens to bind. There is also recent evidence from a genome-wide association study linking possession of HLA-B*5701 with risk of drug-induced liver injury due to flucloxacillin.

Gene expression, asthma and regulatory variants

Gene expression varies between individuals and has been successfully mapped as a quantitative trait in model organisms and more recently in humans.[82,83] Genome-wide mapping of disease association and of global gene expression offers a complimentary and synergistic approach, which was elegantly demonstrated by a recent study of asthma. Here, a strong signal of disease association was found at chromosome 17q21 which replicated in different European populations; analysis of gene expression showed that the same SNPs were also those most strongly associated with expression of a neighboring gene, ORM1-like 3 (ORMDL3) (Figure 4). The SNPs accounted for ∼30% of the variance in expression of this gene which encodes a widely expressed transporter protein.

Figure 4.

Genetic association with disease and gene expression. A genome-wide association study of asthma revealed a strong association with disease at chromosome 17q21. The same SNPs also showed evidence of association with levels of expression of the nearby gene ORMDL3. Adapted and reprinted by permission from Macmillan Publishers Ltd: Nature, copyright (2007). Efforts to map variants associated with gene expression have highlighted the role of regulatory variants in particular classes of genes such as encoding cellular chaperones, cell cycle, RNA processing and immune function. Improved understanding of gene regulation through characterization of regulatory elements across the genome by projects such as ENCODE (ENCyclopedia Of DNA Elements) (http://www.genome.gov/10005107) will provide a route map to identify functional variants. Effects are likely to be highly context specific, for example to specific cell or tissue types, developmental stages or particular cellular conditions.

Genome-wide association studies and prostate cancer

Many new susceptibility loci have been defined among common cancers occurring in European populations (breast, lung, colorectal, prostate and melanoma) through the application of genome-wide association studies over the last 3 years. These germline variants are associated with modest increased risks (typically odds ratios <1.3) although as they are typically relatively common, the population attributable risk is higher. Moreover, for the individual, possession of multiple risk alleles may confer a relatively large risk. Genome-wide association studies using common SNP markers have highlighted the polygenic nature of the identified genetic risk for these cancers, but taking into account the rare high penetrance variants (such as involving BRCA1 and BRCA2 in breast and ovarian cancer) which have been identified, still only explain a minority of the familial risk. Moreover, the utility for clinical testing in cancer remains to be resolved, as the reported associations do not generally help in predicting the aggressiveness of the disease process which may be of greatest benefit. Analysis of somatic mutations may be more informative for this purpose notably gene fusion events such as between TMPRSS2 (transmembrane protease serine 2) and EGR (ETS related gene) which commonly occurs in prostate cancer and results in overexpression of oncogenic transcription factors.[86,87] The main benefit of the new genetic knowledge gained through association studies, as seen with other multifactorial diseases such as Crohn's disease, is likely to be in providing novel insights into disease pathogenesis and potential new targets for intervention. Prostate cancer has seen the largest number of new susceptibility loci identified of any cancer to date from genome-wide association scans.[85,86] Perhaps most striking among these are the associations with chromosome 8q24, a genomic region initially identified from linkage and association together with admixture mapping and subsequently genome-wide association studies. Remarkably, within a 1.2 Mb ‘gene desert’ at 8q24, three independent susceptibility loci for prostate cancer are present containing multiple risk alleles. For one of these loci, the most significantly associated SNP is also associated with colorectal cancer; the region also contains a distinct breast cancer susceptibility locus. The basis for these associations remains unresolved although the proto-oncogene MYC is one of the genes flanking the region. Indeed, recent evidence shows that the variant associated with colorectal and prostate cancer is located in a conserved enhancer element which interacts with MYC and shows allele-specific binding of a transcription factor activated in colorectal cancer.[96,97] Other disease-associated loci include chromosome 10q11.2 in the region of a postulated tumour suppressor gene MSMB encoding β microseminoprotein;[90,91] and chromosome 17q12, where intriguing evidence of association was found with increased risk of prostate cancer and protection from type 2 diabetes for a variant in HNF1B (TCF2) encoding the transcription factor HNF1 homeobox B. This gene had been previously implicated in maturity onset diabetes of the young (MODY) and the association is consistent with epidemiological evidence of an inverse risk between prostate cancer and type 2 diabetes.[100,101] Different variants involving the transcriptional repressor JAZF1 (JAZF zinc finger 1) at chromosome 7p15 have also been found to show contrasting associations with these two conditions, of protection from prostate cancer and susceptibility to type 2 diabetes suggesting possible commonality in disease pathogenesis and new avenues for investigation.

Venous thrombosis and warfarin dose

Venous thromboembolism is a commonly encountered clinical problem, in which genetic testing for inherited causes of thrombophilia when clinically indicated is an important component of patient care. The genetic basis for venous thrombosis includes multiple rare variants affecting inhibitors of the pro-coagulant system such as deficiency of antithrombin, protein C and protein S, but together these account for only a small minority of cases of familial thrombophilia. Other variants common in the general population and accounting for a significant genetic risk have been defined including factor V Leiden which confers activated protein C resistance. Here, a specific SNP in exon 10 of coagulation factor V (F5), the gene encoding coagulation factor V, results in an amino acid substitution from arginine to glutamine which makes the protein less susceptible to degradation. The effect in terms of disease is large with a 5- or 50-fold increased risk of venous thrombosis associated with having a single or two copies of the variant, respectively. Other common variants have also been identified, notably in the coagulation factor II (F2) gene encoding prothrombin. The indications for thrombophilia screening remain controversial, but evidence relating to clinical outcomes, absolute and relative risks is being established to aid clinical decision making.

Pharmacogenomics of warfarin therapy

Venous thromboembolism is an important indication for warfarin therapy, but as a treatment, warfarin is associated with a high risk of adverse events. This is related to the narrow therapeutic range of warfarin and the marked inter-individual variation in response. This variability is modulated by many factors including patient age and size, together with concomitant drug use, diet and the underlying disease process. Genetic factors also play an important role and illustrate the developing field of pharmacogenomics. Cytochrome P450, family 2, subfamily C, polypeptide 9 (CYP2C9) encodes a cytochrome P450 family enzyme involved in elimination of the more active form of warfarin, and varies in activity more than 20-fold between individuals. Reduced enzyme activity and warfarin dosage was associated with two common alleles of CYP2C9. A recent prospective randomized study of warfarin dosing incorporating knowledge of CYP2C9 genotype showed a significant reduction in time to stable anticoagulation, a reduction of minor bleeding episodes and the number of monitoring blood tests required. Genetic variation in vitamin K epoxide reductase complex, subunit 1 (VKORC1), which encodes vitamin K epoxide reductase complex 1, has also been associated with levels of gene expression and warfarin dosage. This genetic evidence led the US Food and Drug Administration in August 2007 to approve a labelling change for warfarin on package insert such that ‘lower initiation doses should be considered for patients with certain genetic variations in CYP2C9 and VKORC1 enzymes’. Combining genetic data for these two genes with information on age, sex and drug interactions explains 59% of the variance in warfarin dose. Dosage algorithms incorporating genetic data are available (http://www.warfarindosing.org), and while initial work suggests use of pharmacogenomic information may be cost-effective only in patients at high risk of haemorrhage the falling costs of genetic testing and growing feasibility of rapid ‘inhouse’ tests mean that further studies to assess this and clinical utility across different patient groups will be important, together with a clearer understanding of molecular mechanisms and interactions. Consideration of clinical outcomes, whether in terms of warfarin use or specific disease traits, is an essential part of evaluation of the clinical utility of genetic testing for evidence-based practice but substantial challenges remain in how to acquire such data.

Conclusions

There has been massive research investment in the field of human genetics over the last 20 years and remarkable advances achieved (Table 1 ). There remains a gap, however, in terms of this growing genetic knowledge and translation into clinical practice, and more broadly with how as a society we choose to use and manage such information. As knowledge of the basic science and associated technologies has advanced, so has our awareness of the daunting task ahead; the scale and complexity of genetic variation in human populations is vast and still incompletely understood, and is manifested in terms of common disease risk through multiple genetic, epigenetic and environmental interactions. It remains a major hurdle to resolve current genetic associations to specific functionally important genetic variants and understand their mechanism of action. We are only now beginning to address the role of rare and structural variants and much of the genetic risk in common disease remains to be explained. We need to more robustly understand the clinical utility of genetic testing, to appreciate how this will impact on screening, diagnosis and management, and to more clearly establish the cost-effectiveness of these approaches. As genetic testing for the individual becomes increasingly available through the private sector, understanding of risk and the caveats that need to be placed on its interpretation in the context of common disease are critically important if we are to use and manage such information. ‘Personalised medicine’ may not yet be a reality, but the potential of recent genetic advances are clear. How we use and regulate this new information for the benefit of patients requires a more active debate in which clinicians and other health care professionals should play a greater role.

Table 1

Summary points

There have been radical advances over the last 20 years in our understanding of the nature of the human genome, its remarkably complex regulation and the extent of genetic diversity between individuals

Genome-wide association studies using hundreds of thousands of polymorphic common genetic markers (single nucleotide polymorphisms or SNPs) across the genome are now allowing many novel disease susceptibility loci to be resolved in common multifactorial diseases

The proportion of the genetic risk currently explained by such studies remains disappointingly low however future studies of rarer variants enabled by recent advances in high throughput DNA sequencing, and of structural genomic variants such as copy number variation, are likely to be highly informative

Genetic variation may modulate the structure or function of the protein encoded by a gene or how gene expression is regulated at a transcriptional or post transcriptional level, including effects on epigenetic control mechanisms and alternative splicing

Defining functionally important variants remains challenging and requires analysis in the relevant cell or tissue type in the specific disease context

Important lessons are being learned from diseases such as Crohn's disease and HIV/AIDS where many different genetic variants have been implicated, and new insights into disease pathogenesis and potential drug targets gained, with direct implications for patient care

Much work remains to be done to establish the clinical utility of genetic testing in common diseases and there is a need for a broader debate on the ethical implications of current advances in genetics

Pharmacogenomics is likely to have the most immediate impact on the clinical practice of the general physician

Summary points

Funding

Wellcome Trust (grant 074318 to J.C.K.). Conflict of interest: None declared.

117 in total

1. Large scale replication and meta-analysis of variants on chromosome 4q25 associated with atrial fibrillation.

Authors: Stefan Kääb; Dawood Darbar; Charlotte van Noord; Josée Dupuis; Arne Pfeufer; Christopher Newton-Cheh; Renate Schnabel; Seiko Makino; Moritz F Sinner; Prince J Kannankeril; Britt M Beckmann; Subbarao Choudry; Brian S Donahue; Jan Heeringa; Siegfried Perz; Kathryn L Lunetta; Martin G Larson; Daniel Levy; Calum A MacRae; Jeremy N Ruskin; Annette Wacker; Albert Schömig; H-Erich Wichmann; Gerhard Steinbeck; Thomas Meitinger; André G Uitterlinden; Jacqueline C M Witteman; Dan M Roden; Emelia J Benjamin; Patrick T Ellinor
Journal: Eur Heart J Date: 2009-01-13 Impact factor: 29.983

2. Diabetes mellitus and risk of prostate cancer in the health professionals follow-up study.

Authors: Jocelyn S Kasper; Yan Liu; Edward Giovannucci
Journal: Int J Cancer Date: 2009-03-15 Impact factor: 7.396

3. The largest prospective warfarin-treated cohort supports genetic forecasting.

Authors: Mia Wadelius; Leslie Y Chen; Jonatan D Lindh; Niclas Eriksson; Mohammed J R Ghori; Suzannah Bumpstead; Lennart Holm; Ralph McGinnis; Anders Rane; Panos Deloukas
Journal: Blood Date: 2008-06-23 Impact factor: 22.113

Review 4. Mapping complex disease traits with global gene expression.

Authors: William Cookson; Liming Liang; Gonçalo Abecasis; Miriam Moffatt; Mark Lathrop
Journal: Nat Rev Genet Date: 2009-03 Impact factor: 53.242

Review 5. Prostate cancer genomics: towards a new understanding.

Authors: John S Witte
Journal: Nat Rev Genet Date: 2008-12-23 Impact factor: 53.242

6. Genetic variation at the 9p21 locus predicts angiographic coronary artery disease prevalence but not extent and has clinical utility.

Authors: Jeffrey L Anderson; Benjamin D Horne; Matthew J Kolek; Joseph B Muhlestein; Chrissa P Mower; James J Park; Heidi T May; Nicola J Camp; John F Carlquist
Journal: Am Heart J Date: 2008-10-11 Impact factor: 4.749

7. Cost-effectiveness of using pharmacogenetic information in warfarin dosing for patients with nonvalvular atrial fibrillation.

Authors: Mark H Eckman; Jonathan Rosand; Steven M Greenberg; Brian F Gage
Journal: Ann Intern Med Date: 2009-01-20 Impact factor: 25.391

8. Cardiovascular disease risk prediction with and without knowledge of genetic variation at chromosome 9p21.3.

Authors: Nina P Paynter; Daniel I Chasman; Julie E Buring; Dov Shiffman; Nancy R Cook; Paul M Ridker
Journal: Ann Intern Med Date: 2009-01-20 Impact factor: 25.391

9. Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease.

Authors: Steven A McCarroll; Alan Huett; Petric Kuballa; Shannon D Chilewski; Aimee Landry; Philippe Goyette; Michael C Zody; Jennifer L Hall; Steven R Brant; Judy H Cho; Richard H Duerr; Mark S Silverberg; Kent D Taylor; John D Rioux; David Altshuler; Mark J Daly; Ramnik J Xavier
Journal: Nat Genet Date: 2008-09 Impact factor: 38.330

10. Changing interpretations, stable genes: responsibilities of patients, professionals, and policy makers in the clinical interpretation of complex genetic information.

Authors: Brian H Shirts; Lisa S Parker
Journal: Genet Med Date: 2008-11 Impact factor: 8.822

2 in total

1. Sensitized phenotypic screening identifies gene dosage sensitive region on chromosome 11 that predisposes to disease in mice.

Authors: Olga Ermakova; Lukasz Piszczek; Luisa Luciani; Florence M G Cavalli; Tiago Ferreira; Dominika Farley; Stefania Rizzo; Rosa Chiara Paolicelli; Mumna Al-Banchaabouchi; Claus Nerlov; Richard Moriggl; Nicholas M Luscombe; Cornelius Gross
Journal: EMBO Mol Med Date: 2011-01 Impact factor: 12.137

2. Evolution and medicine in undergraduate education: a prescription for all biology students.

Authors: Michael F Antolin; Kristin P Jenkins; Carl T Bergstrom; Bernard J Crespi; Subhajyoti De; Angela Hancock; Kathryn A Hanley; Thomas R Meagher; Andres Moreno-Estrada; Randolph M Nesse; Gilbert S Omenn; Stephen C Stearns
Journal: Evolution Date: 2012-02-06 Impact factor: 3.694

2 in total