Literature DB >> 20944596

Epigenomics reveals a functional genome anatomy and a new approach to common disease.

Abstract

Epigenomics provides the context for understanding the function of genome sequence, analogous to the functional anatomy of the human body provided by Vesalius a half-millennium ago. Much of the seemingly inconclusive genetic data related to common diseases could therefore become meaningful in an epigenomic context.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2010 PMID： 20944596 PMCID： PMC2956605 DOI： 10.1038/nbt1010-1049

Source DB: PubMed Journal: Nat Biotechnol ISSN： 1087-0156 Impact factor: 54.908

New Year's Eve in 2014 will mark the 500th anniversary of the birth of Andreas van Wesel, commonly known as Vesalius, author of De humani corporis fabrica1, a treatise almost as influential in its time as was Origin of Species over three centuries later. Vesalius pioneered the rigorous study of human anatomy, and introduced experimental observation into medical education, as a rigorous substitute to hearsay). The late Victor McKusick, who helped to create the genome project and mapped the first human autosomal gene, called gene mapping “neo-Vesalian,”2 as it represented foundational mapping of the genome in order to exploit this information for finding genes. Vesalius was more than a mapper, though, as he debunked dogma of both Galen and Aristotle on the anatomy and physiology of blood circulation, showing how the anatomical map meant that blood must flow through the lungs and return to the heart, not just directly between the ventricles. Similarly, the particular order of genes on chromosomes and arrangement of the chromosomes themselves have only recently been found to be intrinsically meaningful, not just as a map. I suggest here that epigenomics, i.e. the genome science of epigenetics, has transformed genome science, by showing us that the organization of the genome is as important for gene function, as Vesalius showed us how the organization of anatomic structures allowed the function of organs. Moreover, the combination of new epigenomic tools with conventional genetics and a new mathematical language for their interface may have as much impact on understanding human disease as did Vesalius' anatomy a half millennium ago.

Epigenomics Provides a Functional Anatomy of the Genome

Epigenomics has helped to reveal several surprising large-scale functional relationships among the genes themselves and the surrounding nongenic DNA, previously hinted at by the beta-globin cluster. One is the generality of large (10's to 100's of kb) genomic regions regulating gene expression. While the beta-globin gene cluster had been studied for decades3, linking progressive chromatin changes to globin gene switching during development4, the generality and size of multigene chromatin domains only emerged with large-scale epigenomic mapping. As increasing numbers of imprinted genes were found, it was discovered that they were organized in gene clusters, often with common regulatory elements such as CTCF binding sites5. With the advent of genome-scale mapping of histone modifications, many large regions of heterochromatin modifications were found, such as specific modifications associated with the inactive X chromosome6. Moreover, large autosomal regions of heterochromatin modification across HOX gene clusters were found to be more highly conserved across species than the underlying DNA sequence, while not being a simple reflection of exonic boundaries7. Thus, epigenomicstudies revealed that the scope of genome that is apparently functional was at least an order of magnitude greater than that suspected from the sequence alone. Epigenomics provided the functional anatomy of the genome that Vesalius gave gross anatomy a half millennium ago. Another surprising large-scale genomic relationship is frequent intra-chromosomal and inter-chromosomal interactions mediated by chromatin proteins. These were discovered through chromatin capture methods, described in detail elsewhere in this issue 8, designed to preserve chromatin-mediated interactions over long distances. DNA loop structures, mediated by chromatin, surprisingly common and highly dynamic, were found to be associated with function. For example, multiple interleukin genes in the 200 kb mouse TH2 cytokine locus, when transcriptionally active, are folded into numerous loops, anchored by SATB at their bases9. Remarkably, trans-interactions between chromosomes involve some of the same sequences that epigenetically regulate imprinted gene domains, for example the H19 differentially methylated region, and may act through transvection to regulate genes in trans10. A recent example of large-scale genomic organization mediated by chromatin is the link between long RNAs, heterochromatin modification and gene activity. At the Cold Spring Harbor Genome Biology meeting in 2005, Tom Gingeras asked for a wager on the number of genes that will ultimately be agreed upon, arguing that the nearly 50% of the genome that may be untranslated RNA will be proved functional11. Growing evidence indicates that much of this RNA mediates chromatin structure. For example, antisense RNAs appears to establish heterochromatin in mammalian genes, independent of dicer and the posttranslational miRNA machinery12. These regions may be >100 kb12, affect multiple genes, and involve Argonaut family proteins13. An exciting recent discovery is the role of long intergenic noncoding RNAs (lincRNAs) in establishing heterochromatin. For example, HOTAIR is a lincRNA that retargets PRC2 over HOX domains with profound changes in gene expression relevant to cancer progression14. Finally, large organized chromatin K(lysine) modifications (or LOCKs) have been shown to organize the genome into very large blocks (100's to 1000's of kb), some of which are differentiation-specific in their location and extent and correspond to lamin-associated domains (LADs)15–17. These very large regions may provide a dynamic mechanism for functional organization of the genome and are altered in cancer15. Additional clues that many such large-scale epigenetic networks profoundly influence cellular development and genome function come from large-scale mapping studies. For example, CTCF, which mediates H19 imprinting described earlier, appears to play a general role in defining functional gene region boundaries18. Similarly polycomb target genes, thought to be involved in stable gene silencing, may alternate between functionally active and silent states over large gene regions19. That such networks have a general role in organizing the genome functionally is suggested by the identification of chromosome territories with closely approximated gene-rich regions20.

Epigenomics may supersede single-gene epigenetic disease research

Just as epigenomics provides a functional anatomy of the normal genome, genome-scale studies of epigenetic disease are revealing a Pandora’s box of epigenetic pathology. Just as cancer was the vanguard for gene-specific disease epigenetics21, genome-scale epigenetic studies of disease have also focused first on cancer, and these studies have revealed much more genetic pathology than was suggested by candidate gene approaches. For example, methylation changes can affect large genomic regions in colorectal cancer22, and widespread methylation changes are even more striking outside of the usually examined CpG islands, i.e. in shores and gene bodies23. Similarly, it came as a surprise to most when widespread alterations in histone acetylation and methylation were found ubiquitously in cancer24. Stem cells, a hoped for therapeutic target for many diseases, have shown promiscuous methylation differences from somatic cells on a genome-scale, surprisingly involving non-CpG sites25. Remarkably, the sites of differential methylation largely overlap, with strong statistical significance, among physiological states, such as normal vs. cancer, stem cells vs. differentiated cells, and tissues derived from differing germ layers26. Thus, the language of epigenomic organization appears to be common for normal development and for disease, just as the language of anatomy is common for normal and abnormal physiology. The increasingly appreciated importance of large-scale epigenetic control in regulating gene function has had a profound influence on how disease-based genomic studies are being organized. While published genome-scale studies represent only about 2% of cancer epigenetics, the rate of increase over the last 5 years of cancer epigenomic studies is more than double that of conventional gene-based analyses (Fig. 1). The same relative increase in genome-scale studies also appears to be true for the nascent field of non-cancer human disease epigenetics, such as cardiovascular, immunological and neuropsychiatric disease 27, 28.These differences are of course driven in part by the availability of new technology, but also by the growing realization that variation in both DNA methylation and chromatin are widespread across the genome, and may be organized into large genomic domains.

Figure 1

Greater rate of increase of genome-scale, compared to selected gene-focused publications, addressing cancer epigenetics

While published genome-scale studies represent only about 2% of cancer epigenetics, the rate of increase over the last 5 years of cancer epigenomic studies appears double that of conventional selected gene-based analyses. Numbers are approximate from PubMed citation analysis; scales are different for gene-based and genome-based plots; 2010 is extrapolated from 2/3 of 2009 plus 2010.

But another important factor driving such “disease epigenomics” is the relatively limited yield to date of conventional SNP-based genetic analysis in explaining most common human disease. As widely described in both scientific29, 30 and lay publications31, the gap between original expectation of genetic analysis and attributable risk of disease is much greater than anticipated a decadeago. How is epigenomics transforming the search for genetic causes of common human disease? Many have suggested that one contributing factor may be the importance of environmentally driven epigenetic variation in disease risk, particularly as a surrogate for mutational change32–34 (Table 1). But we should also consider another dimension to this epigenetic argument for common disease that has received comparatively less attention. Since the actual “genome anatomy” target is likely much larger than we realized, perhaps involving half or more of the genome, and since the understanding of the normal function of this genome anatomy requires epigenomics, perhaps much of what appears to be negative genetic data could become meaningful in an epigenomic context (Table 1). For example, most GWAS studies identify not genes, but nearby regions or intergenic deserts. Yet these same regions frequently harbor differentially methylated regions (DMRs) that discriminate tissue types, or distinguish cancer from normal cells. They are also the canonical regions for long intergenic noncoding RNAs (LINCs) that help establish chromatin structure and normal gene function. Furthermore, gene deserts may promote trans-associations of chromosomes in epigenetic regulation35. Another way in which disease-associated DNA sequence variants might affect disease risk is through their linkage to DNA sequences regulating DNA methylation or chromatin modification or binding factors. Substantial association of SNPs with DNA methylation has already been found36, 37.

Table 1

How epigenomics is transforming the search for genetic causes of common human disease

Epigenome anatomy	Possible disease link	New approach to common disease search
Environmentally driven epigenetic variation	Epigenome changes in absence of sequence variant	Methylome arrays, capture bisulfite sequencing, ChIPseq
Regulatorysite or expression per se	Noncoding RNAs	RNAseq and methods above
Key disease sequences unlinked to target genes	Intra-and interchromosomal interactions	Chromatin network mapping; replication timing?
Regulatory sequence distant from gene	Co-regulated gene clusters	Genome-scale methylation; chromatin mapping
Sequence-defined methylation	Sequence variants controlling epigenome	Linked GWAS and epigenome studies
New class of Variably Methylated Regions	Sequence variants controlling epigenomic variance perse	New statistics for reexamining and integrating GWAS
Domain disruption, anchoring proteins	LOCKs and LADs	Native chromatin whole-genome analysis

An intriguing additional possibility we have proposed is that DNA sequence variants might themselves affect the stochastic or environmentally influenced variance in the epigenome. According to this model, complex species would have an evolutionary advantageto include alleles for increased epigenetic variation per se, i.e. genetic alleles that increase epigenetic variance but not the mean38. This would be like an evolutionary “hedging one's bet,” and confer an advantage for genes in pathways for which the environment changes epochally, e.g. the abundance of food and water. Examining inbred mice from the same litter and living in the same cage, we identified hundreds of “variably methylated regions,” or VMRs, that were highly enriched for key genes in development and embryonic pattern formation. Thus development itself, which epigenetics regulates, likely includes a great deal of stochasticism at the epigenetic level. Genetic variants that increase this developmental plasticity at specific targets may confer an evolutionary advantage but might be deleterious to some individuals after a recent epochal change in the environment, such as the recent Western diet38. Intriguingly, several VMRs have recently been linked to body mass index39. Finally, we are only beginning to understand the role of LOCKs and LADs in functional genome organization, and their assessment in disease will require robust genome-scale approaches to native chromatin measurement, and availability of clinical specimens permitting such analyses (Table 1).

Future technology development that could drive epigenomics

What potential areas for future technology development will fuel growth in this area? Of course, all roads lead to sequencing, including bisulfite genome-scale sequencing for DNA methylation, just as in non-epigenetic genome science. The rollout of inexpensive, comprehensive, high throughput single molecule sequencing has been slower than promised, and second generation sequencing is still impractical for large scale epidemiological studies involving thousands of patients, except for capture-based methods such as padlock probes 40. The conundrum in such studies is that while they offer enormous advantages in throughput, single base resolution, and allele-specific data, they will not reveal regions of differential methylation where we do not already know to look, which may be vast as epigenomics is applied to an ever increasing number of disease states. At the same time, high throughput sequencing is relatively cheap now for examining chromatin modifications, but that is true for modifications representing a relatively small fraction of the genome purified by chromatin immunoprecipitation, for example. For large regional changes such as LOCKs, one faces cost limitations similar to whole-genome bisulfite sequencing. An important advance will come from reagents that are cheap and amenable to processing by typical university core laboratories, such as Illumina and other arrays. For example, a soon to be released methylation chip will provide ~450,000 targets, including all CpG islands and shores, as well as DNase hypersensitive sites and other regions identified and curated for this purpose by a consortium of laboratories organized by Tom Hudson. While this reagent may not be next or even this year’s most comprehensive approach, last year’s isn’t bad and such cooperative approaches open epigenomic research to any general laboratory, a very exciting development. Other exciting technological initiatives include epigenomic analysis of microdissected samples or even single cells, and enrichment of small chromosomal fragments for biochemical analysis of chromatin 41. A new epigenetic epidemiology will need to be crafted. We can no longer consider genetic variation in isolation when looking for disease relationships. Samples in ongoing and future large-scale cohorts must be preserved to allow DNA methylation and chromatin analysis. But retrospectively, a great deal can be added to existing cohort studies, since DNA methylation is stable over decades. Much of the existing genetic data might be made clearer by adding epigenomic analysis to those studies. New cohort sampling should include standard sources, such as lymphocytes, but also, as much as possible, target tissues affected by the disease. Additionally, we need to develop new statistical and epidemiological tools for disease epigenomics, and for its synthesis with conventional genetic analysis. For example, unlike SNPs, epigenetic variation is inherently quantitative, and thus does not lend itself to simple allele designation, e.g. quantitative levels of DNA methylation or polycomb complex members. The quantitative nature of epigenome variation can help explain complex traits with a smaller number of contributing loci, since they do not necessarily require as many of the additive signals originally proposed by R.A. Fisher42. Such an approach is being applied, for example, to the analysis of quantitative traits associated with VMRs39. The apparent additional complexity epigenomics brings to genetics may seem daunting. But I don't think Vesalius would have been intimidated, and I know Victor would have been delighted.

38 in total

1. Genomic maps and comparative analysis of histone modifications in human and mouse.

Authors: Bradley E Bernstein; Michael Kamal; Kerstin Lindblad-Toh; Stefan Bekiranov; Dione K Bailey; Dana J Huebert; Scott McMahon; Elinor K Karlsson; Edward J Kulbokas; Thomas R Gingeras; Stuart L Schreiber; Eric S Lander
Journal: Cell Date: 2005-01-28 Impact factor: 41.582

Review 2. Genome-wide transcription and the implications for genomic organization.

Authors: Philipp Kapranov; Aarron T Willingham; Thomas R Gingeras
Journal: Nat Rev Genet Date: 2007-05-08 Impact factor: 53.242

3. Genome wide ChIP-chip analyses reveal important roles for CTCF in Drosophila genome organization.

Authors: Sheryl T Smith; Priyankara Wickramasinghe; Andrew Olson; Dmitri Loukinov; Lan Lin; Joy Deng; Yanping Xiong; John Rux; Ravi Sachidanandam; Hao Sun; Victor Lobanenkov; Jumin Zhou
Journal: Dev Biol Date: 2009-01-08 Impact factor: 3.582

4. Genomic surveys by methylation-sensitive SNP analysis identify sequence-dependent allele-specific DNA methylation.

Authors: Kristi Kerkel; Alexandra Spadola; Eric Yuan; Jolanta Kosek; Le Jiang; Eldad Hod; Kerry Li; Vundavalli V Murty; Nicole Schupf; Eric Vilain; Mitzi Morris; Fatemeh Haghighi; Benjamin Tycko
Journal: Nat Genet Date: 2008-06-22 Impact factor: 38.330

5. SATB1 packages densely looped, transcriptionally active chromatin for coordinated expression of cytokine genes.

Authors: Shutao Cai; Charles C Lee; Terumi Kohwi-Shigematsu
Journal: Nat Genet Date: 2006-10-22 Impact factor: 38.330

Review 6. Towards unravelling the Igf2/H19 imprinted domain.

Authors: S Viville; M A Surani
Journal: Bioessays Date: 1995-10 Impact factor: 4.345

7. Targeted bisulfite sequencing reveals changes in DNA methylation associated with nuclear reprogramming.

Authors: Jie Deng; Robert Shoemaker; Bin Xie; Athurva Gore; Emily M LeProust; Jessica Antosiewicz-Bourget; Dieter Egli; Nimet Maherali; In-Hyun Park; Junying Yu; George Q Daley; Kevin Eggan; Konrad Hochedlinger; James Thomson; Wei Wang; Yuan Gao; Kun Zhang
Journal: Nat Biotechnol Date: 2009-03-29 Impact factor: 54.908

8. Epigenetic silencing of tumour suppressor gene p15 by its antisense RNA.

Authors: Wenqiang Yu; David Gius; Patrick Onyango; Kristi Muldoon-Jacobs; Judith Karp; Andrew P Feinberg; Hengmi Cui
Journal: Nature Date: 2008-01-10 Impact factor: 49.962

9. The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores.

Authors: Rafael A Irizarry; Christine Ladd-Acosta; Andrew P Feinberg; Bo Wen; Zhijin Wu; Carolina Montano; Patrick Onyango; Hengmi Cui; Kevin Gabo; Michael Rongione; Maree Webster; Hong Ji; James Potash; Sarven Sabunciyan
Journal: Nat Genet Date: 2009-01-18 Impact factor: 38.330

10. Large histone H3 lysine 9 dimethylated chromatin blocks distinguish differentiated from embryonic stem cells.

Authors: Bo Wen; Hao Wu; Yoichi Shinkai; Rafael A Irizarry; Andrew P Feinberg
Journal: Nat Genet Date: 2009-01-18 Impact factor: 38.330

48 in total

1. Associations with early-life socio-economic position in adult DNA methylation.

Authors: Nada Borghol; Matthew Suderman; Wendy McArdle; Ariane Racine; Michael Hallett; Marcus Pembrey; Clyde Hertzman; Chris Power; Moshe Szyf
Journal: Int J Epidemiol Date: 2011-10-20 Impact factor: 7.196

Review 2. The dynorphin/κ-opioid receptor system and its role in psychiatric disorders.

Authors: H A Tejeda; T S Shippenberg; R Henriksson
Journal: Cell Mol Life Sci Date: 2011-10-16 Impact factor: 9.261

3. An atlas of DNA methylomes in porcine adipose and muscle tissues.

Authors: Mingzhou Li; Honglong Wu; Zonggang Luo; Yudong Xia; Jiuqiang Guan; Tao Wang; Yiren Gu; Lei Chen; Kai Zhang; Jideng Ma; Yingkai Liu; Zhijun Zhong; Jing Nie; Shuling Zhou; Zhiping Mu; Xiaoyan Wang; Jingjing Qu; Long Jing; Huiyu Wang; Shujia Huang; Na Yi; Zhe Wang; Dongxing Xi; Juan Wang; Guangliang Yin; Li Wang; Ning Li; Zhi Jiang; Qiulei Lang; Huasheng Xiao; Anan Jiang; Li Zhu; Yanzhi Jiang; Guoqing Tang; Miaomiao Mai; Surong Shuai; Ning Li; Kui Li; Jinyong Wang; Xiuqing Zhang; Yingrui Li; Haosi Chen; Xiaolian Gao; Graham S Plastow; Stephen Beck; Huanming Yang; Jian Wang; Jun Wang; Xuewei Li; Ruiqiang Li
Journal: Nat Commun Date: 2012-05-22 Impact factor: 14.919

4. Epigenetics lights up the obesity field.

Authors: Amelia Marti; Jose Ordovas
Journal: Obes Facts Date: 2011-06-17 Impact factor: 3.942

5. Charting a course for genomic medicine from base pairs to bedside.

Authors: Eric D Green; Mark S Guyer
Journal: Nature Date: 2011-02-10 Impact factor: 49.962

Review 6. The emerging role of epigenetics in cardiovascular disease.

Authors: Charbel Abi Khalil
Journal: Ther Adv Chronic Dis Date: 2014-07 Impact factor: 5.091

Review 7. Heartache and heartbreak--the link between depression and cardiovascular disease.

Authors: Charles B Nemeroff; Pascal J Goldschmidt-Clermont
Journal: Nat Rev Cardiol Date: 2012-06-26 Impact factor: 32.419

8. DNA methylation profiling in human lung tissue identifies genes associated with COPD.

Authors: Jarrett D Morrow; Michael H Cho; Craig P Hersh; Victor Pinto-Plata; Bartolome Celli; Nathaniel Marchetti; Gerard Criner; Raphael Bueno; George Washko; Kimberly Glass; Augustine M K Choi; John Quackenbush; Edwin K Silverman; Dawn L DeMeo
Journal: Epigenetics Date: 2016-11-01 Impact factor: 4.528

Review 9. Genetics and Genomics of Coronary Artery Disease.

Authors: Milos Pjanic; Clint L Miller; Robert Wirka; Juyong B Kim; Daniel M DiRenzo; Thomas Quertermous
Journal: Curr Cardiol Rep Date: 2016-10 Impact factor: 2.931

10. Buccals are likely to be a more informative surrogate tissue than blood for epigenome-wide association studies.

Authors: Robert Lowe; Carolina Gemma; Huriya Beyan; Mohammed I Hawa; Alexandra Bazeos; R David Leslie; Alexandre Montpetit; Vardhman K Rakyan; Sreeram V Ramagopalan
Journal: Epigenetics Date: 2013-03-28 Impact factor: 4.528