In 1990, with the initial launch of a 15-year project to map and sequence the human genome, a new era of science began. However, even after its successful and early completion in 2001 [1], no one could have foreseen how, only a few years later, genome sequencing would explode to become a widely applied multi-purpose tool whose applications include the mapping of epigenetic modifications and the complete assessment of both coding and non-coding RNA transcripts. The game changer behind this explosion was the transition from the classic electrophoretic Sanger sequencing method, which had limited scalability, to image-based massively parallel 'sequencing-by-synthesis' platforms.It had already become clear in the early days of the post-genome era, before these technological breakthroughs, that there were additional layers to the primary sequence waiting to be uncovered, and a small number of pilot epigenome projects, including the Human Epigenome Consortium (HEC), were launched [2,3]. While on the right track, these early projects suffered from lacking the sequencing capacity required to tackle the multidimensional space of the epigenome. This obstacle was overcome in 2006, with the introduction of next generation sequencing platforms, and the NIH was commendably fast to capitalize on these developments by implementing both its ENCODE and 'Roadmap Epigenomics' projects. ENCODE aimed to utilize the newly generated epigenome maps to assist in discovering and assigning functional elements in the genome, while the Roadmap Epigenomics Program aimed to create reference maps for the majority of normal, primary cell types [4,5]. The success of these projects has helped to popularize epigenomics and has proved somewhat contagious, with additional consortia, such as the recently funded BLUEPRINT (European) and DEEP (German), arriving on the scene; the International Human Epigenome Consortium (IHEC) now coordinates international efforts.The core technologies used in these projects, and in general across the field, have stabilized over the years and standards are now largely agreed upon. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) [6,7] remains the standard assay for determining transcription factor binding, as well as for mapping the genome-wide distribution of histone modifications. Continued efforts to increase sensitivity and resolution has resulted in some recent technical improvements to the basic ChIP-seq method, in the form of nano-ChIP-seq [8] and ChIP-exo [9], respectively. By contrast, dozens of assays exist for DNA methylation [10], although most genome-wide studies are focused on just a few of these [11-13]. As costs continue to decrease, methods are converging on whole genome bisulfite sequencing [14], which had previously been prohibitively expensive. As with exome sequencing, the subject of Genome Biology's 2011 special issue [15], the driving force behind the ongoing explosion in epigenome studies, and data, has been an increase in sequencing capacity at reduced cost.
What can epigenomics do for you?
Epigenome data are very powerful and have multiple applications that extend beyond a simple map of a particular mark or modification in a given cell type. Below, I will highlight a few selected examples of these applications, although this is a far from exhaustive list.
Genome Annotation
Mammalian genomes are large and complex. Understanding such genomes is not trivial and comparative genomics based on the primary DNA sequence alone, while powerful [16], cannot provide all the answers. As demonstrated several years ago [7], and further highlighted by the many recent ENCODE publications [17], chromatin signatures enable efficient and precise genome annotation of regulatory elements, and can pinpoint functional or cell type-specific regions of interest.
Cell identity
It has become abundantly clear over the past years that epigenomic maps provide more information than can be gained from gene expression data alone [6,7,18-20]. While genes are either expressed or not, chromatin states can add further refinement to a gene's activity status, such as whether it is primed or poised, and can also describe varying degrees of repressed states that would all look the same by any gene expression measure. The precise chromatin state of these loci can have clear consequences for how they behave in both normal development and disease.
Disease
As highlighted by many studies of human disease, including several in this issue [21-27], epigenomic maps can be utilized to trace the origin of cells, dissect effected pathways and identify predictive biomarkers. Epigenome data have also proved to be powerful in helping to pinpoint disease-relevant regulatory elements through epigenome-wide association studies, or 'EWAS', especially when integrated with data from genome-wide association studies [17,28].
Challenges
Several of the reports in this special issue expand the catalog of user-friendly tools for the visualization of epigenomic datasets [27,29-31], and much work has previously been done elsewhere in this area, including the development of advanced epigenome browsers [32,33]. Making data even more accessible will be critical if the field is to continue its rapid growth and strengthen its impact. As the number of epigenomic datasets grows into the thousands and tens of thousands, of key importance will be plans made to ensure that sufficient standards are met and that data can be navigated in well curated, high quality databases. An additional challenge in data integration is that much of the full complexity of the epigenome lies in uncharted waters, with many known modifications remaining unmapped and other modifications, such as hydroxylmethylation [34], moving to the center stage. This dynamism in the types of data being produced will be sure to generate increasing demands for new, refined, bioinformatic tools.
Conclusions
The overall impact of the growing number of epigenomes, including the NIH Roadmap Epigenomics Project reference maps, will likely be underestimated. For example, almost every study uses the reference genome sequence, whether it is to design primers, target constructs or align sequencing reads, yet those studies rarely acknowledge the reference genome, because it is simply there and so you can just use it.I predict that, as investigators become more accustomed to epigenome browsers and to utilizing the existing data for various purposes, the reference set of epigenomics maps will also become a routine resource used in many studies. Applications would include providing a quick overview of a gene/locus of interest; helping to refine a hypothesis; assisting primer design by narrowing down the exact region of dynamic regulation; forming the bases of reporter assays by selecting with precision the functional elements of an upstream regions; and so forth.This special issue covers epigenomics over a wide range of organisms, systems and methods, all of which provide an informative sampling to illuminate the possibilities for future studies in this expanding and exciting field.
Competing interests
The author declares that he has no competing interests.
Authors: Robert H Waterston; Kerstin Lindblad-Toh; Ewan Birney; Jane Rogers; Josep F Abril; Pankaj Agarwal; Richa Agarwala; Rachel Ainscough; Marina Alexandersson; Peter An; Stylianos E Antonarakis; John Attwood; Robert Baertsch; Jonathon Bailey; Karen Barlow; Stephan Beck; Eric Berry; Bruce Birren; Toby Bloom; Peer Bork; Marc Botcherby; Nicolas Bray; Michael R Brent; Daniel G Brown; Stephen D Brown; Carol Bult; John Burton; Jonathan Butler; Robert D Campbell; Piero Carninci; Simon Cawley; Francesca Chiaromonte; Asif T Chinwalla; Deanna M Church; Michele Clamp; Christopher Clee; Francis S Collins; Lisa L Cook; Richard R Copley; Alan Coulson; Olivier Couronne; James Cuff; Val Curwen; Tim Cutts; Mark Daly; Robert David; Joy Davies; Kimberly D Delehaunty; Justin Deri; Emmanouil T Dermitzakis; Colin Dewey; Nicholas J Dickens; Mark Diekhans; Sheila Dodge; Inna Dubchak; Diane M Dunn; Sean R Eddy; Laura Elnitski; Richard D Emes; Pallavi Eswara; Eduardo Eyras; Adam Felsenfeld; Ginger A Fewell; Paul Flicek; Karen Foley; Wayne N Frankel; Lucinda A Fulton; Robert S Fulton; Terrence S Furey; Diane Gage; Richard A Gibbs; Gustavo Glusman; Sante Gnerre; Nick Goldman; Leo Goodstadt; Darren Grafham; Tina A Graves; Eric D Green; Simon Gregory; Roderic Guigó; Mark Guyer; Ross C Hardison; David Haussler; Yoshihide Hayashizaki; LaDeana W Hillier; Angela Hinrichs; Wratko Hlavina; Timothy Holzer; Fan Hsu; Axin Hua; Tim Hubbard; Adrienne Hunt; Ian Jackson; David B Jaffe; L Steven Johnson; Matthew Jones; Thomas A Jones; Ann Joy; Michael Kamal; Elinor K Karlsson; Donna Karolchik; Arkadiusz Kasprzyk; Jun Kawai; Evan Keibler; Cristyn Kells; W James Kent; Andrew Kirby; Diana L Kolbe; Ian Korf; Raju S Kucherlapati; Edward J Kulbokas; David Kulp; Tom Landers; J P Leger; Steven Leonard; Ivica Letunic; Rosie Levine; Jia Li; Ming Li; Christine Lloyd; Susan Lucas; Bin Ma; Donna R Maglott; Elaine R Mardis; Lucy Matthews; Evan Mauceli; John H Mayer; Megan McCarthy; W Richard McCombie; Stuart McLaren; Kirsten McLay; John D McPherson; Jim Meldrim; Beverley Meredith; Jill P Mesirov; Webb Miller; Tracie L Miner; Emmanuel Mongin; Kate T Montgomery; Michael Morgan; Richard Mott; James C Mullikin; Donna M Muzny; William E Nash; Joanne O Nelson; Michael N Nhan; Robert Nicol; Zemin Ning; Chad Nusbaum; Michael J O'Connor; Yasushi Okazaki; Karen Oliver; Emma Overton-Larty; Lior Pachter; Genís Parra; Kymberlie H Pepin; Jane Peterson; Pavel Pevzner; Robert Plumb; Craig S Pohl; Alex Poliakov; Tracy C Ponce; Chris P Ponting; Simon Potter; Michael Quail; Alexandre Reymond; Bruce A Roe; Krishna M Roskin; Edward M Rubin; Alistair G Rust; Ralph Santos; Victor Sapojnikov; Brian Schultz; Jörg Schultz; Matthias S Schwartz; Scott Schwartz; Carol Scott; Steven Seaman; Steve Searle; Ted Sharpe; Andrew Sheridan; Ratna Shownkeen; Sarah Sims; Jonathan B Singer; Guy Slater; Arian Smit; Douglas R Smith; Brian Spencer; Arne Stabenau; Nicole Stange-Thomann; Charles Sugnet; Mikita Suyama; Glenn Tesler; Johanna Thompson; David Torrents; Evanne Trevaskis; John Tromp; Catherine Ucla; Abel Ureta-Vidal; Jade P Vinson; Andrew C Von Niederhausern; Claire M Wade; Melanie Wall; Ryan J Weber; Robert B Weiss; Michael C Wendl; Anthony P West; Kris Wetterstrand; Raymond Wheeler; Simon Whelan; Jamey Wierzbowski; David Willey; Sophie Williams; Richard K Wilson; Eitan Winter; Kim C Worley; Dudley Wyman; Shan Yang; Shiaw-Pyng Yang; Evgeny M Zdobnov; Michael C Zody; Eric S Lander Journal: Nature Date: 2002-12-05 Impact factor: 49.962
Authors: Tarjei S Mikkelsen; Manching Ku; David B Jaffe; Biju Issac; Erez Lieberman; Georgia Giannoukos; Pablo Alvarez; William Brockman; Tae-Kyung Kim; Richard P Koche; William Lee; Eric Mendenhall; Aisling O'Donovan; Aviva Presser; Carsten Russ; Xiaohui Xie; Alexander Meissner; Marius Wernig; Rudolf Jaenisch; Chad Nusbaum; Eric S Lander; Bradley E Bernstein Journal: Nature Date: 2007-07-01 Impact factor: 49.962
Authors: Christoph Bock; Eleni M Tomazou; Arie B Brinkman; Fabian Müller; Femke Simmer; Hongcang Gu; Natalie Jäger; Andreas Gnirke; Hendrik G Stunnenberg; Alexander Meissner Journal: Nat Biotechnol Date: 2010-09-19 Impact factor: 54.908
Authors: Vardhman K Rakyan; Thomas Hildmann; Karen L Novik; Jörn Lewin; Jörg Tost; Antony V Cox; T Dan Andrews; Kevin L Howe; Thomas Otto; Alexander Olek; Judith Fischer; Ivo G Gut; Kurt Berlin; Stephan Beck Journal: PLoS Biol Date: 2004-11-23 Impact factor: 8.029
Authors: Florian Eckhardt; Joern Lewin; Rene Cortese; Vardhman K Rakyan; John Attwood; Matthias Burger; John Burton; Tony V Cox; Rob Davies; Thomas A Down; Carolina Haefliger; Roger Horton; Kevin Howe; David K Jackson; Jan Kunde; Christoph Koenig; Jennifer Liddle; David Niblett; Thomas Otto; Roger Pettett; Stefanie Seemann; Christian Thompson; Tony West; Jane Rogers; Alex Olek; Kurt Berlin; Stephan Beck Journal: Nat Genet Date: 2006-10-29 Impact factor: 38.330