Literature DB >> 21349863

ACT: aggregation and correlation toolbox for analyses of genome tracks.

Justin Jee¹, Joel Rozowsky, Kevin Y Yip, Lucas Lochovsky, Robert Bjornson, Guoneng Zhong, Zhengdong Zhang, Yutao Fu, Jie Wang, Zhiping Weng, Mark Gerstein.

Abstract

UNLABELLED: We have implemented aggregation and correlation toolbox (ACT), an efficient, multifaceted toolbox for analyzing continuous signal and discrete region tracks from high-throughput genomic experiments, such as RNA-seq or ChIP-chip signal profiles from the ENCODE and modENCODE projects, or lists of single nucleotide polymorphisms from the 1000 genomes project. It is able to generate aggregate profiles of a given track around a set of specified anchor points, such as transcription start sites. It is also able to correlate related tracks and analyze them for saturation--i.e. how much of a certain feature is covered with each new succeeding experiment. The ACT site contains downloadable code in a variety of formats, interactive web servers (for use on small quantities of data), example datasets, documentation and a gallery of outputs. Here, we explain the components of the toolbox in more detail and apply them in various contexts. AVAILABILITY: ACT is available at http://act.gersteinlab.org CONTACT: pi@gersteinlab.org.

Entities: Disease Gene Species

Mesh：

Year: 2011 PMID： 21349863 PMCID： PMC3072554 DOI： 10.1093/bioinformatics/btr092

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 INTRODUCTION

There is now an abundance of genome-sized data from high-throughput genomic experiments. For instance, there are ChIP-chip, ChIP-seq and RNA-seq experiments from the ENCODE (ENCODE Project Consortium, 2007) and modENCODE (modENCODE consortium, 2009) projects. There are also genome sequence data that can be used to generate tracks measuring sequence content, such as the densities of single nucleotide polymorphisms (SNPs) from dbSNP (Sharry ) and the 1000 genomes project. In most cases, the representations of these data take the form of either signal tracks that describe a genomic landscape or distinct region tracks that tag portions of the genome as active. The aggregation and correlation toolbox (ACT) provides a powerful set of programs that can be applied to any experiments producing data in these formats. The ability to analyze multiple genomic datasets is important, as demonstrated by tools like Galaxy (Giardine ). ACT provides a unique set of functionality that complements existing methods of analysis.

2 THE ACT TOOLBOX: OVERVIEW

ACT facilitates three main types of analysis: Aggregation: in many scenarios, it is useful to determine the distribution of signals in a signal track relative to certain genomic anchors (Fig. 1, aggregation). For example, it has recently been reported that the contribution of each transcription factor binding site to tissue-specific gene expression depends on its position relative to the transcription start site (TSS) (MacIssac ). It is thus useful to aggregate binding signals of transcription factors at a certain distance from the TSSs of all genes (the anchors). In general, this type of aggregation analyses helps identify proximity correlations and functional relationships between the signals and anchors. In the ENCODE pilot study (ENCODE Project Consortium, 2007), it has been used to demonstrate positional relationships between chromatin features and TSSs.

Fig. 1.

Uses of ACT using signal tracks from various sources. Signal around all TSSs is aggregated to give an average signal profile, for example of Baf155 binding around TSSs (Encode Project) (aggregation). Figure made in Excel (correlation). Multiple signal tracks are correlated to show which tracks are more or less related to each other. In the selected example, a heatmap of the SNP track correlation between four individuals (dbSNP) leads to a dendogram of their phylogenetic relationship. Figure made using Web ACT. Each additional signal track increases the number of base pairs covered (saturation). When the addition of signal tracks is considered in all possible combinations, the average increase in coverage, with error bars, can be visualized by a saturation plot. In the example, data are taken from individuals from dbSNP [with additional genomes from Ahn ), Bentley ), Drmanac ), Kim )]. In each box plot, the top and bottom pink bars correspond to the maximum and minimum normal values, the top edge, middle line and bottom edge of the box correspond to the top 25 percentile, median and bottom 25 percentile, the black dot is the mean, and red circles are outliers. Figure made using ACT downloadable saturation program. Correlation: it is also useful to consider how multiple-related signal tracks are correlated with each other. For example, a previous study (Zhang ) demonstrated, using whole-track correlation methods, that there was a consistent relationship among transcription factors as judged by their signal profiles across several ChIP-chip experiments. By providing a means of correlating signal tracks with each other, ACT allows for initial comparison of different experiments to see which are more similar or related than others (Fig. 1, correlation). Saturation: another important type of analysis is determining the number of experimental conditions required to achieve a high genomic coverage of the biological phenomenon under study. For example, using ChIP-chip or ChIP-seq experiments, one could identify a set of transcription factor binding sites from a human cell line. When the experiment is repeated using another cell line, some additional binding sites could be identified. How many cell lines need to be considered in order to reach the point of saturation, so that few new binding sites would be identified by extra experiments? ACT produces plots that help answer this type of question.

3 DETAILS AND USE CASES

ACT is available as a suite of downloadable scripts corresponding to the aggregation, correlation and saturation components of the toolbox. The tool is intended for Linux/Unix users with Java and Python. In addition, it is useful to have R for output visualization for the aggregation and correlation tools. There is also a compendium of other versions of the tool components written in different languages and with varied functionality. For some types of analysis, there are web components for demonstration purposes on small datasets with built-in visualization features. However, because most whole-genome signal tracks are too large to upload via standard Internet connections, users are recommended to download the toolbox and run it locally. As performing these calculations on whole-genome data can be especially time intensive, the version of the tools presented here has been designed to run efficiently on large datasets. Aggregation: the aggregation component is designed to take a signal track (.sgr or .wig) and an annotation track (.bed) as input, and compute the average signal over a certain number of base pairs upstream and downstream of (i.e. a fixed radius around) the annotations. In other words, signal values are taken from the region surrounding each annotation, and averaged over the number of annotation anchors provided. The base pair resolution of the aggregation can be specified by the number of bins (narrower bins give more data points and therefore finer granularity). Results of such calculation can be plotted as in Figure 1 (aggregation). ACT also provides features such as computing the standard deviation, median and quartiles that can be viewed as a boxplot, as well as scaling aggregation over regions such as areas between transcription start and end sites or within exons so that all of the aggregate signals within those regions fall into a fixed number of bins. In this case, bin size is dynamically computed for each region so that the same number of bins cover regions of different sizes. Correlation: the correlation analysis takes a set of active genomic regions (.bed) such as a SNP track or a genomic signal track (.wig). It then divides genomic coordinates into bins and gives each bin a value corresponding to the mean or maximum signal values which fall within the bin, or assigns value based on the number of ‘active regions’ which fall within the bin. A final correlation matrix is created based on either the Spearman's, Pearson's or normal score correlation between each pair of binned datasets. The results can be visualized as a heatmap or as a phylogenetic tree using programs such as PHYLIP (Felsenstein, 1996). One version of the correlation tool uses parallelization to decrease the pro-gram's overall running time. This component was written largely in Java. Examples of correlation output based on SNP tracks and ChIP-chip data are shown in Figure 1 (correlation). Saturation: we provide an efficient implementation of saturation plot generator. Each input file corresponds to one dataset (e.g. one new individual, in .bed format), and each line in a file specifies a genomic location that has the biological phenomenon under study (e.g. tagged SNPs). The saturation plot shows, with each new dataset (x-axis), what percentage of genomic base pairs are covered (y-axis). The program considers the various combinations in which tracks can be added so that the increase in base pair coverage is a range of values based on all the files in the input. The resulting plot is output in PDF format (Fig. 1, saturation), in which a series of boxplots depicts increasing base pair coverage, where the boxplot at each position m on the x-axis shows the coverage values of all combinations of m conditions. Boxplots that approach a horizontal asymptote indicate that the coverage has reached saturation. Our implementation makes use of special data structures to avoid redundant counting. It normally takes less than a minute to generate the plot for up to 30 input files each with a few thousand lines. To handle more files and files with more lines, the tool also provides an option to compute the coverage of a random sample of the input file combinations.

4 DISCUSSION

There are number of additional analyses that can be done to fine-tune the output of ACT. For instance, it is possible to use the online genomic signal aggregator (GSA), which assigns each genomic position to the nearest anchor in order to reduce the artifacts caused by the subsets of anchors clustering together, to handle tightly clustered anchors. Also, aggregation can be used in conjunction with genome structure correction to determine if the enrichments of a given signal with respect to anchor points are significantly relative to the non-random positioning of the anchors (ENCODE Project Consortium, 2007). This correction takes into account the fact that a ‘random’ distribution of anchors on the genome arises from a distinctly non-uniform distribution. Practically, this could be carried out through ACT by comparing the aggregation over anchors (e.g. TSSs) to that from ‘randomized anchors’, where the latter is generated by shifting anchor coordinates along the chromosome or transferring anchor coordinates from a second chromosome to the one of interest. Finally, ACT can be used as a starting point for other downstream analyses. In the instance of RNA-seq data tracks, further analysis can be conducted with RseqTools (Habegger ) to, for example, determine additional similarities between two or more highly correlated tracks. The results of correlation analysis, for instance, can also be fed into downstream principal component analysis, allowing for grouping of coregulating factors with their coregulated sites. This would simply involve diagonalization of the output correlation matrix from ACT. Saturation analysis can also be used to inform future experimental design. Funding: National Institute of Health; A.L. Williams Professorship funds. Conflict of Interest: none declared.

12 in total

1. dbSNP: the NCBI database of genetic variation.

Authors: S T Sherry; M H Ward; M Kholodov; J Baker; L Phan; E M Smigielski; K Sirotkin
Journal: Nucleic Acids Res Date: 2001-01-01 Impact factor: 16.971

2. Galaxy: a platform for interactive large-scale genome analysis.

Authors: Belinda Giardine; Cathy Riemer; Ross C Hardison; Richard Burhans; Laura Elnitski; Prachi Shah; Yi Zhang; Daniel Blankenberg; Istvan Albert; James Taylor; Webb Miller; W James Kent; Anton Nekrutenko
Journal: Genome Res Date: 2005-09-16 Impact factor: 9.043

3. Statistical analysis of the genomic distribution and correlation of regulatory elements in the ENCODE regions.

Authors: Zhengdong D Zhang; Alberto Paccanaro; Yutao Fu; Sherman Weissman; Zhiping Weng; Joseph Chang; Michael Snyder; Mark B Gerstein
Journal: Genome Res Date: 2007-06 Impact factor: 9.043

4. The first Korean genome sequence and analysis: full genome sequencing for a socio-ethnic group.

Authors: Sung-Min Ahn; Tae-Hyung Kim; Sunghoon Lee; Deokhoon Kim; Ho Ghang; Dae-Soo Kim; Byoung-Chul Kim; Sang-Yoon Kim; Woo-Yeon Kim; Chulhong Kim; Daeui Park; Yong Seok Lee; Sangsoo Kim; Rohit Reja; Sungwoong Jho; Chang Geun Kim; Ji-Young Cha; Kyung-Hee Kim; Bonghee Lee; Jong Bhak; Seong-Jin Kim
Journal: Genome Res Date: 2009-05-26 Impact factor: 9.043

5. Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods.

Authors: J Felsenstein
Journal: Methods Enzymol Date: 1996 Impact factor: 1.600

6. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.

Authors: Ewan Birney; John A Stamatoyannopoulos; Anindya Dutta; Roderic Guigó; Thomas R Gingeras; Elliott H Margulies; Zhiping Weng; Michael Snyder; Emmanouil T Dermitzakis; Robert E Thurman; Michael S Kuehn; Christopher M Taylor; Shane Neph; Christoph M Koch; Saurabh Asthana; Ankit Malhotra; Ivan Adzhubei; Jason A Greenbaum; Robert M Andrews; Paul Flicek; Patrick J Boyle; Hua Cao; Nigel P Carter; Gayle K Clelland; Sean Davis; Nathan Day; Pawandeep Dhami; Shane C Dillon; Michael O Dorschner; Heike Fiegler; Paul G Giresi; Jeff Goldy; Michael Hawrylycz; Andrew Haydock; Richard Humbert; Keith D James; Brett E Johnson; Ericka M Johnson; Tristan T Frum; Elizabeth R Rosenzweig; Neerja Karnani; Kirsten Lee; Gregory C Lefebvre; Patrick A Navas; Fidencio Neri; Stephen C J Parker; Peter J Sabo; Richard Sandstrom; Anthony Shafer; David Vetrie; Molly Weaver; Sarah Wilcox; Man Yu; Francis S Collins; Job Dekker; Jason D Lieb; Thomas D Tullius; Gregory E Crawford; Shamil Sunyaev; William S Noble; Ian Dunham; France Denoeud; Alexandre Reymond; Philipp Kapranov; Joel Rozowsky; Deyou Zheng; Robert Castelo; Adam Frankish; Jennifer Harrow; Srinka Ghosh; Albin Sandelin; Ivo L Hofacker; Robert Baertsch; Damian Keefe; Sujit Dike; Jill Cheng; Heather A Hirsch; Edward A Sekinger; Julien Lagarde; Josep F Abril; Atif Shahab; Christoph Flamm; Claudia Fried; Jörg Hackermüller; Jana Hertel; Manja Lindemeyer; Kristin Missal; Andrea Tanzer; Stefan Washietl; Jan Korbel; Olof Emanuelsson; Jakob S Pedersen; Nancy Holroyd; Ruth Taylor; David Swarbreck; Nicholas Matthews; Mark C Dickson; Daryl J Thomas; Matthew T Weirauch; James Gilbert; Jorg Drenkow; Ian Bell; XiaoDong Zhao; K G Srinivasan; Wing-Kin Sung; Hong Sain Ooi; Kuo Ping Chiu; Sylvain Foissac; Tyler Alioto; Michael Brent; Lior Pachter; Michael L Tress; Alfonso Valencia; Siew Woh Choo; Chiou Yu Choo; Catherine Ucla; Caroline Manzano; Carine Wyss; Evelyn Cheung; Taane G Clark; James B Brown; Madhavan Ganesh; Sandeep Patel; Hari Tammana; Jacqueline Chrast; Charlotte N Henrichsen; Chikatoshi Kai; Jun Kawai; Ugrappa Nagalakshmi; Jiaqian Wu; Zheng Lian; Jin Lian; Peter Newburger; Xueqing Zhang; Peter Bickel; John S Mattick; Piero Carninci; Yoshihide Hayashizaki; Sherman Weissman; Tim Hubbard; Richard M Myers; Jane Rogers; Peter F Stadler; Todd M Lowe; Chia-Lin Wei; Yijun Ruan; Kevin Struhl; Mark Gerstein; Stylianos E Antonarakis; Yutao Fu; Eric D Green; Ulaş Karaöz; Adam Siepel; James Taylor; Laura A Liefer; Kris A Wetterstrand; Peter J Good; Elise A Feingold; Mark S Guyer; Gregory M Cooper; George Asimenos; Colin N Dewey; Minmei Hou; Sergey Nikolaev; Juan I Montoya-Burgos; Ari Löytynoja; Simon Whelan; Fabio Pardi; Tim Massingham; Haiyan Huang; Nancy R Zhang; Ian Holmes; James C Mullikin; Abel Ureta-Vidal; Benedict Paten; Michael Seringhaus; Deanna Church; Kate Rosenbloom; W James Kent; Eric A Stone; Serafim Batzoglou; Nick Goldman; Ross C Hardison; David Haussler; Webb Miller; Arend Sidow; Nathan D Trinklein; Zhengdong D Zhang; Leah Barrera; Rhona Stuart; David C King; Adam Ameur; Stefan Enroth; Mark C Bieda; Jonghwan Kim; Akshay A Bhinge; Nan Jiang; Jun Liu; Fei Yao; Vinsensius B Vega; Charlie W H Lee; Patrick Ng; Atif Shahab; Annie Yang; Zarmik Moqtaderi; Zhou Zhu; Xiaoqin Xu; Sharon Squazzo; Matthew J Oberley; David Inman; Michael A Singer; Todd A Richmond; Kyle J Munn; Alvaro Rada-Iglesias; Ola Wallerman; Jan Komorowski; Joanna C Fowler; Phillippe Couttet; Alexander W Bruce; Oliver M Dovey; Peter D Ellis; Cordelia F Langford; David A Nix; Ghia Euskirchen; Stephen Hartman; Alexander E Urban; Peter Kraus; Sara Van Calcar; Nate Heintzman; Tae Hoon Kim; Kun Wang; Chunxu Qu; Gary Hon; Rosa Luna; Christopher K Glass; M Geoff Rosenfeld; Shelley Force Aldred; Sara J Cooper; Anason Halees; Jane M Lin; Hennady P Shulha; Xiaoling Zhang; Mousheng Xu; Jaafar N S Haidar; Yong Yu; Yijun Ruan; Vishwanath R Iyer; Roland D Green; Claes Wadelius; Peggy J Farnham; Bing Ren; Rachel A Harte; Angie S Hinrichs; Heather Trumbower; Hiram Clawson; Jennifer Hillman-Jackson; Ann S Zweig; Kayla Smith; Archana Thakkapallayil; Galt Barber; Robert M Kuhn; Donna Karolchik; Lluis Armengol; Christine P Bird; Paul I W de Bakker; Andrew D Kern; Nuria Lopez-Bigas; Joel D Martin; Barbara E Stranger; Abigail Woodroffe; Eugene Davydov; Antigone Dimas; Eduardo Eyras; Ingileif B Hallgrímsdóttir; Julian Huppert; Michael C Zody; Gonçalo R Abecasis; Xavier Estivill; Gerard G Bouffard; Xiaobin Guan; Nancy F Hansen; Jacquelyn R Idol; Valerie V B Maduro; Baishali Maskeri; Jennifer C McDowell; Morgan Park; Pamela J Thomas; Alice C Young; Robert W Blakesley; Donna M Muzny; Erica Sodergren; David A Wheeler; Kim C Worley; Huaiyang Jiang; George M Weinstock; Richard A Gibbs; Tina Graves; Robert Fulton; Elaine R Mardis; Richard K Wilson; Michele Clamp; James Cuff; Sante Gnerre; David B Jaffe; Jean L Chang; Kerstin Lindblad-Toh; Eric S Lander; Maxim Koriabine; Mikhail Nefedov; Kazutoyo Osoegawa; Yuko Yoshinaga; Baoli Zhu; Pieter J de Jong
Journal: Nature Date: 2007-06-14 Impact factor: 49.962

7. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays.

Authors: Radoje Drmanac; Andrew B Sparks; Matthew J Callow; Aaron L Halpern; Norman L Burns; Bahram G Kermani; Paolo Carnevali; Igor Nazarenko; Geoffrey B Nilsen; George Yeung; Fredrik Dahl; Andres Fernandez; Bryan Staker; Krishna P Pant; Jonathan Baccash; Adam P Borcherding; Anushka Brownley; Ryan Cedeno; Linsu Chen; Dan Chernikoff; Alex Cheung; Razvan Chirita; Benjamin Curson; Jessica C Ebert; Coleen R Hacker; Robert Hartlage; Brian Hauser; Steve Huang; Yuan Jiang; Vitali Karpinchyk; Mark Koenig; Calvin Kong; Tom Landers; Catherine Le; Jia Liu; Celeste E McBride; Matt Morenzoni; Robert E Morey; Karl Mutch; Helena Perazich; Kimberly Perry; Brock A Peters; Joe Peterson; Charit L Pethiyagoda; Kaliprasad Pothuraju; Claudia Richter; Abraham M Rosenbaum; Shaunak Roy; Jay Shafto; Uladzislau Sharanhovich; Karen W Shannon; Conrad G Sheppy; Michel Sun; Joseph V Thakuria; Anne Tran; Dylan Vu; Alexander Wait Zaranek; Xiaodi Wu; Snezana Drmanac; Arnold R Oliphant; William C Banyai; Bruce Martin; Dennis G Ballinger; George M Church; Clifford A Reid
Journal: Science Date: 2009-11-05 Impact factor: 47.728

8. A highly annotated whole-genome sequence of a Korean individual.

Authors: Jong-Il Kim; Young Seok Ju; Hansoo Park; Sheehyun Kim; Seonwook Lee; Jae-Hyuk Yi; Joann Mudge; Neil A Miller; Dongwan Hong; Callum J Bell; Hye-Sun Kim; In-Soon Chung; Woo-Chung Lee; Ji-Sun Lee; Seung-Hyun Seo; Ji-Young Yun; Hyun Nyun Woo; Heewook Lee; Dongwhan Suh; Seungbok Lee; Hyun-Jin Kim; Maryam Yavartanoo; Minhye Kwak; Ying Zheng; Mi Kyeong Lee; Hyunjun Park; Jeong Yeon Kim; Omer Gokcumen; Ryan E Mills; Alexander Wait Zaranek; Joseph Thakuria; Xiaodi Wu; Ryan W Kim; Jim J Huntley; Shujun Luo; Gary P Schroth; Thomas D Wu; HyeRan Kim; Kap-Seok Yang; Woong-Yang Park; Hyungtae Kim; George M Church; Charles Lee; Stephen F Kingsmore; Jeong-Sun Seo
Journal: Nature Date: 2009-07-08 Impact factor: 49.962

9. RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries.

Authors: Lukas Habegger; Andrea Sboner; Tara A Gianoulis; Joel Rozowsky; Ashish Agarwal; Michael Snyder; Mark Gerstein
Journal: Bioinformatics Date: 2010-12-05 Impact factor: 6.937

10. Unlocking the secrets of the genome.

Authors: Susan E Celniker; Laura A L Dillon; Mark B Gerstein; Kristin C Gunsalus; Steven Henikoff; Gary H Karpen; Manolis Kellis; Eric C Lai; Jason D Lieb; David M MacAlpine; Gos Micklem; Fabio Piano; Michael Snyder; Lincoln Stein; Kevin P White; Robert H Waterston
Journal: Nature Date: 2009-06-18 Impact factor: 49.962

23 in total

1. Genome-wide identification and characterization of replication origins by deep sequencing.

Authors: Jia Xu; Yoshimi Yanagisawa; Alexander M Tsankov; Christopher Hart; Keita Aoki; Naveen Kommajosyula; Kathleen E Steinmann; James Bochicchio; Carsten Russ; Aviv Regev; Oliver J Rando; Chad Nusbaum; Hironori Niki; Patrice Milos; Zhiping Weng; Nicholas Rhind
Journal: Genome Biol Date: 2012-04-24 Impact factor: 13.583

2. Regulation of RNA polymerase II activation by histone acetylation in single living cells.

Authors: Timothy J Stasevich; Yoko Hayashi-Takanaka; Yuko Sato; Kazumitsu Maehara; Yasuyuki Ohkawa; Kumiko Sakata-Sogawa; Makio Tokunaga; Takahiro Nagase; Naohito Nozaki; James G McNally; Hiroshi Kimura
Journal: Nature Date: 2014-09-21 Impact factor: 49.962

Review 3. Unifying immunology with informatics and multiscale biology.

Authors: Brian A Kidd; Lauren A Peters; Eric E Schadt; Joel T Dudley
Journal: Nat Immunol Date: 2014-02 Impact factor: 25.606

4. Prdm9 and Meiotic Cohesin Proteins Cooperatively Promote DNA Double-Strand Break Formation in Mammalian Spermatocytes.

Authors: Tanmoy Bhattacharyya; Michael Walker; Natalie R Powers; Catherine Brunton; Alexander D Fine; Petko M Petkov; Mary Ann Handel
Journal: Curr Biol Date: 2019-03-07 Impact factor: 10.834

5. Beyond antioxidant genes in the ancient Nrf2 regulatory network.

Authors: Sarah E Lacher; Joslynn S Lee; Xuting Wang; Michelle R Campbell; Douglas A Bell; Matthew Slattery
Journal: Free Radic Biol Med Date: 2015-07-08 Impact factor: 7.376

6. Transposon mutagenesis identifies genes driving hepatocellular carcinoma in a chronic hepatitis B mouse model.

Authors: Nancy A Jenkins; Neal G Copeland; Emilie A Bard-Chapeau; Anh-Tuan Nguyen; Alistair G Rust; Ahmed Sayadi; Philip Lee; Belinda Q Chua; Lee-Sun New; Johann de Jong; Jerrold M Ward; Christopher Ky Chin; Valerie Chew; Han Chong Toh; Jean-Pierre Abastado; Touati Benoukraf; Richie Soong; Frederic A Bard; Adam J Dupuy; Randy L Johnson; George K Radda; Eric Cy Chan; Lodewyk Fa Wessels; David J Adams
Journal: Nat Genet Date: 2013-12-08 Impact factor: 38.330

7. Ubiquitous heterogeneity and asymmetry of the chromatin environment at regulatory elements.

Authors: Anshul Kundaje; Sofia Kyriazopoulou-Panagiotopoulou; Max Libbrecht; Cheryl L Smith; Debasish Raha; Elliott E Winters; Steven M Johnson; Michael Snyder; Serafim Batzoglou; Arend Sidow
Journal: Genome Res Date: 2012-09 Impact factor: 9.043

8. PRDM9 binding organizes hotspot nucleosomes and limits Holliday junction migration.

Authors: Christopher L Baker; Michael Walker; Shimpei Kajita; Petko M Petkov; Kenneth Paigen
Journal: Genome Res Date: 2014-03-06 Impact factor: 9.043

9. Genome-wide predictors of NF-κB recruitment and transcriptional activity.

Authors: Marcin Cieślik; Stefan Bekiranov
Journal: BioData Min Date: 2015-11-26 Impact factor: 2.522

10. Insertional mutagenesis and deep profiling reveals gene hierarchies and a Myc/p53-dependent bottleneck in lymphomagenesis.

Authors: Camille A Huser; Kathryn L Gilroy; Jeroen de Ridder; Anna Kilbey; Gillian Borland; Nancy Mackay; Alma Jenkins; Margaret Bell; Pawel Herzyk; Louise van der Weyden; David J Adams; Alistair G Rust; Ewan Cameron; James C Neil
Journal: PLoS Genet Date: 2014-02-27 Impact factor: 5.917