Literature DB >> 27538250

metagene Profiles Analyses Reveal Regulatory Element's Factor-Specific Recruitment Patterns.

Charles Joly Beauparlant^1,2, Fabien C Lamaze^1,3, Astrid Deschênes¹, Rawane Samb¹, Audrey Lemaçon¹, Pascal Belleau¹, Steve Bilodeau^1,3,4, Arnaud Droit^1,2.

Abstract

ChIP-Sequencing (ChIP-Seq) provides a vast amount of information regarding the localization of proteins across the genome. The aggregation of ChIP-Seq enrichment signal in a metagene plot is an approach commonly used to summarize data complexity and to obtain a high level visual representation of the general occupancy pattern of a protein. Here we present the R package metagene, the graphical interface Imetagene and the companion package similaRpeak. Together, they provide a framework to integrate, summarize and compare the ChIP-Seq enrichment signal from complex experimental designs. Those packages identify and quantify similarities or dissimilarities in patterns between large numbers of ChIP-Seq profiles. We used metagene to investigate the differential occupancy of regulatory factors at noncoding regulatory regions (promoters and enhancers) in relation to transcriptional activity in GM12878 B-lymphocytes. The relationships between occupancy patterns and transcriptional activity suggest two different mechanisms of action for transcriptional control: i) a "gradient effect" where the regulatory factor occupancy levels follow transcription and ii) a "threshold effect" where the regulatory factor occupancy levels max out prior to reaching maximal transcription. metagene, Imetagene and similaRpeak are implemented in R under the Artistic license 2.0 and are available on Bioconductor.

Entities: Disease Gene Species

Mesh：

Year: 2016 PMID： 27538250 PMCID： PMC4990179 DOI： 10.1371/journal.pcbi.1004751

Source DB: PubMed Journal: PLoS Comput Biol ISSN： 1553-734X Impact factor: 4.475

This is a PLOS Computational Biology Software paper.

Introduction

Understanding the global regulation of gene expression programs is an important goal of functional genomics studies. To this end, it is now standard procedure to survey the occupancy of regulatory proteins genome-wide using chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-Seq) [1]. Affordability and accessibility of the technique are now generating more complex experimental designs containing many samples, treatments, controls comparisons and technical replicates. Furthermore, the abundance of public datasets, such as those provided by the ENCODE [2] and Roadmap Epigenomics [3] consortiums, provides a wealth of information. Unfortunately, the integration of large amounts of ChIP-Seq information remains challenging. In a typical ChIP-Seq analysis, reads are first aligned using an aligner of choice and peaks are called using peak calling algorithms, such as MACS [4] or PICS [5], to obtain a list of occupied regions. Then, these regions are annotated to genes [6] and/or used to search for DNA binding motifs [7]. In addition, tools were developed to quantitatively compare regions from ChIP-Seq experiments in order to define regions with differential binding between conditions [8]. The algorithms and models used to manage background, to normalize read counts and to estimate the reads distribution across the genome are the main differences between the different methods. While these tools allow the discovery of regions that are differentially occupied by a factor of interest, they are unable to evaluate differences in the general occupancy patterns of DNA-binding proteins. Furthermore, they rely on the peak calling step which varies greatly based on the algorithm or the parameters used [9]. Current approaches to compare and summarize enrichment signals for groups of regions rely on visual representations of the average enrichment at a specific position. These representations are known as metagene plots (also referred to as meta-gene [10] or aggregation plots [11]). To compare multiple samples, many tools implemented reads per million aligned [11, 12] or quantile [13, 14] normalizations. The addition of confidence intervals (represented as ribbons) based on standard errors (of mean or of percentiles) in ngs.plot [12], on bootstrap approaches in ChIPseeker [15] or as standard error in seqPlots [13] improved the prediction of the mean. However, while confidence intervals are effective tools to estimate the range within which the true mean is likely to lie, profile comparisons require statistical testing. In addition, valuable information embedded in the enrichment profiles such as the position of the binding event inside the region or the presence of a specific pattern notwithstanding its amplitude is currently ignored. Therefore, representation tools enabling a quantitative assessment and robust statistical comparisons of metagene profiles are needed. We developed the metagene package to quantitatively compare enrichment profiles of group of regions. Specifically, this package is designed to 1) facilitate the integration of signal from many datasets linked by complex experimental designs, 2) statistically compare the enrichment profiles of groups of genomic regions and 3) provide visual representations of the data to facilitate interpretation. Here we used the metagene package to investigate how regulatory factors contribute to the transcriptional output of noncoding regulatory regions. Indeed, recruitment of regulatory factors to noncoding regulatory regions, including enhancer and promoter regions, modulates the transcriptional response of each gene. Using the metagene and similaRpeak package, we identified the similarities and dissimilarities in the recruitment patterns of these factors at enhancer and promoter regions. Our results demonstrate that there are two distinct mechanisms of action for transcriptional regulators. Indeed, we discovered that the level of the regulatory factors either correlates with the transcriptional activity or saturates prior to maximal transcriptional activity of the regulatory region. We termed those patterns “gradient effect” and “threshold effect”.

Design and Implementation

The metagene package builds upon Bioconductor scalable data structures for representing annotated ranges on the genome [16]. Additionally, to efficiently import large datasets, metagene supports the most common genomic file formats such as bam, bed and narrowPeak/broadPeak. The number of files used in a single analysis is only limited by the computer memory available. To reduce memory usage, metagene produces coverages only for the genomic regions of interest and stores this information in Run-length encoding. It is possible to compare multiple region groups and multiple experiments in a single analysis. To increase the analytical power, metagene uses the controls to estimate the signal-to-noise ratio and remove background signal. The datasets are also normalized for an accurate comparison. Furthermore, the directionality of the genomic regions (i.e. the strand) is usable to highlight asymmetric enrichment patterns. In the final graphical output, the metagene plot, each curve summarizes the information of multiple genomic regions (termed region groups) from a single experiment. When used with the similaRpeak package, our approach allows the comparison of multiple samples and gives the possibility to statistically compare the results with metrics adapted to different profile features. The Imetagene package offers a simple graphical interface to manage complex experimental designs. A workflow of a typical metagene analysis is provided in Fig 1.

Fig 1

metagene workflow.

metagene workflow.

A metagene analysis requires 3 types of inputs: 1) a list of genomic regions (BED or GRanges formats), 2) alignment files (BAM format) and 3) a design sheet (data frame format) explaining the relations between samples. The alignment files are processed to extract the coverages of every genomic regions. Afterward, the background is removed from the coverages and the signal is normalized (reads per millions aligned or RPM) to allow comparison between samples. The main output is the metagene plot. The other outputs are the curve values and confidence intervals (CI) used to produce the plot and an interactive heatmap with Imetagene. The results are compatible with similaRpeak for profile characterization. In order to quantitatively compare different experiments, it is crucial to take into account the signal-to-noise ratio and to normalize samples. Indeed, the ChIP-Seq signal is a mixture of legitimate signal and noise. The experimental noise is influenced by biological factors such as the GC content and the chromatin structure [17] and by technical factors such as the antibody quality, the cell number, the DNA fragmentation and the library construction [18]. A common approach to separate true signal from noise is to use controls. Ideally, the controls should be normalized to fit only with the noise component of the chip signal since only this part of the signal will follow the same distribution [19]. In order to normalize the controls before subtracting the background, metagene uses the Normalization of ChIP-seq (NCIS) approach [20] to calculate the signal to noise ratio. This approach performs well on ChIP-Seq datasets [19] and is readily available in R (Fig 2A and 2B show the effect of noise reduction). If multiple samples are compared together, they should be normalized to take into account the difference in library sizes. This is performed in metagene by converting the raw coverage values in read per millions aligned. It is also possible to change the orientation of each genomic region on the negative strand to represent every region in the 5’→3’ orientation. The profile of each group defined in the design is calculated using either an average or median profile, as specified by the user. A confidence intervals of the estimators (mean or median) is computed at each base pair using bootstraps (1000 times by default) for each group profile. To reduce the effects of extreme coverage values, a data binning strategy with customizable bin sizes, is applied before bootstrapping. Visually, the confidence interval is represented by a ribbon which includes an editable percentage (default 95%) of the sampled values (see S1 Text for more information on the bootstrap approach implemented in metagene). Using the Imetagene package, it is also possible to preview the regions as an interactive heatmap (S1 Fig).

Fig 2

Impact of noise removal and description of the pseudometrics.

Impact of noise removal and description of the pseudometrics.

Metagene plots of the BCL11A transcription factor (A) with noise removal using the NCIS algorithm and (B) without noise removal. The x-axis is centered on enhancers and promoters ±1000bp. The y-axis represents the mean occupancy normalized in reads per million (RPM). Each line represents the mean occupancy of the BCL11A replicates. Groups of transcriptional activity of enhancers or promoters are identified by different colors (red = no CAGE signal; green = low CAGE signal; blue = moderate CAGE signal; purple = high CAGE signal; see S1 Text). Ribbons represent the 95% confidence interval of the mean calculated using 1000 bootstraps. (C) Description of some of the pseudometrics implemented in the similaRpeak packages. A unique feature of the metagene package is the implementation of a statistical comparison between profiles to detect differential enrichment. The comparison is done through a permutation test using metrics which are specified by the user that is not related to the confidence intervals calculated with bootstrapping. For each round of the permutation test, the metric value is calculated using two profiles obtained by randomly sampling the coverages used to compute the original profiles. The proportion of metric scores above the original score is used to calculate a p-value and determine if two profiles are significantly different (see S1 Text for more details). By enabling the use of a diversity of metrics, the statistical comparison can be tailored to fit custom needs. To facilitate the identification of common patterns between two ChIP-Seq profiles, similaRpeak is proposed as a companion package to metagene. The similaRpeak package implements six pseudometrics specialized in pattern similarity detection (Fig 2C). The profile submitted to each pseudometric must respect certain editable criterias, specific to each pseudometric, to ensure that the calculation of the pseudometric is only made in presence of informative peaks and to limit the computation of extreme values. A description of each pseudometric is available in S2 Table. Lastly, we developed a graphical user interface powered by Shiny [21], Imetagene. This graphical interface was developed to facilitate the use of metagene without R programming experience. Taken together, this set of software is used to quickly compare multiple region groups to discover enrichment patterns that would otherwise be missed when looking at individual regions.

Results

Proper spatiotemporal transcription requires the complex interplay of transcription factors, cofactors and chromatin regulators at noncoding regulatory regions [22, 23]. Indeed, enhancer and promoter regions recruit regulatory factors to modulate the recruitment, initiation, pause-release and elongation of the RNA polymerase II (Pol II) [24, 25]. During the transcriptional process, both enhancer and promoter regions are transcribed [26-28]. Here we use the metagene package to correlate the recruitment of regulatory factors at enhancer and promoter regions with their transcriptional output.

Data collection and metagene analyses

To define the contribution of transcription factors and cofactors to the transcriptional activity of promoters and enhancers, we gathered the publically available data generated in GM12878 B-lymphocytes (106 available experiment datasets; 276 alignment files, information in S1 Table). Promoters regions were obtained using the Bioconductor’s TxDb.Hsapiens.UCSC.hg19.knownGene package [16] and enhancers were downloaded from the Fantom5 database [29]. Robust enhancer and promoter regions were defined by regions with at least one robust transcription start site (TSS) in the Fantom5 database. Finally, the regions were stratified into four groups based on their cap analysis of gene expression (CAGE) levels [28]: “no expression”, “low expression”, “moderate expression” and “high expression” (see S1 Text).

Pol II and the general transcription factors levels correlate with transcriptional activity

To validate our transcriptional stratification of enhancers and promoters, we surveyed the occupancy of total Pol II and the general transcription factors (GTFs), in function of the transcriptional activity [30-32]. As expected, transcriptional levels of enhancer and promoter regions correlated with recruitment of Pol II (Fig 3A and S2 and S3 Figs), TAF1 (Fig 3B), and TBP (S4 Fig). Histone marks associated with active enhancers (H3K27ac) and with active promoters (H3K4me3) showed a similar pattern (S5 Fig). The RATIO INTERSECT pseudometric, which calculates the ratio of the area under the intersection of two profiles with the total area, was used to compare the coverage between each group (S3 Table). The pseudometric value tends to 1 as the similarity between profiles increases. The statistical analyses confirmed that an increase in transcriptional activity correlates with an increase in the Pol II machinery (permutation p-value <0.001). In addition, the GTFs followed the same correlation with transcriptional activity. These results demonstrate that metagene and similaRpeak are able to distinguish patterns associated with different levels of transcription activity in a large number of samples by using robust metrics. Together, they offer an excellent tool to investigate the relationship between recruitment of regulatory factors and transcriptional activity.

Fig 3

Metagene profiles in enhancer and promoter regions.

(A) POLR2A, the largest subunit of Pol II. (B) TAF1, a general transcription factor. (C) ELF1, a transcription factor. The x-axis is centered on enhancers and promoters ±1000bp. The y-axis represents the mean occupancy normalized in reads per million (RPM). Each line represents the mean occupancy of the factor replicates. Groups of transcriptional activity of enhancers or promoters are identified by different colors (red = no CAGE signal; green = low CAGE signal; blue = moderate CAGE signal; purple = high CAGE signal). The ribbons represent the 95% confidence interval of the mean calculated using 1000 bootstraps.

Metagene profiles in enhancer and promoter regions.

Differential recruitment of regulatory factors at promoter and enhancer regions

While Pol II and GTFs activities are directly linked to the transcriptional output, the importance of each individual regulatory factor for the transcription process is not well understood. To assess the quantitative recruitment of transcription factors, cofactors and chromatin regulators at cis-regulatory elements as a function of the transcriptional activity, we evaluated the occupancy of regulatory factors, histone modifications and DNAse hypersensitive sites in GM12878 cells. Interestingly, we observed two distinct recruitment patterns at promoter and enhancer regions. Indeed, a “gradient effect” was observed when the occupancy level of a factor correlated with the transcriptional activity (Fig 3A and 3B) while a “threshold effect” refers to factors reaching a plateau in their occupancy prior to maximal transcriptional activity (Fig 3C). We defined a “threshold effect” as a ratio between the intersection area and the total area of the two profiles (RATIO INTERSECT) superior or equal to 0.85 between the high and moderate CAGE signal group. Overall, 44.6% of factors showed a “threshold effect” at enhancer regions while only 19.8% were observed at promoter regions (S6 Fig; p-value = 0.0048, Welch’s Two Sample t-test). For example, the transcription factor ELF1 levels correlated with the transcriptional activity at promoters regions (RATIO INTERSECT = 0.66), but not at enhancers regions (RATIO INTERSECT = 0.88) (Fig 3C). A total of 35 regulatory factors including IRF3 and IRF4 (involved in interleukin regulation [33, 34]) and cofactors like SMC3 and EP300 (S7 and S8 Figs) were identified with a similar dichotomy (see S3 Table for a complete list). These results highlight a differential requirement of regulatory factors at enhancer and promoter regions in relation to transcriptional activity.

Threshold versus gradient effects

Differential recruitment of regulatory factors at promoter and enhancer regions raises mechanistic questions. We are proposing different models to explain the “gradient” and “threshold” effects. For the “threshold effect”, mostly observed at enhancer regions, the regulatory factors are potentially working as “on/off” switches. In that model, once a predetermined level is achieved for a specific transcription factor or cofactors, the transcriptional contribution is maximized (Fig 3B, S7 and S8 Figs and S3 Table). Extrapolation of this model suggests that an accumulation of different regulatory factors is required to achieve maximal transcriptional output at enhancer regions. This idea is corroborated by observations of dozens of transcription factors at enhancers regions in mammalian cells [35]. For the “gradient effect” mostly observed at promoter regions, we are considering two models: i) the regulatory factor directly contributes to Pol II transcriptional activity or ii) the “gradient effect” corresponds to the signal accumulation of multiple enhancers connecting to a promoter region through long distance interactions. These models are not mutually exclusive, but the latter is supported by evidence of an average of 4.9 enhancers connecting per promoter [28] in addition to a positive correlation between the number of connections and the transcriptional output [36]. Taken together, our results establish different recruitment patterns of regulatory factors at enhancers and promoters.

Other applications of metagene

In addition to the current study, the metagene package will be usable for multiple applications. For instance, the metagene package will be suitable to study differential recruitment in different classes of regulatory elements. For instance, enhancers and promoters regions could be stratified by functional types instead of expressions levels, such as the chromatin states [37]. The enrichment patterns of a transcription factor following drug treatment or an infection could also be analyzed with metagene to provide molecular insights into the mechanism of action. Additionally, the dynamic of transcription factors recruitment could be studied using time course datasets. Future studies will reveal new details on the mechanisms of recruitment of regulatory factors and will help in understanding the similarities and dissimilarities between the various classes of regulatory elements.

Availability and Future Directions

The metagene package, the graphical interface Imetagene, and the companion package similaRpeak are available on Bioconductor with documentation and an example dataset. These packages perform a thorough evaluation of the similarities or dissimilarities of the aggregated signal of region groups. For the current version, the region groups are based on annotations in order to test specific scientific hypotheses. Next, we will work on refinement to the bootstrapping strategy and we will be implementing clustering algorithms (as a part of a machine learning strategy) to cluster regions based directly on their occupancy patterns to provide an exploratory approach.

Imetagene interactive heatmap representation.

After the matrices are computed, the Imetagene package can be used to explore the matrix-associated with each experiment to visualize the coverages of the regions. (PDF) Click here for additional data file.

Metagene plots of RNA Pol II phosphorylated at serine 2 (POLR2AphosphoS2) in promoters and enhancers.

The x-axis is centered on enhancers and promoters ±1000bp. The y-axis represents the mean occupancy normalized in reads per million (RPM). Each line represents the mean occupancy of POLR2Aphosphos2. Groups of transcriptional activity of enhancers or promoters are identified by different colors (red = no CAGE signal; green = low CAGE signal; blue = moderate CAGE signal; purple = high CAGE signal; see S1 Text). Ribbons represent the 95% confidence interval of the mean calculated using 1000 bootstraps. (PDF) Click here for additional data file.

Metagene plots of RNA Pol II phosphorylated at serine 5 (POLR2AphosphoS5) in promoters and enhancers.

The x-axis is centered on enhancers and promoters ±1000bp. The y-axis represents the mean occupancy normalized in reads per million (RPM). Each line represents the mean occupancy of POLR2Aphosphos5. Groups of transcriptional activity of enhancers or promoters are identified by different colors (red = no CAGE signal; green = low CAGE signal; blue = moderate CAGE signal; purple = high CAGE signal; see S1 Text). Ribbons represent the 95% confidence interval of the mean calculated using 1000 bootstraps. (PDF) Click here for additional data file.

Metagene plots of the general transcription factor TBP at promoters and enhancers.

The x-axis is centered on enhancers and promoters ±1000bp. The y-axis represents the mean occupancy normalized in reads per million (RPM). Each line represents the mean occupancy of TBP. Groups of transcriptional activity of enhancers or promoters are identified by different colors (red = no CAGE signal; green = low CAGE signal; blue = moderate CAGE signal; purple = high CAGE signal). The ribbons represent the 95% confidence interval of the mean calculated using 1000 bootstraps. (PDF) Click here for additional data file.

Metagene plots of H3K27ac at enhancers and H3K4me3 at promoters.

The x-axis is centered on enhancers and promoters ±1000bp. The y-axis represents the mean occupancy normalized in reads per million (RPM). Each line represents the mean occupancy of the histone mark. Groups of transcriptional activity of enhancers or promoters are identified by different colors (red = no CAGE signal; green = low CAGE signal; blue = moderate CAGE signal; purple = high CAGE signal). The ribbons represent the 95% confidence interval of the mean calculated using 1000 bootstraps. (PDF) Click here for additional data file.

Boxplot of RATIO INTERSECT values for 106 experiments in GM12878.

The RATIO INTERSECT was calculated using the moderate CAGE signal and high CAGE signal groups. (PDF) Click here for additional data file.

Metagene plots of the cofactor SMC3 at promoters and enhancers.

The x-axis is centered on enhancers and promoters ±1000bp. The y-axis represents the mean occupancy normalized in reads per million (RPM). Each line represents the mean occupancy of SMC3. Groups of transcriptional activity of enhancers or promoters are identified by different colors (red = no CAGE signal; green = low CAGE signal; blue = moderate CAGE signal; purple = high CAGE signal). The ribbons represent the 95% confidence interval of the mean calculated using 1000 bootstraps. (PDF) Click here for additional data file.

Metagene plots of the cofactor EP300 at promoters and enhancers.

The x-axis is centered on enhancers and promoters ±1000bp. The y-axis represents the mean occupancy normalized in reads per million (RPM). Each line represents the mean occupancy of EP300. Groups of transcriptional activity of enhancers or promoters are identified by different colors (red = no CAGE signal; green = low CAGE signal; blue = moderate CAGE signal; purple = high CAGE signal). The ribbons represent the 95% confidence interval of the mean calculated using 1000 bootstraps. (PDF) Click here for additional data file.

Description of the 276 bam files used in this article.

Experiment accession: unique identifier of the experiment. File accession: unique identifier of the file. Target: the name of the factor that was targeted for immunoprecipitation. Controls: the experiment accession of the recommended controls. Biosample name: the cell type. Assembly: the version of the genome used for the alignment. Href: the URL to download the file. (CSV) Click here for additional data file.

Description of similaRpeak’s pseudometrics.

Pseudometric: the name of the pseudometric. Definition: the description of the metric. Threshold: criteria that can be set by the user to avoid calculating the value of a pseudometric that would return nonsensical results (division by zero, etc…). (XLSX) Click here for additional data file.

Classification of GM12878 factors.

The classification of the 106 regulatory factors in “gradient” or “threshold”. Target: the name of the target. Type: enhancer or promoter. RATIO_INTERSECT: the RATIO_INTERSECT score calculated using the moderate and high CAGE signal groups. Class: “gradient” or “threshold”. (CSV) Click here for additional data file.

Data collection: Details of the data collection procedure.

Bootstrap: Description of the bootstrapping steps. Permutation: Details of the permutation procedure in Metagene and similaRpeak. (PDF) Click here for additional data file.

34 in total

1. ChIP-Seq: technical considerations for obtaining high-quality data.

Authors: Benjamin L Kidder; Gangqing Hu; Keji Zhao
Journal: Nat Immunol Date: 2011-09-20 Impact factor: 25.606

Review 2. The Hierarchy of Transcriptional Activation: From Enhancer to Promoter.

Authors: Douglas Vernimmen; Wendy A Bickmore
Journal: Trends Genet Date: 2015-12 Impact factor: 11.639

3. CEAS: cis-regulatory element annotation system.

Authors: Hyunjin Shin; Tao Liu; Arjun K Manrai; X Shirley Liu
Journal: Bioinformatics Date: 2009-08-18 Impact factor: 6.937

Review 4. Structural basis of transcription initiation by RNA polymerase II.

Authors: Sarah Sainsbury; Carrie Bernecky; Patrick Cramer
Journal: Nat Rev Mol Cell Biol Date: 2015-02-18 Impact factor: 94.444

Review 5. Enhancer function: new insights into the regulation of tissue-specific gene expression.

Authors: Chin-Tong Ong; Victor G Corces
Journal: Nat Rev Genet Date: 2011-03-01 Impact factor: 53.242

Review 6. Enhancer function: mechanistic and genome-wide insights come together.

Authors: Jennifer L Plank; Ann Dean
Journal: Mol Cell Date: 2014-07-03 Impact factor: 17.970

7. Genome-wide mapping of in vivo protein-DNA interactions.

Authors: David S Johnson; Ali Mortazavi; Richard M Myers; Barbara Wold
Journal: Science Date: 2007-05-31 Impact factor: 47.728

8. Ubiquitous heterogeneity and asymmetry of the chromatin environment at regulatory elements.

Authors: Anshul Kundaje; Sofia Kyriazopoulou-Panagiotopoulou; Max Libbrecht; Cheryl L Smith; Debasish Raha; Elliott E Winters; Steven M Johnson; Michael Snyder; Serafim Batzoglou; Arend Sidow
Journal: Genome Res Date: 2012-09 Impact factor: 9.043

9. The pluripotent regulatory circuitry connecting promoters to their long-range interacting elements.

Authors: Stefan Schoenfelder; Mayra Furlan-Magaril; Borbala Mifsud; Filipe Tavares-Cadete; Robert Sugar; Biola-Maria Javierre; Takashi Nagano; Yulia Katsman; Moorthy Sakthidevi; Steven W Wingett; Emilia Dimitrova; Andrew Dimond; Lucas B Edelman; Sarah Elderkin; Kristina Tabbada; Elodie Darbo; Simon Andrews; Bram Herman; Andy Higgs; Emily LeProust; Cameron S Osborne; Jennifer A Mitchell; Nicholas M Luscombe; Peter Fraser
Journal: Genome Res Date: 2015-03-09 Impact factor: 9.043

10. Integrative analysis of 111 reference human epigenomes.

Authors: Anshul Kundaje; Wouter Meuleman; Jason Ernst; Misha Bilenky; Angela Yen; Alireza Heravi-Moussavi; Pouya Kheradpour; Zhizhuo Zhang; Jianrong Wang; Michael J Ziller; Viren Amin; John W Whitaker; Matthew D Schultz; Lucas D Ward; Abhishek Sarkar; Gerald Quon; Richard S Sandstrom; Matthew L Eaton; Yi-Chieh Wu; Andreas R Pfenning; Xinchen Wang; Melina Claussnitzer; Yaping Liu; Cristian Coarfa; R Alan Harris; Noam Shoresh; Charles B Epstein; Elizabeta Gjoneska; Danny Leung; Wei Xie; R David Hawkins; Ryan Lister; Chibo Hong; Philippe Gascard; Andrew J Mungall; Richard Moore; Eric Chuah; Angela Tam; Theresa K Canfield; R Scott Hansen; Rajinder Kaul; Peter J Sabo; Mukul S Bansal; Annaick Carles; Jesse R Dixon; Kai-How Farh; Soheil Feizi; Rosa Karlic; Ah-Ram Kim; Ashwinikumar Kulkarni; Daofeng Li; Rebecca Lowdon; GiNell Elliott; Tim R Mercer; Shane J Neph; Vitor Onuchic; Paz Polak; Nisha Rajagopal; Pradipta Ray; Richard C Sallari; Kyle T Siebenthall; Nicholas A Sinnott-Armstrong; Michael Stevens; Robert E Thurman; Jie Wu; Bo Zhang; Xin Zhou; Arthur E Beaudet; Laurie A Boyer; Philip L De Jager; Peggy J Farnham; Susan J Fisher; David Haussler; Steven J M Jones; Wei Li; Marco A Marra; Michael T McManus; Shamil Sunyaev; James A Thomson; Thea D Tlsty; Li-Huei Tsai; Wei Wang; Robert A Waterland; Michael Q Zhang; Lisa H Chadwick; Bradley E Bernstein; Joseph F Costello; Joseph R Ecker; Martin Hirst; Alexander Meissner; Aleksandar Milosavljevic; Bing Ren; John A Stamatoyannopoulos; Ting Wang; Manolis Kellis
Journal: Nature Date: 2015-02-19 Impact factor: 69.504

7 in total

1. The ChIP-Seq tools and web server: a resource for analyzing ChIP-seq and other types of genomic data.

Authors: Giovanna Ambrosini; René Dreos; Sunil Kumar; Philipp Bucher
Journal: BMC Genomics Date: 2016-11-18 Impact factor: 3.969

2. Association of breast cancer risk with genetic variants showing differential allelic expression: Identification of a novel breast cancer susceptibility locus at 4q21.

Authors: Yosr Hamdi; Penny Soucy; Véronique Adoue; Kyriaki Michailidou; Sander Canisius; Audrey Lemaçon; Arnaud Droit; Irene L Andrulis; Hoda Anton-Culver; Volker Arndt; Caroline Baynes; Carl Blomqvist; Natalia V Bogdanova; Stig E Bojesen; Manjeet K Bolla; Bernardo Bonanni; Anne-Lise Borresen-Dale; Judith S Brand; Hiltrud Brauch; Hermann Brenner; Annegien Broeks; Barbara Burwinkel; Jenny Chang-Claude; Fergus J Couch; Angela Cox; Simon S Cross; Kamila Czene; Hatef Darabi; Joe Dennis; Peter Devilee; Thilo Dörk; Isabel Dos-Santos-Silva; Mikael Eriksson; Peter A Fasching; Jonine Figueroa; Henrik Flyger; Montserrat García-Closas; Graham G Giles; Mark S Goldberg; Anna González-Neira; Grethe Grenaker-Alnæs; Pascal Guénel; Lothar Haeberle; Christopher A Haiman; Ute Hamann; Emily Hallberg; Maartje J Hooning; John L Hopper; Anna Jakubowska; Michael Jones; Maria Kabisch; Vesa Kataja; Diether Lambrechts; Loic Le Marchand; Annika Lindblom; Jan Lubinski; Arto Mannermaa; Mel Maranian; Sara Margolin; Frederik Marme; Roger L Milne; Susan L Neuhausen; Heli Nevanlinna; Patrick Neven; Curtis Olswold; Julian Peto; Dijana Plaseska-Karanfilska; Katri Pylkäs; Paolo Radice; Anja Rudolph; Elinor J Sawyer; Marjanka K Schmidt; Xiao-Ou Shu; Melissa C Southey; Anthony Swerdlow; Rob A E M Tollenaar; Ian Tomlinson; Diana Torres; Thérèse Truong; Celine Vachon; Ans M W Van Den Ouweland; Qin Wang; Robert Winqvist; Wei Zheng; Javier Benitez; Georgia Chenevix-Trench; Alison M Dunning; Paul D P Pharoah; Vessela Kristensen; Per Hall; Douglas F Easton; Tomi Pastinen; Silje Nord; Jacques Simard
Journal: Oncotarget Date: 2016-12-06

3. Gut Microbiota Has a Widespread and Modifiable Effect on Host Gene Regulation.

Authors: Allison L Richards; Amanda L Muehlbauer; Adnan Alazizi; Michael B Burns; Anthony Findley; Francesco Messina; Trevor J Gould; Camilla Cascardo; Roger Pique-Regi; Ran Blekhman; Francesca Luca
Journal: mSystems Date: 2019-09-03 Impact factor: 6.496

4. GFI1 tethers the NuRD complex to open and transcriptionally active chromatin in myeloid progenitors.

Authors: Anne Helness; Jennifer Fraszczak; Charles Joly-Beauparlant; Halil Bagci; Christian Trahan; Kaifee Arman; Peiman Shooshtarizadeh; Riyan Chen; Marina Ayoub; Jean-François Côté; Marlene Oeffinger; Arnaud Droit; Tarik Möröy
Journal: Commun Biol Date: 2021-12-02

5. Association of breast cancer risk in BRCA1 and BRCA2 mutation carriers with genetic variants showing differential allelic expression: identification of a modifier of breast cancer risk at locus 11q22.3.

Authors: Yosr Hamdi; Penny Soucy; Karoline B Kuchenbaeker; Tomi Pastinen; Arnaud Droit; Audrey Lemaçon; Julian Adlard; Kristiina Aittomäki; Irene L Andrulis; Adalgeir Arason; Norbert Arnold; Banu K Arun; Jacopo Azzollini; Anita Bane; Laure Barjhoux; Daniel Barrowdale; Javier Benitez; Pascaline Berthet; Marinus J Blok; Kristie Bobolis; Valérie Bonadona; Bernardo Bonanni; Angela R Bradbury; Carole Brewer; Bruno Buecher; Saundra S Buys; Maria A Caligo; Jocelyne Chiquette; Wendy K Chung; Kathleen B M Claes; Mary B Daly; Francesca Damiola; Rosemarie Davidson; Miguel De la Hoya; Kim De Leeneer; Orland Diez; Yuan Chun Ding; Riccardo Dolcetti; Susan M Domchek; Cecilia M Dorfling; Diana Eccles; Ros Eeles; Zakaria Einbeigi; Bent Ejlertsen; Christoph Engel; D Gareth Evans; Lidia Feliubadalo; Lenka Foretova; Florentia Fostira; William D Foulkes; George Fountzilas; Eitan Friedman; Debra Frost; Pamela Ganschow; Patricia A Ganz; Judy Garber; Simon A Gayther; Anne-Marie Gerdes; Gord Glendon; Andrew K Godwin; David E Goldgar; Mark H Greene; Jacek Gronwald; Eric Hahnen; Ute Hamann; Thomas V O Hansen; Steven Hart; John L Hays; Frans B L Hogervorst; Peter J Hulick; Evgeny N Imyanitov; Claudine Isaacs; Louise Izatt; Anna Jakubowska; Paul James; Ramunas Janavicius; Uffe Birk Jensen; Esther M John; Vijai Joseph; Walter Just; Katarzyna Kaczmarek; Beth Y Karlan; Carolien M Kets; Judy Kirk; Mieke Kriege; Yael Laitman; Maïté Laurent; Conxi Lazaro; Goska Leslie; Jenny Lester; Fabienne Lesueur; Annelie Liljegren; Niklas Loman; Jennifer T Loud; Siranoush Manoukian; Milena Mariani; Sylvie Mazoyer; Lesley McGuffog; Hanne E J Meijers-Heijboer; Alfons Meindl; Austin Miller; Marco Montagna; Anna Marie Mulligan; Katherine L Nathanson; Susan L Neuhausen; Heli Nevanlinna; Robert L Nussbaum; Edith Olah; Olufunmilayo I Olopade; Kai-Ren Ong; Jan C Oosterwijk; Ana Osorio; Laura Papi; Sue Kyung Park; Inge Sokilde Pedersen; Bernard Peissel; Pedro Perez Segura; Paolo Peterlongo; Catherine M Phelan; Paolo Radice; Johanna Rantala; Christine Rappaport-Fuerhauser; Gad Rennert; Andrea Richardson; Mark Robson; Gustavo C Rodriguez; Matti A Rookus; Rita Katharina Schmutzler; Nicolas Sevenet; Payal D Shah; Christian F Singer; Thomas P Slavin; Katie Snape; Johanna Sokolowska; Ida Marie Heeholm Sønderstrup; Melissa Southey; Amanda B Spurdle; Zsofia Stadler; Dominique Stoppa-Lyonnet; Grzegorz Sukiennicki; Christian Sutter; Yen Tan; Muy-Kheng Tea; Manuel R Teixeira; Alex Teulé; Soo-Hwang Teo; Mary Beth Terry; Mads Thomassen; Laima Tihomirova; Marc Tischkowitz; Silvia Tognazzo; Amanda Ewart Toland; Nadine Tung; Ans M W van den Ouweland; Rob B van der Luijt; Klaartje van Engelen; Elizabeth J van Rensburg; Raymonda Varon-Mateeva; Barbara Wappenschmidt; Juul T Wijnen; Timothy Rebbeck; Georgia Chenevix-Trench; Kenneth Offit; Fergus J Couch; Silje Nord; Douglas F Easton; Antonis C Antoniou; Jacques Simard
Journal: Breast Cancer Res Treat Date: 2016-10-28 Impact factor: 4.624

6. Association analysis identifies 65 new breast cancer risk loci.

Authors: Kyriaki Michailidou; Sara Lindström; Joe Dennis; Jonathan Beesley; Shirley Hui; Siddhartha Kar; Audrey Lemaçon; Penny Soucy; Dylan Glubb; Asha Rostamianfar; Manjeet K Bolla; Qin Wang; Jonathan Tyrer; Ed Dicks; Andrew Lee; Zhaoming Wang; Jamie Allen; Renske Keeman; Ursula Eilber; Juliet D French; Xiao Qing Chen; Laura Fachal; Karen McCue; Amy E McCart Reed; Maya Ghoussaini; Jason S Carroll; Xia Jiang; Hilary Finucane; Marcia Adams; Muriel A Adank; Habibul Ahsan; Kristiina Aittomäki; Hoda Anton-Culver; Natalia N Antonenkova; Volker Arndt; Kristan J Aronson; Banu Arun; Paul L Auer; François Bacot; Myrto Barrdahl; Caroline Baynes; Matthias W Beckmann; Sabine Behrens; Javier Benitez; Marina Bermisheva; Leslie Bernstein; Carl Blomqvist; Natalia V Bogdanova; Stig E Bojesen; Bernardo Bonanni; Anne-Lise Børresen-Dale; Judith S Brand; Hiltrud Brauch; Paul Brennan; Hermann Brenner; Louise Brinton; Per Broberg; Ian W Brock; Annegien Broeks; Angela Brooks-Wilson; Sara Y Brucker; Thomas Brüning; Barbara Burwinkel; Katja Butterbach; Qiuyin Cai; Hui Cai; Trinidad Caldés; Federico Canzian; Angel Carracedo; Brian D Carter; Jose E Castelao; Tsun L Chan; Ting-Yuan David Cheng; Kee Seng Chia; Ji-Yeob Choi; Hans Christiansen; Christine L Clarke; Margriet Collée; Don M Conroy; Emilie Cordina-Duverger; Sten Cornelissen; David G Cox; Angela Cox; Simon S Cross; Julie M Cunningham; Kamila Czene; Mary B Daly; Peter Devilee; Kimberly F Doheny; Thilo Dörk; Isabel Dos-Santos-Silva; Martine Dumont; Lorraine Durcan; Miriam Dwek; Diana M Eccles; Arif B Ekici; A Heather Eliassen; Carolina Ellberg; Mingajeva Elvira; Christoph Engel; Mikael Eriksson; Peter A Fasching; Jonine Figueroa; Dieter Flesch-Janys; Olivia Fletcher; Henrik Flyger; Lin Fritschi; Valerie Gaborieau; Marike Gabrielson; Manuela Gago-Dominguez; Yu-Tang Gao; Susan M Gapstur; José A García-Sáenz; Mia M Gaudet; Vassilios Georgoulias; Graham G Giles; Gord Glendon; Mark S Goldberg; David E Goldgar; Anna González-Neira; Grethe I Grenaker Alnæs; Mervi Grip; Jacek Gronwald; Anne Grundy; Pascal Guénel; Lothar Haeberle; Eric Hahnen; Christopher A Haiman; Niclas Håkansson; Ute Hamann; Nathalie Hamel; Susan Hankinson; Patricia Harrington; Steven N Hart; Jaana M Hartikainen; Mikael Hartman; Alexander Hein; Jane Heyworth; Belynda Hicks; Peter Hillemanns; Dona N Ho; Antoinette Hollestelle; Maartje J Hooning; Robert N Hoover; John L Hopper; Ming-Feng Hou; Chia-Ni Hsiung; Guanmengqian Huang; Keith Humphreys; Junko Ishiguro; Hidemi Ito; Motoki Iwasaki; Hiroji Iwata; Anna Jakubowska; Wolfgang Janni; Esther M John; Nichola Johnson; Kristine Jones; Michael Jones; Arja Jukkola-Vuorinen; Rudolf Kaaks; Maria Kabisch; Katarzyna Kaczmarek; Daehee Kang; Yoshio Kasuga; Michael J Kerin; Sofia Khan; Elza Khusnutdinova; Johanna I Kiiski; Sung-Won Kim; Julia A Knight; Veli-Matti Kosma; Vessela N Kristensen; Ute Krüger; Ava Kwong; Diether Lambrechts; Loic Le Marchand; Eunjung Lee; Min Hyuk Lee; Jong Won Lee; Chuen Neng Lee; Flavio Lejbkowicz; Jingmei Li; Jenna Lilyquist; Annika Lindblom; Jolanta Lissowska; Wing-Yee Lo; Sibylle Loibl; Jirong Long; Artitaya Lophatananon; Jan Lubinski; Craig Luccarini; Michael P Lux; Edmond S K Ma; Robert J MacInnis; Tom Maishman; Enes Makalic; Kathleen E Malone; Ivana Maleva Kostovska; Arto Mannermaa; Siranoush Manoukian; JoAnn E Manson; Sara Margolin; Shivaani Mariapun; Maria Elena Martinez; Keitaro Matsuo; Dimitrios Mavroudis; James McKay; Catriona McLean; Hanne Meijers-Heijboer; Alfons Meindl; Primitiva Menéndez; Usha Menon; Jeffery Meyer; Hui Miao; Nicola Miller; Nur Aishah Mohd Taib; Kenneth Muir; Anna Marie Mulligan; Claire Mulot; Susan L Neuhausen; Heli Nevanlinna; Patrick Neven; Sune F Nielsen; Dong-Young Noh; Børge G Nordestgaard; Aaron Norman; Olufunmilayo I Olopade; Janet E Olson; Håkan Olsson; Curtis Olswold; Nick Orr; V Shane Pankratz; Sue K Park; Tjoung-Won Park-Simon; Rachel Lloyd; Jose I A Perez; Paolo Peterlongo; Julian Peto; Kelly-Anne Phillips; Mila Pinchev; Dijana Plaseska-Karanfilska; Ross Prentice; Nadege Presneau; Darya Prokofyeva; Elizabeth Pugh; Katri Pylkäs; Brigitte Rack; Paolo Radice; Nazneen Rahman; Gadi Rennert; Hedy S Rennert; Valerie Rhenius; Atocha Romero; Jane Romm; Kathryn J Ruddy; Thomas Rüdiger; Anja Rudolph; Matthias Ruebner; Emiel J T Rutgers; Emmanouil Saloustros; Dale P Sandler; Suleeporn Sangrajrang; Elinor J Sawyer; Daniel F Schmidt; Rita K Schmutzler; Andreas Schneeweiss; Minouk J Schoemaker; Fredrick Schumacher; Peter Schürmann; Rodney J Scott; Christopher Scott; Sheila Seal; Caroline Seynaeve; Mitul Shah; Priyanka Sharma; Chen-Yang Shen; Grace Sheng; Mark E Sherman; Martha J Shrubsole; Xiao-Ou Shu; Ann Smeets; Christof Sohn; Melissa C Southey; John J Spinelli; Christa Stegmaier; Sarah Stewart-Brown; Jennifer Stone; Daniel O Stram; Harald Surowy; Anthony Swerdlow; Rulla Tamimi; Jack A Taylor; Maria Tengström; Soo H Teo; Mary Beth Terry; Daniel C Tessier; Somchai Thanasitthichai; Kathrin Thöne; Rob A E M Tollenaar; Ian Tomlinson; Ling Tong; Diana Torres; Thérèse Truong; Chiu-Chen Tseng; Shoichiro Tsugane; Hans-Ulrich Ulmer; Giske Ursin; Michael Untch; Celine Vachon; Christi J van Asperen; David Van Den Berg; Ans M W van den Ouweland; Lizet van der Kolk; Rob B van der Luijt; Daniel Vincent; Jason Vollenweider; Quinten Waisfisz; Shan Wang-Gohrke; Clarice R Weinberg; Camilla Wendt; Alice S Whittemore; Hans Wildiers; Walter Willett; Robert Winqvist; Alicja Wolk; Anna H Wu; Lucy Xia; Taiki Yamaji; Xiaohong R Yang; Cheng Har Yip; Keun-Young Yoo; Jyh-Cherng Yu; Wei Zheng; Ying Zheng; Bin Zhu; Argyrios Ziogas; Elad Ziv; Sunil R Lakhani; Antonis C Antoniou; Arnaud Droit; Irene L Andrulis; Christopher I Amos; Fergus J Couch; Paul D P Pharoah; Jenny Chang-Claude; Per Hall; David J Hunter; Roger L Milne; Montserrat García-Closas; Marjanka K Schmidt; Stephen J Chanock; Alison M Dunning; Stacey L Edwards; Gary D Bader; Georgia Chenevix-Trench; Jacques Simard; Peter Kraft; Douglas F Easton
Journal: Nature Date: 2017-10-23 Impact factor: 49.962

7. SOX10-regulated promoter use defines isoform-specific gene expression in Schwann cells.

Authors: Elizabeth A Fogarty; Jacob O Kitzman; Anthony Antonellis
Journal: BMC Genomics Date: 2020-08-08 Impact factor: 3.969

7 in total