Literature DB >> 25904632

RSAT 2015: Regulatory Sequence Analysis Tools.

Alejandra Medina-Rivera¹, Matthieu Defrance², Olivier Sand³, Carl Herrmann⁴, Jaime A Castro-Mondragon⁵, Jeremy Delerce⁵, Sébastien Jaeger⁶, Christophe Blanchet⁷, Pierre Vincens⁸, Christophe Caron⁹, Daniel M Staines¹⁰, Bruno Contreras-Moreira¹¹, Marie Artufel⁵, Lucie Charbonnier-Khamvongsa⁵, Céline Hernandez⁸, Denis Thieffry⁸, Morgane Thomas-Chollier¹², Jacques van Helden¹³.

Abstract

RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/.

Entities: Chemical Disease Species

Mesh：

Substances：
Transcription Factors

Year: 2015 PMID： 25904632 PMCID： PMC4489296 DOI： 10.1093/nar/gkv362

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

The Regulatory Sequence Analysis Tools (RSAT) is a software suite integrating a wide variety of programs to analyse cis-regulatory elements in genomic sequences. Since its initial development in 1998 (1,2), RSAT has provided uninterrupted service and has broadened its applications (Figure 1), following advances in the field of regulatory genomics. The suite is organized in a modular way: programs can be accessed individually or interconnected into pipelines to perform more complex analyses. The Web interface combines 52 tools enabling to perform distinct types of analyses (Table 1): obtaining sequences, discovering motifs ab initio, scanning sequences to predict transcription factor (TF) binding sites, comparing and clustering motifs, analyzing conservation and divergence of TF binding sites, detecting inter-individual regulatory variations and building control sets based on a wide variety of probabilistic models. Altogether, the RSAT Web site includes nine novel programs (tagged with asterisks in Table 1) in addition to the 43 tools described in previous NAR Web software issues (3–5). In this article, we summarize the main functionalities and novelties of the toolbox, describe the supporting teaching and training facilities and explain its various modes of access.

Figure 1.

Overview of the main applications of RSAT.

Table 1.

Selection of some tools available on RSAT Web servers

Application field	Program name	Input	Output	Description
Obtaining sequences (Sequence Tools)	retrieve-seq	Gene names	Sequences	Given a set of gene names, returns upstream, downstream (relative to ORF start) or unspliced ORF sequences. Segments overlapping an upstream ORF can be excluded or included.
	* fetch-sequences (from UCSC)	Genomic coordinates	Sequences	From a set of genomic coordinates (BED file), collects the sequences from the UCSC genome browser.
	retrieve-ensembl-seq	Gene names	Sequences	Returns upstream, downstream, intronic, exonic, UTR, mRNA or CDS for a list of genes from Ensembl. Multi-genome queries enable automatic retrieval of sequences for gene orthologues.
	* retrieve-variation-seq	Identifier of variations	Sequences of the variants	Given a set of IDs for genetic variations, returns the corresponding variants and their flanking sequences. The output file can be scanned with the tool ‘variation-scan’.
Motif discovery	oligo-analysis	Sequences	Over/under-represented oligonucleotides + PSSM	Analyses oligonucleotide occurrences in a set of sequences and detects over/under-represented oligonucleotides, using various background models and scoring statistics.
	dyad-analysis	Sequences	Over/under-represented dyads + PSSM	Detects over-represented dyads (spaced pairs of oligonucleotides) within a set of sequences.
NGS ChIP-seq	peak-motifs	Sequences	Discovered motifs + predicted sites	Discovers motifs in ChIP-seq peak sequence sets and returns detailed information on sequence composition and discovered motifs, with correspondences in databases and predicted binding sites.
Pattern matching	* crer-scan	Transcription factor binding sites	Cis-regulatory enriched regions (CRER)	Given a set of cis-regulatory elements (predicted sites, annotated sites, ChIP-seq peaks), detects regions presenting a significant enrichment in CRERs.
	matrix-scan (-quick)	Sequences + PSSMs	Matching positions in input sequences	Scans sequences with one or several PSSMs to identify instances of the corresponding motifs (putative sites). Supports a variety of background models (Bernoulli, Markov chains of any order).
	* variation-scan	Variant sequences	Regulatory variants	Scans variant sequences with PSSMs and report variations that affect the binding score, in order to predict regulatory variants.
	dna-pattern	Sequences + patterns	Matching positions in input sequences	String-based pattern matching program specialized for DNA sequences. Supports IUPAC code for partially specified nucleotides, regular expressions and search simultaneously multiple patterns.
Motif quality and comparisons (Matrix Tools)	matrix-quality	Motif (PSSM) + sequence set(s)	Score distribution statistics + ROC curves	Evaluates the quality of a PSSM by comparing score distributions obtained with this matrix in control sequence sets.
	compare-matrices	Two sets of PSSM	Similarity scores + matrix alignments	Compares two collections of PSSMs and returns various similarity statistics + matrix alignments.
	* matrix-clustering	One set of PSSM	Clusters of matrices + similarity trees	Clusters similar PSSMs and builds consensus matrices for each cluster.
Comparative genomics	get-orthologs	Gene names + taxon	List of homologous genes with percentage of identity, alignment length and e-value	Given a list of genes from a query organism and a reference taxon, returns the orthologues of the query gene(s) in all the organisms belonging to the reference taxon.
	footprint-discovery	Sequences	Conserved dyads + PSSM	Detects phylogenetic footprints by applying ‘dyad-analysis’ in promoters of a set of orthologous genes.
	* footprint-scan	Sequences + PSSM	Conserved motifs + binding sites	Scans promoters of orthologous genes with one or several PSSMs to detect enriched motifs and predict phylogenetically conserved target genes.
Building control sets	random-seq	Sequence specifications	Sequences	This program generates random sequences. Different probabilistic models are proposed (equiprobable nucleotides, specific alphabet utilization, Markov chains).
	random-genes	Name of an organism	Genes	Selects a random set of genes in a given genome.
	random-genome-fragments	Name of an organism	Randomly selected genome fragments	Selects a set of fragments with random positions in a given genome supported in either RSAT or Ensembl and returns their coordinates and/or sequences.
	permute-matrix	One set of PSSM	Randomized PSSMs	Randomizes a set of input matrices by permuting their columns. The resulting motifs have the same nucleotide composition and information content as the original ones.

This table only displays the most central tools available on the Web interface. See the RSAT Web site for an exhaustive list of available tools. The new tools since the 2011 Web software issue are marked with an asterisk (*).

Overview of the main applications of RSAT. This table only displays the most central tools available on the Web interface. See the RSAT Web site for an exhaustive list of available tools. The new tools since the 2011 Web software issue are marked with an asterisk (*).

RSAT FUNCTIONALITIES

De novo motif discovery in genome-wide data sets

RSAT core programs focus on finding putative regulatory signals by detecting exceptional motifs in a set of sequences. These sequences can correspond, for example, to regulatory regions of co-expressed genes obtained from transcriptome profiling (e.g. microarrays, RNA-seq) or regions revealed by epigenomic experiments (e.g. ChIP-seq, ChIP-exo, DNaseI, ATAC-seq) to be likely bound by a given TF or associated with open chromatin. RSAT provides tools to retrieve promoter sequences (retrieve-seq, retrieve-ensembl-seq (6); Table 1) from a list of genes. For genome-wide epigenomic data sets, a new program extracts the sequences corresponding to a list of genomic coordinates specified in BED format (fetch-sequences from UCSC). The UCSC database is used for this task, as its programmatic access for sequences via DAS is very efficient, although it does not support repeat-masked sequences. In the future, we will add support for the new programmatic access to Ensembl via REST, which does support repeat-masked sequences. Sequences are then used to perform ab initio motif discovery, based on a variety of criteria: over-represented oligonucleotides (oligo-analysis (1)) or spaced pairs (dyad-analysis (7)), positionally biased oligonucleotides (position-analysis (8), local-word-analysis) and differential motif representation between two data sets (oligo-diff). To facilitate analysis of genome-wide data sets, we provide a predefined pipeline (peak-motifs, (9,10)) that performs motif discovery with multiple algorithms, compares the predicted motifs with databases and enables visualization of putative binding sites in the UCSC genome browser. The computing efficiency of ‘peak-motifs’ enables online analysis of full data sets (several tens of megabases), without size restriction, within a few minutes. The discovered motifs are usually used in a second step to scan the original set of sequences and locate putative binding sites (‘dna-pattern’, ‘matrix-scan’ (11)). TF binding sites often form clusters, potentially corresponding to enhancers. Identifying such cis-regulatory modules is achieved in RSAT by predicting cis-regulatory enriched regions (CRERs) (11). Initially embedded within ‘matrix-scan’, detection of CRERs has been re-designed as an independent program (crer-scan) to increase its computing efficiency and expand its scope. Initially limited to binding sites predicted with RSAT ‘matrix-scan’, it now takes as input any set of feature coordinates (e.g. annotated sites, ChIP-seq peaks) and detects windows significantly enriched in these features. This quicker version now enables the scanning of genome-wide data sets. Since its early development, RSAT comprises several tools to build negative control sets, which can be used to assess the reliability of results obtained from predictive programs (random-seq, random-genes). For genome-wide analyses such as ChIP-seq, random data sets can be prepared by selecting sequences at random positions from a given genome (random-genome-fragments), or random controls can be performed by scanning the original sequences with permuted motifs (permute-matrix).

Comparing and clustering motifs

RSAT proposes an extended support for detailed analysis of motifs represented as position-specific scoring matrices (PSSMs). First, it comprises a program to assess the quality of a PSSM on user-defined sequence data sets by comparing theoretical and empirical score distributions (matrix-quality (12)). This program has also proven its usefulness to measure the enrichment of genomic regions (e.g. ChIP-seq peak sets) for one or several TF binding motifs. Second, there is an increasing demand for comparing motifs of various sources: discovered motifs versus collections of annotated matrices such as JASPAR (13), motifs discovered by different algorithms, or in different biological data sets. We have thus increased the computing efficiency of our motif comparison program (compare-matrices). Third, comparing multiple motifs and identifying clusters of similarities amongst them is very useful to regroup redundant matrices returned by several motif discovery algorithms, or to study the relationships between motifs bound by families of phylogenetically related TFs. To this purpose, we have implemented a new tool, ‘matrix-clustering’, which performs hierarchical clustering on a set of input motifs, draws trees to highlight the similarities between them (computed with ‘compare-matrices’), computes consensus motifs (matrices, IUPAC and logos) at each branch of the trees and generates a dynamic report enabling users to customize the graphical representations of motif similarities (Castro-Mondragon, J.A. et al., in preparation).

Detecting regulatory variations

Population genomics has given rise to large amounts of genetic variation data in human populations, in some model organisms and in several plants. Furthermore, for Human, Genome-Wide Association Studies (GWAS), which aim at discovering loci and genes associated with diseases, peculiar phenotypes or quantitative traits, have produced over 10 000 single nucleotide polymorphisms (SNPs) with reported trait associations, accessible via the NHGRI catalogue (14). GWAS results are also available for other organisms (https://easygwas.tuebingen.mpg.de/). Of note, a large fraction of reported SNPs are located outside of protein coding sequences (15,16). RSAT now provides a tool to extract genetic variants together with their flanking sequences (retrieve-variation-seq), which can then be scanned with a collection of motifs to predict the impact of the variation on TF binding (variation-scan). Impact on TF binding is computed by comparing the site scores (weight score difference and P-value fold change) obtained with a PSSM on sequences containing the different reported alleles for one SNP in sliding windows (Medina-Rivera, A. et al., in preparation).

Comparative genomics

The high number of sequenced prokaryotic genomes makes cross-species conservation (sequences or biological features) an increasingly powerful approach to detect potentially functional genomic elements in non-coding regions. RSAT supports both motif discovery and motif scanning approaches in the context of comparative genomic applications. Starting from a gene of interest, two predefined pipelines are provided to extract promoters of orthologous genes and discover over-represented motifs (footprint-discovery (17,18), limited to Prokaryotes and Fungi), or scan them with user-specified motifs to predict phylogenetically conserved target genes for known TFs (footprint-scan, limited to Prokaryotes and Fungi).

Compatibility with other programs and resources

As far as we know, there is no other software suite dedicated to cis-regulation and covering an as wide scope of functionalities as RSAT. The main alternative to RSAT is the MEME suite (19), which is limited to motif analyses. MEME and RSAT can be used in a complementary way, as RSAT encompasses the utility tool ‘convert-matrix’ supporting as input MEME-formatted results. RSAT actually includes several utility tools to ensure inter-conversion between alternative formats for different types of objects: ‘convert-matrix’, ‘convert-seq’, ‘convert-features’, ‘convert-variations’, ‘convert-background-model’, ‘convert-classes’ and ‘convert-graph’. These simple programs facilitate inter-connections between RSAT and complementary methods, extending the potential usages of the tools.

RSAT 2015 NOVELTIES

In addition to the novel programs described above, we have made particular efforts to facilitate the installation of the RSAT suite, in particular by packaging it in a virtual machine (see below). The other novelties mainly concern the management of supported genomes.

Taxon-specific public web sites

In order to cope with the exponential increase of available genomes, the new RSAT release presents a reconfigured organization of the public Web sites based on five servers dedicated to specific taxonomic groups. It is important to note that some types of analyses have become taxon-specific. For example, motif discovery approaches by phylogenetic footprints have proven powerful with Bacteria and Fungi taxa, but remain delusive with Metazoa. Similarly, the methods for detecting regulatory variants depend on the availability of primary data about genetic polymorphism, which is currently essentially available for a few metazoans (human and model organisms) and one yeast (Saccharomyces cerevisiae) genomes. To address the specific needs of user communities and better guide them to the appropriate tools, we set up taxon-dedicated servers, organized according to Ensembl Genomes divisions into Prokaryotes (regrouping Bacteria and Archaea), Fungi, Protists (a multi-clade grouping), Metazoa (merging metazoan species from ensembl.org, which mainly hosts vertebrates, and ensemblgenomes.org, which hosts non-vertebrate metazoans) and Plants. These taxon-specific servers also provide adapted collections of reference motifs, extracted from specialized databases: RegulonDB for Bacteria (20), JASPAR for Metazoa (13) and footprintDB for Plants (21). Furthermore, we are dedicating one server to teaching courses and tutorials (see below), which will provide access to all tools and support representative sets of organisms from each taxon.

Extension of supported organisms

In addition to the previously available organisms imported from the NCBI and Ensembl database, we have added support for Ensembl Genomes (22). As of January 2015, RSAT public servers support 3314 genomes (including 2941 Bacteria, 170 Archaea, 123 Fungi, 40 Metazoa, 18 Plants and 22 Protists).

LEARNING TO USE RSAT

We provide extensive material to help users becoming familiar with the RSAT suite.

Question-based guidance through the tools

Each server's home page now includes a dynamic menu guiding new users to the appropriate tool for his/her question, within a selection of the most common analyses.

Online help on the web pages

Each tool is documented by a manual and is equipped with one or several ‘Demo’ buttons to load the form with illustrative test cases. Some of the tools are also documented by online tutorials, to explain how to choose the relevant parameters and interpret the results. A tutorial given at the ECCB 2014 entitled ‘Analysis of Cis-Regulatory Motifs from High-Throughput Sequence Sets’ is also accessible to all users and constitutes a useful guide to the various access modes to RSAT.

Published protocols and tutorials

A series of protocols have been published (10,11,23–27) to cover some core applications of RSAT. These explain how to manipulate the tools and the underlying algorithms, and guide the reader to gain experience in the biological interpretation of the results.

Outreach and training

Since its initial development, the RSAT team has been committed to education, providing courses and workshop to students and scientific community around the world. Courses include introduction to basic pattern-matching and pattern-discovery approaches as well as application to biological questions (e.g. transcriptome analysis, microbial genome regulation, comparative bacterial genomics, ChIP-seq analysis), with a particular emphasis on ‘hands on’ data analysis. Courses material is available at http://teaching.rsat.eu/.

EXTENDED ACCESS MODES TO THE TOOLS

RSAT can be accessed in different ways: via the Web sites, SOAP Web services or the Unix command-line. In addition, RSAT can now be used through a virtual machine, installed either on a local server or on a computer cloud. We are currently working towards integrating RSAT within the Galaxy framework (28).

Web server

The simplest way to use the RSAT suite is via its Web sites, which provide a user-friendly interface and do not require any particular computational skills. The tools are organized in a modular way: at the end of each analysis, the result page proposes a list of buttons to send the results as input for the complementary tools. For example, tools producing sequences are automatically interconnected to all the tools taking sequences as input.

Web services

To use RSAT for repetitive tasks or to combine several tools into custom pipelines, we provide Web services implemented using the standards SOAP/WSDL (Simple Object Access Protocol/Web Services Description Language). For this programmatic access, users can write clients in any SOAP-supported language (e.g. Perl, Python, Java).

Virtual machine

For users wishing a local version of RSAT, the easiest option is to download the ready-to-use Virtual Machine (RSAT Download page: http://teaching.rsat.eu/download-request_form.cgi). The advantages of this solution are: (i) to run RSAT on any operating system supporting VirtualBox (including Windows); (ii) to avoid installing dependencies (libraries for the system, Perl, Python, etc.); and (iii) security, by ensuring an isolation from the host system and data space. The main drawback of this solution is the requirement of sufficient computing resources: at least 2 Gb of memory allocated to the virtual machine and 6 Gb of storage for the guest operating system (Linux Ubuntu 14.04), in addition to the RSAT package and genomes. The RSAT Virtual Machine can be installed on a cloud, as done for French users via the Cloud of the Institut Français de Bioinformatique (http://cloud.france-bioinformatique.fr).

Installing RSAT in user's operating system

The whole software suite can also be downloaded (RSAT Download page: http://teaching.rsat.eu/download-request_form.cgi) and installed on Unix-type operating systems (e.g. Linux, Mac OSX). The local installation enables to directly call each program on the command-line interpreter. This presents several advantages: (i) access more tools than presented on the Web sites; (ii) install custom collection of genomes; (iii) automate analyses by integrating the tools in custom scripts; and (iv) parallelize analyses on multi-processor configurations. The downloaded tools also enable to set up a custom Web server, to support the needs of local communities. The drawbacks of the local installation are (i) the requirement to install several programs and libraries whilst ensuring their compatibility with other local resources, and (ii) the need for substantial disk space to store the genomes of interest.

CONCLUSIONS

RSAT is possibly the most comprehensive academic suite of programs for the analysis of cis-regulatory sequences. In addition to the core motif discovery programs, which are scalable to genome-wide analyses, RSAT has been expanded to diversify its applications, including comparison and clustering of motifs, regulatory variants analyses and comparative genomics. A key strength is its interoperability with other databases (supports for many motif collections) and web tools (thanks to inter-conversions between file formats). Contrary to many programs that are dedicated to few model organisms, RSAT offers access to thousands of genomes from all kingdoms, facilitated by a new taxon-specific organization of the public servers. Its various modes of access and comprehensive documentation suit the needs of various types of users, from experimental biologists wishing to analyse their data sets without programming skills, to bioinformaticians wishing to integrate RSAT within their automated analysis workflows.

AVAILABILITY

All public RSAT servers are accessible from the RSAT portal at http://www.rsat.eu/. RSAT Web servers can be freely accessed by all users without login requirement.

28 in total

1. Systematic localization of common disease-associated variation in regulatory DNA.

Authors: Matthew T Maurano; Richard Humbert; Eric Rynes; Robert E Thurman; Eric Haugen; Hao Wang; Alex P Reynolds; Richard Sandstrom; Hongzhu Qu; Jennifer Brody; Anthony Shafer; Fidencio Neri; Kristen Lee; Tanya Kutyavin; Sandra Stehling-Sun; Audra K Johnson; Theresa K Canfield; Erika Giste; Morgan Diegel; Daniel Bates; R Scott Hansen; Shane Neph; Peter J Sabo; Shelly Heimfeld; Antony Raubitschek; Steven Ziegler; Chris Cotsapas; Nona Sotoodehnia; Ian Glass; Shamil R Sunyaev; Rajinder Kaul; John A Stamatoyannopoulos
Journal: Science Date: 2012-09-05 Impact factor: 47.728

2. A complete workflow for the analysis of full-size ChIP-seq (and similar) data sets using peak-motifs.

Authors: Morgane Thomas-Chollier; Elodie Darbo; Carl Herrmann; Matthieu Defrance; Denis Thieffry; Jacques van Helden
Journal: Nat Protoc Date: 2012-07-26 Impact factor: 13.491

3. Genic and nongenic contributions to natural variation of quantitative traits in maize.

Authors: Xianran Li; Chengsong Zhu; Cheng-Ting Yeh; Wei Wu; Elizabeth M Takacs; Katherine A Petsch; Feng Tian; Guihua Bai; Edward S Buckler; Gary J Muehlbauer; Marja C P Timmermans; Michael J Scanlon; Patrick S Schnable; Jianming Yu
Journal: Genome Res Date: 2012-06-14 Impact factor: 9.043

4. Theoretical and empirical quality assessment of transcription factor-binding motifs.

Authors: Alejandra Medina-Rivera; Cei Abreu-Goodger; Morgane Thomas-Chollier; Heladia Salgado; Julio Collado-Vides; Jacques van Helden
Journal: Nucleic Acids Res Date: 2010-10-04 Impact factor: 16.971

5. RSAT 2011: regulatory sequence analysis tools.

Authors: Morgane Thomas-Chollier; Matthieu Defrance; Alejandra Medina-Rivera; Olivier Sand; Carl Herrmann; Denis Thieffry; Jacques van Helden
Journal: Nucleic Acids Res Date: 2011-07 Impact factor: 16.971

6. RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets.

Authors: Morgane Thomas-Chollier; Carl Herrmann; Matthieu Defrance; Olivier Sand; Denis Thieffry; Jacques van Helden
Journal: Nucleic Acids Res Date: 2011-12-08 Impact factor: 16.971

7. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations.

Authors: Danielle Welter; Jacqueline MacArthur; Joannella Morales; Tony Burdett; Peggy Hall; Heather Junkins; Alan Klemm; Paul Flicek; Teri Manolio; Lucia Hindorff; Helen Parkinson
Journal: Nucleic Acids Res Date: 2013-12-06 Impact factor: 16.971

8. RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more.

Authors: Heladia Salgado; Martin Peralta-Gil; Socorro Gama-Castro; Alberto Santos-Zavaleta; Luis Muñiz-Rascado; Jair S García-Sotelo; Verena Weiss; Hilda Solano-Lira; Irma Martínez-Flores; Alejandra Medina-Rivera; Gerardo Salgado-Osorio; Shirley Alquicira-Hernández; Kevin Alquicira-Hernández; Alejandra López-Fuentes; Liliana Porrón-Sotelo; Araceli M Huerta; César Bonavides-Martínez; Yalbi I Balderas-Martínez; Lucia Pannier; Maricela Olvera; Aurora Labastida; Verónica Jiménez-Jacinto; Leticia Vega-Alvarado; Victor Del Moral-Chávez; Alfredo Hernández-Alvarez; Enrique Morett; Julio Collado-Vides
Journal: Nucleic Acids Res Date: 2012-11-29 Impact factor: 16.971

9. Ensembl Genomes 2013: scaling up access to genome-wide data.

Authors: Paul Julian Kersey; James E Allen; Mikkel Christensen; Paul Davis; Lee J Falin; Christoph Grabmueller; Daniel Seth Toney Hughes; Jay Humphrey; Arnaud Kerhornou; Julia Khobova; Nicholas Langridge; Mark D McDowall; Uma Maheswari; Gareth Maslen; Michael Nuhn; Chuang Kee Ong; Michael Paulini; Helder Pedro; Iliana Toneva; Mary Ann Tuli; Brandon Walts; Gareth Williams; Derek Wilson; Ken Youens-Clark; Marcela K Monaco; Joshua Stein; Xuehong Wei; Doreen Ware; Daniel M Bolser; Kevin Lee Howe; Eugene Kulesha; Daniel Lawson; Daniel Michael Staines
Journal: Nucleic Acids Res Date: 2013-10-25 Impact factor: 16.971

10. JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles.

Authors: Anthony Mathelier; Xiaobei Zhao; Allen W Zhang; François Parcy; Rebecca Worsley-Hunt; David J Arenillas; Sorana Buchman; Chih-yu Chen; Alice Chou; Hans Ienasescu; Jonathan Lim; Casper Shyr; Ge Tan; Michelle Zhou; Boris Lenhard; Albin Sandelin; Wyeth W Wasserman
Journal: Nucleic Acids Res Date: 2013-11-04 Impact factor: 16.971

128 in total

1. The histone variant H2A.Z and chromatin remodeler BRAHMA act coordinately and antagonistically to regulate transcription and nucleosome dynamics in Arabidopsis.

Authors: E Shannon Torres; Roger B Deal
Journal: Plant J Date: 2019-03-19 Impact factor: 6.417

2. Genome-Wide Transcriptional Analysis Reveals Novel AhR Targets That Regulate Dendritic Cell Function during Influenza A Virus Infection.

Authors: Anthony M Franchini; Jason R Myers; Guang-Bi Jin; David M Shepherd; B Paige Lawrence
Journal: Immunohorizons Date: 2019-06-17

3. Mitotic binding of Esrrb marks key regulatory regions of the pluripotency network.

Authors: Nicola Festuccia; Agnès Dubois; Sandrine Vandormael-Pournin; Elena Gallego Tejeda; Adrien Mouren; Sylvain Bessonnard; Florian Mueller; Caroline Proux; Michel Cohen-Tannoudji; Pablo Navarro
Journal: Nat Cell Biol Date: 2016-10-10 Impact factor: 28.824

4. Gene Regulatory Variation in Drosophila melanogaster Renal Tissue.

Authors: Amanda Glaser-Schmitt; Aleksandra Zečić; John Parsch
Journal: Genetics Date: 2018-07-05 Impact factor: 4.562

5. The BaMM web server for de-novo motif discovery and regulatory sequence analysis.

Authors: Anja Kiesel; Christian Roth; Wanwan Ge; Maximilian Wess; Markus Meier; Johannes Söding
Journal: Nucleic Acids Res Date: 2018-07-02 Impact factor: 16.971

6. The transcription factor bZIP14 regulates the TCA cycle in the diatom Phaeodactylum tricornutum.

Authors: Michiel Matthijs; Michele Fabris; Toshihiro Obata; Imogen Foubert; José Manuel Franco-Zorrilla; Roberto Solano; Alisdair R Fernie; Wim Vyverman; Alain Goossens
Journal: EMBO J Date: 2017-04-18 Impact factor: 11.598

7. FactorNet: A deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data.

Authors: Daniel Quang; Xiaohui Xie
Journal: Methods Date: 2019-03-26 Impact factor: 3.608

8. Dissociation between Iron and Heme Biosyntheses Is Largely Accountable for Respiration Defects of Shewanella oneidensis fur Mutants.

Authors: Huihui Fu; Lulu Liu; Ziyang Dong; Shupan Guo; Haichun Gao
Journal: Appl Environ Microbiol Date: 2018-04-02 Impact factor: 4.792

9. HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis.

Authors: Ivan V Kulakovskiy; Ilya E Vorontsov; Ivan S Yevshin; Ruslan N Sharipov; Alla D Fedorova; Eugene I Rumynskiy; Yulia A Medvedeva; Arturo Magana-Mora; Vladimir B Bajic; Dmitry A Papatsenko; Fedor A Kolpakov; Vsevolod J Makeev
Journal: Nucleic Acids Res Date: 2018-01-04 Impact factor: 16.971

10. Regulation of Gene Expression in Shewanella oneidensis MR-1 during Electron Acceptor Limitation and Bacterial Nanowire Formation.

Authors: Sarah E Barchinger; Sahand Pirbadian; Christine Sambles; Carol S Baker; Kar Man Leung; Nigel J Burroughs; Mohamed Y El-Naggar; John H Golbeck
Journal: Appl Environ Microbiol Date: 2016-08-15 Impact factor: 4.792