Literature DB >> 34850907

JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles.

Jaime A Castro-Mondragon¹, Rafael Riudavets-Puig¹, Ieva Rauluseviciute¹, Roza Berhanu Lemma¹, Laura Turchi², Romain Blanc-Mathieu², Jeremy Lucas², Paul Boddie¹, Aziz Khan³, Nicolás Manosalva Pérez^4,5, Oriol Fornes⁶, Tiffany Y Leung⁶, Alejandro Aguirre⁶, Fayrouz Hammal⁷, Daniel Schmelter⁸, Damir Baranasic^9,10, Benoit Ballester⁷, Albin Sandelin¹¹, Boris Lenhard^9,10, Klaas Vandepoele^4,5,12, Wyeth W Wasserman⁶, François Parcy², Anthony Mathelier^1,13.

Abstract

JASPAR (http://jaspar.genereg.net/) is an open-access database containing manually curated, non-redundant transcription factor (TF) binding profiles for TFs across six taxonomic groups. In this 9th release, we expanded the CORE collection with 341 new profiles (148 for plants, 101 for vertebrates, 85 for urochordates, and 7 for insects), which corresponds to a 19% expansion over the previous release. We added 298 new profiles to the Unvalidated collection when no orthogonal evidence was found in the literature. All the profiles were clustered to provide familial binding profiles for each taxonomic group. Moreover, we revised the structural classification of DNA binding domains to consider plant-specific TFs. This release introduces word clouds to represent the scientific knowledge associated with each TF. We updated the genome tracks of TFBSs predicted with JASPAR profiles in eight organisms; the human and mouse TFBS predictions can be visualized as native tracks in the UCSC Genome Browser. Finally, we provide a new tool to perform JASPAR TFBS enrichment analysis in user-provided genomic regions. All the data is accessible through the JASPAR website, its associated RESTful API, the R/Bioconductor data package, and a new Python package, pyJASPAR, that facilitates serverless access to the data.

Entities: Chemical

Mesh：

Substances：
Transcription Factors

Year: 2022 PMID： 34850907 PMCID： PMC8728201 DOI： 10.1093/nar/gkab1113

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Transcription factors are proteins that interact with the DNA in a sequence-specific manner through recognition of their TF binding sites (TFBSs) located at cis-regulatory regions (promoters, enhancers) to regulate transcription (1). TF binding to these regions occurs through direct interactions between the DNA-binding domains (DBDs) of TFs and the DNA. DBDs are classified into structural classes and families, and TFs with related DBDs typically have similar DNA binding preferences (2). The binding of TFs to cis-regulatory regions promotes or inhibits the assembly of the transcription machinery, thereby controlling gene expression regulation (1,3–5). Sequence-specific TF-DNA interactions at TFBSs can be experimentally determined either in vitro or in vivo. High-throughput in vitro methods include systematic evolution of ligands by exponential enrichment (SELEX) (6) and protein binding-microarrays (PBM) (7) where TFs are exposed to synthesized DNA sequences. High-throughput in vivo assays include chromatin immunoprecipitation-based methods such as ChIP-seq (8), ChIP-exo (9) and ChIP-nexus (10), and cleavage-based methods such as cleavage under targets and tagmentation (11) or cleavage under targets and release using nuclease (12). These high-throughput assays (reviewed in (1)) provide unprecedented means to characterize the binding properties of individual TFs. Nevertheless, a challenge lies in our understanding of how TFs interact cooperatively at regulatory elements, for instance by forming dimers (13). Recently, CAP-SELEX revealed that TF pairs can bind in a DNA-dependent manner and that the combined binding of TFs can alter their individual binding specificities (14). Despite the establishment of a wide variety of experimental techniques that delineate TF-DNA binding interactions and TF binding specificities, experimentally identifying all TFBSs for all TFs in various systems and biological conditions is intractable. To address this challenge, researchers rely on computational modeling to predict and investigate TF-DNA interactions. Such methods are helpful for investigating results of experimental methods with low resolution. For instance, ChIP-seq peaks are typically an order of magnitude larger than the actual binding sites of a targeted TF, and therefore computational methods can be used to pinpoint the binding sites within the peaks (15,16). Given the importance of understanding TF-DNA interactions in studying gene expression regulation, various computational methods have been devised to model and predict TFBSs. The methods utilize experimentally identified TFBSs to build models and computationally predict TFBSs in a given genomic sequence (5). These computational methods range from basic representations such as sequence consensus-based models and position frequency matrices (PFMs) to more complex representations such as Markov and deep learning-based models (reviewed in (13,17–18)). PFMs, which summarize occurrences of each nucleotide at each position in a set of observed TF-DNA interactions, are largely and most commonly used to capture TF binding specificities. Unlike the simple consensus-based models, PFMs can be transformed to probabilistic or energy-based models to obtain position weight matrices (PWMs) (or position-specific scoring matrices (PSSMs)) that can be used to scan any DNA sequence and predict TFBSs with sum weights above a defined threshold (reviewed in (17)). Hence, TF binding preferences can be represented as PFMs, which can be interpreted as TF binding profiles or motifs. In this manuscript, we will use the term PFM, motif, and TF binding profile interchangeably. JASPAR is a popular and regularly maintained open-access and manually curated database storing TF binding preferences as PFMs. The JASPAR CORE collection provides non-redundant binding preferences for TFs (one versioned profile per TF per taxon, except when a TF has multiple DNA-binding preferences) across 6 taxa: urochordata, vertebrates, plants, insects, nematodes, and fungi. Inclusion of new profiles requires orthogonal evidence for the binding preferences of the TFs, which is rigorously evaluated by our expert curators. To complement the CORE collection, we previously introduced the Unvalidated collection to store high-quality TF-binding profiles that are lacking orthogonal supporting evidence in the literature (19). Beyond the high-quality TF binding profiles and metadata stored in JASPAR, the popularity of the database originates from its simplicity, the tools embedded in its web-interface, and the multitude of popular resources and tools directly integrating JASPAR profiles. Some of these tools include: (i) the MEME suite, allowing various motif enrichment and discovery analysis (20), (ii) TFBSshape allowing investigation of DNA shape features for TFBSs to provide insight on the mechanism of protein–DNA interaction (21,22), (iii) CiiiDER (23) for TFBS prediction and analysis such as enrichment assessment in DNA sequences, (iv) RSAT, allowing motif discovery, TFBS motif analyses (24) and (v) i-cisTarget, which allows the prediction of cis-regulatory modules and regulatory features (25,26). In this paper, we present the 9th release of the JASPAR database, which provides a substantial update and expansion of TF binding profiles in the six taxonomic groups. The update includes not only binding profiles (as PFMs) but also revisited metadata. Additionally, we added word clouds to display enriched terms associated with TFs in the scientific literature. Furthermore, a rigorous structural classification of plant TF DBDs is provided to adequately consider the numerous plant-specific TFs. Finally, the update comes with a range of new or updated functionalities and resources such as a TFBS enrichment tool, the pyJASPAR package, new familial binding profiles, and native UCSC human and mouse genome tracks with TFBSs predicted from JASPAR TF binding profiles.

RESULTS

Expansion and update of the JASPAR database

TF binding profiles

In the 9th release of JASPAR, we discarded unused collections introduced in early releases of the database (27–29) that either did not correspond to TF-specific binding profiles or were data-type specific; we maintained the CORE and Unvalidated collections. We computed and compiled TF binding profiles obtained from CAP-SELEX (14), NCAP-SELEX (30), SELEX-seq (31), PBMs (32), ChIP-seq (33–36) and DAP-seq experiments from ReMap 2022 (36) and GEO (37), and ChIP-exo (38) data (Supplementary Data 1 - Text for detailed list of datasets and method details). After manual curation of these profiles to confirm orthogonal supports in the literature, we augmented the CORE collection with 341 new binding profiles for TFs in four taxa (Table 1; Figure 1): 148 profiles in plants (a 24% expansion for this taxon), 101 profiles in vertebrates (a 13% expansion), 85 profiles in urochordates (only one motif was present since the second release of JASPAR in 2006 (27)), and seven profiles in insects (a 5% expansion). Out of these added profiles, 52 were upgraded from the Unvalidated to the CORE collection (27 and 25 for plants and vertebrates, respectively). Moreover, out of the newly introduced PFMs, 31 are associated with TF dimers. The literature that provides orthogonal evidence for the newly introduced TF binding profiles is provided in the metadata. Additionally, we updated 160 TF binding profiles across the six taxa with new PFMs (Table 1).

Table 1.

Growth overview of the CORE collection of JASPAR 2022 compared to the previous release

Taxonomic Group	Non-redundant PFMs in JASPAR 2020	New non-redundant PFMs in JASPAR 2022	Removed profiles	Upgraded profiles (from Unvalidated to CORE)	Updated PFMs in JASPAR 2022	Total PFMs (non-redundant) in JASPAR 2022
Plants	530	121	22	27	44	656
Vertebrates	746	76	6	25	102	841
Urochordata	1	85	-	-	-	86
Insects	143	7	-	-	-	150
Nematodes	43	-	-	-	-	43
Fungi	183	-	4	-	14	179
CORE total	1646	289	32	52	160	1955

Figure 1.

JASPAR CORE collection growth. The number of non-redundant profiles in each taxon (see legend) and overall through all JASPAR releases.

Growth overview of the CORE collection of JASPAR 2022 compared to the previous release JASPAR CORE collection growth. The number of non-redundant profiles in each taxon (see legend) and overall through all JASPAR releases. High-quality PFMs lacking orthogonal support were included in the Unvalidated collection (298 new profiles; Supplementary Data 1—Supplementary Figure S1, Supplementary Data 2—Supplementary Table S1). Specifically, 115 TF binding profiles are associated with zinc-finger TFs and 95 associated with TFs binding DNA as dimers. We provide the Unvalidated collection of TF binding profiles to the community to use with due caution since they are not yet supported with orthogonal evidence. We extend our invitation to the user community to be involved in the motif curation process by providing either new unvalidated profiles to consider or support to existing profiles in the collection. We exhaustively revised the metadata to update information about the TF names, the structural class and family of the TF DBDs (following TFClass (39)), and links to external databases such as UniProt (40), ReMap (36), UniBind (15,16) and DNA Readout Viewer (41), whenever possible. Finally, we removed 32 profiles from the CORE collection (22 plant, 6 vertebrate and 4 fungi profiles) as they corresponded to synonyms of already present TF profiles, had low information content, or were derived from consensus strings (Table 1). In addition, we removed 85 profiles from the Unvalidated collection (44 vertebrate, 40 plant and 1 fungi profiles) because: (i) the corresponding profile or a new profile for the same TF was added to the CORE collection; (ii) the profile was of insufficient quality or (iii) the profile was misannotated (Supplementary Data 2—Supplementary Table S1; detailed list of all removed profiles at https://jaspar.genereg.net/changelog/). The JASPAR 2022 CORE collection now stores 1955 non-redundant PFMs (841 for vertebrates, 656 for plants, 179 for fungi, 150 for insects, 43 for nematodes, and 86 for urochordates) (Table 1; Figure 1). Additionally, we maintained the associated collection of transcription factor flexible models (TFFMs; hidden Markov-based models capturing dinucleotide dependencies in TF–DNA interactions (42)) that were initialized using JASPAR CORE PFMs and trained on ChIP-seq data (Supplementary Data 1—Text). This process resulted in 303 new TFFMs (207 for vertebrates and 96 for plants).

Improved structural classification of plant TF DNA-binding domains

In JASPAR, TFs are classified based on TFClass (39), which provides a hierarchical structural classification (including superclass, class, and family) originally designed for human TFs and later extended to mammals. Since plant genomes contain many classes of TFs absent from TFClass, we expanded the TF structural classification using TFClass guidelines (39) and published structural evidence (Supplementary Data 2—Supplementary Table S2). In some rare cases (e.g. GARP and NF-Y TFs), we slightly diverged from TFClass so that the TF common name expected by users is provided in the structural class or family name. We arbitrarily decided to classify plant specific RAV TFs that contain two types of DBD (B3 and AP2) in the B3 Class. WRKY TFs that have a Zinc finger and a DBD derived from a GCM fold have been classified under the GCM domain factors class and WRKY family, and not in the Zinc-coordinating DNA-binding domains superclass. This homogenised classification introduced 27 novel entries in the TF DBD structural classification (Supplementary Data 2—Supplementary Table S2) and led to numerous corrections in the class and family fields compared to previous JASPAR releases.

Word clouds of terms associated with TFs in the scientific literature

Biological information about TFs, or genes in general, is scattered across many different resources, with PubMed possibly being the most extensive one. In an attempt to provide rich annotations for the TFs in JASPAR, we mined the corpus of article abstracts available in the PubMed database (43). We compiled sets of abstracts associated with each TF and weighted each word present by its relative importance when compared to all abstracts associated with other TFs in the same taxon (Supplementary Data 1—Text for method details). For each TF, the 200 highest weighted words were used to create a word cloud summarizing the annotations associated with that TF. As an example, Figure 2 illustrates the word cloud of terms associated with the PAX6 TF in the scientific literature. Among the most significant terms, we find ‘lens’, ‘iris’, and ‘foveal’ that are representative of the importance of PAX6 in the development of the eye, while the term ‘aniridia’ reflects the link between some PAX6 mutations and the genetic disorder aniridia (44,45).

Figure 2.

JASPAR TF word clouds. Webpage providing information about the binding profile associated with PAX6. The word cloud of terms obtained for PAX6 is highlighted in red, which supports the role of this TF in eye development and its implication in causing the genetic disorder aniridia.

TF binding profile clusters, familial binding profiles, and genomic tracks

We updated the hierarchical clustering of the JASPAR TF binding profiles for each taxon with the RSAT matrix-clustering tool (46). Users can explore the CORE and Unvalidated collections through radial trees, which highlight the TF DBD structural classes, and directly access the underlying profiles by clicking on the TF name (https://jaspar.genereg.net/matrix-clusters). The hierarchical clustering of JASPAR PFMs was used to generate a collection of familial binding profiles (5,47), following previously published methodologies (16,48). Such familial motifs are useful in applications where motif redundancy (many TFs have similar binding preferences) is not desired. In brief, we defined clusters based on the DBD structural classes along the hierarchical clustering of PFMs. Next, we computed a familial binding profile for each cluster, summarizing the profiles within the clusters following (48) (Supplementary Data 1—Text for method details; Supplementary Data 1—Supplementary Figure S2). The familial binding profiles, also referred to as archetypes in (48), can be explored and downloaded at https://jaspar.genereg.net/matrix-clusters and https://jaspar.genereg.net/downloads/, respectively. One of the primary uses of PFMs is to predict binding sites. To facilitate this, we created ready-made prediction tracks for genome visualization and interpretation. Specifically, we scanned the genomes of eight organisms (Arabidopsis thaliana, Caenorhabditis elegans, Ciona intestinalis, Danio rerio, Drosophila melanogaster, Homo sapiens, Mus musculus, and Saccharomyces cerevisiae) with the JASPAR CORE PFMs associated with the same taxon to predict TFBSs and update the JASPAR TFBS genomic tracks. Moreover, we created a collection of familial TFBSs by merging overlapping TFBSs that were predicted from PFMs associated with the same familial binding profile (Supplementary Data 1—Text for method details). The TFBS predictions associated with all PFMs are available at http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/2022/. The familial binding TFBSs are available at https://jaspar.genereg.net/downloads/. Finally, we provide JASPAR TFBS predictions as genomic tracks, which can be visualized in genome browsers. Notably, the UCSC Genome Browser (49) now presents predicted human (for the hg19 and hg38 genome assemblies) and mouse (for the mm10 and mm39 genome assemblies) JASPAR TFBS data as a native tracks for the human and mouse genomes with information such as TF names, TFBS prediction scores, and PFM logos (Supplementary Data 1 - Supplementary Figure S3).

A command-line tool to evaluate JASPAR TFBS enrichment in genomic regions

A common challenge in the field of transcriptional regulation is to predict the TF(s) most likely to control a set of cis-regulatory regions. This challenge is classically addressed by evaluating the enrichment for potential TFBSs associated with candidate TFs in the genomic regions of interest compared to background regions (16,26,50–53). We previously introduced an enrichment tool that evaluates the enrichment for sets of direct TF–DNA interactions from UniBind in user-provided DNA regions compared to background regions (16). Following the same strategy, we introduce a TFBS enrichment tool to predict TFs with an enrichment of JASPAR TFBSs using the Locus Overlap Analysis (LOLA) tool (54). The enrichment tool is available as a command-line tool (https://jaspar.genereg.net/enrichment/, https://bitbucket.org/CBGR/jaspar_enrichment/). As a use case, we studied the differential enrichment of predicted TFBSs at DNase-seq peaks observed in A549 cells before and after 2 h treatment with 100 nM dexamethasone. DNase-seq is an assay capturing open chromatin regions (55). Dexamethasone is a known agonist of the glucocorticoid receptor (NR3C1), a nuclear receptor that binds the DNA upon ligand-based activation. Figure 3 provides a visual representation of the differential TFBS enrichment analysis results when considering DNase-seq peaks in treated versus untreated cells. As expected, NR3C1 (a member of the Steroid hormone receptors (NR3) family) was the top enriched TF (–log10(P) = 58.77). Among other TFs showing a high enrichment of TFBSs, we observed many members of the Three-zinc finger Kruppel-related family (e.g. KLF factors, SP3, and SP9) (Supplementary Data 2—Supplementary Table S3). In another example, we observed the enrichment of TFBSs for the TFs FOXA1 and GATA3 in regions surrounding CpGs that are hypomethylated in estrogen receptor positive (ER+) breast cancers (56) (Supplementary Data 1—Supplementary Figure S4, Supplementary Data 2—Supplementary Supplementary Table S4). These TFs are well established drivers of ER+ breast cancers binding to hypomethylated enhancers in ER+ breast cancers (56).

Figure 3.

TFBS differential enrichment analysis on DNase-seq data for A549 cells before and after 2 h of dexamethasone treatment. Enrichment significance for each JASPAR profile from the vertebrate CORE collection is shown in the y-axis as -log10(P) in this beeswarm plot. Each point depicts the Fisher exact test P-value (P) corresponding to a TF. The Points are colored based on the TF DBD structural family annotation, with a distinct color for each of the top 10 enriched families (see legend). Light yellow represents TF families outside of the top 10 enriched and with -log10(P) > 3 (Other) and brown represents TF families for which -log10(P) ≤ 3 (non significant, N.S.).

pyJASPAR—serverless pythonic interface to JASPAR data

All data is accessible through the JASPAR website (https://jaspar.genereg.net/), its associated RESTful API (https://jaspar.genereg.net/api/) (57), and the JASPAR2022 R/Bioconductor data package (source code at https://github.com/da-bar/JASPAR2022). The JASPAR database can also be accessed using Biopython (58) but it requires a local MySQL server to query the underlying database, which limits its access and use. To make access to JASPAR data easier, we introduce a new Python package, pyJASPAR (59), which allows users to query and access all JASPAR data without setting up the underlying MySQL database. pyJASPAR is implemented in Python 3 using the Biopython motifs module and SQLite3 to provide a serverless Pythonic interface to the JASPAR database. The package allows users to query and access TF binding profiles across various releases of JASPAR. The releases currently available are: JASPAR2014, JASPAR2016, JASPAR2018, JASPAR2020, and JASPAR2022. The pyJASPAR package will be updated when future JASPAR releases become available. TF binding profiles can be retrieved using JASPAR matrix IDs, TF names, or other metadata information (Supplementary Data 1—Text for more details). pyJASPAR is open source and the code is available at https://github.com/asntech/pyjaspar/ under the GPL-3.0 License. The module can easily be installed with Conda from the bioconda channel (https://anaconda.org/bioconda/pyjaspar) (60) or from the Python Package Index with the pip command. Detailed documentation with usage examples is available at https://pyjaspar.rtfd.io/.

CONCLUSIONS AND PERSPECTIVES

For the 9th release of the JASPAR database, we substantially expanded the JASPAR CORE collection by 19% (341 added motifs). The newly introduced TF binding profiles were obtained after manual curation of PFMs predicted de novo from >3500 ChIP-seq/-exo datasets (from ReMap 2022 (36) and GEO (61)) or retrieved from publically available repositories. While we continued our commitment to provide non-redundant, high-quality TF binding profiles for TFs across six taxa, this release comes with an important increase in the number of profiles for urochordata, with 86 PFMs available when JASPAR has contained a single one since 2006 (27). We now also provide TFBS predictions in Ciona intestinalis using the 86 JASPAR binding profiles. This increase exemplifies how the investigation of transcriptional regulation is expanding across more model organisms. An important question is what fraction of TFs have a binding profile in JASPAR. For humans, the JASPAR vertebrates CORE collection contains a binding profile for 43% of the 1639 human TFs (1), 56% when including the Unvalidated collection. If we consider the 1717 reported TFs for A. thaliana (62), 21% of these TFs have a profile in the JASPAR plants CORE collection, 22% when including the Unvalidated collection. From the previous version of the Unvalidated collection (19), we found literature support for 81 profiles. Unfortunately, our team of curators did not succeed in identifying orthogonal validation in the literature for several high-quality motifs found enriched at ChIP-seq/-exo peak summits. As a result, 298 of such profiles were added to the previously introduced Unvalidated collection (19). The lack of experimental support for these profiles indicates an opportunity for the research field to explore these understudied TFs (63). Notably, 61% of the profiles in the vertebrates Unvalidated collection is associated with C2H2 zinc finger factors. A potential contributing challenge to obtaining orthogonal evidence may be the fact that many zinc-fingers, which represent the largest class of TFs, have been reported to regulate a limited number or even a single gene (e.g. Zfp568 (64), ZNF558 (65), ZNF410 (66) and ZFP64 (67)). This JASPAR update comes with a new tool to compute TFBS enrichment given user-provided input and background sequences, mimicking a similar tool available with the UniBind database (16). The tool relies on the genome-wide TFBSs predicted using PFMs from the JASPAR CORE collection. Even though JASPAR predicted TFBSs will contain a high number of false positives, the enrichment tool could be useful to suggest roles for TFs for which no direct TF-DNA interactions are available in UniBind (16). Consistent with Weidemüller et al. (63), we noticed that limited scientific literature (i.e. at most a single manuscript in PubMed) exists for many TFs, which clearly impacts the utility of the JASPAR word clouds. This constraint varies between taxa. For example, while the average number of PubMed manuscripts per vertebrate TF was ∼500, urochordata TFs were associated with an average of only four manuscripts. Furthermore, a large number of TFs associated with individual PubMed manuscripts was observed. The average number of vertebrate TFs associated with PubMed IDs was ∼19 with some associated with hundreds of TFs. An example is PubMed ID 21873635 that describes methods development of the Gene Ontology database (822 TFs), PubMed ID 12477932 that describes the Mammalian Gene Collection (MGC) Program (805 TFs), and PubMed ID 15618518 that analyzes the expression of TFs in the mouse brain (722 TFs). These manuscripts include general information about TFs. Therefore, we see opportunities to further improve the literature annotation engine, by decreasing the influence of outlier manuscripts and incorporating emerging natural language processing methods. PFMs are still the most widely used models to represent TF binding preferences to DNA, despite their well-established caveats such as fixed-length and the failure to account for nucleotide interdependencies. A novel generation of computational models based on machine learning approaches such as deep learning are arising (68,69). Nevertheless, how to best share these models in a unified manner is still unclear despite some recent efforts (70) and will require discussion in the community. As the field moves towards a unified framework to share such models, we expect their inclusion in future JASPAR releases.

AUTHORS’ NOTE

The authors wish it to be known that, in their opinion, the first three authors should be regarded as joint first authors. The order of co-first authors provided here was decided through a mushroom picking competition around the Sognsvann lake, Oslo, Norway. Co-first authors can prioritise their names when adding this paper's reference to their résumés. Click here for additional data file.

67 in total

1. Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities.

Authors: Michael F Berger; Anthony A Philippakis; Aaron M Qureshi; Fangxue S He; Preston W Estep; Martha L Bulyk
Journal: Nat Biotechnol Date: 2006-09-24 Impact factor: 54.908

2. Database resources of the National Center for Biotechnology Information.

Authors: Eric W Sayers; Jeffrey Beck; Evan E Bolton; Devon Bourexis; James R Brister; Kathi Canese; Donald C Comeau; Kathryn Funk; Sunghwan Kim; William Klimke; Aron Marchler-Bauer; Melissa Landrum; Stacy Lathrop; Zhiyong Lu; Thomas L Madden; Nuala O'Leary; Lon Phan; Sanjida H Rangwala; Valerie A Schneider; Yuri Skripchenko; Jiyao Wang; Jian Ye; Barton W Trawick; Kim D Pruitt; Stephen T Sherry
Journal: Nucleic Acids Res Date: 2021-01-08 Impact factor: 16.971

3. TFBSshape: an expanded motif database for DNA shape features of transcription factor binding sites.

Authors: Tsu-Pei Chiu; Beibei Xin; Nicholas Markarian; Yingfei Wang; Remo Rohs
Journal: Nucleic Acids Res Date: 2020-01-08 Impact factor: 16.971

4. Genome-wide mapping of in vivo protein-DNA interactions.

Authors: David S Johnson; Ali Mortazavi; Richard M Myers; Barbara Wold
Journal: Science Date: 2007-05-31 Impact factor: 47.728

Review 5. Transcription factors: Bridge between cell signaling and gene regulation.

Authors: Paula Weidemüller; Maksim Kholmatov; Evangelia Petsalaki; Judith B Zaugg
Journal: Proteomics Date: 2021-08-09 Impact factor: 3.984

6. ChIP-nexus enables improved detection of in vivo transcription factor binding footprints.

Authors: Qiye He; Jeff Johnston; Julia Zeitlinger
Journal: Nat Biotechnol Date: 2015-03-09 Impact factor: 54.908

7. JASPAR 2020: update of the open-access database of transcription factor binding profiles.

Authors: Oriol Fornes; Jaime A Castro-Mondragon; Aziz Khan; Robin van der Lee; Xi Zhang; Phillip A Richmond; Bhavi P Modi; Solenne Correard; Marius Gheorghe; Damir Baranašić; Walter Santana-Garcia; Ge Tan; Jeanne Chèneby; Benoit Ballester; François Parcy; Albin Sandelin; Boris Lenhard; Wyeth W Wasserman; Anthony Mathelier
Journal: Nucleic Acids Res Date: 2020-01-08 Impact factor: 16.971

8. UniBind: maps of high-confidence direct TF-DNA interactions across nine species.

Authors: Rafael Riudavets Puig; Paul Boddie; Aziz Khan; Jaime Abraham Castro-Mondragon; Anthony Mathelier
Journal: BMC Genomics Date: 2021-06-26 Impact factor: 3.969

9. RSAT 2018: regulatory sequence analysis tools 20th anniversary.

Authors: Nga Thi Thuy Nguyen; Bruno Contreras-Moreira; Jaime A Castro-Mondragon; Walter Santana-Garcia; Raul Ossio; Carla Daniela Robles-Espinoza; Mathieu Bahin; Samuel Collombet; Pierre Vincens; Denis Thieffry; Jacques van Helden; Alejandra Medina-Rivera; Morgane Thomas-Chollier
Journal: Nucleic Acids Res Date: 2018-07-02 Impact factor: 16.971

10. DNA Readout Viewer (DRV): visualization of specificity determining patterns of protein-binding DNA segments.

Authors: Krisztian Adam; Zoltan Gyorgypal; Zoltan Hegedus
Journal: Bioinformatics Date: 2020-04-01 Impact factor: 6.937

84 in total

1. RNA m¹A methylation regulates glycolysis of cancer cells through modulating ATP5D.

Authors: Yingmin Wu; Zhuojia Chen; Guoyou Xie; Haisheng Zhang; Zhaotong Wang; Jiawang Zhou; Feng Chen; Jiexin Li; Likun Chen; Hongxin Niu; Hongsheng Wang
Journal: Proc Natl Acad Sci U S A Date: 2022-07-08 Impact factor: 12.779

Review 2. Emerging mechanisms of elastin transcriptional regulation.

Authors: Sara S Procknow; Beth A Kozel
Journal: Am J Physiol Cell Physiol Date: 2022-07-11 Impact factor: 5.282

3. TMEM97 is transcriptionally activated by YY1 and promotes colorectal cancer progression via the GSK-3β/β-catenin signaling pathway.

Authors: Dong Mao; Xiaowei Zhang; Zhaoping Wang; Guannan Xu; Yun Zhang
Journal: Hum Cell Date: 2022-07-30 Impact factor: 4.374

4. The Extract of Ilex cornuta Bark Promotes Bone Healing by Activating Adenosine A2A Receptor.

Authors: Xi Zheng; Jingyi Wang; Junlin Zhou; Dong Wang
Journal: Drug Des Devel Ther Date: 2022-08-04 Impact factor: 4.319

5. Cross-lineage potential of Ascl1 uncovered by comparing diverse reprogramming regulatomes.

Authors: Haofei Wang; Benjamin Keepers; Yunzhe Qian; Yifang Xie; Marazzano Colon; Jiandong Liu; Li Qian
Journal: Cell Stem Cell Date: 2022-10-06 Impact factor: 25.269

Review 6. Obtaining genetics insights from deep learning via explainable artificial intelligence.

Authors: Gherman Novakovsky; Nick Dexter; Maxwell W Libbrecht; Wyeth W Wasserman; Sara Mostafavi
Journal: Nat Rev Genet Date: 2022-10-03 Impact factor: 59.581

7. Integrated genomics approaches identify transcriptional mediators and epigenetic responses to Afghan desert particulate matter in small airway epithelial cells.

Authors: Arnav Gupta; Sarah K Sasse; Reena Berman; Margaret A Gruca; Robin D Dowell; Hong Wei Chu; Gregory P Downey; Anthony N Gerber
Journal: Physiol Genomics Date: 2022-09-05 Impact factor: 4.297

8. FABIAN-variant: predicting the effects of DNA variants on transcription factor binding.

Authors: Robin Steinhaus; Peter N Robinson; Dominik Seelow
Journal: Nucleic Acids Res Date: 2022-05-26 Impact factor: 19.160

9. Differentiation of T Helper 17 Cells May Mediate the Abnormal Humoral Immunity in IgA Nephropathy and Inflammatory Bowel Disease Based on Shared Genetic Effects.

Authors: Jianbo Qing; Changqun Li; Xueli Hu; Wenzhu Song; Hasna Tirichen; Hasnaa Yaigoub; Yafeng Li
Journal: Front Immunol Date: 2022-06-13 Impact factor: 8.786

10. Genome-Wide Identification of m⁶A Writers, Erasers and Readers in Poplar 84K.

Authors: Xiaochen Sun; Wenli Wu; Yanfang Yang; Iain Wilson; Fenjuan Shao; Deyou Qiu
Journal: Genes (Basel) Date: 2022-06-05 Impact factor: 4.141