| Literature DB >> 29621972 |
Louis T Dang1, Markus Tondl1, Man Ho H Chiu1, Jerico Revote2, Benedict Paten3, Vincent Tano4, Alex Tokolyi1, Florence Besse5, Greg Quaife-Ryan6, Helen Cumming7, Mark J Drvodelic1, Michael P Eichenlaub1, Jeannette C Hallab1, Julian S Stolper1, Fernando J Rossello1, Marie A Bogoyevitch4, David A Jans8, Hieu T Nim1,9, Enzo R Porrello10,11, James E Hudson6, Mirana Ramialison12.
Abstract
BACKGROUND: A strong focus of the post-genomic era is mining of the non-coding regulatory genome in order to unravel the function of regulatory elements that coordinate gene expression (Nat 489:57-74, 2012; Nat 507:462-70, 2014; Nat 507:455-61, 2014; Nat 518:317-30, 2015). Whole-genome approaches based on next-generation sequencing (NGS) have provided insight into the genomic location of regulatory elements throughout different cell types, organs and organisms. These technologies are now widespread and commonly used in laboratories from various fields of research. This highlights the need for fast and user-friendly software tools dedicated to extracting cis-regulatory information contained in these regulatory regions; for instance transcription factor binding site (TFBS) composition. Ideally, such tools should not require prior programming knowledge to ensure they are accessible for all users.Entities:
Keywords: Chromatin immunoprecipitation; Motif conservation; Motif discovery; Next generation sequencing; Transcription factor binding site
Mesh:
Substances:
Year: 2018 PMID: 29621972 PMCID: PMC5887194 DOI: 10.1186/s12864-018-4630-0
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Operating systems and browsers on which 11 users have successfully tested TrawlerWeb
| User | Operating System | Used browser |
|---|---|---|
| 001 | MAC OS X 10.11 | Mozilla Firefox |
| 002 | Windows 10 | Google Chrome |
| 003 | MAC OS X 10.11.6 | Mozilla Firefox |
| 004 | Windows 8.1 | Google Chrome |
| 005 | MAC OS X 10.10 | Mozilla Firefox |
| 006 | Linux Ubuntu 16.04 | Mozilla Firefox |
| 007 | Windows 7 Enterprise | Mozilla Firefox |
| 008 | MAC OS X 10.11 | Google Chrome |
| 009 | MAC OS X 10.9.5 | Google Chrome |
| 010 | Windows 7 Enterprise | Internet Explorer |
| 011 | MAC OS X 10.9.5 | Google Chrome |
Species and genome assemblies supported by TrawlerWeb
| Species | Genome assemblies | |
|---|---|---|
| Fish | ||
| Medaka |
| oryLat2 |
| Zebrafish |
| danRer7 |
| Stickleback |
| gasAcu1 |
| Tetrapods | ||
| Human |
| hg19, hg38 |
| Mouse |
| mm9, mm10 |
| Rat |
| rn5 |
| Marmoset |
| calJac3 |
| Chicken |
| galGal3 |
| Clawed frog |
| xenTro3 |
| Other eukaryotes | ||
| Fruit fly |
| dm3, dm6 |
| Worm |
| ce10 |
| Yeast |
| sacCer3 |
| Thale cress |
| TAIR9 |
ChIP-seq on transcription factors and genome assemblies used to compare TrawlerWeb, RSAT peak-motifs and MEME-ChIP
| Transcription factor | ChIP-seq GEO accession number | Reference for ChIP-seq | ChIP-seq dataset size (kbp) | Reference for known binding site | Species | Genome |
|---|---|---|---|---|---|---|
| Zic3.2 | GSM1017643 | Winata et al., 2013 [ | 282 | JASPAR PB0207.1 |
| danRer7 |
| TOC1 | GSM878068 | Huang et al., 2012 [ | 343 | Huang et al., 2012 |
| TAIR9 |
| MEF2A | GSM1377538 | Houles et al., 2015 [ | 338 | JASPAR MA0052.3 |
| mm9 |
| Su(H) | GSE66225 | Skalska et al., 2015 [ | 475 | JASPAR MA0085.1 |
| dm3 |
| Sox15.1 | GSM1536045 | Sulahian et al., 2015 [ | 1783 | JASPAR PB0065.1 |
| hg19 |
Fig. 1Comparing the performance of TrawlerWeb with other web-based motif discovery tools. a Duration of 11 independent runs for TrawlerWeb (blue), RSAT peak-motifs (green) and MEME-ChIP (red) in minutes. The mean is represented by the horizontal line for each dataset. The error bars indicate the standard deviation from the mean. The data are ordered by increasing size of the FASTA input file from left to right. Note that MEME-ChIP did not find any motifs for Dr, hence the motif discovered by DREME was used (see also Fig. 2a). b Overall performance benchmark of TrawlerWeb against 7 other algorithms, using 65 ChIP pulled down experiments on yeast dataset from [38]. MEME-c: MEME algorithm run on conserved regions only. c Comparison of percentage occurrence of over-represented motifs across test datasets. Motif discovery were conducted using 4 algorithms (DREME, MEME, RSAT peak-motifs, and TrawlerWeb) on the test datasets and the number of sequences containing the highest scoring motif were expressed as a percentage of the total number of analysed input sequences. The MEME-ChIP pipeline uses both MEME and DREME motif discovery tools for finding relatively long and short motifs respectively. The MEME algorithm uses a random subsample of 600 sequences. Dr = Danio rerio, At = Arabidopsis thaliana, Mm = Mus musculus, Dm = Drosophila melanogaster, Hs = Homo sapiens
Fig. 2Finding the expected motif with TrawlerWeb, RSAT peak-motifs, and MEME-ChIP. Alignment of the closest primary (no.1) and secondary (no.2) motif to the expected binding site identified for each motif discovery tool for the five species a Danio rerio, b Arabidopsis thaliana, c Mus musculus, d Drosophila melanogaster, and e Homo sapiens. f For each tool, Similarity Distance of the closest primary (no.1) and secondary (no.2) motif to the expected binding site. Motifs of 6 nucleotides (nt) length were represented for Su(H) and Sox15.1, and 7 nt for MEF2A, TOC1, and Zic3.2. MEME did not find any motif for Zic3.2, motif found by DREME was used
Fig. 3TrawlerWeb output display with conservation scores and UCSC links. a TrawlerWeb displays the Position Weight Matrices (PWMs, pink box), Hits against known transcription factor binding site (TFBS) databases (red box), Z-scores of the discovered motifs, and the Conservation Score (green box). b Clicking on the PWM (pink box in (a)) directs the user to the list of putative matches (red box) and provides a direct link to the corresponding TFBS database entry. c Chromosomal positions of instances of the discovered motif (pink box) in the input peaks are also provided. Average and maximum conservation score (green box) will be available for each instance of the PWM. Clicking on the genomic region of interest (blue box) opens it in the UCSC Genome Browser (d)