| Literature DB >> 24555784 |
Ngoc Tam L Tran1, Chun-Hsi Huang.
Abstract
ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24555784 PMCID: PMC4022013 DOI: 10.1186/1745-6150-9-4
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 4.540
A summary of motif finding web tools
| MEME | No | Fasta | = 60000 characters | < 1000 bp | No | Yes | Yes | JASPAR, BLOCKS, UniProbe, …., user database | MEME | Implemented Multiple EM | No | No | No | Yes | No | 2006 | 4.9.1 | [ |
| GLAM2 | No | Fasta | = 60000 characters | = 10000 bp | No | No | No | JASPAR, UniProbe, …., user database | GLAM2 | Implemented novel Gapped Local Alignment of Motifs algorithm | No | No | No | Yes | No | 2008 | 4.9.1 | [ |
| CisFinder | No | Fasta, plain text delimited | Unspecified | = 50 Mb | FDR option | No | Yes | JASPAR, CisView, …., user database | CisFinder | Implemented novel CisFinder algorithm | Yes | No | Optional | Optional | Optional | 2009 | Unspecified | [ |
| W-ChIPMotifs | Yes | Fasta | Unspecified | Unspecified | No | No | No | JASPAR, TRANSFAC, …., user database | W-ChIPMotifs | Used existing ChIPMotifs program and incorporated other existing tools: MEME, MaMF, and Weeder | No | Human and Mouse only | No | Yes | No | 2009 | Unspecified | [ |
| CompleteMOTIFs | Yes | Bed, fasta, gff | = 500000 bp for MEME, Weeder, = 5000000 for ChIPMunk | Unspecified | Yes | Yes for MEME | No | JASPAR, TRANSFAC | CompleteMOTIFs | Integrated existing tools: MEME, Weeder, and ChIPMunk | Yes | Yes | Optional | Optional | Optional | 2011 | Unspecified | [ |
| DREME | No | Fasta | Unspecified | Unspecified | E-value option | No | No | JASPAR, UniProbe, …., user database | DREME | Implemented novel Discriminative Regular Expression Motif Elicitation algorithm (DREME) | No | No | No | Yes | No | 2011 | 4.9.1 | [ |
| MEME-ChIP | Yes | Fasta | Unlimited | Unlimited | E-value option | Yes | Yes | JASPAR, UniProbe, …., user database | MEME-ChIP | Integrated existing tools: MEME and DREME | No | No | No | Yes | No | 2011 | 4.9.1 | [ |
| RSAT peak-motifs | Yes | Raw, multi, tab, fasta, wconsensus, IG | Unlimited | Unlimited | No | Yes | Yes | JASPAR, UniProbe, DMMPMM, RegulonDB, …, user database | RSAT peak-motifs | Implemented RSAT oligo-analysis, RSAT dyad-analysis, RSAT local-word analysis, MEME, ChlPMunk | Yes | No | No | Optional | No | 2012 | Unspecified | [ |
| PScanChIP | No | Bed | Unlimited | 100 -150 bp | No | No | No | JASPAR, TRANSFAC | PScanChIP | Used existing Pscan algorithm | Yes | Yes | No | No | No | 2013 | 1.0 | [ |
A summary of peak calling tools
| BayesPeak | BayesPeak algorithm | Used Hidden Markov model (HMM) for finding peaks | 2011 | R and C | Linux, Windows, and Mac OS X | Support multicore | 1.12.0 | N/A | Yes | [ | ||
| BroadPeak | Maximal-segment algorithm, Gibbs sampling algorithm, Ruzzo–Tompa algorithm | Probabilistic model | 2013 | R | N/A | N/A | One version | 2013 | Yes | [ | ||
| CisGenome | Two-pass algorithm | Implemented a modular design, use sliding window for peak detection | 2008 | C, C++ | Windows, Mac, and Linux | Stand-alone system, command mode and GUI | v2.0 | 2011 | Yes | [ | ||
| DROMPA (DRaw and observe Multiple enrichment profiles and annotation) | Sliding window | Two-step procedure, DROMPA peak-calling program | 2013 | ANSI-C | Linux | N/A | 1.4.0 | 2013 | Yes | [ | ||
| F-Seq | F-Seq density estimation algorithm | Kernel density estimation | 2008 | Java | Unix, Linux | N/A | 1.84 | 2011 | Yes | [ | ||
| FindPeaks | Used directional reads module for identifying peaks | Implemented a modular architecture | 2008 | Java | Linux, Windows, and Mac OS X | Command line | 4.0 | N/A | Yes | [ | ||
| GEM (Genome wide Event finding and Motif discovery) | Genome wide event finding and motif discovery (GEM) | Probabilistic model | 2012 | Java | N/A | Stand-alone software | 2.3 | 2013 | Yes | [ | ||
| GLITR (GLobal Identifier of Target Regions) | GLITR algorithm | Used ChIP-Seq Peak Finder framework | 2009 | Perl and Python | N/A | N/A | N/A | N/A | N/A | N/A | [ | |
| GLMNB (Negative binomial generalized linear model) | Sliding window | Generalized Linear Model with Negative binomial distribution | 2012 | N/A | N/A | N/A | 1.0 | 2012 | N/A | [ | ||
| Hpeak (Hidden Markov model (HMM)-based Peak-finding algorithm) | HMM-based algorithm | Hidden Markov Model (HMM) | 2010 | Perl and C++ | Linux, Windows, and Mac OS | N/A | V2.1 | 2009 | N/A | [ | ||
| MACS (Model-based analysis of ChIP-Seq) | MACS algorithm (use shift and sliding window algorithm) | Model-based Analysis of ChIP-Seq | 2008 | Python | Linux | stand-alone, no GUI, open source | 1.4.2 | 2012 | Yes | [ | ||
| NEXT-peak (the normal-exponential two-peak) | NEXT-peak algorithm | Normal-exponential two-peak (NEXT-peak) model | 2013 | C++ | Linux | N/A | 1.1 | 2013 | Yes | [ | ||
| PeakRanger | Same algorithm as PeakSeq for identifying broad regions. Summit-valley-alternator algorithm | Build the read coverage profile | 2011 | C++ | Linux, Mac OS, and Windows | Support parallel cloud computing | 1.16 | 2012 | Yes | [ | ||
| PeakSeq | PeakSeq - two-pass strategy | Two-pass strategy | 2009 | C and Perl | N/A | N/A | 1.1 | 2011 | N/A | [ | ||
| QuEST (Quantitative Enrichment of Sequence Tags) | Construct profiles and use shifting method | Statistical framework-Kernel Density Estimation approach | 2008 | C++ | Linux, Mac OS | Open source, non-profit use | 2.4 | 2009 | No | [ | ||
| SeqSite | Two-step strategy: detect tag-enriched regions and then pinpoint binding sites in the detected regions | Poisson model | 2011 | C/C++ | Windows, Mac OS X, and Linux | Academic use only | 1.1.2 | 2010 | Yes | [ | ||
| SICER | Scoring scheme | Spatial clustering approach | 2009 | Python | Linux, Unix | N/A | v1.1 | 2011 | Yes | [ | ||
| SIPeS (Site Identification from Paired-end Sequencing) | SIPeS algorithm | Used dynamic fragment pileup value for peak calling | 2010 | C | Linux | Non-profit use | 2.0 | 2010 | N/A | [ | ||
| SISSRs (Site Identification from Short Sequence Reads) | Site Identification from Short Sequence Reads (SISSRs) algorithm | Sliding window | 2008 | Perl | Linux, UNIX | N/A | v1.4 | 2008 | N/A | [ | ||
| Sole-Search | Sole-Search program | Implemented several different analysis steps for peak calling | 2010 | Java | N/A | Web-based software | N/A | N/A | N/A | No | [ | |
| T-PIC (Tree shape Peak Identification for ChIP-Seq) | Tree shape Peak Identification for ChIP-Seq (T-PIC) algorithm | Tree-based statistics | 2011 | R and Perl | N/A | N/A | One version | 2011 | N/A | [ | ||
| USeq | Collection of algorithms and software for peak calling | Implemented several different methods for peak calling | 2008 | Java | Linux, Mac OS X, and Windows | GUI | 8.6.6 | 2013 | Yes | [ | ||
| W-ChIPeaks | PELT algorithm and BELT algorithm | Statistical methods control false discovery rate | 2011 | PHP, Perl, Java and C++ | N/A | Web tool | 1.0.1 | 2012 | Yes for BELT only | [ | ||
| ZINBA (Zero-Inflated Negative Binomial Algorithm) | Zero-Inflated Negative Binomial Algorithm (ZINBA) | Statistical framework | 2011 | C and R | Mac OS X and Linux/Unix | Support multi-core clusters | 2.02.03 | 2012 | Yes | [ |
Dataset’s properties
| DM230 | PolII (RNA polymerase II) | GSM722763 | 105 | 157 | 1728 | 47242 | 49 KB | 5 KB | [ |
| DM05 | p300 (co-activator protein) | GSM722762 | 142 | 130 | 1214 | 50318 | 53 KB | 7 KB | [ |
| DM254 | CTCF (insulator binding protein) | GSM722759 | 4009 | 94 | 2374 | 1518265 | 1604 KB | 181 KB | [ |
| DM01 | H3K4me1 (histone H3 lysine 4 monomethylation) | GSM722760 | 2001 | 175 | 8520 | 1856431 | 1871 KB | 88 KB | [ |
| DM721 | H3K27ac (H3 lysine 27 acetylation) | GSM851275 | 4005 | 255 | 16542 | 5429909 | 5423 KB | 180 KB | [ |