Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 THiCweed: fast, sensitive detection of sequence features by clustering big datasets.

Literature DB >> 29267972

THiCweed: fast, sensitive detection of sequence features by clustering big datasets.

Ankit Agrawal¹, Snehal V Sambare¹, Leelavati Narlikar², Rahul Siddharthan¹.

Abstract

We present THiCweed, a new approach to analyzing transcription factor binding data from high-throughput chromatin immunoprecipitation-sequencing (ChIP-seq) experiments. THiCweed clusters bound regions based on sequence similarity using a divisive hierarchical clustering approach based on sequence similarity within sliding windows, while exploring both strands. ThiCweed is specially geared toward data containing mixtures of motifs, which present a challenge to traditional motif-finders. Our implementation is significantly faster than standard motif-finding programs, able to process 30 000 peaks in 1-2 h, on a single CPU core of a desktop computer. On synthetic data containing mixtures of motifs it is as accurate or more accurate than all other tested programs. THiCweed performs best with large 'window' sizes (≥50 bp), much longer than typical binding sites (7-15 bp). On real data it successfully recovers literature motifs, but also uncovers complex sequence characteristics in flanking DNA, variant motifs and secondary motifs even when they occur in <5% of the input, all of which appear biologically relevant. We also find recurring sequence patterns across diverse ChIP-seq datasets, possibly related to chromatin architecture and looping. THiCweed thus goes beyond traditional motif finding to give new insights into genomic transcription factor-binding complexity.

Entities: Chemical

Mesh：

Substances：

Year: 2018 PMID： 29267972 PMCID： PMC5861420 DOI： 10.1093/nar/gkx1251

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

32 in total

1. Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae.

Authors: J D Hughes; P W Estep; S Tavazoie; G M Church
Journal: J Mol Biol Date: 2000-03-10 Impact factor: 5.469

2. JASPAR: an open-access database for eukaryotic transcription factor binding profiles.

Authors: Albin Sandelin; Wynand Alkema; Pär Engström; Wyeth W Wasserman; Boris Lenhard
Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971

3. The UCSC Genome Browser Database.

Authors: D Karolchik; R Baertsch; M Diekhans; T S Furey; A Hinrichs; Y T Lu; K M Roskin; M Schwartz; C W Sugnet; D J Thomas; R J Weber; D Haussler; W J Kent
Journal: Nucleic Acids Res Date: 2003-01-01 Impact factor: 16.971

4. Deep and wide digging for binding motifs in ChIP-Seq data.

Authors: I V Kulakovskiy; V A Boeva; A V Favorov; V J Makeev
Journal: Bioinformatics Date: 2010-08-24 Impact factor: 6.937

5. Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution.

Authors: Ho Sung Rhee; B Franklin Pugh
Journal: Cell Date: 2011-12-09 Impact factor: 41.582

6. PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny.

Authors: Rahul Siddharthan; Eric D Siggia; Erik van Nimwegen
Journal: PLoS Comput Biol Date: 2005-12-09 Impact factor: 4.475

7. Genome-wide mapping of in vivo protein-DNA interactions.

Authors: David S Johnson; Ali Mortazavi; Richard M Myers; Barbara Wold
Journal: Science Date: 2007-05-31 Impact factor: 47.728

8. MEME-ChIP: motif analysis of large DNA datasets.

Authors: Philip Machanick; Timothy L Bailey
Journal: Bioinformatics Date: 2011-04-12 Impact factor: 6.937

9. ChIP-nexus enables improved detection of in vivo transcription factor binding footprints.

Authors: Qiye He; Jeff Johnston; Julia Zeitlinger
Journal: Nat Biotechnol Date: 2015-03-09 Impact factor: 54.908

10. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles.

Authors: Anthony Mathelier; Oriol Fornes; David J Arenillas; Chih-Yu Chen; Grégoire Denay; Jessica Lee; Wenqiang Shi; Casper Shyr; Ge Tan; Rebecca Worsley-Hunt; Allen W Zhang; François Parcy; Boris Lenhard; Albin Sandelin; Wyeth W Wasserman
Journal: Nucleic Acids Res Date: 2015-11-03 Impact factor: 16.971

2 in total

1. Disentangling transcription factor binding site complexity.

Authors: Ralf Eggeling
Journal: Nucleic Acids Res Date: 2018-11-16 Impact factor: 16.971

2. A universal framework for detecting cis-regulatory diversity in DNA regions.

Authors: Anushua Biswas; Leelavati Narlikar
Journal: Genome Res Date: 2021-07-19 Impact factor: 9.043

2 in total