| Literature DB >> 21296745 |
Hyunmin Kim1, Jihye Kim, Heather Selby, Dexiang Gao, Tiejun Tong, Tzu Lip Phang, Aik Choon Tan.
Abstract
Chromatin immunoprecipitation followed by massively parallel next-generation sequencing (ChIP-seq) is a valuable experimental strategy for assaying protein-DNA interaction over the whole genome. Many computational tools have been designed to find the peaks of the signals corresponding to protein binding sites. In this paper, three computational methods, ChIP-seq processing pipeline (spp), PeakSeq and CisGenome, used in ChIP-seq data analysis are reviewed. There is also a comparison of how they agree and disagree on finding peaks using the publically available Signal Transducers and Activators of Transcription protein 1 (STAT1) and RNA polymerase II (PolII) datasets with corresponding negative controls.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21296745 PMCID: PMC3525234 DOI: 10.1186/1479-7364-5-2-117
Source DB: PubMed Journal: Hum Genomics ISSN: 1473-9542 Impact factor: 4.639
The number of common peaks identified by the three ChIP-seq analytical tools for Stat1 experimental data
| Prediction | Predictions within 200 bp | |||
|---|---|---|---|---|
| Spp | PeakSeq | CisGenome | Known Stat1 binding sites (stimulated) | |
| Spp | - | 2633 (97%) | 1640 (60%) | 17/28 (61%) |
| PeakSeq | 2579 (46%) | - | 1671 (30%) | 19/28 (68%) |
| CisGenome | 1611 (96%) | 1677 (100%) | - | 12/28 (43%) |
The number of common peaks identified by the three ChIP-seq analytical tools for PolII experimental data
| Prediction | Support | < 1 kbp | |||
|---|---|---|---|---|---|
| Spp | PeakSeq | CisGenome | From TSS | From TES | |
| Spp | - | 15,606 (96%) | 8866 (55%) | 12,135 (75%) | 790 (5%) |
| PeakSeq | 10,507 (74%) | - | 7541 (53%) | 9550 (67%) | 1181 (8%) |
| CisGenome | 8203 (89%) | 8961 (97%) | - | 7654 (83%) | 454 (5%) |
Figure 1Graphical comparison of three methods on STAT1 (A) and PolII (B) in the University of California Santa Cruz (UCSC) genome browser. The top panel represents the predicted binding sites by ChIP-seq analysis tools; red, green and blue lanes represent Spp, PeakSeq and CisGenome, respectively. The middle panel represents the aligned tags (with non-unique mapping) of ChIP-seq data, and inputs were drawn using squish modes. The lower panel represents the known genes of the STAT1 and PolII binding sites.