| Literature DB >> 20525272 |
Song Wu1, Jianmin Wang, Wei Zhao, Stanley Pounds, Cheng Cheng.
Abstract
BACKGROUND: ChIP-Seq is a powerful tool for identifying the interaction between genomic regulators and their bound DNAs, especially for locating transcription factor binding sites. However, high cost and high rate of false discovery of transcription factor binding sites identified from ChIP-Seq data significantly limit its application.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20525272 PMCID: PMC2893127 DOI: 10.1186/1742-4682-7-18
Source DB: PubMed Journal: Theor Biol Med Model ISSN: 1742-4682 Impact factor: 2.432
Figure 1Tag count distributions from the simulated and real genomic data. A. Simulated forward- and reverse-strand count distribution for a region containing one TF binding site; B. Difference between forward- and reverse-strand tag counts shown in panel A. C. Forward- and reverse-strand tag count distribution in an example genomic region (from the real data application); D. Difference between forward- and reverse-strand tag counts shown in panel C.
Figure 2Sequence the ChIP-PaM algorithm.
Summary statistics of the ChIP-Seq dataset.
| unique reads | Copy percentage | |||||||
|---|---|---|---|---|---|---|---|---|
| 1-22, X | 24.3 M | 20.9 M | 0.34 M | 53.3% | 45.9% | 0.75% | 3B | 0.015 |
| Y | 11 K | 11.6 K | 1.7 K | 45.5% | 47.6% | 6.9% | 54.7 M | 0.0004 |
| M | 33 | 3144 | 16.5 K | 0.17% | 15.9% | 83.9% | 16.5 K | 1.191 |
Shown are selected basic characteristics used in the application, comparing chromosomes 1-22 plus X (combined because they are in the 2-copy state), Chr Y, and mitochondrial chromosomes (M). Chr Y is absent from Hela-S3 cells and indicates sequencing/mapping errors; mitochondrial chromosomes are located in cytoplasm and serve as an internal control for the nuclear chromosomes.
Comparison of nuclear and cytoplasmic chromosomes.
| Total Reads | ||||
|---|---|---|---|---|
| Chr | In ChIP | In input | ChIP/input ratio | |
| Specific Reads | 2356286 | 441471 | 5.3373 | |
| 1-22, X | Noise Reads | 24285371 | 22585024 | 1.0753 |
| E(background | 4959671 | |||
| noise) in ChIP* | (0.2042) | |||
| M | 89835 | 409136 | 0.2196 | |
*Assuming that reads mapped to mitochondrial chromosomes (M) are due to the experimental procedure and not to the non-specific binding of STAT1, the ChIP/input ratio for this background noise is 21.96%. Among chromosome 1-22 and X, the expected background noise in ChIP would be 22585024*0.2196 = 4959671, which accounts for 20.42% of total noise reads.
Figure 3Model fitting of the genome-wide tag count histogram. A. Data fitted by the Gamma-Poisson model; B. Data fitted by the Poisson model. C. The detailed right-tail fitting by the Gamma-Poisson model.
Figure 4Comparison of ChIP-PaM with SISSRs and PeakSeq and ChIP-PaM using count data only. A. The number of findings on Chr Y is used to compare false positive findings. B. Fourteen known STAT1 GAS target genes are used to compare the true positive findings (i.e., power).
Figure 5Venn Diagraph of genes found by the three algorithms.
Figure 6Examples of two real genomic regions that are identified as significant by SISSRs but not by ChIP-PaM and PeakSeq. A. Chr1: 91625233- 91625750; B. Chr1: 121185480- 121186959. The red line represents forward-strand counts and the green line represents reverse-strand counts. The positive blue bars represent forward tag reads and the negative blue bars represent reverse tag reads. The broad pink line delineates regions identified as significant by SISSRs.