| Literature DB >> 22266657 |
Daniela M Witten1, William Stafford Noble.
Abstract
A growing body of experimental evidence supports the hypothesis that the 3D structure of chromatin in the nucleus is closely linked to important functional processes, including DNA replication and gene regulation. In support of this hypothesis, several research groups have examined sets of functionally associated genomic loci, with the aim of determining whether those loci are statistically significantly colocalized. This work presents a critical assessment of two previously reported analyses, both of which used genome-wide DNA-DNA interaction data from the yeast Saccharomyces cerevisiae, and both of which rely upon a simple notion of the statistical significance of colocalization. We show that these previous analyses rely upon a faulty assumption, and we propose a correct non-parametric resampling approach to the same problem. Applying this approach to the same data set does not support the hypothesis that transcriptionally coregulated genes tend to colocalize, but strongly supports the colocalization of centromeres, and provides some evidence of colocalization of origins of early DNA replication, chromosomal breakpoints and transfer RNAs.Entities:
Mesh:
Year: 2012 PMID: 22266657 PMCID: PMC3351188 DOI: 10.1093/nar/gks012
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.(a)–(c) show histograms of hypergeometric P-values. In panel (a), the P-values are computed for 1000 random gene sets with respect to the yeast interaction data set of Duan et al. (3). In panel (b) the P-values are computed with respect to a simulated data set for 250 random sets of 100 genes. In (c), the P-values correspond to 174 gene sets regulated by a single transcription factor and studied in (11), computed with respect to the yeast interaction data set of (3). Panels (d)–(f) are analogous to panels (a)–(c), but the P-values are computed using the resampling approach. In each case, the resampling-based P-values provide no evidence of colocalization of gene sets.
The hypergeometric test is based upon a 2 × 2 contingency table of gene pairs
| Interaction | No interaction | |
|---|---|---|
| In gene set | ||
| Not in gene set |
Each element in the contingency table indicates the number of gene pairs corresponding to the associated row and column. For instance, there are a gene pairs in the gene set for which an interaction was observed, and d gene pairs not in the gene set for which no interaction was observed.
Figure 2.Based on the yeast interaction data of Duan et al., hypergeometric and resampling-based P-values were computed to assess the extent to which certain functional groups colocalize. The height of each bar indicates enrichment or depletion of observed interchromosomal interactions relative to the percent (black line) of all possible interactions that were observed at a false discovery rate below 0.01. Above each bar, the resampling-based P-value is reported (without correction for multiple testing), and an asterisk indicates that the hypergeometric P-value was below 0.01 after Bonferroni correction. Additional information about the fourteen sets of functional elements can be found in Duan et al. (3).