| Literature DB >> 21037963 |
Jiang Wang1, Xianhua Dai, Qian Xiang, Yangyang Deng, Jihua Feng, Zhiming Dai, Caisheng He.
Abstract
Eukaryotic genomes are packaged into chromatin by histone proteins whose chemical modification can profoundly influence gene expression. The histone modifications often act in combinations, which exert different effects on gene expression. Although a number of experimental techniques and data analysis methods have been developed to study histone modifications, it is still very difficult to identify the relationships among histone modifications on a genome-wide scale.We proposed a method to identify the combinatorial effects of histone modifications by association rule mining. The method first identified Functional Modification Transactions (FMTs) and then employed association rule mining algorithm and statistics methods to identify histone modification patterns. We applied the proposed methodology to Pokholok et al's data with eight sets of histone modifications and Kurdistani et al's data with eleven histone acetylation sites. Our method succeeds in revealing two different global views of histone modification landscapes on two datasets and identifying a number of modification patterns some of which are supported by previous studies.We concentrate on combinatorial effects of histone modifications which significantly affect gene expression. Our method succeeds in identifying known interactions among histone modifications and uncovering many previously unknown patterns. After in-depth analysis of possible mechanism by which histone modification patterns can alter transcriptional states, we infer three possible modification pattern reading mechanism ('redundant', 'trivial', 'dominative'). Our results demonstrate several histone modification patterns which show significant correspondence between yeast and human cells.Entities:
Keywords: association rule; histone code; yeast
Year: 2010 PMID: 21037963 PMCID: PMC2964047 DOI: 10.4137/ebo.s5602
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Figure 1.The Venn diagram of FMT.
Figure 2.The global view of histone modification of FMTs for Pokholok et al’s data.
Notes: Rows represent transactions, and columns represent sites. To obtain global view of histone modification of FMTs, we used a sliding window of 10 transactions to calculate ratio of over-expressed state of sites. The transactions were sorted according to EC scores. The graph showed the over-expressed states of histone modification.
Figure 3.The global view of histone modification of FMTs for Kurdistani et al’s data.
Note: Same as in Figure 2 except that Kurdistani et al’s data is used.
Summary of various datasets.
| Pokholok et al (2005) | A subset of the Pokholok et al’s dataset with eight sets of histone modifications under YPD condition | W303 | YPD |
| Kurdistani et al (2004) | A dataset with eleven acetylation sites | YDS2 wt | YEPD (YPD) |
| Harbison et al (2004) | Transcription factor binding activity | W303 | YPD (rich medium) |
| Gasch et al (2000) | Environmental stresses | Diversity | Diversity |
| Spellman et al (1998) | Cell cycle | Diversity | YEP medium |
Notes: ‘YEPD’ is often abbreviated as YPD, which corresponds to rich medium. ‘YEP medium’ is based upon YPD but is without dextrose. ‘Diversity’ indicates diverse conditions.
Rules extracted from Pokholok et al’s data.
| 1.2 | 100 | 313 | 5.79E-07 | <0.001 | |||
| 1.2 | 100 | 281.7 | 1.77E-05 | <0.001 | |||
| 1.1 | 100 | 281.7 | 1.44E-05 | 0.001 | |||
| 0.9 | 100 | 281.7 | 6.17E-05 | 0.001 | |||
| Rule5 | 13 71 33 63 | 51 | 0.7 | 100 | 313 | 7.20E-04 | <0.001 |
| Rule6 | 13 71 23 63 | 51 | 0.7 | 100 | 313 | 6.85E-04 | <0.001 |
| Rule7 | 21 41 13 | 81 | 0.6 | 100 | 410.8 | 1.99E-05 | <0.001 |
| Rule8 | 41 13 71 33 | 51 | 0.4 | 100 | 313 | 1.57E-02 | 0.001 |
| 1.4 | 94.1 | 265.1 | 4.97E-05 | <0.001 | |||
| 1.2 | 93.3 | 262.9 | 5.13E–05 | 0.007 | |||
| 1.2 | 93.3 | 262.9 | 2.96E-05 | <0.001 | |||
| 0.7 | 88.9 | 278.2 | 1.16E-02 | 0.002 | |||
| Rule13 | 81 33 23 63 | 51 | 0.7 | 88.9 | 278.2 | 1.25E-03 | <0.001 |
| Rule14 | 41 31 73 | 83 | 0.4 | 83.3 | 216.2 | 3.58E-02 | 0.005 |
| 0.7 | 80 | 308.3 | 2.44E-02 | 0.038 | |||
| 0.7 | 80 | 249.7 | 2.91E-03 | <0.001 | |||
| Rule17 | 61 21 41 31 | 53 | 0.7 | 80 | 249.7 | 1.20E-03 | <0.001 |
| Rule18 | 21 73 51 43 | 63 | 0.7 | 80 | 225.3 | 2.10E-02 | 0.003 |
| Rule19 | 61 21 31 51 | 71 | 0.7 | 80 | 269.6 | 1.03E-04 | <0.001 |
| Rule20 | 81 33 43 | 71 | 0.7 | 80 | 269.6 | 1.16E-03 | 0.005 |
| 0.7 | 80 | 207.5 | 1.14E-03 | 0.004 |
Notes: The unit position of numbers in the columns of antecedent and consequent are denoted as follows: 1, under-expressed; 3, over-expressed. The other positions of numbers stand for histone modification sites, eg, the number from 1 to 8, which corresponds to sites of H3K9ac, H3K14ac, H4ac, H3K4me1, H3K4me2, H3K4me3, H3K36me3, H3K79me3 respectively. There 11 rules (in boldface) can be explained by previous studies.
Figure 4.Fraction of over-expressed or under-expressed state at sites from extracted 21 rules for Pokholok et al’s data.
Notes: Over-expressed state corresponds to ‘X3’ and under-expressed state corresponds to ‘X1’ in the extracted rules, where ‘X’ is the column number.
Rules extracted from Kurdistani et al’s data.
| Rule1 | 6.1 | 100 | 209.9 | 8.95E-30 | <0.001 | ||
| Rule2 | 3.8 | 100 | 209.9 | 6.66E-19 | <0.001 | ||
| Rule3 | 2.5 | 100 | 209.9 | 4.32E-13 | <0.001 | ||
| Rule4 | 2.4 | 100 | 209.9 | 2.43E-12 | <0.001 | ||
| Rule5 | 41 21 111 | 71 | 2.4 | 84.2 | 290.1 | 9.00E-20 | 0.002 |
| Rule6 | 2.4 | 80 | 464.9 | 5.48E-23 | <0.001 | ||
| Rule7 | 41 11 111 | 71 | 2.4 | 80 | 275.6 | 3.16E-17 | 0.018 |
| Rule8 | 2.2 | 100 | 209.9 | 1.09E-11 | <0.001 | ||
| Rule9 | 2.2 | 100 | 209.9 | 9.27E-12 | <0.001 | ||
| Rule10 | 41 21 11 | 71 | 2.2 | 100 | 344.5 | 2.20E-23 | <0.001 |
| Rule11 | 2.1 | 100 | 209.9 | 5.16E-11 | <0.001 | ||
| Rule12 | 2.1 | 100 | 209.9 | 1.86E-11 | <0.001 | ||
| Rule13 | 2 | 100 | 209.9 | 1.75E-10 | <0.001 | ||
| Rule14 | 1.9 | 100 | 209.9 | 2.83E-10 | <0.001 | ||
| Rule15 | 1.8 | 100 | 209.9 | 2.17E-09 | <0.001 | ||
| Rule16 | 1.7 | 100 | 209.9 | 5.10E-09 | <0.001 | ||
| Rule17 | 1.6 | 100 | 209.9 | 8.28E-09 | <0.001 | ||
| Rule18 | 1.6 | 100 | 209.9 | 2.34E-08 | <0.001 | ||
| Rule19 | 1.6 | 91.3 | 191.7 | 3.48E-06 | <0.001 | ||
| Rule20 | 1.5 | 100 | 209.9 | 5.36E-08 | <0.001 | ||
| Rule21 | 1.5 | 100 | 209.9 | 3.95E-08 | <0.001 | ||
| Rule22 | 1.4 | 100 | 209.9 | 1.65E-07 | <0.001 | ||
| Rule23 | 1.3 | 90 | 1007.1 | 3.02E-25 | <0.001 | ||
| Rule24 | 1.3 | 90 | 1007.1 | 1.64E-25 | <0.001 | ||
| Rule25 | 33 91 | 41 | 1.2 | 100 | 543.8 | 6.32E-13 | 0.002 |
| Rule26 | 1.2 | 100 | 1119 | 3.38E-23 | <0.001 | ||
| Rule27 | 1.2 | 88.9 | 186.6 | 1.30E-04 | <0.001 | ||
| Rule28 | 1.2 | 80 | 464.9 | 4.08E-13 | <0.001 | ||
| Rule29 | 1 | 100 | 209.9 | 2.53E-05 | <0.001 | ||
| Rule30 | 1 | 92.9 | 194.9 | 1.74E-04 | <0.001 | ||
| Rule31 | 33 81 | 41 | 1 | 82.4 | 447.8 | 2.89E-08 | 0.032 |
| Rule32 | 1 | 81.3 | 909.2 | 1.40E-18 | <0.001 |
Notes: The unit position of numbers in the columns of antecedent and consequent are as follows: 1, under-expressed; 3, over-expressed. The other positions of numbers stand for sites, eg, the number from 1 to 11, which corresponds to sites of H4K8, H4K12, H4K16, H3K9, H3K14, H3K18, H3K23, H3K27, H2AK7, H2BK11, H2BK16 respectively. There are 27 rules (in boldface) are consistent with role of unacetylation of H4K16.
Figure 5.Fraction of over-expressed or under-expressed state at sites from extracted 69 rules for Kurdistani et al’s data.
Notes: It’s same as in Figure 3. Because of eliminating ‘6X’ (X = 1 or 3) among the extracted rules, there are no value on H3K18 sites (‘6X’ site).