| Literature DB >> 21118582 |
Noam Kaplan1, Timothy R Hughes, Jason D Lieb, Jonathan Widom, Eran Segal.
Abstract
We propose definitions and procedures for comparing nucleosome maps and discuss current agreement and disagreement on the effect of histone sequence preferences on nucleosome organization in vivo.Entities:
Mesh:
Substances:
Year: 2010 PMID: 21118582 PMCID: PMC3156944 DOI: 10.1186/gb-2010-11-11-140
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Illustration of the proposed definitions. (a) Four nucleosome configurations, which, together with their respective probabilities, constitute a nucleosome organization. (b-d) The derived (b) nucleosome occupancy, (c) absolute positioning and (d) conditional positioning measures. The configuration with a probability of 0.4 is weighed twice as heavily as the other configurations in the derived occupancy (b) and positioning (c, d) measures. Note how the rightmost nucleosome that appears in the same position in two of the four configurations has a relatively low absolute positioning value (0.4 in (c)) but a high conditional positioning value (1 in (d)), whereas the leftmost nucleosome is relatively well positioned by both the absolute and conditional positioning measures (0.6 in (c) and 0.75 in (d)). Also note that owing to the existence of another nucleosome close to the leftmost nucleosome, the rightmost nucleosome has a higher conditional positioning value than the leftmost nucleosome (d), even though the rightmost nucleosome has an overall lower probability across all four nucleosome configurations. Red boxes in (d) represent regions in which the conditional positioning is undefined.
Figure 2The effect of the number of sequence reads on the comparison of nucleosome maps using different measures. (a, b) We randomly sampled in vivo nucleosome data from yeast at different levels of genomic coverage [10]. At each level, five pairs of nucleosome maps were generated and, for each map, we estimated five different quantities: nucleosome occupancy, absolute nucleosome positioning (without smoothing), conditional positioning (without smoothing), smoothed absolute positioning and smoothed conditional positioning (both using a Gaussian filter with 20-bp standard deviation). For each pair of maps and every estimated measure, the Pearson correlation between each pair of maps was computed; this simulates the comparison of two replicates with the same level of coverage and thus shows the difference between two random samples from the same experiment with the same number of reads. The black arrow indicates an average read number beyond the scale of the y-axis. (b) An expansion of (a) for low numbers of reads. Standard deviation at all plotted points is smaller than 0.001. The coverage of the full in vivo map was about 2.2 nucleosome read starts per base pair. This simulation addresses only the error introduced by sampling and does not simulate the effect of other sources of errors in the experiments. These include the effect of variability in the extracted lengths of nucleosome-protected sequences, to which measures such as positioning are especially sensitive. Vertical dashed lines indicate the approximate amount of uniquely mapped reads in various studies [5,10-12,18,24], suggesting that sequencing coverage in several of these studies might lead to underestimation of similarities among maps, depending on the estimated quantity.