| Literature DB >> 24931992 |
Nelle Varoquaux1, Ferhat Ay2, William Stafford Noble3, Jean-Philippe Vert1.
Abstract
MOTIVATION: Recent technological advances allow the measurement, in a single Hi-C experiment, of the frequencies of physical contacts among pairs of genomic loci at a genome-wide scale. The next challenge is to infer, from the resulting DNA-DNA contact maps, accurate 3D models of how chromosomes fold and fit into the nucleus. Many existing inference methods rely on multidimensional scaling (MDS), in which the pairwise distances of the inferred model are optimized to resemble pairwise distances derived directly from the contact counts. These approaches, however, often optimize a heuristic objective function and require strong assumptions about the biophysics of DNA to transform interaction frequencies to spatial distance, and thereby may lead to incorrect structure reconstruction.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24931992 PMCID: PMC4229903 DOI: 10.1093/bioinformatics/btu268
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Performance evaluation on simulated data, varying the parameter β. (A) RMSD of each experiment for varying values of the parameter β. ChromSDE failed to yield consistent results for 14 experiments (it reported the wrong number of beads in the results file), and the PM2 algorithm failed to converge at the desired precision for one experiment (it exceeded the maximum number of iterations). (B) Distance error of each experiment for varying values of β. (C) Average SNR for each β. Higher SNR corresponds to better quality data
Fig. 2.Performance evaluation for simulated data, varying the parameter α. The figure plots the average RMSD of the inferred structures for a range of α values. As α increases, the SNR of the dataset also increases
Stability across enzyme replicates
| Resolution | Corr | MDS1 | MDS2 | NMDS | PM1 | PM2 | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Corr | RMSD | Corr | RMSD | Corr | RMSD | Corr | RMSD | Corr | RMSD | ||
| 1 Mb | 0.981 | 13.13 | 0.945 | 5.54 | 0.964 | 5.80 | 0.965 | 7.28 | 0.931 | ||
| 500 kb | 0.959 | 10.00 | 0.942 | 5.68 | 0.959 | 5.67 | 0.959 | 7.14 | 0.913 | ||
| 200 kb | 0.845 | 5.64 | 0.940 | 3.74 | 0.945 | 3.73 | 0.946 | 4.01 | 0.891 | ||
| 100 kb | 0.605 | 5.07 | 0.736 | 2.53 | 0.676 | 2.52 | 0.666 | 0.664 | 2.76 | ||
Note: For each resolution, the table lists the Spearman correlation the two enzyme replicate datasets, and, for each inference method, the average RMSD and Spearman correlation between pairs of structures inferred from the two datasets. Boldface values correspond to the best RMSD or correlation values among all five methods. In general, higher resolution leads to a lower correlation between pairs of inferred structures.
Fig. 3.Predicted structures for chromosome 1 at different resolution Contact counts matrices and predicted structures for the MDS2, NMDS, PM1 and PM2 methods at 1 Mb (A), 500 kb (B), 200 kb (C) and 100 kb (D)
Stability across resolution
| Measure | MDS1 | MDS2 | NMDS | PM1 | PM2 |
|---|---|---|---|---|---|
| 14.86 | 12.92 | 12.98 | 13.03 | ||
| 0.781 | 0.754 | 0.738 | 0.737 |
Note: The table lists the average RMSD and Spearman correlation between pairs of structures of different resolutions. In bold are the lowest average RMSD and highest average Spearman correlation. These values were computed on mouse ESC HindIII libraries [Dixon ].