| Literature DB >> 32492116 |
Mikhail D Magnitov1,2,3, Veronika S Kuznetsova3,4, Sergey V Ulianov5,6, Sergey V Razin5,6, Alexander V Tyakht1,4.
Abstract
MOTIVATION: The application of genome-wide chromosome conformation capture (3C) methods to prokaryotes provided insights into the spatial organization of their genomes and identified patterns conserved across the tree of life, such as chromatin compartments and contact domains. Prokaryotic genomes vary in GC content and the density of restriction sites along the chromosome, suggesting that these properties should be considered when planning experiments and choosing appropriate software for data processing. Diverse algorithms are available for the analysis of eukaryotic chromatin contact maps, but their potential application to prokaryotic data has not yet been evaluated.Entities:
Mesh:
Year: 2020 PMID: 32492116 PMCID: PMC7653553 DOI: 10.1093/bioinformatics/btaa555
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Simulated contact maps reflect the properties of experimental data. (A) Scatterplot and LOWESS approximation between genomic GC content and median restriction fragment sizes generated by four restriction enzymes (with varying GC content and recognition-site length) and their combinations. (B) Normalized experimental (top row) and simulated (bottom row) Hi-C contact maps for B.subtilis generated with HpaII and HindIII restriction enzymes at 10 and 5 kb resolutions. (C) The fraction of non-zero cells in experimental and simulated contact maps at different resolutions. (D) Venn diagrams showing the overlap of zero-count column and row positions in experimental and simulated contact maps at 5 kb resolution
Fig. 2.Comparative analysis of annotated domains in prokaryotes. Swarm plots of the JI (A) and MoC (B) metrics for pairwise comparison of domain segmentations between biological replicates. Tools are ordered by median JI value, from highest to lowest. Each dot represents the average score for a single dataset. Black lines represent the median value for each metric across all datasets. (C) Bubble chart for the number of CIDs (represented by bubble size) and their median sizes (represented by colour-scale) annotated by each domain caller for each dataset
Fig. 3.Properties of the domains annotated at different coverage depths and resolutions. Clustered heatmaps of the median size of CIDs annotated in the E.coli pseudo-replicates across different (A) coverages and (B) resolutions. (C) Scatter plot of the t-SNE performed on the matrix of MoC values across coverage and resolutions and JI values obtained for the segmentations between biological replicates. Clusters were annotated using the k-means clustering algorithm