| Literature DB >> 31048738 |
Sang Hoon Lee1, Yeonghoon Kim2, Sungmin Lee3, Xavier Durang4, Per Stenberg5, Jae-Hyung Jeon6, Ludvig Lizana7.
Abstract
Several experiments show that the three dimensional (3D) organization of chromosomes affects genetic processes such as transcription and gene regulation. To better understand this connection, researchers developed the Hi-C method that is able to detect the pairwise physical contacts of all chromosomal loci. The Hi-C data show that chromosomes are composed of 3D compartments that range over a variety of scales. However, it is challenging to systematically detect these cross-scale structures. Most studies have therefore designed methods for specific scales to study foremost topologically associated domains (TADs) and A/B compartments. To go beyond this limitation, we tailor a network community detection method that finds communities in compact fractal globule polymer systems. Our method allows us to continuously scan through all scales with a single resolution parameter. We found: (i) polymer segments belonging to the same 3D community do not have to be in consecutive order along the polymer chain. In other words, several TADs may belong to the same 3D community. (ii) CTCF proteins-a loop-stabilizing protein that is ascribed a big role in TAD formation-are well correlated with community borders only at one level of organization. (iii) TADs and A/B compartments are traditionally treated as two weakly related 3D structures and detected with different algorithms. With our method, we detect both by simply adjusting the resolution parameter. We therefore argue that they represent two specific levels of a continuous spectrum 3D communities, rather than seeing them as different structural entities.Entities:
Mesh:
Year: 2019 PMID: 31048738 PMCID: PMC6497878 DOI: 10.1038/s41598-019-42212-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 13D communities in simulated fractal globules. (a, left) The 3D representation of fractal globule from the CDP algorithm. The colors highlight communities that we detect using Eq. (2), γ = 0.6. (a, right) The communities are not contiguous: the small globule is a subsection of the lower left part of polymer, and the stretched version shows the alternating communities (ABCAB). (b–d) The contact maps of simulated fractal globules after KR-normalization, with various resolution parameters: (b) γ = 0.4, (c) γ = 0.6, and (d) γ = 0.8. To show non-contiguous communities, we superimpose them as the squares; the same color indicates that they belong to the same 3D community. (e) The end-to-end distance for the fractal (FG, blue) and the equilibrium globules (EG, red) averaged over 200 polymer realizations. The triangles denote the end-to-end distance for community boundaries, and dashed lines represent the chain as a whole. To find the communities, we use γ = 0.4. The data are obtained from the simulation of 200 sample globules for each polymer model. The error bars show the standard error of the mean, and the two guided slopes (1/2 and 1/3) show the known scaling of equilibrium and fractal globules at intermediate length scales. (f) The same data as in the panel (e), where we scale the vertical axis with the radius of gyration where N is the total number of polymer segments and ri is the coordinate of the ith segment.
Figure 23D communities in real Hi-C data (chromosome 1). (a–c) Normalized Hi-C data with squares showing the structure of 3D communities. The black regions are the unmappable regions. The resolution parameter ranges from γ = 0.6 (a), γ = 0.7 (b), and γ = 0.8 (c). As in Fig. 1(b–d), we assign the same colors to those squares that belong to the same 3D community. It is clear that they are not contiguous sequences. (d) The fraction of community boundary points predicted by our method that coincide with the ones in Rao et al.[2] (the squares), and binding positions for CTCF (the circles) for different values of γ. (e) The average number of TADs for each community, and the number of communities as functions of γ. (f) The community division along chromosome 1 for three different values of γ. The purple squares represent the largest community in the panels (g)–(i), while the other colors indicate smaller communities. (g)–(i) Communities’ gene activity sorted by their relative size for different values of γ: (g) γ = 0.6, (h) γ = 0.65, and (i) γ = 0.7. The circles show the median RNA expression levels, and vertical lines are quartiles. We omit communities that are smaller than 50 nodes. We find the communities using Eqs (1 and 2).