| Literature DB >> 34454581 |
Ning Liu1,2,3, Wai Yee Low4, Hamid Alinejad-Rokny5,6, Stephen Pederson3,7, Timothy Sadlon2,8, Simon Barry2,6,8, James Breen9,10,11,12.
Abstract
Eukaryotic genomes are highly organised within the nucleus of a cell, allowing widely dispersed regulatory elements such as enhancers to interact with gene promoters through physical contacts in three-dimensional space. Recent chromosome conformation capture methodologies such as Hi-C have enabled the analysis of interacting regions of the genome providing a valuable insight into the three-dimensional organisation of the chromatin in the nucleus, including chromosome compartmentalisation and gene expression. Complicating the analysis of Hi-C data, however, is the massive amount of identified interactions, many of which do not directly drive gene function, thus hindering the identification of potentially biologically functional 3D interactions. In this review, we collate and examine the downstream analysis of Hi-C data with particular focus on methods that prioritise potentially functional interactions. We classify three groups of approaches: structural-based discovery methods, e.g. A/B compartments and topologically associated domains, detection of statistically significant chromatin interactions, and the use of epigenomic data integration to narrow down useful interaction information. Careful use of these three approaches is crucial to successfully identifying potentially functional interactions within the genome.Entities:
Keywords: Chromosome conformation capture; Data integration; Hi-C; Statistically significant interactions identification
Mesh:
Substances:
Year: 2021 PMID: 34454581 PMCID: PMC8399707 DOI: 10.1186/s13072-021-00417-4
Source DB: PubMed Journal: Epigenetics Chromatin ISSN: 1756-8935 Impact factor: 4.954
Fig. 1Illustration of genome architecture and the corresponding Hi-C interaction maps. Top panel: interaction heatmaps A, B, C, D are in different scales (kb or Mb per pixel) to correlate with the diagrams of 3D structures in the bottom panel, yellow boxes in A and B are identified TADs and small blue boxes in A indicate chromatin loops. The purple box in A is a frequently interacting region, with its classical “V” shape pattern coloured in purple dotted lines. Heatmaps were generated using Juicebox [29] with published Hi-C data of GM12878 [3]. Bottom panel: diagrams of 3D structures in the genome
Different Hi-C-derived methods. Optimisations indicate their modification in their protocols compared to traditional Hi-C
| Hi-C flavours | Optimisations | Advantages compared to traditional Hi-C | Reference |
|---|---|---|---|
| Traditional Hi-C | – | – | [ |
| In situ Hi-C | Nuclear ligation; 4-based cutter | Allow higher resolution data generation | [ |
| DNase Hi-C | DNase I to digest cross-linked DNA | Improve capture efficiency, reducing digestion bias but have A compartment bias | [ |
| Micro-C | Crosslinking with DSG and micrococcal nuclease to digest cross-linked DNA | Improve capture efficiency, reducing digestion bias but have A compartment bias | [ |
| BL-Hi-C | HaeIII to digest cross-linked DNA, followed by a two-step ligation | Improve capture efficiency in regulatory regions, reducing random ligation events | [ |
| DLO Hi-C | No labelling and pull-down step | Reduce experimental cost | [ |
| tag Hi-C | Tn5-transposase tagmentation | Focus on accessible chromatin, allow only hundreds of cells as input, reduce experimental cost | [ |
| Capture HiC | RNA baits to subset specific chromatin contacts | Reduce sequencing cost, focus on a subset of interactions | [ |
| Capture-C/NG Capture-C/Tiled-C | Enrich the 3C library with biotinylated capture oligonucleotides | Focus on the subset of interactions while retaining maximal library complexity | [ |
| HiChIP/PLAC-seq | Chromatin Immunoprecipitation (ChIP) to subset bound chromatin contacts | Reduce sequencing cost, focus on a subset of interactions | [ |
| OCEAN-C | Phenol–chloroform extraction step | Focus on accessible chromatin | [ |
| HiCoP | Column purified chromatin step | Focus on accessible chromatin | [ |
| Methyl-HiC | Bisulfite conversion | Allow jointly profiling of DNA methylation and 3D genome structure | [ |
| Hi-C 2.0 | Efficient unligated ends removal | Largely reduce the dangling end DNA products | [ |
| Hi-C 3.0 | Double cross-linking with FA and DSG and double digestion with | Improve the ability to identify A/B compartments and improve the enrichment of regulatory elements in loop detection | [ |
Fig. 2Approaches to prioritise interactions from Hi-C datasets. In this review, we categorised the approaches to identify potentially functional interactions into three ways, including significant interactions identification, structures summarisation and data integration. Referenced tools and sub-categorical analyses are marked on the figure with boxes and stars, respectively
Methods for identification of statistically significant interactions for Hi-C data
| Method name | Type | Base model | Specific features | Reference |
|---|---|---|---|---|
| Duan et al. 2010 | Global background | Binomial | Specifically designed for yeast genome | [ |
| Fit-Hi-C/FitHiC2 | Global background | Binomial | Spline fitting procedure, compatible with different formats | [ |
| HOMER | Global background | Binomial | Highly compatible with the HOMER Hi-C analysis pipeline | [ |
| GOTHiC | Global background | Binomial | Use relative coverage to estimate biases | [ |
| FitHiChIP | Global background | Binomial | Specifically designed for HiChIP data | [ |
| HIPPIE | Global background | Negative binomial | Account for fragment length and distance biases | [ |
| HiC-DC | Global background | Negative binomial | Use zero-inflated model | [ |
| HMRFBayesHiC | Global background | Negative binomial | Use hidden Markov random field model | [ |
| FastHiC | Global background | Negative binomial | An updated version of HMRFBayesHi, with improved computing speed | [ |
| MaxHiC | Global background | Negative binomial | Use ADAM algorithm, identify interactions with enrichment for regulatory elements | [ |
| CHiCAGO | Global background | Negative binomial | Specifically designed for CHi-C data | [ |
| ChiCMaxima | Global background | Local maxima | Specifically designed for CHi-C data, more stringent and robust when comparing biological replicates | [ |
| HICCUP | Local background | Local enrichment | Robust for finding chromatin loops | [ |
| cLoops | Local background | DBSCAN | Loop detection with less computational resource | [ |
| Automated identification of stripes | Local background | Local enrichment | Specifically designed to identify architectural stripes | [ |