| Literature DB >> 35885897 |
Siyuan Kong1, Yuhui Lu1, Shuhao Tan1, Rongrong Li1, Yan Gao1, Kui Li1, Yubo Zhang1.
Abstract
Genetic information is loaded on chromatin, which involves DNA sequence arrangement and the epigenetic landscape. The epigenetic information including DNA methylation, nucleosome positioning, histone modification, 3D chromatin conformation, and so on, has a crucial impact on gene transcriptional regulation. Out of them, nucleosomes, as basal chromatin structural units, play an important central role in epigenetic code. With the discovery of nucleosomes, various nucleosome-level technologies have been developed and applied, pushing epigenetics to a new climax. As the underlying methodology, next-generation sequencing technology has emerged and allowed scientists to understand the epigenetic landscape at a genome-wide level. Combining with NGS, nucleosome-omics (or nucleosomics) provides a fresh perspective on the epigenetic code and 3D genome landscape. Here, we summarized and discussed research progress in technology development and application of nucleosome-omics. We foresee the future directions of epigenetic development at the nucleosome level.Entities:
Keywords: 3D genome; MNase-seq; Micro-C; epigenetics; nucleosomes
Mesh:
Substances:
Year: 2022 PMID: 35885897 PMCID: PMC9323251 DOI: 10.3390/genes13071114
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.141
Figure 1The diagram of hierarchical structures from chromatin to nucleosome in eukaryotes. (a), Zoom in chromosome or chromatin to DNA. The DNA double helix wraps around histone octamers to form nucleosomes, the basic structural units of chromatin. The classic histone octamer is made up of four types of histone (H2A/H2B/H3/H4) and has eight histone tails. Covalent modification markers on histone tails play an important role in regulating the chromatin structure and function. DNA methylation markers are also epigenetic codes. (b), Different states of chromatin. The regions of chromatin, called euchromatin, exist in an extended and loosely packed state, which is conducive to transcription. Conversely, densely packed heterochromatin is silent. (c), Regulation of histone modifications and chromatin remodeling. Recognition, reading, and removal of histone modifications depend on the Writer, Reader, and Eraser. The complex enzymes responsible for the alteration of histone modifications often have more than one ability. When the reading modules of these complexes bind to the corresponding sites, their writing or erasure modules are activated and work at nearby sites. ATP-driven chromatin remodelers, which are complex of multiple subunits, catalyze chromatin remodeling. Meanwhile, the nucleosome structure usually changes in four ways: replacement, dissociation, removing, and slide. Replacement indicates chromatin remodelers, which catalyze replacement between canonical histone and histone variants in nucleosomes. Dissociation indicates that the double helix DNA wrapped around the nucleosome loosens. Removing indicates the disintegration of nucleosomes. Slide indicates chromatin remodelers can allow nucleosomes to slide along DNA without unwinding the DNA double strand [32]. Courtesy: National Human Genome Research Institute (https://www.genome.gov/genetics-glossary/Chromatin, accessed on 19 May 2022; https://www.genome.gov/genetics-glossary/Nucleosome, accessed on 19 May 2022).
Figure 2The methodology and brief analysis result overview of MNase-seq and its derivatives. Schematic view of the MNase-seq, scMNase-seq, MNase-Exo-seq, and MNase-ChIP-seq procedures and principles. All these technologies use MNase to digest internucleosomal DNA and allow nucleosome DNA to be released. Subsequent steps for MNase-seq include digestion, DNA purification, library preparation, and high-throughput sequencing. The high-resolution location of the mononucleosome and location of the NDRs can be determined. MNase-seq panel was adapted with permission from Ref. [50]. 2021, Springer Nature. scMNase-seq adds the step of cell sorting on the basis of MNase-seq. As shown in the Figure, two methods of sorting and digestion are provided. scMNase-seq panel was adapted with permission from Refs. [40,51]. 2019, 2018, Springer Nature. MNase-Exo-seq increases the ExoIII digestion step, leading to significant spikes in core particle size and a more accurate depiction of nucleosome location. MNase-Exo-seq panel was adapted from Ref. [12] with Open Access and no copyright issue. MNase-ChIP-seq increases the chromatin immunoprecipitation step after digestion and can perform the amplification of specific mononucleosomal DNA [52]. MNase-ChIP-seq panel was adapted with permission from Ref. [13]. 2011, Elsevier.
Summary of different experimental techniques for nucleosomics.
| Technology | Sequence Type | Start Material | Enzyme Digestion | Advantage | Feature | Reference |
|---|---|---|---|---|---|---|
| MNase-seq | Single-end or paired-end sequencing | 10–20 million cells | MNase | The whole genome can be measured, high resolution, low technical difficulty. | MNase has sequence bias, it cuts DNA upstream of A or T faster than upstream of G or C, traditional method requires a large number of cells, easy to cause technical error. | [ |
| scMNase-seq | Single-end or paired-end sequencing | >0.1 million cells | MNase | Provides a method for determining nucleosome localization and chromatin accessibility in single-cell or low-input materials, few cells are required, high resolution, low technical difficulty. | Low capture rate, information may be missing, lower throughput. | [ |
| ULI-MNase-seq | Paired-end sequencing | 10–15 pronuclei per replicate | MNase | Low cell initiation. | It requires extremely high proficiency and skill. It easily loses cell and library DNA. | [ |
| MNase-Exo-seq | Paired-end sequencing | 10–20 million cells | MNase, exonuclease III | The perfectly clipped core particle has a major sharp peak corresponding to it, weaker signals that cannot be detected in MNase-seq data are evident. | Detailed analysis of nucleosome location is complicated, it is possible to misestimate the occupancy of a specific nucleosome positions. | [ |
| ChIP-seq | Single-end or paired-end sequencing | >1 million cells | Priority with MNase | High resolution and low noise, high genome coverage and wider dynamic range, the method is more mature and widely applicable. | Data quality depends on antibody quality, and antibody screening is time-consuming and costly. | [ |
| ChIP-exo | Single-end or paired-end sequencing | >1 million cells | Exonuclease | High resolution, defines genomic binding locations, more precisely determine the location of protein gene interactions in the genome, few false positives or negatives Binding-site complexity. | Multiple binding of a single protein cannot be detected. | [ |
| ChIP-MNase | Single-end or paired-end sequencing | 10–20 million cells | MNase | High resolution, nucleosome localization analysis in specific position of the genome and differential analysis of alleles undergoing different molecular processes. | Need for precise selection of antibodies, antibody repertoire may be incomplete, Other features similar to MNase-seq. | [ |
| ATAC-seq | Paired-end sequencing | 500–50,000 cells | Tn5 transposase | Simple method, short experiment period, few cells are required, high resolution and good repeatability. | Conventional data analysis methods have limitations, Tn5 transposase is expensive. | [ |
| DNase-seq | Single-end or paired-end sequencing | >1 million cells | DNase I | Simple method, high resolution, the most active regulatory regions can be identified from many cell types. | Traditional method requires a large number of cells, precise control of enzyme quantity, time consuming; it was not easy to determine the precise activity and function which were associated with each regulatory region. | [ |
| NOMe-seq | Paired-end sequencing | Need to test | GpC methyltransferase | Does not depend on knowing the exact modification of surrounding nucleosomes. It can provide localization information of multiple nucleosomes on both sides of each open regulatory element, nucleosome localization and DNA methylation degree can be analyzed simultaneously. | Requires a large number of cells and data analysis is difficult. | [ |
| Micro-C | Paired-end sequencing | 0.001–5 million cells | MNase | The signal-to-noise is improved, high resolution, reveals the chromosome folding of nucleosome resolution. | The observed chromosome structure will be biased, difficulty recovering known higher-order interactions. | [ |
| Micro-C XL | Paired-end sequencing | 1 million cells | MNase | The signal-to-noise is improved, improved structure visualization, chromosome folding can be detected from nucleosomes to whole genomes, adds some subtle details to the Micro-C map. | Requires one more step of cross-linking, it may take many attempts to find the best conditions. | [ |
| MACC-seq | Single-end or paired-end sequencing | 1 million cells per reaction | MNase | Profiles both open and closed genomic loci simultaneously, combined with ChIP specificity to enrich histone modification-associated DNA fragments. | Traditional method requires a large number of cells. | [ |
| MH-seq | Paired-end sequencing | 10–20 million cells | MNase | Simple procedures, enables detection of distinct types of open chromatin. | Traditional method requires more cells, it is not easy in plants to establish Single-cell-based MH-seq, application in plants has limitations, high requirements for nuclear quality. | [ |
| Array-seq | Paired-end sequencing | 10–20 million cells | MNase | Reveals linker length and array regularity in unmappable areas. | Traditional method requires more cells, titration test required. | [ |
| MRE-seq | Paired-end sequencing | Need to test | Methylation-sensitive restriction enzymes | The methylation status of most repeats can be revealed, the methylation state of a local region or a single CPG can be addressed, MREs are inexpensive. | The recognition range of methylation events is limited, and only those within MRE recognition sites can be detected. | [ |
Figure 3The methodology of Micro-C and Micro-C XL. (a), Outline of key steps in the Micro-C protocol. Adapted with permission from Ref. [17]. 2015, Elsevier. (b), Single−nucleosome resolution contact matrix. Reprinted with permission from Ref. [15]. 2015, Elsevier. (c), Heatmap of Micro-C−specific dots and stripes for loop interactions in HFFs in chromosome 8. Corresponding Micro-C and Hi−C heatmaps are shown above and below the diagonal lines, respectively. Reprinted with permission from Ref. [69]. 2020, Elsevier. (d), Difference plot of interaction decaying rates of Micro-C and Hi−C. X axis and Y axis represent distances of 100 bp to 10 MB between contact loci and contact density normalized by sequencing depth, respectively. Reprinted with permission from Ref. [70]. 2020, Elsevier. (e), An overview of the Micro-C XL protocol. Adapted with permission from Ref. [55]. 2016, Springer Nature. (f), Interaction heatmaps of Micro-C data and Micro-C XL from yeast chromosomes VII to X in 10 kb bin resolution. Adapted with permission from Ref. [55]. 2016, Springer Nature. (g), The decaying line of contact probability with genomic distance of Micro-C XL data is shown in 1 kb bin resolution. Adapted from Ref. [71] with Open Access and no copyright issue.
Figure 4MCC, an upgraded version of Micro-C. (a), Library construction process of MCC. Like other 3C technologies, MCC captures interactions through chemical cross-linking. MCC uses micrococcal nuclease (MNase) which is independent of DNA sequence as the DNA molecular scissor to achieve random fragmentation, which is the key step to improve resolution. The Oligonucleotide probe tends to bind to the center of the hypersensitive site. Hybridization with probes is conducted to enrich the specific interactions. (b), MCC has a higher resolution and can capture interactions that the other 3C technologies do not. Adapted with permission from Ref. [76]. 2021, Springer Nature.
Nucleosome prediction database.
| Database | Description | Data Type | Species | Source | Reference |
|---|---|---|---|---|---|
| GTRD | The largest integrated resource of data on transcription regulation in eukaryotes, which contains uniformly annotated and processed NGS data, the results of the meta-analysis, and the sets of non-redundant and reproducible TFBSs for each TF. | ChIP-seq, ChIP-exo, DNase-seq, MNase-seq, ATAC-seq, RNA-seq | [ | ||
| NPRD (Nucleosome Positioning Region Database) | It is compiling the available experimental data on locations and characteristics of nucleosome formation sites (NFSs), and is the first curated NFS-oriented database. | The type used in original paper | [ | ||
| ChIP-Atlas | An integrative, comprehensive database to explore public Epigenetic dataset, covers almost all public data archived in Sequence Read Archive of NCBI, EBI, and DDBJ with over 224,000 experiments. | ChIP-Seq, DNase-Seq, ATAC-Seq, Bisulfite-Seq | [ | ||
| CistromeDB | A resource of human and mouse cis-regulatory information, which map the genome-wide locations of transcription factor binding sites, histone post-translational modifications, and regions of chromatin accessible to endonuclease activity. | ChIP-seq, DNase-seq, ATAC-seq | [ | ||
| ENCODE | Integrative-level annotations integrate multiple types of experimental data and ground level annotations. Ground-level annotations are derived directly from the experimental data, typically produced by uniform processing pipelines. | ChIP-seq, DNase-seq, ATAC-seq, TFChIP-seq, RNA-seq, eCLIP-seq, ChIA-PET, Hi-C, RRBS, WGBS, RAMPAGE | [ | ||
| ChIPBase | Decoding the encyclopedia of transcriptional regulations of ncRNAs and PCGs. | ChIP-seq, ChIP-exo, ChIP-nexus, MNChIP-seq | [ | ||
| ReMap 2020 3rd release | Information of regulatory regions from an integrative analysis of Human and Arabidopsis DNA-binding sequencing experiments. | ChIP-seq, ChIP-exo, ChIP-nexus, DAP-seq | [ | ||
| Factorbook | A transcription factor (TF)-centric repository of all ENCODE ChIP-seq datasets on TF-binding regions, as well as the rich analysis results of these data. | ChIP-seq | [ | ||
| NucMap | Genome-wide nucleosome positioning map across different species. | MNase-Seq | [ | ||
| NucPosDB | Database reporting experimental nucleosome maps in vivo across different cell types and conditions, cell-free DNA (cfDNA) datasets in people and model organisms, processed stable-nucleosome regions, as well as software for computational analysis and modeling of nucleosome positioning and “nucleosomics” analysis for medical diagnostics. | MNase-seq, ChIP-seq, MH-seq, MPE-seq, MiSeq, NOME-seq, RED-seq, Nanopore-seq, Fiber-seq, Ucleosome-scale mapping of 3D genome contact, Micro-C | [ |
Recently developed computational tools for the analysis of experimental nucleosome data.
| Algorithm | Web Server/GUI | Feature | Input Dataset | Languages | Source | Reference |
|---|---|---|---|---|---|---|
| CAESAR | +/− | Connecting epigenomics and chromatin organization at the nucleosome resolution. | Epigenomic features and Hi-C contact maps | Python | [ | |
| Factor-agnostic chromatin occupancy profiles from MNase | +/− | Links changes in chromatin at nucleotide resolution with transcriptional regulation. | MNase-seq and RNA-seq data | Python, Shell | [ | |
| NucHMM | +/− | Identifies functional nucleosome states associated with cell type-specific combinatorial histone marks and nucleosome organization features. | MNase-seq and ChIP-seq data | Python, C++, Makefile | [ | |
| ProbC | +/− | Decomposes Hi-C and Micro-C interactions by known chromatin marks at genome and chromosome levels. | Hi-C and Micro-C data | Python | [ |
Footer: +:Yes; −:No.