| Literature DB >> 32637041 |
Seungbyn Baek1, Insuk Lee1,2.
Abstract
Most genetic variations associated with human complex traits are located in non-coding genomic regions. Therefore, understanding the genotype-to-phenotype axis requires a comprehensive catalog of functional non-coding genomic elements, most of which are involved in epigenetic regulation of gene expression. Genome-wide maps of open chromatin regions can facilitate functional analysis of cis- and trans-regulatory elements via their connections with trait-associated sequence variants. Currently, Assay for Transposase Accessible Chromatin with high-throughput sequencing (ATAC-seq) is considered the most accessible and cost-effective strategy for genome-wide profiling of chromatin accessibility. Single-cell ATAC-seq (scATAC-seq) technology has also been developed to study cell type-specific chromatin accessibility in tissue samples containing a heterogeneous cellular population. However, due to the intrinsic nature of scATAC-seq data, which are highly noisy and sparse, accurate extraction of biological signals and devising effective biological hypothesis are difficult. To overcome such limitations in scATAC-seq data analysis, new methods and software tools have been developed over the past few years. Nevertheless, there is no consensus for the best practice of scATAC-seq data analysis yet. In this review, we discuss scATAC-seq technology and data analysis methods, ranging from preprocessing to downstream analysis, along with an up-to-date list of published studies that involved the application of this method. We expect this review will provide a guideline for successful data generation and analysis methods using appropriate software tools and databases for the study of chromatin accessibility at single-cell resolution.Entities:
Keywords: ATAC sequencing; Chromatin accessibility; Single-cell ATAC sequencing; Single-cell RNA sequencing; Single-cell biology
Year: 2020 PMID: 32637041 PMCID: PMC7327298 DOI: 10.1016/j.csbj.2020.06.012
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Schematic overview of a typical single-cell ATAC sequencing analysis workflow.
Fig. 2Schematic summary of two major strategies for single-cell adaptation of ATAC sequencing library generation: (a) split-pool cellular indexing and (b) microfluidics-based, and (c) their modified methods.
Summary of scATAC-seq analysis software packages.
| Tool | Platform | Feature Matrix | Preprocessing | Clustering | DAR | Motif/k-mer | Gene activity | Co-accessibility | Trajectory | Pathway | Enrichment analysis | scRNA integration | Reference |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ChromVAR | R | TF motifs, k-mer | O | O | X | O | X | X | X | X | X | X | |
| SCRAT | R/Web | Selectable feature | O | O | O | X | X | X | X | X | X | X | |
| scABC | R | Peak | O | O | X | O (ChromVAR) | X | X | X | X | X | X | |
| Cicero | R | TSS | O | O | O | X | O | O | O | X | X | X | |
| Scasat | Python/R | Peak | O | O | O | X | X | X | X | O (GREAT) | X | X | |
| cisTopic | R | Peak | O | O | X | X | O | X | X | O | O | X | |
| snapATAC | Python/R | Bin, peak | O | O | O | O (ChromVAR, Homer) | O | X | X | O (GREAT) | X | O (Seurat) | |
| epiScanpy | Python | Peak | O | O | X | X | X | X | X | X | X | X | |
| Destin | R | Peak | O | O | O | X | X | X | X | X | O | X | |
| SCALE | Python | Peak | O | O | O | O (ChromVAR) | X | X | X | X | X | X | |
| scATAC-pro | Python/R | Peak | O | O | O | O (ChromVAR) | O | O (Cicero) | X | O (GREAT) | X | X | |
| Signac | R | Peak | O | O | O | O (ChromVAR) | O | X | X | X | X | O (Seurat) | |
| ArchR | R | Bin, peak | O | O | O | O (ChromVAR), TF footprinting | O | O | O | X | O | O (Seurat) |
Tools used in junction are indicated in parentheses.
Fig. 3Integration of single-cell ATAC sequencing data with single-cell RNA sequencing data via experimental approaches and computational approaches. Integrative analysis of gene expression and chromatin accessibility for the same cell types can be used for confirming cell identity annotation and for facilitating generating new hypotheses for regulatory elements. For example, identification of peak-to-gene interactions can infer enhancer-promoter interactions; comparison between expression of a gene and accessibility of its TF-enriched regions across pseudotime can reveal kinetic relationship between transcription and regulatory regions; comparison between expression of a gene and accessibility of its TF-enriched regions across cell types or sample groups can reveal expression and accessibility signature associated with a cell type or subpopulation.