| Literature DB >> 35255493 |
Liheng Luo1, Michael Gribskov2, Sufang Wang1.
Abstract
With recent advances in high-throughput next-generation sequencing, it is possible to describe the regulation and expression of genes at multiple levels. An assay for transposase-accessible chromatin using sequencing (ATAC-seq), which uses Tn5 transposase to sequence protein-free binding regions of the genome, can be combined with chromatin immunoprecipitation coupled with deep sequencing (ChIP-seq) and ribonucleic acid sequencing (RNA-seq) to provide a detailed description of gene expression. Here, we reviewed the literature on ATAC-seq and described the characteristics of ATAC-seq publications. We then briefly introduced the principles of RNA-seq, ChIP-seq and ATAC-seq, focusing on the main features of the techniques. We built a phylogenetic tree from species that had been previously studied by using ATAC-seq. Studies of Mus musculus and Homo sapiens account for approximately 90% of the total ATAC-seq data, while other species are still in the process of accumulating data. We summarized the findings from human diseases and other species, illustrating the cutting-edge discoveries and the role of multi-omics data analysis in current research. Moreover, we collected and compared ATAC-seq analysis pipelines, which allowed biological researchers who lack programming skills to better analyze and explore ATAC-seq data. Through this review, it is clear that multi-omics analysis and single-cell sequencing technology will become the mainstream approach in future research.Entities:
Keywords: ATAC-seq; ChIP-seq; RNA-seq; gene expression; multi-omics analysis
Mesh:
Substances:
Year: 2022 PMID: 35255493 PMCID: PMC9116206 DOI: 10.1093/bib/bbac061
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 13.994
Figure 1Bibliometrics and datasets statistics of ATAC-seq. (A) The number of articles using combinations of ATAC-seq, ChIP-seq and RNA-seq published from 2013 to 2021 from PubMed database. Keywords of ATAC-seq, ChIP-seq and RNA-seq are searched, respectively, under the Title/Abstract term in the query box. (B) The names of top 50 journals published ATAC-seq articles. (C) The stacked bar plot shows the number of cancer studies versus other diseases using ATAC-seq and/or other sequencing techniques from 2015 to 2021.
Figure 2Keywords correlation analysis and the changes of clusters in ATAC-seq articles. (A) Keyword correlation analysis. The colour of the words represents the average time of occurrence of each keyword in the literature corpus. The colour bar represents the normalized Z-score of average time. The size of the circle behind each word represents its frequency of occurrence. The three ellipses indicate three clusters of highly correlated keywords (abbreviation of TF). (B) The change of clusters over time. Keywords are divided into three clusters according to their relevance. The horizontal axis represents the average time normalized by Z-score. The vertical axis shows its proportion to all the keywords in each cluster.
Figure 3Principles and workflow of ATAC-seq. (A) Principles of ATAC-seq, ChIP-seq and RNA-seq. (B) The generic ATAC-seq data analysis workflow.
Figure 4Phylogenetic tree of 65 species to which ATAC-seq data are applied. The four colours correspond to four kingdoms: protozoa, animals, plants and fungi. The outer circle indicates the number of GEO experiments for each species.
Pipelines for bulk ATAC-seq data analysis
| Name | Data type | Input format | Function | Advantages | Download website | Language | Platform |
|---|---|---|---|---|---|---|---|
| ALTRE [ | ATAC-seq | CSV | (i) Peak merging and annotation, (ii) differential analysis, (iii) pathway enrichment analysis | (i) Easy-to-use |
| R | Windows, Linux, MacOS |
| ATAC-pipe [ | ATAC-seq | FASTQ | (i) QC, (ii) alignment, (iii) peak calling, (iv) differential analysis, (v) search for motifs, (vi) TF footprinting, (vii) Regulatory network reconstruction | (i) Integrated pipeline with multiple toolkits |
| Python | Linux, MacOS |
| atacR [ | ATAC-seq | BAM | (i) Normalization, (ii) differential analysis | (i) Allows for normalization |
| R | Platform independent |
| Ataqv [ | ATAC-seq | BAM | (i) QC with visualization | (i) Diverse QC metrics |
| C++ | Linux, MacOS |
| CIPHER [ | ATAC-seq | FASTQ | (i) QC, (ii) alignment, (iii) peak calling, (iv) peak annotation and visualization, (v) differential analysis, (vi) motif identification | (i) Stand-alone workflow platform |
| Nextflow, R | Linux, MacOS |
| COCOA [ | ATAC-seq | Counts matrix | (i) quantify inter-sample variation | (i) Supports supervised and unsupervised analysis |
| R | Windows, MacOS |
| DEBrowser [ | ATAC-seq | Counts matrix | (i) QC, (ii) differential analysis, (iii) pathway analysis | (i) A shiny website platform |
| R | Platform independent |
| diffTF [ | ATAC-seq | BAM | (i) calculate differential TF activity, (ii) classify TF with RNA-seq data | Classify TF into activators or repressors |
| Snakemake | Cluster system |
| esATAC [ | ATAC-seq | FASTQ | (i) QC, (ii) alignment, (iii) peak calling, (iv) peak annotation, (v) motif analysis | (i) Perform ‘one command line for results’ analysis |
| R and C++ | Windows, Linux, MacOS |
| GUAVA [ | ATAC-seq | FASTQ | (i) QC, (ii) alignment, (iii) peak calling, (iv) peak annotation, (v) differential analysis, (vi) functional annotation | (i) Standalone software |
| JAVA | Linux, MacOS |
| I-ATAC [ | ATAC-seq | FASTQ | (i) QC, (ii) alignment, (iii) peak calling | (i) Standalone software |
| JAVA | UNIX, Linux, Windows, MacOS |
| MMARGE [ | ATAC-seq | FASTQ | (i) alignment, (ii) TF and motif analysis | Identify combinations of TFs |
| Perl and R | UNIX |
| Octopus-toolkit [ | ATAC-seq | SRA | (i) QC, (ii) alignment, (iii) peak calling, (iv) peak annotation, (v) motif analysis | (i) Standalone software |
| JAVA | Linux, MacOS |
| Recoup [ | ATAC-seq | BAM | (i) Signal normalization , (ii) coverage profile analysis | (i) High-quality figures |
| R | Linux, Windows, MacOS |
| snakePipes [ | ATAC-seq | FASTQ | (i) QC, (ii) alignment, (iii) peak calling, (iv) differential analysis | (i) Integrates multiple NGS data, |
| Snakemake | Cluster system |
| TOBIAS [ | ATAC-seq | BAM | (i) Bias correction, (ii) footprint analysis, (iii) differential analysis | (i) Focus on footprint analysis |
| Python | Cluster system |