| Literature DB >> 30791920 |
Aslıhan Karabacak Calviello1,2, Antje Hirsekorn1, Ricardo Wurmus1, Dilmurat Yusuf1, Uwe Ohler3,4,5.
Abstract
BACKGROUND: DNase-seq and ATAC-seq are broadly used methods to assay open chromatin regions genome-wide. The single nucleotide resolution of DNase-seq has been further exploited to infer transcription factor binding sites (TFBSs) in regulatory regions through footprinting. Recent studies have demonstrated the sequence bias of DNase I and its adverse effects on footprinting efficiency. However, footprinting and the impact of sequence bias have not been extensively studied for ATAC-seq.Entities:
Keywords: ATAC-seq; Bias correction; DNase-seq; Footprinting; Reproducibility
Mesh:
Substances:
Year: 2019 PMID: 30791920 PMCID: PMC6385462 DOI: 10.1186/s13059-019-1654-y
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Generating ATAC-seq libraries without the usage of lysis buffer increases agreement with DNase-seq. a Percentage of all reads that align to the mitochondrial genome in K562 ATAC-seq libraries generated with the published protocol (10 min lysis), shorter lysis (5 min lysis), or without using lysis buffer (no lysis buffer). b Agreement of these libraries with all K562 DNase-seq libraries as measured by Pearson correlations of read counts in 100 bp bins genome-wide. c Overlap of peaks found in K562 DNase-seq data with peaks in ATAC-seq data generated using the published protocol (left) and peaks in ATAC-seq data generated without using lysis buffer (right)
Fig. 2The task of finding open chromatin regions saturates at medium depth. a Number of reads after processing in the 11 HEK293 ATAC-seq libraries with different library depths. The 2 biological replicates are shown in blue and red, with the shades representing the technical replicates. b Numbers of reproducible peaks found with the JAMM-IDR strategy at different depths. c The overlaps between 1 set of peaks in b shown for high vs. medium (left), high vs. low (middle), and medium vs. low sets (right)
Fig. 3The sequence bias of the Tn5 transposase is distinct from that of DNase I. a Comparison of Tn5 transposition propensities of all 6-mers (log10 scale) in two libraries generated using deproteinized genomic DNA from human (YH1) and D. melanogaster. b 6-mer transposition propensities in the human library compared to cleavage propensities of DNase inferred previously from a single-hit DNase-seq experiment using deproteinized genomic DNA from K562 cells
Fig. 4The number of reproducible footprints scales with library depth. a CTCF footprints inferred from HEK293 ATAC-seq data (left) and DNase-seq data (right). Vertical lines depict the edges of the motif match. b Overlap between reproducible CTCF footprints in the HEK293 DNase-seq and combined ATAC-seq replicates, found using the FLR-IDR strategy. c Numbers of reproducible CTCF footprints in HEK293 ATAC-seq datasets at different depths. d The overlaps between one set of footprints in c shown for high vs. medium (left), high vs. low (middle), and medium vs. low sets (right). e The ratio of reproducible CTCF footprints (IDR footprints) or all CTCF motif regions with positive footprint scores (all footprints) that overlap CTCF ChIP-seq peaks, in all six individual sets at different depths (Additional file 1: Table S3). Red dashed line indicates this ratio for all considered CTCF motif sites
Fig. 5TF footprinting accuracy is linked to clear discrimination of footprint from the background. a, b AUCs vs. footprint-background model similarities in (a) ATAC-seq data and (b) DNase-seq data. c Difference in AUCs (ATAC-DNase) vs. difference in footprint-background model similarities (ATAC-DNase). d Average DNase I cleavage propensities over candidate TFBSs for all 20 assayed factors