| Literature DB >> 34648025 |
Robert J Schmitz1, Alexandre P Marand1, Xuan Zhang1, Rebecca A Mosher2, Franziska Turck3, Xuemei Chen4, Michael J Axtell5, Xuehua Zhong6, Siobhan M Brady7, Molly Megraw8, Blake C Meyers9,10.
Abstract
Epigenomics is the study of molecular signatures associated with discrete regions within genomes, many of which are important for a wide range of nuclear processes. The ability to profile the epigenomic landscape associated with genes, repetitive regions, transposons, transcription, differential expression, cis-regulatory elements, and 3D chromatin interactions has vastly improved our understanding of plant genomes. However, many epigenomic and single-cell genomic assays are challenging to perform in plants, leading to a wide range of data quality issues; thus, the data require rigorous evaluation prior to downstream analyses and interpretation. In this commentary, we provide considerations for the evaluation of plant epigenomics and single-cell genomics data quality with the aim of improving the quality and utility of studies using those data across diverse plant species.Entities:
Mesh:
Substances:
Year: 2022 PMID: 34648025 PMCID: PMC8773985 DOI: 10.1093/plcell/koab255
Source DB: PubMed Journal: Plant Cell ISSN: 1040-4651 Impact factor: 11.277
Recommendations for important checks and controls for plant epigenome assays
| Important Checks and Controls | RNA-seq | smRNA-seq | WGBS | ChIP-seq | Chromatin Accessibility | Single-cell RNA-seq | Single-cell ATAC-seq |
|---|---|---|---|---|---|---|---|
| Low level of duplicate reads | x | x | x | x | x | x | x |
| Appropriate normalization | x | x | x | x | x | x | x |
| Report number of sequenced and aligned reads | x | x | x | x | x | x | x |
| Show representative genome browser screen shots (including replicates) | x | x | x | x | x | x | x |
| Evaluate RNA/DNA quality | x | x | x | ||||
| High and/or consistent alignment rates | x | x | x | x | x | x | x |
| Show Venn diagram or Upset plots of identified clusters/regions/peaks between replicates | x | x | x | x | |||
| Aligned smRNA sequences are enriched for 21–24 nt sizes | x | ||||||
| Report bisulfite conversion rates | x | ||||||
| Report read coverage of the genome | x | ||||||
| Report SPOT or FRiP scores | x | x | x | ||||
| Implement IDR with replicates | x | ||||||
| Include input or IP background control | x | ||||||
| Plot read coverage around genomic features (genes/TEs/TSSs, etc.) | x | x | x | ||||
| Consider a spike-in control | x | x | |||||
| Evaluate enzymatic bias using genomic DNA control | x | ||||||
| Report number of cells targeted | x | x | |||||
| Report number of unique transcripts/Tn5 integrations per cell | x | x | |||||
| Evaluate marker genes | x | x | |||||
| Filter cells with high proportion of organellar reads | x |
Figure 1Accounting for mappability and genome assembly artifacts. A schematic diagram of typical aligned data obtained by ATAC-seq. Regions of chromatin accessibility are indicated by peaks. A region that appears enriched for sequencing coverage (labeled “false peak”) is actually due to a collapsed repeat in the genome assembly. Tn5-treated genomic DNA helps to identify these problematic regions. A k-mer-based approach is used to reveal regions of the genome that are uniquely mappable for a given sequence fragment length.
Figure 2Visualization of ChIP-seq enrichment of histone modifications. Low-quality/failed versus high-quality ChIP-seq data are shown for H3K4me3 and H3K27me3 from soybean (Glycine max) leaves. The first and third tracks show low-quality and/or failed ChIP-seq data, whereas tracks 2 and 4 show high-quality data. Box 1 shows a region of H3K27me3 enrichment in track 4, whereas the same region shows almost no enrichment in track 3. As is typical for H3K27me3, enrichment is present throughout the gene body into the upstream region. Boxes 2 and 3 show enrichment for H3K4me3 at TSSs in track 2, whereas weak enrichment is detected in track 1.