| Literature DB >> 34508354 |
Sheng Zhu1,2, Qiwei Lian2, Wenbin Ye2, Wei Qin2, Zhe Wu2, Guoli Ji2, Xiaohui Wu1.
Abstract
Alternative polyadenylation (APA) is a widespread regulatory mechanism of transcript diversification in eukaryotes, which is increasingly recognized as an important layer for eukaryotic gene expression. Recent studies based on single-cell RNA-seq (scRNA-seq) have revealed cell-to-cell heterogeneity in APA usage and APA dynamics across different cell types in various tissues, biological processes and diseases. However, currently available APA databases were all collected from bulk 3'-seq and/or RNA-seq data, and no existing database has provided APA information at single-cell resolution. Here, we present a user-friendly database called scAPAdb (http://www.bmibig.cn/scAPAdb), which provides a comprehensive and manually curated atlas of poly(A) sites, APA events and poly(A) signals at the single-cell level. Currently, scAPAdb collects APA information from > 360 scRNA-seq experiments, covering six species including human, mouse and several other plant species. scAPAdb also provides batch download of data, and users can query the database through a variety of keywords such as gene identifier, gene function and accession number. scAPAdb would be a valuable and extendable resource for the study of cell-to-cell heterogeneity in APA isoform usages and APA-mediated gene regulation at the single-cell level under diverse cell types, tissues and species.Entities:
Mesh:
Substances:
Year: 2022 PMID: 34508354 PMCID: PMC8728153 DOI: 10.1093/nar/gkab795
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Schematic diagram of scAPAdb showing the pipeline of data pre-processing and poly(A) site identification, the back-end data of APA information stored in scAPAdb and functional modules in scAPAdb.
Data summary in scAPAdb
| Organism | Study# | Experiment# | Tissue# | Cell type# | Cell# |
|---|---|---|---|---|---|
|
| 37 | 108 | 19 | 84 | 252 836 |
|
| 36 | 192 | 49 | 129 | 511 366 |
|
| 9 | 50 | 2 | 27 | 220 736 |
|
| 1 | 9 | 1 | 4 | 67 821 |
|
| 1 | 1 | 1 | 8 | 12 326 |
|
| 1 | 1 | 1 | 8 | 10 965 |
|
| 1 | 2 | 1 | 2 | 13 507 |
Figure 2.Exploration of the APA landscape with scAPAdb. The ‘Experiment’ page presents each experiment as a dataset card and users can filter experiments by entries including species, year of publish, sequencing protocol and tissue (A). Upon the selection of a study (here is GSE104556, access via http://www.bmibig.cn/scAPAdb/groups/Study/study_info.php?study=GSE104556), detailed APA information on this study is provided, including summary and statistics of this study (B), the global structure of cell type distribution shown by a UMAP plot (C) and 2D embeddings of all experiments in this study (D). Similarly, upon the selection of an experiment (access via http://www.bmibig.cn/scAPAdb/groups/Dataset/dataset_info.php?GSM=GSM2803334), the APA usage of individual cells can be visualized by an UMAP plot (E) or a bar plot (F). Single nucleotide compositions around poly(A) sites and poly(A) signal motifs of selected cell types are provided in the ‘PolyA Signal’ module (G). Users can easily search the entire database for a gene, an experiment or a study by a variety of keywords through the search interface (H). Upon the query of the keyword ‘Odf4’, an intermediate page appears to show a gene list associated with the input keyword and users can click the gene of interest (here is ENSMUSG00000032921) to show all experiments with at least one poly(A) site expressed in this gene (I). By clicking the dataset card of the experiment GSM2803334 from the search results, detailed APA information of the gene is provided, including summary of the gene and APA sites (J), the read coverage of poly(A) sites along this gene across cell types (K), APA dynamics across cell types (L) and the scatter plot showing the correlation between profiles of APA usage measured by PPUI and gene expression (M). SC, spermatocytes; ES, elongating spermatids; RS, round spermatids; PPUI, percentage of the proximal poly(A) site usage index.