| Literature DB >> 25378337 |
Leiming You1, Jiexin Wu2, Yuchao Feng2, Yonggui Fu2, Yanan Guo2, Liyuan Long2, Hui Zhang2, Yijie Luan2, Peng Tian2, Liangfu Chen2, Guangrui Huang2, Shengfeng Huang2, Yuxin Li2, Jie Li2, Chengyong Chen2, Yaqing Zhang2, Shangwu Chen2, Anlong Xu3.
Abstract
Increasing amounts of genes have been shown to utilize alternative polyadenylation (APA) 3'-processing sites depending on the cell and tissue type and/or physiological and pathological conditions at the time of processing, and the construction of genome-wide database regarding APA is urgently needed for better understanding poly(A) site selection and APA-directed gene expression regulation for a given biology. Here we present a web-accessible database, named APASdb (http://mosas.sysu.edu.cn/utr), which can visualize the precise map and usage quantification of different APA isoforms for all genes. The datasets are deeply profiled by the sequencing alternative polyadenylation sites (SAPAS) method capable of high-throughput sequencing 3'-ends of polyadenylated transcripts. Thus, APASdb details all the heterogeneous cleavage sites downstream of poly(A) signals, and maintains near complete coverage for APA sites, much better than the previous databases using conventional methods. Furthermore, APASdb provides the quantification of a given APA variant among transcripts with different APA sites by computing their corresponding normalized-reads, making our database more useful. In addition, APASdb supports URL-based retrieval, browsing and display of exon-intron structure, poly(A) signals, poly(A) sites location and usage reads, and 3'-untranslated regions (3'-UTRs). Currently, APASdb involves APA in various biological processes and diseases in human, mouse and zebrafish.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25378337 PMCID: PMC4383914 DOI: 10.1093/nar/gku1076
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Overview of the APASdb website. (A) Experiment outline of SAPAS library preparation. (B) Outline of the APASdb building pipeline. The data flow is indicated by arrowed lines. Data generated by this optimized pipeline, contains positions and reads of heterogeneous cleavage sites, poly(A) signals and 3′-UTR sequences, as well as the locations and usage reads of poly(A) sites. (C) Schematic representation of a poly(A) site and polyadenylation configuration. A poly(A) span is a cluster containing heterogeneous cleavage sites (arrowed lines) and the most-frequently used cleavage site is defined as the reference point for a poly(A) site. The binding sites for the cleavage polyadenylation specificity factor (CPSF) and cleavage stimulatory factor (CstF) are also depicted.(D) Architecture of the APASdb website. Arrows denote the direction of information flow, and several output pages are shown, including the popular genome browser (Gbrowse), especially the developmental presentation termed ‘poly(A)-site map’, ‘poly(A)-site usage’ and ‘heterogeneous cleavage-site selection’.
The searching datasets listed by species in APASdb website
| Species | Searching datasets | subsetsa | Poly(A) sites | Simple descriptionsb |
|---|---|---|---|---|
| Zv9_embryonic_development | 8 | 108 290 | Dynamic APA sites and 3′-UTRs, selection of heterogeneous cleavage sites during zebrafish embryonic development. | |
| mm9_thymic_development | 8 | 226 858 | Dynamic APA sites and 3′-UTRs, selection of heterogeneous cleavage sites in mouse thymopoiesis. | |
| hg19_breastCancer_MCF10A-MCF7-MB231 | 3 | 46 531 | Genome-wide APA sites and 3′-UTRs, selection of heterogeneous cleavage sites in human breast cancer cell lines MCF7 and MB231, also one cultured normal epithelial cell line MCF10A. | |
| hg19_rectalCancer_12N-VS-12T | 2 | 74 116 | Genome-wide APA sites and 3′-UTRs, selection of heterogeneous cleavage sites in human normal and tumorous tissues of intestinum rectum. | |
| hg19_rhinosinusitis_11N11P25N25P26N26P | 6 | 83 641 | Genome-wide APA sites and 3′-UTRs, selection of heterogeneous cleavage sites in nasal polyps and nasal uncinate process mucosa of eosinophilic chronic rhinosinusitis patients with nasal polyps. | |
| hg19_human-all22-tissues | 22 | 179 532 | Genome-wide APA sites and 3′-UTRs, selection of heterogeneous cleavage sites in human 20 tissues. |
aTotal number of subsets integrated into a searching dataset.
bDetail descriptions of experimental samples can be referred (Supplementary Notes, or http://mosas.sysu.edu.cn/utr/search_APASdb.php?show=1).
Figure 2.Screen shot of the searching page and the media page resulting from a fuzzy query keyword of ‘chemokine’. (A) User retrieval interface designed to query datasets. (B) Descript list of datasets in retrieval interface. The List summarizes the released datasets and directs user's query. The ‘view’ button supports quick access to a example query of dataset and the ‘chr’ button links the browsing of dataset in a genome browser (Gbrowse). (C) List of APA sites-contained genes matching the fuzzy keyword of ‘chemokine’. Each icon displayed in ‘APAS’ column of the result table links a detail page to show more corresponding information of APA sites and the number highlighted in ‘APAS’ column indicates the number of APA-sites located in the transcript locus, and texts with hyperlinks in other columns enable redirecting to other extensive resources, especially the texts in ‘locus’ column guide user to the specified URLs to browse APA sites associated with genes in a genome browser. For direct viewing the example mentioned here, the reader is asked to refer to http://mosas.sysu.edu.cn/utr/search_APASdb.php?seqkeywords=chemokine.
Figure 3.Screen shot of the detail page with the unfolded ‘polyA-site used’ tab to reveal the dynamic usage of APA sites and expression pattern of cxcl12a in zebrafish embryogenesis. The bar chart indicates the location and usage quantification of APA sites of cxcl12a from 0 hpf to 5 dpf, and by summing the normalized supporting reads of APA sites appeared in each stages, the curve diagram presents the expression pattern of cxcl12a in zebrafish embryogenesis. Y-axis, numbers of normalized reads. Reads, read number normalized to per million mapped read; hpf, hours post fertilization; dpf, days post fertilization;pA, poly(A) sites. For browsing the example described here, readers are asked to refer to http://mosas.sysu.edu.cn/utr/search_APASdb.php?seqkeywords=ENSDARG00000037116.