| Literature DB >> 34403477 |
Weihao Zhao1, Shang Zhang1, Yumin Zhu1,2, Xiaochen Xi1, Pengfei Bao1, Ziyuan Ma1,3, Thomas H Kapral4, Shuyuan Chen1,5, Bojan Zagrovic4, Yucheng T Yang6,7,8, Zhi John Lu1.
Abstract
RNA-binding proteins (RBPs) play key roles in post-transcriptional regulation. Accurate identification of RBP binding sites in multiple cell lines and tissue types from diverse species is a fundamental endeavor towards understanding the regulatory mechanisms of RBPs under both physiological and pathological conditions. Our POSTAR annotation processes make use of publicly available large-scale CLIP-seq datasets and external functional genomic annotations to generate a comprehensive map of RBP binding sites and their association with other regulatory events as well as functional variants. Here, we present POSTAR3, an updated database with improvements in data collection, annotation infrastructure, and analysis that support the annotation of post-transcriptional regulation in multiple species including: we made a comprehensive update on the CLIP-seq and Ribo-seq datasets which cover more biological conditions, technologies, and species; we added RNA secondary structure profiling for RBP binding sites; we provided miRNA-mediated degradation events validated by degradome-seq; we included RBP binding sites at circRNA junction regions; we expanded the annotation of RBP binding sites, particularly using updated genomic variants and mutations associated with diseases. POSTAR3 is freely available at http://postar.ncrnalab.org.Entities:
Mesh:
Substances:
Year: 2022 PMID: 34403477 PMCID: PMC8728292 DOI: 10.1093/nar/gkab702
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Overview of POSTAR3 database content. Our database is concentrated in RBP-RNA interaction network and reveals information related to RBP binding through CLIP-seq. Other types of post-transcriptional regulation events (RNA modification and editing, genomic variants, disease-associated mutations, secondary structure profile, miRNA-mediated decay, etc.) and translational dynamics from Ribo-seq is associated with RBP binding in order to give users novel insights to the relationship between these events.
Figure 2.Statistics of data curated in POSTAR3 database. (A) Number of CLIP-seq and Ribo-seq datasets in seven species, compared with our previous version POSTAR2. (B) Number of newly curated CLIP-seq datasets using different technologies. (C) Number of curated RBPs in seven species. (D) RBP-RNA interactome network of human in POSTAR3. Arcs on the top represents chromosomes in human, and bottom ones represents RBPs. (E) Number of structure-seq and degradome-seq datasets curated in POSTAR3. (F) Annotation status of RBP binding sites in different modules. Each dot indicates a specific set of data. (G) MFE ratio distribution in all degradome duplex across 4 species.
Figure 3.Example applications of POSTAR3: studying Ireb2 in mouse and AT2G33830 in Arabidopsis. (A) Search of mouse Ireb2 gene in ‘Structurome’. In the ‘Structurome’ module, users could observe the secondary structure model predicted by algorithms enhanced by secondary structure profiling data. (B) They could also click the ‘RBP location & Structure’ or ‘Reactivity & structure’ button to visualize secondary structure using forna, along with other layers of information. (C) Search of mouse Ireb2 gene in ‘Disease Mutations’ module. ‘Disease Mutations’ module provides users with information of disease-associated mutations associated with RBP binding in human. Notice that the score for this binding site was relatively high. (D) Search of Arabidopsis AT2G33830 gene in ‘Degradome’ module. Search in ‘Degradome’ module returns a table containing knowledge of miRNA–mRNA binding and degradation peaks, with statistical scores indicating the reliability of the degradation pair. (E) Search of Arabidopsis AT2G33830 gene in ‘Genomic Variants’ module. ‘Genomic Variants’ module gives us information on genomic variants resided within the RBP binding sites.