| Literature DB >> 28148393 |
Hsin-Sung Yeh1, Wei Zhang2, Jeongsik Yong1.
Abstract
Alternations in usage of polyadenylation sites during transcription termination yield transcript isoforms from a gene. Recent findings of transcriptome-wide alternative polyadenylation (APA) as a molecular response to changes in biology position APA not only as a molecular event of early transcriptional termination but also as a cellular regulatory step affecting various biological pathways. With the development of high-throughput profiling technologies at a single nucleotide level and their applications targeted to the 3'-end of mRNAs, dynamics in the landscape of mRNA 3'-end is measureable at a global scale. In this review, methods and technologies that have been adopted to study APA events are discussed. In addition, various bioinformatics algorithms for APA isoform analysis using publicly available RNA-seq datasets are introduced. [BMB Reports 2017; 50(4): 201-207].Entities:
Mesh:
Substances:
Year: 2017 PMID: 28148393 PMCID: PMC5437964 DOI: 10.5483/bmbrep.2017.50.4.019
Source DB: PubMed Journal: BMB Rep ISSN: 1976-6696 Impact factor: 4.778
Fig. 1Schematic showing a gene structure and alternative poly-adenylation. A gene is composed of exons and introns. Exons of a gene divide into coding DNA sequence (CDS) and untranslated regions (UTRs). Alternative polyadenylation can occur within the last exon of a gene (UTR-APA) and/or upstream exons/introns (CR-APA).
Currently available algorithms to analyze the variations of 3′-UTR length using RNA-seq data
| Algorithm | Reference | Description |
|---|---|---|
| DaPars | ( | It first models the RNA seq-read densities of both tumor and normal as a linear combination of both proximal and distal polyA sites. It then uses a linear regression model to identify the location of the |
| ChangePoint | ( | It is based on a generalized likelihood ratio statistic for identifying 3′UTR length change in the analysis of RNA-seq data. A directional multiple test procedure is then developed to identifying APA events between two samples. |
| Roar | ( | It is based on Fisher test to detect disequilibriums in the number of RNA-seq reads mapped to the 3′UTRs. Read counts and lengths of fragments are then used to calculate the prevalence of the short isoform over the long one in two biological conditions to identify APA events. |
| 3USS | ( | A web-server developed with the aim of giving experimentalists the possibility to identify alternative 3′UTRs between two samples by RNA-seq data analysis. |
| IsoSCM | ( | A method for transcript assembly that incorporates change point analysis by a Bayesian framework to improve the 3′UTR annotation process with RNA-seq data. |
| KLEAT | ( | An analysis tool that uses |
| GETUTR | ( | It first makes a density function of RNA-seq reads aligned to the 3′UTRs using kernel density estimation. A smoothing step is then applied to maintain the biological changes of the 3′UTR. The goal of the method is to estimate the 3′UTR landscape based on these smoothed RNA-seq signal. |
Fig. 2A work flow for 3′end-seq and PacBio Iso-seq. (A) Multiple versions of global profiling method for 3′-end sequence of mRNAs are available. An example of 3′end-seq method is shown. A 3′end-seq cDNA library is produced by a series of molecular biology work integrating first strand cDNA synthesis and PCR. Short sequence reads of polyadenylation site are cataloged by conducting RNA-seq using the cDNA library and trimming/aligning sequencing data. (B) The SMRT bell cDNA library for PacBio-seq can be produced from cDNA amplicon which is made by reverse transcription. Concatemerized long reads of insert in SMRT bell cDNA library are produced as raw data. Processed long reads (by eliminating SMRT bell sequences) are aligned to generate a consensus sequence of long reads.