| Literature DB >> 31023725 |
Andrew Routh1,2.
Abstract
Poly(A)-tail targeted RNAseq approaches, such as 3'READS, PAS-Seq and Poly(A)-ClickSeq, are becoming popular alternatives to random-primed RNAseq to focus sequencing reads just to the 3' ends of polyadenylated RNAs to identify poly(A)-sites and characterize changes in their usage. Additionally, we and others have demonstrated that these approaches perform similarly to other RNAseq strategies for differential gene expression analysis, while saving on the volume of sequencing data required and providing a simpler library synthesis strategy. Here, we present DPAC ( D ifferential P oly( A )- C lustering); a streamlined pipeline for the preprocessing of poly(A)-tail targeted RNAseq data, mapping of poly(A)-sites, poly(A)-site clustering and annotation, and determination of differential poly(A)-cluster usage using DESeq2. Changes in poly(A)-cluster usage is simultaneously used to report differential gene expression, differential terminal exon usage and alternative polyadenylation (APA).Entities:
Keywords: Alternative polyadenylation; ClickSeq; Differential gene expression; Poly(A)-sites
Mesh:
Substances:
Year: 2019 PMID: 31023725 PMCID: PMC6553543 DOI: 10.1534/g3.119.400273
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1A flow-chart summarizing each of the stages, required input files and returned output files for the DPAC pipeline. Command-line options used to invoke each stage are illustrated: -P for raw data preprocessing, -M for mapping, -C for poly(A) cluster generation, -A for poly(A) cluster database renaming, -B for bedgraphs, -D for the final differential PAC usage analysis. Examples of the output of the DPAC pipeline are shown for three genes: SCL35E2A, MEGF11, and CD9.
Figure 2Read coverage and the detected poly(A) sites (PASs) over the CD9 gene for two samples of Poly(A)-ClickSeq analysis of mocked treated HeLa cells (blue) and CFIm25 siRNA treated HeLa cells (orange) are depicted. Poly(A)-Clusters (PACs) are illustrated as a track (red) in the UCSC genome browser. The most frequently detected poly(A)-site within the poly(A)-cluster is highlighted as the thicker portion of the whole poly(A)-cluster in the track.
Example of count table used or DESeq2 for CD9, CD9 exon, and CD9 poly(A)-clusters
| Gene: | CD9 | 1993 | 1820 | 1900 | 6639 | 4021 | 6806 |
| Exon: | CD9_exon_chr12:6346929 | 1993 | 1820 | 1900 | 6639 | 4021 | 6806 |
| PACs: | CD9_exon_chr12:6346929_PAS-1 | 5 | 267 | 388 | 4061 | 2537 | 4262 |
| CD9_exon_chr12:6346929_PAS-2 | 1988 | 1553 | 1512 | 2578 | 1484 | 2544 |
Summaries of findings of DPAC analysis of CFIm25 KD HeLa cells using three different sets of parameters
| Exons only | All PACs (inc. introns) | PolyA_DB PACs | |
|---|---|---|---|
| 12499 | 12886 | 11523 | |
| - Increase | 267 | 335 | 261 |
| - Decrease | 117 | 121 | 103 |
| 14025 | 26217 | 14367 | |
| - Increase | 342 | 412 | 392 |
| - Decrease | 130 | 146 | 127 |
| Terminal Exon Change | 154 | 235 | 194 |
| 29411 | 41573 | 20949 | |
| - Increase | 1167 | 925 | 1052 |
| - Decrease | 307 | 308 | 271 |
| Genes with multiple PACs | 5067 | 5880 | 3881 |
| 861 | 647 | 638 | |
| - Shortening | 620 | 457 | 485 |
| - Lengthening | 82 | 78 | 89 |
| - Both | 153 | 109 | 60 |
Figure 3Volcano plots of the differential expression of Genes (left) and Poly(A)-Clusters (right) in HeLa cells upon siRNA KD of CFIm25 using default settings of DPAC (data from Table 2, column 1). Red dots indicate genes or PACs with a fold changes greater than 1.5 and a p-adjusted value less than 0.1.
Summaries of Poly(A)-Clusters annotated using 3′READs+ and PAS-Seq datasets
| ( | hg19 | 29,820,497 4 datasets | 2,519,867 | 21,532 Total, 15,599 Exonic, 2,081 Intronic, 3,852 Intergenic | ‘-c’ ( | |
| ( | mm10 | 83,224,679 4 datasets | 37,663,516 | 34,156 Total, 19,922 Exonic, 4,834 Intronic, 9,402 Intergenic | ‘-a 10’ ( |