| Literature DB >> 34070203 |
Nitika Kandhari1, Calvin A Kraupner-Taylor1, Paul F Harrison1,2, David R Powell2, Traude H Beilharz1.
Abstract
Alternative transcript cleavage and polyadenylation is linked to cancer cell transformation, proliferation and outcome. This has led researchers to develop methods to detect and bioinformatically analyse alternative polyadenylation as potential cancer biomarkers. If incorporated into standard prognostic measures such as gene expression and clinical parameters, these could advance cancer prognostic testing and possibly guide therapy. In this review, we focus on the existing methodologies, both experimental and computational, that have been applied to support the use of alternative polyadenylation as cancer biomarkers.Entities:
Keywords: 3′ focused RNA-seq; alternative polyadenylation; bioinformatics; cancer biomarkers; scRNA-seq
Year: 2021 PMID: 34070203 PMCID: PMC8158509 DOI: 10.3390/ijms22105322
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Alternative polyadenylation: (A) The schematic shows the 5 end, coding sequences (grey boxes), 3UnTranslated Regions (3 UTRs) and polyadenylation sites (blue arrows) in DNA. (B) Polyadenylation is the enzymatic extension of ∼200 Adenosine residues to the nascent mRNA, in this case the distal polyadenylation site was used. (B,C) In 3 UTR-APA, choice of the proximal cleavage and polyadenylation results in an mRNA with the same protein-coding potential but different 3 UTR length. (D) When a poly(A) signal is recognised in the intronic region, protein isoforms with distinct Carboxy-termini are generated in a process termed as CR-APA.
Figure 2The triad of APA attributes: This review focuses on three attributes of genome-wide APA i.e., characterisation, detection and curation of APA databases. Currently, conventional RNA-seq, 3 focused seq and single-cell RNA-seq are the main methods for APA characterisation. APA databases hold information relating to APAs and 3 UTRs collated from a wide array of inputs. Detection requires bioinformatic methods for statistical ranking. These methods are classified based on prior knowledge from the databases or determined de novo.The bioinformatic methods for single-cell data analysis are shown in red.
3 focused RNA-sequencing approaches suitable for APA detection and characterization.
| Name | Key Points | Typical Input | Sequence Target |
|---|---|---|---|
| PAIso-seq [ | PacBio based method to capture poly(A) site, length, splicing, expression, PacBio is costly for the read coverage obtained, Low coverage | 100 ng total RNA | Full length mRNA, |
| Oxford Nanopore- Direct RNA sequencing [ | The Nanopore instrument is capable of full-length direct RNA seq, tail lengths can also be extracted. Low coverage | 500 ng poly(A)+ selected RNA | Full length mRNA, |
| TAIL-seq [ | rRNA depletion and 3 | ∼100 g total RNA | Poly(A) tail length, |
| mTAIL-seq [ | 3 | 1–5 g total RNA | Poly(A) tail length, |
| PAT-seq [ | Single end read approach, 3 | 1 g total RNA | Poly(A) tail length, |
| PAL-seq [ | Requires non-standard use of an Illumina instrument for tail length measurement by biotinylated dTTP incorporation. 3 | 1–50 g total RNA | Poly(A) tail length, |
| Poly(A) seq [ | Poly(A)+ RNA is captured with oligo(dT) conjugated magnetic beads, then 3 | 5.1 g total RNA | Poly(A) tail length, |
| TED-Seq [ | 3 | 100 ng poly(A)+ RNA | Poly(A) tail length, |
| 3P-seq [ | Poly(A) tail removed by RNase H. Sequenced from the 3 | 30 g total RNA | Poly(A) site |
| 2P-seq [ | Poly(A) site detection by anchored oligo(dT) priming, sequencing from start of poly(A) tail in reverse | 15 g total RNA | Poly(A) site |
| 3 | Poly(A) site detection by anchored oligo(dT) priming. Unique approach to fragmentation by rate limited nick translation of double stranded cDNA | 2 g DNase treated RNA | Poly(A) site |
| 3 | Poly(A) tail is trimmed by RNase H, 3 | 0.1–15 g total RNA | Poly(A) site |
| 3PC [ | Anchored oligo(dT) primer to detect poly(A) site, 5 | 100 g total RNA | Poly(A) site |
| 3 | Anchored oligo(dT) primer to detect poly(A) site, sequenced from 3 | 0.5–10 g total RNA | Poly(A) site |
| SAPAS [ | Anchored oligo(dT) primer to detect poly(A) site, 5 | 10 g total RNA | Poly(A) site |
| PAS-seq [ | Anchored oligo(dT) primer to detect poly(A) site, template switching 5 | 0.5–1 g poly(A)+ selected RNA | Poly(A) site |
| IVT-SAPAS [ | 200 ng total RNA | Poly(A) site | |
| PAPERCLIP [ | RNA crosslinked, partially digested, and 3 | NA, starting material is tissue/cells | Poly(A) site |
| MACE [ | GenXPro commercial kit, barcodes transcripts with UMIs to deal with PCR duplication | 0.05 ng total RNA | Poly(A) site |
| Quant-Seq [ | Lexogen commercial kit, oligo(dT) annealing to detect 3 | 0.5–500 ng total RNA | Poly(A) site |
| MAPS [ | 3 | 1 g total RNA | Poly(A) site |
| TM3 | Fragmentation and 5 | 200 ng total RNA | Poly(A) site |
| PAC-seq [ | Click-chemistry approach to fragmentation and 5 | 0.125–4 g total RNA | Poly(A) site |
| EnD-Seq [ | Targeted sequencing approach to 3 | 1.5 g total RNA | Poly(A) site, |
Single cell RNA-sequencing approaches suitable for APA detection and characterization.
| Name | Overview | Scale |
|---|---|---|
| CEL-seq [ | 3 | Manually isolated single cells |
| CEL-seq2 [ | Application of CEL-seq to high throughput sequencing, UMI’s added to reverse transcription oligo | Automated microfluidic sorting via Fluidigm C1 into wells |
| MARS-seq 2.0 [ | 3 | 384-well plate, FACS sorting |
| InDrop [ | Application of CEL-seq to droplet-based sequencing for higher throughput | Droplet sequencing, inDrop system, 1CellBio |
| Drop-seq [ | 3 | Droplet sequencing, custom instrument |
| 10X Chromium [ | 3 | Droplet sequencing, 10X genomics instrument |
| SCRB-seq [ | 3 | 384-well plate, FACS sorting |
| MAPS-seq [ | 3 | 96-well plate, FACS sorting |
| BATSeq [ | Method specifically developed to detect APA. 3 | FACS sorting |
Figure 3Detection of poly(A) sites: (A) Two polyadenylation sites, proximal and distal, result in expression of two isoforms. (B–D) Methods to determine the location of poly(A) sites: (B) de novo method to identify change-points in read-coverage of RNA-seq data. (C) de novo method to identify poly(A) peaks in 3 focused RNA-seq data. (D) combining read-coverage data with poly(A) site coordinates from APA databases.
Bioinformatic databases for 3 UTR and APA storage and retrieval.
| Database | Primary Data Collection | Organism | Last Updated | URL |
|---|---|---|---|---|
| UTRdb [ | 5 | human, rodent, vertebrate, plant and fungi | 2010 |
|
| PACdb [ | cDNA/ESTs | human, mouse, rat, dog, chicken, zebrafish, | inaccessible |
|
| PolyA_DB | aligned cDNA/ESTs | human, mouse, rat, chicken and zebrafish | 2018 |
|
| GENCODE Poly (A) site track | cDNA/ESTs | human | 2021 |
|
| APADB [ | MACE-Seq | human, mouse and chicken | 2014 |
|
| APASdb [ | SAPAS | human (22 normal and cancer tissues), mouse, | inaccessible |
|
| TC3A [ | RNA-seq in TCGA | 32 human cancer types | inaccessible |
|
| APAatlas [ | RNA-seq in GTEx project | >50 human normal tissue | 2020 |
|
| PolyASite [ | 3 | human, mouse and worm | 2020 |
|
APA genes as potential cancer biomarkers.
| Cancer | Gene Markers | Signature APA | Physiological Effects | Molecular Role |
|---|---|---|---|---|
| Breast | PRELID1 | Shortening of 3 | increased protein expression | mitochondrial ROS signalling [ |
| Breast | SNX3, YME1L1D, USP9X | Shortening of 3 | increased protein levels in short isoform | EGF signalling [ |
| adult T-cell lymphoma, | PD-L1 gene (CD274) | Shortening of 3 | PD-1/PD-L1-mediated immune escape in cancer development; | T-cell modulator; |
| Colorectal cancer | IGF2BP1/IMP-1 | Shortening of 3 | increased protein levels; | Modulates pathogenesis [ |
| TNBC, | N4BP2L2, WDHD1, ZER1, | Shortening of 3 | unfavourable prognosis | All are related to cancer development: |
| TNBC | PPIC, ZCCHC14, RTN1, | Lengthening of 3 | poor prognosis; | TGF-βpathway; |
| TNBC (MB-231) | Caspase 6, DFFA (ICAD), | Lengthening of 3 | escape of apoptosis | Caspase pathway [ |
| TNBC (MB-231) | cyclin D1, D2 | Shortening of 3 | promote cell cycling | Mitotic cell cycle; |