| Literature DB >> 35876845 |
Federico Ansaloni1,2, Nicolò Gualandi1, Mauro Esposito1, Stefano Gustincich2, Remo Sanges1,2.
Abstract
SUMMARY: Transposable Elements (TEs) play key roles in crucial biological pathways. Therefore, several tools enabling the quantification of their expression were recently developed. However, many of the existing tools lack the capability to distinguish between the transcription of autonomously expressed TEs and TE fragments embedded in canonical coding/non-coding non-TE transcripts. Consequently, an apparent change in the expression of a given TE may simply reflect the variation in the expression of the transcripts containing TE-derived sequences. To overcome this issue, we have developed TEspeX, a pipeline for the quantification of TE expression at the consensus level. TEspeX uses Illumina RNA-seq short reads to quantify TE expression avoiding counting reads deriving from inactive TE fragments embedded in canonical transcripts.Entities:
Year: 2022 PMID: 35876845 PMCID: PMC9477521 DOI: 10.1093/bioinformatics/btac526
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.931
Fig. 1.(A) Pipeline workflow. Reference transcriptome is generated concatenating TE consensus sequences (1), coding (2) and non-coding transcripts (3). RNA-seq reads (4) are mapped to the reference transcriptome using STAR. Only best scoring alignments are selected and all the reads mapping to any annotated non-TE transcripts are discarded. Selected reads are finally counted. Yellow- and orange-squared RNA-seq read pairs represent two exemplificative examples on TEspeX functioning. Both pairs are aligned to a locus shared between non-TE and TE transcripts. However, while for the orange-squared pair a best alignment to TE sequences can be defined with the read pair therefore considered as TE specific, the yellow-squared one maps with the best score alignment to both non-TE and TE transcripts and it is consequently discarded from the counting. (B) Quantification of the TE expression with SalmonTE, SQuiRE, TEtranscripts, L1EM and TEspeX on synthetic RNA-seq reads generated from coding and non-coding transcripts. On y-axis, the mean of expression of all the analysed TEs is reported. (C) Correlation between TE expression values calculated by Krug and colleagues (y-axis) and TEspeX (x-axis). (D) Correlation between TE expression values calculated by Jönsson and colleagues (y-axis) and TEspeX (x-axis). In both C and D, expression levels are reported as mean of expression calculated among all the samples of each dataset