MOTIVATION: Most RNA-seq data analysis software packages are not designed to handle the complexities involved in properly apportioning short sequencing reads to highly repetitive regions of the genome. These regions are often occupied by transposable elements (TEs), which make up between 20 and 80% of eukaryotic genomes. They can contribute a substantial portion of transcriptomic and genomic sequence reads, but are typically ignored in most analyses. RESULTS: Here, we present a method and software package for including both gene- and TE-associated ambiguously mapped reads in differential expression analysis. Our method shows improved recovery of TE transcripts over other published expression analysis methods, in both synthetic data and qPCR/NanoString-validated published datasets. AVAILABILITY AND IMPLEMENTATION: The source code, associated GTF files for TE annotation, and testing data are freely available at http://hammelllab.labsites.cshl.edu/software. CONTACT: mhammell@cshl.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Most RNA-seq data analysis software packages are not designed to handle the complexities involved in properly apportioning short sequencing reads to highly repetitive regions of the genome. These regions are often occupied by transposable elements (TEs), which make up between 20 and 80% of eukaryotic genomes. They can contribute a substantial portion of transcriptomic and genomic sequence reads, but are typically ignored in most analyses. RESULTS: Here, we present a method and software package for including both gene- and TE-associated ambiguously mapped reads in differential expression analysis. Our method shows improved recovery of TE transcripts over other published expression analysis methods, in both synthetic data and qPCR/NanoString-validated published datasets. AVAILABILITY AND IMPLEMENTATION: The source code, associated GTF files for TE annotation, and testing data are freely available at http://hammelllab.labsites.cshl.edu/software. CONTACT: mhammell@cshl.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Paola N Perrat; Shamik DasGupta; Jie Wang; William Theurkauf; Zhiping Weng; Michael Rosbash; Scott Waddell Journal: Science Date: 2013-04-05 Impact factor: 47.728
Authors: John M Sedivy; Jill A Kreiling; Nicola Neretti; Marco De Cecco; Steven W Criscione; Jeffrey W Hofmann; Xiaoai Zhao; Takahiro Ito; Abigail L Peterson Journal: Bioessays Date: 2013-10-15 Impact factor: 4.345
Authors: Veena P Gnanakkan; Andrew E Jaffe; Lixin Dai; Jie Fu; Sarah J Wheelan; Hyam I Levitsky; Jef D Boeke; Kathleen H Burns Journal: BMC Genomics Date: 2013-12-10 Impact factor: 3.969
Authors: Marco De Cecco; Steven W Criscione; Abigail L Peterson; Nicola Neretti; John M Sedivy; Jill A Kreiling Journal: Aging (Albany NY) Date: 2013-12 Impact factor: 5.682
Authors: Wan R Yang; Daniel Ardeljan; Clarissa N Pacyna; Lindsay M Payer; Kathleen H Burns Journal: Nucleic Acids Res Date: 2019-03-18 Impact factor: 16.971
Authors: Zbigniew Warkocki; Paweł S Krawczyk; Dorota Adamska; Krystian Bijata; Jose L Garcia-Perez; Andrzej Dziembowski Journal: Cell Date: 2018-08-16 Impact factor: 41.582
Authors: Amir K Foroushani; Bryan Chim; Madeline Wong; Andre Rastegar; Patrick T Smith; Saifeng Wang; Kent Barbian; Craig Martens; Markus Hafner; Stefan A Muljo Journal: Proc Natl Acad Sci U S A Date: 2020-10-05 Impact factor: 11.205