Literature DB >> 33752599

SLIDR and SLOPPR: flexible identification of spliced leader trans-splicing and prediction of eukaryotic operons from RNA-Seq data.

Marius A Wenzel1, Berndt Müller2, Jonathan Pettitt2.   

Abstract

BACKGROUND: Spliced leader (SL) trans-splicing replaces the 5' end of pre-mRNAs with the spliced leader, an exon derived from a specialised non-coding RNA originating from elsewhere in the genome. This process is essential for resolving polycistronic pre-mRNAs produced by eukaryotic operons into monocistronic transcripts. SL trans-splicing and operons may have independently evolved multiple times throughout Eukarya, yet our understanding of these phenomena is limited to only a few well-characterised organisms, most notably C. elegans and trypanosomes. The primary barrier to systematic discovery and characterisation of SL trans-splicing and operons is the lack of computational tools for exploiting the surge of transcriptomic and genomic resources for a wide range of eukaryotes.
RESULTS: Here we present two novel pipelines that automate the discovery of SLs and the prediction of operons in eukaryotic genomes from RNA-Seq data. SLIDR assembles putative SLs from 5' read tails present after read alignment to a reference genome or transcriptome, which are then verified by interrogating corresponding SL RNA genes for sequence motifs expected in bona fide SL RNA molecules. SLOPPR identifies RNA-Seq reads that contain a given 5' SL sequence, quantifies genome-wide SL trans-splicing events and predicts operons via distinct patterns of SL trans-splicing events across adjacent genes. We tested both pipelines with organisms known to carry out SL trans-splicing and organise their genes into operons, and demonstrate that (1) SLIDR correctly detects expected SLs and often discovers novel SL variants; (2) SLOPPR correctly identifies functionally specialised SLs, correctly predicts known operons and detects plausible novel operons.
CONCLUSIONS: SLIDR and SLOPPR are flexible tools that will accelerate research into the evolutionary dynamics of SL trans-splicing and operons throughout Eukarya and improve gene discovery and annotation for a wide range of eukaryotic genomes. Both pipelines are implemented in Bash and R and are built upon readily available software commonly installed on most bioinformatics servers. Biological insight can be gleaned even from sparse, low-coverage datasets, implying that an untapped wealth of information can be retrieved from existing RNA-Seq datasets as well as from novel full-isoform sequencing protocols as they become more widely available.

Entities:  

Keywords:  5′ UTR; Chimeric reads; Eukaryotic operons; Genome annotation; Polycistronic RNA processing; RNA-seq; Spliced-leader trans-splicing

Mesh:

Substances:

Year:  2021        PMID: 33752599      PMCID: PMC7986045          DOI: 10.1186/s12859-021-04009-7

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  72 in total

1.  HISAT: a fast spliced aligner with low memory requirements.

Authors:  Daehwan Kim; Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2015-03-09       Impact factor: 28.547

2.  BLAST+: architecture and applications.

Authors:  Christiam Camacho; George Coulouris; Vahram Avagyan; Ning Ma; Jason Papadopoulos; Kevin Bealer; Thomas L Madden
Journal:  BMC Bioinformatics       Date:  2009-12-15       Impact factor: 3.169

3.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

4.  A global analysis of Caenorhabditis elegans operons.

Authors:  Thomas Blumenthal; Donald Evans; Christopher D Link; Alessandro Guffanti; Daniel Lawson; Jean Thierry-Mieg; Danielle Thierry-Mieg; Wei Lu Chiu; Kyle Duke; Moni Kiraly; Stuart K Kim
Journal:  Nature       Date:  2002-06-20       Impact factor: 49.962

5.  The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.

Authors:  Lincoln D Stein; Zhirong Bao; Darin Blasiar; Thomas Blumenthal; Michael R Brent; Nansheng Chen; Asif Chinwalla; Laura Clarke; Chris Clee; Avril Coghlan; Alan Coulson; Peter D'Eustachio; David H A Fitch; Lucinda A Fulton; Robert E Fulton; Sam Griffiths-Jones; Todd W Harris; LaDeana W Hillier; Ravi Kamath; Patricia E Kuwabara; Elaine R Mardis; Marco A Marra; Tracie L Miner; Patrick Minx; James C Mullikin; Robert W Plumb; Jane Rogers; Jacqueline E Schein; Marc Sohrmann; John Spieth; Jason E Stajich; C Wei; David Willey; Richard K Wilson; Richard Durbin; Robert H Waterston
Journal:  PLoS Biol       Date:  2003-11-17       Impact factor: 8.029

6.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.

Authors:  Cole Trapnell; Brian A Williams; Geo Pertea; Ali Mortazavi; Gordon Kwan; Marijke J van Baren; Steven L Salzberg; Barbara J Wold; Lior Pachter
Journal:  Nat Biotechnol       Date:  2010-05-02       Impact factor: 54.908

7.  A high-throughput screen for the identification of compounds that inhibit nematode gene expression by targeting spliced leader trans-splicing.

Authors:  George Cherian Pandarakalam; Michael Speake; Stuart McElroy; Ammar Alturkistani; Lucas Philippe; Jonathan Pettitt; Berndt Müller; Bernadette Connolly
Journal:  Int J Parasitol Drugs Drug Resist       Date:  2019-04-05       Impact factor: 4.077

8.  Complete representation of a tapeworm genome reveals chromosomes capped by centromeres, necessitating a dual role in segregation and protection.

Authors:  Peter D Olson; Alan Tracey; Andrew Baillie; Katherine James; Stephen R Doyle; Sarah K Buddenborg; Faye H Rodgers; Nancy Holroyd; Matt Berriman
Journal:  BMC Biol       Date:  2020-11-09       Impact factor: 7.431

9.  Operons are a conserved feature of nematode genomes.

Authors:  Jonathan Pettitt; Lucas Philippe; Debjani Sarkar; Christopher Johnston; Henrike Johanna Gothe; Diane Massie; Bernadette Connolly; Berndt Müller
Journal:  Genetics       Date:  2014-06-14       Impact factor: 4.562

10.  SL-quant: a fast and flexible pipeline to quantify spliced leader trans-splicing events from RNA-seq data.

Authors:  Carlo Yague-Sanz; Damien Hermand
Journal:  Gigascience       Date:  2018-07-01       Impact factor: 6.524

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.