Literature DB >> 30192918

Ambiguous splice sites distinguish circRNA and linear splicing in the human genome.

Roozbeh Dehghannasiri1, Linda Szabo2, Julia Salzman1,2.   

Abstract

MOTIVATION: Identification of splice sites is critical to gene annotation and to determine which sequences control circRNA biogenesis. Full-length RNA transcripts could in principle complete annotations of introns and exons in genomes without external ontologies, i.e., ab initio. However, whether it is possible to reconstruct genomic positions where splicing occurs from full-length transcripts, even if sampled in the absence of noise, depends on the genome sequence composition. If it is not, there exist provable limits on the use of RNA-Seq to define splice locations (linear or circular) in the genome.
RESULTS: We provide a formal definition of splice site ambiguity due to the genomic sequence by introducing equivalent junction, which is the set of local genomic positions resulting in the same RNA sequence when joined through RNA splicing. We show that equivalent junctions are prevalent in diverse eukaryotic genomes and occur in 88.64% and 78.64% of annotated human splice sites in linear and circRNA junctions, respectively. The observed fractions of equivalent junctions and the frequency of many individual motifs are statistically significant when compared against the null distribution computed via simulation or closed-form. The frequency of equivalent junctions establishes a fundamental limit on the possibility of ab initio reconstruction of RNA transcripts without appealing to the ontology of "GT-AG" boundaries defining introns. Said differently, completely ab initio is impossible in the vast majority of splice sites in annotated circRNAs and linear transcripts.
AVAILABILITY AND IMPLEMENTATION: Two python scripts generating an equivalent junction sequence per junction are available at: https://github.com/salzmanlab/Equivalent-Junctions. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 30192918      PMCID: PMC6477988          DOI: 10.1093/bioinformatics/bty785

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  24 in total

Review 1.  Current methods of gene prediction, their strengths and weaknesses.

Authors:  Catherine Mathé; Marie-France Sagot; Thomas Schiex; Pierre Rouzé
Journal:  Nucleic Acids Res       Date:  2002-10-01       Impact factor: 16.971

Review 2.  Aberrant RNA splicing and its functional consequences in cancer cells.

Authors:  James D Fackenthal; Lucy A Godley
Journal:  Dis Model Mech       Date:  2008 Jul-Aug       Impact factor: 5.758

3.  GeneMark.hmm: new solutions for gene finding.

Authors:  A V Lukashin; M Borodovsky
Journal:  Nucleic Acids Res       Date:  1998-02-15       Impact factor: 16.971

Review 4.  Aberrant RNA splicing in cancer; expression changes and driver mutations of splicing factor genes.

Authors:  A Sveen; S Kilpinen; A Ruusulehto; R A Lothe; R I Skotheim
Journal:  Oncogene       Date:  2015-08-24       Impact factor: 9.867

5.  Prediction of complete gene structures in human genomic DNA.

Authors:  C Burge; S Karlin
Journal:  J Mol Biol       Date:  1997-04-25       Impact factor: 5.469

6.  Circular RNAs are the predominant transcript isoform from hundreds of human genes in diverse cell types.

Authors:  Julia Salzman; Charles Gawad; Peter Lincoln Wang; Norman Lacayo; Patrick O Brown
Journal:  PLoS One       Date:  2012-02-01       Impact factor: 3.240

7.  Alternative splicing detection workflow needs a careful combination of sample prep and bioinformatics analysis.

Authors:  Matteo Carrara; Josephine Lum; Francesca Cordero; Marco Beccuti; Michael Poidinger; Susanna Donatelli; Raffaele Adolfo Calogero; Francesca Zolezzi
Journal:  BMC Bioinformatics       Date:  2015-06-01       Impact factor: 3.169

8.  circBase: a database for circular RNAs.

Authors:  Petar Glažar; Panagiotis Papavasileiou; Nikolaus Rajewsky
Journal:  RNA       Date:  2014-09-18       Impact factor: 4.942

Review 9.  Oxford Nanopore MinION Sequencing and Genome Assembly.

Authors:  Hengyun Lu; Francesca Giordano; Zemin Ning
Journal:  Genomics Proteomics Bioinformatics       Date:  2016-09-17       Impact factor: 7.691

10.  Full-length transcriptome assembly from RNA-Seq data without a reference genome.

Authors:  Manfred G Grabherr; Brian J Haas; Moran Yassour; Joshua Z Levin; Dawn A Thompson; Ido Amit; Xian Adiconis; Lin Fan; Raktima Raychowdhury; Qiandong Zeng; Zehua Chen; Evan Mauceli; Nir Hacohen; Andreas Gnirke; Nicholas Rhind; Federica di Palma; Bruce W Birren; Chad Nusbaum; Kerstin Lindblad-Toh; Nir Friedman; Aviv Regev
Journal:  Nat Biotechnol       Date:  2011-05-15       Impact factor: 54.908

View more
  2 in total

1.  Nanopore direct RNA sequencing maps the complexity of Arabidopsis mRNA processing and m6A modification.

Authors:  Matthew T Parker; Katarzyna Knop; Anna V Sherwood; Nicholas J Schurch; Katarzyna Mackinnon; Peter D Gould; Anthony Jw Hall; Geoffrey J Barton; Gordon G Simpson
Journal:  Elife       Date:  2020-01-14       Impact factor: 8.140

2.  2passtools: two-pass alignment using machine-learning-filtered splice junctions increases the accuracy of intron detection in long-read RNA sequencing.

Authors:  Matthew T Parker; Katarzyna Knop; Geoffrey J Barton; Gordon G Simpson
Journal:  Genome Biol       Date:  2021-03-01       Impact factor: 13.583

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.