Håkon Tjeldnes1, Kornel Labun1, Yamila Torres Cleuren1,2, Katarzyna Chyżyńska1, Michał Świrski3, Eivind Valen4,5. 1. Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway. 2. Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen, Norway. 3. Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, Warsaw, Poland. 4. Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway. eivind.valen@gmail.com. 5. Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen, Norway. eivind.valen@gmail.com.
Abstract
BACKGROUND: With the rapid growth in the use of high-throughput methods for characterizing translation and the continued expansion of multi-omics, there is a need for back-end functions and streamlined tools for processing, analyzing, and characterizing data produced by these assays. RESULTS: Here, we introduce ORFik, a user-friendly R/Bioconductor API and toolbox for studying translation and its regulation. It extends GenomicRanges from the genome to the transcriptome and implements a framework that integrates data from several sources. ORFik streamlines the steps to process, analyze, and visualize the different steps of translation with a particular focus on initiation and elongation. It accepts high-throughput sequencing data from ribosome profiling to quantify ribosome elongation or RCP-seq/TCP-seq to also quantify ribosome scanning. In addition, ORFik can use CAGE data to accurately determine 5'UTRs and RNA-seq for determining translation relative to RNA abundance. ORFik supports and calculates over 30 different translation-related features and metrics from the literature and can annotate translated regions such as proteins or upstream open reading frames (uORFs). As a use-case, we demonstrate using ORFik to rapidly annotate the dynamics of 5' UTRs across different tissues, detect their uORFs, and characterize their scanning and translation in the downstream protein-coding regions. CONCLUSION: In summary, ORFik introduces hundreds of tested, documented and optimized methods. ORFik is designed to be easily customizable, enabling users to create complete workflows from raw data to publication-ready figures for several types of sequencing data. Finally, by improving speed and scope of many core Bioconductor functions, ORFik offers enhancement benefiting the entire Bioconductor environment. AVAILABILITY: http://bioconductor.org/packages/ORFik .
BACKGROUND: With the rapid growth in the use of high-throughput methods for characterizing translation and the continued expansion of multi-omics, there is a need for back-end functions and streamlined tools for processing, analyzing, and characterizing data produced by these assays. RESULTS: Here, we introduce ORFik, a user-friendly R/Bioconductor API and toolbox for studying translation and its regulation. It extends GenomicRanges from the genome to the transcriptome and implements a framework that integrates data from several sources. ORFik streamlines the steps to process, analyze, and visualize the different steps of translation with a particular focus on initiation and elongation. It accepts high-throughput sequencing data from ribosome profiling to quantify ribosome elongation or RCP-seq/TCP-seq to also quantify ribosome scanning. In addition, ORFik can use CAGE data to accurately determine 5'UTRs and RNA-seq for determining translation relative to RNA abundance. ORFik supports and calculates over 30 different translation-related features and metrics from the literature and can annotate translated regions such as proteins or upstream open reading frames (uORFs). As a use-case, we demonstrate using ORFik to rapidly annotate the dynamics of 5' UTRs across different tissues, detect their uORFs, and characterize their scanning and translation in the downstream protein-coding regions. CONCLUSION: In summary, ORFik introduces hundreds of tested, documented and optimized methods. ORFik is designed to be easily customizable, enabling users to create complete workflows from raw data to publication-ready figures for several types of sequencing data. Finally, by improving speed and scope of many core Bioconductor functions, ORFik offers enhancement benefiting the entire Bioconductor environment. AVAILABILITY: http://bioconductor.org/packages/ORFik .
Authors: Steven Verbruggen; Elvis Ndah; Wim Van Criekinge; Siegfried Gessulat; Bernhard Kuster; Mathias Wilhelm; Petra Van Damme; Gerben Menschaert Journal: Mol Cell Proteomics Date: 2019-04-30 Impact factor: 5.911
Authors: Susan Wagner; Anna Herrmannová; Vladislava Hronová; Stanislava Gunišová; Neelam D Sen; Ross D Hannan; Alan G Hinnebusch; Nikolay E Shirokikh; Thomas Preiss; Leoš Shivaya Valášek Journal: Mol Cell Date: 2020-06-25 Impact factor: 17.970
Authors: Joseph L Gage; Sujina Mali; Fionn McLoughlin; Merritt Khaipho-Burch; Brandon Monier; Julia Bailey-Serres; Richard D Vierstra; Edward S Buckler Journal: Proc Natl Acad Sci U S A Date: 2022-03-29 Impact factor: 12.779