Dominika Labudová1, Jiří Hon1, Matej Lexa2,3. 1. IT4Innovations Centre of Excellence, Faculty of Information Technology, Brno University of Technology, Brno 61266, Czech Republic. 2. Department of Machine Learning and Data Processing, Faculty of Informatics, Masaryk University, Brno 60200, Czech Republic. 3. Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, Brno 61200, Czech Republic.
Abstract
MOTIVATION: G-quadruplex is a DNA or RNA form in which four guanine-rich regions are held together by base pairing between guanine nucleotides in coordination with potassium ions. G-quadruplexes are increasingly seen as a biologically important component of genomes. Their detection in vivo is problematic; however, sequencing and spectrometric techniques exist for their in vitro detection. We previously devised the pqsfinder algorithm for PQS identification, implemented it in C++ and published as an R/Bioconductor package. We looked for ways to optimize pqsfinder for faster and user-friendly sequence analysis. RESULTS: We identified two weak points where pqsfinder could be optimized. We modified the internals of the recursive algorithm to avoid matching and scoring many sub-optimal PQS conformations that are later discarded. To accommodate the needs of a broader range of users, we created a website for submission of sequence analysis jobs that does not require knowledge of R to use pqsfinder. AVAILABILITY AND IMPLEMENTATION: https://pqsfinder.fi.muni.cz, https://bioconductor.org/packages/pqsfinder. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: G-quadruplex is a DNA or RNA form in which four guanine-rich regions are held together by base pairing between guanine nucleotides in coordination with potassium ions. G-quadruplexes are increasingly seen as a biologically important component of genomes. Their detection in vivo is problematic; however, sequencing and spectrometric techniques exist for their in vitro detection. We previously devised the pqsfinder algorithm for PQS identification, implemented it in C++ and published as an R/Bioconductor package. We looked for ways to optimize pqsfinder for faster and user-friendly sequence analysis. RESULTS: We identified two weak points where pqsfinder could be optimized. We modified the internals of the recursive algorithm to avoid matching and scoring many sub-optimal PQS conformations that are later discarded. To accommodate the needs of a broader range of users, we created a website for submission of sequence analysis jobs that does not require knowledge of R to use pqsfinder. AVAILABILITY AND IMPLEMENTATION: https://pqsfinder.fi.muni.cz, https://bioconductor.org/packages/pqsfinder. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Anton Stepanenko; Guimin Chen; Phuong T N Hoang; Jörg Fuchs; Ingo Schubert; Nikolai Borisjuk Journal: Front Plant Sci Date: 2022-03-03 Impact factor: 5.753