Osman Doluca1. 1. Department of Biomedical Engineering, Izmir University of Economics, Izmir, Turkey. Electronic address: osman.doluca@ieu.edu.tr.
Abstract
MOTIVATION: In vivo discovery of G-quadruplex-forming sequences would provide the most relevant G-quadruplexes along a genomic DNA or an RNA molecule, however it is difficult to perform due to the small size of G-quadruplexes, the existence of different topologies, and the additional influence of environmental factors and ligands present during experimentation. In vitro discovery on the other hand is not only unable to simulate in vivo conditions but also, is not practical for large sequences due to limited resources. The immediate solution continues to be the computational prediction although, not always in agreement with experimental findings. This is often due to features that are not conventionally accepted for G-quadruplexes such as disrupted G-tracts or extremely long loops. RESULTS: Here, we propose a novel tool for the discovery of putative G-quadruplexes with better accuracy through consideration of the features of previously missed G-quadruplex-forming sequences. Comparing against a set of experimentally confirmed sequences, a sensitivity as high as 99% and Youden's J-statistics of as high as 0.91 is achieved; an improvement over other computational approaches. More importantly, we showed that the allowance of a single atypical G-tract which includes a mismatched or a bulging non-guanine nucleotide, and a single loop of extreme size benefits the overall prediction. AVAILABILITY AND IMPLEMENTATION: The python code may be found at http://github.com/odoluca/G4Catchall and the web application at http://homes.ieu.edu.tr/odoluca/G4Catchall.
MOTIVATION: In vivo discovery of G-quadruplex-forming sequences would provide the most relevant G-quadruplexes along a genomic DNA or an RNA molecule, however it is difficult to perform due to the small size of G-quadruplexes, the existence of different topologies, and the additional influence of environmental factors and ligands present during experimentation. In vitro discovery on the other hand is not only unable to simulate in vivo conditions but also, is not practical for large sequences due to limited resources. The immediate solution continues to be the computational prediction although, not always in agreement with experimental findings. This is often due to features that are not conventionally accepted for G-quadruplexes such as disrupted G-tracts or extremely long loops. RESULTS: Here, we propose a novel tool for the discovery of putative G-quadruplexes with better accuracy through consideration of the features of previously missed G-quadruplex-forming sequences. Comparing against a set of experimentally confirmed sequences, a sensitivity as high as 99% and Youden's J-statistics of as high as 0.91 is achieved; an improvement over other computational approaches. More importantly, we showed that the allowance of a single atypical G-tract which includes a mismatched or a bulging non-guanine nucleotide, and a single loop of extreme size benefits the overall prediction. AVAILABILITY AND IMPLEMENTATION: The python code may be found at http://github.com/odoluca/G4Catchall and the web application at http://homes.ieu.edu.tr/odoluca/G4Catchall.
Authors: Efres Belmonte-Reche; Israel Serrano-Chacón; Carlos Gonzalez; Juan Gallo; Manuel Bañobre-López Journal: PLoS One Date: 2021-06-08 Impact factor: 3.240
Authors: Elizaveta A Klimanova; Svetlana V Sidorenko; Polina A Abramicheva; Artem M Tverskoi; Sergei N Orlov; Olga D Lopina Journal: Int J Mol Sci Date: 2020-10-27 Impact factor: 5.923