Literature DB >> 30721922

G4Hunter web application: a web server for G-quadruplex prediction.

Václav Brázda1, Jan Kolomazník2, Jiří Lýsek2, Martin Bartas1,3, Miroslav Fojta1, Jiří Šťastný2,4, Jean-Louis Mergny1,5.   

Abstract

MOTIVATION: Expanding research highlights the importance of guanine quadruplex structures. Therefore, easy-accessible tools for quadruplex analyses in DNA and RNA molecules are important for the scientific community.
RESULTS: We developed a web version of the G4Hunter application. This new web-based server is a platform-independent and user-friendly application for quadruplex analyses. It allows retrieval of gene/nucleotide sequence entries from NCBI databases and provides complete characterization of localization and quadruplex propensity of quadruplex-forming sequences. The G4Hunter web application includes an interactive graphical data representation with many useful options including visualization, sorting, data storage and export.
AVAILABILITY AND IMPLEMENTATION: G4Hunter web application can be accessed at: http://bioinformatics.ibp.cz. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 30721922      PMCID: PMC6748775          DOI: 10.1093/bioinformatics/btz087

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Important roles are emerging for local DNA structures in the regulation of basic biological processes (Brázda ; Todolli ). Sequences that can form G-quadruplex (G4) structures are widespread in a variety of genomes (Huppert and Balasubramanian, 2005; Murat and Balasubramanian, 2014; Todd ). G4 structures often exhibit high structural stability under physiological conditions and are important in telomere biology (De Cian ; Zimmermann ), protein recognition (Brázda and Coufal, 2017; Brázda , 2014; Vasilyev ), transcription (Siddiqui-Jain ), translation and RNA maturation (Wieland and Hartig, 2007), replication and genomic stability (Cheung ; Paeschke ) and replication origin definition (Comoglio ; Valton ). G4 sequences can be predicted in silico and several tools have been developed (Garant ; Hon ; Huppert and Balasubramanian, 2005; Kikin ; Wong ). A newly developed and validated algorithm, G4Hunter, overcomes several limitations of previous algorithms (Bedrat ). G4Hunter is a powerful and widely used tool for G4 prediction which takes into account G-richness and G-skewness of a DNA or RNA sequence and provides a quadruplex propensity score. However, three major limitations of the G4Hunter have become evident. (i) The current implementation requires advanced computational expertise and software packages including Python v. 2.7 and libraries NumPy, Matplotlib and Biopython; (ii) The software runs in the command line only; (iii) results are exported in two files in text format without any additional tools for direct postprocessing or visualization. Here, we present ‘G4Hunter web application’, a new and optimized algorithm, expanding on the original G4Hunter Python code embedded into a platform-independent and easy-to-use web-based graphical interface, freely available at http://bioinformatics.ibp.cz.

2 Features

Our implementation is based on the algorithm published on GitHub https://github.com/AnimaTardeb/G4-Hunter (Bedrat ). We modified it to our server-based platform developed originally for Palindrome analysis (Brázda ). For this application, we completely rewrote the back-end and the front-end of our web-server to broaden and optimize computational and visualization performance.

2.1 Workflow and implementation

The original G4Hunter requires the user to install Python and Python’s libraries Biopython, Matplotlib and NumPy, runs only in console and stores results in a text file. The basic input parameters and result of our new application are the same; the user specifies input sequence(s), window size and G4Hunter score threshold, and outputs the sequences and scores identified. Compared to the original Python program, our web-based application has a user-friendly interface and rich graphical presentation of results. It also shows a heatmap of quadruplex forming sequences and provides statistical information. Another advantage is the ability to analyze multiple sequences at once. Results can also be exported as a text file for further processing. The G4Hunter web application runs on a central web server and clients/visitors can execute analyses using server resources, sequences and results are stored in a relational database. A compatible web browser with JavaScript support is required. Although user requests are queued and processed one at a time, the application benefits from multi-core processor architectures because the analysis itself is performed by Java execution framework and the algorithm is separated into four independent steps (determine the score for each nuclide, precalculate sum, calculate scores and aggregation) that are computed in parallel. Our application is implemented as a Single-Page-Application with REST API backend. This configuration allows us to use the back-end as a standalone computational server employed by custom scripts for specific analyses. Front-end client application is made with Vue.js (https://vuejs.org/) framework. Back-end is a Java application based on Spring framework (https://spring.io/). The server currently runs on 4 threads.

2.1.1 G4Hunter web application input and analysis

Sequences can be imported using NCBI ID, with the ability to download multiple sequences from NCBI multiple import. Another option is to upload a local FASTA or txt file and/or directly paste sequences from the clipboard. The user can sort sequences according to all parameters and can add tags to organize sequence sets. Standard IUPAC nucleotide codes are supported. Sample sequences are available to test the server. The user selects one or more sequences and sets the parameters for analysis. Recommended parameters for Window size (25) and score Threshold (1.4) are set as default (window size 10-100 nucleotides and threshold 0.8-4.0 are available, the latter will only retrieve pure G-runs).

2.1.2 G4Hunter web application output

Results are displayed using AJAX technology. If more than one sequence is analyzed, results are shown as individual tabs. The user interface is designed for intuitive visualization and browsing of results. Below the sequence name is a heatmap that divides the sequence and displays the number of G4-forming sequences in each segment. The number of results is marked by the intensity of red color. The heatmap can be used to filter results in selected segments. Below this is basic information including settings, number and frequency of results, export options and sequence information, including length and GC content. The sequence browser component displays the nucleotide sequences of the results and shows a cut-out of bases which fits the screen/browser window. The sequences corresponding to the analysis parameters are marked by colors (G in red—the longer the G-track, the brighter the intensity, C in blue). Position in the sequence, length, sequence, score chart and G4Hunter score are shown. The aggregated results are shown primarily; separated results are displayed using the magnifier icon. A typical analysis output is shown in Supplementary Material S1.

2.2 Output formats

G4Hunter web application gives three result formats. First, the graphical representation described above. Second, concatenated sequences and third, unaggregated sequences in CSV files. The structure of CSV files is shown in Figure 1 and contains POSITION in sequence, LENGTH of the longest continuous sequence with G4Hunter scores above threshold, its SCORE and the part of the sequence. The SUB_SCORE shows scores for each window position inside the concatenated sequence. These CSV files can be downloaded from the main results window and/or from the stored results tabs for follow-up analyses.
Fig. 1.

CSV export

CSV export

3 Validation

To compare the performance and accuracy of G4Hunter web application, we analyzed several identical sequences in each version. Both implementations return the same results. However, thanks to the new architecture of our application and parallel processing, the web version is more than 10-times faster and allows analyses of multiple sequences with a system-independent modern graphical environment.

4 Conclusions

We developed a web version of the G4Hunter application with a user-friendly GUI and improved output options including graphic representation. Our web-server allows detailed analyses of nucleic acid sequences and adds basic information and broad visualization options with sorting tools that allow quick and effective searching for target information from G4Hunter analyses. This web version of the G4 algorithm allows rapid and effective analyses of various nucleic acid sequences and will be useful for researchers in the field. Click here for additional data file.
  23 in total

1.  Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription.

Authors:  Adam Siddiqui-Jain; Cory L Grand; David J Bearss; Laurence H Hurley
Journal:  Proc Natl Acad Sci U S A       Date:  2002-08-23       Impact factor: 11.205

Review 2.  Targeting telomeres and telomerase.

Authors:  Anne De Cian; Laurent Lacroix; Céline Douarre; Nassima Temime-Smaali; Chantal Trentesaux; Jean-François Riou; Jean-Louis Mergny
Journal:  Biochimie       Date:  2007-07-24       Impact factor: 4.079

3.  RNA quadruplex-based modulation of gene expression.

Authors:  Markus Wieland; Jörg S Hartig
Journal:  Chem Biol       Date:  2007-07

4.  DNA replication through G-quadruplex motifs is promoted by the Saccharomyces cerevisiae Pif1 DNA helicase.

Authors:  Katrin Paeschke; John A Capra; Virginia A Zakian
Journal:  Cell       Date:  2011-05-27       Impact factor: 41.582

5.  A toolbox for predicting g-quadruplex formation and stability.

Authors:  Han Min Wong; Oliver Stegle; Simon Rodgers; Julian Leon Huppert
Journal:  J Nucleic Acids       Date:  2010-06-08

6.  Disruption of dog-1 in Caenorhabditis elegans triggers deletions upstream of guanine-rich DNA.

Authors:  Iris Cheung; Michael Schertzer; Ann Rose; Peter M Lansdorp
Journal:  Nat Genet       Date:  2002-07-08       Impact factor: 38.330

Review 7.  Cruciform structures are a common DNA feature important for regulating biological processes.

Authors:  Václav Brázda; Rob C Laister; Eva B Jagelská; Cheryl Arrowsmith
Journal:  BMC Mol Biol       Date:  2011-08-05       Impact factor: 2.946

8.  Highly prevalent putative quadruplex sequence motifs in human DNA.

Authors:  Alan K Todd; Matthew Johnston; Stephen Neidle
Journal:  Nucleic Acids Res       Date:  2005-05-24       Impact factor: 16.971

9.  Prevalence of quadruplexes in the human genome.

Authors:  Julian L Huppert; Shankar Balasubramanian
Journal:  Nucleic Acids Res       Date:  2005-05-24       Impact factor: 16.971

10.  QGRS Mapper: a web-based server for predicting G-quadruplexes in nucleotide sequences.

Authors:  Oleg Kikin; Lawrence D'Antonio; Paramjeet S Bagga
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

View more
  37 in total

1.  Polymerase γ efficiently replicates through many natural template barriers but stalls at the HSP1 quadruplex.

Authors:  Eric D Sullivan; Matthew J Longley; William C Copeland
Journal:  J Biol Chem       Date:  2020-10-19       Impact factor: 5.157

Review 2.  Development of RNA G-quadruplex (rG4)-targeting L-RNA aptamers by rG4-SELEX.

Authors:  Mubarak I Umar; Chun-Yin Chan; Chun Kit Kwok
Journal:  Nat Protoc       Date:  2022-04-20       Impact factor: 13.491

3.  Iso-FRET: an isothermal competition assay to analyze quadruplex formation in vitro.

Authors:  Yu Luo; Daniela Verga; Jean-Louis Mergny
Journal:  Nucleic Acids Res       Date:  2022-09-09       Impact factor: 19.160

4.  A guide to computational methods for G-quadruplex prediction.

Authors:  Emilia Puig Lombardi; Arturo Londoño-Vallejo
Journal:  Nucleic Acids Res       Date:  2020-01-10       Impact factor: 16.971

5.  Polymerase γ efficiently replicates through many natural template barriers but stalls at the HSP1 quadruplex.

Authors:  Eric D Sullivan; Matthew J Longley; William C Copeland
Journal:  J Biol Chem       Date:  2020-12-18       Impact factor: 5.157

6.  Analysis of putative quadruplex-forming sequences in fungal genomes: novel antifungal targets?

Authors:  Emily F Warner; Natália Bohálová; Václav Brázda; Zoë A E Waller; Stefan Bidula
Journal:  Microb Genom       Date:  2021-05

Review 7.  How bioinformatics resources work with G4 RNAs.

Authors:  Joanna Miskiewicz; Joanna Sarzynska; Marta Szachniuk
Journal:  Brief Bioinform       Date:  2021-05-20       Impact factor: 11.622

8.  Mosaic Arrangement of the 5S rDNA in the Aquatic Plant Landoltia punctata (Lemnaceae).

Authors:  Guimin Chen; Anton Stepanenko; Nikolai Borisjuk
Journal:  Front Plant Sci       Date:  2021-06-24       Impact factor: 5.753

9.  SARS-CoV-2 Nsp3 unique domain SUD interacts with guanine quadruplexes and G4-ligands inhibit this interaction.

Authors:  Marc Lavigne; Olivier Helynck; Pascal Rigolet; Rofia Boudria-Souilah; Mireille Nowakowski; Bruno Baron; Sébastien Brülé; Sylviane Hoos; Bertrand Raynal; Lionel Guittat; Claire Beauvineau; Stéphane Petres; Anton Granzhan; Jean Guillon; Geneviève Pratviel; Marie-Paule Teulade-Fichou; Patrick England; Jean-Louis Mergny; Hélène Munier-Lehmann
Journal:  Nucleic Acids Res       Date:  2021-07-21       Impact factor: 16.971

10.  Extraordinary diversity of telomeres, telomerase RNAs and their template regions in Saccharomycetaceae.

Authors:  Vratislav Peska; Petr Fajkus; Michal Bubeník; Václav Brázda; Natália Bohálová; Vojtěch Dvořáček; Jiří Fajkus; Sònia Garcia
Journal:  Sci Rep       Date:  2021-06-17       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.