| Literature DB >> 27151196 |
Costas Bouyioukos1, François Bucchini1, Mohamed Elati2, François Képès3.
Abstract
GREAT (Genome REgulatory Architecture Tools) is a novel web portal for tools designed to generate user-friendly and biologically useful analysis of genome architecture and regulation. The online tools of GREAT are freely accessible and compatible with essentially any operating system which runs a modern browser. GREAT is based on the analysis of genome layout -defined as the respective positioning of co-functional genes- and its relation with chromosome architecture and gene expression. GREAT tools allow users to systematically detect regular patterns along co-functional genomic features in an automatic way consisting of three individual steps and respective interactive visualizations. In addition to the complete analysis of regularities, GREAT tools enable the use of periodicity and position information for improving the prediction of transcription factor binding sites using a multi-view machine learning approach. The outcome of this integrative approach features a multivariate analysis of the interplay between the location of a gene and its regulatory sequence. GREAT results are plotted in web interactive graphs and are available for download either as individual plots, self-contained interactive pages or as machine readable tables for downstream analysis. The GREAT portal can be reached at the following URL https://absynth.issb.genopole.fr/GREAT and each individual GREAT tool is available for downloading.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27151196 PMCID: PMC4987929 DOI: 10.1093/nar/gkw384
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Overview of tools. Input and intermediate data are represented by green boxes. Analysis steps and computations performed by the tools are represented by blue boxes. Visual outputs produced by the tools are represented by orange boxes and corresponding thumbnails depict caricatures of the original visualisations. Solid lines depict successive computation steps, dashed lines depict links between the output of one tool and input to other, the thin doubled-arrowed dotted line represents interchangeability between input files. The first and second step inside each classifier correspond to the learning and the training stage.
Figure 2.Graphical outputs of . (A) Periodogram: The period length (in log scale) is plotted in the horizontal axis, the vertical axis represents the antilogarithm of the P-values. (B) Clustergram: Period length (or ‘phase’) is plotted on the horizontal axis, names and chromosomal coordinates on the left vertical axis and and the positional score (details in the text) on the right. The dashed line indicates that the highlighted period on (A) was used to calculate the ‘clustergram’ (B) (details on the step I and II description of ). (C) Chromosome map: The horizontal axis spans the genome length and each coloured bar, stacked from bottom to top, represents a region with detected periodic pattern. All three visualizations provide interactive mouse over information as shown by the tooltips. Graphical output of (D): interactive, correlation circle plot of the prediction scores. The axes represent the CCA variates (i.e. the two components which capture the highest correlation between variables). Iterations where the position classifier performed better are represented as red dots and where the sequence classifier performed better as blue dots. The bigger the dot, the smaller the classification error. Clicking on a point will load visualizations of the respective classifier at the particular iteration.