Literature DB >> 23749959

SVGenes: a library for rendering genomic features in scalable vector graphic format.

Graham J Etherington1, Daniel MacLean.   

Abstract

MOTIVATION: Drawing genomic features in attractive and informative ways is a key task in visualization of genomics data. Scalable Vector Graphics (SVG) format is a modern and flexible open standard that provides advanced features including modular graphic design, advanced web interactivity and animation within a suitable client. SVGs do not suffer from loss of image quality on re-scaling and provide the ability to edit individual elements of a graphic on the whole object level independent of the whole image. These features make SVG a potentially useful format for the preparation of publication quality figures including genomic objects such as genes or sequencing coverage and for web applications that require rich user-interaction with the graphical elements.
RESULTS: SVGenes is a Ruby-language library that uses SVG primitives to render typical genomic glyphs through a simple and flexible Ruby interface. The library implements a simple Page object that spaces and contains horizontal Track objects that in turn style, colour and positions features within them. Tracks are the level at which visual information is supplied providing the full styling capability of the SVG standard. Genomic entities like genes, transcripts and histograms are modelled in Glyph objects that are attached to a track and take advantage of SVG primitives to render the genomic features in a track as any of a selection of defined glyphs. The feature model within SVGenes is simple but flexible and not dependent on particular existing gene feature formats meaning graphics for any existing datasets can easily be created without need for conversion. AVAILABILITY: The library is provided as a Ruby Gem from https://rubygems.org/gems/bio-svgenes under the MIT license, and open source code is available at https://github.com/danmaclean/bioruby-svgenes also under the MIT License. CONTACT: dan.maclean@tsl.ac.uk.

Entities:  

Mesh:

Year:  2013        PMID: 23749959      PMCID: PMC3712214          DOI: 10.1093/bioinformatics/btt294

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

Visualization, analysis and communication of genome data is an important task in genomics. Numerous desktop computer programs exist for rendering images of genomic data, usually in analytic pipelines including Artemis (Carver ). Genome browsers such as Gbrowse (Stein ), JBrowse (Skinner ), Savant (Fiume ) and IGV (Thorvaldsdóttir ) provide interactive visualization of the data for whole genomes and draft assemblies. Output from these is typically limited to an exported bitmap or screen grab in the program’s particular fixed style. Graphics libraries such as GD and ImageMagick have been used in projects like BioPerl (Stajich ) and BioRuby (Goto ) to create uniquely styled bitmap images like PNG and JPEG programmatically. BioRubys bio-graphics package has similar functionality to bio-svgenes and relies on external libraries such as Cairo, Pango and ImageMagick. The Bio.Graphics module in Biopython (Cock ) also supports output in SVG through the use of third-party software [ReportLab (http://www.reportlab.com/)]. Bitmap images are limited in that they are not easy to re-annotate, re-scale and often cannot be reproduced for publication or presentation with high-fidelity because of limitations of the original bitmaps. Bitmaps can be difficult to manipulate and are not easily amenable to the addition of interactive features. Interactive graphics can be provided in web-browsers through JavaScript libraries such as D3.js but there are no such libraries available specifically for easy rendering of genomic data. Scalable Vector Graphics (SVG) is an XML-based graphic standard under development by the World Wide Web Consortium that provides many advantages for those seeking to produce rich, attractive images. SVG is a vector format so does not suffer image quality degradation on rescaling, has advanced image features such as alpha masks and filter effects, web-interactivity and can be styled with Cascading Style Sheets. Furthermore, as a text-based format, SVG is well suited for searching and indexing in databases and is amenable to lossless compression. SVG can be rendered by modern web-browsers and graphics software including Adobe’s Illustrator and the open source Inkscape programs. SVG output is available from some applications. CGView (Grant ) and Circos (Krzywinski ) are good tools for viewing circular genomes in particular. GenomeDiagram (Pritchard ) is designed to display large amounts of comparative genomics data. MGV (Kerkhoven ) is a database-driven web application designed specifically for microbial data and AnnotationSketch (Steinbiss ), which is dependant on third-party software. SVGenes is a pure Ruby language library that allows a user to set styles for tracks of genomic features and will automatically layout and generate SVG images composed of several pre-defined genomics glyphs, including genes, transcripts, data and single-nucleotide features.

2 APPROACH

SVGenes uses a simple feature Page-Track-Feature model to organize the genomic data and to apply style information provided to it.

2.1 The page and track object

The page object represents the area into which feature tracks are drawn, it has straightforward width, height and background attributes. Height is not fixed and is recalculated if more space is required to render all the constituent features at the specified sizes. The background attribute of the page can be styled, and an automatically generated scale bar is created and added to the top of the rendered page. Instantiating new track objects is the main way that styling information is specified, the track attributes set the final visual style of the genomic features and is responsible for placing them within the track on the page.

2.2 Glyphs and feature objects

SVGenes can render genomic features using various glyphs including gene, transcript and point features. Data tracks representing, e.g. sequence read coverage can be rendered as histograms and the flexible styling options allow for a great deal of variety of appearance (Fig. 1 shows some examples). Each glyph takes style information from the track, and full SVG styling syntax can be used for arbitrary styling information including opacity settings. HTML colours and some pre-defined gradient fills are available through keyword declaration, greatly simplifying basic styling. The feature object represents genomic features simply and flexibly. As a minimum, start and stop positions are required for the basic glyphs. Grouped features such as transcripts are represented by start and stop information for the parent object and start and stops for the block elements within, such as exons and untranslated regions. Data glyphs are bars with start, width and height elements.
Fig. 1.

Rendering of features from the TAIR 10 annotation of the Arabidopsis genome (Lamesch ). The region shown is Chromosome 3: 19 597 235–19 637 249. Tracks contain (from top to bottom) (i) Genes with the ’directed’ glyph, (ii) mRNAs with the ‘transcript’ glyph, (iii) cDNA matches with the ‘directed’ glyph, (iv) microarray probes with the ‘generic’ glyph, (v) insertions with the ‘down triangle’ glyph, (vi) deletions with the ‘up triangle’ glyph, (vii) TE insertions with the ‘circle’ glyph, (viii) a data track showing simulated NGS data (height calculated from a sine function of the genome position) and (ix) a data track showing random bar heights

Rendering of features from the TAIR 10 annotation of the Arabidopsis genome (Lamesch ). The region shown is Chromosome 3: 19 597 235–19 637 249. Tracks contain (from top to bottom) (i) Genes with the ’directed’ glyph, (ii) mRNAs with the ‘transcript’ glyph, (iii) cDNA matches with the ‘directed’ glyph, (iv) microarray probes with the ‘generic’ glyph, (v) insertions with the ‘down triangle’ glyph, (vi) deletions with the ‘up triangle’ glyph, (vii) TE insertions with the ‘circle’ glyph, (viii) a data track showing simulated NGS data (height calculated from a sine function of the genome position) and (ix) a data track showing random bar heights

2.3 Workflow

SVGenes provides programmatic and configuration-based rendering workflows. Within a Ruby script, a user may manually instantiate a page object and attach tracks as required, then create and add the feature objects to the appropriate tracks. This workflow does not rely on any particular feature file format. For input from the popular GFF format, a configuration-based workflow is provided. In this, the user is able to create a JSON configuration file that describes each track and contains links to a file containing the features or data values to be rendered in each track.

3 CONCLUSION

SVGenes is a useful and flexible library for creating easily manipulated, high-quality, web-friendly images of genomic data quickly and easily in SVG format without embedding a bitmap. The library can be used for visualization at many levels; in high-throughput pipelines and web applications, but individual users preparing figures for publication will also find the library extremely useful, as the individual elements of the images can be independently manipulated and annotated and composited.
  14 in total

1.  The generic genome browser: a building block for a model organism system database.

Authors:  Lincoln D Stein; Christopher Mungall; ShengQiang Shu; Michael Caudy; Marco Mangone; Allen Day; Elizabeth Nickerson; Jason E Stajich; Todd W Harris; Adrian Arva; Suzanna Lewis
Journal:  Genome Res       Date:  2002-10       Impact factor: 9.043

2.  The Bioperl toolkit: Perl modules for the life sciences.

Authors:  Jason E Stajich; David Block; Kris Boulez; Steven E Brenner; Stephen A Chervitz; Chris Dagdigian; Georg Fuellen; James G R Gilbert; Ian Korf; Hilmar Lapp; Heikki Lehväslaiho; Chad Matsalla; Chris J Mungall; Brian I Osborne; Matthew R Pocock; Peter Schattner; Martin Senger; Lincoln D Stein; Elia Stupka; Mark D Wilkinson; Ewan Birney
Journal:  Genome Res       Date:  2002-10       Impact factor: 9.043

3.  Visualization for genomics: the Microbial Genome Viewer.

Authors:  Robert Kerkhoven; Frank H J van Enckevort; Jos Boekhorst; Douwe Molenaar; Roland J Siezen
Journal:  Bioinformatics       Date:  2004-02-26       Impact factor: 6.937

4.  Savant: genome browser for high-throughput sequencing data.

Authors:  Marc Fiume; Vanessa Williams; Andrew Brook; Michael Brudno
Journal:  Bioinformatics       Date:  2010-06-20       Impact factor: 6.937

5.  GenomeDiagram: a python package for the visualization of large-scale genomic data.

Authors:  Leighton Pritchard; Jennifer A White; Paul R J Birch; Ian K Toth
Journal:  Bioinformatics       Date:  2005-12-23       Impact factor: 6.937

6.  AnnotationSketch: a genome annotation drawing library.

Authors:  Sascha Steinbiss; Gordon Gremme; Christin Schärfer; Malte Mader; Stefan Kurtz
Journal:  Bioinformatics       Date:  2008-12-23       Impact factor: 6.937

7.  Circos: an information aesthetic for comparative genomics.

Authors:  Martin Krzywinski; Jacqueline Schein; Inanç Birol; Joseph Connors; Randy Gascoyne; Doug Horsman; Steven J Jones; Marco A Marra
Journal:  Genome Res       Date:  2009-06-18       Impact factor: 9.043

8.  JBrowse: a next-generation genome browser.

Authors:  Mitchell E Skinner; Andrew V Uzilov; Lincoln D Stein; Christopher J Mungall; Ian H Holmes
Journal:  Genome Res       Date:  2009-07-01       Impact factor: 9.043

9.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration.

Authors:  Helga Thorvaldsdóttir; James T Robinson; Jill P Mesirov
Journal:  Brief Bioinform       Date:  2012-04-19       Impact factor: 11.622

10.  Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database.

Authors:  Tim Carver; Matthew Berriman; Adrian Tivey; Chinmay Patel; Ulrike Böhme; Barclay G Barrell; Julian Parkhill; Marie-Adèle Rajandream
Journal:  Bioinformatics       Date:  2008-10-09       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.