Literature DB >> 23044549

MGAviewer: a desktop visualization tool for analysis of metagenomics alignment data.

Zhengwei Zhu1, Beifang Niu, Jing Chen, Sitao Wu, Shulei Sun, Weizhong Li.   

Abstract

SUMMARY: Numerous metagenomics projects have produced tremendous amounts of sequencing data. Aligning these sequences to reference genomes is an essential analysis in metagenomics studies. Large-scale alignment data call for intuitive and efficient visualization tool. However, current tools such as various genome browsers are highly specialized to handle intraspecies mapping results. They are not suitable for alignment data in metagenomics, which are often interspecies alignments. We have developed a web browser-based desktop application for interactively visualizing alignment data of metagenomic sequences. This viewer is easy to use on all computer systems with modern web browsers and requires no software installation. AVAILABILITY: http://weizhongli-lab.org/mgaviewer

Entities:  

Mesh:

Year:  2012        PMID: 23044549      PMCID: PMC3530914          DOI: 10.1093/bioinformatics/bts567

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

The advances of Next Generation Sequencing technologies (Mardis, 2011) have promoted big waves of metagenomic projects in study of microbiomes under different environments such as ocean (Rusch ) and human body (Qin ). An essential step in metagenomic data analysis is to align the sequencing reads against the available microbial genomes. Visualization is an intuitive way to analyze large-scale alignment data in genomic studies. There are many visualization tools available. Some are web browser-based such as UCSC genome browser (Dreszer ), LookSeq (Manske and Kwiatkowski, 2009) and JBrowse (Skinner ). Some are standalone programs such as Tablet (Milne ), GenomeView (Abeel ), MapView (Bao ), IGB (Nicol ), IGV (Robinson ), SamScope (Popendorf and Sakakibara, 2012) and so on. However, these sophisticated visualization tools are specialized in handling intraspecies alignment results (i.e. query and reference are same species). They are not suitable for interspecies alignments from metagenomic datasets, where query and reference can be from different species. There are fundamental differences between intraspecies and interspecies alignments. The former only involves one reference genome and represent features like single nucleotide polymorphism and alternative splicing. But the latter involves multiple (often 103) reference microbial genomes. To visualize interspecies alignments, a tool needs to show the wide range of alignment similarities (100% to as low as 50% for DNAs and 30% for proteins) and to handle thousands of reference genomes. The Global Ocean Sampling study (Rusch ) first introduced fragment recruitment plots to illustrate the metagenomic alignment data. However, its underlying software is not available to the public. Here, we present MetaGenomic Alignment Viewer (MGAviewer), a platform-independent web browser-based tool for visualizing alignment data. It does not rely on web server and relational database for image generation and data retrieval. It can be simply used as a standalone desktop program to analyze local data. It can also be included in a web server like other web-based genome browsers.

2 METHODS

The key component of this tool is a graphic interface with a 2D map that displays large amounts of alignments between metagenomic sequences from one or more samples and a reference genome (Fig. 1). Users can explore alignment data by interactively operating the 2D map in a similar way as in Google Maps.
Fig. 1.

Screenshots of the MGAviewer visualization interface

Screenshots of the MGAviewer visualization interface MGAviewer is an HTML5 web application. It works in all major modern browsers, including Chrome, Firefox, Safari and Internet Explorer 9, without the need of installing any extra software or plugin. It uses jQuery (http://jquery.com/) as the base JavaScript library, and on top of which, a customized version of jQuery plotting plugin, Flot (http://code.google.com/p/flot/). We extended Flot to make it support drawing of fragments and annotation features. Above these, a site-specific JavaScript file (‘site.js’) is responsible for setting up plot parameters, placing and responding to additional controls and fetching data. MGAviewer fetches alignment data from a user’s local computer or from a web server on demand via AJAX. It then draws the plot in an HTML5 Canvas element. Every time a user interaction event is triggered, e.g. zooming in/out, panning and resizing of the plot, the plot image is simply redrawn using data already loaded, unless additional data are required. This is in contrast to many other web-based genome browsers where plot images are generated on the server side and then retrieved by browser on demand; in MGAviewer a plot is drawn locally in browser. This results in no network traffic for most user operations and therefore dramatically improves the responsiveness of user interactions, especially on slow network. Alignment data are stored in JSON (a lightweight data-interchange format used by JavaScript) formatted files, which contain alignment details including coordinate, sequence identity, name, e-value, etc. We provide scripts to generate JSON files from raw alignment results by BLAST (Altschul ) and FR-HIT (Niu ) and also from alignments in SAM format. These scripts need installation of BioPython package. Converters for other programs like BLAT (Kent, 2002) can be easily implemented. MGAviewer can be used as standalone software by simply opening the directory that contains these JSON files, MGAviewer scripts and a master HTML file (see user’s guide for details). It can also be hosted on a web server. The plot itself can be embedded in any webpage.

3 RESULTS

MGAviewer has an interface for users to select one or more metagenomic samples and a reference from a list of reference genomes to generate the plot. The screenshots of MGAviewer are shown in Figure 1. The plot shows alignments from eight metagenomic samples to a reference genome. The x-axis is the genome coordinate, and y-axis is alignment identity (%). Alignments are coloured by sample and are represented as points or lines depending on zoom level. The bottom of the plot shows genes of the reference genome, and the top shows the genome coverage for each sample. Icons at left and right bottom corners are for zoom, resize and reset. Users can also zoom or pan the map by mouse. The inside circular images are zoomed views of the plot. We tested MGAviewer on 1.5 million alignment datasets between >600 metagenomic samples from CAMERA (Sun ) and >2500 genomes from NCBI. MGAviewer provides real-time visualization for almost all these datasets except a few hundred very large datasets, which need extra several seconds for data loading and plotting. MGAviewer is already adopted by CAMERA project in its alignment resources, which will be described in a separate publication. MGAviewer can be used to analyze alignment data not only for prokaryotic species but also for viruses and small eukaryotic organisms. Funding: This study was supported by Award R01HG005978 from the National Human Genome Research Institute and the Gordon and Betty Moore Foundation. Conflict of Interest: none declared.
  16 in total

1.  BLAT--the BLAST-like alignment tool.

Authors:  W James Kent
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

2.  SAMSCOPE: an OpenGL-based real-time interactive scale-free SAM viewer.

Authors:  Kris Popendorf; Yasubumi Sakakibara
Journal:  Bioinformatics       Date:  2012-03-13       Impact factor: 6.937

3.  A decade's perspective on DNA sequencing technology.

Authors:  Elaine R Mardis
Journal:  Nature       Date:  2011-02-10       Impact factor: 49.962

4.  A human gut microbial gene catalogue established by metagenomic sequencing.

Authors:  Junjie Qin; Ruiqiang Li; Jeroen Raes; Manimozhiyan Arumugam; Kristoffer Solvsten Burgdorf; Chaysavanh Manichanh; Trine Nielsen; Nicolas Pons; Florence Levenez; Takuji Yamada; Daniel R Mende; Junhua Li; Junming Xu; Shaochuan Li; Dongfang Li; Jianjun Cao; Bo Wang; Huiqing Liang; Huisong Zheng; Yinlong Xie; Julien Tap; Patricia Lepage; Marcelo Bertalan; Jean-Michel Batto; Torben Hansen; Denis Le Paslier; Allan Linneberg; H Bjørn Nielsen; Eric Pelletier; Pierre Renault; Thomas Sicheritz-Ponten; Keith Turner; Hongmei Zhu; Chang Yu; Shengting Li; Min Jian; Yan Zhou; Yingrui Li; Xiuqing Zhang; Songgang Li; Nan Qin; Huanming Yang; Jian Wang; Søren Brunak; Joel Doré; Francisco Guarner; Karsten Kristiansen; Oluf Pedersen; Julian Parkhill; Jean Weissenbach; Peer Bork; S Dusko Ehrlich; Jun Wang
Journal:  Nature       Date:  2010-03-04       Impact factor: 49.962

Review 5.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

6.  Community cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: the CAMERA resource.

Authors:  Shulei Sun; Jing Chen; Weizhong Li; Ilkay Altintas; Abel Lin; Steve Peltier; Karen Stocks; Eric E Allen; Mark Ellisman; Jeffrey Grethe; John Wooley
Journal:  Nucleic Acids Res       Date:  2010-11-02       Impact factor: 16.971

7.  FR-HIT, a very fast program to recruit metagenomic reads to homologous reference genomes.

Authors:  Beifang Niu; Zhengwei Zhu; Limin Fu; Sitao Wu; Weizhong Li
Journal:  Bioinformatics       Date:  2011-04-19       Impact factor: 6.937

8.  The UCSC Genome Browser database: extensions and updates 2011.

Authors:  Timothy R Dreszer; Donna Karolchik; Ann S Zweig; Angie S Hinrichs; Brian J Raney; Robert M Kuhn; Laurence R Meyer; Mathew Wong; Cricket A Sloan; Kate R Rosenbloom; Greg Roe; Brooke Rhead; Andy Pohl; Venkat S Malladi; Chin H Li; Katrina Learned; Vanessa Kirkup; Fan Hsu; Rachel A Harte; Luvina Guruvadoo; Mary Goldman; Belinda M Giardine; Pauline A Fujita; Mark Diekhans; Melissa S Cline; Hiram Clawson; Galt P Barber; David Haussler; W James Kent
Journal:  Nucleic Acids Res       Date:  2011-11-15       Impact factor: 16.971

9.  GenomeView: a next-generation genome browser.

Authors:  Thomas Abeel; Thomas Van Parys; Yvan Saeys; James Galagan; Yves Van de Peer
Journal:  Nucleic Acids Res       Date:  2011-11-18       Impact factor: 16.971

10.  The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific.

Authors:  Douglas B Rusch; Aaron L Halpern; Granger Sutton; Karla B Heidelberg; Shannon Williamson; Shibu Yooseph; Dongying Wu; Jonathan A Eisen; Jeff M Hoffman; Karin Remington; Karen Beeson; Bao Tran; Hamilton Smith; Holly Baden-Tillson; Clare Stewart; Joyce Thorpe; Jason Freeman; Cynthia Andrews-Pfannkoch; Joseph E Venter; Kelvin Li; Saul Kravitz; John F Heidelberg; Terry Utterback; Yu-Hui Rogers; Luisa I Falcón; Valeria Souza; Germán Bonilla-Rosso; Luis E Eguiarte; David M Karl; Shubha Sathyendranath; Trevor Platt; Eldredge Bermingham; Victor Gallardo; Giselle Tamayo-Castillo; Michael R Ferrari; Robert L Strausberg; Kenneth Nealson; Robert Friedman; Marvin Frazier; J Craig Venter
Journal:  PLoS Biol       Date:  2007-03       Impact factor: 8.029

View more
  4 in total

1.  Elviz - exploration of metagenome assemblies with an interactive visualization tool.

Authors:  Michael Cantor; Henrik Nordberg; Tatyana Smirnova; Matthias Hess; Susannah Tringe; Inna Dubchak
Journal:  BMC Bioinformatics       Date:  2015-04-28       Impact factor: 3.169

2.  IMSA: integrated metagenomic sequence analysis for identification of exogenous reads in a host genomic background.

Authors:  Michelle T Dimon; Henry M Wood; Pamela H Rabbitts; Sarah T Arron
Journal:  PLoS One       Date:  2013-05-23       Impact factor: 3.240

3.  RCircos: an R package for Circos 2D track plots.

Authors:  Hongen Zhang; Paul Meltzer; Sean Davis
Journal:  BMC Bioinformatics       Date:  2013-08-10       Impact factor: 3.169

4.  AmalgamScope: merging annotations data across the human genome.

Authors:  Georgia Tsiliki; Konstantinos Tsaramirsis; Sophia Kossida
Journal:  Biomed Res Int       Date:  2014-05-20       Impact factor: 3.411

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.