Literature DB >> 26417198

Alview: Portable Software for Viewing Sequence Reads in BAM Formatted Files.

Richard P Finney1, Qing-Rong Chen1, Cu V Nguyen1, Chih Hao Hsu1, Chunhua Yan1, Ying Hu1, Massih Abawi1, Xiaopeng Bian1, Daoud M Meerzaman1.   

Abstract

The name Alview is a contraction of the term Alignment Viewer. Alview is a compiled to native architecture software tool for visualizing the alignment of sequencing data. Inputs are files of short-read sequences aligned to a reference genome in the SAM/BAM format and files containing reference genome data. Outputs are visualizations of these aligned short reads. Alview is written in portable C with optional graphical user interface (GUI) code written in C, C++, and Objective-C. The application can run in three different ways: as a web server, as a command line tool, or as a native, GUI program. Alview is compatible with Microsoft Windows, Linux, and Apple OS X. It is available as a web demo at https://cgwb.nci.nih.gov/cgi-bin/alview. The source code and Windows/Mac/Linux executables are available via https://github.com/NCIP/alview.

Entities:  

Keywords:  BAM; alignment; genomics; open source; short read; visualization

Year:  2015        PMID: 26417198      PMCID: PMC4573065          DOI: 10.4137/CIN.S26470

Source DB:  PubMed          Journal:  Cancer Inform        ISSN: 1176-9351


Introduction

New, large genomic data sets are providing more in-depth insights into the diagnosis and treatment of disease. In the past decade, new and innovative methods have continued to add value to the underlying data and uncover the secrets of the genome. Visual data inspection by experienced researchers is an important quality control element in the analytical process. Additionally, data visualization helps one to prioritize downstream analysis and verification steps. Unfortunately, this part of the process is tedious and time consuming, and the increasing volumes of high-throughput sequencing data of various types and platforms are proving to be a major analytical challenge. Here, we report a visualization tool that allows researchers to explore their data at a very rapid speed and significantly reduce the burden of reviewing tens and hundreds of thousands of variant calls. Areas with systematic read errors can be quickly identified, and inefficient attempts to verify results in noisy regions can be avoided.

Features and Methods

Alview is a fast and portable visualization tool. The core code interfaces with Heng Li et al’s SAMtools Library1 for parsing BAM files. The program is written in platform-independent C. Peculiarities specific to an operating system are isolated with if defined (ifdef) directives; so, for instance, when Microsoft Visual C provides alternate support for a portable operating system interface (POSIX) standard function, a handcrafted, native interface work around is supplied. For graphical user interface (GUI) frameworks, Alview uses WIN32 interface for Windows, the GTK2 interface for Linux, and BSD Unix-based systems and Cocoa for Apple Mac OS X. SAMtools1 is written to POSIX standards, but different Microsoft Visual compilers provide various levels of support for these UNIX style standards. As a result, the source code for third-party libraries that were modified for Windows is provided to facilitate compiling and linking Alview on Windows. The main code for Alview, in the file alviewcore.cpp, is written to be portable between operating systems and emphasizes speed of execution. The code can be compiled as a stand-alone executable and must be linked with the zlib2 and SAMtools1 libraries. Sequence reads are processed via custom SAMtools callback functions arranged in in-memory data structures and represented by an aesthetic, annotated image. The image is then output to the screen as a native graphics object or to the disk as a standard image format file. Alview can also be compiled as a webserver daemon that uses the common gateway interface (CGI)3 standard. The CGI version produces interactive html output and uses dynamic HTML54 features, including zoom in by selection via a jQuery5 library. The CGI webserver Alview version loads a list of permitted-to-access BAM files from a user-maintained text file; so custom lists of BAM files of interest are easy to generate and use. The source code is free and open to modification so that users and local system operators can implement their own security. The Alview CGI webserver version provides modifiable URL access, so that, for instance, cells in a spreadsheet can link to viewable results for any sample or location. A user-generated custom HTML file can link to specific samples and regions. Stand-alone Alview accepts parameters that specify BAM file name and genomic coordinates. Invoking Alview in a script can create a slideshow of interesting regions. For example, fields in a single nucleotide polymorphism (SNP) detection output file can be used to specify a series of calls to Alview to generate images for each purported polymorphism or mutation. The results can be quickly and easily reviewed by researchers. Users can generate text to annotate the slideshow images. A template is provided for command line creation of slide shows. The burden of reviewing ten and hundreds of thousands of mutation calls can therefore be significantly reduced. The source code is available at GitHub.6 The README file there points to links for selected executables and complete download packages that include the associated reference genome data. A live webserver version of Alview for examining public human cancer short-read datasets is available at https://cgwb.nci.nih.gov/cgi-bin/alview. The core source code for Alview is in the public domain. It uses some permissive free software licensed libraries. Alview source code and executables for several operating systems are available at the National Cancer Institute (NCI)/National Cancer Informatics Program’s (NCIP’s) GitHub site: https://github.com/NCIP/alview. Developers may modify Alview as they wish. NCI retains the copyrights to “National Cancer Institute” and associated images, which may not be used in forked projects.

Results

Alview provides a solid substructure that allows for various types of access to short-read data across different operating systems. Figure 1. demonstrates the various navigation and information buttons available in the web version of Alview and shows how selection via mouse provides zoom in capabilities. Alview is a trim, fast, precise tool and complements existing programs such as the Integrated Genomics Viewer (IGV),7 BamView,8 and GBrowse 2.0.9 The benefits of Alview are extreme speed and a sharp focus on exploring short reads.
Figure 1

Information and navigation in Alview – upper left is original and lower right is zoom in via mouse drag to examine SNP. Various navigation buttons and information blocks assist in browsing BAM files.

Comparison of Alview with other programs should not be judged solely on benchmarks. Compounding factors include operating system cache effects and internet congestion. Different implementation philosophies can influence memory usage and performance but provide useful alternative paths to solving similar problems. IGV provides much more functionality than Alview by supporting many other input file types other than BAM sequence read files. IGV’s Java implementation provides write once, run anywhere portability via implementations of the Java virtual machine. Alview’s implementation relies on low-level operating system and native GUI toolkit API calls. Alview provides extreme speed but is difficult to develop and maintain. IGV requires registration for download for running off of disk, whereas Alview does not. Desktop IGV may require internet for full, easy, simple operation, whereas Alview does not require network connection (though it may call user-invoked external webpages). Alview operation does not log any user activity. On a Windows 7 Intel Core i5–2400 CPU at 3.10 GHz and 8 GB RAM, restarts of IGV v2.3 took from 12 to 18 seconds. Restarts of Alview took a small fraction of one second. For a small view of a genomic region, the Java Platform SE Binary for IGV took up 292 Mb, while Alview took up 11 Mb.
  4 in total

1.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

2.  Integrative genomics viewer.

Authors:  James T Robinson; Helga Thorvaldsdóttir; Wendy Winckler; Mitchell Guttman; Eric S Lander; Gad Getz; Jill P Mesirov
Journal:  Nat Biotechnol       Date:  2011-01       Impact factor: 54.908

3.  BamView: viewing mapped read alignment data in the context of the reference sequence.

Authors:  Tim Carver; Ulrike Böhme; Thomas D Otto; Julian Parkhill; Matthew Berriman
Journal:  Bioinformatics       Date:  2010-01-12       Impact factor: 6.937

4.  Using GBrowse 2.0 to visualize and share next-generation sequence data.

Authors:  Lincoln D Stein
Journal:  Brief Bioinform       Date:  2013-02-01       Impact factor: 11.622

  4 in total
  2 in total

1.  Circulating tumor cells capture disease evolution in advanced prostate cancer.

Authors:  Justin Lack; Marc Gillard; Maggie Cam; Gladell P Paner; David J VanderWeele
Journal:  J Transl Med       Date:  2017-02-23       Impact factor: 5.531

Review 2.  Chromatic: WebAssembly-Based Cancer Genome Viewer.

Authors:  Richard Finney; Daoud Meerzaman
Journal:  Cancer Inform       Date:  2018-04-27
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.