Literature DB >> 27412096

MSAViewer: interactive JavaScript visualization of multiple sequence alignments.

Guy Yachdav1,2, Sebastian Wilzbach1, Benedikt Rauscher1, Robert Sheridan3, Ian Sillitoe4, James Procter5, Suzanna E Lewis6, Burkhard Rost1,2, Tatyana Goldberg1.   

Abstract

The MSAViewer is a quick and easy visualization and analysis JavaScript component for Multiple Sequence Alignment data of any size. Core features include interactive navigation through the alignment, application of popular color schemes, sorting, selecting and filtering. The MSAViewer is 'web ready': written entirely in JavaScript, compatible with modern web browsers and does not require any specialized software. The MSAViewer is part of the BioJS collection of components.
AVAILABILITY AND IMPLEMENTATION: The MSAViewer is released as open source software under the Boost Software License 1.0. Documentation, source code and the viewer are available at http://msa.biojs.net/Supplementary information: Supplementary data are available at Bioinformatics online. CONTACT: msa@bio.sh.
© The Author 2016. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2016        PMID: 27412096      PMCID: PMC5181560          DOI: 10.1093/bioinformatics/btw474

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Multiple Sequence Alignment (MSA) is a fundamental procedure to capture similarities between sequences of nucleotides (DNA/RNA) or of amino acids (protein). Biologically meaningful MSAs highlight and capture sites with significant evolutionary conservation. MSAs are essential to predict aspects of protein structure (e.g. secondary structure (Rost, 2001) or protein disorder (Schlessinger )) and function [e.g. binding sites (Ofran ) or localization (Goldberg )]. MSAs can be used to understand genomic rearrangements (Darling ), to derive sequence homology (Altschul ) and to identify evolutionary rates (Pupko ). MSAs are widely used to display complex annotations relating to structure and function, and to transfer those annotations to sequences that lack annotations (Waterhouse ). Many tools are available to view and analyze MSAs, including standalone applications (Larsson, 2014; Waterhouse ) and web applets (Waterhouse ). With the recent widespread adoption of JavaScript as the leading programming language for interactive web applications, new MSA viewing tools compatible with modern web browsers have been developed and made available (Martin, 2014). BioJS is one particular collection of JavaScript components with growing applications in biology (Corpas ); it is interoperable with many other data visualization tools. Here, we describe the BioJS MSAViewer. It is readily loaded into web pages to visualize and analyze MSA datasets of arbitrary sizes. MSAViewer implements most features commonly available in other popular MSA viewing software, including scrolling, selecting, highlighting, cross-referencing with protein feature annotations and phylogenetic trees (a detailed comparison of features is in Supplementary Table S1).

2 Visualization

The MSAViewer loads MSA data in FASTA (Pearson, 2000) or CLUSTAL (Larkin ) formats from a user’s local computer or a web server. It then draws two main Canvas panels—the main panel and the overview MSA panel (Fig. 1). The choice to use Canvas over other rendering technologies is discussed in Supplementary Material Section S1.
Fig. 1.

A simplified view of the MSAViewer for the sequence alignment of protein VP24 within seven viruses of Filoviridae family. (A) Sequence logo (Schneider and Stephens, 1990) representation with conservation patterns at each position in the MSA. (B) Bar chart showing amino acid conservation per position. (C) Main MSA panel with residues in the alignment colored according to the Percentage Identity coloring scheme (Waterhouse ). Percentage sequence identities relative to consensus (consensus sequence not shown) are listed for each sequence. Filled red rectangles indicate sequence annotations provided by the user [here: secondary structure predictions of PredictProtein (Yachdav )]. Red frames (indicated by two black arrows) are clusters of residues of ebolavirus VP24 responsible for binding to karyopherin alpha nuclear transporters to suppress antiviral defense mechanism in human (Xu ). (D) A compact overview MSA showing a bird’s eye view of the full alignment. Yellow rectangles are two highlighted clusters in the main MSA panel. Despite the overall sequence homology, the highlighted regions in ebolavirus (EBOV) proteins binding karyoprotein differ from those in Llovu cuevavirus (LLOV) and Lake Victoria marburgvirus (MARV). MARV is known to not suppress host’s antiviral defense mechanism (Xu ). Though no experimental information on binding of LLOV VP24 to karyoproteins is available to date, the comparison of its sequence and structural features suggests a mechanism that is more similar to EBOV than to MARV

A simplified view of the MSAViewer for the sequence alignment of protein VP24 within seven viruses of Filoviridae family. (A) Sequence logo (Schneider and Stephens, 1990) representation with conservation patterns at each position in the MSA. (B) Bar chart showing amino acid conservation per position. (C) Main MSA panel with residues in the alignment colored according to the Percentage Identity coloring scheme (Waterhouse ). Percentage sequence identities relative to consensus (consensus sequence not shown) are listed for each sequence. Filled red rectangles indicate sequence annotations provided by the user [here: secondary structure predictions of PredictProtein (Yachdav )]. Red frames (indicated by two black arrows) are clusters of residues of ebolavirus VP24 responsible for binding to karyopherin alpha nuclear transporters to suppress antiviral defense mechanism in human (Xu ). (D) A compact overview MSA showing a bird’s eye view of the full alignment. Yellow rectangles are two highlighted clusters in the main MSA panel. Despite the overall sequence homology, the highlighted regions in ebolavirus (EBOV) proteins binding karyoprotein differ from those in Llovu cuevavirus (LLOV) and Lake Victoria marburgvirus (MARV). MARV is known to not suppress host’s antiviral defense mechanism (Xu ). Though no experimental information on binding of LLOV VP24 to karyoproteins is available to date, the comparison of its sequence and structural features suggests a mechanism that is more similar to EBOV than to MARV Navigation through the alignment is enabled through various controls. First, users can scroll or use the ‘jump to a column’ menu item to navigate to a certain column number. Second, users can pan within the main panel to scroll through the alignment; this has proven to be a useful feature for large alignments. Finally, a second panel—the overview panel (drawn under the main panel)—provides a ‘bird’s eye view’ perspective over the entire alignment and can also be used for navigation. The alignment is sortable by unique identifiers, sequence labels and sequences. The percentage of gaps and the sequence identity to the consensus sequence, which is calculated from most frequent bases at each position in the alignment, can also be used for sorting. Users can select sequences, columns, or arbitrary regions for analysis, and hide them from the main panel based on their selection, conservation to the consensus sequence or the percentage of gaps. An alignment can also be searched for motifs using a regular expression (e.g. K(K|R)RK for a nuclear localization signal), which are highlighted with a red frame if matched. Sequence position annotations, such as binding sites, are provided by a user and displayed as filled rectangles below the corresponding sequence in the alignment. Users can switch between 15 predefined color schemes. The MSAViewer exports alignments and annotations as ASCII files, and the visual representation as a publication-quality figure.

3 Summary

MSAViewer is a lightweight viewer for MSAs of arbitrary size. Being JavaScript-based, it can be used on any modern web browser without installing any specialized software or add-ons. As part of BioJS, the MSAviewer can interoperate with a growing set of biological data viewers. The phylogenetic tree and the sequence logo viewers are already integrated in the current release. Integration with new MSA visualization techniques, e.g. sequence bundles (Kultys ) is planned. The MSAViewer has already been found useful and became part of Galaxy (Giardine ) https://cpt.tamu.edu/clustalw-msa-and-visualisations and JalView (Waterhouse ) http://www.jalview.org/help/html/features/biojsmsa.html.
  18 in total

1.  Flexible sequence similarity searching with the FASTA3 program package.

Authors:  W R Pearson
Journal:  Methods Mol Biol       Date:  2000

Review 2.  Review: protein secondary structure prediction continues to rise.

Authors:  B Rost
Journal:  J Struct Biol       Date:  2001 May-Jun       Impact factor: 2.867

Review 3.  Protein disorder--a breakthrough invention of evolution?

Authors:  Avner Schlessinger; Christian Schaefer; Esmeralda Vicedo; Markus Schmidberger; Marco Punta; Burkhard Rost
Journal:  Curr Opin Struct Biol       Date:  2011-04-20       Impact factor: 6.809

4.  Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues.

Authors:  Tal Pupko; Rachel E Bell; Itay Mayrose; Fabian Glaser; Nir Ben-Tal
Journal:  Bioinformatics       Date:  2002       Impact factor: 6.937

5.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

6.  Jalview Version 2--a multiple sequence alignment editor and analysis workbench.

Authors:  Andrew M Waterhouse; James B Procter; David M A Martin; Michèle Clamp; Geoffrey J Barton
Journal:  Bioinformatics       Date:  2009-01-16       Impact factor: 6.937

7.  PredictProtein--an open resource for online prediction of protein structural and functional features.

Authors:  Guy Yachdav; Edda Kloppmann; Laszlo Kajan; Maximilian Hecht; Tatyana Goldberg; Tobias Hamp; Peter Hönigschmid; Andrea Schafferhans; Manfred Roos; Michael Bernhofer; Lothar Richter; Haim Ashkenazy; Marco Punta; Avner Schlessinger; Yana Bromberg; Reinhard Schneider; Gerrit Vriend; Chris Sander; Nir Ben-Tal; Burkhard Rost
Journal:  Nucleic Acids Res       Date:  2014-05-05       Impact factor: 16.971

8.  AliView: a fast and lightweight alignment viewer and editor for large datasets.

Authors:  Anders Larsson
Journal:  Bioinformatics       Date:  2014-08-05       Impact factor: 6.937

9.  Sequence Bundles: a novel method for visualising, discovering and exploring sequence motifs.

Authors:  Marek Kultys; Lydia Nicholas; Roland Schwarz; Nick Goldman; James King
Journal:  BMC Proc       Date:  2014-08-28

10.  BioJS: an open source standard for biological visualisation - its status in 2014.

Authors:  Manuel Corpas; Rafael Jimenez; Seth J Carbon; Alex García; Leyla Garcia; Tatyana Goldberg; John Gomez; Alexis Kalderimis; Suzanna E Lewis; Ian Mulvany; Aleksandra Pawlik; Francis Rowland; Gustavo Salazar; Fabian Schreiber; Ian Sillitoe; William H Spooner; Anil S Thanki; José M Villaveces; Guy Yachdav; Henning Hermjakob
Journal:  F1000Res       Date:  2014-02-13
View more
  67 in total

1.  Interaction of Zika Virus Envelope Protein with Glycosaminoglycans.

Authors:  So Young Kim; Jing Zhao; Xinyue Liu; Keith Fraser; Lei Lin; Xing Zhang; Fuming Zhang; Jonathan S Dordick; Robert J Linhardt
Journal:  Biochemistry       Date:  2017-02-13       Impact factor: 3.162

2.  The Adaptive Evolution Database (TAED): A New Release of a Database of Phylogenetically Indexed Gene Families from Chordates.

Authors:  Russell A Hermansen; Benjamin P Oswald; Stormy Knight; Stephen D Shank; David Northover; Katharine L Korunes; Stephen N Michel; David A Liberles
Journal:  J Mol Evol       Date:  2017-08-09       Impact factor: 2.395

3.  Uniclust databases of clustered and deeply annotated protein sequences and alignments.

Authors:  Milot Mirdita; Lars von den Driesch; Clovis Galiez; Maria J Martin; Johannes Söding; Martin Steinegger
Journal:  Nucleic Acids Res       Date:  2016-11-28       Impact factor: 16.971

4.  A sequence family database built on ECOD structural domains.

Authors:  Yuxing Liao; R Dustin Schaeffer; Jimin Pei; Nick V Grishin
Journal:  Bioinformatics       Date:  2018-09-01       Impact factor: 6.937

5.  mtProtEvol: the resource presenting molecular evolution analysis of proteins involved in the function of Vertebrate mitochondria.

Authors:  Anastasia A Kuzminkova; Anastasia D Sokol; Kristina E Ushakova; Konstantin Yu Popadin; Konstantin V Gunbin
Journal:  BMC Evol Biol       Date:  2019-02-26       Impact factor: 3.260

6.  MAFFT-DASH: integrated protein sequence and structural alignment.

Authors:  John Rozewicki; Songling Li; Karlou Mar Amada; Daron M Standley; Kazutaka Katoh
Journal:  Nucleic Acids Res       Date:  2019-07-02       Impact factor: 16.971

7.  IMG-ABC v.5.0: an update to the IMG/Atlas of Biosynthetic Gene Clusters Knowledgebase.

Authors:  Krishnaveni Palaniappan; I-Min A Chen; Ken Chu; Anna Ratner; Rekha Seshadri; Nikos C Kyrpides; Natalia N Ivanova; Nigel J Mouncey
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

Review 8.  Synthetic Biology and Computer-Based Frameworks for Antimicrobial Peptide Discovery.

Authors:  Marcelo D T Torres; Jicong Cao; Octavio L Franco; Timothy K Lu; Cesar de la Fuente-Nunez
Journal:  ACS Nano       Date:  2021-02-04       Impact factor: 15.881

9.  DbStRiPs: Database of structural repeats in proteins.

Authors:  Broto Chakrabarty; Nita Parekh
Journal:  Protein Sci       Date:  2021-03-06       Impact factor: 6.725

10.  XSuLT: a web server for structural annotation and representation of sequence-structure alignments.

Authors:  Bernardo Ochoa-Montaño; Tom L Blundell
Journal:  Nucleic Acids Res       Date:  2017-07-03       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.