| Literature DB >> 16845083 |
Neil C Jones1, Degui Zhi, Benjamin J Raphael.
Abstract
UNLABELLED: Multiple sequence alignment programs are an invaluable tool in computational biology. A-Bruijn Alignment (ABA) is a method for multiple sequence alignment that represents an alignment as a directed graph and has proved useful in aligning nucleotide and amino acid sequences that are composed of repeated and shuffled subsequences. AliWABA is a web server that provides tools to generate alignments with ABA, visualize the resulting ABA graphs and extract subsequences from ABA graphs. AliWABA greatly simplifies the problem of analyzing multiple sequences for local similarities that may be reordered, as is common with the domain architectures of proteins. To facilitate the analysis of protein domains, AliWABA provides direct querying of the Conserved Domain Database. AVAILABILITY: http://aba.nbcr.net/Entities:
Mesh:
Substances:
Year: 2006 PMID: 16845083 PMCID: PMC1538870 DOI: 10.1093/nar/gkl288
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Existing alignment tools force linearity on the output alignments. (a and b) The ‘optimal’ alignments produced by traditional global alignment programs are incomplete because they do not reveal the similarity of blocks C, D and E in a single alignment. (c) The ABA graph representation of the same pair of sequences. In the ABA graph, each sequence is a path from a start node to an end node (but not all such paths are actually sequences) and edges correspond to substrings of a sequence of some (possibly 0) length. Note that ABA graphs are usually more complicated than this example and contain additional labelling information along the edges (Figure 2).
Figure 2(a) The ABA graph generated by AliWABA on a set of ten POU-domain transcription factors of varying biological function. Edges are labeled l(m) where l is the length of the alignment on the edge and m is its multiplicity (the number of sequences in the alignment). The POU-specific and POU-homeodomains are captured in the edge (3,4), which has length 135 and multiplicity 10. (b) When Oct-1, Oct-2, and Pit-1 are aligned separately, a previously unknown shared domain of length 30 becomes apparent. This domain is shuffled in Pit-1 (path traced in blue) versus Oct-1 and Oct-2. (c) The alignment between Oct-1 and Pit-1 that corresponds to the edge (9, s17). This example can be seen at .