| Literature DB >> 16845082 |
Mikita Suyama1, David Torrents, Peer Bork.
Abstract
PAL2NAL is a web server that constructs a multiple codon alignment from the corresponding aligned protein sequences. Such codon alignments can be used to evaluate the type and rate of nucleotide substitutions in coding DNA for a wide range of evolutionary analyses, such as the identification of levels of selective constraint acting on genes, or to perform DNA-based phylogenetic studies. The server takes a protein sequence alignment and the corresponding DNA sequences as input. In contrast to other existing applications, this server is able to construct codon alignments even if the input DNA sequence has mismatches with the input protein sequence, or contains untranslated regions and polyA tails. The server can also deal with frame shifts and inframe stop codons in the input models, and is thus suitable for the analysis of pseudogenes. Another distinct feature is that the user can specify a subregion of the input alignment in order to specifically analyze functional domains or exons of interest. The PAL2NAL server is available at http://www.bork.embl.de/pal2nal.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16845082 PMCID: PMC1538804 DOI: 10.1093/nar/gkl315
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1An example of PAL2NAL input and output files. (A) The first input file: A multiple sequence alignment of human dihydrofolate reductase (GenBank accession no. BC070280) and its pseudogene in the CLUSTAL format with the notation used in GeneWise for frame shifts. Frame shifts and inframe stop codons in the pseudogene are shown in orange. Under the alignment, arbitrarily selected blocks are specified with ‘#’. (B) The second input file: The corresponding DNA (or mRNA) sequences in the FASTA format. UTRs and polyA tails are shown in cyan to indicate how these regions are excluded from the resulting output. (C) Output with the default option setting. The position of the codon that does not correspond with the input protein sequence is shown in red. The regions of alignment blocks correspond to those specified in the input protein alignment are indicated by ‘#’. (D) Output with the following option setting: Remove mismatches, yes; Use only selected positions, yes; Output format, PAML. With this setting, the codon alignment corresponding to the specified regions is generated in the PAML format.