| Literature DB >> 23012628 |
Yiming Bao1, Vyacheslav Chetvernin, Tatiana Tatusova.
Abstract
PAirwise Sequence Comparison (PASC) is a tool that uses genome sequence similarity to help with virus classification. The PASC tool at NCBI uses two methods: local alignment based on BLAST and global alignment based on Needleman-Wunsch algorithm. It works for complete genomes of viruses of several families/groups, and for the family of Filoviridae, it currently includes 52 complete genomes available in GenBank. It has been shown that BLAST-based alignment approach works better for filoviruses, and therefore is recommended for establishing taxon demarcations criteria. When more genome sequences with high divergence become available, these demarcation will most likely become more precise. The tool can compare new genome sequences of filoviruses with the ones already in the database, and propose their taxonomic classification.Entities:
Keywords: Filoviridae; ICTV; International Committee on Taxonomy of Viruses; NCBI; National Center for Biotechnology Information; PASC; PAirwise Sequence Comparison; filovirus; virus classification; virus taxonomy
Mesh:
Year: 2012 PMID: 23012628 PMCID: PMC3446765 DOI: 10.3390/v4081318
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
Figure 1Frequency distribution of pairwise identities for the complete genome sequences of filoviruses, and the application in classifying newly sequenced viruses. The two plots represent results obtained by the BLAST-based alignments and global alignments respectively. The green, yellow and peach bars in the plots represent pairs of genomes that are assigned to the same species, different species but in the same genus, and different genera respectively in the current National Center for Biotechnology Information (NCBI) Taxonomy Database. The top matches for each input genome (JQ352763 and DQ447657) to the existing genomes in GenBank and other input genome are listed, and their pairwise identities are shown. The small red bar on the X-axis of the top plot indicates the percentage of identity of the selected pair (#22 which is highlighted). Not all virus species names listed reflect the most recent International Committee on Taxonomy of Viruses (ICTV) species names.
Figure 2Dot matrix and text views of pairwise alignment between genome sequences of Ebola virus (NC_002549) and Taï Forest virus (NC_014372), using the BLAST-based and global alignment methods. Not all virus species names listed reflect the most recent ICTV species names.
Figure 3Frequency distribution of pairwise identities from the complete genome sequences of marburgviruses. The three peaks from right to left represent the pairs of viruses within each of the five major lineages of marburgviruses, between the four lineages of Marburg virus (MARV), and between Ravn virus (RAVV) and MARV.