Literature DB >> 10890401

Evaluation measures of multiple sequence alignments.

G H Gonnet1, C Korostensky, S Benner.   

Abstract

Multiple sequence alignments (MSAs) are frequently used in the study of families of protein sequences or DNA/RNA sequences. They are a fundamental tool for the understanding of the structure, functionality and, ultimately, the evolution of proteins. A new algorithm, the Circular Sum (CS) method, is presented for formally evaluating the quality of an MSA. It is based on the use of a solution to the Traveling Salesman Problem, which identifies a circular tour through an evolutionary tree connecting the sequences in a protein family. With this approach, the calculation of an evolutionary tree and the errors that it would introduce can be avoided altogether. The algorithm gives an upper bound, the best score that can possibly be achieved by any MSA for a given set of protein sequences. Alternatively, if presented with a specific MSA, the algorithm provides a formal score for the MSA, which serves as an absolute measure of the quality of the MSA. The CS measure yields a direct connection between an MSA and the associated evolutionary tree. The measure can be used as a tool for evaluating different methods for producing MSAs. A brief example of the last application is provided. Because it weights all evolutionary events on a tree identically, but does not require the reconstruction of a tree, the CS algorithm has advantages over the frequently used sum-of-pairs measures for scoring MSAs, which weight some evolutionary events more strongly than others. Compared to other weighted sum-of-pairs measures, it has the advantage that no evolutionary tree must be constructed, because we can find a circular tour without knowing the tree.

Entities:  

Mesh:

Substances:

Year:  2000        PMID: 10890401     DOI: 10.1089/10665270050081513

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  6 in total

1.  Evaluation of Trace Alignment Quality and its Application in Medical Process Mining.

Authors:  Moliang Zhou; Sen Yang; Xinyu Li; Shuyu Lv; Shuhong Chen; Ivan Marsic; Richard Farneth; Randall Burd
Journal:  IEEE Int Conf Healthc Inform       Date:  2017-09-14

2.  Assessing Activity Pattern Similarity with Multidimensional Sequence Alignment based on a Multiobjective Optimization Evolutionary Algorithm.

Authors:  Mei-Po Kwan; Ningchuan Xiao; Guoxiang Ding
Journal:  Geogr Anal       Date:  2015-07

3.  Estimates of positive Darwinian selection are inflated by errors in sequencing, annotation, and alignment.

Authors:  Adrian Schneider; Alexander Souvorov; Niv Sabath; Giddy Landan; Gaston H Gonnet; Dan Graur
Journal:  Genome Biol Evol       Date:  2009-06-05       Impact factor: 3.416

4.  Integrating protein structures and precomputed genealogies in the Magnum database: examples with cellular retinoid binding proteins.

Authors:  Michael E Bradley; Steven A Benner
Journal:  BMC Bioinformatics       Date:  2006-02-23       Impact factor: 3.169

5.  Multiple Alignment of Promoter Sequences from the Arabidopsis thaliana L. Genome.

Authors:  Eugene V Korotkov; Yulia M Suvorova; Dmitrii O Kostenko; Maria A Korotkova
Journal:  Genes (Basel)       Date:  2021-01-21       Impact factor: 4.096

6.  Gene fusions and gene duplications: relevance to genomic annotation and functional analysis.

Authors:  Margrethe H Serres; Monica Riley
Journal:  BMC Genomics       Date:  2005-03-09       Impact factor: 3.969

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.