Literature DB >> 9697212

An information theoretic view of gapped and other alignments.

J P Schmidt1.   

Abstract

We use an information theoretical framework to estimate the probability of the score of gapped alignments. With appropriate scaling, the score of a global (and with some adjustments also the score of a local) alignment of two sequences can be viewed as the difference in the number of bits needed to transmit the two sequences T1 and T2 under two different encoding schemes C1 and C2. C1 is an idealized scheme, assumed to achieve an optimal encoding with respect to a distribution p, and the assumption that T1 and T2 are independent. C2 is an alternate scheme, that will transmit T1 and T2 while taking advantage of the optimal alignment between the two. That is under C1, the strings T1 and T2 (with respective probabilities p(T1) and p(T2)), are assumed to be encoded using C1(T1, T2) = log [formula: see text] bits. By slightly modifying a known Theorem we show that the probability (under p) that two independent sequences T1, T2 can be transmitted with an alternate encoding scheme (C2) with no more than C1(T1, T2)-r bits is bounded by 2-r. We then show how to use this bound to derive upper bounds for the probability of gapped alignment scores between two sequences.

Entities:  

Mesh:

Year:  1998        PMID: 9697212

Source DB:  PubMed          Journal:  Pac Symp Biocomput        ISSN: 2335-6928


  2 in total

1.  Aligning sequences by minimum description length.

Authors:  John S Conery
Journal:  EURASIP J Bioinform Syst Biol       Date:  2007

2.  Information theoretic perspective on genome clustering.

Authors:  Alaguraj Veluchamy; Preeti Mehta; K V Srividhya; Hirendra Vikram; M K Govind; Ramneek Gupta; Abdul Aziz Bin Dukhyil; Raed Abdullah Alharbi; Saleh Abdullah Aloyuni; Mohamed M Hassan; S Krishnaswamy
Journal:  Saudi J Biol Sci       Date:  2020-12-31       Impact factor: 4.219

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.