| Literature DB >> 15485572 |
Laurent Noé1, Gregory Kucherov.
Abstract
BACKGROUND: The hit criterion is a key component of heuristic local alignment algorithms. It specifies a class of patterns assumed to witness a potential similarity, and this choice is decisive for the selectivity and sensitivity of the whole method.Entities:
Mesh:
Substances:
Year: 2004 PMID: 15485572 PMCID: PMC526756 DOI: 10.1186/1471-2105-5-149
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Hit Probability. Hit probability as a function of length of fixed-score alignments
Bernouilli Model Hit probability of seeds on Bernoulli sequences of length 64 with match probability 0.7 and transition/transversion probabilities 0.15
| weight | spaced seed | hit proba | transition-constrained seed | hit proba |
| 9 | 0.7291 | 0.7366 | ||
| 10 | B10 = ##_##___##_#_### | 0.5957 | 0.6056 | |
| 11 | 0.4671 | 0.4784 |
Figure 2Seed Probability. Hit probability of seed models on Bernoulli sequences as a function on ti/tv ratio
Markov Model Hit probability of seeds on a Markov model of order 5 trained on a large mixed sample of cross-species alignments
| weight | spaced seed | hit proba | transition-constrained seed | hit proba |
| 9 | 0.822 | 0.845 | ||
| 10 | 0.716 | 0.746 | ||
| 11 | 0.603 | 0.632 |
Seed experiments. Number of high-scoring similarities found with different seed patterns
| sequences | ||||||||
| IX/V | 323 | 336 | 275 | 279 | 312 | 325 | 274 | 293 |
| IX/XVI | 342 | 354 | 271 | 280 | 349 | 357 | 280 | 295 |
| XVI/IV | 1314 | 1361 | 1124 | 1172 | 1309 | 1348 | 1180 | 1235 |
| MC58/Z2491 | 361896 | 380028 | 341113 | 364792 | 385444 | 392164 | 359348 | 366759 |
Comparative Tests. Comparative tests of YASS vs b12seq (NCBI BLAST 2.2.6). Reported execution times have been obtained on a Pentium IV 2.4 GHz computer.
| sequence 1 | sequence 2 | time (sec) | # align. | # ex. align. | ex. align. length | ||||||
| name | size | name | size | Y. | B. | Y. | B. | Y. | B. | Y. | B. |
| 3.6 | 4.4 | 122 | 148 | 494 | 310 | 130 | 27 | 29145 | 7970 | ||
| 3.6 | 3.3 | 161 | 163 | 578 | 369 | 168 | 63 | 37310 | 30138 | ||
| 3.6 | 4.6 | 156 | 253 | 901 | 617 | 186 | 54 | 39354 | 19994 | ||
| 3.6 | 3.3 | 164 | 167 | 940 | 465 | 349 | 60 | 65788 | 28883 | ||
| 4.4 | 3.3 | 211 | 542 | 1851 | 1265 | 397 | 160 | 102103 | 80012 | ||
| 4.4 | 4.6 | 168 | 255 | 738 | 515 | 197 | 86 | 44348 | 23361 | ||
| 4.4 | 3.3 | 72 | 69 | 498 | 295 | 171 | 30 | 36474 | 12021 | ||
| 3.3 | 4.6 | 130 | 161 | 962 | 640 | 186 | 45 | 34538 | 11277 | ||
| 3.3 | 3.3 | 95 | 93 | 1109 | 687 | 197 | 72 | 42009 | 21575 | ||
| 4.6 | 3.3 | 149 | 217 | 2900 | 1953 | 622 | 264 | 186585 | 110352 | ||
C.g: Corynebacterium glutamicum ATCC 13032,
M.t: Mycobacterium tuberculosis (CDC1551),
S.sp.: Synechocystis sp. PCC 6803,
S.sp.: Vibrio parahaemolyticus RIMD 2210633 chr I,
Y.p.: Yersinia pestis KIM