Literature DB >> 14693810

Distribution of words with a predefined range of mismatches to a DNA probe in bacterial genomes.

O Michael Melko1, Arcady R Mushegian.   

Abstract

MOTIVATION: Hybridization of oligonucleotides with longer nucleotide sequences is an essential step in nucleic acid biosynthesis in vitro and in vivo, in oligonucleotide-based diagnostics, and in therapeutic applications of oligonucleotides. A major factor determining sensitivity and selectivity of hybridization is the number of base pair mismatches that occur in an ungapped alignment of the oligonucleotide (probe) and a longer sequence (target).
RESULTS: The k-distance match count between the probe and the target is defined as the number of ungapped alignments between the two sequences that have exactly k mismatches, and the k-neighbor match count is defined as the sum of the j-distance match counts for j between 0 and k. We derive a novel formula for the probability of a k-distance match. This formula is based on the assumption that the target is strand-symmetric Bernoulli text (i.e. nucleotides are independently, identically distributed in the target and satisfy Chargaff's second parity rule). Our model predicts that the GC-content in both the probe and the target significantly affects the match count expectation. The ratio of k-neighbor match counts in two distinct genomes for a given probe is a measure of its specificity. We calculated such ratios for pairs of bacterial genomes with different combinations of length, GC-content and phylogenetic distance. Examination of the extreme values of these ratios indicates that probes with a high discriminative power exist for each tested pair.

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 14693810     DOI: 10.1093/bioinformatics/btg374

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  2 in total

1.  Global assessment of cross-hybridization for oligonucleotide arrays.

Authors:  Cavan Reilly; Arvind Raghavan; Paul Bohjanen
Journal:  J Biomol Tech       Date:  2006-04

2.  Asymptotic behaviour and optimal word size for exact and approximate word matches between random sequences.

Authors:  Sylvain Forêt; Miriam R Kantorovitz; Conrad J Burden
Journal:  BMC Bioinformatics       Date:  2006-12-18       Impact factor: 3.169

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.