| Literature DB >> 18940003 |
Ravi Vijaya Satya1, Nela Zavaljevski, Kamal Kumar, Elizabeth Bode, Susana Padilla, Leonard Wasieloski, Jeanne Geyer, Jaques Reifman.
Abstract
BACKGROUND: With multiple strains of various pathogens being sequenced, it is necessary to develop high-throughput methods that can simultaneously process multiple bacterial or viral genomes to find common fingerprints as well as fingerprints that are unique to each individual genome. We present algorithmic enhancements to an existing single-genome pipeline that allows for efficient design of microarray probes common to groups of target genomes. The enhanced pipeline takes advantage of the similarities in the input genomes to narrow the search to short, nonredundant regions of the target genomes and, thereby, significantly reduces the computation time. The pipeline also computes a three-state hybridization matrix, which gives the expected hybridization of each probe with each target.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18940003 PMCID: PMC2596143 DOI: 10.1186/1471-2164-9-496
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Overview of the TOFI pipeline. Stage 1 and Stage 3 of the TOFI pipeline have been improved to handle multiple genomes. In stage 1, the target genomes are compared with each other to eliminate redundant sequences. In Stage3, an in silico hybridization matrix is computed, which indicates which probes hybridize to which targets.
NCBI accession numbers and sizes of the Burkholderia genomes used for probe design
| Strain | Accession no./version | Size (bp) | |
| 1 | NC_009076.1, NC_009078.1 | 7089249 | |
| 2 | NC_007434.1, NC_007435.1 | 7308054 | |
| 3 | NC_009074.1, NC_009075.1 | 7040403 | |
| 4 | NC_006350.1, NC_006351.1 | 7247547 | |
| 5 | NC_006348.1, NC_006349.1 | 5835527 | |
| 6 | NC_008835.1, NC_008836.1 | 5742303 | |
| 7 | NC_009079.1, NC_009080.1 | 5848380 | |
| 8 | NC_008784.1, NC_008785.1 | 5232401 | |
| 9 | NC_007651.1, NC_007650.1 | 6723972 |
The specificity thresholds used for probe design
Expected behavior of the 5015 designed probes
| Target | Total | Unique | Group | Common | |
| 1 | 2710 | 259 | 504 | 981 | |
| 2 | 3346 | 739 | 504 | 981 | |
| 3 | 2597 | 601 | 504 | 981 | |
| 4 | 3084 | 613 | 504 | 981 | |
| 5 | 1373 | 0 | 31 | 981 | |
| 6 | 1339 | 0 | 31 | 981 | |
| 7 | 1567 | 0 | 31 | 981 | |
| 8 | 1164 | 0 | 31 | 981 |
Background hybridization intensities averaged over three chips for three Burkholderia genomes
| Median background intensity ( | 3710 | 3076 | 3890 |
| Standard deviation of background intensity (σ) | 548 | 463 | 569 |
| 0.53 | 0.54 | 0.53 | |
| 0.92 | 0.92 | 0.90 |
Evaluation of in silico (design) probes against hybridization results with R= 1.0 and R= 0.5 for five categories of probes
| Category (targets) | Experimental | |||
| Class A | Class B | Class C | ||
| I ( | 0 | 0 (0%) | 0 (0%) | 0 (0%) |
| II ( | 2 | 0 (0%) | 2 (100%) | 0 (0%) |
| III ( | 523 | 420 (80%) | 53 (10%) | 50 (10%) |
| IV ( | 21 | 17 (81%) | 2 (10%) | 2 (10%) |
| V ( | 431 | 184 (43%) | 12 (3%) | 236 (55%) |
Probes that behave as expected are categorized as Class A; i.e., these probes have normalized hybridization intensity greater than Rwith intended targets and less than Rwith non-targets. Class B probes have normalized hybridization intensity greater than Rwith non-targets, and Class C probes have normalized hybridization intensity less than Rwith the intended targets.
Experimental hybridization results of group-specific in silico probes tested against B. pseudomallei 238
| Category (targets) | Probes hybridizing with | |
| VI ( | 302 | 236 (78%) |
| VII ( | 92 | 60 (65%) |
Figure 2Histograms of normalized hybridization intensities for the 382 probes that have 100% identity with the three target genomes. The X-axis shows the normalized hybridization intensities and the Y-axis shows the number of probes that have a given normalized hybridization intensity. Many of the 382 probes fail to hybridize with B. pseudomallei K96243 even though all these probes have 100% identity with this genome, whereas hybridization intensities for the other three genomes are as expected.