| Literature DB >> 20360966 |
Jennifer G Mulle1, Viren C Patel, Stephen T Warren, Madhuri R Hegde, David J Cutler, Michael E Zwick.
Abstract
DNA-based microarrays are increasingly central to biomedical research. Selecting oligonucleotide sequences that will behave consistently across experiments is essential to the design, production and performance of DNA microarrays. Here our aim was to improve on probe design parameters by empirically and systematically evaluating probe performance in a multivariate context. We used experimental data from 19 array CGH hybridizations to assess the probe performance of 385,474 probes tiled in the Duchenne muscular dystrophy (DMD) region of the X chromosome. Our results demonstrate that probe melting temperature, single nucleotide polymorphisms (SNPs), and homocytosine motifs all have a strong effect on probe behavior. These findings, when incorporated into future microarray probe selection algorithms, may improve microarray performance for a wide variety of applications.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20360966 PMCID: PMC2847945 DOI: 10.1371/journal.pone.0009921
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Observed variance in probe performance across multiple array CGH experiments.
384,475 probes are ranked by variance in normalized log(2) ratios across 19 array CGH experiments. Rank order of probe is plotted on the x-axis, variance on the y-axis. A dotted horizontal line is drawn at variance = 0.1, where probes are dichotomized according to “low” or “high” variance. Inset: number of probes that fall into each category.
Figure 2Examples of data from low- and high-variance probes.
Normalized log(2) ratio data for 5 probes of low variance and 5 probes of high variance across multiple experiments. High-variance probes are likely to have outlier values for one or more experiments, but also have large variance even when outliers are excluded.
Results of univariate analysis, quantitative variables.
| Low Variance Probes | High Variance Probes | p-value | |
|
| 378,058 | 7417 | |
|
| 0.039 | 0.379 | |
|
| 65.13 | 63.87 | <2.2e-16 |
|
| 56.55 | 56.97 | <2.2e-16 |
|
| 35.93 | 33.28 | <2.2e-16 |
|
| 50.47 | 50.56 | 0.4538 |
|
| 0.141 | 0.158 | .000011 |
|
| 0.144 | 0.157 | 0.032 |
Percent of low and high variance probes with homopolymer runs (nucleotide, variance).
| Size of Homopolymer | A, Low | A, High | T, Low | T, High | C, Low | C, High | G, Low | G, High |
|
| 21.3 | 18.0 | 23.8 | 20.6 | 73.7 | 80.1 | 74.0 | 77.5 |
|
| 37.3 | 37.8 | 38.0 | 37.9 | 23.1 | 18.0 | 22.8 | 20.2 |
|
| 27.8 | 30.3 | 26.2 | 28.4 | 3.2 | 1.9 | 3.2 | 2.3 |
|
| 11.5 | 12.2 | 10.3 | 11.8 | na | na | na | na |
|
| 1.8 | 1.5 | 1.5 | 1.0 | na | na | na | na |
|
| 0.3 | 0.2 | 0.3 | 0.3 | na | na | na | na |
Results of univariate analysis, categorical variables.
| A | C | G | T | |
|
| 62.79 | 69.70 | 160.97 | 51.36 |
|
| 3.213e-12 | 1.182e-13 | <2.2e-16 | 7.021e-12 |
Pearson's correlations between predictor variables (all p <2.2e-16).
| Tm | Length | |
|
| 0.966 | −0.466 |
|
| −0.397 |
Comparison of single-term models with Tm, GC content, and probe length.
| Single-Term Models | AIC |
|
| 72273* |
|
| 72333 |
|
| 73231 |
Testing for residual effects of GC content or probe length, after adjusting for Tm.
| Model Includes | Estimates | beta | se | p-value | Model AIC |
| Tm | Tm | −0.111 | 0.003 | <2e-16 | 72273 |
| Tm + GC content | Tm | −0.108 | 0.014 | 8.89e-15 | 72275 |
| GC content | −0.002 | 0.006 | 0.794 | ||
| Tm + length | Tm | −0.113 | 0.003 | <2e-16 | 72273 |
| Length | −0.005 | 0.003 | 0.126 |
Comparison of models with remaining predictor variables.
| Model With Tm and: | AIC |
|
| 72255 |
|
| 72268 |
|
| 72253 |
|
| 72276 |
|
| 72270 |
|
| 72236 |
Final model including Tm, SNP, and polyC.
| Variable | beta | se | p-value | Final Model AIC |
| Tm | −0.107 | 0.004 | <2e-16 | 72236 |
| SNP | 0.143 | 0.032 | 8.32e-06 | |
| Poly C (3) | −0.134 | 0.031 | 1.81e-05 | |
| Poly C (4) | −0.204 | 0.087 | 0.0185 | |
| PolyC (5) | −9.15 | 54.5 | 0.8667 |
Figure 3Proposed algorithm to refine probe selection.