| Literature DB >> 23870167 |
Cali E Willet1, Laura Bunbury-Cruickshank, Diane van Rooy, Georgina Child, Mohammad R Shariflou, Peter C Thomson, Claire M Wade.
Abstract
BACKGROUND: In addition to probe sequence characteristics, noise in hybridization array data is thought to be influenced by competitive hybridization between probes tiled at high densities. Empirical evaluation of competitive hybridization and an estimation of what other non-sequence related features might affect noisy data is currently lacking.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23870167 PMCID: PMC3733988 DOI: 10.1186/1471-2105-14-231
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Summary of probe and design variables and hybridization intensities for eight arrays
| Raw intensity | 44 | 65530 | 395 |
| Background intensity | 26 | 653.5 | 36 |
| Positive control intensity | 70 | 15103 | 9346.75 |
| Negative control intensity | 39 | 135 | 57.5 |
| Probe GC (%) | 11.67 | 86.67 | 36.67 |
| Free energy (kcal/mol) | -9.48 | 3.8 | 0.48 |
| Off-target matches | 0 | 4496 | 0 |
| PolyA length (bp) | 0 | 19 | 3 |
| PolyC length (bp) | 0 | 17 | 0 |
| PolyG length (bp) | 0 | 19 | 0 |
| PolyT length (bp) | 0 | 19 | 3 |
| Segment length (bp) | 61 | 5760 | 1156 |
| Distance (bp) | 6 | 7791 | 190.5 |
| Offset (bp) | 6 | 26 | na |
| Probe position | 1 | 389 | na |
REML output for the probe position dataset
| Probe GC | 17334.54 | <0.001 | 0.069 | 0.0005 |
| loge Off-targets | 152.34 | <0.001 | 0.460 | 0.037 |
| Free energy | 1626.64 | <0.001 | 0.153 | 0.004 |
| PolyA | 33.81 | <0.001 | 0.034 | 0.006 |
| PolyC | 7.30 | 0.008 | 0.081 | 0.030 |
| PolyG | 37.21 | <0.001 | 0.132 | 0.022 |
| PolyT | 101.30 | <0.001 | 0.050 | 0.005 |
| Offset | 650.03 | <0.001 | | 0.007a |
| 6 bp | | | 0 | |
| 26 bp | | | 0.190 | |
| loge Background | 364.97 | <0.001 | 0.142 | 0.007 |
| loge Segment length | 17.84 | <0.001 | -0.019 | 0.004 |
| Probe position | 4.04 | 0.044 | 0.0002 | 0.00009 |
a Standard error of the difference between the effect of 6 bp or 26 bp offset.
This probe set included all probes, after filtering. The predicted effects of each variable and factor level on hybridization intensity (loge scale) are shown and the associated standard error of predicted effects.
REML output for the distance dataset
| Probe GC | 976.27 | <0.001 | 0.068 | 0.002 |
| loge Off-targets | 14.49 | <0.001 | 0.421 | 0.111 |
| Free energy | 100.23 | <0.001 | 0.165 | 0.016 |
| PolyA | 5.05 | 0.030 | 0.036 | 0.016 |
| PolyC | 9.20 | 0.005 | 0.167 | 0.056 |
| PolyG | 13.16 | 0.012 | 0.100 | 0.027 |
| PolyT | 13.94 | <0.001 | 0.058 | 0.016 |
| Offset | 31.25 | <0.001 | | 0.037a |
| 6 bp | | | 0 | |
| 26 bp | | | 0.210 | |
| loge Background | 15.28 | <0.001 | 0.124 | 0.032 |
| loge Segment length | 2.11 | 0.146 | -0.019 | 0.013 |
| loge Distance | 3.96 | 0.047 | 0.024 | 0.012 |
a Standard error of the difference between the effect of 6 bp or 26 bp offset.
This probe set included the first and last tiled probe within each tiled segment, after filtering. The predicted effects of each variable and factor level on hybridization intensity (loge scale) are shown and the associated standard error of predicted effects.
Figure 1Logmedian signal by tiling path offset (bp). Two medians are significantly different at the 5% level if their notches do not overlap. Blood sample array data shown.
Figure 2Predicted mean logmedian signal by probe position. Mean loge median signal at each level of probe position when all REML covariates are held constant at the mean and averaged over all factor levels (solid line). Dashed lines indicate the mean loge median signal +/- standard errors of the predictions.
Figure 3Predicted mean logmedian signal by logdistance between tiled segments (bp). Mean loge median signal at each level of loge distance when all REML covariates are held constant at the mean and averaged over all factor levels (solid line). Dashed lines indicate the mean loge median signal +/- standard errors of the predictions.
Figure 4Predicted mean logmedian signal by loglength of tiled segment (bp). Mean loge median signal at each level of loge length of tiled segment when all REML covariates are held constant at the mean and averaged over all factor levels (solid line). Dashed lines indicate the mean loge median signal +/- standard errors of the predictions.
REML output for the repeat element dataset
| Probe GC | 121.40 | <0.001 | 0.051 | 0.005 |
| Free energy | 8.56 | 0.004 | 0.109 | 0.038 |
| PolyA | 0.75 | 0.389 | 0.022 | 0.025 |
| PolyC | 19.64 | <0.001 | 0.235 | 0.053 |
| PolyG | 5.74 | 0.017 | 0.045 | 0.019 |
| PolyT | 2.89 | 0.121 | -0.021 | 0.028 |
| Offset | 4.51 | 0.034 | | 0.059a |
| 6 bp | | | 0 | |
| 26 bp | | | 0.124 | |
| loge Background | 0.58 | 0.446 | -0.022 | 0.028 |
| loge Repeat length | 0.11 | 0.736 | 0.013 | 0.038 |
a Standard error of the difference between the effect of 6 bp or 26 bp offset.
Figure 5Logmedian signal by number of deleted central nucleotides. 20 sets of Agilent DCP probes were printed onto each array. Each set comprised 10 probes: two replicate parent probes (non-deleted, 0) and four replicated probes with one, three, five or seven deletions or two, four, six or eight deletions depending on the sequence set. Two medians are significantly different at the 5% level if their notches do not overlap Blood sample array data shown.