| Literature DB >> 20423502 |
Mike L Smith1, Mark J Dunning, Simon Tavaré, Andy G Lynch.
Abstract
BACKGROUND: A key stage for all microarray analyses is the extraction of feature-intensities from an image. If this step goes wrong, then subsequent preprocessing and processing stages will stand little chance of rectifying the matter. Illumina employ random construction of their BeadArrays, making feature-intensity extraction even more important for the Illumina platform than for other technologies. In this paper we show that using raw Illumina data it is possible to identify, control, and perhaps correct for a range of spatial-related phenomena that affect feature-intensity extraction.Entities:
Mesh:
Year: 2010 PMID: 20423502 PMCID: PMC2880029 DOI: 10.1186/1471-2105-11-208
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Illumina BeadArray structure. Depicting, approximately to scale, the structures of the two types of BeadArray used in this manuscript, and illustrating the terminology we employ for describing BeadArrays. Indicated, for the Human-WG6 V2 expression array and the CNV370-duo DNA copy number array, are the layouts of samples and sections. Additionally, for each technology, one section is expanded to illustrate the segments that comprise it. The images returned as part of the raw Illumina data are of sections, while registration takes place by the segment.
Figure 2The relationship between the fractional part of bead position and observed intensity. (A) To investigate the association between bead-intensity and the fractional part of a bead's location we use the bead coordinates and intensities for array section 4343238066_A_1. A pixel is divided into 100 × 100 bins and the average log-intensity of beads with fractional coordinates falling into those bins has been plotted. (B) As for part A of the figure, but the fractional part of the coordinate was not used in the intensity calculation, with the four pixels instead being given equal weight. (C) As for part A of the figure, but within bead-type residual log-intensities are plotted. (D) As for part B of the figure, but within bead-type residual log-intensities are plotted.
Figure 3Illustrating the failure to map bead-centres correctly in the red channel. Illustrated are a small section of both the green (A) and red (B) images (with a slight shift to ensure alignment of the chip) from array section 4127130020_A_4. Nineteen beads are identified in the figure, and can be seen to follow a regular hexagonal lattice in the green image. The bead-centres for those nineteen beads (as provided by Illumina's software) are also plotted on the red image, where it is clear that there has been a failure in centre identification and mapping for some beads. Part C of the figure shows, for the nineteen beads, how far from the median value for the other replicates of these bead-types the reported log-intensity lies. Beads that are apparently not mapping to the correct bead in the red channel (B) are coloured red.
Figure 4Demonstrating the effect of remapping the probe annotation. For the array illustrated in Additional File 9, bead intensity was averaged by bead-type across segments 2 and 3 (the mis-registered segments) and compared with the average intensity for the same bead-type from segments 1 and 4 (the successfully registered segments). The bead-type identities for segments 2 and 3 were then remapped by manually shifting the grid of bead-types to achieve alignment with the figure, but retaining the individual bead intensities that had been returned.
Figure 5Beads neighbouring clusters of non-decoded beads. Illustrated are 8 segments from various sections of the expression data, each showing a cluster of non-decoded beads (orange) surrounded by a distinct region of high intensity (blue) beads. Within-bead-type residual intensities (log-intensity minus median log-intensity for that bead-type) have been averaged over beads in 20 × 20 pixel squares, and the colour scale for each segment is calculated separately (in each case going from yellow for the lowest value to blue for the highest). Clusters of non-decoding beads are a feature of array manufacture rather than processing, with the implication that the regions of high intensity are similarly so. Images are indexed in the form A_B_C S D, where A is the chip name, B the sample name, C the section number, and D the segment number.
Figure 6Signal overspill from a bright bead onto its neighbours. (A) A false-colour image of section 4343238080_B_2 showing the region surrounding a bright bead centred at pixel (377, 734). The highest intensity pixels are indicated in dark blue, whilst the dimmest pixels are white. Bead-centre locations are marked with a cross. Each of the black squares show the 16 pixels used to calculate the foreground intensity for the neighbours, with the size of the circles in each square representing the weight attributed to each pixel during the foreground calculation. Neighbours 1, 3 and 5 appear to fall largely within the signal emitted by the bright bead. Bead 7 is included as an example where the bead-centre was identified between pixels, resulting in a more even weights matrix. (B) A bar chart showing the difference between the calculated log intensity for each of the neighbouring beads and the median log intensity for beads of that type. Those beads that fell within the signal emitted by the bright bead are all seen to have a dramatically higher intensity score than is expected for their respective bead-types.
Figure 7Pixels of unusually low intensity and their influence. (A) Illustrating the occurrence of an unusually low intensity pixel. A region, centred on pixel (964, 6081), from image 4343238066_A_2_Grn.tif, is illustrated. Whilst the majority of the image shows the typical background log-intensity, a single pixel with an incredibly low value is observed. Note that this occurs in the vicinity of an almost-saturated high-intensity bead. Seven beads which include this pixel in their background region have been successfully decoded, and their bead-centres are indicated. (B) Illustrating how the intensities of the affected beads show a large deviation from the summarized value for their respective bead-types. If the median of the five lowest pixels is used instead of the mean during the background calculation, the impact of the low intensity pixel is reduced in most cases.
Figure 8Effect on biological interpretation. (A) Showing the log intensity of bead-type 5900598 from six array sections each of which had the same sample hybridized to them. The summarized intensities have been calculated twice, once using the standard analysis and once with beads affected by the identified phenomena removed. (B) & (C) show the log intensties of the individual beads of type 5900598 on section 4343238080_D_2 calculated using the standard and two-step summarization methods respectively. Histograms of the log intensities of the negative control beads calculated in the same fashion are shown down the sides. In panel (C) beads excluded due to their proximity to the phenomena identified in this study are indicated by the red cross. The dotted lines indicate the range of values outside of which beads are classed as outliers and are excluded from the summarization step. The removal of the marked bead results in three additional beads being classed as outliers. The result is a lower summarized intensity (the solid black line), which when compared to the negative control beads, changes from being classed as expressed to not expressed.
Number of affected beads for expression arrays.
| Section ID | Neighbouring | Large deviation | Near abnormally | Neighbouring |
|---|---|---|---|---|
| 66_A_1 | 54 | 501 | 32 | 13059 |
| 66_A_2 | 30 | 589 | 32 | 4312 |
| 66_B_1 | 210 | 373 | 33 | 1062 |
| 66_B_2 | 140 | 340 | 30 | 5511 |
| 66_C_1 | 12 | 443 | 14 | 1705 |
| 66_C_2 | 12 | 514 | 8 | 709 |
| 66_D_1 | 72 | 392 | 8 | 272 |
| 66_D_2 | 60 | 382 | 4 | 2616 |
| 66_E_1 | 6 | 495 | 7 | 1918 |
| 66_E_2 | 24 | 508 | 0 | 9399 |
| 66_F_1 | 108 | 625 | 25 | 4427 |
| 66_F_2 | 54 | 475 | 12 | 4800 |
| 80_A_1 | 66 | 451 | 38 | 1579 |
| 80_A_2 | 42 | 343 | 48 | 2530 |
| 80_B_1 | 420 | 348 | 44 | 212 |
| 80_B_2 | 342 | 360 | 76 | 341 |
| 80_C_1 | 54 | 281 | 40 | 2035 |
| 80_C_2 | 30 | 457 | 50 | 0 |
| 80_D_1 | 15246 | 5431 | 148 | 0 |
| 80_D_2 | 12864 | 5653 | 71 | 495 |
| 80_E_1 | 162 | 719 | 43 | 2058 |
| 80_E_2 | 6 | 506 | 13 | 568 |
| 80_F_1 | 30 | 341 | 52 | 381 |
| 80_F_2 | 78 | 355 | 28 | 880 |
| Median | 57 | 454 | 32 | 1642 |
Summary of results.
| Problem | Diagnostic | Solution | Implementation (where implemented) |
|---|---|---|---|
| There is local discordance between the locations in the two channels from two-colour arrays | Between-channel differences in location can be compared to local median differences | If one channel is clearly wrong (from relative grid positions in that channel) then the bead-centre can be remapped from the other channel, else the bead should be dropped | |
| Image cropping out part of the array section, so that values cannot be calculated | Bead-centre coordinates lie without the dimensions of an image, beads apparent on edge of image | Exclude beads with such coordinates | Any text editor can assess the coordinates, while |
| Beads are mostly well-registered, but grid of bead-centres does not align with the image resulting in scrambling of bead-type data | Visual inspection of bead-centres over image. Without access to images, check that the segments have similar extreme x coordinates and that they are equally spaced along the y-axis | In extreme circumstances, bead IDs can be remapped, but usually segments/sections should be excluded | |
| Neighbouring beads of the same bead-type are a potential concern (albeit unsubstantiated) | Such pairs of beads can be identified and down-weighted or excluded | ||
| Beads neighbouring clusters of non-decoded beads are more likely to take extreme values | The presence of such clusters can be determined from the . | Exclude/Down-weight beads in a zone about such clusters | |
| Bright beads encroach on neighbours, raising their associated values | Visual inspection of the brightest beads | Bright beads can be identified (by intensity or size using EBImage) and their neighbours down-weighted or excluded | |
| Abnormally low pixels in the image distort background values and so final intensities | Pixels present with values noticeably lower than the mode | Exclude such pixels, or use a less-sensitive background calculation rule | |
| Multiple bead-centres map to the same location | Text files (or . | In two-colour arrays, it may be clear which is the correct bead-centre, else exclude both | Scripts provided to detect departure from predicted bead-centre. |
Summarizing the phenomena we have described, methods for their diagnosis, and possible solutions.