| Literature DB >> 17992253 |
Raj Chari1, William W Lockwood, Wan L Lam.
Abstract
Array comparative genomic hybridization (array CGH) is a technique for assaying the copy number status of cancer genomes. The widespread use of this technology has lead to a rapid accumulation of high throughput data, which in turn has prompted the development of computational strategies for the analysis of array CGH data. Here we explain the principles behind array image processing, data visualization and genomic profile analysis, review currently available software packages, and raise considerations for future software development.Entities:
Keywords: alteration detection; array CGH; bioinformatics; cancer genome; microarray; software
Year: 2007 PMID: 17992253 PMCID: PMC2067254
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1.Generation of array comparative genomic hybridization profiles. Tumor and normal reference DNA are differentially labeled with cyanine-5 and cyanine-3 respectively and competitively hybridized to a genomic microarray. The array consists of DNA targets selected to span chromosome regions or the entire genome. These targets are typically spotted in replica. The ratio of the two fluorescence signal intensities reflects the relative copy number at that target. The ratio for each spot is plotted against its corresponding position in the human genome to generate a copy number profile.
Figure 2.Principles of array CGH analysis. The process is grouped into three general functions: data preprocessing, visualization, and detection of segmental alterations, in no particular order. Methodologies for each function are indicated in a horizontal manner.
Figure 3.Normalization of array CGH data. A: A plot illustrating spatial bias across the microarray. B: The copy number profile of a chromosome before and after normalization. The removal of systematic biases improves the conformity of the profile.
Figure 4.Visualization of array CGH data. A: A graphical representation of array CGH data. The chromosomes are alternately labeled in green and black. In this graph, log2 signal ratio for each clone is plotted against its chromosomal position ordered in series. PTEN deletion is highlighted in blue. B: Interactive display of the same data emphasizing the options to magnify selected chromosomes or chromosome segments, to display aligned gene annotation (gene track) and to link to external biological databases. The corresponding PTEN region in a) is indicated.
Figure 5.Analysis of array CGH data. Three of the methods described in the text for the detection of segmental alterations are illustrated. A) Direct thresholding, gains and losses are based on a theoretical ratio, in this case the indicated purple line, using the individual values for each clone on the array. B) Moving average based thresholding involves the calculation of the average ratio across a sliding window of clones prior to implementation of a threshold, indicated by the red line. The threshold line is indicated in purple. C) Genetic local search is an algorithm that partitions the data into segments and then “smooths” the data by calculating the average of all the data points within each segment. Smooth segments are indicated by black lines.
Software for analysis and visualization of array CGH data.
| Software | Array Platform | Free/Cost | Computer Platform | Alteration Detection | Display type | Profile Display | Website | Reference |
|---|---|---|---|---|---|---|---|---|
| cDNA | Free | W | No | G | Single | - | ||
| LIC, cDNA, oligo | Free | W | Heuristic algorithm, regularized maximum likelihood, Threshold | G | Single | |||
| LIC, cDNA, oligo | Free | W, M, L, U | No | I | Multiple | |||
| oligo | Free | W | Copy-number calculated based on SNP intensity of sample relative to distribution derived from 100 sample normal reference, Copy number response curve | I | Single | |||
| LIC, cDNA, oligo | Free | Web-based | Thresholding | G | Single | |||
| LIC | Free | W, M. L, U | Unsupervised Hidden Markov Partition | G | Single consensus plot | |||
| viewer | Free | W, M, L, U, Web-based | Moving average, compute log ratio to any base, need GCOS and GTYPE | I | Single | |||
| cDNA, oligo | Cost, Free trial | W, M, L | Z-scoring, moving average calculation | I | Multiple | - | ||
| LIC, oligo | Free | W, M, U | Standard ratio threshold, p-value based on reference | I | Multiple | |||
| LIC, cDNA, oligo | Free | W, M, L | Thresholding, bootstrap-based method, Analysis of Copy Errors (ACE) | I | Multiple | |||
| LIC, cDNA | Free | W, U | CLAC (clustering along chromosomes) with FDR (false discovery rate) | G | Single consensus plot | |||
| cDNA | Free | W, M, L | K-means clustering, dynamic programming | G | Multiple | |||
| LIC, oligo | Free | W, L | Unsupervised Hidden Markov Partition, Circular Binary Segmentation | I | Multiple | |||
| cDNA | Free | W, M, L | Expectation Maximization (EM), one-sided sign test and/or mean permutation test | G | Multiple | |||
| oligo | Free | W | Hidden Markov model | I | Single | |||
| cDNA, oligo | Free | W | Hidden Markov model, Median Smoothing, PM/MM Difference Model | I | Multiple | |||
| cDNA | Free | W | Clustering of cDNA expression data based on chromosome location | I | Multiple | |||
| LIC, oligo | Free | W, M. L, U | Circular Binary Segmentation | G | Single | |||
| LIC | Free | W, M. L, U | Adaptive Weights Smoothing | G | Single | Request author:
| ||
| LIC, cDNA, oligo | Free | W, M, L, U | Maximum likelihood and K-nearest neighbor or wavelet approach | I | Single | |||
| oligo | Cost | W | Windowed Threshold, Second Derivative Peak | I | Single | - | ||
| LIC, cDNA, oligo | Free | W | Region detection by user-defined thresholds or sliding window algorithm | I | Multiple | |||
| LIC | Free | W | No | I | Single | |||
| LIC, cDNA, oligo | Collab | W | Moving average | I | Multiple | - | ||
| LIC | Cost | Web-Based | Confidence interval, based on iterative algorithm | I | Single | - |
W, Windows; M, Macintosh; L, Linux; U, Unix.
G, Graphical Representation; I, Interactive Display.
Free on Collaborative basis.
Figure 6.Examples of multiple experiment visualization methods in SeeGH. A: Multiple alignment of individual chromosome profiles. B: Frequency plot summarizing multiple experiments. Here, red histograms represent frequency of gains and green lost. C: Heatmap display of copy number status. Each vertical column represents an individual profile. Red indicates gain and green indicates loss. The amplitude of the ratio is reflected in the color intensity.