| Literature DB >> 15980552 |
Michael Psarros1, Steffen Heber, Manuela Sick, Gnanasekaran Thoppae, Keith Harshman, Beate Sick.
Abstract
The Remote Analysis Computation for gene Expression data (RACE) suite is a collection of bioinformatics web tools designed for the analysis of DNA microarray data. RACE performs probe-level data preprocessing, extensive quality checks, data visualization and data normalization for Affymetrix GeneChips. In addition, it offers differential expression analysis on normalized expression levels from any array platform. RACE estimates the false discovery rates of lists of potentially regulated genes and provides a Gene Ontology-term analysis tool for GeneChip data to support the biological interpretation and annotation of results. The analysis is fully automated but can be customized by flexible parameter settings. To offer a convenient starting point for subsequent analyses, and to provide maximum transparency, the R scripts used to generate the results can be downloaded along with the output files. RACE is freely available for use at http://race.unil.ch.Entities:
Mesh:
Year: 2005 PMID: 15980552 PMCID: PMC1160250 DOI: 10.1093/nar/gki490
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1‘PLM pseudo image’ tool output. The spatial distribution of residuals obtained from probe-level fitting over multiple arrays is shown. (a) High-quality data showing almost no defects; (b) low-quality data showing large artifacts.
Graphical outputs and their functions for the RACE Quality Checks & Normalization tool
| Output | Function |
|---|---|
| CEL intensity images | Detect spatial intensity artifacts |
| Boxplots of raw PM probe intensities | Display PM probe intensity distribution for selected array; compare overall brightness of selected array |
| Density distribution of the raw PM probe intensities | Check the intensity density for selected arrays; compare the densities of selected arrays |
| Boxplots of the normalized PM intensities | Assess the success of the normalization |
| 5′ to 3′ feature intensity plot (‘RNA digestion plot’) | Detect a bias in probe intensities; identify outlier arrays with deviating biases |
| PLM pseudo images | Assess spatial distribution of weights derived during robust linear model probe-level fit; detect obscure/dark array regions with low weights; assess the spatial distribution of residuals; detect obscure/dark array regions with high positive or negative residuals |
| NUSE boxplots | Identify arrays where the standard errors for gene expression estimates from PLM fit are overall larger relative to other arrays |
| RLE boxplots | Identify arrays where relative log expression compared with a median array are larger than other arrays |
| Pair-wise scatter plots | Assess similarities and differences in expression values measured on two arrays; identify outliers arrays |
| Correlation matrix plot | Detect homogeneous groups of arrays; identify outlier arrays |
| Sample cluster | Find subgroups of similar samples |
Figure 2‘Bias 5′ to 3′ end plot’ tool output. Each line represents the overall 5′ to 3′ intensity bias of a different chip.
Graphical outputs and their functions for the RACE Statistical Tests tool
| Output | Function |
|---|---|
| Correlation matrix plot | Check whether intra-group correlations are higher than correlations between groups; identify outlier samples |
| Sample cluster | Check whether uploaded groups yield separated clusters; identify sample subgroups |
| StdDev plots | Compare distributions of expression standard deviations in the different groups; assess variability in different groups |
| p-Value histogram | Obtain a visual impression of the amount of differentially expressed genes by the height of a potential peak at small |
| Volcano plots | Check for genes with high significance and large expression changes across groups |
| FDR versus | Find the appropriate |
| MvA plot | Visualize mean expression and log changes of all genes; label genes which were selected according to user's defined |
Figure 3‘p-Value histogram’ output. The number of genes (‘Frequency’) which fall into each p-value bin is presented. In the insert, the False Discovery Rate versus the p-value threshold is plotted.
Figure 4‘MvA plots’ output. The expression ratio (log base 2) of genes is plotted against their average expression intensity. Circles identify genes that pass user-defined p-value and fold-change value thresholds.
Figure 5‘GO-term chart’ output. Biological function GO terms calculated to be statistically overrepresented in a user-specified gene list are reported along with the number of genes from the list associated with each term.