| Literature DB >> 21498550 |
Abstract
The use of microarray technology to measure gene expression on a genome-wide scale has been well established for more than a decade. Methods to process and analyse the vast quantity of expression data generated by a typical microarray experiment are similarly well-established. The Affymetrix Exon 1.0 ST array is a relatively new type of array, which has the capability to assess expression at the individual exon level. This allows a more comprehensive analysis of the transcriptome, and in particular enables the study of alternative splicing, a gene regulation mechanism important in both normal conditions and in diseases. Some aspects of exon array data analysis are shared with those for standard gene expression data but others present new challenges that have required development of novel tools. Here, I will introduce the exon array and present a detailed example tutorial for analysis of data generated using this platform.Entities:
Mesh:
Year: 2011 PMID: 21498550 PMCID: PMC3220870 DOI: 10.1093/bib/bbq086
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Using Affymetrix power tools to process raw data from the exon array and generate exon or gene-level signal estimates
| Example argument to apt-probeset- summarize command | Notes |
|---|---|
| -a rma-sketch | Specifies the analysis to be performed. ‘rma-sketch’ (RMA using a subset of probes for memory efficiency) is one of the standard options. Other options, such as plier-sketch and plier-gcbg-sketch (incorporating a gc-based background correction) can be specified instead. It is possible to perform multiple analyses simultaneously by including more than one -a argument. |
| –p HuEx-1_0-st-v2.r2.pgf -c HuEx-1_0-st-v2.r2.clf | Specify the library files, which give information on probeset groups and the array layout. The –p *.pgf and –c *.clf arguments can be replaced with -d *.cdf if the user wishes to use a custom CDF with alternative probeset definitions [ |
| -s HuEx-1_0-st-v2.r2.core.ps | If a .ps file is specified with the –s argument, an exon-level analysis will be performed. To run a gene-level analysis, a .mps file is specified with the –m argument instead. In either case, analysis can be restricted to probes annotated as core, extended or full by specifying the relevant file. |
| --qc-probesets HuEx-1_0-st-v2.r2.qcc | Specify the .qcc file to process the control probesets on the array and check quality of data. |
| -o output_exon | Specify a directory to write the output files with the –o argument. This will be created in the current working directory unless a path to another location is given. Output files are named according to the analysis method (e.g. rma-sketch.summary.txt) and are given the same name for both exon and gene-level analyses; they will be over-written if another analysis is run with the same output folder specified. Therefore, it is useful to write files to a new folder for each analysis. It is also helpful to re-name immediately with an informative name to include the dataset, analysis method and whether it is an exon or gene-level analysis to enable easy identification of the data in the file. |
| *.CEL | Specify the .CEL files to be processed. If they are contained in the current working directory, *.CEL will suffice, but a path to the files can be given if required. There will be some differences between Windows and Linux systems regarding syntax for path names and use of the wildcard (*) character—Windows users will need to specify each .CEL file to be analysed individually. |
Figure 1:Example quality control plots for the 10 arrays in the example data set (GSE18300) showing (a) average raw signal intensity; (b) mean residual deviation; (c) hierarchical clustering and (d) distribution of normalized intensities.