| Literature DB >> 19455230 |
Andreu Alibés1, Edward R Morrissey, Andrés Cañada, Oscar M Rueda, David Casado, Patricio Yankilevich, Ramón Díaz-Uriarte.
Abstract
The analysis of expression and CGH arrays plays a central role in the study of complex diseases, especially cancer, including finding markers for early diagnosis and prognosis, choosing an optimal therapy, or increasing our understanding of cancer development and metastasis. Asterias (http://www.asterias.info) is an integrated collection of freely-accessible web tools for the analysis of gene expression and aCGH data. Most of the tools use parallel computing (via MPI) and run on a server with 60 CPUs for computation; compared to a desktop or server-based but not parallelized application, parallelization provides speed ups of factors up to 50. Most of our applications allow the user to obtain additional information for user-selected genes (chromosomal location, PubMed ids, Gene Ontology terms, etc.) by using clickable links in tables and/or figures. Our tools include: normalization of expression and aCGH data (DNMAD); converting between different types of gene/clone and protein identifiers (IDconverter/IDClight); filtering and imputation (preP); finding differentially expressed genes related to patient class and survival data (Pomelo II); searching for models of class prediction (Tnasas); using random forests to search for minimal models for class prediction or for large subsets of genes with predictive capacity (GeneSrF); searching for molecular signatures and predictive genes with survival data (SignS); detecting regions of genomic DNA gain or loss (ADaCGH). The capability to send results between different applications, access to additional functional information, and parallelized computation make our suite unique and exploit features only available to web-based applications.Entities:
Keywords: aCGH; classification; microarray; parallel computing; prediction; web-based application
Year: 2007 PMID: 19455230 PMCID: PMC2675829
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1.Asterias suite. Relationships between the applications currently available in Asterias. An arrow indicates the possibility of automatically transferring the output from one application (origin of the arrow) as input for another application (end of row). All applications can also be accessed independently. Photo credit: the starfish is a modified image taken from the Wikipedia entry for Asterias (http://en.wikipedia.org/wiki/Image:Asterias_rubens.jpg), and belongs to Hans Hillewaert.
Summary input and output for each application from the Asterias suite.
| DNMAD | GPR or custom format | Normalized log-ratios; A-values | Diagnostic plots. |
| preP | DNMAD output or EDF | Post-processed EDF, summary statistics | |
| Pomelo II | preP output or EDF, class indicator, survival time and status | Differential expression statistics and p-values | Heatmap with gene dendrogram |
| Tnasas | preP output or EDF, class indicator | Error rates, selected genes, stability assessments | Cross-validated error rates vs. number of genes |
| GeneSrF | preP output or EDF, class indicator | OOB predictions, error rates, selected genes, stability assessments | OOB error vs. number of genes, OOB predictions, importance spectrum, selection probability plots. |
| SignS | preP output or EDF, survival time and status; optional validation files. | Single-gene statistics and p-values, CV predictions, model results and parameters, stability assessments | Survival plots, dendrograms, partial-likelihood plots. |
| ADaCGH | preP output or EDF and chromosomal location (e.g. from IDconverter) | Genes and segmented regions, summary statistics | Diagnostic plots, chromosome and genome segmented plots |
| IDconverter | identifiers | Mapped identifiers (gene, clone, protein), chromosomal location, PubMed abstracts, GO terms, pathways. | |
| IDClight | URL | Same as IDconverter | |
EDF: expression (or genomic) data file. See text for details.
For example, http://idclight.bioinfo.cnio.es/idclight.prog?idtype=ug&id=Hs.100890&org=Hs
Figure 2.Output from ADaCGH. Partial output from ADaCGH showing: (Top) the bottom of the main output screen with the thumbnails for the segmented plots; (Bottom) Genome View for one of the arrays, obtained by clicking on the uppermost thumbnail in (Top); (Center) Chromosome View for the first chromosome (obtained by clicking on the region for the first chromosome in (Bottom)), with some data-points showing their ID; (Right) the results from IDClight obtained by clicking on the ID for one of the highlighted points in (Center).
Figure 3.Output from SignS. Partial output from SignS showing: (Top left) Some model quality plots (survival plots from cross-validated prediction scores); (Bottom left) part of the output of the model fitted to the complete data set; note the clickable gene names; (Top center) the half-size dendrogram for the genes with negative coefficients, showing only clusters that fulfill the minimum requirements of correlation and size; (Right) same as (Top center), but using the double-size plot; (Bottom right) the results from clicking, on either (Top center) or (Right), on the cluster leave for Mm.257765.