Literature DB >> 17597861

GEDAS - Gene Expression Data Analysis Suite.

Tangirala Venkateswara Prasad1, Ravindra Pentela Babu, Syed Ismail Ahson.   

Abstract

UNLABELLED: Currently available micro-array gene expression data analysis tools lack standardization at various levels. We developed GEDAS (gene expression data analysis suite) to bring various tools and techniques in one system. It also provides a number of other features such as a large collection of distance measures and pre-processing techniques. The software is an extension of Cluster 3.0 (developed based on Eisen Lab's Cluster and Tree View software). GEDAS allows the usage of different datasets with algorithms such as k-means, HC, SVD/PCA and SVM, in addition to Kohonen's SOM and LVQ. AVAILABILITY: http://gedas.bizhat.com/gedas.htm.

Entities:  

Year:  2006        PMID: 17597861      PMCID: PMC1891661          DOI: 10.6026/97320630001083

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

This work attempts to integrate different tools and techniques for gene expression analysis with an aim to standardize them for efficient usage. In this context, a number of tools such as Cluster/ Tree View [1 ], SNOMAD [2], Cluster 3.0 software [3], GEDA suite [4], GEPAS [5], J-Express [6], Cleaver 1.0 [7] and Expression Profiler [8 ] have been extensively studied and significantly improved in recent years. Here, we describe a software called GEDAS (gene expression data analysis suite) developed by integrating techniques such as OM, LVQ, k-means, hierarchical clustering, SVM [9] and PCA. The software supports a number of visualization techniques/gene expression data preprocessing algorithms [1– 4 ] and it contains over 10 visualizations and 19 distance measures.

Methodology

The GEDAS software has been developed as stand-alone software for analysis of microarray gene expression data using Visual Basic and Visual C++ programming languages. Microarray datasets can be loaded in plain text file, MS Excel or MS Access formats. The software uses Crystal Reports for generating outputs. A snapshot of GEDAS is shown in Figure 1.
Figure 1

A snapshot of GEDAS is shown

Utility

The software facilitates various levels of data manipulation during pre-processing. GEDAS generates at least 6 different outputs for any analysis unlike other many tools producing just one output. The whole genome visualization tool is introduced in this development. [10] In addition to the traditional plots/graphs such as scatter plot and histograms, the temporal (or wave) graph, tree view, tree map, and whole genome view were standardized, developed and integrated into the software. We evaluated the tools using breast cancer, mouse (Mus musculus), Arabidopsis thaliana, Homo sapiens and sugarcane datasets. Another most important inclusion was the representation of hierarchical clustering output in the form of temporal (or wave) graph. In GEDAS, results are presented in a number of ways described elsewhere [4–11–12–13–14– 15–16]. The techniques implemented in GEDAS are given in Table 1. The software facilitates sorting of data in rows, columns or both. The output can be exported in PDF, BMP, GIF and JIF formats.
Table 1

The application of various visualization techniques included in GEDAS is listed.

Visualization/AlgorithmRaw dataPre-processed dataSOMK-MeansLVQHCPCA (gene)SVM
Histogram
Checks view
Microarray
Whole sample
Proximity map
Temporal(incl. zoomed cluster view)
Texual
PC view
Eigen graph
Tree view
Scatter plot & M vs. A plot
Box-Whisker plot
Gene Ontology

Future work

In future development, we plan to incorporate other visualization tools [4 –17] including 2D and 3D score plots, profile plots, scatter plots (3D scatter plots, PCA visualization, ISOMAP visualization, and multi-dimensional scaling), Venn diagrams for visualizing similar elements in micro-arrays and SOM visualization for clustering result. We also plan to provide the software using a web interface. Our other plans include addition of robust distance measures and data mining tools (fuzzy c-means and agglomerative).
  3 in total

1.  Systematic determination of genetic network architecture.

Authors:  S Tavazoie; J D Hughes; M J Campbell; R J Cho; G M Church
Journal:  Nat Genet       Date:  1999-07       Impact factor: 38.330

2.  Analysis of gene expression data using self-organizing maps.

Authors:  P Törönen; M Kolehmainen; G Wong; E Castrén
Journal:  FEBS Lett       Date:  1999-05-21       Impact factor: 4.124

3.  The human transcriptome map: clustering of highly expressed genes in chromosomal domains.

Authors:  H Caron; B van Schaik ; M van der Mee ; F Baas; G Riggins; P van Sluis ; M C Hermus; R van Asperen ; K Boon; P A Voûte; S Heisterkamp; A van Kampen ; R Versteeg
Journal:  Science       Date:  2001-02-16       Impact factor: 47.728

  3 in total
  1 in total

1.  Visualization of microarray gene expression data.

Authors:  Tangirala Venkateswara Prasad; Syed Ismail Ahson
Journal:  Bioinformation       Date:  2006-05-03
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.