| Literature DB >> 26626150 |
Yoav D Shaul1, Bingbing Yuan2, Prathapan Thiru2, Andy Nutter-Upham2, Scott McCallum2, Carolyn Lanzkron3, George W Bell2, David M Sabatini4.
Abstract
The oncogenic transformation of normal cells into malignant, rapidly proliferating cells requires major alterations in cell physiology. For example, the transformed cells remodel their metabolic processes to supply the additional demand for cellular building blocks. We have recently demonstrated essential metabolic processes in tumor progression through the development of a methodological analysis of gene expression. Here, we present the Metabolic gEne RApid Visualizer (MERAV, http://merav.wi.mit.edu), a web-based tool that can query a database comprising ∼4300 microarrays, representing human gene expression in normal tissues, cancer cell lines and primary tumors. MERAV has been designed as a powerful tool for whole genome analysis which offers multiple advantages: one can search many genes in parallel; compare gene expression among different tissue types as well as between normal and cancer cells; download raw data; and generate heatmaps; and finally, use its internal statistical tool. Most importantly, MERAV has been designed as a unique tool for analyzing metabolic processes as it includes matrixes specifically focused on metabolic genes and is linked to the Kyoto Encyclopedia of Genes and Genomes pathway search.Entities:
Mesh:
Year: 2015 PMID: 26626150 PMCID: PMC4702927 DOI: 10.1093/nar/gkv1337
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Generation of the MERAV database. (A) Schematic presentation of the procedures used to generate the MERAV database. (I) Human gene expression data were collected from the following resources: Cancer Cell Line Encyclopedia (CCLE), GlaxoSmithKline (GSK), Gene Expression Omnibus database (GEO), Human Body Index (HBI) and Expression Project for Oncology (ExpO). (II) The data were assembled and normalized together, followed by quality control and removal of low quality arrays. (III) The database was renormalized and non-specific probes were removed. (IV) The arrays were annotated to obtain a more complete and unified annotation style. (B) Relative proportion of each component array type in the database. The number in parenthesis indicates the number of arrays of each type. (C) Identical cell lines demonstrate a higher Pearson correlation, despite having been generated in different experiments. Using all arrays from cancer cell lines (2,016 samples), the Pearson correlation between each one pair was calculated. The boxplot represent the distribution in the correlation between the non-identical and the identical cell lines. The p values for the indicated comparisons were determined using Student's t-test.
Number of arrays from each source
| Source | Number of arrays |
|---|---|
| EXPO | 1,312 |
| GSK | 870 |
| CCLE | 729 |
| GEO-C | 506 |
| HBI | 426 |
| GEO-N | 317 |
| GEO-P | 292 |
The MERAV database was generated from the indicated sources, with the number of constituent arrays shown.
Number of cell line replicates
| Number of representative arrays | Number of cell lines |
|---|---|
| 1 | 469 |
| 2 | 83 |
| 3 | 128 |
| 4 | 143 |
| 5 | 33 |
| 6 | 23 |
| 7 and up | 3 |
Some of the cell lines in the MERAV database are represented by multiple arrays, summarized in this table.
For example, 469 cell lines have data from a single array, 83 have data from two arrays, etc.
Figure 2.MERAV can detect known gene expression profiles. (A) Expression of Aldolase isoenzymes in different normal tissues. The three Aldolase isoenzymes were subjected to a search in MERAV for their expression in selected normal tissues. The results represent the bar graph, (generated by MERAV). The bars colors were manipulated (a feature in the MERAV) in order to indicate the tissue of origin. The color legend is shown in the upper right-hand corner. CNS-Central Nervous System. (B) Expression of Aldolase isoenzymes in different normal tissues. The same search parameters as in (A), with the results presented as a boxplot. This figure was generated using MERAV without any additional tools. CNS-Central Nervous System. (C) RRM1, RRM2 and TYMS expression is elevated in cancer cell lines. These three genes were subjected to a search in MERAV. For each tissue, a boxplot was generated that demonstrates the expression in normal tissues (orange) and cancer cell lines (green). This figure was generated using MERAV without any additional tools. CNS-Central Nervous System. (D) RRM1, RRM2 and TYMS expression is elevated in cancer cell lines. A table represents the p values for each tissue and gene as indicated in (C). The expression data was downloaded and the distribution between the normal tissues and cancer cell lines for each tissue was determined. The p values for the indicated comparisons were determined using Student's t-test and calculated in R.