| Literature DB >> 15608262 |
Tanya Barrett1, Tugba O Suzek, Dennis B Troup, Stephen E Wilhite, Wing-Chi Ngau, Pierre Ledoux, Dmitry Rudnev, Alex E Lash, Wataru Fujibuchi, Ron Edgar.
Abstract
The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest fully public repository for high-throughput molecular abundance data, primarily gene expression data. The database has a flexible and open design that allows the submission, storage and retrieval of many data types. These data include microarray-based experiments measuring the abundance of mRNA, genomic DNA and protein molecules, as well as non-array-based technologies such as serial analysis of gene expression (SAGE) and mass spectrometry proteomic technology. GEO currently holds over 30,000 submissions representing approximately half a billion individual molecular abundance measurements, for over 100 organisms. Here, we describe recent database developments that facilitate effective mining and visualization of these data. Features are provided to examine data from both experiment- and gene-centric perspectives using user-friendly Web-based interfaces accessible to those without computational or microarray-related analytical expertise. The GEO database is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo.Entities:
Mesh:
Year: 2005 PMID: 15608262 PMCID: PMC539976 DOI: 10.1093/nar/gki022
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Schematic diagram of the relationships between GEO Platform, Sample, DataSet and Profiles. For each gene on a Platform (e.g. Gene A), multiple Sample measurement values are generated (Sample1–Sample3). Related Samples make up a DataSet, from which multiple, individual gene profile entities are generated.
Figure 2Selection of GEO web screenshots and how they link with each other. (A) GEO Profiles retrieval results; each entity includes sequence identifier and DataSet information, and a thumbnail profile image. Links to other Entrez databases or related profiles are provided above the thumbnail image. (B) Expanded profile chart depicts values (red bars) and rank (blue bars) information for one gene across each Sample in a GEO DataSet. Experimental subset groupings are reflected in labels at foot of chart. (C) DataSet record includes experiment summary information, DataSet subset classifications, and access to data mining features such as hierarchical cluster heat map and ‘Query subset A versus B’ tool. (D) DataSet hierarchical cluster heat map calculated by un-centered correlation coefficient/average linkage option. Regions of interest are selected using the red image cropper box, then either expanded to view Sample and gene annotation, downloaded, charted as line plots, or linked directly to corresponding Entrez GEO Profiles records.