| Literature DB >> 18605994 |
Xuewei Wang1, Ming Wu, Zheng Li, Christina Chan.
Abstract
The detection and analysis of steady-state gene expression has become routine. Time-series microarrays are of growing interest to systems biologists for deciphering the dynamic nature and complex regulation of biosystems. Most temporal microarray data only contain a limited number of time points, giving rise to short-time-series data, which imposes challenges for traditional methods of extracting meaningful information. To obtain useful information from the wealth of short-time series data requires addressing the problems that arise due to limited sampling. Current efforts have shown promise in improving the analysis of short time-series microarray data, although challenges remain. This commentary addresses recent advances in methods for short-time series analysis including simplification-based approaches and the integration of multi-source information. Nevertheless, further studies and development of computational methods are needed to provide practical solutions to fully exploit the potential of this data.Entities:
Mesh:
Year: 2008 PMID: 18605994 PMCID: PMC2474593 DOI: 10.1186/1752-0509-2-58
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 1The general process of The data then undergoes pre-processing procedures, such as normalization and quality evaluation. Next data mining techniques are used to discover patterns or characteristics, identify related pathways or reconstruct systems network for biological processes from short-time series data. To address the limited sampling in short-time series data, two strategies are introduced in the general process of microarray analysis. Simplification strategies reduce the data to discrete representations based on trends or states with respect to time to achieve more interpretable and biologically meaningful clusters. Such conceptual discretization is part of the pre-processing step, prior to data mining. Incorporating multi-source information takes a different strategy. In this strategy multi-source data, including various omics databases and prior biological information, are collected and integrated to obtain a comprehensive dataset and enhance the information content. To minimize the heterogeneity of omics data from different experiments, standardization can and have been imposed on omics databases. Current standards for high-through-put database include MIAME, MIAPE, MSI, MIMIx. MIAME has been implemented with GEO and ArrayExpress microarray databases. The integration of various omics databases or prior biological information can enhance the effectiveness and efficiency of mining and interpretation of short-time series data to achieve biological discoveries. For example, multi-source prior biological information, i.e., prior noise-distribution has been proposed to enhance the performance of the data mining and network inference [43,44]. In addition, pathway and functional knowledge and metabolic data from different databases have also enhanced the clustering results and pathway identification [39-42]. These studies are discussed and referenced in the text.