| Literature DB >> 25033193 |
Polina Reshetova, Age K Smilde, Antoine H C van Kampen, Johan A Westerhuis.
Abstract
BACKGROUND: High-throughput omics technologies have enabled the measurement of many genes or metabolites simultaneously. The resulting high dimensional experimental data poses significant challenges to transcriptomics and metabolomics data analysis methods, which may lead to spurious instead of biologically relevant results. One strategy to improve the results is the incorporation of prior biological knowledge in the analysis. This strategy is used to reduce the solution space and/or to focus the analysis on biological meaningful regions. In this article, we review a selection of these methods used in transcriptomics and metabolomics. We combine the reviewed methods in three groups based on the underlying mathematical model: exploratory methods, supervised methods and estimation of the covariance matrix. We discuss which prior knowledge has been used, how it is incorporated and how it modifies the mathematical properties of the underlying methods.Entities:
Mesh:
Year: 2014 PMID: 25033193 PMCID: PMC4101693 DOI: 10.1186/1752-0509-8-S2-S2
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 1A general scheme of data analysis methods. f(data) is a function that does not include prior knowledge. f(data, prior knowledge) is a function that includes prior knowledge. "Answer with prior knowledge" gives a better predictive model, is easier interpretable and/or more reproducible than "answer" without prior knowledge. The slider controls the strength of the influence of prior knowledge on the result.