| Literature DB >> 25184108 |
Mingjie Chen1, R Shyama Prasad Rao1, Yiming Zhang1, Cathy Xiaoyan Zhong2, Jay J Thelen1.
Abstract
The goal of metabolomics data pre-processing is to eliminate systematic variation, such that biologically-related metabolite signatures are detected by statistical pattern recognition. Although several methods have been developed to tackle the issue of batch-to-batch variation, each method has its advantages and disadvantages. In this study, we used a reference sample as a normalization standard for test samples within the same batch, and each metabolite value is expressed as a ratio relative to its counterpart in the reference sample. We then applied this approach to a large multi-batch data set to facilitate intra- and inter-batch data integration. Our results demonstrate that normalization to a single reference standard has the potential to minimize batch-to-batch data variation across a large, multi-batch data set.Entities:
Keywords: Batch-to-batch variation; Maize; Metabolomics; Normalization; Reference sample
Year: 2014 PMID: 25184108 PMCID: PMC4149678 DOI: 10.1186/2193-1801-3-439
Source DB: PubMed Journal: Springerplus ISSN: 2193-1801
Figure 1The reference samples show batch variation. Same color coding was used to denotes those replicates within the same batch. (A) The PCA shows grouping of different batches (B1 to B25) based on the sequence of analysis (green-blue-black-brown-red symbols). The samples analyzed after a new column change form a distinct cluster (red symbols). As expected, replicates within batches mostly cluster together. (B) The HCA also shows clustering based on the sequence of analysis (progression: first to last day of analysis). The last ~20% of samples (dark red in progression strips) were analyzed after a column change. Bottom scale bar shows the standardized metabolite levels. (C) The RSD values vary considerably among identified forage metabolites, and ranges from 7.8% (for GOX – glucose oxime hexakis (trimethylsilyl)) to as high as 174.9% (for AMEP – 2-Amino-4,6-bis(1,1-dimethylethyl)-phenol). In general, RSD is higher for metabolites at low concentration. Over 86% of the metabolites have their RSD values below 60%. The X-axis is the mean metabolite concentration relative to internal standard. See Additional file 2: Table S2 for the list of metabolites.
Figure 2Data variability of test samples was reduced after normalization against reference samples. (A) Normalization against reference sample decreases the RSD of over 60% of metabolites in test samples (entries 1-5). (B) A scatter plot of RSD for 98 metabolites before and after normalization is shown for entry 3. The green dots (~69% of metabolites, below diagonal line) show decrease and red dots (~31%) show increase in RSD after normalization. (C) Normalization against reference sample decreases the percent variance of PC1 and PC2 (in PCA of entries 1-5) indicating that more components contribute after normalization.