| Literature DB >> 31546636 |
Ana K Rosen Vollmar1, Nicholas J W Rattray2,3, Yuping Cai4, Álvaro J Santos-Neto5,6, Nicole C Deziel7, Anne Marie Z Jukic8, Caroline H Johnson9.
Abstract
Metabolomics studies of the early-life exposome often use maternal urine specimens to investigate critical developmental windows, including the periconceptional period and early pregnancy. During these windows changes in kidney function can impact urine concentration. This makes accounting for differential urinary dilution across samples challenging. Because there is no consensus on the ideal normalization approach for urinary metabolomics data, this study's objective was to determine the optimal post-analytical normalization approach for untargeted metabolomics analysis from a periconceptional cohort of 45 women. Urine samples consisted of 90 paired pre- and post-implantation samples. After untargeted mass spectrometry-based metabolomics analysis, we systematically compared the performance of three common approaches to adjust for urinary dilution-creatinine adjustment, specific gravity adjustment, and probabilistic quotient normalization (PQN)-using unsupervised principal components analysis, relative standard deviation (RSD) of pooled quality control samples, and orthogonal partial least-squares discriminant analysis (OPLS-DA). Results showed that creatinine adjustment is not a reliable approach to normalize urinary periconceptional metabolomics data. Either specific gravity or PQN are more reliable methods to adjust for urinary concentration, with tighter quality control sample clustering, lower RSD, and better OPLS-DA performance compared to creatinine adjustment. These findings have implications for metabolomics analyses on urine samples taken around the time of conception and in contexts where kidney function may be altered.Entities:
Keywords: creatinine; normalization; pregnancy; probabilistic quotient normalization; specific gravity; urinary dilution
Year: 2019 PMID: 31546636 PMCID: PMC6835889 DOI: 10.3390/metabo9100198
Source DB: PubMed Journal: Metabolites ISSN: 2218-1989
Figure 1Unsupervised PCA plots comparing normalization approaches for RPLC data showing (A) raw data, (B) SVR-normalized data, (C) SVR and creatinine-normalized data, (D) SVR and specific gravity-normalized data, and (E) SVR and PQN-normalized data. Abbreviations: RPLC, reversed-phase liquid chromatography; SVR, support vector regression; PQN, probabilistic quotient normalization; PC1, principal component 1; PC2, principal component 2; ellipse is a 95% confidence ellipse.
Figure 2Unsupervised PCA plots comparing normalization approaches for HILIC data showing (A) raw data, (B) SVR-normalized data, (C) SVR and creatinine-normalized data, (D) SVR and specific gravity-normalized data, and (E) SVR and PQN-normalized data. Abbreviations: HILIC, hydrophilic interaction chromatography; SVR, support vector regression; PQN, probabilistic quotient normalization; PC1, principal component 1; PC2, principal component 2; ellipse is a 95% confidence ellipse.
Comparison of relative standard deviation of peak area for quality control samples across normalization approaches.1
| Normalization Approach | RPLC Data | HILIC Data | ||
|---|---|---|---|---|
| Median RSD | Peaks with RSD | Median RSD | Peaks with RSD | |
| Raw Data | 0.23 | 9827/12,811 | 0.33 | 8271/18,977 |
| (0.16–0.29) | (76.7%) | (0.23–0.48) | (43.6%) | |
| SVR | 0.15 | 12,023/12,794 | 0.19 | 15,158/18,882 |
| (0.10–0.20) | (94.0%) | (0.13–0.27) | (80.3%) | |
| SVR and Creatinine | 0.18 | 11,744/12,794 | 0.20 | 15,104/18,882 |
| (0.15–0.23) | (91.8%) | (0.13–0.27) | (80.0%) | |
| SVR and Specific Gravity | 0.15 | 12,023/12,794 | 0.19 | 15,158/18,882 |
| (0.10–0.20) | (94.0%) | (0.13–0.27) | (80.3%) | |
| SVR and PQN | 0.08 | 12,667/12,794 | 0.11 | 18,106/18,882 |
| (0.05–0.11) | (99.0%) | (0.07–0.16) | (95.9%) | |
1 SVR addresses analytical variability, while creatinine, specific gravity, and PQN address concentration variability. RPLC data represent hydrophobic metabolites, and HILIC data represent hydrophilic metabolites. Raw data has been processed using XCMS without any application of a normalization technique. Abbreviations: RPLC, reversed-phase liquid chromatography; HILIC, hydrophilic interaction chromatography; RSD, relative standard deviation; IQR, interquartile range; SVR, support vector regression; PQN, probabilistic quotient normalization.
Comparison of the mean difference in relative standard deviation across quality control samples between normalization approaches using paired t-tests.1
| Compared Normalization Approaches | RPLC Data | HILIC Data | ||
|---|---|---|---|---|
| RSD Mean Difference 3 | RSD Mean Difference 3 | |||
| Raw and SVR 2 | 0.100 | 0.160 | ||
| SVR and Creatinine 2,4 | −0.040 | −0.002 | 0.21 | |
| SVR and Specific Gravity 2,4 | 0 | -- | 0 | -- |
| SVR and PQN 2 | 0.075 | 0.100 | ||
| Creatinine and Specific Gravity 2,4 | 0.040 | 0.002 | 0.21 | |
| Creatinine and PQN 2 | 0.116 | 0.103 | ||
| Specific gravity and PQN 2 | 0.075 | 0.101 | ||
1 Abbreviations: RPLC, reversed-phase liquid chromatography; HILIC, hydrophilic interaction chromatography; RSD, relative standard deviation; SVR, support vector regression; PQN, probabilistic quotient normalization; SGref, reference specific gravity; SG, specific gravity; QC, quality control sample. 2 Raw data were processed using XCMS. SVR normalization was then applied to all datasets. Creatinine, specific gravity, and PQN adjustments were carried out after SVR normalization. 3 RSD mean differences are computed in the order listed, for example, RSDraw–RSDSVR, and expressed as percentages. 4 Because of the specific gravity normalization method in which SGref = SG for QC samples (see Section 4.6, Equation (1)), QC peak areas are the same as those normalized by SVR alone, and therefore there is no difference in the RSD when comparing these methods. Similarly, because QC peak areas are the same for specific gravity and SVR, comparisons between creatinine and SVR and creatinine and specific gravity result in the same mean differences in terms of absolute value, and the same p-values.
Results of OPLS-DA analysis of variation associated with pre- and post-implantation status, and Wilcoxon paired signed-rank test of changes in peak area from pre- to post-implantation.1
| Normalization Approach | RPLC Data | HILIC Data | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Features, | R2X | R2Y | Q2 | VIP > 1 and | Features, | R2X | R2Y | Q2 | VIP > 1 and | |
| Raw 2 | 9816 | 0.18 | 0.93 | 0.66 | 1303 | 8271 | 0.25 | 0.87 | 0.51 | 1385 |
| SVR 2 | 12,023 | 0.13 | 0.95 | 0.62 | 1425 | 15,158 | 0.22 | 0.90 | 0.49 | 2161 |
| Creatinine 2 | 11,744 | 0.36 | 0.82 | 0.58 | 1589 | 15,104 | 0.37 | 0.75 | 0.27 | 1491 |
| Specific Gravity 2 | 12,023 | 0.37 | 0.86 | 0.62 | 1591 | 15,158 | 0.20 | 0.87 | 0.53 | 2358 |
| PQN2 | 12,667 | 0.18 | 0.94 | 0.69 | 1551 | 18,106 | 0.25 | 0.91 | 0.52 | 2578 |
1 Abbreviations: OPLS-DA, orthogonal partial least-squares discriminant analysis; RPLC, reversed-phase liquid chromatography; HILIC, hydrophilic interaction chromatography; VIP, variable importance in projection; SVR, support vector regression; PQN, probabilistic quotient normalization. 2 Raw data were processed using XCMS. SVR normalization was then applied to all datasets. Creatinine, specific gravity, and PQN normalizations were then carried out after SVR normalization. 3 Only those features with RSD < 0.3 were included in this analysis. 4 Number of discriminant features with VIP > 1 based on OPLS-DA analysis, and q < 0.05 based on the Wilcoxon paired signed-rank test.
Figure 3Pooled sampling strategy for pre- and post-implantation urine specimens. The pre-implantation sample depicted here is a from a theoretical conception cycle during which ovulation occurs on day 14 (the mode in this population). In most conceptions, implantation occurs approximately 8–10 days after ovulation. Gestational weeks are measured from the start of the last menstrual period (LMP), so post-implantation early pregnancy pooled samples span 3–6 weeks gestation.