| Literature DB >> 35503092 |
Gonçalo D S Correia1,2, Panteleimon G Takis1,2, Caroline J Sands1,2, Anna M Kowalka3,4, Tricia Tan3,4, Lance Turtle5, Antonia Ho6, Malcolm G Semple7,8, Peter J M Openshaw9, J Kenneth Baillie10, Zoltán Takáts1,2, Matthew R Lewis1,2.
Abstract
Normalization to account for variation in urinary dilution is crucial for interpretation of urine metabolic profiles. Probabilistic quotient normalization (PQN) is used routinely in metabolomics but is sensitive to systematic variation shared across a large proportion of the spectral profile (>50%). Where 1H nuclear magnetic resonance (NMR) spectroscopy is employed, the presence of urinary protein can elevate the spectral baseline and substantially impact the resulting profile. Using 1H NMR profile measurements of spot urine samples collected from hospitalized COVID-19 patients in the ISARIC 4C study, we determined that PQN coefficients are significantly correlated with observed protein levels (r2 = 0.423, p < 2.2 × 10-16). This correlation was significantly reduced (r2 = 0.163, p < 2.2 × 10-16) when using a computational method for suppression of macromolecular signals known as small molecule enhancement spectroscopy (SMolESY) for proteinic baseline removal prior to PQN. These results highlight proteinuria as a common yet overlooked source of bias in 1H NMR metabolic profiling studies which can be effectively mitigated using SMolESY or other macromolecular signal suppression methods before estimation of normalization coefficients.Entities:
Mesh:
Year: 2022 PMID: 35503092 PMCID: PMC9118196 DOI: 10.1021/acs.analchem.2c00466
Source DB: PubMed Journal: Anal Chem ISSN: 0003-2700 Impact factor: 8.008
Figure 1(a) The 1H NMR spectrum of albumin (black line) compared to a real urine sample containing the same amount of albumin (∼7 mg/mL). (b) Eighty-five urine 1H NMR profiles from COVID-19 patients, focusing on the backbone −NH (left panel) and parts of methyl proteinic protons (right panel), showcasing the spectral baseline effect from the presence of proteinuria. (c) SMolESY application on the 1H NMR spectrum of albumin (∼7 mg/mL) and a real urine sample with the same amount of albumin. In both cases SMolESY succeeds in broad signals suppression as well as baseline homogenization, allowing the enhancement of sharp signals from small molecules (i.e., metabolites).
Figure 2(a) Fifty 1H NMR spectra from the COVID-19 cohort focusing on the proteinic methyl group-containing spectral region between 0.2 and 0.5 ppm (upper panel). After processed SMolESY filtering, the resulting spectra (bottom panel) are free from small metabolites sharp signals (or the remaining signals contain the same negative/positive part with almost zero integral) and their integration provides an estimate of the total protein. (b) The absolute quantification of total urinary protein via NMR highly correlates to the measured protein concentration by clinical methods (for both NMR and clinical methods see Supporting Information).
Figure 3Agreement between PQN coefficients estimated from the standard 1H NMR spectra and the corresponding SMolESY processed data. The linear regression trendline (dashed red line) was estimated with the orthogonal least-squares Passing–Bablok method. The regression coefficients, Pearson correlation coefficient (ρ) and the p-value from the two-sided correlation significance test are shown in the top left corner. Data points are colored by the natural logarithm of the estimated protein concentration (mg/mL).
Figure 4PQN coefficient variance explained (r2, estimated from the linear regression models plotted in red) by protein concentration, in the (a) standard and (b) SMolESY processed NMR spectra. Protein values equal or below the limit of detection (LOD) = 0.11 mg/mL were excluded for this analysis (final n = 810). PQN coefficients were square root transformed, and urine protein measurements were log-transformed.
Figure 5Correlation between PQN coefficients estimated from (a) the standard 1H NMR or (b) SMolESY processed spectra and creatinine. Pearson correlation coefficient and p-value from the two-sided correlation significance test are shown in each figure. Data points are colored by the natural logarithm of the estimated protein concentration. PQN coefficients and creatinine measurements were square root transformed. (c,d) The residuals from the OLS regression trendlines (dashed red lines in panels a and b) and their residual association with total urine protein (log-transformed). Protein values equal or below the LOD = 0.11 mg/mL were excluded in panels c and d (final n = 810).
Figure 6PCA scores plots for the SMolESY processed data set normalized with the PQN coefficients estimated from the (a) standard 1D and (b) SMolESY processed NMR spectra. NMR data was unit-variance scaled. Protein values were log-transformed and values equal or below LOD were imputed by replacement with the LOD value = 0.11 mg/mL.