Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Choosing proper normalization is essential for discovery of sparse glycan biomarkers.

Literature DB >> 32211690

Choosing proper normalization is essential for discovery of sparse glycan biomarkers.

Hae-Won Uh¹, Lucija Klarić², Ivo Ugrina³, Gordan Lauc⁴, Age K Smilde⁵, Jeanine J Houwing-Duistermaat⁶.

Abstract

Rapid progress in high-throughput glycomics analysis enables the researchers to conduct large sample studies. Typically, the between-subject differences in total abundance of raw glycomics data are very large, and it is necessary to reduce the differences, making measurements comparable across samples. Essentially there are two ways to approach this issue: row-wise and column-wise normalization. In glycomics, the differences per subject are usually forced to be exactly zero, by scaling each sample having the sum of all glycan intensities equal to 100%. This total area (row-wise) normalization (TA) results in so-called compositional data, rendering many standard multivariate statistical methods inappropriate or inapplicable. Ignoring the compositional nature of the data, moreover, may lead to spurious results. Alternatively, a log-transformation to the raw data can be performed prior to column-wise normalization and implementing standard statistical tools. Until now, there is no clear consensus on the appropriate normalization method applied to glycomics data. Nor is systematic investigation of impact of TA on downstream analysis available to justify the choice of TA. Our motivation lies in efficient variable selection to identify glycan biomarkers with regard to accurate prediction as well as interpretability of the model chosen. Via extensive simulations we investigate how different normalization methods affect the performance of variable selection, and compare their performance. We also address the effect of various types of measurement error in glycans: additive, multiplicative and two-component error. We show that when sample-wise differences are not large row-wise normalization (like TA) can have deleterious effects on variable selection and prediction.

Entities: Chemical

Mesh：

Substances：
Biomarkers

Year: 2020 PMID： 32211690 DOI： 10.1039/c9mo00174c

Source DB: PubMed Journal: Mol Omics ISSN： 2515-4184

Keyword Cloud
Cited

4 in total

1. Serum N-glycomic profiling may provide potential signatures for surveillance of COVID-19.

Authors: Yongjing Xie; Michael Butler
Journal: Glycobiology Date: 2022-09-19 Impact factor: 5.954

2. Systematic Evaluation of Normalization Methods for Glycomics Data Based on Performance of Network Inference.

Authors: Elisa Benedetti; Nathalie Gerstner; Maja Pučić-Baković; Toma Keser; Karli R Reiding; L Renee Ruhaak; Tamara Štambuk; Maurice H J Selman; Igor Rudan; Ozren Polašek; Caroline Hayward; Marian Beekman; Eline Slagboom; Manfred Wuhrer; Malcolm G Dunlop; Gordan Lauc; Jan Krumsiek
Journal: Metabolites Date: 2020-07-02

3. Statistical integration of two omics datasets using GO2PLS.

Authors: Zhujie Gu; Said El Bouhaddani; Jiayi Pei; Jeanine Houwing-Duistermaat; Hae-Won Uh
Journal: BMC Bioinformatics Date: 2021-03-18 Impact factor: 3.169

4. The local-balanced model for improved machine learning outcomes on mass spectrometry data sets and other instrumental data.

Authors: Heather Desaire; Milani Wijeweera Patabandige; David Hua
Journal: Anal Bioanal Chem Date: 2021-02-13 Impact factor: 4.142

4 in total