MOTIVATION: Data normalization is an important step in processing proteomics data generated in mass spectrometry experiments, which aims to reduce sample-level variation and facilitate comparisons of samples. Previously published methods for normalization primarily depend on the assumption that the distribution of protein expression is similar across all samples. However, this assumption fails when the protein expression data is generated from heterogenous samples, such as from various tissue types. This led us to develop a novel data-driven method for improved normalization to correct the systematic bias meanwhile maintaining underlying biological heterogeneity. RESULTS: To robustly correct the systematic bias, we used the density-power-weight method to down-weigh outliers and extended the one-dimensional robust fitting method described in the previous work to our structured data. We then constructed a robustness criterion and developed a new normalization algorithm, called RobNorm.In simulation studies and analysis of real data from the genotype-tissue expression project, we compared and evaluated the performance of RobNorm against other normalization methods. We found that the RobNorm approach exhibits the greatest reduction in systematic bias while maintaining across-tissue variation, especially for datasets from highly heterogeneous samples. AVAILABILITYAND IMPLEMENTATION: https://github.com/mwgrassgreen/RobNorm. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Data normalization is an important step in processing proteomics data generated in mass spectrometry experiments, which aims to reduce sample-level variation and facilitate comparisons of samples. Previously published methods for normalization primarily depend on the assumption that the distribution of protein expression is similar across all samples. However, this assumption fails when the protein expression data is generated from heterogenous samples, such as from various tissue types. This led us to develop a novel data-driven method for improved normalization to correct the systematic bias meanwhile maintaining underlying biological heterogeneity. RESULTS: To robustly correct the systematic bias, we used the density-power-weight method to down-weigh outliers and extended the one-dimensional robust fitting method described in the previous work to our structured data. We then constructed a robustness criterion and developed a new normalization algorithm, called RobNorm.In simulation studies and analysis of real data from the genotype-tissue expression project, we compared and evaluated the performance of RobNorm against other normalization methods. We found that the RobNorm approach exhibits the greatest reduction in systematic bias while maintaining across-tissue variation, especially for datasets from highly heterogeneous samples. AVAILABILITYAND IMPLEMENTATION: https://github.com/mwgrassgreen/RobNorm. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Yuliya V Karpievitch; Thomas Taverner; Joshua N Adkins; Stephen J Callister; Gordon A Anderson; Richard D Smith; Alan R Dabney Journal: Bioinformatics Date: 2009-07-14 Impact factor: 6.937
Authors: Kim Kultima; Anna Nilsson; Birger Scholz; Uwe L Rossbach; Maria Fälth; Per E Andrén Journal: Mol Cell Proteomics Date: 2009-07-12 Impact factor: 5.911
Authors: Matthew E Ritchie; Belinda Phipson; Di Wu; Yifang Hu; Charity W Law; Wei Shi; Gordon K Smyth Journal: Nucleic Acids Res Date: 2015-01-20 Impact factor: 16.971
Authors: Andrea Franceschini; Damian Szklarczyk; Sune Frankild; Michael Kuhn; Milan Simonovic; Alexander Roth; Jianyi Lin; Pablo Minguez; Peer Bork; Christian von Mering; Lars J Jensen Journal: Nucleic Acids Res Date: 2012-11-29 Impact factor: 16.971
Authors: Jürgen Cox; Marco Y Hein; Christian A Luber; Igor Paron; Nagarjuna Nagaraj; Matthias Mann Journal: Mol Cell Proteomics Date: 2014-06-17 Impact factor: 5.911
Authors: Jelena Čuklina; Chloe H Lee; Evan G Williams; Tatjana Sajic; Ben C Collins; María Rodríguez Martínez; Varun S Sharma; Fabian Wendt; Sandra Goetze; Gregory R Keele; Bernd Wollscheid; Ruedi Aebersold; Patrick G A Pedrioli Journal: Mol Syst Biol Date: 2021-08 Impact factor: 11.429