Zhuxuan Jin1, Jian Kang2, Tianwei Yu1. 1. Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA. 2. Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.
Abstract
Motivation: Metabolomics data generated from liquid chromatography-mass spectrometry platforms often contain missing values. Existing imputation methods do not consider underlying feature relations and the metabolic network information. As a result, the imputation results may not be optimal. Results: We proposed an imputation algorithm that incorporates the existing metabolic network, adduct ion relations even for unknown compounds, as well as linear and nonlinear associations between feature intensities to build a feature-level network. The algorithm uses support vector regression for missing value imputation based on features in the neighborhood on the network. We compared our proposed method with methods being widely used. As judged by the normalized root mean squared error in real data-based simulations, our proposed methods can achieve better accuracy. Availability and implementation: The R package is available at http://web1.sph.emory.edu/users/tyu8/MINMA. Contact: jiankang@umich.edu or tianwei.yu@emory.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: Metabolomics data generated from liquid chromatography-mass spectrometry platforms often contain missing values. Existing imputation methods do not consider underlying feature relations and the metabolic network information. As a result, the imputation results may not be optimal. Results: We proposed an imputation algorithm that incorporates the existing metabolic network, adduct ion relations even for unknown compounds, as well as linear and nonlinear associations between feature intensities to build a feature-level network. The algorithm uses support vector regression for missing value imputation based on features in the neighborhood on the network. We compared our proposed method with methods being widely used. As judged by the normalized root mean squared error in real data-based simulations, our proposed methods can achieve better accuracy. Availability and implementation: The R package is available at http://web1.sph.emory.edu/users/tyu8/MINMA. Contact: jiankang@umich.edu or tianwei.yu@emory.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Charmion Cruickshank-Quinn; Laura K Zheng; Kevin Quinn; Russell Bowler; Richard Reisdorph; Nichole Reisdorph Journal: Metabolites Date: 2018-12-04
Authors: Qian Li; Kate Fisher; Wenjun Meng; Bin Fang; Eric Welsh; Eric B Haura; John M Koomen; Steven A Eschrich; Brooke L Fridley; Y Ann Chen Journal: Bioinformatics Date: 2020-01-01 Impact factor: 6.937