Literature DB >> 33256233

A Workflow for Missing Values Imputation of Untargeted Metabolomics Data.

Tariq Faquih1, Maarten van Smeden2, Jiao Luo1, Saskia le Cessie1,3, Gabi Kastenmüller4,5, Jan Krumsiek6, Raymond Noordam7, Diana van Heemst7, Frits R Rosendaal1, Astrid van Hylckama Vlieg1, Ko Willems van Dijk8,9,10, Dennis O Mook-Kanamori1,11,12.   

Abstract

Metabolomics studies have seen a steady growth due to the development and implementation of affordable and high-quality metabolomics platforms. In large metabolite panels, measurement values are frequently missing and, if neglected or sub-optimally imputed, can cause biased study results. We provided a publicly available, user-friendly R script to streamline the imputation of missing endogenous, unannotated, and xenobiotic metabolites. We evaluated the multivariate imputation by chained equations (MICE) and k-nearest neighbors (kNN) analyses implemented in our script by simulations using measured metabolites data from the Netherlands Epidemiology of Obesity (NEO) study (n = 599). We simulated missing values in four unique metabolites from different pathways with different correlation structures in three sample sizes (599, 150, 50) with three missing percentages (15%, 30%, 60%), and using two missing mechanisms (completely at random and not at random). Based on the simulations, we found that for MICE, larger sample size was the primary factor decreasing bias and error. For kNN, the primary factor reducing bias and error was the metabolite correlation with its predictor metabolites. MICE provided consistently higher performance measures particularly for larger datasets (n > 50). In conclusion, we presented an imputation workflow in a publicly available R script to impute untargeted metabolomics data. Our simulations provided insight into the effects of sample size, percentage missing, and correlation structure on the accuracy of the two imputation methods.

Entities:  

Keywords:  imputation; k-nearest neighbors; metabolon; multiple imputation using chained equations; simulation; untargeted metabolomics; workflow

Year:  2020        PMID: 33256233      PMCID: PMC7761057          DOI: 10.3390/metabo10120486

Source DB:  PubMed          Journal:  Metabolites        ISSN: 2218-1989


  19 in total

Review 1.  Untargeted Metabolomics Strategies-Challenges and Emerging Directions.

Authors:  Alexandra C Schrimpe-Rutledge; Simona G Codreanu; Stacy D Sherrod; John A McLean
Journal:  J Am Soc Mass Spectrom       Date:  2016-09-13       Impact factor: 3.109

2.  Metabolic footprint of diabetes: a multiplatform metabolomics study in an epidemiological setting.

Authors:  Karsten Suhre; Christa Meisinger; Angela Döring; Elisabeth Altmaier; Petra Belcredi; Christian Gieger; David Chang; Michael V Milburn; Walter E Gall; Klaus M Weinberger; Hans-Werner Mewes; Martin Hrabé de Angelis; H-Erich Wichmann; Florian Kronenberg; Jerzy Adamski; Thomas Illig
Journal:  PLoS One       Date:  2010-11-11       Impact factor: 3.240

3.  The Netherlands Epidemiology of Obesity (NEO) study: study design and data collection.

Authors:  Renée de Mutsert; Martin den Heijer; Ton Johannes Rabelink; Johannes Willem Adriaan Smit; Johannes Anthonius Romijn; Johan Wouter Jukema; Albert de Roos; Christa Maria Cobbaert; Margreet Kloppenburg; Saskia le Cessie; Saskia Middeldorp; Frits Richard Rosendaal
Journal:  Eur J Epidemiol       Date:  2013-04-11       Impact factor: 8.082

4.  Normalization and missing value imputation for label-free LC-MS analysis.

Authors:  Yuliya V Karpievitch; Alan R Dabney; Richard D Smith
Journal:  BMC Bioinformatics       Date:  2012-11-05       Impact factor: 3.169

5.  Influence of missing values substitutes on multivariate analysis of metabolomics data.

Authors:  Piotr S Gromski; Yun Xu; Helen L Kotze; Elon Correa; David I Ellis; Emily Grace Armitage; Michael L Turner; Royston Goodacre
Journal:  Metabolites       Date:  2014-06-16

6.  Large Scale Metabolic Profiling identifies Novel Steroids linked to Rheumatoid Arthritis.

Authors:  Noha A Yousri; Karim Bayoumy; Wessam Gad Elhaq; Robert P Mohney; Samar Al Emadi; Mohammed Hammoudeh; Hussein Halabi; Basel Masri; Humeira Badsha; Imad Uthman; Robert Plenge; Richa Saxena; Karsten Suhre; Thurayya Arayssi
Journal:  Sci Rep       Date:  2017-08-22       Impact factor: 4.379

7.  Using simulation studies to evaluate statistical methods.

Authors:  Tim P Morris; Ian R White; Michael J Crowther
Journal:  Stat Med       Date:  2019-01-16       Impact factor: 2.497

8.  Profound Perturbation of the Metabolome in Obesity Is Associated with Health Risk.

Authors:  Elizabeth T Cirulli; Lining Guo; Christine Leon Swisher; Naisha Shah; Lei Huang; Lori A Napier; Ewen F Kirkness; Tim D Spector; C Thomas Caskey; Bernard Thorens; J Craig Venter; Amalio Telenti
Journal:  Cell Metab       Date:  2018-10-11       Impact factor: 27.287

9.  Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data.

Authors:  Runmin Wei; Jingye Wang; Mingming Su; Erik Jia; Shaoqiu Chen; Tianlu Chen; Yan Ni
Journal:  Sci Rep       Date:  2018-01-12       Impact factor: 4.379

10.  Characterization of missing values in untargeted MS-based metabolomics data and evaluation of missing data handling strategies.

Authors:  Kieu Trinh Do; Simone Wahl; Johannes Raffler; Sophie Molnos; Michael Laimighofer; Jerzy Adamski; Karsten Suhre; Konstantin Strauch; Annette Peters; Christian Gieger; Claudia Langenberg; Isobel D Stewart; Fabian J Theis; Harald Grallert; Gabi Kastenmüller; Jan Krumsiek
Journal:  Metabolomics       Date:  2018-09-20       Impact factor: 4.290

View more
  6 in total

1.  Optimization of Imputation Strategies for High-Resolution Gas Chromatography-Mass Spectrometry (HR GC-MS) Metabolomics Data.

Authors:  Isaac Ampong; Kip D Zimmerman; Peter W Nathanielsz; Laura A Cox; Michael Olivier
Journal:  Metabolites       Date:  2022-05-11

Review 2.  A Review on Differential Abundance Analysis Methods for Mass Spectrometry-Based Metabolomic Data.

Authors:  Zhengyan Huang; Chi Wang
Journal:  Metabolites       Date:  2022-03-30

3.  MIRTH: Metabolite Imputation via Rank-Transformation and Harmonization.

Authors:  Benjamin A Freeman; Sophie Jaro; Tricia Park; Sam Keene; Wesley Tansey; Ed Reznik
Journal:  Genome Biol       Date:  2022-09-01       Impact factor: 17.906

4.  Changes in serum metabolomics in idiopathic pulmonary fibrosis and effect of approved antifibrotic medication.

Authors:  Benjamin Seeliger; Alfonso Carleo; Pedro David Wendel-Garcia; Jan Fuge; Ana Montes-Warboys; Sven Schuchardt; Maria Molina-Molina; Antje Prasse
Journal:  Front Pharmacol       Date:  2022-08-17       Impact factor: 5.988

5.  Agreement between nicotine metabolites in blood and self-reported smoking status: The Netherlands Epidemiology of Obesity study.

Authors:  Sofia Folpmers; Dennis O Mook-Kanamori; Renée de Mutsert; Frits R Rosendaal; Ko Willems van Dijk; Diana van Heemst; Raymond Noordam; Saskia le Cessie
Journal:  Addict Behav Rep       Date:  2022-09-23

6.  Kernel weighted least square approach for imputing missing values of metabolomics data.

Authors:  Nishith Kumar; Md Aminul Hoque; Masahiro Sugimoto
Journal:  Sci Rep       Date:  2021-05-27       Impact factor: 4.379

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.