E C Considine1, G Thomas2, A L Boulesteix3, A S Khashan4,5, L C Kenny4. 1. The Irish Centre for Fetal and Neonatal Translational Research (INFANT), Department of Obstetrics and Gynaecology, University College Cork, Cork, Ireland. lizconsidine@gmail.com. 2. SQU4RE, Sint-Alfonsusstraat 17, 8800, Roeselare, Belgium. 3. Department of Medical Informatics, Biometry and Epidemiology, LMU Munich, Marchioninistr. 15, 81377, Munich, Germany. 4. The Irish Centre for Fetal and Neonatal Translational Research (INFANT), Department of Obstetrics and Gynaecology, University College Cork, Cork, Ireland. 5. Department of Epidemiology and Public Health, University College Cork, Cork, Ireland.
Abstract
INTRODUCTION: We present the first study to critically appraise the quality of reporting of the data analysis step in metabolomics studies since the publication of minimum reporting guidelines in 2007. OBJECTIVES: The aim of this study was to assess the standard of reporting of the data analysis step in metabolomics biomarker discovery studies and to investigate whether the level of detail supplied allows basic understanding of the steps employed and/or reuse of the protocol. For the purposes of this review we define the data analysis step to include the data pretreatment step and the actual data analysis step, which covers algorithm selection, univariate analysis and multivariate analysis. METHOD: We reviewed the literature to identify metabolomic studies of biomarker discovery that were published between January 2008 and December 2014. Studies were examined for completeness in reporting the various steps of the data pretreatment phase and data analysis phase and also for clarity of the workflow of these sections. RESULTS: We analysed 27 papers, published anytime in 2008 until the end of 2014 in the area or biomarker discovery in serum metabolomics. The results of this review showed that the data analysis step in metabolomics biomarker discovery studies is plagued by unclear and incomplete reporting. Major omissions and lack of logical flow render the data analysis' workflows in these studies impossible to follow and therefore replicate or even imitate. CONCLUSIONS: While we await the holy grail of computational reproducibility in data analysis to become standard, we propose that, at a minimum, the data analysis section of metabolomics studies should be readable and interpretable without omissions such that a data analysis workflow diagram could be extrapolated from the study and therefore the data analysis protocol could be reused by the reader. That inconsistent and patchy reporting obfuscates reproducibility is a given. However even basic understanding and reuses of protocols are hampered by the low level of detail supplied in the data analysis sections of the studies that we reviewed.
INTRODUCTION: We present the first study to critically appraise the quality of reporting of the data analysis step in metabolomics studies since the publication of minimum reporting guidelines in 2007. OBJECTIVES: The aim of this study was to assess the standard of reporting of the data analysis step in metabolomics biomarker discovery studies and to investigate whether the level of detail supplied allows basic understanding of the steps employed and/or reuse of the protocol. For the purposes of this review we define the data analysis step to include the data pretreatment step and the actual data analysis step, which covers algorithm selection, univariate analysis and multivariate analysis. METHOD: We reviewed the literature to identify metabolomic studies of biomarker discovery that were published between January 2008 and December 2014. Studies were examined for completeness in reporting the various steps of the data pretreatment phase and data analysis phase and also for clarity of the workflow of these sections. RESULTS: We analysed 27 papers, published anytime in 2008 until the end of 2014 in the area or biomarker discovery in serum metabolomics. The results of this review showed that the data analysis step in metabolomics biomarker discovery studies is plagued by unclear and incomplete reporting. Major omissions and lack of logical flow render the data analysis' workflows in these studies impossible to follow and therefore replicate or even imitate. CONCLUSIONS: While we await the holy grail of computational reproducibility in data analysis to become standard, we propose that, at a minimum, the data analysis section of metabolomics studies should be readable and interpretable without omissions such that a data analysis workflow diagram could be extrapolated from the study and therefore the data analysis protocol could be reused by the reader. That inconsistent and patchy reporting obfuscates reproducibility is a given. However even basic understanding and reuses of protocols are hampered by the low level of detail supplied in the data analysis sections of the studies that we reviewed.
Keywords:
Biomarker discovery; Data analysis; Guidelines; Metabolomics; Minimum standards; Reporting
Authors: Lloyd W Sumner; Alexander Amberg; Dave Barrett; Michael H Beale; Richard Beger; Clare A Daykin; Teresa W-M Fan; Oliver Fiehn; Royston Goodacre; Julian L Griffin; Thomas Hankemeier; Nigel Hardy; James Harnly; Richard Higashi; Joachim Kopka; Andrew N Lane; John C Lindon; Philip Marriott; Andrew W Nicholls; Michael D Reily; John J Thaden; Mark R Viant Journal: Metabolomics Date: 2007-09 Impact factor: 4.290
Authors: Brian H Walsh; David I Broadhurst; Rupasri Mandal; David S Wishart; Geraldine B Boylan; Louise C Kenny; Deirdre M Murray Journal: PLoS One Date: 2012-12-05 Impact factor: 3.240
Authors: Reza M Salek; Steffen Neumann; Daniel Schober; Jan Hummel; Kenny Billiau; Joachim Kopka; Elon Correa; Theo Reijmers; Antonio Rosato; Leonardo Tenori; Paola Turano; Silvia Marin; Catherine Deborde; Daniel Jacob; Dominique Rolin; Benjamin Dartigues; Pablo Conesa; Kenneth Haug; Philippe Rocca-Serra; Steve O'Hagan; Jie Hao; Michael van Vliet; Marko Sysi-Aho; Christian Ludwig; Jildau Bouwman; Marta Cascante; Timothy Ebbels; Julian L Griffin; Annick Moing; Macha Nikolski; Matej Oresic; Susanna-Assunta Sansone; Mark R Viant; Royston Goodacre; Ulrich L Günther; Thomas Hankemeier; Claudio Luchinat; Dirk Walther; Christoph Steinbeck Journal: Metabolomics Date: 2015-05-26 Impact factor: 4.290
Authors: Debora Farias Batista Leite; Aude-Claire Morillon; Elias F Melo Júnior; Renato T Souza; Fergus P McCarthy; Ali Khashan; Philip Baker; Louise C Kenny; Jose Guilherme Cecatti Journal: BMJ Open Date: 2019-08-10 Impact factor: 2.692
Authors: Mary C Playdon; Amit D Joshi; Fred K Tabung; Susan Cheng; Mir Henglin; Andy Kim; Tengda Lin; Eline H van Roekel; Jiaqi Huang; Jan Krumsiek; Ying Wang; Ewy Mathé; Marinella Temprosa; Steven Moore; Bo Chawes; A Heather Eliassen; Andrea Gsur; Marc J Gunter; Sei Harada; Claudia Langenberg; Matej Oresic; Wei Perng; Wei Jie Seow; Oana A Zeleznik Journal: Metabolites Date: 2019-07-17
Authors: Roberto Bonelli; Sasha M Woods; Brendan R E Ansell; Tjebo F C Heeren; Catherine A Egan; Kamron N Khan; Robyn Guymer; Jennifer Trombley; Martin Friedlander; Melanie Bahlo; Marcus Fruttiger Journal: Sci Rep Date: 2020-07-22 Impact factor: 4.379