Literature DB >> 31350543

The nPYc-Toolbox, a Python module for the pre-processing, quality-control and analysis of metabolic profiling datasets.

Caroline J Sands¹, Arnaud M Wolfer¹, Gonçalo D S Correia¹, Noureddin Sadawi², Arfan Ahmed¹, Beatriz Jiménez^1,2, Matthew R Lewis^1,2, Robert C Glen², Jeremy K Nicholson^1,2, Jake T M Pearce^1,2.

Abstract

SUMMARY: As large-scale metabolic phenotyping studies become increasingly common, the need for systemic methods for pre-processing and quality control (QC) of analytical data prior to statistical analysis has become increasingly important, both within a study, and to allow meaningful inter-study comparisons. The nPYc-Toolbox provides software for the import, pre-processing, QC and visualization of metabolic phenotyping datasets, either interactively, or in automated pipelines.
AVAILABILITY AND IMPLEMENTATION: The nPYc-Toolbox is implemented in Python, and is freely available from the Python package index https://pypi.org/project/nPYc/, source is available at https://github.com/phenomecentre/nPYc-Toolbox. Full documentation can be found at http://npyc-toolbox.readthedocs.io/ and exemplar datasets and tutorials at https://github.com/phenomecentre/nPYc-toolbox-tutorials.

Entities: CellLine Chemical Species

Mesh：

Year: 2019 PMID： 31350543 PMCID： PMC6954639 DOI： 10.1093/bioinformatics/btz566

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 Introduction

Metabolic phenotyping offers a powerful window into gene-environment interactions (Nicholson ). Inter-study comparison in the field is complicated by the diversity of analytical platforms used to generate data, and the lack of standard quality criteria. Standards are emerging around the most common platforms: Nuclear Magnetic Resonance spectroscopy (NMR), and hyphenated-Mass Spectrometry (MS), and procedures for the acquisition of profiles from human biofluid samples in particular are well established (Dona ; Lewis ). However, QC in profiling studies has typically been conducted on an ad-hoc basis in individual studies, although there is an increasing push towards the systematization and automation of pre-processing procedures (Giacomoni ; van Rijswijk ). The toolbox presented here provides software for pre-processing, QC and visualization of metabolic profiling datasets, embodying the MRC-NIHR National Phenome Centre (NPC) practices and focusing on the interpretability of the output to both data generators and analysts (Fig. 1).

Fig. 1.

Conceptual diagram outlining the workflow embodied by the toolbox, from import of raw or feature-extracted datasets, preprocessing and filtering, QC, visualization and export

2 The nPYc-Toolbox

2.1 Implementation

The toolbox is designed to allow reproducible processing of datasets with minimal reliance on human judgement during the process. It may be used interactively (e.g. in a Jupyter notebook, for which tutorials are provided), or as an API in automated workflows. It is coded in Python 3.6. To account for the differing processing workflows expected of the common analytical datasets outlined above, the toolbox subclasses its Dataset object; the NMRDataset encapsulates methods for handling spectral NMR data; MSDataset for discretely measured (peak-picked) hyphenated-MS profiling datasets; and TargetedDataset for targeted, quantified datasets, derived from MS, NMR or any other analytical platform.

2.2 Features

Dataset objects are initialized from raw (Bruker NMR) or feature-extracted data [outputs of software such as XCMS (Tautenhahn ), Progenesis QITM, TargetLynxTM, &c], and associated with study design parameters or metadata read directly from the raw data or from csv files. The csv template is structured so that each row corresponds to a sample, and columns contain a set of mandatory fields, and any other user required metadata. The role that each sample plays in the assay and its pre-processing is delineated using a standardized nomenclature. Routines for pre-processing 1D NMR spectra by the automated calculation of QC metrics assessing line-width, water suppression and baseline stability are implemented (as described by Dona ). Current best-practices in QC of profiling LC-MS (Broadhurst ; Dunn ; Lewis ; Want ) include repeated injections of pooled quality control samples, and a serial dilution of the reference sample to calculate per feature analytical precision and linearity of response. Correction of run-order effects follows an adapted version of the LOWESS approach proposed by Dunn . The targeted pre-processing module contains a set of reports and data consistency checks, to assist analysts in assessing the presence of batch effects, standardizing the linearity range over multiple batches, and visualizing the distribution ranges of samples assayed and relationships within the limits of quantification. Exploratory data analysis with PCA is used to assess the impact of the QC choices on the final dataset, and screen for associations between acquisition parameters and study factors. Parameter sets can be specified as JSON dictionaries, allowing simple automation and generation of standardized workflows with basic auditing of all manipulations in a dataset. This toolbox can therefore be used to ensure reproducible pre-processing and quality control. Processed datasets can be exported as csv files in a number of different formats.

3 Conclusion

The nPYc-Toolbox supports both profiling and targeted metabolic phenotyping datasets, and provides tools for pre-processing, quality control and visualization.

Funding

This work was supported by the Medical Research Council and National Institute for Health Research [grant number MC_PC_12025] through funding for the MRC-NIHR National Phenome Centre, infrastructure support was provided by the NIHR Imperial Biomedical Research Centre and PhenoMeNal, European Commission Horizon2020 programme, grant agreement number 654241. Conflict of Interest: none declared.

8 in total

1. Global metabolic profiling procedures for urine using UPLC-MS.

Authors: Elizabeth J Want; Ian D Wilson; Helen Gika; Georgios Theodoridis; Robert S Plumb; John Shockcor; Elaine Holmes; Jeremy K Nicholson
Journal: Nat Protoc Date: 2010-06 Impact factor: 13.491

2. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry.

Authors: Warwick B Dunn; David Broadhurst; Paul Begley; Eva Zelena; Sue Francis-McIntyre; Nadine Anderson; Marie Brown; Joshau D Knowles; Antony Halsall; John N Haselden; Andrew W Nicholls; Ian D Wilson; Douglas B Kell; Royston Goodacre
Journal: Nat Protoc Date: 2011-06-30 Impact factor: 13.491

Review 3. Metabolic phenotyping in clinical and surgical environments.

Authors: Jeremy K Nicholson; Elaine Holmes; James M Kinross; Ara W Darzi; Zoltan Takats; John C Lindon
Journal: Nature Date: 2012-11-15 Impact factor: 49.962

4. Development and Application of Ultra-Performance Liquid Chromatography-TOF MS for Precision Large Scale Urinary Metabolic Phenotyping.

Authors: Matthew R Lewis; Jake T M Pearce; Konstantina Spagou; Martin Green; Anthony C Dona; Ada H Y Yuen; Mark David; David J Berry; Katie Chappell; Verena Horneffer-van der Sluis; Rachel Shaw; Simon Lovestone; Paul Elliott; John Shockcor; John C Lindon; Olivier Cloarec; Zoltan Takats; Elaine Holmes; Jeremy K Nicholson
Journal: Anal Chem Date: 2016-08-26 Impact factor: 6.986

5. Precision high-throughput proton NMR spectroscopy of human urine, serum, and plasma for large-scale metabolic phenotyping.

Authors: Anthony C Dona; Beatriz Jiménez; Hartmut Schäfer; Eberhard Humpfer; Manfred Spraul; Matthew R Lewis; Jake T M Pearce; Elaine Holmes; John C Lindon; Jeremy K Nicholson
Journal: Anal Chem Date: 2014-09-16 Impact factor: 6.986

6. Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics.

Authors: Franck Giacomoni; Gildas Le Corguillé; Misharl Monsoor; Marion Landi; Pierre Pericard; Mélanie Pétéra; Christophe Duperier; Marie Tremblay-Franco; Jean-François Martin; Daniel Jacob; Sophie Goulitquer; Etienne A Thévenot; Christophe Caron
Journal: Bioinformatics Date: 2014-12-19 Impact factor: 6.937

Review 7. Guidelines and considerations for the use of system suitability and quality control samples in mass spectrometry assays applied in untargeted clinical metabolomic studies.

Authors: David Broadhurst; Royston Goodacre; Stacey N Reinke; Julia Kuligowski; Ian D Wilson; Matthew R Lewis; Warwick B Dunn
Journal: Metabolomics Date: 2018-05-18 Impact factor: 4.290

8. Highly sensitive feature detection for high resolution LC/MS.

Authors: Ralf Tautenhahn; Christoph Böttcher; Steffen Neumann
Journal: BMC Bioinformatics Date: 2008-11-28 Impact factor: 3.169

8 in total

9 in total

1. Associations of NAFLD with circulating ceramides and impaired glycemia.

Authors: Meghana D Gadgil; Monika Sarkar; Caroline Sands; Matthew R Lewis; David M Herrington; Alka M Kanaya
Journal: Diabetes Res Clin Pract Date: 2022-03-12 Impact factor: 8.180

2. Statistical analysis in metabolic phenotyping.

Authors: Benjamin J Blaise; Gonçalo D S Correia; Gordon A Haggart; Izabella Surowiec; Caroline Sands; Matthew R Lewis; Jake T M Pearce; Johan Trygg; Jeremy K Nicholson; Elaine Holmes; Timothy M D Ebbels
Journal: Nat Protoc Date: 2021-07-28 Impact factor: 13.491

3. Circulating metabolites and lipids are associated with glycaemic measures in South Asians.

Authors: Meghana D Gadgil; Alka M Kanaya; Caroline Sands; Matthew R Lewis; Namratha R Kandula; David M Herrington
Journal: Diabet Med Date: 2020-12-25 Impact factor: 4.359

4. Urinary metabolic phenotyping for Alzheimer's disease.

Authors: Natalja Kurbatova; Manik Garg; Luke Whiley; Elena Chekmeneva; Beatriz Jiménez; María Gómez-Romero; Jake Pearce; Torben Kimhofer; Ellie D'Hondt; Hilkka Soininen; Iwona Kłoszewska; Patrizia Mecocci; Magda Tsolaki; Bruno Vellas; Dag Aarsland; Alejo Nevado-Holgado; Benjamine Liu; Stuart Snowden; Petroula Proitsi; Nicholas J Ashton; Abdul Hye; Cristina Legido-Quigley; Matthew R Lewis; Jeremy K Nicholson; Elaine Holmes; Alvis Brazma; Simon Lovestone
Journal: Sci Rep Date: 2020-12-10 Impact factor: 4.379

5. Prediction of response of methotrexate in patients with rheumatoid arthritis using serum lipidomics.

Authors: Mateusz Maciejewski; Caroline Sands; Matthew R Lewis; Darren Plant; Nisha Nair; Stephanie Ling; Suzanne Verstappen; Kimme Hyrich; Anne Barton; Daniel Ziemek
Journal: Sci Rep Date: 2021-03-31 Impact factor: 4.379

6. Antiviral metabolite 3'-deoxy-3',4'-didehydro-cytidine is detectable in serum and identifies acute viral infections including COVID-19.

Authors: Ravi Mehta; Elena Chekmeneva; Heather Jackson; Caroline Sands; Ewurabena Mills; Dominique Arancon; Ho Kwong Li; Paul Arkell; Timothy M Rawson; Robert Hammond; Maisarah Amran; Anna Haber; Graham S Cooke; Mahdad Noursadeghi; Myrsini Kaforou; Matthew R Lewis; Zoltan Takats; Shiranee Sriskandan
Journal: Med (N Y) Date: 2022-01-31

7. Metabolomic profiling in small vessel disease identifies multiple associations with disease severity.

Authors: Eric L Harshfield; Caroline J Sands; Anil M Tuladhar; Frank Erik de Leeuw; Matthew R Lewis; Hugh S Markus
Journal: Brain Date: 2022-07-29 Impact factor: 15.255

8. SMolESY: an efficient and quantitative alternative to on-instrument macromolecular ¹H-NMR signal suppression.

Authors: Panteleimon G Takis; Beatriz Jiménez; Caroline J Sands; Elena Chekmeneva; Matthew R Lewis
Journal: Chem Sci Date: 2020-05-27 Impact factor: 9.825

9. Integration of global metabolomics and lipidomics approaches reveals the molecular mechanisms and the potential biomarkers for postoperative recurrence in early-stage cholangiocarcinoma.

Authors: Sureerat Padthaisong; Jutarop Phetcharaburanin; Poramate Klanrit; Jia V Li; Nisana Namwat; Narong Khuntikeo; Attapol Titapun; Apiwat Jarearnrat; Arporn Wangwiwatsin; Panupong Mahalapbutr; Watcharin Loilome
Journal: Cancer Metab Date: 2021-08-04

9 in total