| Literature DB >> 31953824 |
Tusharkanti Ghosh1, Weiming Zhang1, Debashis Ghosh1, Katerina Kechris2.
Abstract
In recent years, mass spectrometry (MS)-based metabolomics has been extensively applied to characterize biochemical mechanisms, and study physiological processes and phenotypic changes associated with disease. Metabolomics has also been important for identifying biomarkers of interest suitable for clinical diagnosis. For the purpose of predictive modeling, in this chapter, we will review various supervised learning algorithms such as random forest (RF), support vector machine (SVM), and partial least squares-discriminant analysis (PLS-DA). In addition, we will also review feature selection methods for identifying the best combination of metabolites for an accurate predictive model. We conclude with best practices for reproducibility by including internal and external replication, reporting metrics to assess performance, and providing guidelines to avoid overfitting and to deal with imbalanced classes. An analysis of an example data will illustrate the use of different machine learning methods and performance metrics.Entities:
Keywords: Mass spectrometry; Metabolomics; Performance Metrics; Predictive Modeling; Supervised learning
Mesh:
Year: 2020 PMID: 31953824 PMCID: PMC7423323 DOI: 10.1007/978-1-0716-0239-3_16
Source DB: PubMed Journal: Methods Mol Biol ISSN: 1064-3745