Bety Rostandy1,2, Xiaoli Gao3. 1. Department of Mathematics and Statistics, University of North Carolina, Greensboro, NC, USA. brostandy@rockefeller.edu. 2. Proteomics Resource Center, The Rockefeller University, New York, NY, USA. brostandy@rockefeller.edu. 3. Department of Mathematics and Statistics, University of North Carolina, Greensboro, NC, USA. x_gao2@uncg.edu.
Abstract
INTRODUCTION: Mass spectrometric data analysis of complex biological mixtures can be a challenge due to its vast datasets. There is lack of data treatment pipelines to analyze chemical signals versus noise. These tasks, so far, have been up to the discretion of the analysts. OBJECTIVES: The aim of this work is to demonstrate an analytical workflow that would enhance the confidence in metabolomics before answering biological questions by serial dilution of botanical complex mixture and high-dimensional data analysis. Furthermore, we would like to provide an alternative approach to a univariate p-value cutoff from t-test for blank subtraction procedure between negative control and biological samples. METHODS: A serial dilution of complex mixture analysis under electrospray ionization was proposed to study firsthand chemical complexity of metabolomics. Advanced statistical models using high-dimensional penalized regression were employed to study both the concentration and ion intensity relationship and the ion-ion relationship per second of retention time sub dataset. The multivariate analysis was carried out with a tool built in-house, so called metabolite ions extraction and visualization, which was implemented in R environment. RESULTS: A test case of the medicinal plant goldenseal (Hydrastis canandensis L.), showed an increase in metabolome coverage of features deemed as "important" by a multivariate analysis compared to features deemed as "significant" by a univariate t-test. For an illustration, the data analysis workflow suggested an unexpected putative compound, 20-hydroxyecdysone. This suggestion was confirmed with MS/MS acquisition and literature search. CONCLUSION: The multivariate analytical workflow selects "true" metabolite ions signals and provides an alternative approach to a univariate p-value cutoff from t-test, thus enhancing the data analysis process of metabolomics.
INTRODUCTION: Mass spectrometric data analysis of complex biological mixtures can be a challenge due to its vast datasets. There is lack of data treatment pipelines to analyze chemical signals versus noise. These tasks, so far, have been up to the discretion of the analysts. OBJECTIVES: The aim of this work is to demonstrate an analytical workflow that would enhance the confidence in metabolomics before answering biological questions by serial dilution of botanical complex mixture and high-dimensional data analysis. Furthermore, we would like to provide an alternative approach to a univariate p-value cutoff from t-test for blank subtraction procedure between negative control and biological samples. METHODS: A serial dilution of complex mixture analysis under electrospray ionization was proposed to study firsthand chemical complexity of metabolomics. Advanced statistical models using high-dimensional penalized regression were employed to study both the concentration and ion intensity relationship and the ion-ion relationship per second of retention time sub dataset. The multivariate analysis was carried out with a tool built in-house, so called metabolite ions extraction and visualization, which was implemented in R environment. RESULTS: A test case of the medicinal plant goldenseal (Hydrastis canandensis L.), showed an increase in metabolome coverage of features deemed as "important" by a multivariate analysis compared to features deemed as "significant" by a univariate t-test. For an illustration, the data analysis workflow suggested an unexpected putative compound, 20-hydroxyecdysone. This suggestion was confirmed with MS/MS acquisition and literature search. CONCLUSION: The multivariate analytical workflow selects "true" metabolite ions signals and provides an alternative approach to a univariate p-value cutoff from t-test, thus enhancing the data analysis process of metabolomics.
Authors: Royston Goodacre; Seetharaman Vaidyanathan; Warwick B Dunn; George G Harrigan; Douglas B Kell Journal: Trends Biotechnol Date: 2004-05 Impact factor: 19.536
Authors: Matthew C Chambers; Brendan Maclean; Robert Burke; Dario Amodei; Daniel L Ruderman; Steffen Neumann; Laurent Gatto; Bernd Fischer; Brian Pratt; Jarrett Egertson; Katherine Hoff; Darren Kessner; Natalie Tasman; Nicholas Shulman; Barbara Frewen; Tahmina A Baker; Mi-Youn Brusniak; Christopher Paulse; David Creasy; Lisa Flashner; Kian Kani; Chris Moulding; Sean L Seymour; Lydia M Nuwaysir; Brent Lefebvre; Frank Kuhlmann; Joe Roark; Paape Rainer; Suckau Detlev; Tina Hemenway; Andreas Huhmer; James Langridge; Brian Connolly; Trey Chadick; Krisztina Holly; Josh Eckels; Eric W Deutsch; Robert L Moritz; Jonathan E Katz; David B Agus; Michael MacCoss; David L Tabb; Parag Mallick Journal: Nat Biotechnol Date: 2012-10 Impact factor: 54.908