Literature DB >> 19408956

Application of fuzzy c-means clustering in data analysis of metabolomics.

Xiang Li1, Xin Lu, Jing Tian, Peng Gao, Hongwei Kong, Guowang Xu.   

Abstract

Fuzzy c-means (FCM) clustering is an unsupervised method derived from fuzzy logic that is suitable for solving multiclass and ambiguous clustering problems. In this study, FCM clustering is applied to cluster metabolomics data. FCM is performed directly on the data matrix to generate a membership matrix which represents the degree of association the samples have with each cluster. The method is parametrized with the number of clusters (C) and the fuzziness coefficient (m), which denotes the degree of fuzziness in the algorithm. Both have been optimized by combining FCM with partial least-squares (PLS) using the membership matrix as the Y matrix in the PLS model. The quality parameters R(2)Y and Q(2) of the PLS model have been used to monitor and optimize C and m. Data of metabolic profiles from three gene types of Escherichia coli were used to demonstrate the method above. Different multivariable analysis methods have been compared. Principal component analysis failed to model the metabolite data, while partial least-squares discriminant analysis yielded results with overfitting. On the basis of the optimized parameters, the FCM was able to reveal main phenotype changes and individual characters of three gene types of E. coli. Coupled with PLS, FCM provides a powerful research tool for metabolomics with improved visualization, accurate classification, and outlier estimation.

Entities:  

Mesh:

Year:  2009        PMID: 19408956     DOI: 10.1021/ac900353t

Source DB:  PubMed          Journal:  Anal Chem        ISSN: 0003-2700            Impact factor:   6.986


  7 in total

Review 1.  Multi-dimensional mass spectrometry-based shotgun lipidomics and novel strategies for lipidomic analyses.

Authors:  Xianlin Han; Kui Yang; Richard W Gross
Journal:  Mass Spectrom Rev       Date:  2011-07-13       Impact factor: 10.946

2.  Estimation of breast percent density in raw and processed full field digital mammography images via adaptive fuzzy c-means clustering and support vector machine segmentation.

Authors:  Brad M Keller; Diane L Nathan; Yan Wang; Yuanjie Zheng; James C Gee; Emily F Conant; Despina Kontos
Journal:  Med Phys       Date:  2012-08       Impact factor: 4.071

3.  MCAM: multiple clustering analysis methodology for deriving hypotheses and insights from high-throughput proteomic datasets.

Authors:  Kristen M Naegle; Roy E Welsch; Michael B Yaffe; Forest M White; Douglas A Lauffenburger
Journal:  PLoS Comput Biol       Date:  2011-07-21       Impact factor: 4.475

Review 4.  Statistical methods for the analysis of high-throughput metabolomics data.

Authors:  Jörg Bartel; Jan Krumsiek; Fabian J Theis
Journal:  Comput Struct Biotechnol J       Date:  2013-03-22       Impact factor: 7.271

5.  Integrating transcriptomic techniques and k-means clustering in metabolomics to identify markers of abiotic and biotic stress in Medicago truncatula.

Authors:  Elizabeth Dickinson; Martin J Rusilowicz; Michael Dickinson; Adrian J Charlton; Ulrike Bechtold; Philip M Mullineaux; Julie Wilson
Journal:  Metabolomics       Date:  2018-09-17       Impact factor: 4.290

6.  hcapca: Automated Hierarchical Clustering and Principal Component Analysis of Large Metabolomic Datasets in R.

Authors:  Shaurya Chanana; Chris S Thomas; Fan Zhang; Scott R Rajski; Tim S Bugni
Journal:  Metabolites       Date:  2020-07-21

7.  Multivariate strategy for the sample selection and integration of multi-batch data in metabolomics.

Authors:  Izabella Surowiec; Erik Johansson; Frida Torell; Helena Idborg; Iva Gunnarsson; Elisabet Svenungsson; Per-Johan Jakobsson; Johan Trygg
Journal:  Metabolomics       Date:  2017-08-24       Impact factor: 4.290

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.