| Literature DB >> 35721675 |
Agus Yodi Gunawan1, Made Tri Ari Penia Kresnowati2.
Abstract
In metabolomics studies, independent analyses or replicating the metabolite concentration measurements are often performed to anticipate errors. On the other hand, the size of the dataset is increasing. For clustering purposes, obtaining representative information chemically from independent analyses is needed. The objective of this study is to develop a data reduction method such that a dataset that represents chemical information is obtained. Overall a proper data reduction method would simplify the clustering of metabolite data. We propose the modified Weiszfeld algorithm (MWA) to reduce independent analyses. To obtain comprehensive results, we compare MWA with some other well-known reduction methods, including PCA, CMDS, LE, and LLE. Then reduced datasets are clustered using the fuzzy c-means (FCM) algorithm with the Tang Sun Sun (TSS) index and silhouette index as the cluster validity indices. The results show that MWA, together with PCA, present the optimal number of clusters, namely four clusters. This result aligns with the optimal number of clusters before dimensionality reduction. The present results show that MWA is robust to perform dimensionality reduction of independent analyses while maintaining chemical information on the reduced dataset. Therefore, we recommend the reliability of MWA as one of the chemometric techniques, and the present finding has enriched chemometric techniques in metabolomics studies.Entities:
Keywords: Chemometric; Dimensionality reduction; Indonesian clove buds; Metabolite data; Metabolomics
Year: 2022 PMID: 35721675 PMCID: PMC9201019 DOI: 10.1016/j.heliyon.2022.e09715
Source DB: PubMed Journal: Heliyon ISSN: 2405-8440
Figure 1The structure of the clove bud metabolite dataset, used in this research.
Figure 2The structure of the clove bud metabolite dataset, after dimensionality reduction.
Figure 3The Tang Sun Sun index values without dimensionality reduction.
Figure 4The silhouette index values without dimensionality reduction.
Clustering result without dimensionality reduction.
| Cluster | Member of Cluster |
|---|---|
| I | M11, M12, M13, M14, M15, M16, M17, M18, M21, M22, M23, M24, M25, M26, M27, M28, M31, M32, M33, M34, M35, M36, M37, M38, |
| II | B11, B12, B13, B14, B15, B16, B17, B18, B21, B22, B23, B24, B25, B26, B27, B28, B31, B32, B33, B34, B35, B36, B37, B38 |
| III | J11, J12, J13, J14, J15, J16, J21, J22, J23, J24, J25, J26, J27, J22, J31, J32, J33, J34, J35, J36, J37, J38 |
| IV | T11, T12, T13, T14, T15, T16, T17, T18, T21, T23, T24, T25, T26, T27, T28, T31, T32, T34, T35, T36, T37, T38 |
The Tang Sun Sun index values after dimensionality reduction.
| Number of clusters | PCA | CMDS | LE | LLE | MWA |
|---|---|---|---|---|---|
| 2 | 2.69 | 1.90 | 2.76 | ||
| 3 | 2.59 | 3.80 | 3.44 | 2.45 | |
| 4 | 3.17 | 2.39 | 5.11 | ||
| 5 | 4.65 | 4.08 | 2.02 | 2.70 | 3.78 |
| 6 | 5.21 | 4.01 | 2.13 | 2.73 | 2.98 |
| 7 | 4.82 | 12.07 | 2.09 | 4.63 | 4.90 |
| 8 | 6.17 | 12.23 | 2.16 | 4.98 | 5.54 |
| 9 | 8.38 | 11.19 | 2.33 | 4.85 | 9.14 |
| 10 | 8.37 | 18.57 | 2.31 | 4.64 | 8.62 |
| 11 | 7.21 | 21.42 | 2.30 | 4.63 | 8.15 |
The silhouette index values after dimensionality reduction.
| Number of clusters | PCA | CMDS | LE | LLE | MWA |
|---|---|---|---|---|---|
| 2 | 0.66 | 0.82 | 0.53 | 0.58 | 0.66 |
| 3 | 0.73 | 0.73 | 0.45 | 0.49 | 0.75 |
| 4 | 0.78 | 0.79 | 0.61 | 0.56 | 0.78 |
| 5 | 0.77 | 0.75 | 0.65 | 0.69 | 0.80 |
| 6 | 0.79 | 0.83 | 0.72 | 0.64 | 0.85 |
| 7 | 0.74 | 0.87 | 0.76 | 0.70 | 0.80 |
| 8 | 0.76 | 0.89 | 0.78 | 0.81 | 0.72 |
| 9 | 0.84 | 0.85 | 0.74 | 0.85 | 0.89 |
| 10 | 0.92 | 0.94 | 0.84 | 0.94 | 0.94 |
| 11 |
Clustering result by using PCA as dimensionality reduction technique.
| Cluster | Member of Cluster |
|---|---|
| I | M1, M2, M3 |
| II | T1, T2, T3 |
| III | B1, B2, B3 |
| IV | J1, J2, J3 |
Clustering result by using CMDS as dimensionality reduction technique.
| Cluster | Member of Cluster |
|---|---|
| I | J2, J3, T2 B1, B2, B3 M1, M2, M3 |
| II | J1, T1, T3 |
Clustering result by using LE as dimensionality reduction technique.
| Cluster | Member of Cluster |
|---|---|
| I | B1, B3, M1 |
| II | J1, J2, J3 M2, T1 |
| III | B2, M3, T2, T3 |
Clustering result by using LLE as dimensionality reduction technique.
| Cluster | Member of Cluster |
|---|---|
| I | B2, B3, T1 M1, M2, M3 |
| II | J1, J2, J3 B1, T2, T3 |
Clustering result by using the proposed MWA dimensionality reduction technique.
| Cluster | Member of Cluster |
|---|---|
| I | M1, M2, M3 |
| II | B1, B2, B3 |
| III | J1, J2, J3 |
| IV | T1, T2, T3 |
Clustering result by using the silhouette index as cluster validity index.
| Cluster | Member of Cluster |
|---|---|
| I | B1 |
| II | B3 |
| III | J2, J3 |
| IV | T2 |
| V | B2 |
| VI | T1 |
| VII | M3 |
| VIII | M1 |
| IX | J1 |
| X | T3 |
| XI | M2 |
Figure 5The convergence of the FCM objective function with dimension reduction using MWA.