| Literature DB >> 30842738 |
Xuejiao Cui1,2, Qingxia Yang1,2, Bo Li2, Jing Tang1,2, Xiaoyu Zhang1,2, Shuang Li1,2, Fengcheng Li1, Jie Hu3, Yan Lou4, Yunqing Qiu4, Weiwei Xue2, Feng Zhu1,2.
Abstract
Because of the extended period of clinic data collection and huge size of analyzed samples, the long-term and large-scale pharmacometabonomics profiling is frequently encountered in the discovery of drug/target and the guidance of personalized medicine. So far, integration of the results (ReIn) from multiple experiments in a large-scale metabolomic profiling has become a widely used strategy for enhancing the reliability and robustness of analytical results, and the strategy of direct data merging (DiMe) among experiments is also proposed to increase statistical power, reduce experimental bias, enhance reproducibility and improve overall biological understanding. However, compared with the ReIn, the DiMe has not yet been widely adopted in current metabolomics studies, due to the difficulty in removing unwanted variations and the inexistence of prior knowledges on the performance of the available merging methods. It is therefore urgently needed to clarify whether DiMe can enhance the performance of metabolic profiling or not. Herein, the performance of DiMe on 4 pairs of benchmark datasets was comprehensively assessed by multiple criteria (classification capacity, robustness and false discovery rate). As a result, integration/merging-based strategies (ReIn and DiMe) were found to perform better under all criteria than those strategies based on single experiment. Moreover, DiMe was discovered to outperform ReIn in classification capacity and robustness, while the ReIn showed superior capacity in controlling false discovery rate. In conclusion, these findings provided valuable guidance to the selection of suitable analytical strategy for current metabolomics.Entities:
Keywords: classification capacity; direct data merging; false discovery rate; long-term and large-scale metabolomics; robustness
Year: 2019 PMID: 30842738 PMCID: PMC6391323 DOI: 10.3389/fphar.2019.00127
Source DB: PubMed Journal: Front Pharmacol ISSN: 1663-9812 Impact factor: 5.810
FIGURE 1Distribution of the sample sizes of all (gray) and human (green) metabolomics studies publicly available in the Metabolights database.
FIGURE 2Schematic representations of the workflows of the analytical strategies applied in this study. (a) the pipeline of direct merge; (b) the pipeline of results integration.
Classification capacities of different analytical strategies assessed by accuracy (ACC), sensitivity (SEN), specificity (SPE), Matthews correlation coefficient (MCC) and area under the curve (AUC) based on four pairs of benchmark datasets collected from the Metabolights database.
| Experiment ID | ACC | SEN | SPE | MCC | AUC | |
|---|---|---|---|---|---|---|
| SiE1 | 0.74 | 0.67 | 0.75 | 0.32 | 0.79 | |
| SiE2 | 0.69 | 0.33 | 0.80 | 0.13 | 0.60 | |
| 0.73 | 0.60 | 0.76 | 0.29 | 0.70 | ||
| 0.78 | 0.82 | 0.77 | 0.53 | 0.85 | ||
| SiE1 | 0.59 | 0.58 | 0.59 | 0.13 | 0.57 | |
| SiE2 | 0.69 | 0.33 | 0.80 | 0.13 | 0.76 | |
| 0.60 | 0.53 | 0.62 | 0.12 | 0.66 | ||
| 0.80 | 0.53 | 0.92 | 0.50 | 0.83 | ||
| SiE1 | 0.67 | 0.50 | 0.80 | 0.32 | 0.80 | |
| SiE2 | 0.56 | 0.50 | 0.60 | 0.10 | 0.80 | |
| 0.61 | 0.50 | 0.70 | 0.20 | 0.80 | ||
| 0.78 | 0.50 | 1.00 | 0.60 | 0.93 | ||
| SiE1 | 0.56 | 0.25 | 0.80 | 0.06 | 0.70 | |
| SiE2 | 0.67 | 0.50 | 0.80 | 0.32 | 0.75 | |
| 0.61 | 0.38 | 0.80 | 0.19 | 0.73 | ||
| 0.72 | 0.50 | 0.90 | 0.44 | 0.88 | ||
FIGURE 3Classification capacities of different analytical strategies assessed by receiver operating characteristic (ROC) and area under the curve (AUC) based on four pairs of benchmark datasets collected from the Metabolights database.
Robustness of different analytical strategies assessed by the number of markers selected by each sampling set and overlap values based on four pairs of benchmark datasets collected from the Metabolights database.
| Experiment ID | No. of Cases/Controls | No. of MS Peaks Detected | No. of markers selected by the | Overlap Median across 10 Samplings | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |||||
| SiE1 | 59/129 | 941 | 216 | 219 | 74 | 87 | 135 | 276 | 63 | 70 | 42 | 96 | ||
| SiE2 | 13/50 | 1,209 | 37 | 107 | 50 | 135 | 47 | 170 | 60 | 64 | 129 | 64 | ||
| 72/179 | 941/1,209 | 127 | 163 | 62 | 111 | 91 | 223 | 62 | 67 | 86 | 80 | |||
| 72/179 | 734 | 145 | 81 | 53 | 115 | 57 | 95 | 57 | 66 | 54 | 125 | |||
| SiE1 | 60/129 | 1,586 | 161 | 141 | 43 | 84 | 113 | 43 | 114 | 66 | 114 | 195 | ||
| SiE2 | 13/50 | 3,230 | 128 | 161 | 597 | 179 | 173 | 140 | 291 | 167 | 278 | 233 | ||
| 73/179 | 1,586/3,230 | 145 | 151 | 320 | 132 | 143 | 92 | 203 | 117 | 196 | 214 | |||
| 73/179 | 1,144 | 173 | 68 | 334 | 107 | 82 | 112 | 90 | 106 | 109 | 106 | |||
| SiE1 | 20/25 | 883 | 34 | 51 | 53 | 56 | 39 | 23 | 179 | 73 | 118 | 123 | ||
| SiE2 | 20/24 | 825 | 27 | 114 | 139 | 216 | 42 | 60 | 22 | 112 | 12 | 32 | ||
| 40/50 | 883/825 | 31 | 83 | 96 | 136 | 41 | 41.5 | 101 | 93 | 65 | 78 | |||
| 40/50 | 665 | 66 | 11 | 57 | 187 | 109 | 47 | 27 | 60 | 76 | 37 | |||
| SiE1 | 20/25 | 1,526 | 57 | 104 | 63 | 91 | 74 | 164 | 86 | 76 | 37 | 52 | ||
| SiE2 | 20/24 | 1,542 | 229 | 77 | 34 | 187 | 170 | 150 | 80 | 248 | 175 | 57 | ||
| 40/50 | 1,526/1,542 | 143 | 91 | 49 | 139 | 122 | 157 | 83 | 162 | 106 | 55 | |||
| 40/50 | 872 | 132 | 29 | 110 | 80 | 102 | 148 | 206 | 110 | 163 | 146 | |||
FIGURE 4Robustness of different analytical strategies assessed by the overlap values based on four pairs of benchmark datasets collected from the Metabolights database.
Robustness of different analytical strategies assessed by the percent and number of markers discovered simultaneously by multiple sampling datasets based on four pairs of benchmark datasets collected from the Metabolights database.
| MTBLS17-NEG | MTBLS17-POS | MTBLS19-NEG | MTBLS19-POS | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SiE1 | SiE2 | SiE1 | SiE2 | SiE1 | SiE2 | SiE1 | SiE2 | |||||
| 10 | 0.00% (0) | 0.00% (0) | 0.12% (1) | 0.09% (1) | 0.00% (0) | 0.31% (4) | 0.00% (0) | 0.00% (0) | 0.00% (0) | 0.25% (2) | 0.21% (3) | 0.41% (5) |
| ≥9 | 0.31% (4) | 0.00% (0) | 1.30% (11) | 0.19% (2) | 0.09% (2) | 1.01% (13) | 0.80% (6) | 0.00% (0) | 1.18% (8) | 0.37% (3) | 0.36% (5) | 1.14% (14) |
| ≥8 | 0.70% (9) | 0.35% (3) | 2.48% (21) | 0.56% (6) | 0.34% (8) | 1.55% (20) | 1.20% (9) | 0.13% (1) | 1.77% (12) | 0.37% (3) | 0.50% (7) | 1.55% (19) |
| ≥7 | 1.17% (15) | 0.35% (3) | 4.01% (34) | 0.93% (10) | 0.64% (15) | 2.87% (37) | 1.34% (10) | 0.52% (4) | 2.07% (14) | 0.62% (5) | 1.07% (15) | 2.45% (30) |
| ≥6 | 2.74% (35) | 0.93% (8) | 5.07% (43) | 2.05% (22) | 1.15% (27) | 3.81% (49) | 2.00% (15) | 1.29% (10) | 3.25% (22) | 0.87% (7) | 2.06% (29) | 4.24% (52) |
A variety of metabolite biomarkers differentiating the patients of hepatocellular carcinoma (HCC) from those of cirrhosis (CIR) identified during the past ten years.
| No. | True metabolite markers differentiating HCC and CIR | HMDB ID | Bio-fluid used for marker identification | Experimental strategy applied for marker identification | Reference |
|---|---|---|---|---|---|
| 1 | 16:0 lysophosphatidic acid | 10382 | Serum | Profiled and then identified by UPLC-ESI-TQMS based on the internal metabolite standard | |
| 2 | 18:0 lysophosphatidic acid | 10384 | Serum | Combining the TOF MS/MS with UPLC-SRM-MS/MS using internal standard-based isotope dilution | |
| 3 | Acetyl carnitine | 00201 | Serum/Urine | Verified by acquiring MS/MS spectra and further confirmed based on the structure of commercial standard | |
| 4 | Carnitine | 00562 | Serum/Urine | Discovered by serum-based isotope dilution using LC-MS/MS and analyzing the urine-based 1H MRS data | |
| 5 | Creatinine | 00062 | Urine | Identified experimentally by statistically analyzing the urine-based 1H MRS data | |
| 6 | Glycochenodeoxycholic acid | 00637 | Serum | Verified by acquiring MS/MS spectra and then quantified using internal standard-based isotope dilution by UPLC-MS/MS | |
| 7 | Glycocholic acid | 00138 | Serum | Verified by acquiring MS/MS spectra and then quantified using internal standard-based isotope dilution by UPLC-MS/MS | |
| 8 | Glycodeoxycholic acid | 00631 | Serum | Discovered by the serum-based isotope dilution integrating the internal standard with UPLC-SRM-MS/MS | |
| 9 | Oleamide | 02117 | Serum | Experimentally validated and identified by UPLC-MS profiling of serum-based data | |
| 10 | Phenylalanine | 00159 | Serum | Detected from the serum samples based on the targeted analysis using LC-MRM-MS/MS | |
| 11 | Phenylalanyl-tryptophan | 29006 | Serum | Identified by the targeted profiling using serum-based UPLC-MS and determined by isotope-labeled quantification | |
| 12 | Taurochenodeoxycholic acid | 00951 | Serum | Discovered by the serum-based isotope dilution integrating the internal standard with UPLC-SRM-MS/MS | |
| 13 | Taurocholic acid | 00036 | Serum | Verified by acquiring MS/MS spectra and then quantified using internal standard-based isotope dilution by UPLC-MS/MS | |
False discovery rate of different analytical strategies assessed by the number of true markers identified and the enrichment factor (EF) based on four pairs of benchmark datasets collected from the Metabolights database.
| Experiment ID | No. of cases / controls | No. of MS peaks detected | No. of metabolites annotated based on detected peaks | No. of true markers covered by detected metabolites | No. of differential peaks identified | No. of metabolites annotated based on identified peaks | No. of true markers covered by identified metabolites | Enrichment factor | |
|---|---|---|---|---|---|---|---|---|---|
| SiE1 | 59/129 | 941 | 42,269 | 9 | 172 | 9709 | 5 | ||
| SiE2 | 13/50 | 1,209 | 43,614 | 8 | 174 | 3296 | 2 | ||
| 72/179 | 941/1,209 | 32,592 | 5 | - | 930 | 1 | |||
| 72/179 | 734 | 34,840 | 7 | 141 | 2523 | 4 | |||
| SiE1 | 60/129 | 1,586 | 19,724 | 11 | 205 | 6760 | 7 | ||
| SiE2 | 13/50 | 3,230 | 24,157 | 11 | 215 | 5815 | 5 | ||
| 73/179 | 1,586/3,230 | 19,724 | 11 | - | 1862 | 5 | |||
| 73/179 | 1,144 | 19,272 | 11 | 182 | 5503 | 7 | |||
| SiE1 | 20/25 | 883 | 28,088 | 7 | 122 | 5163 | 3 | ||
| SiE2 | 20/25 | 825 | 25,950 | 7 | 135 | 7049 | 4 | ||
| 40/50 | 883/825 | 22,992 | 7 | - | 1307 | 2 | |||
| 40/50 | 665 | 23,040 | 7 | 107 | 1931 | 2 | |||
| SiE1 | 20/25 | 1,526 | 17,166 | 9 | 202 | 4205 | 5 | ||
| SiE2 | 20/25 | 1,542 | 17,966 | 8 | 202 | 6028 | 4 | ||
| 40/50 | 1,526/1,542 | 15,215 | 8 | - | 1789 | 4 | |||
| 40/50 | 872 | 14,935 | 8 | 82 | 2469 | 3 | |||