| Literature DB >> 32156060 |
Lun Zhang1, Jiamin Zheng1, Rashid Ahmed2, Guoyu Huang2, Jennifer Reid1, Rupasri Mandal1, Andrew Maksymuik3,4, Daniel S Sitar4,5, Paramjit S Tappia6, Bram Ramjiawan6, Philippe Joubert7, Alessandro Russo8,9, Christian D Rolfo9, David S Wishart1.
Abstract
The objective of this research is to use metabolomic techniques to discover and validate plasma metabolite biomarkers for the diagnosis of early-stage non-small cell lung cancer (NSCLC). The study included plasma samples from 156 patients with biopsy-confirmed NSCLC along with age and gender-matched plasma samples from 60 healthy controls. A fully quantitative targeted mass spectrometry (MS) analysis (targeting 138 metabolites) was performed on all samples. The sample set was split into a discovery set and validation set. Metabolite concentration data, clinical data, and smoking history were used to determine optimal sets of biomarkers and optimal regression models for identifying different stages of NSCLC using the discovery sets. The same biomarkers and regression models were used and assessed on the validation models. Univariate and multivariate statistical analysis identified β-hydroxybutyric acid, LysoPC 20:3, PC ae C40:6, citric acid, and fumaric acid as being significantly different between healthy controls and stage I/II NSCLC. Robust predictive models with areas under the curve (AUC) > 0.9 were developed and validated using these metabolites and other, easily measured clinical data for detecting different stages of NSCLC. This study successfully identified and validated a simple, high-performing, metabolite-based test for detecting early stage (I/II) NSCLC patients in plasma. While promising, further validation on larger and more diverse cohorts is still required.Entities:
Keywords: LC-MS; cancer staging; early detection; lung cancer; metabolomics
Year: 2020 PMID: 32156060 PMCID: PMC7139410 DOI: 10.3390/cancers12030622
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Summary of grouping of samples.
|
| |||||||||||
|
|
|
|
|
|
| ||||||
|
|
|
|
|
|
|
|
|
|
| ||
| Stage I NSCLC | 47 | 49–79 | 66 | 32 | 15 | 18 | 29 | 10 | 26 | 11 | 36 |
| Stage II NSCLC | 40 | 49–79 | 61.5 | 29 | 11 | 11 | 29 | 3 | 34 | 3 | 34 |
| Stage IIIB/IV NSCLC | 26 | 42–79 | 63 | 20 | 6 | 14 | 12 | 0 | 18 | 8 | 43 |
| Healthy control | 40 | 49–77 | 62.5 | NA | NA | 18 | 22 | 25 | 15 | 0 | 11 |
| Total | 153 | 42–79 | 64 | 81 | 32 | 61 | 92 | 57 | 131 | 28 | 33 |
|
| |||||||||||
|
|
|
|
|
|
| ||||||
|
|
|
|
|
|
|
|
|
|
| ||
| Stage I NSCLC | 23 | 49–78 | 65 | 18 | 5 | 8 | 15 | 4 | 14 | 5 | 35 |
| Stage II NSCLC | 20 | 51–78 | 64 | 11 | 9 | 9 | 11 | 2 | 16 | 2 | 38 |
| Healthy control | 20 | 49–77 | 62.5 | NA | NA | 8 | 12 | 13 | 7 | 0 | 5 |
| Total | 63 | 49–78 | 65 | 29 | 14 | 25 | 38 | 57 | 131 | 28 | 27 |
* 1 pack = 20 cigarettes.
Figure 1Partial least squares discriminant analysis (PLS-DA) results showing the comparison between plasma metabolite data acquired for healthy controls vs. stage I non-small cell lung cancer (NSCLC) patients. (a) 2-D PLS-DA scores plots; (b) variable importance in projection plot. The most discriminating metabolites are shown in descending order of their coefficient scores. The color boxes indicate whether metabolite concentration is increased (red) or decreased (green) in controls vs. cases.
Figure 2Receiver-operating characteristic (ROC) curve generated by the logistic regression models for diagnosing stage I NSCLC patients. (a) ROC curve of the metabolite-only model; (b) ROC curve of the metabolites + smoking model. ROC curves and their 95% CI on the discovery set are shown in blue. ROC curves obtained from the validation set are colored in red.
Logistic regression based optimal model for stage I NSCLC detection: metabolites only.
|
| |||||||
| log(P/(1 − P)) = 0.258 − 1.341 × PC ae C40:6 + 1.747 × LysoPC 20:3 + 0.913 × β-hydroxybutyric acid + 0.939 × Fumaric acid. | |||||||
|
| |||||||
|
|
|
|
|
| |||
|
| 0.258 | 0.352 | 0.733 | 0.463 | - | ||
|
| 1.747 | 0.518 | 3.37 | 0.001 | 5.73 | ||
|
| 0.913 | 0.404 | 2.263 | 0.024 | 2.49 | ||
|
| 0.939 | 0.446 | 2.106 | 0.035 | 2.56 | ||
|
| −1.341 | 0.465 | −2.884 | 0.001 | 0.26 | ||
|
| |||||||
|
|
|
| |||||
|
| 0.939 (0.924–0.955) | 0.827 (0.791–0.863) | 0.957 (0.936–0.977) | ||||
|
| 0.923 (0.866–0.980) | 0.830 (0.830–0.937) | 0.927 (0.847–1.000) | ||||
Logistic regression based optimal model for stage I NSCLC detection: metabolites plus smoking history.
|
| |||||||
| logit(P) = log(P/(1 − P)) = 0.311 + 0.641 × Amount of smoking − 1.372 × PC ae C40:6 + 1.623 × LysoPC 20:3 + 0.882 × β-hydroxybutyric acid + 0.65 × Fumaric acid. | |||||||
|
| |||||||
|
|
|
|
|
| |||
|
| 0.311 | 0.369 | 0.843 | 0.399 | - | ||
|
| 0.641 | 0.382 | 1.676 | 0.094 | 1.9 | ||
|
| −1.372 | 0.475 | −2.886 | 0.004 | 0.25 | ||
|
| 1.623 | 0.495 | 3.281 | 0.001 | 5.07 | ||
|
| 0.882 | 0.419 | 2.105 | 0.035 | 2.42 | ||
|
| 0.65 | 0.474 | 1.373 | 0.17 | 1.92 | ||
|
| |||||||
|
|
|
| |||||
|
| 0.942 (0.926–0.957) | 0.844 (0.809–0.879) | 0.951 (0.929–0.973) | ||||
|
| 0.922 (0.864–0.979) | 0.851 (0.851–0.953) | 0.951 (0.882–1.000) | ||||
Figure 3PLS-DA results showing the comparison between plasma metabolite data acquired for healthy controls vs. stage II NSCLC patients. (a) 2-D PLS-DA scores plots; (b) variable importance in projection plot. The most discriminating metabolites are shown in descending order of their coefficient scores. The color boxes indicate whether metabolite concentration is increased (red) or decreased (green) in controls vs. cases.
Figure 4ROC curve generated by the logistic regression models for stage II NSCLC patients. (a) ROC curve of the metabolites-only model; (b) ROC curve of the metabolites + smoking history model. ROC curves and their 95% CI on the discovery set are shown in blue. ROC curves obtained from the validation set are colored in red.
Logistic regression based optimal model for stage II NSCLC detection: metabolites only.
|
| |||||||
| logit(P) = log(P/(1 − P)) = 0.346 + 2.565 × β-hydroxybutyric acid − 2.219 × Citric acid + 2.904 × Carnitine − 1.599 × PC ae C40:6. | |||||||
|
| |||||||
|
|
|
|
|
| |||
|
| 0.346 | 0.516 | 0.671 | 0.502 | - | ||
|
| 2.565 | 0.861 | 2.981 | 0.003 | 13.93 | ||
|
| −2.219 | 0.804 | −2.758 | 0.006 | 0.11 | ||
|
| 2.904 | 0.976 | 2.975 | 0.003 | 18.24 | ||
|
| −1.599 | 0.765 | −2.091 | 0.037 | 0.2 | ||
|
| |||||||
|
|
|
| |||||
|
| 0.980 (0.973–0.987) | 0.958 (0.938–0.979) | 0.881 (0.854–0.909) | ||||
|
| 0.952 (0.909–0.995) | 0.875 (0.875–0.977) | 0.875 (0.773–0.977) | ||||
Logistic regression based optimal model for stage II NSCLC detection: metabolites plus smoking history.
|
| |||||||
| logit(P) = log(P/(1 − P)) = 0.098 + 1.489 × Amount of smoking + 2.911 × β-hydroxybutyric acid − 1.627 × Citric acid + 2.605 × Carnitine − 0.702 × PC ae C40:6. | |||||||
|
| |||||||
|
|
|
|
|
| |||
|
| −0.098 | 0.612 | 0.159 | 0.873 | - | ||
|
| 1.489 | 0.915 | 1.627 | 0.104 | 4.43 | ||
|
| 2.911 | 1.132 | 2.572 | 0.01 | 18.37 | ||
|
| −1.627 | 0.864 | −1.883 | 0.06 | 0.2 | ||
|
| 2.605 | 0.936 | 2.784 | 0.005 | 13.53 | ||
|
| −0.702 | 0.862 | −0.814 | 0.416 | 0.5 | ||
|
| |||||||
|
|
|
| |||||
|
| 0.985 (0.979–0.991) | 0.972 (0.955–0.989) | 0.875 (0.841–0.909) | ||||
|
| 0.948 (0.900–0.996) | 0.925 (0.925–1.000) | 0.850 (0.739–0.961) | ||||
Figure 5ROC curve generated by the logistic regression models for NSCLC patients at early stages (stage I + II). (a) ROC curve of the metabolites-only model; (b) ROC curve of the metabolites + smoking history model. ROC curves and their 95% CI on the discovery set are shown in blue. ROC curves obtained from the validation set are colored in red.
Logistic regression based optimal model for stages I + II NSCLC detection: metabolites only.
|
| |||||||
| logit(P) = log(P/(1 − P)) = 2.346 − 1.528 × PC ae C40:6 + 1.429 × β-hydroxybutyric acid − 2.481 × Citric acid + 1.03 × LysoPC 20:3 + 1.773 × Fumaric acid. | |||||||
|
| |||||||
|
|
|
|
|
| |||
|
| 2.346 | 0.588 | 3.991 | <0.001 | - | ||
|
| −1.528 | 0.61 | −2.507 | 0.012 | 0.22 | ||
|
| 1.429 | 0.505 | 2.832 | 0.005 | 4.18 | ||
|
| −2.481 | 0.642 | −3.863 | <0.001 | 0.08 | ||
|
| 1.03 | 0.508 | 2.028 | 0.043 | 2.8 | ||
|
| 1.773 | 0.569 | 3.117 | 0.002 | 5.89 | ||
|
| |||||||
|
|
|
| |||||
|
| 0.974 (0.965–0.982) | 0.937 (0.920–0.954) | 0.922 (0.895–0.950) | ||||
|
| 0.959 (0.923–0.995) | 0.919 (0.919–0.976) | 0.900 (0.807–0.993) | ||||
Logistic regression based optimal model for stages I + II NSCLC detection: metabolites plus smoking history.
|
| |||||||
| logit(P) = log(P/(1 − P)) = 2.427 + 1.425 × Amount of smoking − 1.414 × PC ae C40:6 + 1.414 × β-hydroxybutyric acid − 2.193 × Citric acid + 1.738 × LysoPC 20:3 + 1.44 × Fumaric acid. | |||||||
|
| |||||||
|
|
|
|
|
| |||
|
| 2.427 | 0.638 | 3.803 | <0.001 | - | ||
|
| 1.425 | 0.507 | 2.813 | 0.005 | 4.16 | ||
|
| −1.048 | 0.64 | −1.637 | 0.102 | 0.35 | ||
|
| 1.414 | 0.594 | 2.379 | 0.017 | 4.11 | ||
|
| −2.193 | 0.719 | −3.051 | 0.002 | 0.11 | ||
|
| 1.738 | 0.739 | 2.351 | 0.019 | 5.68 | ||
|
| 1.44 | 0.612 | 2.352 | 0.019 | 4.22 | ||
|
| |||||||
|
|
|
| |||||
|
| 0.982 (0.975–0.990) | 0.960 (0.946–0.974) | 0.944 (0.921–0.968) | ||||
|
| 0.965 (0.930–1.000) | 0.930 (0.930–0.984) | 0.925 (0.843–1.000) | ||||