| Literature DB >> 34936449 |
Guangxi Wang1, Hantao Yao2, Yan Gong3, Zipeng Lu4, Ruifang Pang5, Yang Li1, Yuyao Yuan1, Huajie Song1, Jia Liu1, Yan Jin1, Yongsu Ma6, Yinmo Yang6, Honggang Nie7, Guangze Zhang1, Zhu Meng8, Zhe Zhou1, Xuyang Zhao1, Mantang Qiu9, Zhicheng Zhao8, Kuirong Jiang4, Qiang Zeng3, Limei Guo1,10, Yuxin Yin1,5.
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal cancers, characterized by rapid progression, metastasis, and difficulty in diagnosis. However, there are no effective liquid-based testing methods available for PDAC detection. Here we introduce a minimally invasive approach that uses machine learning (ML) and lipidomics to detect PDAC. Through greedy algorithm and mass spectrum feature selection, we optimized 17 characteristic metabolites as detection features and developed a liquid chromatography-mass spectrometry-based targeted assay. In this study, 1033 patients with PDAC at various stages were examined. This approach has achieved 86.74% accuracy with an area under curve (AUC) of 0.9351 in the large external validation cohort and 85.00% accuracy with 0.9389 AUC in the prospective clinical cohort. Accordingly, single-cell sequencing, proteomics, and mass spectrometry imaging were applied and revealed notable alterations of selected lipids in PDAC tissues. We propose that the ML-aided lipidomics approach be used for early detection of PDAC.Entities:
Year: 2021 PMID: 34936449 PMCID: PMC8694594 DOI: 10.1126/sciadv.abh2724
Source DB: PubMed Journal: Sci Adv ISSN: 2375-2548 Impact factor: 14.136
Fig. 1.Study strategy and classification performance of the ML-aided metabolic PDAC detection approach.
(A) Study strategy and schematic illustration of establishment of the ML-aided metabolic PDAC detection approach. An exploratory study (sample size of n = 595) is set for an SVM model. A feature selection was conducted by greedy algorithm and LC-MS. Setup of targeted lipid MRM-mode quantification assay and validation study [n = 1898; 495 for training set (training set), 100 for internal test set (test set), 1003 for external validation cohort (independent test set), and 300 for prospective clinical cohort (clinical test set)] are outlined. (B and C) Classification performance summaries of the ML-aided metabolic PDAC detection approach on the training, cross-validation, and test set in positive-ion mode (B) and negative-ion mode (C) (shown are means ± SD and n = 5000 iterations; each dot represents an indicated data for one iteration of SVM evaluation).
Fig. 2.Greedy-based feature selection of the ML-aided metabolic PDAC detection approach.
(A) Greedy-based feature selection in positive-ion mode. SVM model with Top-27 features shows the best performance of mean accuracy of classification on cross-validation dataset in positive-ion mode (n = 500 iterations). (B) Greedy-based feature selection in negative-ion mode. SVM model with Top-19 features shows the best performance of mean accuracy of classification on cross-validation dataset in negative-ion mode (n = 500 iterations). (C and D) Classification performance summaries of modified ML-aided metabolic PDAC detection approach after greedy-based feature selection on the training, cross-validation, and test set in positive-ion mode (C) and negative-ion mode (D) (shown are means ± SD and n = 5000 iterations; each dot represents an indicated data for one iteration of SVM evaluation). Greedy algorithm is efficient for the feature selection of the ML-aided metabolic PDAC detection approach.
Classification performance of the ML-aided metabolic PDAC detection approach with feature selection in the exploratory and in the validation study.
|
| ||||||
|
| ||||||
|
| ||||||
|
|
|
| ||||
| Data collection | Positive-ion mode | Negative-ion mode | Positive-ion mode | Negative-ion mode | Positive-ion mode | Negative-ion mode |
| Mean accuracy | 0.9596 | 0.9535 | 0.9531 | 0.9472 | 0.9361 | 0.904 |
| 95% CI of accuracy | 0.9595–0.9597 | 0.9534–0.9537 | 0.9526–0.9536 | 0.9467–0.9477 | 0.9357–0.9365 | 0.9034–0.9047 |
| Mean specificity | 0.9863 | 0.9736 | 0.9858 | 0.9667 | 0.8992 | 0.8315 |
| 95% CI of specificity | 0.9861–0.9864 | 0.9734–0.9738 | 0.9854–0.9862 | 0.9661–0.9674 | 0.8983–0.9001 | 0.8300–0.8331 |
| Mean sensitivity | 0.9397 | 0.9385 | 0.9284 | 0.9324 | 0.973 | 0.9766 |
| 95% CI of sensitivity | 0.9395–0.9399 | 0.9383–0.9388 | 0.9275–0.9292 | 0.9316–0.9333 | 0.9727–0.9732 | 0.9761–0.9770 |
|
| ||||||
|
| ||||||
|
|
|
|
| |||
|
|
|
|
| |||
| Accuracy | 0.8949 | 0.8600 | 0.8674 | 0.8500 | ||
| Mean squared error | 0.1051 | 0.1400 | 0.1326 | 0.1500 | ||
| Specificity | 0.8915 | 0.8000 | 0.8610 | 0.8100 | ||
| 95% CI of specificity | 0.8398–0.9285 | 0.6586–0.8950 | 0.8225–0.8925 | 0.7473–0.8605 | ||
| Sensitivity | 0.8975 | 0.9200 | 0.8717 | 0.9300 | ||
| 95% CI of sensitivity | 0.8547–0.9292 | 0.7989–0.9741 | 0.8416–0.8968 | 0.8562–0.9690 | ||
| AUC | 0.9591 | 0.9444 | 0.9351 | 0.9389 | ||
Fig. 3.Method establishment and classification performance of the ML-aided metabolic PDAC detection approach on validation study.
(A) Extracted ion chromatogram of 17 selected lipids quantified by MRM-mode assay. The 17 selected lipid markers (DG 18:1-18:1; LPC 14:0, 16:0, 18:1, and 20:4; PC 16:0-16:0, 16:0-18:1, 18:0-18:2, 18:0-20:3, 16:0-22:5, 18:0-22:5, and O-16:0-18:2; LPE 22:4; PE 16:0-18:2; and SM d18:1/18:0, d18:2/24:1, and d18:2/24:2) are shown with standards in a single 19-min LC-MS run. Each lipid is represented by a different color. (B) ROC curve of the ML-aided metabolic PDAC detection approach on the training set of the validation study. The asterisk sign denotes the cutoff (score = 0) for the ML-aided metabolic PDAC detection approach. (C) ROC curve of the ML-aided metabolic PDAC detection approach on the internal validation dataset of the validation study. The asterisk sign denotes the cutoff (score = 0) for the ML-aided metabolic PDAC detection approach. (D) ROC curve of the ML-aided metabolic PDAC detection approach on the external validation dataset of the validation study. The asterisk sign denotes the cutoff (score = 0) for the ML-aided metabolic PDAC detection approach. The ML-aided metabolic PDAC detection approach shows good performance on detection of PDAC in an independent external validation dataset of the validation study. (E) ROC curves of the ML-aided metabolic PDAC detection approach and CA19-9 on the prospective clinical cohort of the validation study (the ML-aided metabolic PDAC detection approach in black, CT in green, and CA19-9 in blue). The red asterisk sign denotes the cutoff (score = 0) for the ML-aided metabolic PDAC detection approach, the red plus sign for CT diagnosis, and the red multiplication sign for the 37 U/ml cutoff for CA19-9. The ML-aided metabolic PDAC detection approach shows accurate, robust, and better performance on the prospective clinical cohort than CA19-9 and CT scanning.