| Literature DB >> 35795067 |
Yun-Fan Liu1, Xin Shu1, Xiao-Feng Qiao1, Guang-Yong Ai1, Li Liu2, Jun Liao2, Shuang Qian2, Xiao-Jing He1.
Abstract
Objective: To develop and validate a noninvasive radiomic-based machine learning (ML) model to identify P504s/P63 status and further achieve the diagnosis of prostate cancer (PCa).Entities:
Keywords: MRI; P504s/P63; immunohistochemistry; machine learning; prostate cancer
Year: 2022 PMID: 35795067 PMCID: PMC9252170 DOI: 10.3389/fonc.2022.911426
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
Figure 1Flow chart of the patient selection process.
Figure 2The flow chart of this study.
Figure 3Sorted element mutual information values.
Figure 4ROC curves of five ML prediction models. (A) ROC curve of RF. (B) ROC curve of AdaBoost. (C) ROC curve of GBDT. (D) ROC curve of LR. (E) ROC curve of KNN. The solid line represents the prediction efficiency of each group in the corresponding model. Dotted lines represent the overall prediction performance of the models, including the macro-average ROC curves and micro-average ROC curves.
Results obtained by different ML algorithms.
| Models | AUC | Brier-score | Precision | Recall | F1-score | Accuracy | |
|---|---|---|---|---|---|---|---|
| Micro-average | Macro-average | ||||||
| RF | 0.920 | 0.870 | 0.407 | 0.820 | 0.740 | 0.750 | 0.850 |
| GBDT | 0.910 | 0.870 | 0.441 | 0.870 | 0.690 | 0.690 | 0.800 |
| LR | 0.890 | 0.840 | 0.746 | 0.640 | 0.640 | 0.640 | 0.760 |
| AdaBoost | 0.890 | 0.870 | 0.610 | 0.750 | 0.700 | 0.700 | 0.800 |
| KNN | 0.890 | 0.860 | 0.610 | 0.720 | 0.720 | 0.720 | 0.690 |
AUC, area under the curve; RF, random forest; GBDT, gradient boosting decision tree; LR, logistic regression; KNN, k-nearest neighbours.
Interlabel predictive performance of five ML models.
| Models | label | Sensitivity | Specificity | PPV | NPV | Youden index | Accuracy |
|---|---|---|---|---|---|---|---|
| RF | 0 | 0.333 | 0.980 | 0.750 | 0.891 | 0.313 | 0.831 |
| 1 | 0.917 | 0.857 | 0.815 | 0.938 | 0.774 | 0.831 | |
| 2 | 0.962 | 0.909 | 0.893 | 0.968 | 0.871 | 0.932 | |
| GBDT | 0 | 0.222 | 1.000 | 1.000 | 0.877 | 0.222 | 0.881 |
| 1 | 0.875 | 0.829 | 0.778 | 0.906 | 0.704 | 0.847 | |
| 2 | 0.962 | 0.848 | 0.833 | 0.966 | 0.810 | 0.898 | |
| LR | 0 | 0.222 | 0.900 | 0.286 | 0.865 | 0.122 | 0.797 |
| 1 | 0.750 | 0.829 | 0.750 | 0.829 | 0.579 | 0.797 | |
| 2 | 0.961 | 0.909 | 0.893 | 0.968 | 0.871 | 0.932 | |
| AdaBoost | 0 | 0.333 | 0.920 | 0.429 | 0.885 | 0.253 | 0.831 |
| 1 | 0.792 | 0.857 | 0.792 | 0.857 | 0.649 | 0.831 | |
| 2 | 0.962 | 0.909 | 0.893 | 0.968 | 0.871 | 0.932 | |
| KNN | 0 | 0.444 | 0.900 | 0.444 | 0.900 | 0.334 | 0.831 |
| 1 | 0.750 | 0.886 | 0.818 | 0.838 | 0.636 | 0.831 | |
| 2 | 0.962 | 0.909 | 0.893 | 0.968 | 0.871 | 0.932 |
AUC, area under the curve; PPV, positive predictive value; NPV, negative predictive value; RF, random forest; GBDT, gradient boosting decision tree; LR, logistic regression; KNN, k-nearest neighbours.
Figure 5The recognition effects of five prediction models. The information above the horizontal axis represents the model prediction grouping, and that below this axis represents the actual grouping. Purple represents the correct prediction group, and blue represents a case of misrecognition by the corresponding model.
Figure 6ROC curves of the RF models established by T2WI (red), DWI (blue), ADC (green) and merged sequences (black). The corresponding areas under the micro-average ROC curves were 0.94, 0.92, 0.92, and 0.93, respectively. The corresponding areas under the macro-average ROC curves were 0.78, 0.84, 0.84 and 0.90, respectively.
Results obtained by the RF models constructed with different datasets.
| Datasets | AUC | Precision | Recall | F1-score | Accuracy | |
|---|---|---|---|---|---|---|
| Micro-average | Macro-average | |||||
| T2WI | 0.940 | 0.840 | 0.790 | 0.800 | 0.790 | 0.800 |
| DWI | 0.920 | 0.840 | 0.800 | 0.830 | 0.800 | 0.830 |
| ADC | 0.920 | 0.780 | 0.800 | 0.810 | 0.800 | 0.810 |
| Merge | 0.930 | 0.900 | 0.840 | 0.850 | 0.880 | 0.850 |