| Literature DB >> 34367985 |
Jiaru Li1, Ziyi Yang2,3, Bowen Xin1, Yichao Hao1, Lisheng Wang4, Shaoli Song2,3, Junyan Xu2,3, Xiuying Wang1.
Abstract
OBJECTIVES: Microsatellite instability (MSI) status is an important hallmark for prognosis prediction and treatment recommendation of colorectal cancer (CRC). To address issues due to the invasiveness of clinical preoperative evaluation of microsatellite status, we investigated the value of preoperative 18F-FDG PET/CT radiomics with machine learning for predicting the microsatellite status of colorectal cancer patients.Entities:
Keywords: 18F-FDG PET/CT; colorectal cancer; machine learning; microsatellite status; radiomics
Year: 2021 PMID: 34367985 PMCID: PMC8339969 DOI: 10.3389/fonc.2021.702055
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1The overall flowchart of predicting the microsatellite instability status of the colorectal cancer patients.
Figure 2The flowchart of feature selection for selecting representative non-redundant and relevant features, as well as handling imbalanced data.
Demographic and clinical characteristics of study subjects.
| Characteristics | Total Population (n = 173) | MSI-H (n = 13) | MSS (n = 160) | P-value | |
|---|---|---|---|---|---|
| Age, median (range) | 61 (24~85) | 61 (32~70) | 61 (24~85) | 0.454 | |
| Gender, n (%) | 173 | 13 | 160 | 0.157 | |
| Stage, n (%) | Male | 99 (57.2) | 5 (38.5) | 94 (58.8) | |
| Female | 74 (42.8) | 8 (61.5) | 66 (41.2) |
| |
| I | 11 (6.4) | 3 (23.1) | 8 (5.0) | ||
| II | 46 (26.6) | 7 (53.8) | 39 (24.4) | ||
| III | 65 (37.6) | 3 (23.1) | 62 (38.8) | ||
| IV | 51 (29.4) | 0 (0.0) | 51 (31.8) | ||
| CEA (ng/ml), n (%) | 0.286 | ||||
| ≥5.2ng | 83 (48.0) | 11 (84.6) | 72 (45.0) | ||
| <5.2ng | 90 (52.0) | 2 (15.4) | 88 (55.0) | ||
| Location, n (%) |
| ||||
| Ascending | 18 (10.4) | 2 (15.4) | 16 (10.0) | ||
| Descending | 21 (12.1) | 3 (23.1) | 18 (11.2) | ||
| Ileocecum | 12 (6.9) | 3 (23.1) | 9 (5.6) | ||
| Rectum | 63 (36.4) | 4 (30.7) | 59 (36.9) | ||
| Sigmoid | 48 (27.8) | 1 (7.7) | 47 (29.4) | ||
| Transverse | 11 (6.4) | 0 (0.0) | 11 (6.9) | ||
| Extramural vascular invasion, n (%) | 0.189 | ||||
| + | 55 (31.8) | 2 (15.4) | 53 (33.1) | ||
| − | 118 (68.2) | 11 (84.6) | 107 (66.9) | ||
| Perineural invasion, n (%) |
| ||||
| + | 75 (43.4) | 2 (15.4) | 73 (45.6) | ||
| − | 98 (56.6) | 11 (84.6) | 87 (54.4) | ||
| 40% MTV, mean (std) | 20.64 (16.85) | 25.75 (15.66) | 20.22 (16.88) | 0.258 | |
| SUVmax, mean (std) | 14.65 (5.84) | 15.25 (6.21) | 14.60 (5.81) | 0.699 | |
| SUVmean, mean (std) | 8.80 (3.40) | 8.99 (3.35) | 8.79 (3.40) | 0.839 | |
| TLG, mean (std) | 188.74 (188.21) | 254.79 (201.13) | 183.37 (186.09) | 0.190 | |
| Lymph nodes, mean (std) | 2.60 (3.49) | 0.38 (0.84) | 2.78 (3.56) |
| |
The bold value means the value is smaller than 0.05 (< 0.05), and the corresponding feature is statistical correlated to the MSI status.
Figure 3The results of feature selection in different stages, namely, (A) feature extraction, (B) multivariant feature selection, (C) relevancy-based feature selection, and (D) non-redundancy-based feature selection stage.
Evaluation results with different output feature sets on independent validation dataset.
| Feature | Accuracy | AUC | Sensitivity | Specificity |
|---|---|---|---|---|
| CEA (0–5.2 ng/ml) | 0.610 | 0.692 | 0.5 | 0.618 |
| CT feature only |
|
| 0.5 |
|
| PET feature only | 0.439 | 0.684 | 0.667 | 0.421 |
|
| 0.768 | 0.828 |
| 0.763 |
|
| 0.744 | 0.803 | 0.667 | 0.750 |
The bold feature value represented the combined radiomic features that achieved high prediction accuracy for both target classes, while the bold numerical value represented the highest value of each column.
The definitions for the features selected for the predictive model construction.
| Feature name | Feature definition and meaning |
|---|---|
|
| |
|
|
|
| where | |
|
| |
|
|
|
| where |
Figure 4Feature importance for predicting MSI status of patients with CRCs in different situations. (A) Feature importance for predicting all the patients in the validation set. (B) Feature importance only for predicting the patients with MSS in the validation set. (C) Feature importance only for predicting the patients with MSI-H in the validation set.
Figure 5Case studies with two patients: (A) a patient with MSI-H status and (B) a patient with MSI-L or MSS status. Both patients were predicted correctly by the machine learning model. The top right section for each panel indicated the feature value of the corresponding case and the approximated feature weights interpreted by LIME. The bottom right section demonstrated the 3D model constructed based on the input CT and PET images from different viewpoints, while the red section represented the tumor of the patients.
Figure 6Data distribution between the selected radiomic features and their statistically correlated clinical features. Point mark and cross mark indicate the patients from the 2010–2018 and 2018–2020 periods, respectively. (A) indicates the PET feature, while (B) indicates the CT feature. For gender and location, we applied the Point-Biserial correlation method, while for the other clinical features, we applied the Pearson correlation method.