| Literature DB >> 32185129 |
Xiaoyong Shen1, Fan Yang2, Pengfei Yang3,4,5, Modan Yang2, Lei Xu3,4, Jianyong Zhuo2, Jianguo Wang2, Di Lu2, Zhikun Liu2, Shu-Sen Zheng2, Tianye Niu3,4,6, Xiao Xu2.
Abstract
Background: Serous cystadenoma (SCA), mucinous cystadenoma (MCN), and intraductal papillary mucinous neoplasm (IPMN) are three subtypes of pancreatic cystic neoplasm (PCN). Due to the potential of malignant-transforming, patients with MCN and IPMN require radical surgery while patients with SCA need periodic surveillance. However, accurate pre-surgery diagnosis between SCA, MCN, and IPMN remains challenging in the clinic.Entities:
Keywords: contrast-enhanced computed tomography; differentiation diagnosis; machine learning; pancreatic cystic neoplasm; radiomics
Year: 2020 PMID: 32185129 PMCID: PMC7058789 DOI: 10.3389/fonc.2020.00248
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1First, we performed delineation of the region of interest and segmentation, then features belong to different categories (Histogram, GLCM, GLRLM, GLSZM, NGTDM, and wavelet) were extracted and further analyzed. According to feature selection algorithm, the most important features were selected for model construction. Then, the performance of constructed model was evaluated in the validation dataset.
Figure 2Typical CT imaging (arterial phase) of SCA, MCN, and IPMN were shown in (A,C,E). The region of interest on CT imaging after delineation were shown in (B,D,F).
Patient clinical factors in training and validation cohort.
| Tumor type | 0.9914 | ||
| SCA | 53 | 23 | |
| MCN | 28 | 12 | |
| IPMN | 34 | 14 | |
| Age | 57 [20–79] | 57 [26–79] | 0.3844 |
| Maximum Diameter | 3.5 [0.6–14.8] | 3.3 [0.5–11] | 0.4936 |
| Serum platelet | 199 [46–443] | 202 [87–397] | 0.8871 |
| Serum ALB | 44.4 [22.9–54.9] | 45 [32.9–52.8] | 0.7261 |
| Serum ALT | 16 [5–452] | 14 [6–134] | 0.2255 |
| Serum AST | 19 [10–280] | 19 [11–68] | 0.4175 |
| Serum FBG | 5.12 [2.65–15.03] | 4.85 [3.98–7.07] | 0.1699 |
| Serum AFP | 2.3 [0.2 −2374.9] | 2.3 [0.7–5.2] | 0.2905 |
| Serum CEA | 1.9 [0.6–682.8] | 1.8 [0.6–19.1] | 0.3389 |
| Serum CA 19–9 | 9.6 [1–8170.2] | 9.8 [1–128.8] | 0.9799 |
| Serum SF | 132.4 [4.7–23290.9] | 124 [3.8–1547.2] | 0.7807 |
| Sex | 0.5605 | ||
| Male | 35 | 12 | |
| Female | 80 | 37 | |
| Location | 0.1927 | ||
| Head and neck | 45 | 21 | |
| Body and tail | 62 | 28 | |
| Other | 8 | 0 | |
| Number of tumors | 0.3821 | ||
| Single | 104 | 47 | |
| Multiple | 11 | 2 | |
| Calcification | 1 | ||
| Without | 109 | 46 | |
| With | 6 | 3 | |
| Chronic Pancreatitis History | 1 | ||
| Without | 114 | 49 | |
| With | 1 | 0 | |
| Abdominal symptom | 0.4125 | ||
| Without | 66 | 24 | |
| With | 49 | 25 | |
| Pancreatic neoplasm family history | 1 | ||
| Without | 115 | 49 | |
| History of smoking | 0.6427 | ||
| Without | 101 | 41 | |
| With | 14 | 8 | |
| History of alcoholic consumption | 0.4710 | ||
| Without | 94 | 43 | |
| With | 21 | 6 | |
| Blood type | 0.6167 | ||
| A | 34 | 14 | |
| B | 22 | 6 | |
| AB | 9 | 6 | |
| O | 50 | 23 | |
| Obesity | 0.5724 | ||
| Without | 93 | 37 | |
| With | 22 | 12 |
Figure 3The feature importance in the Boruta feature selection process. The green box showed the features which are confirmed important, the yellow box showed the tentative attributes and the green box showed the unimportant features. Five important radiomics features include Histogram_Entropy, Histogram_ Skewness, LLL_GLSZM_GLV, Histogram_Uniformity, HHL_Histogram_Kurtosis. Four important clinical parameters include serum CA 19-9, sex, age, and serum CEA.
Diagnosis performance of the constructed SVM model in the training and validation dataset.
| IPMN | 27 | 3 | 5 | 0.7714 | 0.7941 | 0.7826 | 11 | 0 | 4 | 0.7333 | 0.7857 | 0.7586 |
| MCN | 2 | 16 | 7 | 0.6400 | 0.5714 | 0.6038 | 0 | 7 | 2 | 0.7778 | 0.5833 | 0.6667 |
| SCA | 5 | 9 | 41 | 0.7455 | 0.7736 | 0.7593 | 3 | 5 | 17 | 0.6800 | 0.7391 | 0.7083 |
| Total | 34 | 28 | 53 | OA | 0.7304 | 14 | 12 | 23 | OA | 0.7143 | ||
T, True type; P, Predicted type; Pre, Precision; Rec, Recall; OA, Overall accuracy.
Figure 4The error plot corresponding different tree numbers in the construction of the RF model. The red line showed the error of “SCA” class; the green line showed the error of “MCN” class; the blue line showed the error of “IPMN” class; the black line showed the OOB error. When tree number is more than 3,000, the errors become stable and thus 3,000 was chosen as the optimal tree number.
Diagnosis performance of the constructed RF model in the training and validation dataset.
| IPMN | 30 | 4 | 1 | 0.8571 | 0.8824 | 0.8696 | 9 | 0 | 1 | 0.9000 | 0.6429 | 0.7500 |
| MCN | 1 | 18 | 3 | 0.8182 | 0.6429 | 0.7200 | 0 | 9 | 1 | 0.9000 | 0.7500 | 0.8182 |
| SCA | 3 | 6 | 49 | 0.8448 | 0.9245 | 0.8829 | 5 | 3 | 21 | 0.7241 | 0.9130 | 0.8077 |
| Total | 34 | 28 | 53 | OA | 0.8435 | 14 | 12 | 23 | OA | 0.7959 | ||
T, True type; P, Predicted type; Pre, Precision; Rec, Recall; OA, Overall accuracy.
Figure 5(A) the ANN parameters optimization process: when the hidden units were 14 and the weighted decay was 1, the accuracy reached highest and thus the ANN model was constructed with 14 hidden units and the weighted decay value of 1. (B) the final constructed ANN model in this study.
Diagnosis performance of the constructed ANN model in the training and validation dataset.
| IPMN | 31 | 4 | 7 | 0.7381 | 0.9118 | 0.8158 | 13 | 1 | 5 | 0.6842 | 0.9286 | 0.7879 |
| MCN | 1 | 15 | 3 | 0.7895 | 0.5357 | 0.6383 | 0 | 8 | 4 | 0.6667 | 0.6667 | 0.6667 |
| SCA | 2 | 9 | 43 | 0.7963 | 0.8113 | 0.8037 | 1 | 3 | 14 | 0.7778 | 0.6087 | 0.6829 |
| Total | 34 | 28 | 53 | OA | 0.7739 | 14 | 12 | 23 | OA | 0.7143 | ||
T, True type; P, Predicted type; Pre, Precision; Rec, Recall; OA: Overall accuracy.