| Literature DB >> 33208887 |
Jae Seung Kang1, Chanhee Lee2, Wookyeong Song2, Wonho Choo2, Seungyeoun Lee3, Sungyoung Lee4, Youngmin Han1, Claudio Bassi5, Roberto Salvia5, Giovanni Marchegiani5, Cristopher L Wolfgang6, Jin He6, Alex B Blair6, Michael D Kluger7, Gloria H Su8, Song Cheol Kim9, Ki-Byung Song9, Masakazu Yamamoto10, Ryota Higuchi10, Takashi Hatori10,11, Ching-Yao Yang12, Hiroki Yamaue13, Seiko Hirono13, Sohei Satoi14, Tsutomu Fujii15,16, Satoshi Hirano17, Wenhui Lou18, Yasushi Hashimoto19,20, Yasuhiro Shimizu21, Marco Del Chiaro22,23, Roberto Valente22,23, Matthias Lohr24,25, Dong Wook Choi26, Seong Ho Choi26, Jin Seok Heo26, Fuyuhiko Motoi27, Ippei Matsumoto28,29, Woo Jung Lee30, Chang Moo Kang30, Yi-Ming Shyr31, Shin-E Wang31, Ho-Seong Han32, Yoo-Seok Yoon32, Marc G Besselink33, Nadine C M van Huijgevoort34, Masayuki Sho35, Hiroaki Nagano36,37, Sang Geol Kim38, Goro Honda39, Yinmo Yang40, Hee Chul Yu41, Jae Do Yang41, Jun Chul Chung42, Yuichi Nagakawa43, Hyung Il Seo44, Yoo Jin Choi1, Yoonhyeong Byun1, Hongbeom Kim1, Wooil Kwon1, Taesung Park45, Jin-Young Jang46.
Abstract
Most models for predicting malignant pancreatic intraductal papillary mucinous neoplasms were developed based on logistic regression (LR) analysis. Our study aimed to develop risk prediction models using machine learning (ML) and LR techniques and compare their performances. This was a multinational, multi-institutional, retrospective study. Clinical variables including age, sex, main duct diameter, cyst size, mural nodule, and tumour location were factors considered for model development (MD). After the division into a MD set and a test set (2:1), the best ML and LR models were developed by training with the MD set using a tenfold cross validation. The test area under the receiver operating curves (AUCs) of the two models were calculated using an independent test set. A total of 3,708 patients were included. The stacked ensemble algorithm in the ML model and variable combinations containing all variables in the LR model were the most chosen during 200 repetitions. After 200 repetitions, the mean AUCs of the ML and LR models were comparable (0.725 vs. 0.725). The performances of the ML and LR models were comparable. The LR model was more practical than ML counterpart, because of its convenience in clinical use and simple interpretability.Entities:
Mesh:
Year: 2020 PMID: 33208887 PMCID: PMC7676251 DOI: 10.1038/s41598-020-76974-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Predictive factors for malignant intraductal papillary mucinous neoplasm in the univariate and multivariate logistic regression analysis.
| Total (N = 3,463) | Univariate analysis | Multivariate analysis | |||||
|---|---|---|---|---|---|---|---|
| Benign IPMN (N = 2094) | Malignant IPMN (N = 1369) | P value | Odds ratio | 95% CI | P value | ||
| Age (mean ± SD, year) | 65.4 ± 9.9 | 64.5 ± 9.8 | 66.7 ± 10.0 | < 0.001 | 1.02 | 1.01 – 1.03 | < 0.001 |
| 0.195 | |||||||
| Female | 1,266 (36.6%) | 784 (37.4%) | 482 (35.2%) | Ref | Ref | ||
| Male | 2,197 (63.4%) | 1,310 (62.6%) | 887 (64.8%) | 1.22 | 1.05 – 1.42 | 0.010 | |
| < 0.001 | |||||||
| Head | 2,059 (59.5%) | 1,175 (56.1%) | 884 (64.6%) | Ref | Ref | ||
| Body or tail | 1,180 (34.1%) | 818 (39.1%) | 362 (26.4%) | 0.74 | 0.62 – 0.87 | < 0.001 | |
| Diffuse | 224 (6.4%) | 101 (4.8%) | 123 (9.0%) | 1.54 | 1.14 – 2.08 | 0.005 | |
| Cyst Size (mean ± SD, mm) | 30.3 ± 16.3 | 28.6 ± 14.5 | 33.6 ± 18.2 | < 0.001 | 1.02 | 1.01 – 1.02 | < 0.001 |
| MPD diameter (mean ± SD, mm) | 4.8 ± 2.5 | 4.2 ± 2.3 | 5.6 ± 2.5 | < 0.001 | 1.24 | 1.20 – 1.28 | < 0.001 |
| Mural nodule (No.) | 1,285 (37.1%) | 576 (27.5%) | 709 (51.8%) | < 0.001 | 2.38 | 2.05 – 2.78 | < 0.001 |
IPMN, intraductal papillary mucinous neoplasm; MPD, main pancreatic duct.
Figure 1The number of the first ranked machine learning algorithm chosen in the tenfold cross validation during 200 times repetition.
Figure 2The mean highest tenfold cross validation are under the receiver operating curves of each algorithm during 200 times repetition. AUC indicates area under the receiver operative curve.
Figure 3The overall performance of machine learning (ML) and logistic regression (LR). The performance of optimal ML model (Auto ML) was comparable with that of LR model (mean AUC, 0.725 vs. 0.725). AUC indicates area under the receiver operating curve.
Figure 4Overall flowchart of whole process. The workflows of both logistic regression (LR) and machine learning (ML) were separately processed in the same model development (MD) set. The whole process was repeated 200 times for reducing the selection bias which occurred during random split with test set and MD set. MD, model development; LR, logistic regression; Auto ML, automated machine learning; AUC, area under the receiver operating curve.
Figure 5The process of calculation of test area under the receiver operating curves (AUCs) during 200 times repetition. After tenfold cross validation and selection of the first rank automated machine learning (Auto ML) model structure, this Auto ML model structure was fit with the model development set at each seed and the best ML model developed. Then the AUC was calculated with the test set. This process was repeated 200 times and mean AUC was calculated and compared.