| Literature DB >> 36046645 |
Wenjie Li1, Zhe Chen1, Han Chen1, Xu Han1, Guoxin Zhang1, Xiaoying Zhou1.
Abstract
Purpose: To establish and validate a model to determine the occurrence risk of colorectal ademomatous polyps.Entities:
Keywords: Colorectal ademomatous polyps; Colorectal cancer; Occurrence risk; Prediction model
Year: 2022 PMID: 36046645 PMCID: PMC9414019 DOI: 10.7150/jca.74772
Source DB: PubMed Journal: J Cancer ISSN: 1837-9664 Impact factor: 4.478
Figure 1Flow diagram of the model's discovery and validation cohort.
Demographic and clinical data of study participants
| Variable | Discovery | Validation | ||
|---|---|---|---|---|
| Polyps (n=1753) | Non-Polyp (n=750) | Polyps (n=767) | Non-Polyp (n=306) | |
|
| 58.5±11.4 | 49.8±13.6* | 58.4±11.3 | 50.4±14.2* |
|
| ||||
| Male, n (%) | 1061(60.5) | 393(52.4)* | 480(62.6) | 151(49.3)* |
| Female, n (%) | 692 (39.5) | 357(47.6)* | 287(37.4) | 155(50.7)* |
|
| 23.9±3.1 | 23.5±3.5* | 23.9±3.0 | 23.3±3.3* |
|
| ||||
| Never | 1243(70.9) | 605(80.7)* | 240(68.7) | 61(80.1)* |
| Former | 276(15.7) | 91(12.1)* | 240(14.2) | 61(11.4)* |
| Current | 234 (13.3) | 54(7.2)* | 240(17.1) | 61(8.5)* |
|
| ||||
| Never | 1355(77.3) | 596(79.5) | 588(76.7) | 234(76.5) |
| Former or Current | 398(22.7) | 154(20.5) | 179(23.3) | 72(23.5) |
|
| ||||
| No | 795(45.4) | 493(65.7)* | 358(46.7) | 193(63.1)* |
| Yes | 958(54.6) | 257(34.3)* | 409(53.3) | 113(36.9)* |
|
| ||||
| No | 1198(68.3) | 545(72.7)* | 517(67.4) | 226(73.9)* |
| Yes | 555(31.7) | 205(27.3)* | 250(32.6) | 80(26.1)* |
|
| ||||
| No | 999(57.0) | 540(72.0)* | 425(55.4) | 215(70.3)* |
| Yes | 754(43.0) | 210(28.0)* | 342(44.6) | 91(29.7)* |
|
| ||||
| No | 858(48.9) | 557(74.3)* | 390(50.8) | 228(74.5)* |
| Yes | 895(51.1) | 193(25.7)* | 377(49.2) | 78(25.5)* |
|
| ||||
| No | 468(26.7) | 172(22.9))* | 193(25.2) | 67(21.9) |
| Yes | 1285(73.3) | 578(77.1)* | 574(74.8) | 239(78.1) |
|
| ||||
| No | 894(51.0) | 398(53.1) | 387(50.5) | 149(48.7) |
| Yes | 859(49.0) | 352(46.9) | 380(49.5) | 157(51.3) |
|
| ||||
| No | 962(54.9) | 495(66.0)* | 408(53.2) | 202(66.0)* |
| Yes | 791(45.1) | 255(34.0)* | 359(46.8) | 104(34.0)* |
|
| ||||
| No | 1494(85.2) | 659(87.9) | 645(84.1) | 267(87.3) |
| Yes | 259(14.8) | 91(12.1) | 122(15.9) | 39(12.7) |
|
| ||||
| No | 1169(66.7) | 612(81.6)* | 513(66.9) | 263(85.9)* |
| Yes | 584(33.3) | 138(18.4)* | 254(33.1) | 43(14.1)* |
|
| ||||
| No | 1536(87.6) | 677(90.3) | 680(88.7) | 284(92.8) |
| Yes | 217(12.4) | 73(9.7) | 87(11.3) | 22(7.2) |
|
| ||||
| No | 1613(92.0) | 712(94.9)* | 705(91.9) | 296(96.7)* |
| Yes | 140(8.0) | 38(5.1)* | 62(8.1) | 10(3.3)* |
|
| ||||
| No | 1550(88.4) | 666(88.8) | 679(88.5) | 274(89.5) |
| Yes | 203(11.6) | 84(11.2) | 88(11.5) | 32(10.5) |
|
| ||||
| No | 1501(85.6) | 702(93.6)* | 668(87.1) | 292(95.4)* |
| Yes | 252(14.4) | 48(6.4)* | 99(12.9) | 14(4.6)* |
|
| ||||
| No | 1196(68.2) | 595(79.3)* | 482(62.8) | 239(78.1)* |
| Yes | 557(31.8) | 155(20.7)* | 285(37.2) | 67(21.9)* |
|
| ||||
| No | 1576(89.9) | 697(92.9)* | 698(91.0) | 279(91.2) |
| Yes | 177(10.1) | 53(7.1)* | 69(9.0) | 27(8.8) |
|
| ||||
| No | 1168(66.6) | 602(80.3)* | 503(65.6) | 247(80.7)* |
| Yes | 585(33.4) | 148(19.7)* | 264(34.4) | 59(19.3)* |
|
| 5.7±1.6 | 5.7±1.7 | 5.8±1.9 | 5.7±1.8 |
|
| 3.3±1.2 | 3.4±1.5* | 3.4±1.6 | 3.4±1.6 |
|
| 1.8±0.6 | 1.7±0.6* | 1.8±0.6 | 1.7±0.6 |
|
| 0.4±0.1 | 0.4±0.2 | 0.4±0.1 | 0.4±0.2 |
|
| 0.1±0.1 | 0.1±0.1 | 0.1±0.1 | 0.1±0.1 |
|
| 0.3±0.2 | 0.3±0.2 | 0.3±0.2 | 0.3±0.2 |
|
| 197.4±55.8 | 203.4±57.6* | 199.4±59.9 | 204.8±66.8 |
|
| 5.0±1.1 | 4.9±1.2* | 4.9±1.1 | 5.1±1.6* |
Continuous variables are expressed as mean ± SD and categorical variables are expressed as number (%).
Abbreviations: BMI: body mass index; HCRM: high consumption of red meat; HCP: high consumption of pungency; HCG: high consumption of greasy; HCS: high consumption of salt; HCDF: high consumption of dietary fiber; FHCT: family history of colorectal tumors; NAFLD: non-alcoholic fatty liver disease; H.pylori: helicobacter pylori; WBC: white blood cell.
*A two-tailed significant difference P<0.05 between patients with and without ademomatous polyps.
Figure 2Predictor selection using the LASSO regression analysis with 10-fold cross-validation. (A) Tuning parameter (λ) selection of deviance in the LASSO regression based on the minimum criteria (left dotted line) and the 1-SE criteria (right dotted line). (B) A coefficient profile plot was created against the log (λ) sequence. In the present study, predictor's selection was according to the 1-SE criteria (right dotted line), where 10 non-zero coefficients were selected. LASSO: least absolute shrinkage and selection operator; SE: standard error.
Univariate and multivariate logistic regression analysis in the discovery cohort
| Variables | Univariate | Multivariate | ||
|---|---|---|---|---|
| OR (95%CI) | OR (95%CI) | |||
|
| <0.001 | <0.001 | ||
| 18-45 | Ref. | Ref. | ||
| 46-69 | 3.22 (2.61-3.97) | <0.001 | 3.97 (3.13-5.03) | <0.001 |
| >69 | 9.93 (6.70-14.72) | <0.001 | 14.20 (9.33-21.5)9 | <0.001 |
| Male, % | 1.53 (1.29-1.82) | <0.001 | 1.27 (1.02-1.58) | 0.031 |
|
| <0.001 | 0.032 | ||
| Never | Reference | Reference | ||
| Former | 1.48 (1.14-1.91) | 0.003 | 1.18 (0.87-1.61) | 0.288 |
| Current | 2.11 (1.55-2.88) | <0.001 | 1.60 (1.12-2.30) | 0.010 |
| Hyperlipidemia, % | 1.60 (1.34-1.91) | <0.001 | 1.24 (1.02-1.52) | 0.035 |
| HCRM, % | 2.31 (1.94-2.76) | <0.001 | 1.83 (1.48-2.25) | <0.001 |
| HCS, % | 3.01 (2.49-3.64) | <0.001 | 2.55 (2.05-3.16) | <0.001 |
| HCDF, % | 0.82 (0.67-0.99) | 0.048 | 0.69 (0.55-0.86) | 0.001 |
| H.pylori, % | 2.04 (1.66-2.50) | <0.001 | 1.98 (1.58-2.48) | <0.001 |
| NAFLD, % | 2.22 (1.80-2.73) | <0.001 | 1.55 (1.23-1.96) | <0.001 |
| Chronic diarrhea, % | 2.46 (1.78-3.39) | 0.001 | 1.87 (1.32-2.66) | <0.001 |
Abbreviation: OR: odds ratio; CI: 95% confidence interval; HCRM: high consumption of red meat; HCS: high consumption of salt; HCDF: high consumption of dietary fiber; H.pylori: helicobacter pylori; NAFLD: non-alcoholic fatty liver disease.
Figure 3Nomogram for predicting colorectal ademomatous polyps risk and its algorithm.
Figure 4The AUC of the discovery and validation cohorts were 0.775 and 0.776 respectively. The blue line represented the ROC curve of the discovery cohort and the red line represented the ROC curve of the validation cohort. ROC: receiver operating characteristic; AUC: area under the curve.
Figure 5Calibration curve of the predictive model showing consistency between the predicted probability and observed probability (the H-L test, P=0.370, suggesting that it is of goodness-of-fit). The gray solid line represented a perfect prediction by an ideal model, and the black solid line shows the performance of the model.
Figure 6DCA of the nomogram. The red solid line represented the predictive model. The blue solid line represented the screen-all scheme. The black solid line represented the screen-none scheme. DCA: decision curve analysis.