| Literature DB >> 34608882 |
Su Yon Jung1,2.
Abstract
INTRODUCTION: Insulin resistance (IR)/glucose intolerance is a critical biologic mechanism for the development of colorectal cancer (CRC) in postmenopausal women. Whereas IR and excessive adiposity are more prevalent in African American (AA) women than in White women, AA women are underrepresented in genome-wide studies for systemic regulation of IR and the association with CRC risk.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34608882 PMCID: PMC8500576 DOI: 10.14309/ctg.0000000000000412
Source DB: PubMed Journal: Clin Transl Gastroenterol ISSN: 2155-384X Impact factor: 4.488
Figure 1.Scatter plot for the effects of 38 individual FG- and FI-genetic instrumental variables on colorectal cancer risk. Each black dot reflects a genome-wide FG/FI–raising genetic variant. The green and pink lines indicate IVW and WM estimates, respectively. The blue line indicates PWM estimates and 95% CIs. (PWM HR = 9.24, 95% CI: 0.03–2.95E+03; MR-Egger intercept P value = 0.701). CI, confidence interval; FG, fasting glucose; FI, fasting insulin; HR, hazard ratio; IVW, inverse variance–weighted; MR, Mendelian randomization; PC, principal components; PWM, penalized weighted median; WM, weighted median. Note: All MR estimates were based on the phenotype association and cancer association with genetic instruments, each of which was adjusted for age and 10 genetic PCs.
Figure 2.The second stage of RSF analysis using 10 single-nucleotide polymorphisms and 9 behavioral factors selected from the first stage of RSF analysis. (a) Comparing rankings between minimal depth and VIMP. PFA, polyunsaturated fatty acid; RSF, random survival forest; SFA, saturated fatty acid; VIMP, variable of importance. Note: The 4 behavioral variables within the gold ellipse and the 4 genetic markers within the blue ellipse were identified as the topmost influential predictors. (b) Out-of-bag concordance index (c-index). (Improvement in c-index was observed when the top 8 variables [●] were added to the model, whereas other variables [○] did not further improve the accuracy of prediction.)
RSF second-stage analysis: predictive values of variables for colorectal cancer risk
| Variable[ | Minimal depth[ | VIMP | C-index | Incremental error[ | Drop error[ |
| Years as a regular smoker[ | 2.9426 | 0.0202 | 0.5026 | 0.4974 | 0.0026 |
| Age at enrollment[ | 3.4192 | 0.0072 | 0.5635 | 0.4366 | 0.0609 |
| Age at menopause[ | 3.5056 | 0.0065 | 0.6325 | 0.3675 | 0.0691 |
| Percentage calories from PFA/day[ | 3.5902 | 0.0047 | 0.5993 | 0.4007 | −0.0332 |
| Duration of oral contraceptive use | 3.6041 | 0.0014 | 0.6217 | 0.3783 | 0.0224 |
| Dietary total sugars/day (g) | 3.8086 | 0.0011 | 0.6434 | 0.3567 | 0.0217 |
| 4.0185 | 0.0034 | 0.6513 | 0.3487 | 0.0080 | |
| Percentage calories from protein/day | 4.0939 | 0.0025 | 0.6595 | 0.3405 | 0.0082 |
| 4.1934 | 0.0038 | 0.6622 | 0.3378 | 0.0027 | |
| 4.2558 | 0.0040 | 0.6446 | 0.3554 | −0.0176 | |
| 4.2797 | 0.0072 | 0.6671 | 0.3329 | 0.0225 | |
| Waist-to-hip ratio | 4.2984 | 0.0019 | 0.6626 | 0.3374 | −0.0045 |
| 4.4524 | 0.0046 | 0.6666 | 0.3334 | 0.0040 | |
| Percentage calories from SFA/day | 4.4530 | 0.0002 | 0.6643 | 0.3357 | −0.0023 |
| 4.5190 | 0.0009 | 0.6666 | 0.3334 | 0.0023 | |
| 4.6099 | 0.0045 | 0.6747 | 0.3253 | 0.0082 | |
| 4.7623 | −0.0003 | 0.6576 | 0.3424 | −0.0172 | |
| 4.9678 | 0.0028 | 0.6578 | 0.3422 | 0.0002 | |
| 5.5591 | −0.0011 | 0.6603 | 0.3397 | 0.0025 |
C-index, concordance index; PFA, polyunsaturated fatty acid; RSF, random survival forest; SFA, saturated fatty acid; VIMP, variable of importance.
Variables ordered by minimal depth.
Minimal depth is the predictive value of the variable estimated from the nested RSF models with a lower value being likely to have a greater impact on prediction.
The incremental error rate was calculated in the nested sequence of models starting with the top variable, followed by the model with the top 2 variables, then the model with the top 3 variables, and so on. For example, the third error rate was computed from the third nested model, including the first, second, and third variables.
The drop error rate of the variable was calculated by the difference between the error rates of a previous and the corresponding variable from the nested models. For example, the drop error rate of the second variable was estimated by the difference between the error rates from the first and second nested models. The error rate for the null model is set at 0.5; thus, the drop error rate for the first variable was obtained by subtracting the error rate (0.4974) from 0.5.
Variables were selected as the most predictive behavioral markers on the basis of multimodal predictive values.
Variables were selected as the most predictive genetic markers on the basis of multimodal predictive values.
Figure 3.Cumulative incidence rates of colorectal cancer for the 8 topmost predictive variables: 4 single-nucleotide polymorphisms and 4 behavioral factors selected on the basis of a random survival forest analysis. AA, African American; PFA, polyunsaturated fatty acid. Note: Dashed red lines indicate 95% confidence intervals.
Results of combined and joint tests for smoking with risk genotypes predicting colorectal cancer risk
| SNP[ | Total | Never smokers | Regular smoker for <20 yr | |||||
| No. of risk | HR[ |
| n | HR[ |
| N | HR[ |
|
| 0 | Reference | 810 | Reference | 733 | 2.87 (0.731–11.287) | 0.1307 | ||
| 1 |
| 233 | 1.14 (0.119–10.992) | 0.9086 | 221 |
| ||
| 2 |
| 90 | 3.32 (0.342–32.159) | 0.3010 | 76 |
| ||
|
| 0.0500 |
| 0.1000 | |||||
Numbers in bold face are statistically significant.
CI, confidence interval; FDR, false discovery rate; HR, hazard ratio; SNP, single-nucleotide polymorphism.
The number of risk genotypes (GCK rs730497 AG+GG; MTNR1B rs10466351 TT; PCSK1 rs144489757GC+CC; and PCSK1 rs9285019 TC+CC) was defined as follows: 0 (none or 1 risk allele) vs 1 (2 risk alleles) vs 2 (3 or more risk alleles).
Multivariate regression for risk genotypes was adjusted by waist-to-hip ratio, duration of oral contraceptive use, dietary total sugars/day (g), percentage calories from protein/day, and percentage calories from saturated fatty acid/day.
P value with FDR <0.05 was presented after multiple comparison corrections via the Benjamini-Hochberg method.
Results of combined and joint tests for smoking with risk behaviors predicting colorectal cancer risk
| Behavior[ | Total | Never smokers | Regular smoker for <20 yr | |||||
| No. of risk | HR[ |
| n | HR[ |
| n | HR[ |
|
| 0 | Reference | 563 | Reference | 531 | 2.84 (0.721–11.149) | 0.1356 | ||
| 1 |
| 427 | 0.48 (0.050–4.620) | 0.5239 | 365 | 3.14 (0.726–13.541) | 0.1256 | |
| 2 |
| 143 | 1.76 (0.178–17.450) | 0.6281 | 134 | 1.90 (0.192–18.888) | 0.5828 | |
|
| 0.0010 |
| 0.2000 | |||||
| Never smokers | Regular smoker for ≥20 yrs | |||||||
| n | HR[ |
| n | HR[ |
| |||
| 563 | Reference | 1,072 | 2.50 (0.701–8.913) | 0.1580 | ||||
| 427 | 0.46 (0.048–4.410) | 0.4995 | 1,059 |
| ||||
| 143 | 1.30 (0.134–12.489) | 0.8225 | 398 |
| ||||
| 2.00E-04 | ||||||||
Numbers in bold face are statistically significant.
CI, confidence interval; FDR, false discovery rate; HR, hazard ratio.
The number of behavioral factors (years as a regular smoker <20 yr vs ≥20 yr [in overall analysis only]; age at enrollment ≤56 vs >56 yr; age at menopause ≤49 vs >49 yr; and percent calories from polyunsaturated fatty acid/day <7.5% vs ≥7.5%) was defined as follows: 0 (none, 1, or 2 risk behaviors) vs 1 (3 risk behaviors) vs 2 (4 risk behaviors).
Multivariate regression for risk genotypes was adjusted by waist-to-hip ratio, duration of oral contraceptive use, dietary total sugars/day (g), percentage calories from protein/day, and percentage calories from saturated fatty acid/day.
P value with FDR <0.05 was presented after multiple comparison corrections via the Benjamini-Hochberg method.
Results of combined and joint tests for smoking with risk genotypes and behaviors predicting colorectal cancer risk
| SNPs combined with behaviors[ | Total | Never smokers | Regular smoker for <20 yr | |||||
| No. of risk[ | HR[ |
| n | HR[ |
| n | HR[ |
|
| 0 | Reference | 1,043 | Reference | 954 |
| |||
| 1 |
| 90 | 3.21 (0.355–28.935) | 0.2992 | 76 |
| ||
| 2 |
|
| 0.0900 | |||||
|
| 0.0800 | |||||||
Numbers in bold face are statistically significant.
CI, confidence interval; FDR, false discovery rate; HR, hazard ratio.
The risk genotypes are GCK rs730497 AG+GG; MTNR1B rs10466351 TT; PCSK1 rs144489757GC + CC; and PCSK1 rs9285019 TC+CC. The behavioral factors are years as a regular smoker <20 vs ≥20 yr [in overall analysis only]; age at enrollment ≤56 vs >56 yr; age at menopause ≤49 vs >49 yr; and percentage calories from polyunsaturated fatty acid/day <7.5% vs ≥7.5%.
The combined number of risk genotypes and risk behaviors was based on risk genotypes defined as 0 (none, 1, or 2 risk alleles) and 1 (3 or more risk alleles) and based on risk behaviors defined as 0 (none, 1, 2, or 3 risk behaviors) and 1 (4 risk behaviors). The ultimate number of risk genotypes combined with risk behaviors was defined as 0 (none of risk genotypes and risk behaviors), 1 (either risk genotypes or risk behaviors), and 2 (both risk genotypes and risk behaviors) in total analysis; in smoker-specific analyses, 0 (none) and 1 (either risk genotypes or risk behaviors or both).
Multivariate regression for risk genotypes was adjusted by waist-to-hip ratio, duration of oral contraceptive use, dietary total sugars/day (g), percentage calories from protein/day, and percentage calories from saturated fatty acid/day.
P value with FDR <0.05 was presented after multiple comparison corrections via the Benjamini-Hochberg method.