| Literature DB >> 34069979 |
Moniek van Zutphen1, Fränzel J B van Duijnhoven1, Evertine Wesselink1, Ruud W M Schrauwen2, Ewout A Kouwenhoven3, Henk K van Halteren4, Johannes H W de Wilt5, Renate M Winkels1, Dieuwertje E Kok1, Hendriek C Boshuizen1.
Abstract
Current lifestyle recommendations for cancer survivors are the same as those for the general public to decrease their risk of cancer. However, it is unclear which lifestyle behaviors are most important for prognosis. We aimed to identify which lifestyle behaviors were most important regarding colorectal cancer (CRC) recurrence and all-cause mortality with a data-driven method. The study consisted of 1180 newly diagnosed stage I-III CRC patients from a prospective cohort study. Lifestyle behaviors included in the current recommendations, as well as additional lifestyle behaviors related to diet, physical activity, adiposity, alcohol use, and smoking were assessed six months after diagnosis. These behaviors were simultaneously analyzed as potential predictors of recurrence or all-cause mortality with Random Survival Forests (RSFs). We observed 148 recurrences during 2.6-year median follow-up and 152 deaths during 4.8-year median follow-up. Higher intakes of sugary drinks were associated with increased recurrence risk. For all-cause mortality, fruit and vegetable, liquid fat and oil, and animal protein intake were identified as the most important lifestyle behaviors. These behaviors showed non-linear associations with all-cause mortality. Our exploratory RSF findings give new ideas on potential associations between certain lifestyle behaviors and CRC prognosis that still need to be confirmed in other cohorts of CRC survivors.Entities:
Keywords: colorectal cancer; lifestyle; random survival forests; recurrence; survival
Year: 2021 PMID: 34069979 PMCID: PMC8157840 DOI: 10.3390/cancers13102442
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Characteristics of the study population at colorectal cancer diagnosis and lifestyle characteristics six months after diagnosis.
| Background Variables, | |
|---|---|
| Age at diagnosis, y | 66 (61–71) |
| Men | 747 (63%) |
| Education (missing | |
| Low | 482 (41%) |
| Medium | 314 (27%) |
| High | 375 (32%) |
| Living with partner (missing | 988 (84%) |
| Tumor stage | |
| I | 307 (26%) |
| II | 356 (30%) |
| III | 517 (44%) |
| Tumor site | |
| Colon | 796 (67%) |
| Rectum | 384 (33%) |
| Neo-adjuvant treatment | 272 (23%) |
| Adjuvant chemotherapy | 284 (24%) |
| ASA physical performance classification (missing | |
| I | 354 (30%) |
| II | 653 (55%) |
| III | 122 (10%) |
| Daily NSAID use | 102 (9%) |
| Smoking at diagnosis (missing | |
| Yes | 119 (10%) |
| Former | 694 (59%) |
| Never | 359 (31%) |
|
| |
| Body Mass Index, kg/m2 (missing | 25.9 (23.9–28.5) |
| Physical activity 1, min/wk. | 480 (240–840) |
| Diet | |
| Fruits and vegetables, g/day | 248 (147–350) |
| Red and processed meat, g/day | 63 (38–85) |
| Sugary drinks, g/day | 70 (13–176) |
| Dietary fiber, g/day | 19 (15–24) |
| Energy intake, kcal/day | 1765 (1472–2112) |
| Alcohol intake | |
| Non-drinker 2 | 293 (25%) |
| Amount (g/d) among drinkers | 9 (3–21) |
| Amount (g/d) among all | 5 (0–16) |
| Current smoker (missing | 80 (7%) |
1 Moderate-to-vigorous physical activity included all activities with a metabolic equivalent value ≥ 3; 2 No alcohol intake in past month. Bold: sub-heading.
Figure 1Graphical presentation of the Random Survival Forest (RSF) algorithm. Adapted from Datema et al., 2012 [26]. OOB, out-of-bag data.
Figure 2Variable importance (VIMP) from random survival forest analysis for (A) colorectal cancer recurrence, and (B) all-cause mortality for one model repetition. The dashed horizontal line is the threshold for filtering variables: All variable above the line are identified as predictive variables. VIMP values are shown for 1 out of 10 model repetitions. Notably, some variations in VIMP values were noted over the 10 repetitions of the RSF models.
Variables predictive of recurrence or all-cause mortality based on variable importance.
| Variables Predictive of Recurrence | Number of Times Selected as Predictive Variable in 10 Repetitions of RSF Model |
|---|---|
|
| 10 |
|
| 10 |
|
| 10 |
|
| 10 |
| Sugary drinks | 10 |
| Saturated fat | 8 |
| Fruit | 6 |
| Total fat | 4 |
| Trans-fats | 3 |
| Eggs | 3 |
| Polyunsaturated fat | 3 |
| Carbohydrates | 3 |
| Fiber | 2 |
| Liquor | 2 |
| Energy intake | 2 |
|
| |
|
| 10 |
|
| 10 |
| Liquid fat & oils | 10 |
| Fruit & vegetables | 10 |
| Animal protein | 10 |
| Fruit | 9 |
| Polyunsaturated fat | 9 |
| Potato | 8 |
| Processed meat | 8 |
|
| 7 |
| Herbal tea | 6 |
| Sugary drinks | 6 |
| Soup | 6 |
|
| 6 |
| Alcohol | 5 |
| BMI | 4 |
| Beer | 4 |
|
| 4 |
| Plant protein | 4 |
|
| 2 |
| Dietary fiber | 2 |
Variables printed in italics are background variables, all other variables are lifestyle variables. Variables were selected as predictive based on their VIMP values. Only variables selected in ≥2 model repetitions are included in this table. Bold: sub-heading.
RSF-derived error rates for the prediction of recurrence and all-cause mortality in different RSF models based on 10 model repetitions.
| RSF Model | Prediction Error Rate 1 | |
|---|---|---|
| Recurrence | All-Cause Mortality | |
| Final model (background and identified lifestyle variables) | 0.3376 ± 0.0005 | 0.3452 ± 0.0006 |
| Only background variables | 0.3570 ± 0.0005 | 0.3483 ± 0.0004 |
| Full model (background and lifestyle variables) | 0.3777 ± 0.0006 | 0.3964 ± 0.0009 |
| Only lifestyle variables | 0.4858 ± 0.0014 | 0.4309 ± 0.0007 |
| Only noise (benchmark model) | 0.5706 ± 0.0014 | 0.4886 ± 0.0011 |
Background variables included age, sex, education, living with partner, stage of disease, neo-adjuvant treatment, adjuvant chemotherapy, tumor location, smoking status at diagnosis, use of non-steroidal anti-inflammatory drugs at diagnosis, and ASA classification. 1 Standard error (SE) represents randomness based on 10 repetitions of the RSF model within the same dataset.
Figure 3Partial plots of identified lifestyle variables for recurrence. Values on the vertical axis represent predicted three-year and five-year recurrence-free survival for a given variable after adjusting for all other variables (background and shown lifestyle variables). Dietary intakes in grams per day are on the horizontal axis. A lower predicted recurrence-free survival means a higher risk to develop a local or distant recurrence within three or five years of follow-up. The rug plots on the x-axis show the distribution of intake data observed in the cohort; about 90% of observations occurs between the second and second-last rug.
Figure 4Partial plots of identified lifestyle variables for all-cause mortality. Values on the vertical axis represent predicted three-year and five-year survival for a given variable after adjusting for all other variables (background and shown lifestyle variables). Dietary intakes in grams per day are on the horizontal axis. The rug plots on the x-axis show the distribution of intake data observed in the cohort; about 90% of observations occurs between the second and second-last rug.