Literature DB >> 35256862

Development and Validation of a New Multiparametric Random Survival Forest Predictive Model for Breast Cancer Recurrence with a Potential Benefit to Individual Outcomes.

Huan Li1, Ren-Bin Liu1, Chen-Meng Long2, Yuan Teng3, Lin Cheng1, Yu Liu1.   

Abstract

Purpose: Breast cancer (BC) is a multi-factorial disease. Its individual prognosis varies; thus, individualized patient profiling is instrumental to improving BC management and individual outcomes. An economical, multiparametric, and practical model to predict BC recurrence is needed. Patients and
Methods: We retrospectively investigated the clinical data of BC patients treated at the Third Affiliated Hospital of Sun Yat-sen University and Liuzhou Women and Children's Medical Center from January 2013 to December 2020. Random forest-recursive feature elimination (run by R caret package) was used to determine the best variable set, and the random survival forest method was used to develop a predictive model for BC recurrence.
Results: The training and validations sets included 623 and 151 patients, respectively. We selected 14 variables, the pathological (TNM) stage, gamma-glutamyl transpeptidase, total cholesterol, Ki-67, lymphocyte count, low-density lipoprotein, age, apolipoprotein B, high-density lipoprotein, globulin, neutrophil count to lymphocyte count ratio, alanine aminotransferase, triglyceride, and albumin to globulin ratio, using random survival forest (RSF)-recursive feature elimination. We developed a recurrence prediction model using RSF. Using area under the receiver operating characteristic curve and Kaplan-Meier survival analyses, the model performance was determined to be accurate. C-indexes were 0.997 and 0.936 for the training and validation sets, respectively.
Conclusion: The model could accurately predict BC recurrence. It aids clinicians in identifying high-risk patients and making treatment decisions for Breast cancer patients in China. This new multiparametric RSF model is instrumental for breast cancer recurrence prediction and potentially improves individual outcomes.
© 2022 Li et al.

Entities:  

Keywords:  breast cancer; individualized patient profiles; multi-level diagnostics and disease modeling; random survival forest; recurrence

Year:  2022        PMID: 35256862      PMCID: PMC8898179          DOI: 10.2147/CMAR.S346871

Source DB:  PubMed          Journal:  Cancer Manag Res        ISSN: 1179-1322            Impact factor:   3.989


Introduction

Breast cancer (BC) is a malignant tumor originating from the epithelial tissue of the breast. The incidence of BC has been increasing annually, and BC has become a significant threat to women’s health. According to the data of GLOBOCAN 2020, the number of new cases of female BC is estimated to be 2.3 million (11.7%), surpassing lung cancer as the most common cancer type.1 In China, morbidity and mortality rates of BC have increased in recent years because of lifestyle changes, dietary regimens, and the natural environment.2,3 Conventional prognostic factors for BC include tumor-node-metastasis (TNM) stage (tumor size, number of metastatic lymph nodes, distant metastatic state), tumor grade, and expression of molecular biomarkers such as estrogen receptor (ER), progesterone receptor (PR), Ki-67, and human epidermal growth factor receptor 2 (HER-2).4 However, patient characteristics, including age, body fat, and nutritional and inflammatory status, affect tumor prognosis.5–8 With the recent development of gene technology, several studies have focused on the genetic diagnosis of BC, and researchers have proposed the integration of multiple genes and molecular markers to construct new prognostic models for predicting BC prognosis. One of them was the 21-gene recurrence score assay, which considers the expression of 21 genes to accurately predict the risk of BC recurrence in patients.9 However, gene detection requires proficient technological devices and is costly; thus, it is still not widely used.9 Overall, prognostic factors of BC include not only the tumor itself but also patient characteristics. Using tumor or patient characteristics separately to predict BC recurrence may be inaccurate.10,11 The genetic prediction of BC recurrence has been shown to be accurate, but obtaining genetic information is laborious, time-consuming, and relatively expensive, which has greatly limited use in clinical practice.9 Breast cancer is a multi-factorial disease, so individualized patient profiling is instrumental to improving BC management and individual outcomes. Thus, the paradigm shift from a reactive to a predictive, preventive, and personalized medicine approach is essential to improve BC management.12–14 Thus, a comprehensive, accurate, and low-cost predictive model using multi-level diagnostics and disease modeling that can be used in clinical practice is needed. The rapid development of machine learning technologies has facilitated the construction of predictive models that can efficiently evaluate numerous parameters. However, conventional models often have a low level of predictive accuracy due to overfitting.15 Random survival forest (RSF), derived from random forest, is a machine learning method based on both random forest and survival analysis.16 It has the following advantages: it has no special requirements for the data set and can be used to analyze data for which a number of variables are significantly larger than the sample size. Moreover, RSF effectively avoids problems associated with overfitting and collinearity.17–19 Further, no restrictions on the type of data or the association between predictive variables and outcomes are needed, nor is RSF constrained by proportional risk or logarithmic linear assumptions. As a result, higher levels of accuracy are achieved.16,20 To capitalize on the advantages of the machine learning method, we developed a new predictive model for BC recurrence using RSF, which was based on baseline, cross-sectional, common clinical variables, including general patient information, blood tests, pathological examinations, and adverse events due to treatment. Here, we hypothesized that multi-level diagnostics and disease modeling may lead to the identification of risk of recurrence for BC patients. Furthermore, a multi-omic predictive model using machine learning was considered a potent tool for stratifying patients with high versus low risk for BC recurrence.

Methods

A retrospective survey of BC patients’ medical records was performed at the Third Affiliated Hospital of Sun Yat-sen University and Liuzhou Women and Children’s Medical Center between January 2013 and December 2020. All patients enrolled in this cohort study were diagnosed with stage 0 to III primary BC and had received primary BC therapy. Patients lost to follow-up and those with tumor stage IV, a history of cancer, other synchronous malignancies, or incomplete important information (lacking more than 50% variables) were excluded from the study. Patients diagnosed and treated at the Third Affiliated Hospital of Sun Yat-sen University were included in the training set for model development, and patients diagnosed and treated at Liuzhou Women and Children’s Medical Center were included in the validation set for model validation. The flowchart of the study design and patient selection is shown in Figure 1.
Figure 1

Flowchart of the study design and patient selection.

Flowchart of the study design and patient selection. This study was approved by the ethics committee of the Third Affiliated Hospital of Sun Yat-sen University and Liuzhou Women and Children’s Medical Center. The study was in compliance with the Declaration of Helsinki and its later amendments. All study participants provided informed consent to review their medical records. Identifiable data involving the individuals in this study were encrypted.

Potential Predictors

The patients’ data were obtained from their medical records. The results of the routine peripheral blood parameters and biochemical parameters before initiating any treatment were reviewed. The complete blood cell counts and biochemical parameters were measured by standard clinical laboratory methods. White blood cell (WBC), neutrophil (NEUT), and platelet (PLT) counts were collected, and the neutrophil count to lymphocyte count ratio (NLR) and platelet count to lymphocyte count ratio (PLR) were calculated. We retrospectively investigated patients’ characteristics, including age, type of chemotherapy, chemotherapy toxicities, and prognoses based on a review of patients’ medical records or telephonic follow-up. ER, PR, and Ki67 status was assessed by immunohistochemistry (IHC). HER-2 status was evaluated by IHC and/or fluorescence in situ hybridization (FISH). Tumors exhibiting greater than or equal to 10% positivity for ER or PR at any staining intensity among the total tumor cells were considered positive. The HER-2 staining intensity score was evaluated from 0 to 3+. HER-2 membranous staining was evaluated as 0 if no cells showed staining; as 1+ if incomplete, faint staining was present in >10% of cells; as 2+ if complete, moderate staining was present in >10% of cells; and as 3+ if complete, strong staining was present in >10% of cells. An HER-2 score of 0–1+ was considered negative, and when HER-2 score was 2+ and 3+, further examination of FISH was performed. Specimens scored as 3+ or confirmed to display amplification based on FISH were considered positive. The patients were categorized into four subtypes based on ER, PR, HER-2, and Ki67 via IHC of their tumor in the following manner: Luminal A (ER+ and/or PR+, HER-2‐, Ki67 <14%), Luminal B (ER+ and/or PR+, HER-2+) or (ER+ and/or PR+, HER-2‐, Ki67 ≥ 14%), HER-2+ (ER‐, PR‐, HER-2+), and TNBC (ER‐, PR‐, HER-2‐). The TNM staging was performed according to histopathology results using the recommendation of the American Joint Committee on Cancer.

Assessment of Adverse Events

Data on adverse events were collected and assessed by the Common Terminology Criteria for Adverse Events version 5.0. The severity of adverse events was measured using grades 1 to 5 as follows: Grade 1: Mild, asymptomatic or mild symptoms, clinical or diagnostic observations only, and intervention not indicated. Grade 2: Moderate; minimal, local, or noninvasive intervention indicated; limiting age-appropriate instrumental activities of daily living (ADLs). Grade 3: Severe or medically significant but not immediately life-threatening; hospitalization or prolongation of hospitalization indicated; disabling; and limiting self-care ADLs. Grade 4: Life-threatening consequences and urgent intervention indicated. Grade 5: Death related to adverse events. Severe adverse events included grades 3 to 5.

Statistical Analyses

The patients’ baseline demographic and clinical characteristics are listed as percentages or means with standard deviation. Student’s t-test or Mann–Whitney U-test was employed to estimate the continuous data, and the chi-squared test was employed to estimate the categorical data. R software was employed when performing all statistical analyses. All analyses were two-tailed, and differences were statistically significant when P was < 0.05. We used the R mice package (PMM) to interpolate the missing data in the training and validation sets. Based on the characteristics of BC, demographic characteristics, routine blood tests, treatment adverse events, and interruption of treatment, the random forest-recursive feature elimination (RF-RFE) (run by R caret package) was used to determine the best variable set, and the RSF method was used to develop a predictive model for the recurrence risk of BC patients. All the pairs of ntree and mtry were formed by a grid search using 10-fold cross-validation, and those with the best concordance index (C-index) values were identified as the optimized parameters. Moreover, the C-index was used to evaluate the discrimination of the predictive model, and the receiver operating characteristic (ROC) curve and Kaplan-Meier (KM) survival analysis were used to evaluate the precision of the model in predicting BC recurrence.

Results

In total, 774 patients were included in the study. Patients diagnosed with BC and treated at the Third Affiliated Hospital of Sun Yat-sen University were included in the training set (n=623) for model development, and patients diagnosed with BC and treated at Liuzhou Women and Children’s Medical Center were included in the validation set (n=151) for model validation. We used R-Pack mice (PMM) to interpolate the missing data in the training and validation sets. Basic information regarding the training and validation sets is shown in Table 1.
Table 1

Basic Information of the Training and Validation Sets

Total SetTraining SetValidation SetP value
Total (n)774623151
Female (n)774623151
Age (year)49.76±10.8350.50±10.8446.74±10.270.001
≥60 (n)155 (20.03%)138 (22.15%)17 (11.26%)0.011
<60 (n)619 (79.97%)485 (77.85%)134 (88.74%)
≥35 (n)713 (92.12%)577 (92.62%)136 (90.07%)0.569
<35 (n)61 (7.88%)46 (7.38%)15 (9.93%)
BMI (kg/m2)23.19±3.1823.69±3.3621.49±1.54<0.001
ALT (U/L)20.24±12.9419.52 ± 13.8423.23 ± 7.59<0.001
AST (U/L)21.0±10.0221.34±11.0719.62±2.590.167
TBIL (μmol/L)11.12±4.8210.92±4.3411.96±6.340.312
DBIL (μmol/L)3.31±1.453.18±1.463.81±1.26<0.001
GGT (U/L)25.40±17.8026.81±8.6225.03±19.46<0.001
ALP (U/L)64.84±20.9465.14±22.6363.66±12.140.738
ALB (g/L)42.63±4.3742.59±3.5842.80± 6.720.875
GLB (g/L)27.30±4.2127.15±3.9927.88±5.020.169
A/G1.60±0.281.60±0.251.58±0.370.821
Cr (μmol/L)65.13±17.0359.88±12.1986.71±17.15<0.001
GLU5.40±1.735.47±1.415.13±2.67<0.001
UA313.78±80.62315.80±89.10305.96±29.500.409
CHOL4.98±0.944.94±1.045.12±0.360.112
TRIG1.24±0.881.35±0.950.81±0.34<0.001
HDL1.36±0.451.30±0.321.59±0.73<0.001
LDL3.06±0.873.11±0.902.88±0.760.017
ApoA1.47±0.221.43±0.221.62±0.09<0.001
ApoB1.01±0.271.03±0.300.94±0.070.005
Lpa198.07±217.58210.55±241.54149.30±38.150.009
WBC (×109/L)5.88±1.816.28±1.654.25±1.46<0.001
NEUT (×109/L)3.69±1.403.91±1.422.75±0.81<0.001
LYMPH (×109/L)1.77±0.581.82 ± 0.61.53±0.41<0.001
RBC (×1012/L)4.42±0.524.43±0.554.37±0.350.449
HCT0.37±0.040.38±0.040.35±0.02<0.001
Hb (g/L)124.94±12.18125.60±13.25122.22±5.150.009
PLT (×109/L)239.73±65.38253.51±63.0182.87±39.08<0.001
AST/PLT0.09±0.110.09±0.120.11±0.020.268
NLR2.29±1.272.39±1.371.85±0.59<0.001
PLR146.78±57.36152.87±61.37121.68±23.51<0.001
PT (s)12.86±0.7112.95±0.7412.51±0.36<0.001
INR0.99±0.361.00±0.390.95±0.030.799
Follow-up (months)55.59±25.4360.24±25.6836.42±11.77<0.001
Tumor pathology
Tumor stage0.569
029 (3.75%)27 (4.33%)2 (1.32%)
I191 (24.68%)117 (18.78%)74 (49.01%)
II382 (49.35%)330 (52.97%)52 (34.44%)
III172 (22.22%)149 (23.92%)23 (15.23%)
Histology0.569
Invasive ductal carcinoma650 (83.98%)540 (86.68%)110 (72.85%)
Invasive lobular carcinoma35 (4.52%)26 (4.17%)9 (5.96%)
Carcinoma in situ51 (6.59%)32 (5.14%)19 (12.58%)
Special types (inflammatory breast cancer, Paget’s disease, mucinous carcinoma, malignant phyllodes tumor)38 (4.91%)25 (4.01%)13 (8.61%)
Immunohistochemistry
ER statue0.005
Negative150 (19.38%)135 (21.67%)15 (9.93%)
Positive624 (80.62%)488 (78.33%)136 (90.07%)
PR0.027
Negative188 (24.29%)164 (26.32%)24 (15.89%)
Positive586 (75.71%)459 (73.68%)127 (84.11%)
HER2 status0.001
Negative543 (70.16%)418 (67.09%)125 (82.78%)
Positive231 (29.84%)205 (32.91%)26 (17.22%)
Ki-670.028
<14%257 (33.20%)193 (30.98%)64 (42.38%)
≥15%517 (66.80%)430 (69.02%)87 (57.62%)
Axillary lymph node metastasis0.023
No446 (57.62%)344 (55.22%)102 (67.55%)
Yes328 (42.38%)279 (44.78%)49 (32.45%)
Molecular type0.023
Luminal A203 (26.23%)162 (26.0%)41 (27.15%)
Luminal B414 (53.49%)343 (55.06%)71 (47.02%)
HER2 enriched64 (8.27%)54 (8.67%)10 (6.62%)
TNBC80 (10.34%)51 (8.19%)29 (19.21%)
Adverse event<0.001
No137 (17.70%)136 (21.83%)1 (0.66%)
Yes637 (82.30%)487 (78.17%)150 (99.64%)
Serious adverse events (CTCTE>3)<0.001
No569 (73.51%)426 (68.38%)143 (94.70%)
Yes205 (26.49%)197 (31.62%)8 (5.30%)
Disruptions of therapy0.028
No746 (96.38%)595 (95.51%)151 (100%)
Yes28 (3.62%)28 (4.49%)0
Recurrence0.938
No717 (92.64%)576 (92.46%)141 (93.38%)
Yes57 (7.36%)47 (7.54%)10 (6.62%)
Recurrence time (months)53.47±25.1158.04±25.4534.74±11.08<0.001
Basic Information of the Training and Validation Sets The RF-RFE program of the R caret package was used to filter the most highly predictive variables of the set, and we selected the optimal number of variables according to the root mean square error (RMSE). To evaluate the accuracy of a model, the RMSE of test and predicted values is determined. The lower the RMSE value, the higher the predictive accuracy of the model. Figure 2 shows that when the model included 14 variables, the RMSE value was lowest.
Figure 2

Evaluating the number of variables contained in the optimal set using the root mean square error.

Evaluating the number of variables contained in the optimal set using the root mean square error. The best variable set (14 variables) filtered by RF-RFE included the pathological (TNM) stage, gamma-glutamyl transpeptidase (GGT), total cholesterol (CHOL), Ki-67, lymphocyte count, low-density lipoprotein (LDL), age, apolipoprotein B (ApoB), high-density lipoprotein (HDL), serum globulin (GLB), neutrophil count to lymphocyte count ratio (NLR), alanine aminotransferase (ALT), triglyceride (TRIG), and serum albumin to serum GLB (A/G) ratio data. Variable importance (VIMP) indicated by RF-RFE is shown in Figure 3. The positive VIMP value indicates that one variable improves predictive accuracy, while the negative value indicates an adverse effect in the prediction.21 VIMP indicated the contribution of each variable to model prediction separately, but it did not consider the contribution of combinations of variables.22,23 VIMPs of A/G were negative, but when they were included in the variable set, the RMSE value was reduced to the lowest level measured, indicating that the variable set that included A/G had the best predictive performance. Moreover, the previous studies reported that the increased A/G ratio often predicts a good prognosis,6,24–26 which was reversed to recurrence. Therefore, A/G ratios were included in the variable set.
Figure 3

Variable importance values derived from the random forest-recursive feature elimination analysis.

Variable importance values derived from the random forest-recursive feature elimination analysis. RSF using the R software RandomForestSRC package was used to construct the model. As shown in Figure 4, the error rate of the model gradually stabilized as the numbers of fixed trees increased. Between 4000 and 6000, the out-of-bag error rate steadily decreased and tended to be approximately 0.3. The error rate was stable when the number of fixed trees was 10,000. Therefore, the selection of 10,000 trees (ntree = 10,000) was appropriate, and the best performing parameters (ntree = 10,000; mtry = 4) were selected to develop the RSF prognostic model. Subsequently, RSF-based scores for individual samples were calculated. The C-index was 0.997 (95% confidence interval [CI], 0.995–0.998) (strong discriminatory power).27 ROC curve analysis was used to evaluate the performance of the developed RSF prognostic model in the training set. Based on RSF scores, the area under the ROC curve (AUROC) was 0.994 (95% CI, 0.9848–1.0), with a sensitivity of 97.9%, specificity of 98.4%, and an optimal cut-off value of 2.81 in the training set (Figure 5).
Figure 4

Change in the prediction error rate of the recurrence risk model of breast cancer patients with tree number.

Figure 5

Receiver operating characteristic curve of the developed random survival forest model.

Change in the prediction error rate of the recurrence risk model of breast cancer patients with tree number. Receiver operating characteristic curve of the developed random survival forest model. We divided samples into high-and low-risk groups, which included above- and below RSF-based scores of 2.81, respectively. KM analyses revealed highly significant differences between overall survival times of the high-risk and low-risk groups (P < 0.0001) (Figure 6). Results showed that the developed RSF prognostic model was able to predict BC recurrence accurately.
Figure 6

Kaplan-Meier survival curves of recurrence-free survival for the training set.

Kaplan-Meier survival curves of recurrence-free survival for the training set. The developed RSF prognostic model was applied in the assessment of an independent validation set, and the predictive performance of the model was evaluated using data of the Liuzhou Women and Children’s Medical Center cohort. The C-index was determined to be 0.936 (95% CI, 0.891–0.981) (strong discriminatory power).27 ROC curve analysis was used to evaluate the performance of the developed RSF prognostic model in the validation set. Based on the RSF-based scores, the AUROC was 0.961 (95% CI, 0.926–0.996), with a sensitivity of 100% and specificity of 87.9% (Figure 7).
Figure 7

Receiver operating characteristic curve of the developed random survival forest model assessment by the validation set.

Receiver operating characteristic curve of the developed random survival forest model assessment by the validation set. The optimal cut-off value of 2.81 in the RSF-based score in the training set was used to stratify patients into high-risk and low-risk groups in the validation set. KM analyses revealed highly significant differences in recurrence-free survival between high-risk and low-risk groups (P < 0.0001) (Figure 8).
Figure 8

Kaplan-Meier survival curves of recurrence-free survival for the validation set.

Kaplan-Meier survival curves of recurrence-free survival for the validation set.

Discussion

This study employed an RF-RFE algorithm28 to automatically select the most relevant features among 40 variables identified for further RSF model development. Variable selection is the process of selecting a data set that includes relevant features for further analysis to minimize possible generalization error. The 14 selected variables for the model developed included pathological (TNM) stage, GGT, CHOL, Ki-67, lymphocyte count, LDL, age, ApoB, HDL, GLB, NLR, ALT, TRIG, and A/G. These were reported to be closely associated with BC recurrence risk in previous studies, and they were used in this study to develop a reliable model.

The Characteristic of the Selected Variables

The most important variable selected was pathological stage (TNM stage). Pathological stage has been widely used in clinical practice to predict prognosis and survival as well as guide clinical treatment.29 Proliferating cell nuclear antigen (Ki-67) is a protein closely related to cell proliferation. Current studies have shown that Ki-67 is highly correlated with the differentiation, invasion, and metastasis of BC,30 which can be used to predict the prognosis of BC patients. Some studies have confirmed that Ki-67-positive BC patients have a poor treatment response and prognosis.29,31 Lymphocyte count, NLR, and platelet count to lymphocyte count ratio (PLR) are variables assessed in routine blood tests. In previous studies, it was found that a change in the white blood cell (WBC) count in peripheral blood is related to the systemic inflammatory response.5 Moreover, some studies have found that tumor-related systemic inflammatory response is an independent predictor of tumor prognosis in patients.32,33 NLR and PLR can reliably reflect the body’s inflammatory status. In fact, studies confirmed the classification of WBC count in peripheral blood, and NLR and PLR can be used to predict the prognosis of BC.5,34–36 Increases in NLR and PLR often indicate poor response to treatment and poor prognosis, whereas low NLR and PLR are often indicative of good treatment response and good prognosis.37–39 Age is an independent risk factor for BC. The incidence and mortality rate of BC increases with age. Many factors related to age play an important role in the occurrence and development of BC, such as changes in hormone levels before and after menopause, cumulative DNA damage with aging, the occurrence of various types of chronic infections, and changes in the immune system.40–42 At the same time, some studies have shown that the prognosis of young BC patients is often poor, which may be due to the higher pathological grade of young patients’ tumors, as these patients are also often cancer cell hormone receptor-negative and HER-2-positive and may have other adverse indications.8,43 The liver function test is routinely performed to assess ALT, AST, total bilirubin, ALP, GGT, ALB, GLB, and A/G. Some researchers have used ALT and GGT to predict all-cause mortality in the general population and found that predictions using the factors were moderately accurate.44 Previous studies have shown that high levels of GLB are associated with a poor prognosis in patients with breast and rectal cancers.6,45,46 On the other hand, in various tumors including BC, non-small cell lung cancer, renal cell cancer, and laryngeal cancer, increased ALB levels and ALB/GLB ratios (A/Gs) are often predictive of a good prognosis.6,24–26 Therefore, these liver function parameters can be used as predictors of BC prognosis. BC patients with baseline hypertriglyceridemia may have poor prognosis.47 Moreover, CHOL is associated with BC recurrence,48 whereas a high HDL/CHOL ratio is associated with a good prognosis.47

Novelty of the Proposed Model

However, breast cancer is a multi-factorial disease, and individualized patient profiling is instrumental to improving BC management and individual outcomes.12–14 It is widely accepted that the prognosis of cancer patients depends on not only tumor characteristics but also patient characteristics.49 Thus, we determined that it would be best to develop a predictive model using both tumor characteristics and patient characteristics. In terms of validity and reliability, we found that our predictive model performed well even upon external validation using an independent data set. C-indexes were 0.997 and 0.936 for the training and external validation sets, respectively, and discriminatory power was good. The KM analyses (P < 0.0001 for both training and validation sets) were used to evaluate the performance of the model, and we found that our model could accurately predict BC recurrence. Further, AUROCs were 0.994 and 0.961 for training and validation sets, respectively, indicating that our model was able to reliably predict BC recurrence. Moreover, the AUROC of the previously reported predictive model ranged from 0.69 to 0.92, suggesting that our model was more accurate than the models previously described.50–53 Compared with published predictive models created based on new molecular biomarkers derived from gene or protein expression analysis, our existing models rely on simple and easy-to-obtain demographic data and clinical routine examination indicators obtainable in clinical practice. This means that a high-accuracy prediction may be made without increasing cost to patients. From the perspective of economic cost, using conventional laboratory indicators has lower cost than using new molecular biomarkers, particularly since new molecular biomarkers measurements are not routinely conducted in clinical practice. Further, this additional expense is not covered by insurance. The model constructed in this study included comprehensive selected features for both tumor and patient-related features and can be performed without additional cost to patients and is easy to operate. The model has the potential to help clinicians identify and provide interventions for high-risk patients in the early stage of disease and allow physicians to perform more accurate, targeted, efficient, and individualized treatment plans to improve the prognosis of BC patients. Our study has some limitations. First, all study participants were of Han descent. Thus, the model has limited applicability to other races until external validation using data of patients from other regions and ethnicities is performed. Second, this is a cross-sectional study, which created a prediction model based on the baseline level, suggesting that timing and causality could not be determined. Thus, studies conducted at multiple centers that include larger cohorts are required.

Conclusions

We developed and validated a model to predict BC recurrence in patients in China. Predictive variables were selected based on data commonly obtained in clinical practice, which would not incur an additional cost to patients. The RSF model exhibited high discriminatory accuracy and good calibration, which may facilitate recurrence prediction. Moreover, by using the model, clinicians can deliver precise and efficient individualized treatment for BC patients to improve their prognosis. This new multiparametric RSF model is instrumental for breast cancer recurrence prediction and potentially improves individual outcomes.
  46 in total

1.  Evaluation of birth cohort patterns in population disease rates.

Authors:  R E Tarone; K C Chu
Journal:  Am J Epidemiol       Date:  1996-01-01       Impact factor: 4.897

2.  Prognostic value of pretreatment circulating neutrophils, monocytes, and lymphocytes in oropharyngeal cancer stratified by human papillomavirus status.

Authors:  Shao Hui Huang; John N Waldron; Michael Milosevic; Xiaowei Shen; Jolie Ringash; Jie Su; Li Tong; Bayardo Perez-Ordonez; Ilan Weinreb; Andrew J Bayley; John Kim; Andrew Hope; B C John Cho; Meredith Giuliani; Albiruni Razak; David Goldstein; Willa Shi; Fei-Fei Liu; Wei Xu; Brian O'Sullivan
Journal:  Cancer       Date:  2014-10-21       Impact factor: 6.860

3.  Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes.

Authors:  Guy N Brock; John R Shaffer; Richard E Blakesley; Meredith J Lotz; George C Tseng
Journal:  BMC Bioinformatics       Date:  2008-01-10       Impact factor: 3.169

4.  Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.

Authors:  Freddie Bray; Jacques Ferlay; Isabelle Soerjomataram; Rebecca L Siegel; Lindsey A Torre; Ahmedin Jemal
Journal:  CA Cancer J Clin       Date:  2018-09-12       Impact factor: 508.702

5.  Preoperative albumin/globulin ratio has predictive value for patients with laryngeal squamous cell carcinoma.

Authors:  Wan-Zhi Chen; Shi-Tong Yu; Rong Xie; Yun-Xia Lv; De-Bin Xu; Ji-Chun Yu
Journal:  Oncotarget       Date:  2017-07-18

6.  Comparison of unsupervised machine-learning methods to identify metabolomic signatures in patients with localized breast cancer.

Authors:  Jocelyn Gal; Caroline Bailleux; David Chardin; Thierry Pourcher; Julia Gilhodes; Lun Jing; Jean-Marie Guigonis; Jean-Marc Ferrero; Gerard Milano; Baharia Mograbi; Patrick Brest; Yann Chateau; Olivier Humbert; Emmanuel Chamorey
Journal:  Comput Struct Biotechnol J       Date:  2020-06-03       Impact factor: 7.271

7.  The Peripheral Blood Neutrophil-To-Lymphocyte Ratio Is Superior to the Lymphocyte-To-Monocyte Ratio for Predicting the Long-Term Survival of Triple-Negative Breast Cancer Patients.

Authors:  Weijuan Jia; Jiannan Wu; Haixia Jia; Yaping Yang; Xiaolan Zhang; Kai Chen; Fengxi Su
Journal:  PLoS One       Date:  2015-11-18       Impact factor: 3.240

8.  Nomogram of Naive Bayesian Model for Recurrence Prediction of Breast Cancer.

Authors:  Woojae Kim; Ku Sang Kim; Rae Woong Park
Journal:  Healthc Inform Res       Date:  2016-04-30

9.  Use of 21-gene recurrence score assay to individualize adjuvant chemotherapy recommendations in ER+/HER2- node positive breast cancer-A National Cancer Database study.

Authors:  Prema P Peethambaram; Tanya L Hoskin; Courtney N Day; Matthew P Goetz; Elizabeth B Habermann; Judy C Boughey
Journal:  NPJ Breast Cancer       Date:  2017-10-19

10.  Impact of pre-diagnostic triglycerides and HDL-cholesterol on breast cancer recurrence and survival by breast cancer subtypes.

Authors:  Trygve Lofterød; Elin S Mortensen; Hawa Nalwoga; Tom Wilsgaard; Hanne Frydenberg; Terje Risberg; Anne Elise Eggen; Anne McTiernan; Sura Aziz; Erik A Wist; Andreas Stensvold; Jon B Reitan; Lars A Akslen; Inger Thune
Journal:  BMC Cancer       Date:  2018-06-15       Impact factor: 4.430

View more
  1 in total

1.  Machine Learning-Based Overall Survival Prediction of Elderly Patients With Multiple Myeloma From Multicentre Real-Life Data.

Authors:  Li Bao; Yu-Tong Wang; Jun-Ling Zhuang; Ai-Jun Liu; Yu-Jun Dong; Bin Chu; Xiao-Huan Chen; Min-Qiu Lu; Lei Shi; Shan Gao; Li-Juan Fang; Qiu-Qing Xiang; Yue-Hua Ding
Journal:  Front Oncol       Date:  2022-06-30       Impact factor: 5.738

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.