| Literature DB >> 31275615 |
Matthew Hueman1, Huan Wang2, Donald Henson3, Dechang Chen4.
Abstract
OBJECTIVE: The American Joint Committee on Cancer (AJCC) system for staging cancers of the colon and rectum includes depth of tumour penetration, number of positive lymph nodes and presence or absence of metastasis. Using machine learning, we demonstrate that these factors can be integrated with age, carcinoembryonic antigen (CEA) interpretation and tumour location, to form prognostic systems that expand the tumour, lymph node, metastasis (TNM) staging system.Entities:
Keywords: C-index; colorectal cancer; dendrogram; machine learning; staging
Year: 2019 PMID: 31275615 PMCID: PMC6579577 DOI: 10.1136/esmoopen-2019-000518
Source DB: PubMed Journal: ESMO Open ISSN: 2059-7029
Figure 1Ensemble Algorithm for Clustering Cancer Data prognostic groups and American Joint Committee on Cancer (AJCC) stages on the basis of dataset 1 involving T, N and M. (A) The tree-structured dendrogram (in the black colour). A 3-year cancer-specific survival rate is given beneath each combination. Cutting the dendrogram according to n*=10 in (B) creates 10 prognostic groups, shown in red square boxes. Listed on the bottom are the group numbers. (B) C-index curve based on the dendrogram in (A). The number 0.7802 is the C-index corresponding to n*=10 prognostic groups. (C) Cancer-specific survival of 10 prognostic groups in (A). The 3-year cancer-specific survival rates for 10 groups are listed on the right side. (D) Cancer-specific survival of 10 AJCC stages. The 3-year cancer-specific survival rates for 10 stages are listed on the right side.
Figure 2Dendrogram for T, N, M, A, C and L on dataset 2 and cutting the dendrogram to produce 10 groups. Running Ensemble Algorithm for Clustering Cancer Data results in the tree-structured dendrogram (in black colour). A 5-year cancer-specific survival rate in percentage is given beneath each combination. Red square boxes show cutting the dendrogram into 10 prognostic groups according to n*=10 for the red curve {T, N, M, A, C, L} in figure 3 (A). Listed on the bottom are the group numbers.
Figure 3C-index curves and survival curves on the basis of dataset 2. (A) C-index curves of {T, N, M}, {T, N, M, L}, {T, N, M, C}, {T, N, M, A}, {T, N, M, C, L}, {T, N, M, A, L}, {T, N, M, A, C} and {T, N, M, A, C, L}. The optimal number of groups n*, determined using the red curve of {T, N, M, A, C, L}, equals 10 with a corresponding C-index of 0.7914. (B) Cancer-specific survival of 10 prognostic groups in figure 2. The 5-year cancer-specific survival rates for 10 groups are listed on the right side.
Figure 4Profiles of factor levels across prognostic groups shown in figure 2. In each panel, one factor is concerned, and for each level of the factor, the distribution of associated patients (10 proportions at 10 groups) is presented. The number shown for each distribution is the maximum proportion achieved at the corresponding group.