Literature DB >> 33046018

ECCDIA: an interactive web tool for the comprehensive analysis of clinical and survival data of esophageal cancer patients.

Jingcheng Yang1, Jun Shang1, Qian Song1, Zuyi Yang2, Jianing Chen3, Ying Yu4,5,6, Leming Shi7,8,9.   

Abstract

BACKGROUND: Esophageal cancer (EC) is considered as one of the deadliest malignancies with respect to incidence and mortality rate, and numerous risk factors may affect the prognosis of EC patients. For better understanding of the risk factors associated with the onset and prognosis of this malignancy, we develop an interactive web-based tool for the convenient analysis of clinical and survival characteristics of EC patients.
METHODS: The clinical data were obtained from The Surveillance, Epidemiology, and End Results (SEER) database. Seven analysis and visualization modules were built with Shiny.
RESULTS: The Esophageal Cancer Clinical Data Interactive Analysis (ECCDIA, http://webapps.3steps.cn/ECCDIA/ ) was developed to provide basic data analysis, visualization, survival analysis, and nomogram of the overall group and subgroups of 77,273 EC patients recorded in SEER. The basic data analysis modules contained distribution analysis of clinical factor ratios, Sankey plot analysis for relationships between clinical factors, and a map for visualizing the distribution of clinical factors. The survival analysis included Kaplan-Meier (K-M) analysis and Cox analysis for different subgroups of EC patients. The nomogram module enabled clinicians to precisely predict the survival probability of different subgroups of EC patients.
CONCLUSION: ECCDIA provides clinicians with an interactive prediction and visualization tool for visualizing invaluable clinical and prognostic information of individual EC patients, further providing useful information for better understanding of esophageal cancer.

Entities:  

Keywords:  Clinical data mining; Esophageal cancer; Nomogram; SEER; Survival analysis

Mesh:

Year:  2020        PMID: 33046018      PMCID: PMC7552344          DOI: 10.1186/s12885-020-07479-9

Source DB:  PubMed          Journal:  BMC Cancer        ISSN: 1471-2407            Impact factor:   4.430


Background

Esophageal cancer (EC) is considered as one of the most deadly malignancies with respect to incidence and mortality rate [1, 2]. Globally, EC was ranked the seventh for the incidence rate and the sixth for the mortality rate in 2018 [2]. Approximately 17,650 new cases of EC are expected to occur and 16,080 patients are predicted to die from esophageal cancer in the United States in 2019 [1]. Previous studies have revealed numerous risk factors that may affect the prognosis of EC patients [3-6]. Nevertheless, these studies have been outdated and unable to provide an interactive and continuously updated result for researchers and physicians. Population-based studies have been widely utilized to predict patients’ survival outcomes and have played a significant role for clinical decision makers and for the recommendations of guidelines [7]. With the rise of interactive data analysis, there have been many tools to help us understand the molecular characteristics of EC, but there is still a lack of effective interactive web tools based on population statistics data of EC to help us fully understand the risk factors associated with the onset and prognosis of this malignancy. The Surveillance, Epidemiology, and End Results (SEER) database is an authoritative source for cancer statistics with comprehensive clinical and pathological information of cancer cases reported in the United States [8]. Based on SEER data, many studies have been conducted to explore the epidemic, clinicopathologic and prognostic characteristics of EC, and to examine numerous risk factors that might be affected [3-6], but no study provides an interactive visual analysis of all the characteristics of EC data based on the SEER database [9]. Moreover, because SEER data are updated annually, the value of statistical results published from these studies using “outdated data” is somehow limited, resulting in limited usage of these precious data. Clinicians who would like to obtain valuable and updated information on EC prognosis may find it hard to navigate the rich data in SEER in whichever way they want. Herein, we developed a powerful user-friendly web-based platform called Esophageal Cancer Clinical Data Interactive Analysis (ECCDIA) using data on 77,273 EC patients in the SEER Program from 1975 to 2018. ECCDIA is able to provide on-line statistical analysis tools, including clinical factor ratio distribution by year, the Sankey plots presenting relationships between different clinical factors, the survival rate analysis, Kaplan-Meier (K-M) analysis, Cox analysis, and nomograms illustrating prediction of survival probability across subgroups. ECCDIA is an efficient and user-friendly tool to assist researchers and clinicians to understand esophageal cancer using interactive analysis tools which can help users quickly explore data using different visualization approaches. ECCDIA is freely accessible and available at http://webapps.3steps.cn/ECCDIA/.

Methods

Patient data

Patient data were retrieved from the SEER*Stat Version 8.3.5 database named Incidence-SEER 18 Regs Custom Data (with additional treatment fields), Nov 2018 Sub (1973–2016 varying) by using the case-listing session. The International Classification of Diseases for Oncology (ICD-O-3) was utilized to identify patients with esophageal squamous cell carcinoma (ESCC) (ICD-O-3 histologic type: 8050–8089) and esophageal adenocarcinoma (EAD) (ICD-O-3 histologic type: 8140–8389) [10]. The ICD-O-3 site codes for EC were C15.0, 15.1, 15.2, 15.3, 15.4, 15.5, 15.8, and 15.9. As the SEER database is a public one, there is no personal identification information for patients. Patients with diagnosed confirmation of positive histology and those of active follow up were included for analysis. Patients with unknown survival data were excluded. There were 77,273 patients for overall survival (OS) and 52,206 patients for cancer specific death (CSS).

Construction of analysis modules

ECCDIA is a web-based tool constructed with the Shiny framework. It contained seven interactive analysis modules written with R language (Fig. 1). Basic charts, such as bar plot, Sankey plot, line plot and map, were constructed with Plotly [11]. Cox and survival analysis were performed with R packages survival (v2.42–6) [12] and survminer (v0.4.3) [13]. Nomogram was constructed with R package rms (v5.1–2) [14].
Fig. 1

Framework of ECCDIA. Schema describing data processing and data visualization for the ECCDIA. Raw data contains information for 77,273 esophageal cancer (EC) patients. Computation layer contains survival analysis, Cox analysis and nomogram modules. Interactive output contains basic charts, such as bar plot, Sankey plot, line plot and map, etc.

Framework of ECCDIA. Schema describing data processing and data visualization for the ECCDIA. Raw data contains information for 77,273 esophageal cancer (EC) patients. Computation layer contains survival analysis, Cox analysis and nomogram modules. Interactive output contains basic charts, such as bar plot, Sankey plot, line plot and map, etc.

Results

Data summary

Patients with unknown survival data were excluded. There were 77,273 patients including 58,668 males and 18,605 females for overall survival (OS) and 52,206 patients for cancer specific death (CSS). The histological type of 40,683 cases is esophageal adenocarcinoma (EAD), and the other 36,590 cases is esophageal squamous cell carcinoma (ESCC). Based on the 7th edition of AJCC, 3625, 3304, 5018 and 6287 patients were in stages I, II, III and IV, respectively.

Modules of ECCDIA

ECCDIA is a modular interactive tool which mainly contains seven capabilities, including “Clinical Ratio” that analyzes clinical factor ratio distribution by year, “Sankey Plot” that demonstrates the relationship of frequency distribution between different clinical factors, “Survival Rate” that exhibits the changes of survival rate for clinical factors by year, “K-M Analysis” that displays survival curves of OS and CSS for clinical factors, “Cox Analysis” that exhibits univariate and multivariate analysis of OS and CSS for different subgroups of EC patients, “Nomogram” that predicts survival outcome for different subgroups of EC patients, and “Map” that exhibits the distribution of clinical factors in the form of a map of the United States (Fig. 2).
Fig. 2

Overview of ECCDIA analysis modules. Clinical factor ratio aims to find out the trend of different clinical factor ratio distributions by year. Flows of patients provides users with a convenient and intuitive interface for the correlation of different clinical factors. Survival rate exhibits the changes of survival rate for clinical factors by year. The survival analysis module is used to compare the influence of clinical factors on OS and CSS in different subgroups of EC patients. Cox analysis module exhibits univariate and multivariate analysis of OS and CSS for different subgroups of EC patients. The Nomogram module predicts patients’ survival outcome for different subgroups of EC patients

Overview of ECCDIA analysis modules. Clinical factor ratio aims to find out the trend of different clinical factor ratio distributions by year. Flows of patients provides users with a convenient and intuitive interface for the correlation of different clinical factors. Survival rate exhibits the changes of survival rate for clinical factors by year. The survival analysis module is used to compare the influence of clinical factors on OS and CSS in different subgroups of EC patients. Cox analysis module exhibits univariate and multivariate analysis of OS and CSS for different subgroups of EC patients. The Nomogram module predicts patients’ survival outcome for different subgroups of EC patients

Basic data analysis

The first module of ECCDIA aims to find out the trend of different clinical factor ratio distribution by year. As exhibited in Figure S1A, the incidence of EAD increased, while the incidence of ESCC decreased by year. To further investigate whether this trend existed in the subgroup of EC patients, the different subgroups of data could be chosen. Interestingly, we found that male and white patients had a similar trend as the whole group, but the female, black and API (Asian or Pacific Islander) patients did not demonstrate a significant trend change (Figure S1B-F). Additional fascinating results can be found by users using this module of ECCDIA. The flows of patients’ module are to provide users with a convenient and intuitive interface for the correlation of different clinical factors. As demonstrated in Figure S2, most of the white patients were ESCC in 1975. In 1993, the proportion of EAD and ESCC was almost equal in white patients, whereas the majority of white patients were EAD in 2016. Users can perform interactive analysis in this module to find what they may be interested in. The last module of ECCDIA provides users with a map of the United States to show the distribution of clinical factors by state.

Survival analysis

The survival analysis mainly contains three modules, including survival rate, K-M analysis, and Cox analysis. The survival rate module has the ability to show EC patients survival rate fluctuation by year. We showed in Figures S3A and S3B how to perform survival analysis quickly and easily. ECCDIA allows us to quickly find that there were clear differences in survival among different histopathological types and ethnicities. In Figure S4A, before 1989, there exhibited an obvious fluctuation of survival rate between EAD and ESCC. Nevertheless, EAD patients tended to have a consistently higher survival rate than ESCC patients after 1989. The K-M analysis module provides intuitive figures for users who would like to compare the impact of clinical factors on OS and CSS in different subgroups of EC patients. For instance, Figure S4B-D exhibited a comparison of the impact of histologic types on patients’ OS. Regardless of male or female subgroup, EAD patients had a much better OS than ESCC patients. The Cox analysis module demonstrates tables for the results of univariate and multivariate analysis of OS and CSS for different subgroups of EC patients. This module is also user-friendly and provides users with interactive tables.

Nomogram

The nomogram module can precisely predict 1-year, 3-year, and 5-year survival probabilities of OS and CSS for all patients, ESCC patients, EAD patients, stages I-II patients, stages III-IV patients, and patients undergoing surgery plus chemotherapy and radiation therapy. For instance, for an EC patient who was 20 years old and had three positive regional nodes with grade I and stage I, the 1-year, 3-year, and 5-year survival probabilities were 96.82, 90.19, and 86.25%, respectively (Figure S5). At the same time, we show that there is a good agreement between the nomogram-based survival rates and the actual survival rate by using calibration plots (Figure S5C). Furthermore, the agreement between predicted and observed 1, 3, 5-year survival rates shown with calibration curves was verified with clinical data associated with the TCGA esophageal cancer data set (Figure S6).

Map

“Map”, the last module of ECCDIA, provides users with a map of the United States to show the distribution of clinical factors by state. The distribution of the different survival rates is displayed in Data exploration and Interactive map sections. These functions can easily visualize the survival rate in different states for the EC patients (Table 1).
Table 1

The survival rates of esophageal cancer patients in different states

StateNum_PatientsAverage_ageAverage_tumor_size(mm)1-year survival rate2-year survival rate3-year survival rate4-year survival rate5-year survival rate
Alaska12268.3350.190.420.260.190.160.13
California25,22564.4149.440.400.230.170.140.13
Connecticut683265.7448.400.410.260.190.160.14
Georgia783267.8845.130.480.310.240.200.17
Hawaii166965.8048.690.400.230.160.130.12
Iowa498367.8651.870.420.260.200.160.15
Kentucky348468.0350.490.430.270.210.180.16
Louisiana339465.9047.010.420.270.220.180.17
Michigan762465.0748.310.440.270.210.170.15
New Jersey676267.4953.910.390.220.170.130.11
New Mexico189267.5847.700.420.240.170.140.12
Utah150367.2755.900.340.180.130.100.08
Washington595164.1855.430.440.280.230.220.22
The survival rates of esophageal cancer patients in different states

Discussion

This study provides an interactive web tool that analyzes rich clinical and prognostic data of EC patients from the SEER database. Our tool is able to provide clinicians and clinical decision makers with useful information to make suitable treatment plan for EC patients with no need to refer to a large number of research papers. The SEER database has such a large collection of cancer patients’ clinicopathologic and prognostic data so that it holds a great potential to conduct robust statistical mining to gain the most powerful and reliable survival prediction for cancer patients. However, previously published researches using the SEER database present analyses in only one aspect or the other and do not make full use of the comprehensive information in SEER. ECCDIA makes the most use of EC data of the SEER database and presents these data in a user-friendly interactive interface with no need to grasp computational programming skills. It can easily exhibit the clinicopathologic and prognostic analysis for a variety of subgroups of EC patients. To the best of our knowledge, the online tool ECCDIA is the first such system that demonstrates the most comprehensive integrative analysis of clinical data with the full utilization of EC data in the SEER database. More importantly, to facilitate clinical use of this online tool, nomograms predicting the prognosis of different subgroups of EC patients are provided are provided by ECCDIA. Using ECCDIA, clinicians can immediately obtain the survival probability of patients by simply inputting the values of clinical factors, which helps them make the right decision for EC patients. There are already many good online tools for esophageal cancer. For example, both OSescc and OSeac are great tools that use gene expression data of esophageal cancer patients in public databases to quickly query the correlation between the expression level of a gene and patient prognosis [15, 16]. Briefly, compared with OSescc and or OSeac, ECCDIA is committed to creating dynamic interactive visualization tools to explore the epidemiological characteristics of esophageal cancer in the SEER database. In addition to survival analysis, ECCDIA can also dynamically and interactively display the epidemiological characteristics of esophageal cancer patients spanning 20 years. Both gene expression data and epidemiological characteristics can provide complementary information for better understanding esophageal cancer. Some limitations of ECCDIA need to be mentioned. ECCDIA does not integrate the molecular or genetic data of EC patients with their clinical data, since the SEER database only provides the clinical data of cancer patients. In addition, some treatment biases are present. Therefore, additional future work is needed by combining the data in the SEER database with other publicly available databases. Nevertheless, ECCDIA is the first interactive web tool to assess the largest clinical and prognostic data of EC patients, which will become an invaluable resource for clinical guideline of EC. Besides, ECCDIA will be updated to embrace the newest data released by SEER.

Conclusion

The Esophageal Cancer Clinical Data Interactive Analysis (ECCDIA, http://webapps.3steps.cn/ECCDIA/) is the first interactive prediction and visualization web tool to assess the largest clinical and prognostic data of EC patients from the SEER database, further increasing the assessment of clinical guidelines for EC. Furthermore, ECCDIA will be regularly updated to embrace the newest data to be released by SEER. Additional file 1: Figure S1. Histologic type ratio distribution by year. Figure S2. The patient flows between histologic type and race. Figure S3. Survival analysis example. Figure S4. The relationship between histologic type and survival. Figure S5. Survival rate prediction. Figure S6. Survival prediction external verification.
  11 in total

1.  A comparison of observational studies and randomized, controlled trials.

Authors:  K Benson; A J Hartz
Journal:  N Engl J Med       Date:  2000-06-22       Impact factor: 91.245

2.  Cancer statistics, 2019.

Authors:  Rebecca L Siegel; Kimberly D Miller; Ahmedin Jemal
Journal:  CA Cancer J Clin       Date:  2019-01-08       Impact factor: 508.702

3.  Clinical and Prognostic Features of Patients With Esophageal Cancer and Multiple Primary Cancers: A Retrospective Single-institution Study.

Authors:  Yoshifumi Baba; Naoya Yoshida; Koichi Kinoshita; Masaaki Iwatsuki; Yo-Ichi Yamashita; Akira Chikamoto; Masayuki Watanabe; Hideo Baba
Journal:  Ann Surg       Date:  2018-03       Impact factor: 12.969

4.  Treatment modalities for T1N0 esophageal cancers: a comparative analysis of local therapy versus surgical resection.

Authors:  Mark F Berry; Josiane Zeyer-Brunner; Anthony W Castleberry; Jeremiah T Martin; Beat Gloor; Ricardo Pietrobon; Thomas A D'Amico; Mathias Worni
Journal:  J Thorac Oncol       Date:  2013-06       Impact factor: 15.609

5.  Trends in esophageal cancer survival in United States adults from 1973 to 2009: A SEER database analysis.

Authors:  Basile Njei; Thomas R McCarty; John W Birk
Journal:  J Gastroenterol Hepatol       Date:  2016-06       Impact factor: 4.029

6.  Influence of sex on the survival of patients with esophageal cancer.

Authors:  Pierre Bohanes; Dongyun Yang; Ruchika S Chhibar; Melissa J Labonte; Thomas Winder; Yan Ning; Armin Gerger; Léonor Benhaim; David Paez; Takeru Wakatsuki; Fotios Loupakis; Rita El-Khoueiry; Wu Zhang; Heinz-Josef Lenz
Journal:  J Clin Oncol       Date:  2012-05-14       Impact factor: 44.544

Review 7.  Esophageal cancer: A Review of epidemiology, pathogenesis, staging workup and treatment modalities.

Authors:  Kyle J Napier; Mary Scheerer; Subhasis Misra
Journal:  World J Gastrointest Oncol       Date:  2014-05-15

8.  Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.

Authors:  Freddie Bray; Jacques Ferlay; Isabelle Soerjomataram; Rebecca L Siegel; Lindsey A Torre; Ahmedin Jemal
Journal:  CA Cancer J Clin       Date:  2018-09-12       Impact factor: 508.702

9.  Interactive online consensus survival tool for esophageal squamous cell carcinoma prognosis analysis.

Authors:  Qiang Wang; Fengling Wang; Jiajia Lv; Junfang Xin; Longxiang Xie; Wan Zhu; Yitai Tang; Yongqiang Li; Xiaofang Zhao; Yunlong Wang; Xianzhe Li; Xiangqian Guo
Journal:  Oncol Lett       Date:  2019-06-05       Impact factor: 2.967

10.  OSeac: An Online Survival Analysis Tool for Esophageal Adenocarcinoma.

Authors:  Qiang Wang; Zhongyi Yan; Linna Ge; Ning Li; Mengsi Yang; Xiaoxiao Sun; Longxiang Xie; Guosen Zhang; Wan Zhu; Yunlong Wang; Yongqiang Li; Xianzhe Li; Xiangqian Guo
Journal:  Front Oncol       Date:  2020-03-06       Impact factor: 6.244

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.