Literature DB >> 34790803

A narrative review of prognosis prediction models for non-small cell lung cancer: what kind of predictors should be selected and how to improve models?

Yuhang Wang¹, Xuefeng Lin², Daqiang Sun^1,3.

Abstract

OBJECTIVE: To discover potential predictors and explore how to build better models by summarizing the existing prognostic prediction models of non-small cell lung cancer (NSCLC).
BACKGROUND: Research on clinical prediction models of NSCLC has experienced explosive growth in recent years. As more predictors of prognosis are discovered, the choice of predictors to build models is particularly important, and in the background of more applications of next-generation sequencing technology, gene-related predictors are widely used. As it is more convenient to obtain samples and follow-up data, the prognostic model is preferred by researchers.
METHODS: PubMed and the Cochrane Library were searched using the items "NSCLC", "prognostic model", "prognosis prediction", and "survival prediction" from 1 January 1980 to 5 May 2021. Reference lists from articles were reviewed and relevant articles were identified.
CONCLUSIONS: The performance of gene-related models has not obviously improved. Relative to the innovation and diversity of predictors, it is more important to establish a highly stable model that is convenient for clinical application. Most of the prevalent models are highly biased and referring to PROBAST at the beginning of the study may be able to significantly control the bias. Existing models should be validated in a large external dataset to make a meaningful comparison. 2021 Annals of Translational Medicine. All rights reserved.

Entities: Chemical

Keywords: Non-small cell lung cancer (NSCLC); PROBAST; prediction model; prognosis

Year: 2021 PMID： 34790803 PMCID： PMC8576716 DOI： 10.21037/atm-21-4733

Source DB: PubMed Journal: Ann Transl Med ISSN： 2305-5839

Introduction

Although the status of the most common cancers has changed, lung cancer remains the leading cause of cancer-related death in the world, with a mortality of 22% in male and 13.8% in female in 2018 (1-3). The 5-year relative survival rate for lung cancer is also poor and is estimated at 21% in 2021 for all stages combined (1). According to the WHO classification, lung cancer is histologically divided into two subtypes: non-small cell lung cancer (NSCLC) and small cell lung cancer (SCLC), with the former accounting for 85% and the latter for 15% (4). In recent years, therapy for NSCLC has progressed with the rise of targeted therapy and immune therapy, and personalized treatment. Emphasized by The American Joint Committee on Cancer, this therapeutic approach has increased the importance of prognostic prediction to assist clinicians in making the most effective treatment plan for patients (5,6). At present, the prognosis prediction for NSCLC is based on tumor, node, and metastasis (TNM) staging system (7), which stratifies patients to four stages according to clinical and pathological characteristics. However, the staging system is too broad to make a prognosis prediction for treatment guidance. A clinical prediction model, also known as a “clinical prediction rules” or “risk scores”, is a tool incorporating multiple predictors to predict the risk of some event (8) which has been developed explosively in recent years and has been applied to the detection, diagnosis, and prognosis of NSCLC. The model contains two sub-types: the diagnostic model and the prognostic model. The diagnostic model focuses on estimating the risk of developing a disease based on the epidemiological and clinical features of patients. Some guidelines, such as the National Comprehensive Center Network (NCCN) guidelines and the United States Preventive Services Task Force (USPSTF) (9,10), have recommended or applied some risk prediction models to the detection of NSCLC. The prognostic model focuses on the risk of disease recurrence, death, disability, and complications in the current state of the disease. An increasing number of prognostic predictors have been found to have potential for clinical application, and numerous models have been developed and validated. Compared to the TNM staging system, a prognosis prediction model can improve the accuracy of and guide personalized therapy through the combination of multiple prognostic factors. Since the 1980s, many large cohort studies (11-13) have been performed to develop and validate risk prediction models for the screening of NSCLC because of their high effectiveness compared to the NLST criteria (14). However, there is a lack of research on prognosis prediction models for NSCLC in large cohort studies, and the small-size sample of existing models may cause a reduction of persuasiveness. Although there are many studies on NSCLC, there are few reviews on clinical prediction models of NSCLC, which focus on the methodology of model establishment. This paper focuses on the selection of predictors and the control of model bias to provide more ideas and suggestions for modelers. In this article, we reviewed existing lung cancer prediction models with an emphasis on those with utility for prognosis, analyzed the urgent problems to be solved in the field, and suggest novel approaches to the construction of models. The focus of this article is to inform readers of the kind of predictors which should be selected to build particular models at a time when greater attention is being paid to this method. We present the following article in accordance with the Narrative Review reporting checklist (available at https://dx.doi.org/10.21037/atm-21-4733).

Methods

PubMed and Cochrane Library were searched using the items “NSCLC”, “prognostic model”, “prognosis prediction”, “survival prediction”, and “PROBAST” from 1 January 1980 to 5 May 2021. Reference lists from articles were reviewed and relevant articles were identified. Non-English language articles and abstracts were excluded.

Screening of predictors

Screening of the predictors is fundamental to the development of prediction models. The predictive value of several classic predictors, such as the TNM staging system, WHO-PS/ECOG-PS, and pathological classification has been widely validated (4,5,15,16). These traditional predictors have been included as the classification standard of prognosis stratification and treatment decision making by guidelines and expert consensus (17,18). Hence, we consider these factors should be routinely incorporated in the prognostic prediction models of NSCLC. The screening, diagnosis, and treatment of cancers has been promoted to the molecular genetic stage, benefiting from the development of high-throughput techniques such as next-generation sequencing and microarray (19-21). The classification of lung cancer has been further sub-classified from pathological classification to molecular classification based on driver genes (4). The treatment of lung cancer has developed to a comprehensive treatment based on pathological classification, stage, and molecular classification. The extraordinary development and application of genomics and epigenomics is defining reproducible and scalable prediction models (22), and increasingly, modelers have incorporated gene-related biomarkers obtained from the aspect of genomics and epigenomics such as the expression of DNA/RNA, somatic mutations, and DNA methylation. Accordingly, we have conducted a special review and summary in this aspect. On the other hand, the progress of prediction models depends on the discovery of innovative predictors, and we have summarized some factors that have not been incorporated or have been rarely used in models which we believe may hold some prediction value.

Predictors that should be incorporated routinely

TNM staging system

The TNM staging system involves the combination of three important factors, and most prognostic models would take this system into consideration when screening for predictors. Although some models may only include one component, such as T stage or number of positive lymph node stations, almost all research involving univariate and multi-variate analysis in the field of NSCLC incorporates the TMN system (23-26). Oberije et al. developed a survival prediction model with a dataset of 548 stage III NSCLC patients (23). After the predictor screening, they constructed a multi-variates model (MV model) consisting of age, gender, number of positive lymph node stations, gross tumor volume, and three other factors. Compared to making survival prediction with only TNM staging, the new MV model had better discrimination (C-statistic: 0.62 vs. 0.57). She et al. developed a deep learning model with a training data of 12,912 samples which incorporated 127 predictors including the TNM staging system (26). At the same time, they developed a model only including TNM staging system, and the results showed that the performance of the deep learning model was better than the TNM model in C-statistics (0.739 vs. 0.706).

WHO-PS

WHO-PS is the Eastern Cooperative Oncology Group Performance Status Scale (ECOG-PS), which is mainly used to assess the functional status of cancer patients from the clinical manifestation of tumors, patient activity, and the proportion of time in bed. Although WHO-PS is a subjective factor that would, relatively speaking, cause a heterogeneous group, it is increasingly applied to prediction models of cancer (23,27). Many researchers have suggested that WHO-PS 2 is a precise indicator of a bad prognosis (16,28), and for patients whose WHO-PS ≥3, guidelines recommend the best supportive treatment, which most researchers take into exclusion criteria (28,29). Prelaj et al. (27) conducted a retrospective study of 154 patients with advanced NSCLC receiving immunotherapy to find prognostic predictors and develop an effective prognostic scoring system. The study highlighted the negative prognostic role for OS and PFS of WHO-PS. The researchers divided patients into an WHO-PS 0–1 group and WHO-PS 2 group. The results of the multi-variates analysis suggested that WHO-PS had the worst prognostic significance for OS (HR =4.85, CI: 2.87–8.20) and PFS (HR =2.20, CI: 1.46–3.131) among the five variables finally incorporated, indicating patients with a lower WHO-PS might be less likely to be benefit from immunotherapy, possibly due to their poorer immune function. While WHO-PS may be a very expectant predictor, both before and after treatment, there are some defects in its application. Many studies have highlighted the subjectivity and variability of WHO-PS due to the evaluator (30-32), and the question of how to standardize its evaluation needs to be solved.

Pathological classification

The pathological classification of lung cancer has been increasingly detailed with the development and popularization of immunotherapy and targeted therapy. The 2015 WHO classification of lung tumors provides more information on improving outcomes in comparison to the 2004 version, and at present, almost all clinical prediction models of lung cancer incorporate pathological classification or use it as a grouping factor to make a subgroup analysis. However, the influence of the proportion of various subtypes on prognosis, such as whether the proportion of micropapillary and solid pattern is correlated with prognosis, is of great concern (33). It has been widely shown that the proportion of micropapillary is an independent prognosis predictor for OS and was associated with poor outcome (34-36), and some studies demonstrated that the presence or absence of micropapillary pattern is a poor prognostic factor. In cases where the micropapillary proportion is only 1%, the prognosis of patients is worse than that of patients without micropapillary (37,38). This suggests that whether as a dichotomous variable or continuous variable, micropapillary pattern can be incorporated into models as a very good predictor. Additionally, ground-glass opacification (GGO) of lung and spread of tumor through alveolar spaces (STAS) are both potential predictors to be considered (39,40). The histological classification of squamous cell carcinoma is less specific than that of adenocarcinoma (41). According to the new WHO pathological classification, the former is only classified into keratosis and non-keratosis, which is not significant for prognostic prediction. Therefore, more study of the histological classification of squamous cell carcinoma is required to determine its prognosis and treatment value.

Biomarker roles of gene-related information in lung cancer prognosis and their practical applications in models

Genomic biomarkers for NSCLC prognosis

Genomics has played a significant part in the prognosis and treatment management of patients with NSCLC by clarifying the role of driver genes and providing information on mutation and gene expression information (42-44). In NSCLC, any known somatic mutation is associated with a higher risk of disease recurrence than NSCLC with no mutations, regardless of the type and stage of lung cancer, although the mechanism by which this occurs remains unclear (45). Researchers have explored the prognosis prediction value of somatic mutations, and many studies have revealed the prognostic role of some individual mutations. For example, the mutation of EGFR (46,47), the mutation of TP53 (48), and the mutation of KRAS (49,50) have all been linked to a poor prognosis in NSCLC patients. Additionally, the mutation of EGFR, ALK, ROS1 and BRAF V600E have been defined as drive genes by NCCN guidelines, because of their significant roles in prognosis of NSCLC (9). In recent years, researchers have paid more attention to the prognosis value of co-mutations of multiple somatic mutations. For example, Jao and colleagues showed that multiple mutations were significantly associated with worse DFS but not OS, and multiple mutations with TP53 presented an additional risk to prognosis (51). In a different study, the same authors revealed the prognostic value of TP53 and EGFR co-mutations with data collected from 1,441 NSCLC patients (45). The results suggested that TP53 mutation is more frequent in EGFR mutated NSCLC patients compared to the EGFR wild types, and in patients with mutated EGFR, TP53 mutation is a negative prognostic factor, which highlighted the prognosis value of both co-mutations. However, some contradictions to these results have emerged, including those of the study by Labbé et al., which showed TP53 co-mutation in resected NSCLC patients with EGFR mutated had no significant association with OS nor PFS (52), although the small sample size may have skewed the results. Somatic mutation data are usually used in prediction models in the form of “0” and “1”, with “0” representing no mutation and “1” representing mutation. Most studies usually use gene mutation information to build models, such as EGFR [0/1], but more and more researchers begin to pay attention to the different effects of different mutation sites of the same gene, such as EGFR C. [2,573T >G] [0/1] (53,54). Another significant prognostic biomarker resulting from genomics analysis is tumor mutational burden (TMB), which is defined as the number of nonsynonymous mutations, especially in immunotherapy patients (55). TMB is a comprehensive manifestation of copy number alterations and somatic mutations. A study by Devarakonda et al. reported high TMB (>8) had a strong positive effect on the prognosis of patients with lung cancer after resection, and patients with low TMB (<4) were more likely to benefit from adjuvant chemotherapy (56). Additionally, the value of TMB as a powerful biomarker in immune checkpoint indicators therapy has been proposed (57-59). Nevertheless, the gold standards for TMB calculation are whole-genome sequencing or whole-exome sequencing, which are time-consuming and costly, and carry a high criterion for the quality and quantity of tissue samples. Recently, Tian and colleagues constructed a TMB estimation model with only 23 genes successfully correlated with whole-exome sequencing TMB (60), and the TMB estimated by the 23-gene panel was shown to have significant correlation with DFS and OS in patients with early-stage NSCLC. The calculation principle of TMB is very complicated and will not be described here (61,62). It is defined as the number of non-synonymous somatic variation per Mb region. The form applied to the model is its value. Advances in liquid biopsy provide a significant opportunity for acquiring genomics information from circulating tumor DNA (ctDNA). Circulating cell-free DNA is a substance found in blood and body fluids. The amount of cfDNA in cancer patients is much higher than in healthy people, most of which is produced in tumor tissue where it is referred to as ctDNA. ctDNA enters the periphery with necrotic or apoptotic tumor cells and has been proposed as having a significant prognosis value in NSCLC (63,64). On the other hand, it contains the same information as tumor tissue, such as tumor-specific somatic alterations and gene expression (65), and can be accessed with minimally invasive methods and in real time (66). Wang et al. conducted a study to explore whether TMB estimated by ctDNA in blood is associated with clinical outcomes in NSCLC patients with immunotherapy (63) by constructing a cancer gene panel including 150 genes and named this NCC-GP150, to estimate blood TMB and measured tissue TMB by whole-exome sequencing. The results suggested that blood TMB estimated by a small set of genes in ctDNA had a stable correlation with tissue TMB measured by whole-exome sequencing and was associated with superior PFS. This indicated that the clinical limitations of genomics have been largely addressed by liquid biopsy.

Epigenetic biomarkers for NSCLC patients

The wide diversity of tumor mutations is a great challenge for researchers developing effective biomarkers for diagnosis or prognosis because of the large proportions of the genome needed to be examined to provide adequate sensitivity (67). Moreover, epigenetic alterations are a good substitute due to their better stability and homogeneousness in cancer (68). The epigenetic biomarkers most studied are DNA methylation, histone modifications, chromatin remodeling complexes, and long non-coding RNA (lncRNA) (22,69). The form of its application to the model is its expression quantity. DNA methylation, the most common epigenetic alteration, is also the most widely studied. It has been proved that aberrant methylation in the promoter of tumor suppress gene (TSG) could eliminate its function and promote carcinogenesis (70,71). Although DNA methylation is thought to occur at the early stages of carcinogenesis, researchers have found that some specific genes are also methylated at different tumor stages (72). Many methylated TSG promoters have been shown to be associated with a worse prognosis of NSCLC, such as APC, STXBP6, and RASSF1 (73-76), while many detection panels based on DNA methylation biomarkers were also shown to be suitable for prognosis prediction (77-79). In addition, researchers have acquired DNA methylation information based on liquid biopsy. There is considerable evidence that gene methylation detection in serum, plasma, and sputum is an effective prognostic prediction tool (22,68), and Ooki et al. have demonstrated that the methylation of gene sequences in malignant pleural effusion and malignant ascites is also significant for the prognosis of tumors (77). Thus far, only a few studies with small samples have reported the prognostic and monitoring effect of ctDNA methylation in NSCLC, and most were performed in advanced patients. However, many studies have found that the level of gene methylation seems to be closely related to the prognosis of patients with NSCLC treated with neoadjuvant chemotherapy and surgery combined with radiotherapy (68). lncRNAs are non-protein-coding RNA molecules composed of more than 200 nucleotides and are often expressed in a spatial, temporal, and tissue-specific pattern (80). In recent years, some meta-analysis and studies have shown that the aberrant expression of lncRNA is a significant prognostic biomarker in a variety of cancers including NSCLC (81). For example, linc00673 could influence the prognosis of NSCLC by regulating its proliferation, migration, invasion, and epithelial mesenchymal transition (82), and by promoting aerobic glycolysis (83). At the same time, some prognostic models based on lncRNAs have also been developed, including a seven-lncRNAs signature to predict OS for patients with early-stage NSCLC (84), and a four-lncRNAs signature to predict the prognosis of patients with NSCLC recently developed by a multicenter study in China (85). Around 3,000 lncRNAs have been discovered to have numerous biological functions in cell growth, differential, and disease progression (86) and more are expected to be identified as tumor prognostic biomarkers in the near future. Similarly, histone modifications and chromatin remodeling could also provide valuable prognostic biomarkers for NSCLC. Some studies have shown that epigenetic changes involving multiple histones, especially H2A and H3, have great prognostic value for early NSCLC (86), while others have demonstrated that the low expression of BRM and BRG1 contained in two functionally complementary chromatin remodeling complexes was associated with a poor prognosis in NSCLC (87,88). However, to date, prediction models incorporating these two epigenetic modifications are few, perhaps because such modifications are hard to quantify.

The practical application of gene-related biomarkers in prognosis prediction models

Increasingly, evidence has shown that molecular biomarkers can greatly benefit prognostic prediction. Molecular changes in tumors occur when tumor sizes are small and hard to capture, so models that combine genetic and non-genetic factors will have better biological accuracy (79). Moreover, the increasing development of sequencing technology also provides a convenient condition for the application of gene information. Most studies to date have preferred constructing a prediction model with data from a single omics. The genomics and epigenomics characteristics have been increasingly complex with the discovery of more driver genes and signal pathways, so as other omics (22,51). Compared with a single prognostic marker, integrating multiple prognostic markers into a simple prediction model might effectively improve the predictive value for prognosis. For example, when making a prognosis prediction based on somatic mutations, the risk must be more accurately stratified with an integration of multiple mutations compared to predicting with only a simple gene mutation. In addition to the mutation of gene level, the specific mutation sites could also be incorporated into models (51). Similarly, integrating different kinds of biomarkers to build predictive models should also make predictions more accurate, but may also involve more cost to patients. Many researchers have commented on the issue of cost-effectiveness. Gray et al. (89) performed a meta-analysis on lung cancer risk prediction models and found that although the discrimination of models incorporating genetic information has improved, there has been little improvement in epidemiological models, which remain costly and time consuming. As there are no systematic reviews on lung cancer prognostic prediction models, we conducted a literature screening to compare the performance of those incorporating gene-related biomarkers with those that do not (). The details of the literature screening process are summarized in , and show that although only some models were compared, it appears that incorporating gene-related biomarkers cannot significantly improve the performance of prediction models.

Table 1

Discrimination performance of the prediction models screened by the systematic review

Title	Reference	Gene-related	Training sample size	C-statistics in training set	Test sample size	C-statistics in validation set
The development and external validation of an overall survival nomogram in medically inoperable centrally located early-stage non-small cell lung carcinoma	Duijm et al. (90)	No	220	0.640	92	0.620
A nomogram based on CT deep learning signature: a potential tool for the prediction of overall survival in resected non-small cell lung cancer patients	Lin et al. (91)	No	231	0.800	77	0.723
Development and validation of a nomogram for preoperative prediction of lymph node metastasis in lung adenocarcinoma based on radiomics signature and deep learning signature	Ran et al. (92)	No	200	0.820	60	0.861
A seven-gene signature with close immune correlation was identified for survival prediction of lung adenocarcinoma*	Zou et al. (93)	Yes	499	0.781	180	0.659
Identification and validation of a tumor microenvironment-related gene signature for prognostic prediction in advanced- stage non-small-cell lung cancer*^{#^}	Zhang et al. (94)	Yes	192	0.681	91	0.637
Development of an immune-related gene pairs signature for predicting clinical outcome in lung adenocarcinoma*^#	Wu et al. (95)	Yes	465	0.87	431	0.803
Identification of a 5-gene metabolic signature for predicting prognosis based on an integrated analysis of tumor microenvironment in lung adenocarcinoma	Yu et al. (96)	Yes	535	0.767	442	0.685
A model of twenty-three metabolic-related genes predicting overall survival for lung adenocarcinoma*^#	Zhao et al. (97)	Yes	445	0.734	393	0.742
A prognostic nomogram combining immune-related gene signature and clinical factors predicts survival in patients with lung adenocarcinoma^#	Song et al. (98)	Yes	500	0.652	442	0.632

Figure 1

The literature screening flow chart. Studies published in PubMed in the past one year were searched on April 26, 2021. The key words were: ((prognosis) AND (survival) AND (non-small-cell lung cancer) AND (prediction model) OR (signature) AND (AUC) OR (C-index)). Literatures were excluded by the following exclusion criteria: (I) not a study for the prognosis of NSCLC, (II) a model or signature was not developed, (III) the full articles could not be acquired, (IV) the prediction model was not validated in external datasets, (V) the c-index and sample size of prediction models were not assessed or reported in both training and validation datasets. NSCLC, non-small cell lung cancer.

*, more than one external validation set was used, and the one with the largest sample size was compared; #, time-dependent ROC curves were made, and the ROC curve with the longest predicted survival time was compared; ^, several models were made according to the different end points of the study, and the model with OS as the end point was compared. The literature screening flow chart. Studies published in PubMed in the past one year were searched on April 26, 2021. The key words were: ((prognosis) AND (survival) AND (non-small-cell lung cancer) AND (prediction model) OR (signature) AND (AUC) OR (C-index)). Literatures were excluded by the following exclusion criteria: (I) not a study for the prognosis of NSCLC, (II) a model or signature was not developed, (III) the full articles could not be acquired, (IV) the prediction model was not validated in external datasets, (V) the c-index and sample size of prediction models were not assessed or reported in both training and validation datasets. NSCLC, non-small cell lung cancer. This may indicate the benefits of incorporating genetic information into prediction models should be evaluated from more aspects. Despite prognosis prediction models being different from diagnostic prediction models in function, the cost-effectiveness of each is an issue that requires common attention, and whether the added cost and time to obtain gene-related biomarkers is worthy requires further investigation (89). Some authors believe that with the progresses in science and technology, the detection technology of bionomics will be more convenient allowing more patients to benefit from it (79). Overall, the most fundamental role of the clinical prognosis prediction model is to allow clinicians and patients to understand the prognosis of patients simply, quickly, and effectively, and to guide treatment in a timely manner. Therefore, how to reduce the number of genes that need to be detected as much as possible while ensuring the improvement of accuracy is more important than the continuous development of new biomarkers (99). In fact, there is a solution to this problem. More and more researchers have begun to select genes based on their function in order to reduce the number of genes. At present, more and more mechanisms have been found to be related to tumor genesis, development and prognosis, such as hypoxia (100,101), autophagy (102,103), ferroptosis (104,105), immune microenvironment (106) and so on. At the same time, more and more genes have been found to be associated with these mechanisms. Genes with similar functions have been collected by some researchers into a set of genes called a functional gene set, such as ferroptosis-related genes (107). Such models, based on specific functional gene sets, have practical applications, especially in drug use, and could provide clinicians with new ideas for treatment (103,108). For example, autophagy, has been found as a new way to treat cancer (109-111). It plays a role in the occurrence and development of tumors by virtue of various mechanisms. At the same time, various autophagy mechanisms can also inhibit cancer. On this basis, Zhang et al. (103) downloaded 210 autophagy-related genes from the Human Autophagy Database (HADb, https://autophagy.lu/clustering/index.html). Then 1496 lncRNAs were identified by a coexpression analysis. Finally, a 9 autophagy-related lncRNAs were screened to construct a prognostic model for NSCLC. According to this model, patients were divided into high-risk and low-risk groups. The significance of this model is that patients in the high-risk group may be more likely to receive autophagy-related therapy. Obviously, this model is more targeted from the selection of genes, which greatly reduces the amount of calculation needed to establish the model, and at the same time increases the practicality of the model, and puts forward more intuitive opinions on the treatment of diseases. Of course, the application of the model depends on the development of related drugs or treatments. Similarly, starting with a pathway, genes involved in an important pathway are selected to model the set of genes to be screened, which has the same effect with selecting the specific functional gene sets (111).

Potential predictors that were not included or were not examined in detail in the model

Marital status

Recently, Chen et al. recognized the prognostic role of marital status in lung cancer (112). In the model for predicting the prognosis of NSCLC with marital status, the HR of married patients versus unmarried patients was 0.914 (CI: 0.896–0.933, P<0.001) by univariate analysis and 0.869 (CI: 0.851–0.887, P<0.001) by multivariate analysis. A prognostic model of squamous cell carcinoma in the elderly carried out by Chen et al. (113) included marital status as one of the predictors, and the HR of unmarried versus married persons was 1.146 (CI: 1.103–1.190, P<0.001) by univariate analysis, and 1.042 (CI: 1.000–1.085, P=0.049) by multivariate analysis. While studies showing the effect of marital status on the prognosis of lung cancer patients date back to 2007 (114), those examining the relationship between marital status and prognosis in patients with NSCLC have yielded different results using data from different regions. Studies conducted in the United States have generally concluded that unmarried patients have a worse prognosis than those who are divorced or married, while studies in Japan have shown that divorced patients have a worse prognosis (115,116). Overall, unmarried patients have worse overall survival than those who had been married (married or divorced) (112-118). The differences in these results may indicate marriage is not a direct influence on tumor prognosis, which could result from several factors, such as sociodemographic differences in culture and economy between regions. Chen et al. also noted that the current studies only recorded marital status at the time of diagnosis and did not record information on changes in marital status after diagnosis, which could lead to biased results (113). On the other hand, the present data used in most relevant studies we reviewed are from 10 years ago, and the data source is relatively single. Due to changes in social demographics, published conclusions may be different from the current situation, and it is difficult to translate the existing research conclusions into practical applications. Therefore, more new data are needed to carry out relevant studies.

Inflammation scores

The difficulty of obtaining extensive lung cancer tissue in many cases underscores the value of blood biomarkers (119). The role of inflammation in cancer progression is broadly accepted and inflammatory biomarkers are established prognostic predictors (120). However, using a single biomarker of inflammation to predict prognosis may be premature. Therefore, inflammation scores combined with multiple inflammatory biomarkers, such as NLR, PLR, ALI, and SII may provide greater accuracy (121-127). The examination indexes included in inflammation scores can be obtained in routine diagnostic tests, including blood routine and blood biochemical tests. These tests are convenient and cheap to implement and have great potential for clinical application. Mandaliya et al. (127) collected data from 279 patients with advanced NSCLC to conduct unified evaluation of PLR, NLR, ALI, and LMR (110) for the first time. The prognostic effect of these factors was assessed by establishing their association with OS before and after treatment, and it was found that basal high PLR and NLR were associated with poor prognosis. However, while the study illustrated the relationship between these inflammatory scores and prognosis, it did not establish a quantitative relationship within a model. Sandfeld-Paulsen et al. (128) evaluated five existing inflammatory scoring systems (NLR, PLR, GPS, optimization of three of them—CNG, ACBS) using a dataset of 275 people, with the systems optimized by including comorbidity, age, PS score, TNM stage, and smoking. After the model was established and tested, the order of C statistics from small to large was as follows: original inflammatory score model, TNM staging model, original inflammatory score +TNM+PS model. Among them, ACBS, which is a scoring system developed by the authors themselves and differs from CNG in that the rate is replaced with a parameter and globulin is included, had the best prognostic effect. Most patients with lung cancer have comorbidities that affect inflammatory factors (129), and their influence must be considered when using inflammatory cytokines as prognostic predictors, and Sandfeld-Paulsen et al. addressed this by building a multi-factor model. However, the limitation of that study is that the measurement of inflammatory factors was only conducted at one time point, and it may be necessary to measure them several times at different time points to more accurately and dynamically evaluate the treatment effect and patient prognosis. Although many articles have revealed the relationship between various inflammatory scores and the prognosis of NSCLC, this indicator has rarely been included as a predictor in the prognosis prediction model of NSCLC. It is well known that inflammatory markers are easily affected by body states, such as complications, so the use of an inflammatory score alone for prognosis prediction may be unreliable. However, their role as a predictor in an appropriate model should be considered.

Radiomics features

Radiomics involves the study of quantitative features extracted from medical images such as CT, MRI, and PET, and has been used to assess the prognosis of cancer patients (130-133). The use of quantitative information on cancer phenotypic characteristics obtained from imaging to develop clinical predictive models is an important goal of radiomics (134). Existing radiomics studies have provided multiple predictors for the prognostic model of NSCLC. For example, it has been shown that the imaging features of CBCT (cone-beam CT) can be used to evaluate the therapeutic effect of lung cancer patients (135). CBCT is a clinical image promoted to use in recent years, and its image quality is lower than that of conventional CT. In this study, through a two-step calibration process, we showed the imaging features of CBCT and conventional CT had internal conversion, that is, the imaging features of CBCT could be used to predict the prognosis of patients with NSCLC instead of conventional CT. While this provided a subset of new factors for the prognostic model of NSCLC, the authors also put forward problems that need to be addressed. Compared with conventional CT, CBCT is more sensitive to artifacts, and more studies are needed to explore the influence of artifacts on CBCT radiomics. In addition, we believe that the excessive selection of predictors is also a problem. In this study, 149 features were included in the establishment of the model from 1,119 candidate features, which was a great obstacle to the clinical application of the model. Future studies should focus on screening out more representative features for model building. There is another key advantage that radiomics features could provide is that it overcomes the problem of continuity of predictors, including of time and space continuity. It is wide known that tumors are a continuously envolving biological system, a continue predictor could greatly improve models. Profiting from the development of machine learning technology, radiomics features used for modeling have been developed from quantifiable features extracted manually from traditional images to imaging images without artificial definition (136-138). In 2019, Xu et al. (139) constructed a deep learning model from time-series CT-images to predict lung cancer treatment response to chemoradiation. The researchers included 179 stage III NSCLC patients treated with chemoradiation, and established a deep learning model using their pre- and post-treatment CT images at 1,3, and 6 months. The AUC of the model predicting 2-year overall-survival was 0.74 (P<0.05). The models stratified patients into low and high mortality risk-groups, significantly associated with overall survival (HR =6.16, 95% CI: 2.17–17.44, P<0.001). Regardless of the discrimination of the model, after all, AUC can be improved by increasing the sample size or changing the statistical analysis method. The key point of this study is that it solved the problem that the predictors in the prognostic prediction model are not capable of dynamic evaluation. Moreover, the change of input predictors from traditional numbers to two-dimensional images is a qualitative leap in the development of models. The development of radiomics has promoted the development of personalized medicine. By the early identification of quantitative markers of treatment response, treatment can be adjusted in time (140). Currently, most predictive models characterized by imaging omics lack reproducibility and external validation, and cannot be applied in clinical practice (141). In addition, the standardization of imagomics features, including the standardization of images and workflow, should be a focus of development (134,140,142).

Development, validation, and assessment of models

Development

First, data is the basis on which models are built. Nowadays, more and more databases are being created and published, such as The Cancer Genome Atlas (TCGA), Gene Expression Omnibus (GEO). More and more researchers tend to use data from databases to build their own models. This has promoted the development of clinical prediction models. Data should be cleaned prior to application to modeling, including steps of quality control, standardization, and batch removal. Part of the database or datasets has completed this step of processing, researchers can directly use the data. Some researchers tend to combine multiple data sets to build models. Because different data sets have different data processing methods, different sources, and even different sequencing platforms, direct integration of them into a new data set leads to very large heterogeneity. Therefore, it is necessary to remove batch effects and standardize before integration. R packages “lumi”, “limma” and “beadarry” can be used for quality control and normalization, “sva” can be used for batch removal. The establishment of new models relies on the proposal of innovative predictors. After potential predictors are identified, they are further screened with univariate and multi-variates analysis as the candidate variables. In fact, after determining the predictors to be screened, the researchers generally did not include all of this information, but selected variables that were significantly different between the experimental and control groups. For example, in the prognostic prediction models of NSCLC with gene expression, the researchers will first compare the gene expression in cancer tissue with that in normal tissue, the basic way is that the two samples are tested with Student’s t test, and then select the genes with significant differences in expression in the two groups to be included in the next screening, such as those whose adj.P value is less than 0.05 and whose absolute value of logFC (log2FoldChange) is greater than 1. In fact, this method of screening for variables that vary significantly between the experimental and control groups is universal when there are many predictors to screen for, not just for gene-related models. For example, Nair et al. (143) have carried out a study that using machine learning techniques to predict EGFR mutations in NSCLC by radiogenomics. Before feature screening and model building, 326 imaging omics features were analyzed by linear discriminant analysis (LDA) to screen out those more important features and rank them by importance. Although the author did not specify how many features were screened out by this step, the author also clearly indicated that this step could greatly reduce the possibility of overfitting the model. In fact, this approach is similar to difference analysis. In this study, 326 imaging features of patients with EGFR mutated and patients with EGFR wild-type were analyzed, among which adj. P<0.05 were screened out, and the features with greater changes were sorted according to the absolute value of logFC to enter the next stage of screening. This approach has the same effect as the LDA method adopted by the author. After that, univariate analysis is used to primarily screen out the predictors which have influence on the outcome, and multi-variates analysis can eliminate the bias caused by other confounding factors by stepwise regression or The Least Absolute Shrinkage and Selection operator (LASSO). According to different types of predictors, clinical prediction models can also be divided into epidemiological models, clinical models, and biomarker models (89). There are many ways to develop models, such as logistic regression to dichotomous outcomes and COX regression to time-event outcomes (144). Prognosis prediction models are used to evaluate the survival probabilities in future years, so COX regression is used widely. In recent years, machine learning has been applied to the development of clinical prediction models (145,146), and in a background of the rapid development of high-throughput sequencing, more models are incorporating gene-related information for analysis. Machine learning can effectively analyze large samples of biomedical data containing genomic and genetic information and elucidate the complex biological mechanisms involved (146). Researchers should choose the most appropriate way to develop a model according to the difference of the outcome variables.

Validation

When a regression model is developed and fitted on a particular data set, the model must fit the random changes within the data set. However, when the model is applied to other data, its fitting performance may not be satisfactory due to the unique internal random variation of the data set (147). Therefore, models must be validated by internal validation and external validation. Internal validation is mainly used for the correction of overfitting and optimism by cross-validation and optimism-corrected bootstrapping (11). However, external validation that tests the model with different data from the original is more convincing. It is also necessary to assess the accuracy of models. Tammemägi et al. (147) found that internal validation tends to underestimate the loss of predictive performance observed when the model is applied to new data. However, in reviewing existing prognostic models, every model has its own external validation dataset, and there is no unified dataset for external validation of existing models to create a rigorous evaluation environment, which means that comparison is difficult. Additionally, the independent external validation of biomarker models and clinical models is inadequate, which could be due to the high cost and disinterest of volunteers.

Assessment

After the development and validation of models, some critical indexes should be used to assess them, and when this does not take place, serious bias may occur (148). The main components of model assessment are discrimination and calibration. Discrimination refers to the ability to distinguish case from control, which is most widely represented by an area under the receiver operator characteristic curve (AUC) and concordance or c-statistic, and the two are numerically equal. However, discrimination can only evaluate the ability to classify correctly, and does not reflect the ability of a model to predict individual probabilities accurately. To correct this, calibration is introduced. Calibration has been measured by goodness-of-fit tests (149), and the Hosmer-Lemershow statistic has been used for logistic regression. However, this method has been criticized for its instability of P value (150). The most widely applied method for assessing calibration is plotting-observed probabilities versus model-predicted probabilities. Some modelers also use the Brier score, which is a proper scoring rule that is affected by both discrimination and calibration (151). However, differentiation and calibration alone can only reflect the accuracy of model prediction and cannot directly reflect whether patients can benefit from it in clinical practice. Decision curve analysis (DCA) integrates the preferences of patients and decision makers into the curve, directly showing the extent to which patients can benefit from a decision, which meets the needs of clinical application and is an effective supplement to an ROC (146,152). The performance of models developed by different methods with the same data set will not be the same, and no method is consistently better than others. This leads to a status quo, in that modelers may report the best indexes alternatively, which increases the difficulty for comparison, and calls for a unified assessment rule to be identified. While development, validation, and assessment are indispensable in clinical prediction model research, most currently published models are unsatisfactory in their validation and evaluation. As the focus of this study is not on the methodology of the model, we have provided a brief introduction only, and more details on the development, validation, and assessment of clinical prediction models are discussed in Harrell et al. (153).

Visualization of models

Generally, the regression formula should be reported in a logistic or COX regression model to allow validation by other researchers. Some modelers have also transformed the formula to a risk score to make it easier to understand and practice. For example, we have developed a survival prediction model for NSCLC and reported a risk score formula: Age: represented age of diagnosis; stage: I/II =1, stage III/IV =2; EGFR, PIK3CA, TP53: mutation =1, no mutation =0 (53). Visual depictions such as nomograms, graphical score charts, and website/mobile apps convey research results in an accessible manner. In this study, we used a nomogram () to assign a score to each patient on the corresponding axis of variation, and the sum of these numbers could determine the location on the total points axis. This carries the advantage of displaying multiple time points which, along with continuous and categorized predictors can also be presented on the one interface. However, nomograms can appear complex at first sight and require an explanation as to how they should be used (148). In addition, the result can be inaccurate due to the low pointers number for predictors resulting in personal equation during observation. A dynamic nomogram can be plotted by the R package “DynNom” (154), which is a web calculator, offering a high-precision and convenient method which may gradually replace the basic nomogram. In addition, this evaluation method, which relies on the network and mobile devices, can be easily embedded into the patient's diagnosis and treatment system to monitor their condition anytime and anywhere.

Figure 2

Nomogram for predicting the survival of patients with lung cancer at 3, 5, and 10 years based on data from TCGA. TCGA, The Cancer Genome Atlas.

Nomogram for predicting the survival of patients with lung cancer at 3, 5, and 10 years based on data from TCGA. TCGA, The Cancer Genome Atlas. The format depends on user and environment conditions. Guides to presenting clinical prediction models for use in clinical settings provide key information for modelers when selecting visualization form (155). However, there is a restriction in that the guideline is only appropriate for traditional models using logistic or COX regression. As machine learning is increasingly used, so are some new forms of visualization, and a standard which is similar to the guide to presenting clinical prediction models should be formulated and applied.

Bias assessment based on PROBAST

PROBAST is a prediction model of bias assessment tool for assessing the risk of bias (ROB) and has applicability to review prediction models (156). It consists of four domains: participants, predictors, outcome, and analysis, and includes twenty specific items facilitating quality control (157). The assessment tool was designed for the systematic review of prediction model studies and has been applied in many areas including breast cancer, kidney cancer, lymphocytic leukaemia, and oropharyngeal cancer (158-161). We assessed the bias of the nine studies mentioned above using PROBAST and the results are shown in . While the applicability to review was not assessed because a systematic review question was not set, the results of the overall judgement are strict. The prediction models identified as “low ROB” were set when low ROB was seen in all domains, and a “high ROB” was reached if at least one domain had high ROB. While the overall judgement about concerns regarding applicability was the same, a prediction model was not validated in an external dataset, the study was judged as high ROB even if all domains had low ROB. Therefore, we excluded the studies without external validation.

Table 2

Results from the ROB assessment of nine studies using PROBAST

Study	ROB				Applicability			Overall
Study	Participants	Predictors	Outcome	Analysis	Participants	Predictors	Outcome	ROB	Applicability
The development and external validation of an overall survival nomogram in medically inoperable centrally located early-stage non-small cell lung carcinoma	–	–	–	+				+
A nomogram based on CT deep learning signature: a potential tool for the prediction of overall survival in resected non-small cell lung cancer patients	–	–	–	+				+
Development and validation of a nomogram for preoperative prediction of lymph node metastasis in lung adenocarcinoma based on radiomics signature and deep learning signature	–	–	–	+				+
A seven-gene signature with close immune correlation was identified for survival prediction of lung adenocarcinoma	?	–	–	+				+
Identification and validation of a tumor microenvironment-related gene signature for prognostic prediction in advanced-stage non-small-cell lung cancer	–	–	–	+				+
Development of an immune-related gene pairs signature for predicting clinical outcome in lung adenocarcinoma	?	–	–	+				+
Identification of a 5-gene metabolic signature for predicting prognosis based on an integrated analysis of tumor microenvironment in lung adenocarcinoma	?	–	–	+				+
A model of twenty-three metabolic-related genes predicting overall survival for lung adenocarcinoma	–	–	–	+				+
A prognostic nomogram combining immune-related gene signature and clinical factors predicts survival in patients with lung adenocarcinoma	?	–	–	+				+

+, low ROB/low concern regarding applicability; –, high ROB/high concern regarding applicability; ?, unclear ROB/unclear concern regarding applicability. ROB, risk of bias.

+, low ROB/low concern regarding applicability; –, high ROB/high concern regarding applicability; ?, unclear ROB/unclear concern regarding applicability. ROB, risk of bias. From the results, all included studies had a “high ROB” in the overall judgement. The main ROB was due to the domain “analysis”, and the domain “participants” was unclear because many researchers did not note the inclusion and exclusion criteria of subjects. The designers of PROBAST have broadened its scope of application to allow it to be used as a tool or yardstick for critically evaluating the original studies of predictive models. The development team of PROBAST recommend that researchers should refer to it to avoid the ROB from study design and datasets at the commencement of a study and should assess the applicability of statistics methods according to PROBAST during the study. After the study is completed, scholars or experts independent of the model development team should be invited to evaluate the methodological quality of the model to control the bias of the study (157).

Conclusions

The wide application of next-generation sequencing technology provides a variety of predictive factors for clinical prediction models. However, it seems that gene-related biomarkers cannot obviously improve the performance of models. An urgent solution as to how to simplify the required biomarkers while ensuring stability is required. The TNM staging system, WHO-PS, and pathological classification should be incorporated into all models, and the existing models should be validated in a large external dataset to make a meaningful comparison. Moreover, the current prognosis prediction model of NSCLC is at a high ROB, and promoting the application of PROBAST may improve this situation. In addition, a systematic review of the prognosis prediction model research of NSCLC is required. The article’s supplementary files as

153 in total

1. Development of a RNA-Seq Based Prognostic Signature in Lung Adenocarcinoma.

Authors: Sudhanshu Shukla; Joseph R Evans; Rohit Malik; Felix Y Feng; Saravana M Dhanasekaran; Xuhong Cao; Guoan Chen; David G Beer; Hui Jiang; Arul M Chinnaiyan
Journal: J Natl Cancer Inst Date: 2016-10-05 Impact factor: 13.506

2. Promoter methylation of APC and RAR-β genes as prognostic markers in non-small cell lung cancer (NSCLC).

Authors: Hongxiang Feng; Zhenrong Zhang; Xin Qing; Xiaowei Wang; Chaoyang Liang; Deruo Liu
Journal: Exp Mol Pathol Date: 2015-12-08 Impact factor: 3.362

3. Value of KRAS as prognostic or predictive marker in NSCLC: results from the TAILOR trial.

Authors: E Rulli; M Marabese; V Torri; G Farina; S Veronese; A Bettini; F Longo; L Moscetti; M Ganzinelli; C Lauricella; E Copreni; R Labianca; O Martelli; S Marsoni; M Broggini; M C Garassino
Journal: Ann Oncol Date: 2015-07-24 Impact factor: 32.976

4. Does marital status impact survival and quality of life in patients with non-small cell lung cancer? Observations from the mayo clinic lung cancer cohort.

Authors: Aminah Jatoi; Paul Novotny; Stephen Cassivi; Matthew M Clark; David Midthun; Christi A Patten; Jeff Sloan; Ping Yang
Journal: Oncologist Date: 2007-12

Review 5. Prognostic role of neutrophil-to-lymphocyte ratio in solid tumors: a systematic review and meta-analysis.

Authors: Arnoud J Templeton; Mairéad G McNamara; Boštjan Šeruga; Francisco E Vera-Badillo; Priya Aneja; Alberto Ocaña; Raya Leibowitz-Amit; Guru Sonpavde; Jennifer J Knox; Ben Tran; Ian F Tannock; Eitan Amir
Journal: J Natl Cancer Inst Date: 2014-05-29 Impact factor: 13.506

6. Lung Cancer Risk Prediction Model Incorporating Lung Function: Development and Validation in the UK Biobank Prospective Cohort Study.

Authors: David C Muller; Mattias Johansson; Paul Brennan
Journal: J Clin Oncol Date: 2017-01-17 Impact factor: 44.544

7. mTOR-mediated cancer drug resistance suppresses autophagy and generates a druggable metabolic vulnerability.

Authors: Niklas Gremke; Pierfrancesco Polo; Aaron Dort; Jean Schneikert; Sabrina Elmshäuser; Corinna Brehm; Ursula Klingmüller; Anna Schmitt; Hans Christian Reinhardt; Oleg Timofeev; Michael Wanzel; Thorsten Stiewe
Journal: Nat Commun Date: 2020-09-17 Impact factor: 14.919

8. Independent Validation of Early-Stage Non-Small Cell Lung Cancer Prognostic Scores Incorporating Epigenetic and Transcriptional Biomarkers With Gene-Gene Interactions and Main Effects.

Authors: Ruyang Zhang; Chao Chen; Xuesi Dong; Sipeng Shen; Linjing Lai; Jieyu He; Dongfang You; Lijuan Lin; Ying Zhu; Hui Huang; Jiajin Chen; Liangmin Wei; Xin Chen; Yi Li; Yichen Guo; Weiwei Duan; Liya Liu; Li Su; Andrea Shafer; Thomas Fleischer; Maria Moksnes Bjaanæs; Anna Karlsson; Maria Planck; Rui Wang; Johan Staaf; Åslaug Helland; Manel Esteller; Yongyue Wei; Feng Chen; David C Christiani
Journal: Chest Date: 2020-02-28 Impact factor: 9.410

9. Ube2v1-mediated ubiquitination and degradation of Sirt1 promotes metastasis of colorectal cancer by epigenetically suppressing autophagy.

Authors: Tong Shen; Ling-Dong Cai; Yu-Hong Liu; Shi Li; Wen-Juan Gan; Xiu-Ming Li; Jing-Ru Wang; Peng-Da Guo; Qun Zhou; Xing-Xing Lu; Li-Na Sun; Jian-Ming Li
Journal: J Hematol Oncol Date: 2018-07-17 Impact factor: 17.388

10. PLCγ1 suppression promotes the adaptation of KRAS-mutant lung adenocarcinomas to hypoxia.

Authors: Matteo Rossi Sebastiano; Chiara Pozzato; Maria Saliakoura; Florian H Heidel; Tina M Schnöder; Spasenija Savic Prince; Lukas Bubendorf; Paolo Pinton; Ralph A Schmid; Johanna Baumgartner; Stefan Freigang; Sabina A Berezowska; Alessandro Rimessi; Georgia Konstantinidou
Journal: Nat Cell Biol Date: 2020-10-19 Impact factor: 28.213

1 in total

1. Construction of a predictive model for immunotherapy efficacy in lung squamous cell carcinoma based on the degree of tumor-infiltrating immune cells and molecular typing.

Authors: Lingge Yang; Shuli Wei; Jingnan Zhang; Qiongjie Hu; Wansong Hu; Mengqing Cao; Long Zhang; Yongfang Wang; Pingli Wang; Kai Wang
Journal: J Transl Med Date: 2022-08-12 Impact factor: 8.440

1 in total