Literature DB >> 35648850

Spatial interplay patterns of cancer nuclei and tumor-infiltrating lymphocytes (TILs) predict clinical benefit for immune checkpoint inhibitors.

Xiangxue Wang¹, Cristian Barrera¹, Kaustav Bera¹, Vidya Sankar Viswanathan¹, Sepideh Azarianpour-Esfahani¹, Can Koyuncu¹, Priya Velu², Michael D Feldman³, Michael Yang⁴, Pingfu Fu⁵, Kurt A Schalper⁶, Haider Mahdi⁷, Cheng Lu¹, Vamsidhar Velcheti⁸, Anant Madabhushi^1,9.

Abstract

Immune checkpoint inhibitors (ICIs) show prominent clinical activity across multiple advanced tumors. However, less than half of patients respond even after molecule-based selection. Thus, improved biomarkers are required. In this study, we use an image analysis to capture morphologic attributes relating to the spatial interaction and architecture of tumor cells and tumor-infiltrating lymphocytes (TILs) from digitized H&E images. We evaluate the association of image features with progression-free (PFS) and overall survival in non-small cell lung cancer (NSCLC) (N = 187) and gynecological cancer (N = 39) patients treated with ICIs. We demonstrated that the classifier trained with NSCLC alone was associated with PFS in independent NSCLC cohorts and also in gynecological cancer. The classifier was also associated with clinical outcome independent of clinical factors. Moreover, the classifier was associated with PFS even with low PD-L1 expression. These findings suggest that image analysis can be used to predict clinical end points in patients receiving ICI.

Entities: Chemical

Year: 2022 PMID： 35648850 PMCID： PMC9159577 DOI： 10.1126/sciadv.abn3966

Source DB: PubMed Journal: Sci Adv ISSN： 2375-2548 Impact factor: 14.957

INTRODUCTION

Programmed cell death protein 1 (PD-1)/Programmed death-ligand 1 (PD-L1) axis inhibitors have revolutionized the treatment paradigm across multiple cancers, showing remarkable improvement in mortality in melanoma (), lung (), head and neck (), and gynecologic cancers (), among others. In metastatic non–small cell lung cancer (NSCLC), immune checkpoint inhibitors (ICIs) with or without chemotherapy are now a first-line treatment regimen. ICIs offer a drastic improvement in overall and progression-free survivals (OS and PFS) with minimal treatment-related toxicities (, ). In metastatic NSCLC patients with high PD-L1 expression (>50%), where pembrolizumab (a PD-1 inhibitor) is first-line therapy, significant improvements in OS as compared to chemotherapy alone [hazard ratio (HR), 0.63; 95% confidence interval (CI), 0.47 to 0.86; P = 0.002] (, ) have been reported. Despite robust clinical benefit, one of the caveats is that even when patient selection is driven by PD-L1 expression, treatment response rates ranged from 27 to 45% in the first-line setting (27% in PD-L1 > 1% and 45% in PD-L1 > 50% subgroup of KEYNOTE-024) and 19% in second-line refractory setting (, –). In gynecological cancers (including ovarian, cervical, and uterine cancers) (), preliminary studies with PD-L1 expression–based patient selection have also shown that ICI improved survival and time to progression in patients () with metastatic disease, but with similar limited response rates as in NSCLC. These findings suggest that tissue-derived PD-L1 expression is not an optimal biomarker to identify patients likely to respond to ICI therapy in any of these cancers. This could be due to the lack of an accepted standard in determining PD-L1 expression levels. Other molecular biomarkers, like TMB (tumor mutational burden) or microsatellite instability, have been explored in recent clinical trials to determine the association with clinical outcomes in different tumors (–). However, recent studies suggested that high TMB does not always predict response (, ), and some studies (, ) have emphasized the importance of additional biomarkers that capture the complexity of tumor immune microenvironment. Thus, there is an urgent need to identify novel biomarkers associated with likelihood of response to ICI, helping to identify those patients who are less likely to respond to ICI and hence might benefit from alternative therapeutic strategies (). ICI agents rely on augmenting the body’s immune response by one or a combination of the following proposed mechanisms: (i) inhibiting co-repressive signals on T lymphocytes during the early immune response (anti–CTLA-4) in lymph nodes, (ii) antagonizing PD-1 receptor signaling in T lymphocytes during the effector phase (anti–PD-1) in peripheral tissues, or (iii) neutralizing PD-L1 (anti–PD-L1) expressed on the surface of tumor cells in peripheral tissues. One of the hallmarks of cellular immune response for solid tumors is the presence of tumor-infiltrating lymphocytes (TILs) in the tumor microenvironment (TME). Studies have shown that TIL density is strongly prognostic of response in a variety of solid tumors, irrespective of the treatment type (). TIL density has been shown to be a predictor of ICI response as well, with a recent study () showing that density, as measured by immunohistochemistry (IHC) of lymphocytes present on the invasive margin of melanomas instead of more central lymphocytic infiltration, can predict anti–PD-1 response. To date, TIL analysis has focused largely on pathologist-defined visual TIL density estimation based on a few high-power fields viewed under the microscope (–). This manual TIL estimate suffers from the subjectivity and low consistency between evaluators (). Computer-assisted TIL estimation has been explored based on hematoxylin and eosin (H&E) to decrease the subjectivity (). Moreover, recent studies have shown that density and arrangement patterns of TILs on H&E images or using quantitative multiplex immunofluorescence are associated with risk of recurrence and outcomes in NSCLC and breast cancer (, –). While recent studies have reported that neural network–based approaches could predict the treatment response in cancer patients treated with immunotherapy (, ), the end-to-end deep learning models tend to lack interpretability, which, in turn, could make it harder to achieve generalizable results on new, unseen, external datasets. A number of findings from our group (, ) as well as of others (, ) have emphasized the importance of biologically inspired feature related to spatial heterogeneity and complex interactions between TME and immune microenvironment for predicting clinically relevant outcomes in ICI-treated cancer patients. In this work, we present an automated image classifier that not only computationally enumerates TIL density but also characterizes the spatial architecture and arrangement of TILs to predict clinical outcomes in ICI-treated cancer patients (Fig. 1). Specifically, we aimed to capture quantitative image biomarkers relating to the spatial interaction between immune and tumor cells in the TME (i.e., epithelium and stroma) and subsequently show the association of these image biomarkers against clinical outcomes for ICI (monotherapy and combination of ICI agents)–treated patients with different cancer types [NSCLC and gynecological cancers (Fig. 2)]. The association of these image biomarkers against clinical outcome in ICI-treated patients was also compared against manual TIL estimation and also evaluated in the context of cancer patients with low PD-L1 expression.

Fig. 1.

Consort flow diagram of patient selection.

Patient selection from four different NSCLC cohorts (D1–4) and a gynecological cancer cohort (D5).

Fig. 2.

Flowchart of high-level workflow.

Entire workflow comprises (A) tissue preparation and digitization, (B) tile processing from annotation, (C) region and cell segmentation, (D) image analysis and feature extraction, (E) predictive model construction and validation, and (F) cohort overview.

Consort flow diagram of patient selection.

Patient selection from four different NSCLC cohorts (D1–4) and a gynecological cancer cohort (D5).

Flowchart of high-level workflow.

RESULTS

HistoTIL predicts response to ICI

Feature discovery on D1 NSCLC cohort revealed a total of seven HistoTIL features (intersected area between non-TIL nuclei and TIL graphs, nuclei shape descriptors in epithelium area, and distance statistics between non-TIL nuclei and TIL graph) that were significantly associated with OS (P < 0.001) and PFS (P = 0.003). These included four features from the non-TIL nuclei and three from the interplay between non-TIL nuclei and TILs (Fig. 3; detailed feature description and distribution in table S1 and fig. S1). In predicting Response Evaluation Criteria in Solid Tumours (RECIST) as a binary clinical outcome, the linear model achieved 0.71 area under the curve (AUC) value on NSCLC training set D1; 0.73, 0.62, and 0.73 on independent NSCLC validation cohort D2–4; and 0.68 on gynecological cohort D5, respectively.

Fig. 3.

Visualization of features relating to TIL–non-TIL interactions.

A high-risk example is found on the left panel. On the right panel is a low-risk sample. (A and B) The whole-slide image (WSI) is depicted with a green-marked line surrounding the tissue that was indicated as sufficient for processing. (C and D) Two small panels indicate the grouping behavior of the lymphocyte and nonlymphocyte cells by means of a two-dimensional plot with both cell groups intertwined, where the numbers in red indicate the main panel it is coming from. (E and F) A zoomed-in region of interest. The groups of cells form distinctive clusters. The clusters are connected through colored lines, which indicate the distance between lymphocytes (green line), nonlymphocytes (orange line), cluster of lymphocytes (cyan line), and cluster of nonlymphocytes (yellow line). Descriptors are calculated that capture general architectural structure and immune-cell interplay. (G and H) Texture-based feature related to the heatmap of the nuclei and surrounding tissue for the low- and high-risk cases. Zernike polynomials and Haralick descriptors are calculated from the shape and texture of the cell.

Visualization of features relating to TIL–non-TIL interactions.

Independent prediction of survival

On D1, HistoTIL yielded an HR = 5.47 (95% CI, 1.77 to 16.9; P < 0.001) and HR = 4.63 (95% CI, 1.38 to 11.8; P = 0.003) (Fig. 4) for OS and PFS, respectively. On (D2–4) NSCLC validation set, the performance of HistoTIL in predicting OS and PFS was HR = 3.29 (95% CI, 1.32 to 8.16; P = 0.042) and HR = 2.45 (95% CI, 1.00 to 6.00; P = 0.031) on D2, HR = 2.03 (95% CI, 1.08 to 3.82; P = 0.027) and HR = 2.43 (95% CI, 1.04 to 5.65; P = 0.018) on D3, and HR = 2.58 (95% CI, 1.02 to 6.49; P = 0.009) and HR = 3.05 (95% CI, 1.07 to 8.69; P = 0.001) on D4. HistoTIL was also evaluated in terms of association with OS and PFS in gynecological cancers (D5) with HR = 2.29 (95% CI, 1.15 to 4.58; P = 0.0037) and HR = 1.93 (95% CI, 1.10 to 3.38; P < 0.043) for OS and PFS, respectively. The differences in the median OS between the high-risk and low-risk groups as determined by HistoTIL were 14.2 and 34.6 months for D2, 15.7 and 23.3 months for D3, and 12.7 and 22.6 months for D4 (all in lung cancer) and 6.5 and 14.2 months for D5 in gynecological cancers, respectively. Similarly, the differences in the median PFS between the two HistoTIL risk groups were 3.1 and 10.6 months for D2, 2.7 and 8.5 months for D3, and 3.4 and 11.2 months for D4 in NSCLC and 2.5 and 9.1 months for D5 in gynecological cancers.

Fig. 4.

Forest plot for individual cohort.

Survival analysis with OS (A) and PFS (B) between HistoTIL-defined low- and high-risk groups.

Forest plot for individual cohort.

Survival analysis with OS (A) and PFS (B) between HistoTIL-defined low- and high-risk groups. Moreover, on multivariable analysis and after adjusting for the effects of clinicopathological variables, HistoTIL was found to be independently prognostic on the NSCLC datasets (D1: HR = 1.38; 95% CI, 1.04 to 1.68; P = 0.012; D2: HR = 1.69; 95% CI, 1.01 to 3.03; P = 0.049; D3: HR = 1.21; 95% CI, 1.01 to 1.44; P = 0.038; D4: HR = 5.25; 95% CI, 1.65 to 16.78; P = 0.005) and for gynecological cancers (D5 cervical, n = 10; ovarian, n = 14; and endometrial, n = 25; HR = 5.22; 95% CI, 1.24 to 21.94; P = 0.024; Table 1). Smoking status was also found to be significant in multivariable analysis (never smoker versus previous/current smoker: HR = 0.271; 95% CI, 0.10 to 0.78; P = 0.015) but only in D3. None of the other clinicopathological covariates were found to be significantly associated with OS.

Table 1.

Multivariable analysis of OS with HistoTIL and clinical variables.

ADC, adenocarcinoma; SCC, squamous cell carcinoma; values in bold are statistically significant by two-tailed test (P < 0.05).

	Characteristics	Hazard ratio	95% CI	P
D₁ (lung)	Gender	0.528	0.096–2.894	0.462
	Male
	Female
	Smoking status	0.463	0.049–4.396	0.503
	Former/current
	Never
	Histology subtypes	0.347	0.023–5.187	0.443
	Adenocarcinoma
	SCC
	Clinical stage	3.646	0.292–45.484	0.315
	III
	IV
	Treatment	0.441	0.020–9.566	0.602
	Mono
	Combo
	TIL counting	1.385	0.231–8.307	0.721
	Low
	High
	Risk scores	1.397	1.071–1.822	0.014
D₂ (lung)	Gender	0.342	0.056–2.083	0.245
	Male
	Female
	Smoking status	0.126	0.010–1.531	0.104
	Former/current
	Never
	Histology subtypes	0.203	0.017–2.488	0.212
	Adenocarcinoma
	SCC
	Clinical stage	3.437	0.179–66.055	0.413
	III
	IV
	Risk scores	1.685	1.011–3.029	0.049
D₃ (lung)	Gender	2.334	0.746–7.306	0.145
	Male
	Female
	Smoking status	0.287	0.840–0.098	0.023
	Former/current
	Never
	Histology subtypes	0.849	0.19–3.64	0.826
	Adenocarcinoma
	SCC
	Clinical stage	13.023	0–Inf.	0.994
	III
	IV
	Treatment	0.802	0.246–2.614	0.715
	Mono
	Combo
	PD-L1	0.593	0.200–1.755	0.345
	Low
	High
	Risk scores	1.211	1.013–1.447	0.035
D₄ (lung)	Gender	1.558	0.744–3.264	0.240
	Male
	Female
	Smoking status	1.025	0.324–3.241	0.967
	Former/current
	Never
	Histology subtypes	0.985	0.409–2.373	0.972
	Adenocarcinoma
	SCC
	Clinical stage	2.129	0.553–8.194	0.272
	III
	IV
	Treatment	1.252	0.311–5.046	0.752
	Mono
	Combo
	Risk scores	5.253	1.645–16.775	0.005
D₅ (gynecology)	Clinical stage	1.377	0.312–6.078	0.673
	III
	IV
	Histology subtypes
	Endometrioid	Reference	Reference	Reference
	Serous	0.382	0.063–2.315	0.295
	SCC	1.980	0–Inf.	1
	Clear cell	3.544	0.817–15.369	0.091
	Primary site
	Ovarian	Reference	Reference	Reference
	Endometrial	1.031	0.238–4.472	0.967
	Cervix	1.741	0–Inf.	1
	IO type	0.081	0–Inf.	0.983
	Mono
	Combo
	Risk score	5.217	1.240–21.943	0.024

Multivariable analysis of OS with HistoTIL and clinical variables.

ADC, adenocarcinoma; SCC, squamous cell carcinoma; values in bold are statistically significant by two-tailed test (P < 0.05).

HistoTIL associated with OS and PFS independent of type of ICI agent

For NSCLC, HistoTIL predicted that low-risk groups had significantly longer survival time for patients who received nivolumab (OS: HR = 2.08; 95% CI, 1.27 to 3.41; P < 0.001; PFS: HR = 2.04; 95% CI, 1.34 to 3.10; P < 0.001) and pembrolizumab (OS: HR = 3.19; 95% CI, 1.06 to 9.61; P = 0.041; PFS: HR = 5.34; 95% CI, 1.80 to 15.90; P < 0.001) compared to high-risk groups. In the gynecological cohort, patients who received pembrolizumab had significant survival differences (OS: HR = 4.65; 95% CI, 1.08 to 19.9; P = 0.023; PFS: HR = 3.55; 95% CI, 1.00 to 12.60; P = 0.008) in HistoTIL-predicted risk groups. While patients treated with other ICI agents, which included fewer patients, did not show significant survival difference in low- and high-risk groups (nivolumab, OS: HR = 2.12; 95% CI, 0.67 to 6.75; P = 0.180; PFS: HR = 2.55; 95% CI, 0.65 to 10; P = 0.088), the effect size (HR) shows a longer survival trend for low-risk groups (fig. S2).

Comparison with PD-L1 expression and TIL estimation

While there was no significant survival difference between low and high HistoTIL-defined risk groups in PD-L1–high group (OS: HR = 2.03; 95% CI, 0.59 to 6.90; P = 0.166; PFS: HR = 0.91; 95% CI, 0.25 to 3.40; P = 0.896; Fig. 5, A and C), there was a significant survival advantage in HistoTIL-defined low-risk group for PD-L1 low expressed patients for both OS and PFS on D3 (HR = 2.51; 95% CI, 1.15 to 5.47; P = 0.017; HR = 2.59; 95% CI, 1.35 to 4.99; P = 0.016; Fig. 5, B and D). However, pathologists’ TIL grading failed to stratify patients based on survival in D1; there was also no correlation between manual TIL grading with HistoTIL-predicted risk groups (Fig. 6).

Fig. 5.

Survival analysis of PD-L1 expression.

The Kaplan-Meier survival analysis of OS in high–PD-L1 (A) and low–PD-L1 (B) expression group and PFS in high–PD-L1 (C) and low–PD-L1 (D) expression group on D3.

Fig. 6.

Survival analysis of manual TIL grading and comparing with HistoTIL.

Kaplan-Meier survival analysis of (A) manual TIL grading with OS, (B) manual TIL grading with PFS, and (C) manual TIL grading with OS, three stratified groups. (D) Correlation between manual grading and HistoTIL prediction.

Survival analysis of PD-L1 expression.

The Kaplan-Meier survival analysis of OS in high–PD-L1 (A) and low–PD-L1 (B) expression group and PFS in high–PD-L1 (C) and low–PD-L1 (D) expression group on D3.

Survival analysis of manual TIL grading and comparing with HistoTIL.

DISCUSSION

Recent advances in computational power and the advent of whole-slide digitization have led to several studies investigating quantitative approaches to interrogate the TME on high-resolution histology images (, ). Research has focused on leveraging artificial intelligence and digital pathology approaches toward answering clinically relevant questions related to detection, diagnosis, prognosis, and treatment response (). Broadly speaking, computational pathology–based approaches have considered either handcrafted features based on predefined image patterns such as collagen orientation () or spatial architecture of cancer cells (). Another category of computational pathology approaches involves deep learning (). These are neural network–based unsupervised feature generation strategies that aim to directly capture patterns from digitized histopathology images without typically requiring specification of predefined image motifs. Deep learning approaches like handcrafted approaches () have been shown to be able to predict underlying molecular signatures of the tumor () and have also been shown to be associated with disease outcome and treatment response (). While deep learning has achieved great performances in many medical image tasks recently, its low interpretability could prevent its wide application in the clinical setting. A recent example of the failure of deep learning approaches in real-world clinical settings can be seen in applications for coronavirus disease 2019 (COVID-19) (). Domain-inspired handcrafted approaches that are human explainable may be an important prerequisite for oncologists in guiding management of ICI-treated patients (, ). AbdulJabbar et al. () combined deep learning with RNA sequencing data to reveal the spatial immune landscape of NSCLC, identifying an image signature that was prognostic of recurrence. Beck et al. () previously reported on stromal morphologic features and identified their prognostic association in breast cancer. These previous studies have suggested the importance of capturing quantitative and interpretable domain-inspired attributes of the disease; however, these studies were primarily in the context of disease prognosis and outcome prediction. It seems logical, therefore, that these interpretable approaches will play an even more significant role in helping clinicians make treatment decisions and in assessing therapeutic benefit. In this work, we presented HistoTIL, a computational-based assay that spatially characterizes TILs in tumor specimens to predict response to ICI in patients with NSCLC and gynecological cancer. HistoTIL is composed of a set of quantitative image descriptors that characterize local TIL–non-TIL interactions within the separate stromal and epithelial compartments of the TME. This is different from measures of TIL density or manual TIL grading that result in a single number that express the average abundance of lymphocytes over an entire slide. In high-risk patients who showed poor response to ICI agents (as determined by HistoTIL), the cancer nuclei clusters dominated the TME and appeared more chaotic, with only limited immune cell infiltration. However, in the ICI responders identified by HistoTIL, there was a preponderance of TIL clusters that appeared to be encasing the cancer nuclei (Fig. 3), possibly hinting at how the ICI agents require the recruitment of the body’s immune defenses to fight against cancer (). This is further strengthened by having four completely blinded validation sets from two different cancer types, as HistoTIL was only exposed to a single institution dataset for training. This is also advantageous because while TILs have recently been shown to be prognostic in various cancer types (, ), their prognostic significance is plagued by the wide variability in pathologist or manual estimation of TILs between different pathologists (). Moreover, in comparison with current RECIST-based response evaluation, HistoTIL could also predict short-term clinical outcomes across different datasets (Fig. 7). HistoTIL was shown to be associated with survival for NSCLC patients treated with nivolumab and pembrolizumab and gynecological cancer treated with pembrolizumab (Fig. 8 and fig. S2). HistoTIL was also associated with OS in the individual cancer types under gynecological cancer (cervical, n = 10; ovarian, n = 14; and endometrial, n = 25). HistoTIL was associated with clinical outcome independent of specific ICI agent, likely because features relating to immune cell and cancer cell interplay on H&E images are a reflection of tumor biology and disease aggressiveness. Independent of the type of ICI agent used or disease indication, HistoTIL-predicted low-risk patients had a relatively lower hazard of disease progression or death compared to high-risk patients.

Fig. 7.

Prediction of clinical response by HistoTIL.

Receiver operating characteristic (ROC) analysis for HistoTIL in predicting clinical response by RECIST among five independent cohorts.

Fig. 8.

Survival analysis of NSCLC among ICI agents.

Kaplan-Meier OS assessment among ICI agents (nivolumab, pembrolizumab, and combination therapy) for patients with NSCLC with OS (A to C) and PFS (D to F).

Prediction of clinical response by HistoTIL.

Receiver operating characteristic (ROC) analysis for HistoTIL in predicting clinical response by RECIST among five independent cohorts.

Survival analysis of NSCLC among ICI agents.

Kaplan-Meier OS assessment among ICI agents (nivolumab, pembrolizumab, and combination therapy) for patients with NSCLC with OS (A to C) and PFS (D to F). HistoTIL could also significantly stratify two risk groups within the PD-L1 low expressed cohorts for both OS and PFS (HR = 2.2; 95% CI, 0.97 to 4.95; P = 0.01; HR = 2.59; 95% CI, 1.35 to 4.99; P = 0.01; Fig. 5, B and D). This finding could potentially suggest that HistoTIL-identified low-risk patients with PD-L1 low expression, typically recommended for ICI-chemo combination therapy currently, might be eligible for ICI monotherapy directly. Manual TIL grading of the slides by pathologists (N = 26) did not result in significant stratification of patients into different survival groups in D1; the high-TIL groups even presented with longer survival trend after ICI (Fig. 6). This might be due to the lack of standard agreement of manual grading and intra-observer variation, which has been previously reported (). Recently, Hu et al. () showed a deep learning model to predict anti–PD-1 response in melanoma (N = 54) and lung cancer (N = 55) (49). While Hu et al. () also found that the model trained on one tumor type could potentially apply to another tumor type for predicting RECIST-defined response, our results were validated with larger set (N = 226) with more consistent performance (i.e., prognostic of survival and predictive of RECIST) among different cancers treated with different ICI agents. Moreover, post hoc analysis of deep learning approaches usually requires the dividing of whole-slide images (WSIs) into smaller image tiles (256 × 256) (), which can dilute and cause loss of information related to spatial heterogeneity within TME (, ). Moreover, the deep learning model was trained on ImageNet, and hidden features were extracted with further principal components analysis (PCA) projection () for response prediction, which further added another layer of opacity for explaining the result. In contrast, HistoTIL achieved higher and more robust accuracy among different tumor types using human explainable features. A recent surge of applying deep learning on histopathology images ranged from object (i.e., tumor and cell) detection and classification () to prognosticating disease-related survival and response to different treatment options (, ). However, the nature of deep learning–based classifiers makes it difficult to fully interpret the basis of the predictions. It has also been suggested that the opacity associated with the deep learning–based predictions could be an impediment from deploying them in a clinical setting for treatment management–based decision-making (, ). Moreover, studies have indicated that biologically inspired image features could provide more transparent and interpretable options in the clinical setting (). Hence, our HistoTIL approach leverages deep learning for object detection and segmentation tasks but subsequently uses interpretable and engineered features to capture and describe the tumor-immune interaction. However, we do acknowledge some limitations of our work. This work was limited by the retrospective nature of data collection. Future work will involve validating HistoTIL on completed clinical trial datasets of cancer patients treated with ICI and also evaluating HistoTIL in a prospective setting. Such a prospective clinical trial validation would also allow for head-to-head comparisons with currently used patient selection biomarkers including PD-L1 and TMB. We also acknowledge that the approach invoking five large image patches for analysis could bias the approach when only smaller samples are available for interrogation. In conclusion, we presented a computational method to prognosticate survival in advanced NSCLC and gynecological patients treated with multiple ICI. Our approach, HistoTIL, was trained on a dataset from a single institution and independently validated on datasets from two different institutions for predicting both OS and PFS just based on H&E WSIs. In addition to being robust to external multisite validation, HistoTIL also uncovered the prognostic significance of the spatial relationships and structural interplay between TILs and tumor cell nuclei in determining response to ICI agents that modulate the body’s immune system. Last, HistoTIL is tissue nondestructive, based on H&E alone, and thus much less time consuming and inexpensive, as it eliminates the need for tissue to be processed and/or analyzed with expensive molecular methods.

MATERIALS AND METHODS

Dataset and image acquisition

H&E-stained pretreatment tumor specimens of patients with advanced NSCLC and gynecological cancers who received ICIs were collected from four independent institutions for this study. The ICI treatment response was evaluated independently by response criteria used for solid tumors (RECIST 1.1) (). All retrospective analyses were conducted under a protocol approved by the institutional review board (IRB), and the requirement of patient informed consent was waived by IRB. From the University of Pennsylvania Hospital, 42 patients diagnosed with NSCLC between 2008 and 2016 with follow-up information were collected. Tissue quality evaluation was performed at the pathology laboratory to exclude tissue with major artifacts (i.e., tissue folding, blur, staining issues, and anthracosis), and only samples with sufficient tissue (accommodating at least five image patches of size 2000 × 2000 pixels) and complete treatment and follow-up information were included in this study. This process rendered 26 patients to construct the learning set (D1). In addition, we collected 47 (chart review between 2005 and 2016), 81 (chart review between 2009 and 2017), and 85 (chart review between 2011 and 2017) NSCLC patients from the Cleveland Clinic, Yale University, and University Hospital Cleveland Medical Center, respectively. Forty-nine patients with gynecological cancer (cervical, n = 10; ovarian, n = 14; and endometrial, n = 25) were collected from the Cleveland Clinic. After applying the same inclusion and exclusion criteria, we obtained independent validation datasets D2 (N = 30), D3 (N = 63), D4 (N = 68), and D5 (N = 39), respectively (Fig. 1). Detailed demographics including the histology subtypes and ICI agents are available in Table 2.

Table 2.

Demographic and characteristics of the patients.

		D₁	D₂	D₃	D₄	P		D₅
Lung	Gender						Gynecology	Primary site
	Gender							Ovarian	12
	Male	9	19	36	32	0.128		Endometrial	20
	Female	17	11	27	36	0.128		Cervix	7
	Histology							Histology subtypes
	ADC	19	20	47	52	<0.001		Endometrioid	12
	SCC	7	7	12	13			Serous	14
	Others	0	3	4	3			SCC	7
	Smoking status							Clear cell	4
	Smoking status							Unknown	2
	Former/current	21	25	51	61	0.552		Clinical stage
	Never smoker	5	5	12	7	0.552		III	27
	Clinical stage							IV	12
	III	7	5	2	11	0.062
	IV	19	25	61	57	0.062
	Treatment							Treatment
	Nivo	13	30	43	39	<0.001		Nivo	16
	Pembro	7	0	4	22			Pembro	20
	Atezo	4	0	6	0			Atezo	2
	Combo	2	0	10	7			Combo	1
	Age (years)							Age (years)
	Mean ± SD	65.2 ± 10.1	N.A.	68.8 ± 10.5	63.3 ± 12.2	0.015		Mean ± SD	67.6 ± 12.1
	Systematic treatment before ICI							Systematic treatment before ICI
	Yes	22	N.A.	53	58	0.99		Yes	35
	No	4	N.A.	10	10	0.99		No	4

H&E slides from D1–2 were scanned with Philips IntelliSite Pathology Solution (software version: Philips DP v1.0) at ×40 magnification. An Aperio CS2 tissue scanner (software version: Aperio Image Library v10.2.41) was used at D3–5 to scan slides at ×20 magnification. To compensate for the variations during the tissue imaging process among different scanners and institutions, an open source software, HistoQC (), was applied to identify digitized tissue slide images eligible for downstream analysis. Expert pathologists manually annotated the tumor region on digitized H&E slides. All the subsequent image analyses were conducted at ×20 magnification (images were down-sampled from ×40 to ×20) from the annotated regions (see Fig. 2 for flowchart illustrating the methodology).

Image analysis: Nuclei detection and region separation

For segmentation of the nuclei, each WSI was divided into a set of adjacent 2000 × 2000 pixel tiles. A U-Net style deep convolutional neural network (CNN) at ×20 magnification was used for automated nuclei segmentation (). Following nuclear segmentation, a previously reported lymphocyte detection approach () was used to identify TILs and cancer cell nuclei. Another U-Net style CNN model was used to specifically differentiate and identify epithelial and stromal regions at ×5 (see the Supplementary Materials for more details).

Quantitative histomorphometric features

A total of 753 quantitative histomorphometric (QH) features of two main categories were extracted: nuclei-based features and TIL-related features, from regions of stroma, epithelium, and the entire annotated tumor (table S2). Nuclear features were extracted from within the pixels of nuclei regions. The four types of features relating to nuclei were as follows: shape descriptors, orientation entropy, image texture, and local contextual. Shape features included statistical measurements of the nuclei including nuclear area, perimeter, and eccentricity to mathematical operator like Zernike shape descriptors (). Orientation entropy measures the randomness of cell directionality. Nuclear texture quantitatively characterizes the local pixel distribution pattern in the nuclei and is prognostic among different cancer types. Last, local contextual features capture the local neighboring and topological information of individual nuclei. TIL-related features were extracted from the different tissue regions (i.e., epithelium and stroma) and included enumeration of TILs and determination of the architecture and spatial interaction of TILs and cancer cells. Previous work () has shown that a densely packed cluster of these two cell types has a poorer prognosis than a sparse arrangement in early-stage NSCLC. The extent of TIL infiltration was determined through the counts and density of TILs by automated detection (). Aspects of spatial arrangement were measured by first assigning TILs and non-TIL cells into neighboring clusters (see the “Graph construction” section in the Supplementary Materials) and then characterizing relative abundance and spatial closeness of TILs and non-TIL cells within the cluster as quantitative image features. The variations in cell composition (i.e., ratio of number of TILs to non-TIL nuclei) among each neighboring cluster were also captured to reflect the heterogeneity within TME. These patterns quantitatively describe the histomorphometry (i.e., shape, staining, and local density) among neighboring cells and, in turn, reflect the complex interaction within the tumor immune microenvironment. Details of each feature description are included in table S2.

HistoTIL: An image-based digital risk score

The top-performing QH features were selected from a total of 753 features by applying an elastic net regularized Cox proportional hazard model on D1. Tenfold cross-validation was used to select the regularization parameter (lambda) by grid search, and the best penalty hyperparameter (alpha) controlling the ratio between L1 and L2 was found by changing the value from 0 to 1 with step size of 0.1. The selected features were used to build another Cox proportional hazard model to learn the coefficient for each of the selected features on D1. Survival indices (risk score) were then calculated by a linear combination of selected features and learned coefficients. The median index from the training set D1 was picked up as the threshold to divide the patients into low and high risk based on OS and PFS. D1 to D4 were entirely composed of metastatic NSCLC patients. To evaluate the robustness of HistoTIL, another independently collected set (D5) of 39 gynecological cancer patients was used for validation using the same cutoff value derived from D1. We evaluated the association of HistoTIL against PFS and OS on independent sets D2, D3, D4, and D5.

Statistical analysis

OS was measured from the date of initiation of ICI to the date of death and censored at the date of last follow-up for survivors. PFS was calculated from the date of initiation of ICI to the date of recurrence or death, whichever occurred earlier, and censored at the date of last follow-up for those still alive without recurrence. The HistoTIL classifier assigned every patient to either a low- or high-risk survival group based on the median risk score learned from training set D1. Kaplan-Meier survival curves were obtained to visualize the differences in sets D2, D3, D4, and D5. The log-rank test was then used to evaluate the prognostic performance of HistoTIL in predicting PFS and OS on validation sets. HR between survival groups with 95% CIs and P values were calculated. Multivariable Cox analysis was conducted to assess the relationships between the various covariates and survival while adjusting for baseline factors (i.e., tumor subtypes and pathological stage). The Fisher exact test was used to test the difference of demographics in NSCLC patients across the training and validation datasets. All statistical tests were two sided with significance level set at 0.05. In addition, we also built a logistic regression model in conjunction with the same HistoTIL features identified from D1 to predict patients’ response versus nonresponse of ICI. While the response category included both complete response (CR) and partial response (PR), the nonresponse category included progressive disease (PD) and stable disease (SD) defined by RECIST (). The response prediction was evaluated by AUC of the receiver operating characteristics (ROCs).

ICI agent separated analysis

To evaluate whether HistoTIL could stratify ICI-treated patients into low- and high-risk categories independent of specific agents (i.e., nivolumab, N = 141; pembrolizumab, N = 53; atezolizumab, N = 12; and combination of agents, N = 20), we further conducted the Kaplan-Meier survival analysis for each subset of patients treated with different ICI agents. The same cutoff value of HistoTIL learned from D1 was used to identify patients into low- and high-risk survival groups.

Comparison with PD-L1 expression and manual TIL estimation

HistoTIL was also compared with the current PD-L1 expression–based patient selection strategy, especially to evaluate the performance of HistoTIL in the low–PD-1 expression group. D3 with available IHC PD-L1 expression data was used to validate the HistoTIL prediction of clinical outcome after ICI in PD-L1 low and high groups separately (PD-L1 cutoff was empirically determined at 1% expression). Expert thoracic pathologists with experience in TIL grading were involved in deciding the level of lymphocyte infiltration for each WSI in D1. Four available categories were provided to the pathologists for TIL grading: no infiltration (visual absence of TILs from the entire slide), low TILs (1 to 33% TILs were found from tumor area), moderate (34 to 66% tumor region were found with TILs), and high (more than 66% of tumor bed was infiltrated with lymphocytes). The same Kaplan-Meier survival analysis was conducted to compare the survival difference between manually defined low- and high-TIL groups.

50 in total

Review 1. Deep learning in histopathology: the path to the clinic.

Authors: Jeroen van der Laak; Geert Litjens; Francesco Ciompi
Journal: Nat Med Date: 2021-05-14 Impact factor: 53.440

Review 2. Immunotherapy: Beyond Anti-PD-1 and Anti-PD-L1 Therapies.

Authors: Scott J Antonia; Johan F Vansteenkiste; Edmund Moon
Journal: Am Soc Clin Oncol Educ Book Date: 2016

Review 3. Current state of immunotherapy for non-small cell lung cancer.

Authors: Jyoti Malhotra; Salma K Jabbour; Joseph Aisner
Journal: Transl Lung Cancer Res Date: 2017-04

4. High tumor mutation burden fails to predict immune checkpoint blockade response across all cancer types.

Authors: D J McGrail; P G Pilié; N U Rashid; L Voorwerk; M Slagter; M Kok; E Jonasch; M Khasraw; A B Heimberger; B Lim; N T Ueno; J K Litton; R Ferrarotto; J T Chang; S L Moulder; S-Y Lin
Journal: Ann Oncol Date: 2021-03-15 Impact factor: 32.976

5. Spatial Architecture and Arrangement of Tumor-Infiltrating Lymphocytes for Predicting Likelihood of Recurrence in Early-Stage Non-Small Cell Lung Cancer.

Authors: Germán Corredor; Xiangxue Wang; Yu Zhou; Cheng Lu; Pingfu Fu; Konstantinos Syrigos; David L Rimm; Michael Yang; Eduardo Romero; Kurt A Schalper; Vamsidhar Velcheti; Anant Madabhushi
Journal: Clin Cancer Res Date: 2018-09-10 Impact factor: 12.531

6. PD-1 blockade induces responses by inhibiting adaptive immune resistance.

Authors: Paul C Tumeh; Christina L Harview; Jennifer H Yearley; I Peter Shintaku; Emma J M Taylor; Lidia Robert; Bartosz Chmielowski; Marko Spasic; Gina Henry; Voicu Ciobanu; Alisha N West; Manuel Carmona; Christine Kivork; Elizabeth Seja; Grace Cherry; Antonio J Gutierrez; Tristan R Grogan; Christine Mateus; Gorana Tomasic; John A Glaspy; Ryan O Emerson; Harlan Robins; Robert H Pierce; David A Elashoff; Caroline Robert; Antoni Ribas
Journal: Nature Date: 2014-11-27 Impact factor: 49.962

7. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning.

Authors: Nicolas Coudray; Paolo Santiago Ocampo; Theodore Sakellaropoulos; Navneet Narula; Matija Snuderl; David Fenyö; Andre L Moreira; Narges Razavian; Aristotelis Tsirigos
Journal: Nat Med Date: 2018-09-17 Impact factor: 53.440

8. Prognostic value of tumor-infiltrating lymphocytes in triple-negative breast cancers from two phase III randomized adjuvant breast cancer trials: ECOG 2197 and ECOG 1199.

Authors: Sylvia Adams; Robert J Gray; Sandra Demaria; Lori Goldstein; Edith A Perez; Lawrence N Shulman; Silvana Martino; Molin Wang; Vicky E Jones; Thomas J Saphner; Antonio C Wolff; William C Wood; Nancy E Davidson; George W Sledge; Joseph A Sparano; Sunil S Badve
Journal: J Clin Oncol Date: 2014-09-20 Impact factor: 44.544

9. Geospatial immune variability illuminates differential evolution of lung adenocarcinoma.

Authors: Khalid AbdulJabbar; Shan E Ahmed Raza; Rachel Rosenthal; Mariam Jamal-Hanjani; Selvaraju Veeriah; Ayse Akarca; Tom Lund; David A Moore; Roberto Salgado; Maise Al Bakir; Luis Zapata; Crispin T Hiley; Leah Officer; Marco Sereno; Claire Rachel Smith; Sherene Loi; Allan Hackshaw; Teresa Marafioti; Sergio A Quezada; Nicholas McGranahan; John Le Quesne; Charles Swanton; Yinyin Yuan
Journal: Nat Med Date: 2020-05-27 Impact factor: 53.440

Review 10. Using deep learning to predict anti-PD-1 response in melanoma and lung cancer patients from histopathology images.

Authors: Jing Hu; Chuanliang Cui; Wenxian Yang; Lihong Huang; Rongshan Yu; Siyang Liu; Yan Kong
Journal: Transl Oncol Date: 2020-10-28 Impact factor: 4.243

1 in total

Review 1. Computational pathology in ovarian cancer.

Authors: Sandra Orsulic; Joshi John; Ann E Walts; Arkadiusz Gertych
Journal: Front Oncol Date: 2022-07-29 Impact factor: 5.738

1 in total