Literature DB >> 32706384

Development of a Deep Learning Model to Identify Lymph Node Metastasis on Magnetic Resonance Imaging in Patients With Cervical Cancer.

Qingxia Wu^5,6,7, Shuo Wang^2,3, Shuixing Zhang⁴, Meiyun Wang^5,6,7, Yingying Ding⁸, Jin Fang⁴, Qingxia Wu^5,6,7, Wei Qian⁹, Zhenyu Liu^2,10, Kai Sun¹¹, Yan Jin⁸, He Ma¹, Jie Tian^1,2,3,10.

Abstract

Importance: Accurate identification of lymph node metastasis preoperatively and noninvasively in patients with cervical cancer can avoid unnecessary surgical intervention and benefit treatment planning. Objective: To develop a deep learning model using preoperative magnetic resonance imaging for prediction of lymph node metastasis in cervical cancer. Design, Setting, and Participants: This diagnostic study developed an end-to-end deep learning model to identify lymph node metastasis in cervical cancer using magnetic resonance imaging (MRI). A total of 894 patients with stage IB to IIB cervical cancer who underwent radical hysterectomy and pelvic lymphadenectomy were reviewed. All patients underwent radical hysterectomy and pelvic lymphadenectomy, received pelvic MRI within 2 weeks before the operations, had no concurrent cancers, and received no preoperative treatment. To achieve the optimal model, the diagnostic value of 3 MRI sequences was compared, and the outcomes in the intratumoral and peritumoral regions were explored. To mine tumor information from both image and clinicopathologic levels, a hybrid model was built and its prognostic value was assessed by Kaplan-Meier analysis. The deep learning model and hybrid model were developed on a primary cohort consisting of 338 patients (218 patients from Sun Yat-sen University Cancer Center, Guangzhou, China, between January 2011 and December 2017 and 120 patients from Henan Provincial People's Hospital, Zhengzhou, China, between December 2016 and June 2018). The models then were evaluated on an independent validation cohort consisting of 141 patients from Yunnan Cancer Hospital, Kunming, China, between January 2011 and December 2017. Main Outcomes and Measures: The primary diagnostic outcome was lymph node metastasis status, with the pathologic characteristics diagnosed by lymphadenectomy. The secondary primary clinical outcome was survival. The primary diagnostic outcome was assessed by receiver operating characteristic (area under the curve [AUC]) analysis; the primary clinical outcome was assessed by Kaplan-Meier survival analysis.
Results: A total of 479 patients (mean [SD] age, 49.1 [9.7] years) fulfilled the eligibility criteria and were enrolled in the primary (n = 338) and validation (n = 141) cohorts. A total of 71 patients (21.0%) in the primary cohort and 32 patients (22.7%) in the validation cohort had lymph node metastais confirmed by lymphadenectomy. Among the 3 image sequences, the deep learning model that used both intratumoral and peritumoral regions on contrast-enhanced T1-weighted imaging showed the best performance (AUC, 0.844; 95% CI, 0.780-0.907). These results were further improved in a hybrid model that combined tumor image information mined by deep learning model and MRI-reported lymph node status (AUC, 0.933; 95% CI, 0.887-0.979). Moreover, the hybrid model was significantly associated with disease-free survival from cervical cancer (hazard ratio, 4.59; 95% CI, 2.04-10.31; P < .001). Conclusions and Relevance: The findings of this study suggest that deep learning can be used as a preoperative noninvasive tool to diagnose lymph node metastasis in cervical cancer.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2020 PMID： 32706384 PMCID： PMC7382006 DOI： 10.1001/jamanetworkopen.2020.11625

Source DB: PubMed Journal: JAMA Netw Open ISSN： 2574-3805

Introduction

Cervical cancer is one of the most common cancers among women.[1] The treatment and management of cervical cancer are often guided by the International Federation of Gynaecology and Obstetrics (FIGO) staging system, which is based on clinical assessment and imaging rather than invasive investigations, such as surgery.[2] In the 2018 FIGO staging system, once lymph node (LN) metastasis (LNM) is identified either by imaging or pathologic testing, cancer will be considered as stage IIIC irrespective of other findings.[3] Moreover, LNM has been reported to be associated with prognosis and treatment planning in cervical cancer.[4,5] Specifically, patients who show evidence of LNM may undergo chemoradiotherapy rather than surgery as their first choice,[6] avoiding surgery followed by adjuvant chemoradiotherapy and possible serious complications thenceforth.[7,8] Therefore, accurate identification of LN status preoperatively in patients with cervical cancer might avoid unnecessary surgical intervention and benefit treatment planning. Magnetic resonance imaging (MRI), a commonly used imaging modality in cervical cancer,[9] provides a preoperative method for assessing LN status in cervical cancer. However, the traditional methods, which rely mainly on assessing the size of LNs on MRI, have limited sensitivity in diagnosing LNM in cervical cancer and might lead to inappropriate treatment decisions.[10,11,12] Many attempts have been made to improve the performance of MRI in diagnosing LNM before surgery, for example, using radiomic features that extract the quantitative human-defined image features, such as shape, intensity, and texture features.[13,14,15,16] In previous research, the sensitivity of MR images to discriminate metastatic from nonmetastatic LN has shown improvement by using radiomic features.[13] However, radiomic features need time-consuming tumor delineation, and they might not be adaptive to specific clinical issues. Deep learning (DL) as an artificial intelligence method has recently shown promising performance in many medical image analysis tasks,[17,18,19] such as diagnosing Alzheimer disease,[20] screening for breast cancer,[21] and detecting thoracic diseases.[22] Moreover, DL also exhibited predictive performance in cervical cancer, such as screening and predicting toxic rectal reactions to radiotherapy.[23,24] Compared with traditional methods, DL has an advantage in automatically learning and hierarchically organizing task-adaptive image features.[25] Even though these features cannot be identified visually, they tend to reflect the high-dimensional association between images and clinical issues.[26] Furthermore, DL does not require precise tumor delineation, making it an easy-to-use method in clinical practice. In many tumor analysis tasks, DL outperforms traditional radiomic features.[27,28,29] In this research, we aimed to develop a DL model to provide a preoperative noninvasive tool for diagnosing LNM in cervical cancer.

Methods

Two outcomes were studied. The primary diagnostic outcome was LNM status, with the pathologic characteristics diagnosed by lymphadenectomy. We first developed a DL model that used MR images to diagnose LNM. Then we proposed a hybrid model that integrated tumor image information and MRI-reported LN (MRI-LN) status. Herein, MRI-LN status was defined as positive if the short-axis diameter of the largest LN shown on MRI was equal to or larger than 1 cm.[10] We assessed the models' performance by receiver operating characteristic analysis. The second primary clinical outcome was disease-free survival (DFS). We assessed the prognostic ability of the hybrid model with regard to DFS by the Kaplan-Meier method. The institutional review boards of Sun Yat-sen University Cancer Center, Henan Provincial People's Hospital, and Yunnan Cancer Hospital approved this retrospective study with deidentified data, and the need for informed consent from patients was waived. This study followed the Standards for Reporting of Diagnostic Accuracy (STARD) reporting guideline for diagnostic studies. A total of 479 patients with cervical cancer who underwent radical hysterectomy and pelvic lymphadenectomy were enrolled in this research. A total of 338 patients from Sun Yat-sen University Cancer Center (n = 218, from January 2011 to December 2017) and Henan Provincial People's Hospital (n = 120, from December 2016 to June 2018) composed the primary cohort, and 141 patients from Yunnan Cancer Hospital between January 2011 and December 2017 composed the independent validation cohort. All of these patients met the following inclusion criteria: (1) pathologically confirmed cervical cancer; (2) pelvic MRI performed within 2 weeks before the operation; (3) complete clinicopathologic data available, such as age, FIGO stage, histologic characteristics, differentiation, lymphovascular space invasion, LNM, and MRI-LN status; (4) no concurrent cancers; and (5) no preoperative treatment. We excluded patients if the tumor lesions were not visible on MRI or if the image quality was poor as assessed by 2 radiologists (Q.W. and J.F.) with more than 9 years' experience and blinded to all clinical information. The recruitment pathway is shown in eFigure 1 in the Supplement. After surgery, patients from Sun Yat-sen University Cancer Center and Yunnan Cancer Hospital were followed up with MRI or positron emission tomographic and computed tomographic imaging every 3 to 4 months for the first 2 years, every 6 months from the third to fifth years, and then annually. The end point of this study was DFS, which was defined as the period from the date of the operation to the date of the first local-regional recurrence, distant metastasis, all-cause mortality, or the latest follow-up used for censoring. Local-regional recurrences and distant metastasis were confirmed by gynecologic examination; imaging modalities, such as computed tomographic imaging, MRI, and positron emission tomographic and computed tomographic imaging; or biopsy findings.

Image Acquisition and Preprocessing

All patients underwent pelvic MRI scans, including sagittal contrast-enhanced T1-weighted imaging (CET1WI), axial T2-weighted imaging (T2WI), and axial diffusion-weighted imaging (DWI). Magnetic resonance imaging scanning parameters are described in eMethods 1 in the Supplement. We generated apparent diffusion coefficient (ADC) maps to analyze DWI sequence (b values, 0 and 800 s/mm2). To extract tumor information for analysis, the same 2 radiologists (Q.W. and J.F.) used rectangular bounding boxes for the region of interest (ROI) to tightly encapsulate tumors on MRI. This tight ROI was defined as ROI tumor. Because peritumoral regions were reported to have diagnostic value in predicting LN status,[13] we also expanded ROI tumor by 5 pixels to add peritumoral information, defined as ROI tumor + peritumoral. Examples of ROI tumor and ROI tumor + peritumoral are shown in Figure 1.

Figure 1.

Illustration of the DL Model and the Hybrid Model

The blue box on sagittal contrast-enhanced T1-weighted imaging (CET1WI) is a region of interest (ROI) tumor (tightly encapsulated tumor); the green box on sagittal CET1WI is an ROI tumor + peritumoral (5 pixels larger than the ROI tumor). Every 3 adjacent magnetic resonance imaging (MRI) sections were combined and scaled to 64 × 64 voxel size for deep learning (DL) analysis. The DL model consists of subnetworks 1 and 2, which are the stack of multiple convolutions, batch normalization, zero padding, and pooling layers. Feeding a tumor image, the DL model predicts the lymph node metastasis (LNM) probability (defined as DL score). The hybrid model consists of subnetworks 1 and 3, which integrate with the clinical variable (MRI-LN status). Feeding tumor images and the MRI-LN status of a patient, the hybrid model predicts the LNM probability at the end of subnetwork 3 (defined as H score).

Illustration of the DL Model and the Hybrid Model

Model Development and Visualization

We developed an end-to-end DL model for LNM prediction (subnetworks 1 and 2 in Figure 1). The network was the stack of multiple convolutions, zero padding, and batch normalization layers. Layers were basic computational units in DL models,[30] and the links of layers were similar to connections between neurons in brains; details of the layers are presented in eMethods 2 in the Supplement. Subnetwork 1 was similar to ResNet18, a widely used deep learning model,[31,32,33] and the detailed network architecture is described in eMethods 3 and eFigure 2 in the Supplement. To enhance model training, subnetwork 1 was pretrained by 14 million natural images from the ImageNet data set[34,35] and was fine-tuned using images from the primary cohort that comprised 5280 CET1WI, 1633 T2WI, and 1474 ADC map image sections. When an MR image of the tumor was fed into the DL model, subnetwork 2 predicted the LNM probability for the tumor. We defined the DL model–predicted LNM probability as the DL score. Owing to the inconsistency of previous research about the performance of MRI sequences,[13,14,15,16] we compared the DL model among 3 MRI sequences to find the optimal model for LNM prediction. As some preoperative clinical characteristics of cervical cancer have been reported to be associated with LNM,[36] we evaluated 3 preoperative clinical factors (age, FIGO stage, and MRI-LN status) and selected the significant factors (P < .05) in the primary cohort to build clinical models. Because the DL model can mine high-dimensional information from MRI and clinical features can reflect tumor information from clinicopathologic aspects, we developed a hybrid model to combine information from these sources to explore whether they can be complementary (subnetworks 1 and 3 in Figure 1). We defined the hybrid model–predicted LNM probability as the H score. Detailed training processes of the DL and hybrid models are described in eMethods 4 in the Supplement. To gain further intuition and explore the underlying basis of the end-to-end DL model, we applied visualization algorithms to display how the network learned the LNM-related information (eMethods 5 in the Supplement).[37] We evaluated the DL model using the following methods: (1) visually assessing the area in the tumor that drew the attention of the DL model (defined as attention map), (2) visualizing convolutional features learned by the network (defined as DL feature), and (3) exploring the association between the DL feature and LN status. A discriminative DL feature should have different responses between patients with node-negative and node-positive findings.

Statistical Analysis

All statistical analyses were performed with R, version 3.5.1 software (R Project for Statistical Computing). The statistical difference of clinical variables was assessed with an unpaired, 2-tailed χ2 test for categorical variables or t test for continuous variables. The Mann-Whitney test was applied to assess the difference of the DL score between patients with node-negative and node-positive findings. The DeLong test was applied to assess the difference of the receiver operating characteristic curves between different models.[38] The Kaplan-Meier method and 2-sided log-rank tests were applied to estimate DFS. P < .05 indicated a statistically significant difference.

Results

We reviewed 894 patients with stage IB to IIB cervical cancer who underwent radical hysterectomy and pelvic lymphadenectomy; 479 patients fulfilled the eligibility criteria and were enrolled in the primary (n = 338) and validation (n = 141) cohorts. The mean (SD) age of the patients was 49.1 (9.7) years. A total of 71 patients (21.0%) in the primary cohort and 32 patients (22.7%) in the validation cohort had LNM confirmed by lymphadenectomy (Table 1). As of December 2017, 188 patients from Sun Yat-sen University Cancer Center (30 lost to follow-up) and 128 patients from Yunnan Cancer Hospital (13 lost to follow-up) had completed the DFS follow-up.

Table 1.

Characteristics of Patients in the Primary and Validation Cohorts

Characteristic	Primary cohort (n = 338)		P value^a	Validation cohort (n = 141)		P value^a	P value^b
Characteristic	No LNM	LNM		No LNM	LNM
Patients, No. (%)	267 (79.0)	71 (21.0)		109 (77.3)	32 (22.7)		.77
Age, mean (SD), y	49.9 (9.5)	48.8 (10.0)	.40	48.0 (10.2)	47.6 (9.1)	.84	.07
FIGO stage, No. (%)^c
IB	145 (54.3)	28 (39.4)	<.001	81 (74.3)	22 (68.8)	.68	<.001
IIA	108 (40.4)	29 (40.8)		23 (21.1)	9 (28.1)
IIB	14 (5.2)	14 (19.7)		5 (4.6)	1 (3.1)
Differentiation grade, No. (%)
Low	139 (52.1)	43 (60.6)	.44	51 (46.8)	19 (59.4)	.37	.65
Middle	124 (46.4)	27 (38.0)		56 (51.4)	12 (37.5)
High	4 (1.5)	1 (1.4)		2 (1.8)	1 (3.1)
MRI-LN status, No. (%)
Negative	252 (94.4)	45 (63.4)	<.001	103 (94.5)	25 (78.1)	.01	.45
Positive	15 (5.6)	26 (36.6)	<.001	6 (5.5)	7 (21.9)	.01	.45
Histologic characteristic, No. (%)
Squamous cell carcinoma	225 (84.3)	61 (85.9)	.96	94 (86.2)	28 (87.5)	.91	.73
Adenocarcinoma	31 (11.6)	8 (11.3)		12 (11.0)	3 (9.4)
Adenosquamous carcinoma	6 (2.2)	1 (1.4)		1 (0.9)	0
Small cell carcinoma	5 (1.9)	1 (1.4)		2 (1.8)	1 (3.1)
LVSI, No. (%)
Negative	185 (69.3)	28 (39.4)	<.001	96 (88.1)	22 (68.8)	.02	<.001
Positive	82 (30.7)	43 (60.6)	<.001	13 (11.9)	10 (31.2)	.02	<.001

Abbreviations: FIGO, International Federation of Gynaecology and Obstetrics; LNM, lymph node metastasis; LVSI, lymphovascular invasion; MRI-LN, magnetic resonance imaging–reported lymph node.

P values were derived from the univariable association analyses of each clinicopathologic variable between patients with and without LNM in the primary and validation cohort.

P values represent the difference of each clinicopathologic variable between the primary and validation cohorts.

2009 FIGO staging.[39]

Abbreviations: FIGO, International Federation of Gynaecology and Obstetrics; LNM, lymph node metastasis; LVSI, lymphovascular invasion; MRI-LN, magnetic resonance imaging–reported lymph node. P values were derived from the univariable association analyses of each clinicopathologic variable between patients with and without LNM in the primary and validation cohort. P values represent the difference of each clinicopathologic variable between the primary and validation cohorts. 2009 FIGO staging.[39]

Diagnostic Performance of the Models

The MRI-LN status exhibited specificity of 94.38% in the primary cohort and 94.50% in the validation cohort, and sensitivity of 36.62% in the primary cohort and 21.88% in the validation cohort. The clinical model, which incorporated FIGO stage and MRI-LN status, yielded area under the curve (AUC) values of 0.704 (95% CI, 0.633-0.776) in the primary cohort and 0.622 (95% CI, 0.519-0.725) in the validation cohort (Table 2).

Table 2.

Diagnostic Performance of Various Models

Model	Primary cohort, % (95% CI)				Validation cohort, % (95% CI)
Model	AUC	Accuracy	Sensitivity	Specificity	AUC	Accuracy	Sensitivity	Specificity
Clinical
MRI-LN status	0.655 (0.597-0.713)	82.25 (77.75-86.17)	36.62 (25.75-48.95)	94.38 (90.71-96.71)	0.582 (0.506-0.658)	78.01 (70.27-84.55)	21.88 (9.94-40.44)	94.50 (87.92-97.74)^a
FIGO stage	0.604 (0.532-0.674)	55.62 (50.15-61.00)	60.56 (48.23-71.74)	54.31 (48.13-60.36)	0.525 (0.434-0.616)	64.54 (56.05-72.41)	31.25 (16.75-50.14)	74.31 (64.89-81.99)
MRI-LN status + FIGO stage	0.704 (0.633-0.776)	80.47 (75.84-84.56)	45.07 (33.40-57.28)	89.89 (85.47-93.11)	0.622 (0.519-0.725)	66.67 (58.24-74.37)	50.00 (32.24-67.76)	71.56 (61.99-79.59)
Deep learning
CET1WI tumor + peritumoral	0.894 (0.857-0.931)	75.15 (70.18-79.66)	88.73 (78.47-94.66)	71.54 (65.65-76.79)	0.844 (0.780-0.907)	74.47 (66.45-81.43)	87.50 (70.07-95.92)	70.64 (61.03-78.78)
CET1WI tumor	0.845 (0.794-0.896)	76.92 (72.06-81.31)	78.87 (67.25-87.32)	76.40 (70.76-81.27)	0.742 (0.651-0.833)	60.99 (52.43-69.09)	81.25 (62.96-92.14)	55.05 (45.24-64.49)
T2WI tumor + peritumoral	0.671 (0.601-0.742)	56.51 (51.04-61.86)	78.87 (67.25-87.32)	50.56 (44.41-56.69)	0.651 (0.540-0.762)	78.72 (71.04-85.16)	37.50 (21.66-56.25)	90.83 (83.38-95.27)
ADC tumor + peritumoral	0.702 (0.634-0.770)	71.01 (65.85-75.79)	59.15 (46.84-70.47)	74.16 (68.39-79.21)	0.667 (0.563-0.770)	58.87 (50.27-67.08)	78.12 (59.56-90.06)	53.21 (43.45-62.75)
Hybrid
CET1WI tumor + peritumoral + MRI-LN status	0.963 (0.930-0.996)^a	96.45 (93.88-98.15)^a	92.96 (83.65-97.38)^a	97.38 (94.44-98.85)^a	0.933 (0.887-0.979)^a	87.94 (81.40-92.82)^a	90.62 (73.83-97.55)^a	87.16 (79.06-92.55)

Best performance.

Abbreviations: ADC, apparent diffusion coefficient; AUC, area under the receiver operating characteristic curve; CET1WI, contrast-enhanced T1-weighted imaging; FIGO, International Federation of Gynaecology and Obstetrics; MRI-LN, magnetic resonance imaging–reported lymph node; T2WI, T2-weighted imaging. Best performance. Among all the DL models (Figure 2A,B), the CET1WI tumor + peritumoral illustrated the best performance in detecting metastatic LN in both the primary cohort (AUC, 0.894; 95% CI, 0.857-0.931) and validation cohort (AUC, 0.844; 95% CI, 0.780-0.907). The DL score determined from CET1WI tumor + peritumoral revealed a significant difference between patients with node-positive and node-negative findings in both the primary (0.58; interquartile range [IQR], 0.46-0.67 vs 0.34; IQR, 0.27-0.43; P < .001) and validation (0.47; IQR, 0.43-0.56 vs 0.35; IQR, 0.27-0.43; P < .001) cohorts (eFigure 3A in the Supplement). We found that the DL model using both intratumoral and peritumoral regions (CET1WI tumor + peritumoral) outperformed the model using only intratumoral regions (CET1WI tumor) in the primary (AUC, 0.894; 95% CI, 0.857-0.932 vs AUC, 0.845; 95% CI, 0.794-0.896; P = .006) and validation (AUC, 0.844; 95% CI, 0.780-0.907 vs AUC, 0.742; 95% CI, 0.651-0.833; P = .006) cohorts (Table 2).

Figure 2.

Performance of Various Models

Performance of Various Models

Receiver operating characteristic (ROC) curves in the primary (A) and validation (B) cohorts of the contrast-enhanced T1-weighted imaging (CET1WI) tumor + peritumoral + clinical, CET1WI tumor + peritumoral, CET1WI tumor, apparent diffusion coefficient (ADC) tumor + peritumoral, T2-weighted imaging (T2WI) tumor + peritumoral, and clinical model. Survival curves according to the H score from the hybrid model with Kaplan-Meier (K-M) analysis in the primary (C) and validation (D) cohorts. DFS indicates disease-free survival. To further assess the added value of the DL model to the MRI-LN status, we conducted stratified analysis within MRI-LN subgroups. Within the negative MRI-LN subgroup, the DL score achieved an AUC of 0.877 (95% CI, 0.828-0.926) in the primary cohort and 0.841 (95% CI, 0.772-0.911) in the validation cohort. Within the positive MRI-LN subgroup, the AUC was 0.956 (95% CI, 0.893-1.000) in the primary cohort and 0.905 (95% CI, 0.707-1.000) in the validation cohort. Moreover, the DL score exhibited a significant difference between patients with node-positive and node-negative findings in the primary cohort (DL score among MRI-LN-positive patients: node-positive vs node-negative, 0.60; IQR, 0.52-0.67 vs 0.29; IQR, 0.26-0.36; P < .001; DL score among MRI-LN-negative patients: node-positive vs node-negative, 0.56; IQR, 0.45-0.67 vs 0.35; IQR, 0.27-0.43; P < .001) and validation cohort (DL score among MRI-LN-positive patients: node-positive vs node-negative, 0.45; IQR, 0.43-0.56 vs 0.35; IQR, 0.29-0.38; P < .001; DL score among MRI-LN-negative patients: node-positive vs node-negative, 0.47; IQR, 0.44-0.56 vs 0.35; IQR, 0.27-0.43; P < .001) (eFigure 3B in the Supplement). To further illustrate the predictive performance of the DL model, we depicted 4 representative prediction results in Figure 3. The 4 patients had similar clinicopathologic characteristics, making it difficult to identify LN status by clinical characteristics and visual observation on MRI. However, the DL model was able to generate discriminative predictive value.

Figure 3.

Representative Prediction Results From the Validation Cohort

Representative Prediction Results From the Validation Cohort

The blue boxes on sagittal contrast-enhanced T1-weighted imaging (CET1WI) are region of interest (ROI) tumor, the green boxes on sagittal CET1WI and axial T2-weighted imaging (T2WI) are ROI tumor + peritumoral, and the yellow boxes on axial diffusion-weighted imaging (DWI) are lymph nodes. Positive magnetic resonance imaging (MRI)-reported lymph node (MRI-LN) status was assessed by the short-axis diameter of the largest lymph node larger than 10 mm. DL indicates deep learning; SCC, squamous cell carcinoma. Because the DL model of CET1WI tumor + peritumoral exhibited the highest sensitivity and MRI-LN status exhibited the highest specificity, a combined hybrid model was established. The hybrid model showed significant improvement either in the primary cohort (AUC, 0.963; 95% CI, 0.930-0.996 vs AUC, 0.894; 95% CI, 0.857-0.931; P < .001) and validation cohort (AUC, 0.933; 95% CI, 0.887-0.979 vs AUC, 0.844; 95% CI, 0.780-0.907; P = .008). The hybrid model achieved AUC, 0.963; sensitivity, 92.96%; and specificity, 97.38% in the primary cohort and AUC, 0.933; sensitivity, 90.62%; and specificity, 87.16% in the validation cohort. Assisted by the DL visualization algorithms, we discovered a high-response area for each tumor (eFigure 4 in the Supplement). These high-response areas were more important than other parts of tumors because they drew more attention to the DL model and consequently contained more LNM-related information. These high-response areas included both intratumoral and peritumoral areas, indicating that both intratumoral and peritumoral regions were necessary for the DL model to make decisions. To have a better understanding of the DL feature learned by the network, we visualized representative DL features from convolution layers (eFigure 5A in the Supplement). In the shallow convolution layers, the DL model extracted simple tumor edge features (the second and sixth layers), while in deeper convolution layers, it extracted complex tumor texture information (the tenth layer). In the last convolution layer, the DL model extracted high-level abstract features (the fourteenth layer). Although these high-level features were so intricate that they were hard to interpret by general gross observation, they were associated with LN status. As shown in eFigure 5B in the Supplement, the patients with node-negative findings had weaker DL-feature responses and vice versa, indicating that the network learned discriminative DL features for LNM prediction. In eFigure 6A in the Supplement, we visualized 2 DL features of the last convolution layer to explore the association between DL features and LNM. The positive DL feature had strong responses to patients with node-positive findings and weak responses to those with node-negative findings. Similarly, the negative DL feature had strong responses to patients free of LNM and was nearly shut down in patients with LNM. The response value of negative and positive DL features also showed a statistically significant difference between patients with node-positive and node-negative findings in the primary (DL feature response among positive DL feature status: node-positive vs node-negative, −0.014; IQR, −0.104 to 0.077 vs −0.037; IQR, −0.126 to 0.048; P < .001; DL feature response among negative DL feature status: node-positive vs node-negative, −0.195; IQR, −0.291 to −0.114 vs −0.176; IQR, −0.259 to −0.095; P < .001) and validation (DL feature response among positive DL feature status: node-positive vs node-negative, 0.030; IQR, −0.059 to 0.111 vs −0.118, IQR, −0.096 to 0.076; P < .001; DL feature response among negative DL feature status: node-positive vs node-negative, −0.182; IQR, −0.257 to −0.103 vs −0.146; IQR, −0.216 to −0.078; P < .001) cohorts (eFigure 6B in the Supplement). These results suggest that the DL feature is discriminative in diagnosing LNM.

Prognostic Value of the Hybrid Model

Because the LN status of cervical cancer has been reported to be a crucial prognostic factor,[40,41] we performed survival analyses to assess the prognostic ability of the hybrid model with regard to DFS. We used the median H score to stratify patients into low- and high-risk groups. The median survival time for DFS was 31 (IQR, 16-56) months in the primary cohort and 23 (IQR, 14-33) months in the validation cohort. Figure 2C, D shows a significant difference between low- and high-risk patients from the hybrid model in the primary cohort (hazard ratio, 3.24; 95% CI, 1.64-6.44; P < .001) and validation cohort (hazard ratio, 4.59; 95% CI, 2.04-10.31; P < .001). Patients with higher H scores had a shorter time to reach the DFS.

Discussion

In this multicenter study, we developed an end-to-end DL model to diagnose LNM for patients with cervical cancer preoperatively. We compared the DL model among different MRI sequences (CET1WI, T2WI, and DWI) and explored the diagnostic value of intratumoral and peritumoral regions. Among all DL models, the CET1WI tumor + peritumoral model achieved the best performance, indicating that the CET1WI sequence probably contained more LNM-related information than the other 2 sequences (T2WI and DWI). To mine diagnostic information from both MR images and clinical characteristics, a hybrid model combining the CET1WI tumor + peritumoral model with MRI-LN status was established. This hybrid model appears to be able to identify more than 90% of metastatic LN cases with a specificity of more than 87%. Moreover, we found that the H score was significantly associated with DFS of cervical cancer, indicating that the hybrid model was a good prognostic indicator. In previous studies, peritumoral regions in cervical cancer have been shown to be valuable in diagnosing LNM and estimating neoadjuvant chemotherapy response.[13,42] Therefore, we compared the 2 DL models using ROI tumor + peritumoral and ROI tumor. Contrary to CET1WI tumor + peritumoral, the AUC of the CET1WI tumor decreased from 0.844 to 0.742, suggesting that peritumoral regions played a role in predicting LNM in cervical cancer. Adding peritumoral regions led to increased AUC, which can probably be explained by the fact that higher lymphatic vessel density in peritumoral regions might lead to higher regional LNM.[43] As reported in previous studies, an increase in lymphatic vessel density can change the tumor microenvironment and metastatic propensity,[44] which is reflected in many cancers, including cervical, prostate, and breast cancer.[43,45,46] Findings shown in eFigure 4 in the Supplement suggest that the DL model also used both intratumoral and peritumoral regions to make its final decision. Owing to the high sensitivity of the CET1WI tumor + peritumoral model and the high specificity of MRI-LN status, we developed a hybrid model to integrate image-level and clinicopathologic-level information, resulting in an increase in the AUC from 0.844 to 0.933, sensitivity from 87.5% to 90.62%, and specificity from 70.64% to 87.16%. These improvements suggest that the DL model mined complementary information to the MRI-LN status. Therefore, with the apparent high sensitivity and specificity of our hybrid model, this model might be used preoperatively to help gynecologists make decisions. In clinical practice, the following 2 scenarios may result in an inappropriate treatment plan: lymphadenopathy not detected on MRI but positive results shown in surgery (patient 2 in Figure 3) and lymphadenopathy detected on MRI but proved to be negative (patient 4 in Figure 3). Therefore, we applied stratified analysis to explore the added value of the DL model within MRI-LN subgroups. As shown in eFigure 3B in the Supplement, the DL score from the CET1WI tumor + peritumoral model exhibited a significant difference between patients with node-positive and node-negative findings within MRI-LN subgroups in the primary and validation cohorts (all P < .001). Therefore, the DL model may benefit patients with false-negative and false-positive LN status on routine MRI. In contrast with previous studies, our study develops an end-to-end DL model to detect LNM during routine MRI. Attempts have been made to assess LN status, such as sentinel nodes biopsy, applying clinical factors, and radiomic analysis. Although sentinel LN dissection as an invasive method shows good sensitivity and specificity,[47] its application is limited by available facilities and experts.[48,49,50] The sensitivity of clinical characteristics (eg, FIGO stage and MRI-LN status) is not sufficient to help inform decision-making by clinicians. Radiomic analysis requires time-consuming tumor delineation, which affects the reproducibility of radiomic features.[51] Although radiomic features can reflect some generalized image features, those characteristics might not be adaptive to LNM prediction. Consequently, we developed a DL model to try to overcome these problems by automatically learning LNM-related features, providing a helpful adjunct to assess LNM.

Limitations

Despite the favorable diagnostic performance of the DL model, our research has limitations. First, a more extensive and prospective data set is needed to generalize the performance of the DL model. Second, although CET1WI showed better performance than T2WI and ADC maps, the combination of these sequences is unclear.

Conclusions

The findings of this study suggest that DL may serve as a preoperative noninvasive tool to diagnose LNM in women with cervical cancer. The H score from the hybrid model was significantly associated with the prognosis of cervical cancer.

47 in total

1. Deep learning provides a new computed tomography-based prognostic biomarker for recurrence prediction in high-grade serous ovarian cancer.

Authors: Shuo Wang; Zhenyu Liu; Yu Rong; Bin Zhou; Yan Bai; Wei Wei; Wei Wei; Meiyun Wang; Yingkun Guo; Jie Tian
Journal: Radiother Oncol Date: 2018-11-01 Impact factor: 6.280

Review 2. Deep learning.

Authors: Yann LeCun; Yoshua Bengio; Geoffrey Hinton
Journal: Nature Date: 2015-05-28 Impact factor: 49.962

3. Sensitivity and negative predictive value for sentinel lymph node biopsy in women with early-stage cervical cancer.

Authors: Gloria Salvo; Pedro T Ramirez; Charles F Levenback; Mark F Munsell; Elizabeth D Euscher; Pamela T Soliman; Michael Frumovitz
Journal: Gynecol Oncol Date: 2017-02-08 Impact factor: 5.482

4. Bilateral negative sentinel nodes accurately predict absence of lymph node metastasis in early cervical cancer: results of the SENTICOL study.

Authors: Fabrice Lécuru; Patrice Mathevet; Denis Querleu; Eric Leblanc; Philipe Morice; Emile Daraï; Henri Marret; Laurent Magaud; Florence Gillaizeau; Gilles Chatellier; Daniel Dargent
Journal: J Clin Oncol Date: 2011-03-28 Impact factor: 44.544

5. Prognostic significance of peritumoral lymphatic vessel density and vascular endothelial growth factor receptor 3 in invasive squamous cell cervical cancer.

Authors: Shaleen K Botting; Hala Fouad; Kyler Elwell; Bill A Rampy; Salama A Salama; Daniel H Freeman; Concepcion R Diaz-Arrastia
Journal: Transl Oncol Date: 2010-06-01 Impact factor: 4.243

6. Randomised study of radical surgery versus radiotherapy for stage Ib-IIa cervical cancer.

Authors: F Landoni; A Maneo; A Colombo; F Placa; R Milani; P Perego; G Favini; L Ferri; C Mangioni
Journal: Lancet Date: 1997-08-23 Impact factor: 79.321

Review 7. MRI of cervical cancer with a surgical perspective: staging, prognostic implications and pitfalls.

Authors: Patricia Balcacer; Arvind Shergill; Babak Litkouhi
Journal: Abdom Radiol (NY) Date: 2019-07

Review 8. Lymphatic metastases from pelvic tumors: anatomic classification, characterization, and staging.

Authors: Colm J McMahon; Neil M Rofsky; Ivan Pedrosa
Journal: Radiology Date: 2010-01 Impact factor: 11.105

9. Preoperative nomogram for the identification of lymph node metastasis in early cervical cancer.

Authors: D-Y Kim; S-H Shim; S-O Kim; S-W Lee; J-Y Park; D-S Suh; J-H Kim; Y-M Kim; Y-T Kim; J-H Nam
Journal: Br J Cancer Date: 2013-11-14 Impact factor: 7.640

10. SEOM guidelines for cervical cancer.

Authors: A Oaknin; M J Rubio; A Redondo; A De Juan; J F Cueva Bañuelos; M Gil-Martin; E Ortega; A Garcia-Arias; A Gonzalez-Martin; I Bover
Journal: Clin Transl Oncol Date: 2015-12-09 Impact factor: 3.405

10 in total

1. External Validation of Deep Learning Algorithms for Radiologic Diagnosis: A Systematic Review.

Authors: Alice C Yu; Bahram Mohajer; John Eng
Journal: Radiol Artif Intell Date: 2022-05-04

2. Development and Validation of a Deep Learning Model to Screen for Trisomy 21 During the First Trimester From Nuchal Ultrasonographic Images.

Authors: Liwen Zhang; Di Dong; Yongqing Sun; Chaoen Hu; Congxin Sun; Qingqing Wu; Jie Tian
Journal: JAMA Netw Open Date: 2022-06-01

Review 3. The impact of para-aortic lymph node irradiation on disease-free survival in patients with cervical cancer: A systematic review and meta-analysis.

Authors: Leslie J H Bukkems; Ina M Jürgenliemk-Schulz; Femke van der Leij; Max Peters; Cornelis G Gerestein; Ronald P Zweemer; Peter S N van Rossum
Journal: Clin Transl Radiat Oncol Date: 2022-05-30

4. Conserved meningeal lymphatic drainage circuits in mice and humans.

Authors: Jose de Brito Neto; Stephanie Lenck; Anne Eichmann; Jean-Leon Thomas; Laurent Jacob; Celine Corcy; Farhat Benbelkacem; Luiz Henrique Geraldo; Yunling Xu; Jean-Mickael Thomas; Marie-Renee El Kamouh; Myriam Spajer; Marie-Claude Potier; Stephane Haik; Michel Kalamarides; Bruno Stankoff; Stephane Lehericy
Journal: J Exp Med Date: 2022-07-01 Impact factor: 17.579

5. Integrated Profiles Analysis Identified a Coding-Non-Coding Signature for Predicting Lymph Node Metastasis and Prognosis in Cervical Cancer.

Authors: Yu Zhang; Di Sun; Jiayu Song; Nan Yang; Yunyan Zhang
Journal: Front Cell Dev Biol Date: 2021-01-21

6. The 5-year overall survival of cervical cancer in stage IIIC-r was little different to stage I and II: a retrospective analysis from a single center.

Authors: E Yang; Shuying Huang; Xuting Ran; Yue Huang; Zhengyu Li
Journal: BMC Cancer Date: 2021-02-27 Impact factor: 4.430

7. Preoperative prediction of lymph node metastasis using deep learning-based features.

Authors: Renee Cattell; Jia Ying; Lan Lei; Jie Ding; Shenglan Chen; Mario Serrano Sosa; Chuan Huang
Journal: Vis Comput Ind Biomed Art Date: 2022-03-07

Review 8. Artificial Intelligence in Cervical Cancer Screening and Diagnosis.

Authors: Xin Hou; Guangyang Shen; Liqiang Zhou; Yinuo Li; Tian Wang; Xiangyi Ma
Journal: Front Oncol Date: 2022-03-11 Impact factor: 6.244

9. The efficacy of deep learning models in the diagnosis of endometrial cancer using MRI: a comparison with radiologists.

Authors: Aiko Urushibara; Tsukasa Saida; Kensaku Mori; Toshitaka Ishiguro; Kei Inoue; Tomohiko Masumoto; Toyomi Satoh; Takahito Nakajima
Journal: BMC Med Imaging Date: 2022-04-30 Impact factor: 2.795

10. Prediction Model of Residual Neural Network for Pathological Confirmed Lymph Node Metastasis of Ovarian Cancer.

Authors: Huanchun Yao; Xinglong Zhang
Journal: Biomed Res Int Date: 2022-10-11 Impact factor: 3.246

10 in total