Literature DB >> 31581201

Automated clear cell renal carcinoma grade classification with prognostic significance.

Katherine Tian1,2, Christopher A Rubadue1, Douglas I Lin1, Mitko Veta3, Michael E Pyle1, Humayun Irshad1, Yujing J Heng1,4.   

Abstract

We developed an automated 2-tiered Fuhrman's grading system for clear cell renal cell carcinoma (ccRCC). Whole slide images (WSI) and clinical data were retrieved for 395 The Cancer Genome Atlas (TCGA) ccRCC cases. Pathologist 1 reviewed and selected regions of interests (ROIs). Nuclear segmentation was performed. Quantitative morphological, intensity, and texture features (n = 72) were extracted. Features associated with grade were identified by constructing a Lasso model using data from cases with concordant 2-tiered Fuhrman's grades between TCGA and Pathologist 1 (training set n = 235; held-out test set n = 42). Discordant cases (n = 118) were additionally reviewed by Pathologist 2. Cox proportional hazard model evaluated the prognostic efficacy of the predicted grades in an extended test set which was created by combining the test set and discordant cases (n = 160). The Lasso model consisted of 26 features and predicted grade with 84.6% sensitivity and 81.3% specificity in the test set. In the extended test set, predicted grade was significantly associated with overall survival after adjusting for age and gender (Hazard Ratio 2.05; 95% CI 1.21-3.47); manual grades were not prognostic. Future work can adapt our computational system to predict WHO/ISUP grades, and validating this system on other ccRCC cohorts.

Entities:  

Year:  2019        PMID: 31581201      PMCID: PMC6776313          DOI: 10.1371/journal.pone.0222641

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Clear cell renal cell carcinoma (ccRCC) is the most common malignant tumor of epithelial origin in the kidney [1]. For over 30 years, ccRCC was graded using the 4-tiered Fuhrman nuclear grading system which incorporates nuclear size, nucleolar prominence, and nuclear membrane irregularities. Diagnostic challenges can occur with the presence of other morphological features such as sarcomatoid or spindle cell pattern, when higher grade ccRCC show more eosinophilic staining in the cytoplasm, or other renal cancer histologic types (e.g. papillary RCC type1 and chromophobe RCC) exhibit clear cytoplasm [2,3]. The correct classification of ccRCC grade and stage is important for guiding clinical management, molecular-based therapies, and prognosis [4,5]. Fuhrman grade is widely accepted as a prognostic factor despite mediocre inter-observer agreement [6,7]. To improve inter-observer agreement, simplified 2- or 3-tiered grading systems have been proposed. These simplified systems appear to retain prognostic ability similar to that of 4-tiered systems [8,9]. Recently, a new nuclear/nucleolar grading system, known as the World Health Organization (WHO)/International Society of Urological Pathology (ISUP) Grading Classification for RCC, was introduced [10]. Technological advances have enabled computational pathology to discover novel histomics features from whole slide images (WSIs) that may add diagnostic and/or prognostic information [11-13]. Computational pathology techniques can analyze cancer WSIs [14-16], including the detection of malignant RCC cells [17]. In this study, we developed an automated grading system to predict 2-tiered Fuhrman grade using ccRCC WSIs from The Cancer Genome Atlas (TCGA). Our specific aims were to establish a computational pipeline to extract nuclei histomics features, develop a model to predict 2-tiered ccRCC grade, and evaluate the prognostic efficacy of computer predicted grades.

Materials and methods

Cases and grade assignment

TCGA ccRCC clinical data, including Fuhrman’s grade (accessed June 2017), and hematoxylin and eosin (H&E) WSIs were retrieved for 395 cases [18,19]. TCGA ccRCC cases were contributed by seven participating medical centers. The TCGA Fuhrman’s grade for each case is the consensus of at least two pathologists from the case’s medical center. In order to identify tumor areas on each diagnostic WSI (i.e., regions of interest (ROIs)) for this computational pathology study, Pathologist 1 reviewed each WSI, identified an average of five ROIs for each case (Fig 1), and assigned a Fuhrman grade of 1 to 4 for each ROI. The highest grade among all the ROIs was the designated grade. Thus, each patient had two assigned grades: “TCGA grade” and “Grade by Pathologist 1”. TCGA and Pathologist 1 grades were re-stratified into the 2-tiered grading system: low (grades 1 and 2) and high (grades 3 and 4).
Fig 1

Schematic diagram showing how regions of interest (ROIs) were identified by Pathologist 1.

Pathologist 1 identified ROIs and assigned a Fuhrman grade for each ROI. The highest grade among all ROIs was the “Grade by Pathologist 1”. Each case also had a “TCGA grade” retrieved from the TCGA database.

Schematic diagram showing how regions of interest (ROIs) were identified by Pathologist 1.

Pathologist 1 identified ROIs and assigned a Fuhrman grade for each ROI. The highest grade among all ROIs was the “Grade by Pathologist 1”. Each case also had a “TCGA grade” retrieved from the TCGA database.

Image processing and nuclei segmentation

ROIs (n = 1855) from 395 WSIs were extracted and split into 2000 pixel by 2000 pixel patches (Fig 2). Nuclei segmentation was performed using Fiji (ImageJ, National Institutes of Health) [20] and using our previously published workflow [14]. H&E patches were converted from the Red, Green, and Blue (RGB) color space to the Hue, Saturation, and Value (HSV) color space (i.e., binary patches; Fig 3). A nonlinear mapping approach was applied as preprocessing to handle the variation across H&E staining inconsistency [21]. The nuclei segmentation method consists of two steps: adaptive thresholding in each HSV color channel to identify nuclei regions from the background, and marker controlled watershed-based nuclei segmentation to separate touching and overlapping nuclei. We further applied morphological operations to fine-tune the segmentation of nuclei. Extracted nuclei of area less than 200 pixels or greater than 2000 pixels were excluded to improve the specificity of nuclear detection [14].
Fig 2

From whole slide image to patches for image processing and nuclei segmentation.

Fig 3

Examples of nuclei detection and segmentation in low and high grade clear cell renal cell carcinoma.

The rightmost column shows computer-generated segmentation mask where cell nuclei are labelled white against a black background. The middle column shows the overlay of segmented nuclei (green spots) over each hematoxylin and eosin (H&E) patch.

Examples of nuclei detection and segmentation in low and high grade clear cell renal cell carcinoma.

The rightmost column shows computer-generated segmentation mask where cell nuclei are labelled white against a black background. The middle column shows the overlay of segmented nuclei (green spots) over each hematoxylin and eosin (H&E) patch.

2D histomics feature extraction

For each patch, 72 nuclei histomics features were extracted: nine morphological features, 15 intensity-based features, and 48 texture-based features. Morphological features describe the shape and size variation of nuclei. Intensity features (first order statistical features) describe the distribution of color variation in the nucleus. Three color channels were analyzed: lightness from HSV color space, lightness from Lab color space, and Hematoxylin channel from H&E color deconvolution [22]. Five first order statistical features were computed—mean, median, standard deviation, skewness, and kurtosis—for each of the three color channels, for a total of 15 intensity features. Texture features (second order statistical features) quantitatively describe patterns and texture of pixel values. Two types of second order statistical features were computed: co-occurrence based features (n = 8) and run length based features (n = 8). Co-occurrence based features include correlation, cluster shade, cluster prominence, energy, entropy, Haralick correlation, inertia, and inverse difference moment [23]. Run length based features include gray-level non-uniformity, run-length non-uniformity, low and high gray-level run emphasis, short run low and high gray-level emphasis, and long run low and high gray-level emphasis [24]. Likewise, texture features were extracted from the three selected color channels, resulting in a total of 48 texture features. Feature formulas have been previously described [14,25].

Data summarization and selection of representative ROI

Data extracted at the patch level were summarized to the ROI level by calculating the median and median absolute deviation (MAD) (i.e., 144 summarized features). Some cases had multiple ROIs annotated with the highest grade. Thus, one ROI among the highest grade ROIs was selected to represent the case. To do so, the median of all ROIs with the highest grade was calculated, and the ROI with the smallest Euclidean-distance to the calculated median was chosen (Fig 4).
Fig 4

Data summarization and the selection of the representative region of interest (ROI).

Developing the machine learning model to predict grade

Cases with concordant 2-tiered grade by TCGA and Pathologist 1 (n = 277) were used to develop the automated 2-tiered grading system. Concordant cases were spilt into a training set (n = 235; 85%) and held-out test set (n = 42; 15%; Fig 5). The sampling package, R, was used to select the 42 patients in the held-out test set based on grade, age, gender, and stage, ensuring that they were representative of the concordant cases. Histomics features were z-scored. Seven machine learning classification methods were explored to classify ccRCC cases into either low or high grade using nuclei histomics features [26,27] (Fig 5). All methods achieved similar area under the receiver-operator characteristic curves (AUC ROC; S1 Table). Lasso regression was the top performing method with a built-in feature selection capability. Lasso regression is one type of linear regression with L1 regularization. The Lasso procedure uses L1 regularization penalty, which has the effect of shrinking the regression weights of the least predictive features to 0, thereby creating simpler models that are less prone to overfitting [28]. In the Lasso model, a hyper parameter λ determines the amount of the L1 regularization penalty applied. We decided to move forward to use Lasso to build our final classification model because it is computationally efficient and more interpretable compared to other machine learning methods such as deep learning. Lasso regression and its optimal hyper parameter selected the final list of histomics features most associated with grade. We evaluated its performance on the held out test set.
Fig 5

A summary of the workflow used to develop the 2-tiered clear cell renal cell carcinoma (ccRCC) grade classification.

Seven machine learning classification methods were evaluated to determine the optimal method to develop a robust classification model for ccRCC using cases from the Training Set (A). Lasso regression produced an average area under the receiver operator characteristic curve of 0.84 and identified nuclei histomics features associated with ccRCC grade. The Test Set was used to evaluate the performance of the final model; and grades were predicted in the Extended Test Set (B).

A summary of the workflow used to develop the 2-tiered clear cell renal cell carcinoma (ccRCC) grade classification.

Seven machine learning classification methods were evaluated to determine the optimal method to develop a robust classification model for ccRCC using cases from the Training Set (A). Lasso regression produced an average area under the receiver operator characteristic curve of 0.84 and identified nuclei histomics features associated with ccRCC grade. The Test Set was used to evaluate the performance of the final model; and grades were predicted in the Extended Test Set (B).

Survival analyses

The Lasso model was applied to predict the grade of the previously held out test set (n = 42) and cases with discordant grades (n = 118). These 160 cases were combined to create an extended test set to evaluate the prognostic capability (i.e., overall survival [OS]) of our predicted grade using crude and adjusted Cox proportional hazard models. The adjusted Cox models include patient age, gender, and cancer stage. TCGA treatment information was missing from 69% of the cases and thus was not included in the adjusted Cox models. Kaplan-Meier curves were plotted to visualize differences between the curves (survival package, R) [29].

Additional pathological review for discordant cases

The grades provided by TCGA may be assessed from ROIs other than the representative ROIs selected in our study. To obtain a fairer comparison between manual and predicted grades among the discordant cases, the representative ROIs were additionally reviewed by Pathologist 2.

Statistical analyses

Confusion matrices determined the concordance of the 2-tiered and 4-tiered grades between two raters [27]. Inter-rater reliability among three raters was evaluated using Fleiss’ kappa. Boxplots were created using ggplot2 version 2.2.1. Comparisons between the nine morphological features with 2-tiered and 4-tiered grading were done using Mann-Whitney U or Kruskal Wallis test, respectively. All tests of statistical significance were two-sided. Statistical significance was achieved when p-value was <0.05 or when the false discovery rate (FDR) was <0.05. All analyses were conducted using R version 3.4.0.

Results

The majority of TCGA ccRCC cases were white males. Most participants were between the ages of 50 to 69 and had stage I disease (Table 1). The agreement of 4-tiered grading between TCGA and Pathologist 1 was poor (frequency of agreement = 0.47, Cohen’s kappa = 0.20; S1A Fig). When the grading was stratified into 2-tiers, 277 out of 395 cases were concordant (frequency of agreement = 0.70, Cohen’s kappa = 0.41; S1B Fig). Most of the discordant cases were assigned high grade by TCGA and low grade by Pathologist 1.
Table 1

Demographic table of the 395 The Cancer Genome Atlas (TCGA) clear cell renal cell carcinoma cases with 2-tiered histological grade (low and high).

Note that the TCGA grade for each patient in the discordant set is the opposite grade assigned by Pathologist 1.

ConcordantCasesDiscordant cases(Grades by TCGA)Discordant cases(Grades by Pathologist 1)
Total n (%)Low n (%)High n (%)Low n (%)High n (%)Low n (%)High n (%)
Cases, n395 (100)162 (58.5)115 (41.5)28 (23.7)90 (76.3)90 (76.3)28 (23.7)
Age group, n
    <5080 (20.3)36 (22.2)22 (19.1)4 (14.3)18 (20.0)18 (20.0)4 (14.3)
    50–59106 (26.8)50 (30.9)29 (25.2)6 (21.4)21 (23.3)21 (23.3)6 (21.4)
    60–69109 (27.6)37 (22.8)33 (28.7)10 (35.7)29 (32.2)29 (32.2)10 (35.7)
    70–7982 (20.8)31 (19.1)26 (22.6)8 (28.6)17 (18.9)17 (18.9)8 (28.6)
    >8018 (4.6)8 (4.9)5 (4.3)0 (0.0)5 (5.6)5 (5.6)0 (0.0)
Gender, n
    Female130 (32.9)67 (41.4)28 (24.3)10 (35.7)25 (27.8)25 (27.8)10 (35.7)
    Male265 (67.1)95 (58.6)87 (75.7)18 (64.3)65 (72.2)65 (72.2)18 (64.3)
Race, n
    Asian7 (1.8)3 (1.9)2 (1.7)0 (0.0)2 (2.2)2 (2.2)0 (0.0)
    Black33 (8.4)13 (8.0)10 (8.7)2 (7.1)8 (8.9)8 (8.9)2 (7.1)
    White349 (88.4)142 (87.7)102 (88.7)25 (89.3)80 (88.9)80 (88.9)25 (89.3)
    Not reported6 (1.5)4 (2.5)1 (0.9)1 (3.6)0 (0.0)0 (0.0)1 (3.6)
Stage, n
    Stage I207 (52.4)121 (74.7)35 (30.4)13 (46.4)38 (42.2)38 (42.2)13 (46.4)
    Stage II44 (11.1)17 (10.5)15 (13.0)5 (17.9)7 (7.8)7 (7.8)5 (17.9)
    Stage III92 (23.3)18 (11.1)36 (31.3)8 (28.6)30 (33.3)30 (33.3)8 (28.6)
    Stage IV52 (13.2)6 (3.7)29 (25.2)2 (7.1)15 (16.7)15 (16.7)2 (7.1)
Type of Treatment, n
    Chemotherapy7 (1.8)3 (1.9)4 (3.5)0 (0.0)0 (0.0)0 (0.0)0 (0.0)
    Immunotherapy6 (1.5)2 (1.2)2 (1.7)0 (0.0)2 (2.2)2 (2.2)0 (0.0)
    Molecular therapy79 (20.0)31 (19.1)24 (20.9)5 (17.9)19 (21.1)19 (21.1)5 (17.9)
    Radiation10 (2.5)2 (1.2)4 (3.5)2 (7.1)2 (2.2)2 (2.2)2 (7.1)
    Mixed therapy21 (5.3)4 (2.5)10 (8.7)2 (7.1)5 (5.6)5 (5.6)2 (7.1)
    Unknown272 (68.9)120 (74.1)71 (61.7)19 (67.9)62 (68.9)62 (68.9)19 (67.9)

Demographic table of the 395 The Cancer Genome Atlas (TCGA) clear cell renal cell carcinoma cases with 2-tiered histological grade (low and high).

Note that the TCGA grade for each patient in the discordant set is the opposite grade assigned by Pathologist 1. Computer extracted morphological features reflected the variation of ccRCC nuclei as observed by pathologists. Nuclei size (i.e., area, perimeter, and spherical perimeter and radius) and shape (i.e., roundness, elongation, flatness and major axis of ellipse fit) were significantly larger and less spherical in higher grades (FDR<0.05; S2 Fig and S2 Table).

Lasso classification model

The final Lasso model with the optimal λ at 0.0101 had an average ROC AUC of 0.84. The model predicted 2-tiered ccRCC grade with 83.3% accuracy (95% confidence interval (CI) 0.69–0.93), 84.6% sensitivity, 81.3% specificity, 18.8% false positive rate, and 15.4% false negative rate in the test set. The agreement between predicted and manual grades was good (frequency of agreement = 0.83, Cohen’s kappa = 0.65). The 18 unique histomics features associated with ccRCC 2-tiered grade are in Table 2.
Table 2

Nuclear histomics features associated with 2-tiered ccRCC grade selected in the final Lasso classification model (18 unique features; 26 total features).

FeatureTypeBiologicalRelevanceColorSpaceSummaryFunctionCoefficient
ElongationMorphologyNuclear pleomorphism,nuclear shape (irregular)-MAD-1.51E-01
Minor axis of the Ellipse FitMorphologyNuclear pleomorphism,nuclear shape (irregular)-Median-1.25E+00
FlatnessMorphologyNuclear shape (irregular)-MAD-4.20E-16
KurtosisIntensityUneven distribution ofnucleus stainingHSVMedian5.38E-03
SkewnessIntensityUneven distribution ofnucleus stainingH&EMAD-2.22E-01-2.22E-01
HSVMedian-2.87E-01
LabMAD-1.10E-01
CorrelationTextureGranularity of chromatin (a)HSVMAD-2.48E-02
Haralick CorrelationTextureGranularity of chromatin (a)H&EMedian-2.24E-01
EnergyTextureGranularity of chromatin (b)LabMedian-9.36E-01
MAD-3.74E-01
Inverse difference momentTextureGranularity of chromatin (b)H&EMedian-4.81E-01
MAD-4.19E-02
HSVMedian1.67E-01
MAD-1.19E-01
InertiaTextureGranularity of chromatin (b)H&EMedian-1.77E-01
LabMedian-9.72E-02
EntropyTextureGranularity of chromatin (c)HSVMedian8.97E-03
Low gray-level run emphasisTextureGranularity of chromatin (c)H&EMAD-7.52E-02
Long run high gray-level emphasisTextureGranularity of chromatin (c)HSVMAD1.50E-01
Long run low gray-level emphasisTextureGranularity of chromatin (c)H&EMAD-7.71E-16
Short run high gray-level emphasisTextureGranularity of chromatin (c)HSVMAD4.02E-02
Short run low gray-level emphasisTextureGranularity of chromatin (c)H&EMAD-1.28E-15
Gray level non-uniformityTextureGranularity of chromatin (d)HSVMedian-4.26E-01
LabMAD2.45E+00
High gray-level run emphasisTextureGranularity of chromatin (d)HSVMAD6.21E-03

a) Correlation is a co-occurrence based texture feature, describing roughness and repeated direction inside the nuclei.

b) Co-occurrence based texture feature, describing roughness inside the nuclei.

c) Run-length matrix based texture feature, describing randomness of gray-level distribution.

d) Run-length matrix based texture feature, describing coarseness inside nuclei.

MAD: median absolute deviation; Lab: Lab color space; HSV: hue-saturation-value color space; H&E: Hematoxylin and Eosin color space. Median and MAD were used to summarize the data extracted at the patch level to the region of interest (ROI) level.

a) Correlation is a co-occurrence based texture feature, describing roughness and repeated direction inside the nuclei. b) Co-occurrence based texture feature, describing roughness inside the nuclei. c) Run-length matrix based texture feature, describing randomness of gray-level distribution. d) Run-length matrix based texture feature, describing coarseness inside nuclei. MAD: median absolute deviation; Lab: Lab color space; HSV: hue-saturation-value color space; H&E: Hematoxylin and Eosin color space. Median and MAD were used to summarize the data extracted at the patch level to the region of interest (ROI) level.

Prognostic efficacy of predicted grades

There were 65 death events out of 160 cases in the extended test set. Cases predicted as high grade had significantly poorer OS compared to low grade (Fig 6). The association between predicted grade and OS was significant in the crude analysis (hazard ratio (HR) 2.07; 95% CI 1.25–3.43) and after adjusting for age and gender (HR 2.05; 95% CI 1.21–3.47). The association was attenuated when stage was included in the model (HR 1.66; 95% CI 0.97–2.83).
Fig 6

Prognostic efficacy of predicted grades.

Cases predicted as high grade have significantly poorer overall survival rates compared to cases predicted as low grade in the extended test set (hazard ratio 2.07, 95% confidence interval of 1.25–3.43, p<0.01; 65 death events among 160 cases). The shaded areas reflect the 95% confidence interval for high or low grade.

Prognostic efficacy of predicted grades.

Cases predicted as high grade have significantly poorer overall survival rates compared to cases predicted as low grade in the extended test set (hazard ratio 2.07, 95% confidence interval of 1.25–3.43, p<0.01; 65 death events among 160 cases). The shaded areas reflect the 95% confidence interval for high or low grade.

Comparing predicted grade with TCGA and Pathologist 1

Among the concordant cases, 2-tiered manual grades were significantly associated with OS (Fig 7A; Table 3). Predicted grade for concordant cases were not evaluated as the majority of the concordant cases were part of the training set used to build the Lasso model. Within the discordant cases, neither grade provided by TCGA nor Pathologist 1 was associated with OS (Fig 7B and 7C). Predicted grade was significantly associated with OS (crude model HR 2.01; 95% CI 1.14–3.54) and when adjusted for age and gender (HR 2.31; 95% CI 1.26–4.24). The association of predicted grade and OS among the discordant cases was attenuated when adjusted stage was included in the model (HR 1.83; 95% CI 0.98–3.41; Fig 7D; Table 3).
Fig 7

Kaplan-Meier curves comparing manual and predicted grades with overall survival in concordant and discordant cases.

Grades assigned by TCGA/Pathologist 1 were significantly associated with overall survival within the concordant cases (A). In the discordant set, neither grades assigned by TCGA (B) nor Pathologist 1 (C) were associated with overall survival while predicted grade remained significantly prognostic (D). Please refer to Table 3 for hazard ratios and 95% confidence intervals for each analysis. The shaded areas reflect the 95% confidence interval for high or low grade.

Table 3

The association of manual or computer predicted 2-tiered grade with overall survival in the concordant and discordant cases.

Manual GradeComputer Predicted Grade
HazardRatio(95% CI)p-valueHazardRatio(95% CI)p-value
A. Concordant cases between TCGA and Pathologist 1 (85 events out of 277 cases)
Model A: Crude3.12(2.00, 4.86)<0.01NANANA
Model B: Adjusted for Age and Gender3.00(1.91, 4.71)<0.01NANANA
Model C: Adjusted for Age, Gender, and Stage1.59(0.99, 2.57)0.06NANANA
B. Discordant cases between TCGA and Pathologist 1 (52 events out of 118 cases)
Grade assigned by TCGA
Model A: Crude1.16(0.59, 2.25)0.672.01(1.14, 3.54)0.02
Model B: Adjusted for Age and Gender1.09(0.56, 2.13)0.802.31(1.26, 4.24)<0.01
Model C: Adjusted for Age, Gender, and Stage1.08(0.55, 2.11)0.831.83(0.98, 3.41)0.06
Grade assigned by Pathologist 1
Model A: Crude0.86(0.44, 1.68)0.67NANANA
Model B: Adjusted for Age and Gender0.92(0.47, 1.79)0.80NANANA
Model C: Adjusted for Age, Gender, and Stage0.93(0.47, 1.82)0.83NANANA
Grade assigned by Pathologist 2
Model A: Crude1.09(0.63, 1.89)0.75NANANA
Model B: Adjusted for Age and Gender1.21(0.70, 2.10)0.50NANANA
Model C: Adjusted for Age, Gender, and Stage1.15(0.66, 2.00)0.62NANANA

Confidence Interval, CI

Kaplan-Meier curves comparing manual and predicted grades with overall survival in concordant and discordant cases.

Grades assigned by TCGA/Pathologist 1 were significantly associated with overall survival within the concordant cases (A). In the discordant set, neither grades assigned by TCGA (B) nor Pathologist 1 (C) were associated with overall survival while predicted grade remained significantly prognostic (D). Please refer to Table 3 for hazard ratios and 95% confidence intervals for each analysis. The shaded areas reflect the 95% confidence interval for high or low grade. Confidence Interval, CI There was no effective agreement between TCGA, Pathologist 1, and Pathologist 2 among the discordant cases (4-tiered grading: Fleiss’ kappa = -0.23; 2-tiered grading: Fleiss’ kappa = -0.33). When comparing between TCGA and Pathologist 2, there was no effective agreement (4-tiered grading: frequency of agreement = 0.33, Cohen’s kappa = -0.14; 2-tiered grading: frequency of agreement = 0.39, Cohen’s kappa = -0.19). Despite assessing the same representative ROIs, the agreement between Pathologist 1 and Pathologist 2 was poor for 4-tiered grading (frequency of agreement = 0.48, Cohen’s kappa = 0.11) and slightly improved for 2-tiered grading (frequency of agreement = 0.61, Cohen’s kappa = 0.20). Discordant cases between Pathologist 1 and Pathologist 2 were more likely to be assigned as high grade by Pathologist 2. Contingency tables between TCGA, Pathologist 1, and Pathologist 2 are in S3 Table. Grades assigned by Pathologist 2 were not associated with OS (Table 3). Further analyses were explored to determine if the incorporation of manual grade by Pathologist 2 may improve prognostic efficacy. The grades for discordant cases were re-assigned as low or high by using the most frequent grade among TCGA, Pathologist 1, and Pathologist 2, and among Pathologist 1, Pathologist 2, and the predicted grade (i.e., integrating manual and computer). Re-assigned grades were not associated with OS (p>0.05; S4 Table). Next, these cases were further divided into cases that did and did not agree between Pathologist 1 and Pathologist 2. Manual grades were not associated with OS in cases that did and did not agree between Pathologist 1 and Pathologist 2 (p>0.05; Table 4). Predicted grade was only associated with OS in cases that agreed between Pathologist 1 and Pathologist 2 (Table 4). S1 File contains the manual and predicted grades of these ccRCC cases.
Table 4

The association of manual or computer predicted 2-tiered grade with overall survival in 118 discordant cases.

Manual GradeComputer Predicted Grade
HazardRatio(95% CI)p-valueHazardRatio(95% CI)p-value
A. Cases with identical grades between Pathologist 1 and Pathologist 2 (31 events out of 76 cases)
Model A: Crude0.99(0.44, 2.23)0.992.05(1.00, 4.21)0.05
Model B: Adjusted for Age and Gender1.04(0.46, 2.34)0.932.42(1.13, 5.20)0.02
Model C: Adjusted for Age, Gender, and Stage1.18(0.52, 2.68)0.691.89(0.87, 4.12)0.11
B. Cases with different grades between Pathologist 1 and Pathologist 2 (21 events out of 46 cases)
Grade assigned by Pathologist 1
Model A: Crude0.64(0.19, 2.20)0.482.49(0.83, 7.45)0.10
Model B: Adjusted for Age and Gender0.63(0.18, 2.17)0.462.49(0.72, 7.28)0.16
Model C: Adjusted for Age, Gender, and Stage0.59(0.17, 2.03)0.402.03(0.62, 6.66)0.24
Grade assigned by Pathologist 2
Model A: Crude1.56(0.46, 5.31)0.48NANANA
Model B: Adjusted for Age and Gender1.58(0.46, 5.44)0.46NANANA
Model C: Adjusted for Age, Gender, and Stage1.70(0.49, 5.89)0.40NANANA

Confidence Interval, CI

Confidence Interval, CI

Discussion

This study utilized the large and diverse TCGA ccRCC dataset to extract quantitative histomics features from ROIs and applied a Lasso regression model to develop an automated 2-tiered grading system using 18 unique features (26 total features) which achieved an ROC AUC of 0.84. Using discordant cases as an independent validation set, our data-driven system stratified ccRCC cases into low and high grades that were significantly associated with OS. The prognostic efficacy of predicted grades in the discordant cases outperformed the manual grades assessed by TCGA, Pathologist 1, and Pathologist 2. This proof-of-concept study demonstrated the potential of computational pathology to predict ccRCC grades via a more objective and quantitative pipeline, as well as addressed the issue of grade disagreement commonly encountered between pathologists. The grading of ccRCC is highly challenging and subjective, but the accurate assignment of ccRCC grade is important for clinical care and follow-up. Research groups, specifically Yeh and colleagues [30], Kruk and colleagues [31], and Holdbrook and colleagues [32], have been actively developing computational pathology systems to provide objectivity and/or automate ccRCC grading. Each computational system is highly unique with differences in image processing, feature extraction, classification method, and predicting 2 or 4-tiered grades. We utilized an unbiased data-driven approach where we extracted a set of high dimensional nuclear features (n = 144), and used Lasso, a machine learning-based method, to build our final predictive model. This is different from Yeh et al. [30] who only evaluated 1 feature (i.e., maximum nuclei size) to predict 2-tiered grade, Kruk et al. [31] who pre-selected features (out of 31 features) prior to building the final model to predict 4-tiered grade, and Holdbrook et al. who used up to 4 concatenate feature vectors to calculate fraction value scores prior to classification into low or high grade [32]. In addition, our Lasso regression allowed us to identify the 18 unique histomics biomarkers in our final predictive model while the features in the models by Kruk et al [31] and Holdbrook et al. [32] are unknown. Our 18 features provided information about the nucleus, the uneven distribution of nucleus staining, and the granularity of chromatin and nucleoli, highlighting that the addition of computer textual and intensity-related features to traditional pathology morphological features can improve the ability to predict ccRCC grade. We and Holdbrook et al [32] demonstrated that our predicted grade had prognostic significance whilst the studies by Kruk et al [31] and Yeh et al [30] did not report if their grade was associated with prognosis. Lastly, our system was trained using a much larger and more diverse dataset of 277 cases from seven TCGA participating institutions, and we validated our system using 160 cases. This is in contrast to those three studies which used small numbers for training (n = 38 to 70) and validation (n = 6 to 62), and obtaining their cases from a single institution. Collectively, our work and others are substantial efforts to improve ccRCC grading. Each computational method will require further refinement and validation before their clinical utility can be determined. Each TCGA grade is the consensus of at least two pathologists. One reason for grade disagreement between TCGA and Pathologist 1 can be explained by TCGA pathologists assessing different ROIs than the representative ROIs selected in our study. However, even when reviewing the same ROIs for discordant cases, there was very poor agreement between Pathologist 1 and Pathologist 2, reiterating the challenges of ccRCC grading. These discordant cases could be more diagnostically challenging or ambiguous. Since manual grades for concordant cases were significantly associated with OS, it could be argued that concordant cases were diagnostically less challenging where the tumors were overwhelmingly of a low or high grade, and that our model was trained using more homogeneous ROIs. Predicted grades for discordant cases were significantly associated with survival, in contrast to manual assessments or using the most frequent manual grade. Therefore, our automated system has the ability to diagnose a range of ccRCC cases with consistency and objectivity. In practical application, such computational system could be useful as a tool to provide a second-opinion in diagnostically ambiguous cases for pathologists. Our study has some limitations. We did not use the WHO/ISUP grading system because the TCGA participating medical centers used the Fuhrman’s system. However, since our computer system was constructed based on computer extracted nuclear features, it can be adapted to predict WHO/ISUP grades which also utilize nuclei/nucleoli features in the future. There are inherent limitations of reviewing cases using WSIs. Accurate grading may be hindered by the quality of WSIs and the lack of the Z-axis [33]. Our study reviewed diagnostic WSIs and analyzed manually selected ROIs that may not be representative of the entire tumor. For future work, automating ROI detection and grade prediction will allow the review of multiple tumor sections more efficiently. Lastly, our nuclei segmentation relied on conventional image analysis techniques. While qualitative evaluation of the segmentation results revealed that our image processing pipeline produced reasonably good results, the nuclei segmentation may not be optimal in more challenging cases. A solution is to employ deep learning based techniques to improve nuclei segmentation in future studies [30,34,35].

Conclusions

We developed an automated 2-tiered Fuhrman’s grading system with prognostic significance. Our system demonstrated the potential of computational pathology to improve the reproducibility in the diagnosis and grading of ccRCC, and to aid the clinical management of ccRCC patients. Future work may include adapting our computational system to predict WHO/ISUP grades; validating our system on other ccRCC cohorts; using deep learning methods to detect ROIs, segment nuclei and predict grade; and exploring whether histomics features can predict prognosis independently of grade. This work is one step toward developing an artificial intelligence system for diagnostic pathology.

The average area under the receiver-operator characteristic curves (AUC ROC) for each machine learning method using the training set after 100 iterations of random 10% hold out.

These methods were implemented using the glmnet and caret packages in R. (DOCX) Click here for additional data file.

The association of the nine nuclear morphological features with Fuhrman’s grade in the 277 concordant cases (using the 2-tiered grading system).

(DOCX) Click here for additional data file.

Contingency tables of Fuhrman’s grade between TCGA and Pathologist 1 with Pathologist 2 among the 118 discordant cases.

(DOCX) Click here for additional data file.

The association of re-assigned grades for discordant cases.

(DOCX) Click here for additional data file.

Agreement between TCGA and Pathologist 1.

(A) The agreement between the TCGA and Pathologist 1 using the 4-tiered grading was poor (frequency of agreement = 0.47, Cohen’s kappa = 0.20). (B) The agreement improved to moderate when using the 2-tiered grading system (frequency of agreement = 0.70, Cohen’s kappa = 0.41). (PDF) Click here for additional data file.

These plots contain the 277 cases that were concordant between TCGA and Pathologist 1 using the 2-tiered grading system.

Nine morphological features (1: Area, 2: Roundness, 3: Elongation, 4: Flatness, 5: Perimeter, 6: Equivalent Spherical Perimeter, 7: Equivalent Spherical Radius, 8: Minor Axis of the Ellipse Fit, and 9: Nuclei Major Axis of the Ellipse Fit) were plotted with their median and median absolute deviation values. Each feature and measure was stratified either by the 4-tiered (A and B) or 2-tiered (C and D) grades assigned by Pathologist 1, respectively. (PDF) Click here for additional data file.

Patient data with manual and predicted grades.

(CSV) Click here for additional data file.
  28 in total

1.  Cell segmentation in histopathological images with deep learning algorithms by utilizing spatial relationships.

Authors:  Nuh Hatipoglu; Gokhan Bilgin
Journal:  Med Biol Eng Comput       Date:  2017-02-28       Impact factor: 2.602

2.  Automated Renal Cancer Grading Using Nuclear Pleomorphic Patterns.

Authors:  Daniel Aitor Holdbrook; Malay Singh; Yukti Choudhury; Emarene Mationg Kalaw; Valerie Koh; Hui Shan Tan; Ravindran Kanesvaran; Puay Hoon Tan; John Yuen Shyi Peng; Min-Han Tan; Hwee Kuan Lee
Journal:  JCO Clin Cancer Inform       Date:  2018-12

3.  Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer.

Authors:  Babak Ehteshami Bejnordi; Mitko Veta; Paul Johannes van Diest; Bram van Ginneken; Nico Karssemeijer; Geert Litjens; Jeroen A W M van der Laak; Meyke Hermsen; Quirine F Manson; Maschenka Balkenhol; Oscar Geessink; Nikolaos Stathonikos; Marcory Crf van Dijk; Peter Bult; Francisco Beca; Andrew H Beck; Dayong Wang; Aditya Khosla; Rishab Gargeya; Humayun Irshad; Aoxiao Zhong; Qi Dou; Quanzheng Li; Hao Chen; Huang-Jing Lin; Pheng-Ann Heng; Christian Haß; Elia Bruni; Quincy Wong; Ugur Halici; Mustafa Ümit Öner; Rengul Cetin-Atalay; Matt Berseth; Vitali Khvatkov; Alexei Vylegzhanin; Oren Kraus; Muhammad Shaban; Nasir Rajpoot; Ruqayya Awan; Korsuk Sirinukunwattana; Talha Qaiser; Yee-Wah Tsang; David Tellez; Jonas Annuscheit; Peter Hufnagl; Mira Valkonen; Kimmo Kartasalo; Leena Latonen; Pekka Ruusuvuori; Kaisa Liimatainen; Shadi Albarqouni; Bharti Mungal; Ami George; Stefanie Demirci; Nassir Navab; Seiryo Watanabe; Shigeto Seno; Yoichi Takenaka; Hideo Matsuda; Hady Ahmady Phoulady; Vassili Kovalev; Alexander Kalinovsky; Vitali Liauchuk; Gloria Bueno; M Milagro Fernandez-Carrobles; Ismael Serrano; Oscar Deniz; Daniel Racoceanu; Rui Venâncio
Journal:  JAMA       Date:  2017-12-12       Impact factor: 56.272

4.  Application of simplified Fuhrman grading system in clear-cell renal cell carcinoma.

Authors:  Sung K Hong; Chang W Jeong; Ji H Park; Hyung S Kim; Cheol Kwak; Gheeyoung Choe; Hyeon H Kim; Sang E Lee
Journal:  BJU Int       Date:  2010-08-26       Impact factor: 5.588

5.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

Review 6.  Methods for nuclei detection, segmentation, and classification in digital histopathology: a review-current status and future potential.

Authors:  Humayun Irshad; Antoine Veillard; Ludovic Roux; Daniel Racoceanu
Journal:  IEEE Rev Biomed Eng       Date:  2014

Review 7.  Renal cell carcinoma.

Authors:  Brian I Rini; Steven C Campbell; Bernard Escudier
Journal:  Lancet       Date:  2009-03-05       Impact factor: 79.321

8.  Multifeature prostate cancer diagnosis and Gleason grading of histological images.

Authors:  Ali Tabesh; Mikhail Teverovskiy; Ho-Yuen Pang; Vinay P Kumar; David Verbel; Angeliki Kotsianti; Olivier Saidi
Journal:  IEEE Trans Med Imaging       Date:  2007-10       Impact factor: 10.048

9.  Automated grading of renal cell carcinoma using whole slide imaging.

Authors:  Fang-Cheng Yeh; Anil V Parwani; Liron Pantanowitz; Chien Ho
Journal:  J Pathol Inform       Date:  2014-07-30

10.  Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features.

Authors:  Kun-Hsing Yu; Ce Zhang; Gerald J Berry; Russ B Altman; Christopher Ré; Daniel L Rubin; Michael Snyder
Journal:  Nat Commun       Date:  2016-08-16       Impact factor: 14.919

View more
  5 in total

1.  Classification of malignant tumors by a non-sequential recurrent ensemble of deep neural network model.

Authors:  Dipanjan Moitra; Rakesh Kr Mandal
Journal:  Multimed Tools Appl       Date:  2022-02-14       Impact factor: 2.577

2.  Deep Learning Image Analysis of Benign Breast Disease to Identify Subsequent Risk of Breast Cancer.

Authors:  Adithya D Vellal; Korsuk Sirinukunwattan; Kevin H Kensler; Gabrielle M Baker; Andreea L Stancu; Michael E Pyle; Laura C Collins; Stuart J Schnitt; James L Connolly; Mitko Veta; A Heather Eliassen; Rulla M Tamimi; Yujing J Heng
Journal:  JNCI Cancer Spectr       Date:  2021-01-11

3.  Development and evaluation of a deep neural network for histologic classification of renal cell carcinoma on biopsy and surgical resection slides.

Authors:  Mengdan Zhu; Bing Ren; Ryland Richards; Matthew Suriawinata; Naofumi Tomita; Saeed Hassanpour
Journal:  Sci Rep       Date:  2021-03-29       Impact factor: 4.379

Review 4.  Artificial intelligence for renal cancer: From imaging to histology and beyond.

Authors:  Karl-Friedrich Kowalewski; Luisa Egen; Chanel E Fischetti; Stefano Puliatti; Gomez Rivas Juan; Mark Taratkin; Rivero Belenchon Ines; Marie Angela Sidoti Abate; Julia Mühlbauer; Frederik Wessels; Enrico Checcucci; Giovanni Cacciamani
Journal:  Asian J Urol       Date:  2022-06-18

Review 5.  Cultivating Clinical Clarity through Computer Vision: A Current Perspective on Whole Slide Imaging and Artificial Intelligence.

Authors:  Ankush U Patel; Nada Shaker; Sambit Mohanty; Shivani Sharma; Shivam Gangal; Catarina Eloy; Anil V Parwani
Journal:  Diagnostics (Basel)       Date:  2022-07-22
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.