| Literature DB >> 35318420 |
Jeffrey B Hodgin1, Arvind Rao2,3,4,5, Joonsang Lee6, Elisa Warner6, Salma Shaikhouni7, Markus Bitzer7, Matthias Kretzler7, Debbie Gipson8, Subramaniam Pennathur7, Keith Bellovich9, Zeenat Bhat10, Crystal Gadegbeku11, Susan Massengill12, Kalyani Perumal13, Jharna Saha14, Yingbao Yang14, Jinghui Luo14, Xin Zhang6, Laura Mariani7.
Abstract
Pathologists use visual classification to assess patient kidney biopsy samples when diagnosing the underlying cause of kidney disease. However, the assessment is qualitative, or semi-quantitative at best, and reproducibility is challenging. To discover previously unknown features which predict patient outcomes and overcome substantial interobserver variability, we developed an unsupervised bag-of-words model. Our study applied to the C-PROBE cohort of patients with chronic kidney disease (CKD). 107,471 histopathology images were obtained from 161 biopsy cores and identified important morphological features in biopsy tissue that are highly predictive of the presence of CKD both at the time of biopsy and in one year. To evaluate the performance of our model, we estimated the AUC and its 95% confidence interval. We show that this method is reliable and reproducible and can achieve 0.93 AUC at predicting glomerular filtration rate at the time of biopsy as well as predicting a loss of function at one year. Additionally, with this method, we ranked the identified morphological features according to their importance as diagnostic markers for chronic kidney disease. In this study, we have demonstrated the feasibility of using an unsupervised machine learning method without human input in order to predict the level of kidney function in CKD. The results from our study indicate that the visual dictionary, or visual image pattern, obtained from unsupervised machine learning can predict outcomes using machine-derived values that correspond to both known and unknown clinically relevant features.Entities:
Mesh:
Year: 2022 PMID: 35318420 PMCID: PMC8941143 DOI: 10.1038/s41598-022-08974-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Workflow for the unsupervised learning using a bag-of-words paradigm. In step (1) the cortex part of the biopsy sample was used; (2) the Reinhard stain color normalization method applied; (3) each biopsy sample image was tiled into 256 × 256 pixel patches; (4) we extracted features from each patch using the transfer learning method in deep learning; (5) unsupervised machine learning algorithms called K-means clustering was applied; and finally (6) a histogram representation for each biopsy sample was created to describe the distribution of each type of cluster at the patient level.
Figure 2Example images of biopsy samples. Multiple cortexes are combined in each case. To reduce the color and intensity variations present in the stained images, we computed the global mean and standard deviation of each channel in the Lab color space for the Reinhard color normalization for all data and used them as reference values to normalize our data. The figure shows Reinhard color normalization before (left) and after (right).
Figure 3Workflow for the deep learning segmentation. Step 1: New images were fed into our model (detector) for automatic segmentation. Step 2: Our experts corrected errors manually. Step 3: These corrected segmented images were examined by a pathologist (JBH) for quality control. Step 4: Post-processing was performed to remove unwanted dots or pixels as errors. Step 5: The final gold standard labeled data was used to train our model to improve segmentation accuracy.
Figure 6(A) A visual dictionary that consists of 9 representative visual words, (B) a representative cortex example, and (C) its cluster map with colored patches. Each colored patch corresponds to its assigned visual word.
Baseline characteristics of the participants.
| Diagnosis | Race | Gender | |||
|---|---|---|---|---|---|
| 0 | Lupus (n = 22) | 0 | White/Caucasian (n = 38) | 0 | Male (n = 16) |
| 1 | Minimal change/FSGS (n = 8) | 1 | Black/African American (n = 11) | 1 | Female (n = 42) |
| 2 | Membranous nephropathy (n = 3) | 2 | Asian/Asian American (n = 4) | ||
| 3 | IgA/HSP (n = 9) | 3 | Multiracial (n = 1) | ||
| 4 | Other GN (n = 3) | 4 | American Indian/Alaskan Native (n = 1) | ||
| 5 | Diabetic nephropathy /Hypertensive nephropathy (n = 12) | 5 | Others (n = 2) | ||
Figure 4An example of a trichrome-stained image (left) and an automatically segmented image from our trained deep learning model (right).
Deep learning segmentation results.
| Structure | Glomeruli | Arterioles | GS Glomeruli | Interstitium | Tubules |
|---|---|---|---|---|---|
| Accuracy | 0.95 | 0.87 | 0.88 | 0.91 | 0.98 |
| IoU | 0.92 | 0.78 | 0.75 | 0.84 | 0.77 |
GS, globally sclerosed; IoU, intersection over union.
Figure 5Optimal K using the Silhouette algorithm. (A) First, we run the algorithm every 5th point between 5 and 100 and then (B) run the algorithm between 4 and 12 to find the optimal K = 9 for the K means clustering.
Figure 7An example of cortex trichrome stained images with color-coded patches and zoomed images.
Figure 8ROC curves for the prediction of the level of kidney function (A, B) at the biopsy and (C, D) in the future. F1, F2, and F3 represent frequency, polynomial fitting coefficients, and clinical features, respectively. Top7 represents the top 7 features selected based on the importance rank. The x-axis is the true negative rate (TNR) or specificity and the y-axis is the true positive rate (TPR) or sensitivity.
Ranking of the important features for the dichotomized level of kidney function at the biopsy.
| Features | Description | Gini index | Rank | Overall rank |
|---|---|---|---|---|
| Frequency (visual dictionary) | f1 (red) | 0.64 | 9 | 16 |
| f2 (blue) | 3.68 | 1 | 2 | |
| f3 (green) | 0.85 | 6 | 11 | |
| f4 (black) | 0.87 | 5 | 10 | |
| f5 (cyan) | 2.10 | 3 | 4 | |
| f6 (orange) | 0.69 | 8 | 15 | |
| f7 (yellow) | 1.26 | 4 | 7 | |
| f8 (dark blue) | 2.12 | 2 | 3 | |
| f9 (white) | 0.73 | 7 | 13 | |
| Polynomial coefficient | 1.69 | 1 | 6 | |
| 1.17 | 2 | 8 | ||
| 1.09 | 3 | 9 | ||
| 0.70 | 5 | 14 | ||
| 0.85 | 4 | 12 | ||
| Clinical | Age | 4.68 | 1 | 1 |
| Gender | 0.60 | 3 | 17 | |
| Race | 0.37 | 4 | 18 | |
| Diagnosis | 1.98 | 2 | 5 |
c: polynomial coefficients in Eq. (1).
Ranking of the important features for the prediction of eGFR slope.
| Features | Description | Gini index (importance) | Rank | Overall rank |
|---|---|---|---|---|
| Frequency (visual dictionary) | f1 (red) | 1.23 | 6 | 13 |
| f2 (blue) | 1.21 | 7 | 14 | |
| f3 (green) | 1.55 | 5 | 11 | |
| f4 (black) | 1.96 | 4 | 4 | |
| f5 (cyan) | 0.72 | 9 | 17 | |
| f6 (orange) | 2.14 | 3 | 3 | |
| f7 (yellow) | 2.43 | 1 | 1 | |
| f8 (dark blue) | 2.27 | 2 | 2 | |
| f9 (white) | 0.75 | 8 | 16 | |
| Polynomial coefficient | 1.77 | 1 | 6 | |
| 1.74 | 2 | 7 | ||
| 1.01 | 5 | 15 | ||
| 1.66 | 3 | 9 | ||
| 1.37 | 4 | 12 | ||
| Clinical | Age | 1.73 | 2 | 8 |
| Gender | 0.13 | 6 | 20 | |
| Race | 0.24 | 5 | 19 | |
| Diagnosis | 0.67 | 4 | 18 | |
| eGFR | 1.82 | 1 | 5 | |
| UPC | 1.58 | 3 | 10 |
c, polynomial coefficients in Eq. (1); UPC, urine protein creatinine ratio.
Description of 9 representative visual words.
| Visual words | Corresponding kidney structures | Rank (at the biopsy) | Rank (slope) |
|---|---|---|---|
| #1 (red) | Normal TI | 9 | 6 |
| #2 (blue) | Open glomerulus including normal and inflamed but not GS | 1 | 7 |
| #3 (green) | Normal TI-more white space or cells | 6 | 5 |
| #4 (black) | Normal TI, some interstitial expansion | 5 | 4 |
| #5 (cyan) | GS, IF, Arterioles including white space | 3 | 9 |
| #6 (orange) | Normal TI | 8 | 3 |
| #7 (yellow) | Normal and nearly normal TI with more interstitial area | 4 | 1 |
| #8 (dark blue) | Mostly interstitial expansion and tubular atrophy and some cellularity | 2 | 2 |
| #9 (white) | Interstitial expansion | 7 | 8 |
TI, tubulointerstitial; GS, glomerulosclerosis; IF, interstitial fibrosis.