Literature DB >> 36042326

The utility of texture analysis of kidney MRI for evaluating renal dysfunction with multiclass classification model.

Yuki Hara1, Keita Nagawa2,3, Yuya Yamamoto1, Kaiji Inoue1, Kazuto Funakoshi1, Tsutomu Inoue4, Hirokazu Okada4, Masahiro Ishikawa5, Naoki Kobayashi5, Eito Kozawa1.   

Abstract

We evaluated a multiclass classification model to predict estimated glomerular filtration rate (eGFR) groups in chronic kidney disease (CKD) patients using magnetic resonance imaging (MRI) texture analysis (TA). We identified 166 CKD patients who underwent MRI comprising Dixon-based T1-weighted in-phase (IP)/opposed-phase (OP)/water-only (WO) images, apparent diffusion coefficient (ADC) maps, and T2* maps. The patients were divided into severe, moderate, and control groups based on eGFR borderlines of 30 and 60 mL/min/1.73 m2. After extracting 93 texture features (TFs), dimension reduction was performed using inter-observer reproducibility analysis and sequential feature selection (SFS) algorithm. Models were created using linear discriminant analysis (LDA); support vector machine (SVM) with linear, rbf, and sigmoid kernels; decision tree (DT); and random forest (RF) classifiers, with synthetic minority oversampling technique (SMOTE). Models underwent 100-time repeat nested cross-validation. Overall performances of our classification models were modest, and TA based on T1-weighted IP/OP/WO images provided better performance than those based on ADC and T2* maps. The most favorable result was observed in the T1-weighted WO image using RF classifier and the combination model was derived from all T1-weighted images using SVM classifier with rbf kernel. Among the selected TFs, total energy and energy had weak correlations with eGFR.
© 2022. The Author(s).

Entities:  

Mesh:

Year:  2022        PMID: 36042326      PMCID: PMC9427930          DOI: 10.1038/s41598-022-19009-7

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.996


Introduction

Chronic kidney disease (CKD) affects 8–16% of the population worldwide and remains a major threat to global public health due to its increasing incidence and mortality. Common causes of CKD include diabetes and hypertension, especially in developed countries. However, less than 5% of patients with early CKD report being aware of their disease[1]. Therefore, appropriate screening, early diagnosis, and management are significant in preventing CKD-associated adverse clinical outcomes, such as end-stage kidney disease, cardiovascular disease, and increased mortality. The Kidney Disease Improving Global Outcomes guidelines[2] suggested a risk-based approach to the evaluation and management of CKD and proposed six disease categories related to the estimated glomerular filtration rate (eGFR): G1–G5, with G3 subdivided into 3a and 3b. The most essential cutoff points of eGFR are 60 and 30 mL/min/1.73 m2 (as the borderlines of G2/G3 and G3/G4, respectively); therefore, the risk of death was reported to increase as the eGFR decreased below 60 mL/min/1.73 m2 in recent CKD cohort studies[3]. In addition, an eGFR of less than 30 mL/min/1.73 m2 is important from a radiological point of view as it relates to the availability of the contrast media[4]. The risk stratification of CKD based on the eGFR has undisputed advantages and has helped achieve greater awareness of CKD and its impact on global health. Ischemia and hypoxia are associated with the progression of CKD; however, clinical tools to quantify these factors in patients are lacking. Renal biopsy is the gold standard method to histologically evaluate renal pathology; nevertheless, it carries certain risks due to complications, such as bleeding. Conversely, magnetic resonance imaging (MRI) of the kidney has been used to non-invasively assess CKD progression. Several MRI methods have been successfully used to evaluate renal function, including diffusion-weighted imaging (DWI) and blood oxygen level-dependent imaging (BOLD). DWI and apparent diffusion coefficient (ADC) values are the most studied methods and have demonstrated a good correlation with renal function decline and renal fibrosis in CKD[5-8]. BOLD based on the T2* map reflects the regional renal oxygenation status and can assess hypoxia occurring during renal dysfunction[9,10]. Although Dixon-based gradient-echo MRI is another imaging method that is routinely performed in abdominal imaging and can measure renal lipid accumulation in type II diabetes mellitus[11], its utility in the evaluation of CKD has not been thoroughly studied. Texture analysis (TA) is an emerging technique that permits the quantification of image characteristics based on the distribution of pixels and their surface intensity or patterns[12,13]. TA has been applied to several medical image analyses, including oncologic imaging[14], neuroimaging[15], and abdominal imaging[16,17]. Recent reports have demonstrated the utility of TA based on DWI, BOLD, Susceptibility-weighted imaging (SWI), and T1 and T2 mapping[18,19]. However, TA of other essential MRI methods, including Dixon-based T1-weighted imaging (T1WI), has not been fully studied. Previous studies have described that as the renal function declines, a decreased difference between the values in the cortex and those in the medulla is observed in T1WI[20,21]. Therefore, we hypothesized that TA based on Dixon-based T1WI is especially important because of its capacity to capture the clearest images and reflect the morphological characteristics of the kidney. Thus, this study aimed to assess and compare the performance of TA based on Dixon-based T1WI, ADC maps, and T2* maps (BOLD) for evaluating renal dysfunction.

Results

Clinical characteristics

The study included 166 participants. The major etiologies of CKD were hypertensive nephrosclerosis (n = 80), diabetic mellitus nephropathy (n = 25), immunoglobulin A (IgA) nephropathy (n = 22), and nephrotic syndrome (n = 5). No abnormalities were observed in the remaining 34 patients. According to the eGFR, 36 patients had severe renal dysfunction (se-RD) (men, n [%] = 26 [72], mean age 60.9 ± 16.4 years, mean eGFR 19.8 ± 7.7 mL/min/1.73 m2), 85 patients had moderate renal dysfunction (mo-RD) (men, n [%] = 57 [67], mean age 62.3 ± 13.3 years, mean eGFR 46.3 ± 8.1 mL/min/1.73 m2), and 45 patients were in the control group (CG) (men, n [%] = 19 [42], mean age 43.7 ± 18.1 years, mean eGFR 78.1 ± 16.7 mL/min/1.73 m2). Table 1 details the distribution of the study population in each eGFR group. The age, percentage of men, and incidence rates of hypertension and diabetes increased significantly with progressive renal dysfunction. There was no significant difference in the incidence rate of IgA nephropathy and nephrotic syndrome among the three groups.
Table 1

The demographic and clinical characteristics of the study population.

Variablese-RDmo-RDCGP
N368545
Age, years, mean ± SD60.9 ± 16.462.3 ± 13.343.7 ± 18.1< 0.001
Sex, men, n (%)26 (72)57 (67)19 (42)0.006
Hypertension, n (%)29 (81)41 (48)10 (22)< 0.001
Diabetes, n (%)12 (33)11 (13)2 (4)< 0.001
IgA nephropathy, n (%)5 (14)10 (12)7 (16)0.54
Nephrotic syndrome, n (%)1 (2.8)2 (2)2 (4.4)0.51
eGFR, mL/min/1.73 m2, mean ± SD19.8 ± 7.746.3 ± 8.178.1 ± 16.7< 0.001

Unless otherwise indicated, data are represented as the number (%) of patients. se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2), IgA immunoglobulin A, SD standard deviation.

The demographic and clinical characteristics of the study population. Unless otherwise indicated, data are represented as the number (%) of patients. se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2), IgA immunoglobulin A, SD standard deviation.

Dimension reduction of texture features

The T1-weighted in-phase (IP)/opposed-phase (OP)/water-only (WO) images showed good reproducibility in the inter-observer reproducibility analysis, with mean interclass correlation coefficient (ICC) values of 0.767, 0.774, and 0.781, respectively. Conversely, the ADC and T2* maps showed slightly lower reproducibility, with mean ICC values of 0.732 and 0.718, respectively. Good inter-observer reproducibility was observed for 59, 60, 61, 54, and 50 features (ICC ≥ 0.75 and lower 95% confidence interval [CI] ≥ 0.6) in T1-weighted IP/OP/WO images, ADC map, and T2* map, respectively. By excluding features with poor reproducibility (ICC < 0.75 or lower 95% CI < 0.6) from any one of the imaging methods, the number of features for each imaging method was reduced to 40. Table 2 lists the ICC values of these 40 features for each imaging method.
Table 2

Representative texture features and their respective intraclass correlation coefficient.

CodeFeature classFeature name codeImaging method
T1WIIPT1WIOPT1WIWOADC mapT2* map
TF1First-order10th percentile0.9980.9830.9650.9820.757
TF2First-order90th percentile0.9930.9890.9970.9790.992
TF3First-orderEnergy0.9490.9420.8700.8820.759
TF4First-orderEntropy0.9050.9120.9120.8870.919
TF5First-orderInterquartile range0.9840.9730.9870.9570.966
TF6First-orderMean absolute deviation0.9400.9460.9520.8930.950
TF7First-orderMean0.9980.9920.9930.9850.973
TF8First-orderMedian0.9990.9950.9960.9870.979
TF9First-orderRobust mean absolute deviation0.9820.9750.9860.9510.971
TF10First-orderRoot mean squared0.9970.9930.9950.9850.944
TF11First-orderTotal energy0.9870.9430.8720.9480.800
TF12First-orderUniformity0.9510.9490.9610.9300.849
TF13GLCMDifference average0.8570.8830.8050.8720.917
TF14GLCMDifference entropy0.8490.8740.8080.8460.948
TF15GLCMId0.9480.9530.9360.9370.956
TF16GLCMIdm0.9530.9590.9490.9410.956
TF17GLCMInverse variance0.9270.9570.9450.9130.944
TF18GLCMJoint energy0.9400.9460.9550.9370.856
TF19GLCMJoint entropy0.8870.9100.8900.9050.935
TF20GLCMMaximum probability0.9550.9570.9720.9490.855
TF21GLCMSum entropy0.9380.9310.9310.9030.908
TF22GLDMDependence non uniformity0.9060.9050.8030.8500.841
TF23GLDMDependence non uniformity normalized0.9840.9740.9680.9810.981
TF24GLDMDependence variance0.9950.9760.9920.9810.925
TF25GLDMGray level non uniformity0.9740.9650.9870.9850.846
TF26GLDMLarge dependence emphasis0.9860.9720.9780.9680.930
TF27GLDMSmall dependence emphasis0.9560.9640.9510.9400.957
TF28GLRLMGray level non uniformity0.9630.9670.9830.9830.772
TF29GLRLMGray level non uniformity normalized0.9400.9450.9550.9090.854
TF30GLRLMLong run emphasis0.9860.9700.9760.9630.893
TF31GLRLMRun entropy0.9260.8860.8960.7710.764
TF32GLRLMRun length non uniformity0.8800.9160.8330.8010.786
TF33GLRLMRun length non uniformity normalized0.9700.9690.9640.9510.948
TF34GLRLMRun percentage0.9800.9710.9710.9630.941
TF35GLRLMRun variance0.9910.9700.9820.9700.875
TF36GLRLMShort run emphasis0.9710.9690.9650.9470.936
TF37GLSZMGray level non uniformity normalized0.8830.9250.9290.7990.845
TF38GLSZMSize zone non uniformity normalized0.9120.9510.9170.8560.890
TF39GLSZMSmall area emphasis0.9100.9490.9160.8380.842
TF40GLSZMZone percentage0.9700.9680.9640.9530.952

ADC apparent diffusion coefficient, GLCM gray-level co-occurrence matrix, GLDM gray-level dependence matrix, GLRLM gray-level run length matrix, GLSZM gray-level size zone matrix, IP in-phase, OP opposed-phase, TF texture feature, WO water-only.

Representative texture features and their respective intraclass correlation coefficient. ADC apparent diffusion coefficient, GLCM gray-level co-occurrence matrix, GLDM gray-level dependence matrix, GLRLM gray-level run length matrix, GLSZM gray-level size zone matrix, IP in-phase, OP opposed-phase, TF texture feature, WO water-only. Subsequently, the sequential feature selection (SFS) algorithm was used for feature selection. For each imaging method and machine learning (ML) classifier, a subset of five features that provided good classification accuracies was identified. The selected features for each classification attempt are listed in Tables 3, 4, 5, 6 and 7.
Table 3

Performance of each classification attempt in discriminating between the three groups in T1-weighted in-phase imaging (T1WI IP).

Accuracy (%)Sensitivity (%)Specificity (%)AUC
LDA (selected TFs = TF1, TF3, TF4, TF8, and TF20)
Macro-average71.1 ± 1.056.7 ± 1.878.4 ± 1.00.764 ± 0.004
se-RD70.0 ± 1.154.8 ± 1.577.6 ± 1.10.763 ± 0.004
mo-RD64.4 ± 1.049.3 ± 2.372.0 ± 0.90.676 ± 0.007
CG79.0 ± 1.166.0 ± 1.585.5 ± 0.90.840 ± 0.005
SVM with linear kernel (selected TFs = TF3, TF11, TF13, TF20, and TF24)
Macro-average75.0 ± 1.262.5 ± 1.881.2 ± 1.10.804 ± 0.005
se-RD78.2 ± 1.265.0 ± 1.884.7 ± 1.00.836 ± 0.008
mo-RD67.2 ± 1.055.0 ± 2.173.3 ± 1.20.702 ± 0.005
CG79.7 ± 1.267.5 ± 1.485.7 ± 1.10.861 ± 0.007
SVM with rbf kernel (selected TFs = TF3, TF11, TF20, TF28, and TF32)
Macro-average78.8 ± 1.468.2 ± 2.184.1 ± 1.30.826 ± 0.006
se-RD79.5 ± 1.368.2 ± 2.285.2 ± 1.60.865 ± 0.008
mo-RD72.1 ± 1.364.3 ± 2.676.0 ± 1.40.729 ± 0.009
CG84.7 ± 1.672.0 ± 1.691.0 ± 0.80.871 ± 0.005
SVM with sigmoid kernel (selected TFs = TF1, TF5, TF22, TF25, and TF31)
Macro-average74.2 ± 1.262.1 ± 1.980.7 ± 1.10.766 ± 0.005
se-RD75.3 ± 1.268.7 ± 1.478.7 ± 1.30.780 ± 0.006
mo-RD68.6 ± 1.145.3 ± 2.280.2 ± 1.20.659 ± 0.005
CG78.9 ± 1.270.2 ± 2.083.2 ± 0.70.844 ± 0.005
DT (selected TFs = TF3, TF11, TF18, TF24, and TF28)
Macro-average78.2 ± 2.367.3 ± 4.083.6 ± 2.20.805 ± 0.012
se-RD80.0 ± 2.572.6 ± 3.683.7 ± 2.00.835 ± 0.025
mo-RD71.6 ± 2.157.4 ± 4.578.6 ± 2.80.716 ± 0.022
CG83.0 ± 2.571.8 ± 3.988.6 ± 1.90.863 ± 0.019
RF (selected TFs = TF3, TF5, TF9, TF25, and TF28)
Macro-average81.2 ± 1.671.7 ± 2.685.9 ± 1.50.871 ± 0.005
se-RD82.8 ± 1.674.2 ± 2.987.2 ± 1.20.901 ± 0.005
mo-RD76.9 ± 1.562.5 ± 2.884.1 ± 1.80.802 ± 0.009
CG83.7 ± 1.778.4 ± 2.186.3 ± 1.40.898 ± 0.005

TF texture feature, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, AUC area under the curve, se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF1 = 10th percentile, TF3 = energy, TF4 = entropy, TF5 = interquartile range, TF8 = median, TF9 = robust mean absolute deviation, TF11 = total energy, TF13 = difference average, TF18 = joint energy, TF20 = maximum probability, TF22 = dependence non uniformity, TF24 = dependence variance, TF25 = gray level non uniformity (gray-level dependence matrix), TF28 = gray level non uniformity (gray-level run length matrix), TF31 = run entropy, TF32 = run length non uniformity. The data are expressed as means ± standard deviations.

Table 4

Performance of each classification attempt in discriminating between the three groups in T1-weighted opposed-phase imaging (T1WI OP).

Accuracy (%)Sensitivity (%)Specificity (%)AUC
LDA (selected TFs = TF2, TF3, TF13, TF22, and TF24)
Macro-average76.4 ± 0.764.5 ± 1.282.3 ± 0.80.813 ± 0.003
se-RD75.9 ± 0.863.7 ± 0.982.0 ± 0.70.816 ± 0.005
mo-RD70.1 ± 0.749.6 ± 1.880.4 ± 0.70.732 ± 0.006
CG83.1 ± 0.780.2 ± 1.084.5 ± 0.90.879 ± 0.004
SVM with linear kernel (selected TFs = TF3, TF5, TF15, TF22, and TF25)
Macro-average76.2 ± 1.164.3 ± 2.182.1 ± 1.10.782 ± 0.005
se-RD74.3 ± 1.163.2 ± 1.979.8 ± 1.30.777 ± 0.007
mo-RD70.9 ± 0.949.0 ± 2.681.9 ± 1.30.691 ± 0.005
CG83.4 ± 1.280.6 ± 1.784.7 ± 0.80.864 ± 0.008
SVM with rbf kernel (selected TFs = TF2, TF3, TF11, TF16, and TF17)
Macro-average77.2 ± 1.065.8 ± 1.782.9 ± 1.00.766 ± 0.005
se-RD77.6 ± 1.064.6 ± 2.284.1 ± 0.90.775 ± 0.008
mo-RD71.4 ± 0.960.8 ± 1.876.7 ± 1.40.670 ± 0.009
CG82.6 ± 1.172.0 ± 1.187.9 ± 0,60.840 ± 0.007
SVM with sigmoid kernel (selected TFs = TF2, TF3, TF24, TF25, and TF31)
Macro-average76.2 ± 1.267.7 ± 2.083.9 ± 1.10.812 ± 0.006
se-RD74.3 ± 1.171.2 ± 1.682.6 ± 1.10.813 ± 0.008
mo-RD70.9 ± 0.954.6 ± 2.580.7 ± 1.40.720 ± 0.006
CG83.4 ± 1.277.4 ± 1.888.3 ± 0.80.890 ± 0.008
DT (selected TFs = TF3, TF13, TF24, TF26, and TF34)
Macro-average76.6 ± 2.064.8 ± 4.182.4 ± 2.50.792 ± 0.012
se-RD78.3 ± 2.159.9 ± 3.387.4 ± 2.70.804 ± 0.022
mo-RD69.8 ± 1.863.4 ± 5.173.0 ± 2.90.723 ± 0.020
CG81.6 ± 2.171.2 ± 3.886.8 ± 2.00.848 ± 0.018
RF (selected TFs = TF3, TF7, TF10, TF17, and TF38)
Macro-average81.0 ± 1.571.6 ± 2.285.8 ± 1.30.869 ± 0.005
se-RD83.3 ± 1.574.7 ± 2.487.7 ± 1.40.894 ± 0.006
mo-RD75.4 ± 1.367.5 ± 2.679.4 ± 1.40.805 ± 0.009
CG84.4 ± 1.772.5 ± 1.790.3 ± 1.00.895 ± 0.004

TF texture feature, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, AUC area under the curve, se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF2 = 90th percentile, TF3 = energy, TF5 = interquartile range, TF7 = mean, TF10 = root mean squared, TF11 = total energy, TF13 = difference average, TF15 = id, TF16 = idm, TF17 = inverse variance, TF22 = dependence non uniformity, TF24 = dependence variance, TF25 = gray level non uniformity (gray-level dependence matrix), TF26 = large dependence emphasis, TF31 = run entropy, TF34 = run percentage, TF38 = size zone non uniformity normalized. The data are expressed as means ± standard deviations.

Table 5

Performance of each classification attempt in discriminating between the three groups in T1-weighted water-only imaging (T1WI WO).

Accuracy (%)Sensitivity (%)Specificity (%)AUC
LDA (selected TFs = TF3, TF6, TF12, TF19, and TF24)
Macro-average76.8 ± 0.865.3 ± 1.382.6 ± 0.80.824 ± 0.003
se-RD81.4 ± 0.973.4 ± 1.485.4 ± 0.70.862 ± 0.004
mo-RD71.1 ± 0.855.3 ± 1.479.1 ± 0.90.752 ± 0.006
CG78.0 ± 0.967.1 ± 1.283.4 ± 0.70.844 ± 0.004
SVM with linear kernel (selected TFs = TF5, TF11, TF18, TF24, and TF32)
Macro-average76.7 ± 1.265.0 ± 2.182.5 ± 1.20.834 ± 0.005
se-RD81.9 ± 1.276.5 ± 1.784.5 ± 0.90.887 ± 0.005
mo-RD69.0 ± 1.151.9 ± 2.477.5 ± 1.40.741 ± 0.006
CG79.1 ± 1.266.5 ± 2.285.4 ± 1.10.860 ± 0.006
SVM with rbf kernel (selected TFs = TF3, TF13, TF18, TF32, and TF39)
Macro-average78.9 ± 1.668.4 ± 2.884.2 ± 1.60.832 ± 0.005
se-RD83.3 ± 1.771.7 ± 2.889.1 ± 1.60.881 ± 0.007
mo-RD70.0 ± 1.457.4 ± 3.176.4 ± 1.80.712 ± 0.005
CG83.3 ± 1.776.0 ± 2.687.0 ± 1.30.890 ± 0.005
SVM with sigmoid kernel (selected TFs = TF1, TF13, TF22, TF31, and TF32)
Macro-average76.5 ± 1.664.7 ± 2.682.3 ± 1.50.812 ± 0.006
se-RD79.6 ± 1.770.1 ± 3.384.3 ± 1.10.844 ± 0.007
mo-RD70.1 ± 1.554.0 ± 2.678.1 ± 1.90.724 ± 0.006
CG79.7 ± 1.770.0 ± 1.884.6 ± 1.40.853 ± 0.007
DT (selected TFs = TF3, TF12, TF24, TF29, and TF30)
Macro-average81.3 ± 1.771.9 ± 3.086.0 ± 1.30.818 ± 0.012
se-RD83.9 ± 1.767.3 ± 3.092.1 ± 2.00.853 ± 0.017
mo-RD75.1 ± 1.574.0 ± 3.475.6 ± 1.80.743 ± 0.021
CG85.0 ± 1.974.5 ± 2.690.2 ± 1.20.855 ± 0.020
RF (selected TFs = TF11, TF14, TF16, TF29, and TF32)
Macro-average82.0 ± 1.673.0 ± 2.686.5 ± 1.40.884 ± 0.005
se-RD86.7 ± 1.778.4 ± 2.390.9 ± 1.30.924 ± 0.006
mo-RD75.6 ± 1.463.0 ± 3.381.9 ± 1.60.809 ± 0.009
CG83.6 ± 1.777.5 ± 2.586.7 ± 1.40.907 ± 0.005

TF texture feature, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, AUC area under the curve, se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF1 = 10th percentile, TF3 = energy, TF5 = interquartile range, TF6 = mean absolute deviation, TF11 = total energy, TF12 = uniformity, TF13 = difference average, TF14 = difference entropy, TF16 = idm, TF18 = joint energy, TF19 = joint entropy, TF22 = dependence non uniformity, TF24 = dependence variance, TF29 = gray level non uniformity normalized, TF30 = long run emphasis, TF31 = run entropy, TF32 = run length non uniformity, TF39 = small area emphasis. The data are expressed as means ± standard deviations.

Table 6

Performance of each classification attempt in discriminating between the three groups in ADC map imaging.

Accuracy (%)Sensitivity (%)Specificity (%)AUC
LDA (selected TFs = TF3, TF13, TF15, TF27, and TF36)
Macro-average70.6 ± 1.055.9 ± 1.777.9 ± 1.10.748 ± 0.004
se-RD72.7 ± 1.063.7 ± 1.677.2 ± 1.10.785 ± 0.006
mo-RD62.6 ± 0.934.9 ± 2.376.5 ± 1.20.619 ± 0.008
CG76.4 ± 1.169.0 ± 1.380.1 ± 0.90.828 ± 0.004
SVM with linear kernel (selected TFs = TF3, TF6, TF16, TF29, and TF37)
Macro-average69.9 ± 1.354.8 ± 2.177.4 ± 1.30.736 ± 0.006
se-RD69.0 ± 1.260.7 ± 2.473.2 ± 1.50.781 ± 0.007
mo-RD61.2 ± 1.132.6 ± 2.475.5 ± 1.60.573 ± 0.006
CG79.3 ± 1.471.1 ± 1.683.5 ± 0.70.842 ± 0.008
SVM with rbf kernel (selected TFs = TF3, TF15, TF30, TF38, and TF39)
Macro-average72.1 ± 1.658.1 ± 2.979.3 ± 1.90.757 ± 0.007
se-RD74.6 ± 1.664.0 ± 2.980.0 ± 1.70.803 ± 0.009
mo-RD65.0 ± 1.545.0 ± 3.475.0 ± 2.20.633 ± 0.008
CG76.6 ± 1.765.3 ± 2.382.2 ± 1.70.823 ± 0.010
SVM with sigmoid kernel (selected TFs = TF3, TF11, TF25, TF28, and TF39)
Macro-average69.2 ± 1.653.8 ± 3.576.9 ± 2.30.696 ± 0.006
se-RD67.8 ± 1.571.8 ± 4.865.8 ± 2.10.739 ± 0.006
mo-RD63.5 ± 1.414.4 ± 3.888.0 ± 3.20.529 ± 0.006
CG76.4 ± 1.875.3 ± 1.977.0 ± 1.50.808 ± 0.010
DT (selected TFs = TF13, TF14, TF25, TF33, and TF36)
Macro-average70.0 ± 2.555.0 ± 4.577.5 ± 2.70.713 ± 0.014
se-RD74.2 ± 2.772.7 ± 3.975.0 ± 2.60.791 ± 0.020
mo-RD62.2 ± 2.333.8 ± 4.676.4 ± 3.10.574 ± 0.026
CG73.5 ± 2.658.4 ± 5.081.1 ± 2.30.773 ± 0.021
RF (selected TFs = TF4, TF11, TF15, TF16, and TF25)
Macro-average75.0 ± 1.562.4 ± 2.781.2 ± 1.50.808 ± 0.005
se-RD75.4 ± 1.573.0 ± 2.776.6 ± 1.60.843 ± 0.008
mo-RD68.8 ± 1.339.3 ± 3.483.5 ± 1.60.699 ± 0.010
CG80.7 ± 1.675.0 ± 2.083.5 ± 1.30.870 ± 0.006

TF texture feature, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, AUC area under the curve, se-RD severe renal dysfunction (estimated glomerular filtration rate; eGFR < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF3 = energy, TF4 = entropy, TF6 = mean absolute deviation, TF11 = total energy, TF13 = difference average, TF14 = difference entropy, TF15 = id, TF16 = idm, TF25 = gray level non uniformity (gray-level dependence matrix), TF27 = small dependence emphasis, TF28 = gray level non uniformity (gray-level run length matrix), TF29 = gray level non uniformity normalized, TF30 = long run emphasis, TF33 = run length non uniformity normalized, TF36 = short run emphasis, TF37 = gray level non uniformity normalized, TF38 = size zone non uniformity normalized, TF39 = small area emphasis. The data are expressed as means ± standard deviations.

Table 7

Performance of each classification attempt in discriminating between the three groups in T2* map imaging.

Accuracy (%)Sensitivity (%)Specificity (%)AUC
LDA (selected TFs = TF2, TF11, TF22, TF24, and TF38)
Macro-average73.2 ± 0.759.9 ± 1.380.0 ± 1.10.737 ± 0.004
se-RD71.4 ± 0.761.0 ± 1.676.7 ± 1.20.740 ± 0.006
mo-RD71.0 ± 0.750.7 ± 1.581.2 ± 1.20.667 ± 0.007
CG77.3 ± 0.867.9 ± 0.982.0 ± 0.80.792 ± 0.004
SVM with linear kernel (selected TFs = TF2, TF3, TF10, TF11, and TF35)
Macro-average67.1 ± 1.350.7 ± 2.475.4 ± 1.70.694 ± 0.006
se-RD62.5 ± 1.263.5 ± 3.062.0 ± 2.00.689 ± 0.006
mo-RD63.0 ± 1.223.4 ± 3.382.9 ± 2.00.578 ± 0.007
CG75.9 ± 1.565.1 ± 1.081.2 ± 1.00.802 ± 0.010
SVM with rbf kernel (selected TFs = TF3, TF6, TF11, TF13, and TF22)
Macro-average71.9 ± 1.757.8 ± 3.278.9 ± 2.10.739 ± 0.007
se-RD72.5 ± 1.656.0 ± 3.680.7 ± 2.40.751 ± 0.010
mo-RD63.9 ± 1.654.8 ± 4.268.4 ± 2.00.642 ± 0.012
CG79.2 ± 1.962.5 ± 1.787.6 ± 1.80.811 ± 0.007
SVM with sigmoid kernel (selected TFs = TF3, TF5, TF7, TF11, and TF26)
Macro-average69.8 ± 1.754.7 ± 3.077.4 ± 1.90.729 ± 0.007
se-RD68.1 ± 1.757.5 ± 3.173.4 ± 2.40.747 ± 0.008
mo-RD62.4 ± 1.539.3 ± 3.974.0 ± 1.90.590 ± 0.006
CG78.9 ± 1.967.3 ± 2.084.7 ± 1.40.837 ± 0.009
DT (selected TFs = TF3, TF13, TF25, TF32, and TF40)
Macro-average69.8 ± 2.454.7 ± 4.977.3 ± 3.20.721 ± 0.014
se-RD69.9 ± 2.453.0 ± 5.078.4 ± 3.50.743 ± 0.023
mo-RD61.8 ± 2.246.0 ± 6.269.8 ± 3.40.620 ± 0.024
CG77.6 ± 2.765.0 ± 3.483.8 ± 2.80.798 ± 0.018
RF (selected TFs = TF6, TF10, TF11, TF13, and TF22)
Macro-average74.9 ± 2.162.3 ± 3.581.1 ± 1.90.821 ± 0.006
se-RD76.0 ± 2.063.6 ± 3.682.2 ± 1.90.832 ± 0.009
mo-RD67.5 ± 1.851.2 ± 4.175.6 ± 2.30.725 ± 0.014
CG81.1 ± 2.372.1 ± 2.885.6 ± 1.40.895 ± 0.006

TF texture feature, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, AUC area under the curve, se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF2 = 90th percentile, TF3 = energy, TF5 = interquartile range, TF6 = mean absolute deviation, TF7 = mean, TF10 = root mean squared, TF11 = total energy, TF13 = difference average, TF22 = dependence non uniformity, TF24 = dependence variance, TF25 = gray level non uniformity (gray-level dependence matrix), TF26 = large dependence emphasis, TF32 = run length non uniformity, TF35 = run variance, TF38 = size zone non uniformity normalized, TF40 = zone percentage. The data are expressed as means ± standard deviations.

Performance of each classification attempt in discriminating between the three groups in T1-weighted in-phase imaging (T1WI IP). TF texture feature, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, AUC area under the curve, se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF1 = 10th percentile, TF3 = energy, TF4 = entropy, TF5 = interquartile range, TF8 = median, TF9 = robust mean absolute deviation, TF11 = total energy, TF13 = difference average, TF18 = joint energy, TF20 = maximum probability, TF22 = dependence non uniformity, TF24 = dependence variance, TF25 = gray level non uniformity (gray-level dependence matrix), TF28 = gray level non uniformity (gray-level run length matrix), TF31 = run entropy, TF32 = run length non uniformity. The data are expressed as means ± standard deviations. Performance of each classification attempt in discriminating between the three groups in T1-weighted opposed-phase imaging (T1WI OP). TF texture feature, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, AUC area under the curve, se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF2 = 90th percentile, TF3 = energy, TF5 = interquartile range, TF7 = mean, TF10 = root mean squared, TF11 = total energy, TF13 = difference average, TF15 = id, TF16 = idm, TF17 = inverse variance, TF22 = dependence non uniformity, TF24 = dependence variance, TF25 = gray level non uniformity (gray-level dependence matrix), TF26 = large dependence emphasis, TF31 = run entropy, TF34 = run percentage, TF38 = size zone non uniformity normalized. The data are expressed as means ± standard deviations. Performance of each classification attempt in discriminating between the three groups in T1-weighted water-only imaging (T1WI WO). TF texture feature, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, AUC area under the curve, se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF1 = 10th percentile, TF3 = energy, TF5 = interquartile range, TF6 = mean absolute deviation, TF11 = total energy, TF12 = uniformity, TF13 = difference average, TF14 = difference entropy, TF16 = idm, TF18 = joint energy, TF19 = joint entropy, TF22 = dependence non uniformity, TF24 = dependence variance, TF29 = gray level non uniformity normalized, TF30 = long run emphasis, TF31 = run entropy, TF32 = run length non uniformity, TF39 = small area emphasis. The data are expressed as means ± standard deviations. Performance of each classification attempt in discriminating between the three groups in ADC map imaging. TF texture feature, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, AUC area under the curve, se-RD severe renal dysfunction (estimated glomerular filtration rate; eGFR < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF3 = energy, TF4 = entropy, TF6 = mean absolute deviation, TF11 = total energy, TF13 = difference average, TF14 = difference entropy, TF15 = id, TF16 = idm, TF25 = gray level non uniformity (gray-level dependence matrix), TF27 = small dependence emphasis, TF28 = gray level non uniformity (gray-level run length matrix), TF29 = gray level non uniformity normalized, TF30 = long run emphasis, TF33 = run length non uniformity normalized, TF36 = short run emphasis, TF37 = gray level non uniformity normalized, TF38 = size zone non uniformity normalized, TF39 = small area emphasis. The data are expressed as means ± standard deviations. Performance of each classification attempt in discriminating between the three groups in T2* map imaging. TF texture feature, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, AUC area under the curve, se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF2 = 90th percentile, TF3 = energy, TF5 = interquartile range, TF6 = mean absolute deviation, TF7 = mean, TF10 = root mean squared, TF11 = total energy, TF13 = difference average, TF22 = dependence non uniformity, TF24 = dependence variance, TF25 = gray level non uniformity (gray-level dependence matrix), TF26 = large dependence emphasis, TF32 = run length non uniformity, TF35 = run variance, TF38 = size zone non uniformity normalized, TF40 = zone percentage. The data are expressed as means ± standard deviations. Concurrently, cross-correlation analyses were conducted between the eGFR and the 40 texture features. The highest correlation coefficients were observed for two texture features (total energy [0.55, p < 0.001] and energy [0.55, p < 0.001]) in T1-weighted WO images. Table 8 shows the relationship between the eGFR and the 40 selected texture features in each imaging modality.
Table 8

The cross-correlation analyses between the eGFR and the 40 texture features derived from each imaging method.

CodeFeature name codeImaging method
T1 IPT1 OPT1 WOADC mapT2* map
PCCp-valuePCCp-valuePCCp-valuePCCp-valuePCCp-value
TF110th percentile0.325< 0.0010.255< 0.001− 0.0180.8170.1350.083− 0.0030.970
TF290th percentile0.355< 0.0010.2050.0080.0850.2720.0130.873− 0.0070.926
TF3Energy0.504< 0.0010.474< 0.0010.560< 0.0010.429< 0.0010.308< 0.001
TF4Entropy− 0.0040.955− 0.0780.3140.1720.027− 0.2070.0080.0090.909
TF5Interquartile range0.0130.868− 0.0450.5630.0860.268− 0.2150.0050.0150.841
TF6Mean absolute deviation− 0.0110.888− 0.0830.2830.1030.186− 0.2000.010− 0.0560.469
TF7Mean0.349< 0.0010.2360.0020.0420.5840.0760.326− 0.0060.937
TF8Median0.346< 0.0010.2080.0070.0260.7380.0860.2680.0290.706
TF9Robust mean absolute deviation0.0050.948− 0.0500.5200.0850.271− 0.2070.0070.0080.915
TF10Root mean squared0.350< 0.0010.2320.0030.0450.5580.0740.343− 0.0290.704
TF11Total energy0.503< 0.0010.471< 0.0010.558< 0.0010.0720.3530.413< 0.001
TF12Uniformity− 0.0420.5910.0210.786− 0.2070.0070.2510.001− 0.0650.400
TF13Difference average− 0.256< 0.001− 0.343< 0.001− 0.1660.032− 0.2100.007− 0.1010.194
TF14Difference entropy− 0.271< 0.001− 0.348< 0.001− 0.1590.040− 0.2120.006− 0.0910.243
TF15Id0.2360.0020.313< 0.0010.1450.0630.260< 0.0010.0870.264
TF16Idm0.2340.0020.305< 0.0010.1410.0700.263< 0.0010.0890.255
TF17Inverse variance0.255< 0.0010.308< 0.0010.1430.0670.2130.0060.0560.469
TF18Joint energy0.0340.6670.0930.230− 0.1290.0980.281< 0.001− 0.0380.626
TF19Joint entropy− 0.0730.351− 0.1510.0530.0920.237− 0.2170.005− 0.0090.910
TF20Maximum probability− 0.0390.612− 0.0050.945− 0.1490.0550.284< 0.001− 0.0580.455
TF21Sum entropy0.0470.542− 0.0230.7670.2050.008− 0.1970.0110.0290.709
TF22Dependence non uniformity0.341< 0.0010.259< 0.0010.429< 0.0010.2250.0040.2000.010
TF23Dependence non uniformity normalized− 0.2160.005− 0.328< 0.001− 0.1890.015− 0.263< 0.001− 0.0530.496
TF24Dependence variance0.1720.0270.270< 0.0010.1780.0220.267< 0.0010.0650.403
TF25Gray level non uniformity0.311< 0.0010.300< 0.0010.1480.0560.402< 0.0010.2240.004
TF26Large dependence emphasis0.2340.0020.298< 0.0010.1630.0360.293< 0.0010.1120.152
TF27Small dependence emphasis− 0.283< 0.001− 0.343< 0.001− 0.1790.021− 0.273< 0.001− 0.1220.117
TF28Gray level non uniformity0.314< 0.0010.302< 0.0010.1640.0350.396< 0.0010.324< 0.001
TF29Gray level non uniformity normalized− 0.0270.7270.0280.714− 0.2000.0100.2270.003− 0.0310.691
TF30Long run emphasis0.2350.0020.305< 0.0010.1610.0390.288< 0.0010.0880.260
TF31Run entropy0.0950.2230.0210.7870.263< 0.001− 0.0560.4720.1300.094
TF32Run length non uniformity0.347< 0.0010.341< 0.0010.459< 0.0010.2310.0030.2040.008
TF33Run length non uniformity normalized− 0.259< 0.001− 0.316< 0.001− 0.1640.035− 0.298< 0.001− 0.1330.088
TF34Run percentage− 0.2500.001− 0.315< 0.001− 0.1680.030− 0.293< 0.001− 0.1210.120
TF35Run variance0.2210.0040.302< 0.0010.1650.0330.285< 0.0010.0700.369
TF36Short run emphasis− 0.257< 0.001− 0.312< 0.001− 0.1580.041− 0.299< 0.001− 0.1360.081
TF37Gray level non uniformity normalized0.0180.8120.0540.489− 0.1730.0260.1400.0720.0270.730
TF38Size zone non uniformity normalized− 0.326< 0.001− 0.361< 0.001− 0.1840.017− 0.2460.001− 0.1460.061
TF39Small area emphasis− 0.332< 0.001− 0.359< 0.001− 0.1790.021− 0.2460.001− 0.1560.044
TF40Zone percentage− 0.273< 0.001− 0.334< 0.001− 0.1720.027− 0.278< 0.001− 0.1330.087

ADC apparent diffusion coefficient, eGFR estimated glomerular filtration rate, GLCM gray-level co-occurrence matrix, GLDM gray-level dependence matrix, GLRLM gray-level run length matrix, GLSZM gray-level size zone matrix, IP in-phase, OP opposed-phase, PCC Pearson's Correlation Coefficient, TF texture feature, WO water-only.

The cross-correlation analyses between the eGFR and the 40 texture features derived from each imaging method. ADC apparent diffusion coefficient, eGFR estimated glomerular filtration rate, GLCM gray-level co-occurrence matrix, GLDM gray-level dependence matrix, GLRLM gray-level run length matrix, GLSZM gray-level size zone matrix, IP in-phase, OP opposed-phase, PCC Pearson's Correlation Coefficient, TF texture feature, WO water-only.

Classification and validation

Receiver-operating characteristic (ROC) curve analyses were performed to compare the capacity of TA quantified from each imaging method to differentiate the three groups of CKD. Overall, the TA based on T1-weighted IP/OP/WO images provided better classification performance than that based on the ADC and T2* maps. Among the five imaging methods, the T1-weighted WO images obtained the highest classification scores: an accuracy of 76.5–82.0% and a macro-average area under the curve (AUC) of 0.812–0.884. Among the six classifiers we studied, the most favorable performance was observed in the random forest (RF) classifier. As for the support vector machine (SVM) classifier, the favorable results were obtained in the SVM with rbf kernel, whereas the results were poor in the SVM with sigmoid kernel. The results of all classification attempts are summarized in Tables 3, 4, 5, 6 and 7. Figures 1a and 2a present the ROC curve and the confusion matrix of the representative model (T1-weighted WO image with RF classifier). The confusion matrices and ROC curves of all classification attempts are summarized online in Supplementary Figs. S1–S5 and S8–S12, respectively.
Figure 1

The receiver operating characteristic (ROC) curves and area under the curve (AUC) values of representative classification models using T1-weighted water-only images with a random forest classifier (A) and all T1-weighted images using a support vector machine with rbf kernel classifier (B) in classifying the three groups of chronic kidney disease. Severe renal dysfunction group (se-RD, estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2), moderate renal dysfunction group (mo-RD, 30 ≤ eGFR < 60 mL/min/1.73 m2), and control group (CG, eGFR ≥ 60 mL/min/1.73 m2). The AUC values are expressed as means.

Figure 2

Confusion matrices show the status of representative classification models using T1-weighted water-only images with a random forest classifier (A) and all T1-weighted images using a support vector machine with rbf kernel classifier (B) in classifying the three groups of chronic kidney disease. Severe renal dysfunction group (se-RD, estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2), moderate renal dysfunction group (mo-RD, 30 ≤ eGFR < 60 mL/min/1.73 m2), and control group (CG, eGFR ≥ 60 mL/min/1.73 m2). The data are expressed as means ± standard deviations.

The receiver operating characteristic (ROC) curves and area under the curve (AUC) values of representative classification models using T1-weighted water-only images with a random forest classifier (A) and all T1-weighted images using a support vector machine with rbf kernel classifier (B) in classifying the three groups of chronic kidney disease. Severe renal dysfunction group (se-RD, estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2), moderate renal dysfunction group (mo-RD, 30 ≤ eGFR < 60 mL/min/1.73 m2), and control group (CG, eGFR ≥ 60 mL/min/1.73 m2). The AUC values are expressed as means. Confusion matrices show the status of representative classification models using T1-weighted water-only images with a random forest classifier (A) and all T1-weighted images using a support vector machine with rbf kernel classifier (B) in classifying the three groups of chronic kidney disease. Severe renal dysfunction group (se-RD, estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2), moderate renal dysfunction group (mo-RD, 30 ≤ eGFR < 60 mL/min/1.73 m2), and control group (CG, eGFR ≥ 60 mL/min/1.73 m2). The data are expressed as means ± standard deviations.

Combination models

The combination models derived from T1-weighted IP/OP/WO images (ALL T1WIs) and those derived from all imaging methods (ALL IMs) were evaluated. The selected texture features are listed in Tables 9 and 10. The best classification performance was observed in ALL T1WIs using the SVM with rbf kernel classifier: an accuracy of 82.8% and a macro-average AUC of 0.887. The results of all classification attempts are summarized in Tables 9 and 10. Figures 1b and 2b present the ROC curve and the confusion matrix of the representative model (ALL T1WIs using the SVM with rbf kernel classifier). The confusion matrices and ROC curves of all classification attempts are summarized online in Supplementary Figs. S6,S7 and S13,S14, respectively.
Table 9

Performance of each classification attempt in discriminating between the three groups in all T1-weighted imaging methods (ALL T1WIs).

Accuracy (%)Sensitivity (%)Specificity (%)AUC
LDA (selected TFs = TF3 and TF19 derived from T1WI IP, TF3 from T1WI OP, and TF12 and TF31 from T1WI WO)
Macro-average77.6 ± 0.966.5 ± 1.483.2 ± 0.80.844 ± 0.003
se-RD81.0 ± 1.073.1 ± 1.684.9 ± 0.60.882 ± 0.003
mo-RD68.7 ± 0.854.8 ± 1.575.6 ± 1.00.739 ± 0.006
CG83.1 ± 1.071.1 ± 1.289.1 ± 0.80.897 ± 0.003
SVM with linear kernel (selected TFs = TF3 and TF37 derived from T1WI IP, TF10 from T1WI OP, and TF3 and TF20 from T1WI WO)
Macro-average81.5 ± 0.872.2 ± 1.486.1 ± 0.80.860 ± 0.004
se-RD84.2 ± 0.873.4 ± 0.989.6 ± 0.90.878 ± 0.004
mo-RD76.0 ± 0.767.3 ± 2.080.3 ± 0.80.794 ± 0.005
CG84.3 ± 0.975.9 ± 1.388.5 ± 0.70.894 ± 0.004
SVM with rbf kernel (selected TFs = TF3 and TF18 derived from T1WI IP, and TF18, TF20, and TF28 from T1WI WO)
Macro-average82.8 ± 1.574.2 ± 2.487.1 ± 1.30.887 ± 0.006
se-RD85.0 ± 1.674.5 ± 2.090.3 ± 1.30.925 ± 0.006
mo-RD75.2 ± 1.367.3 ± 2.979.2 ± 1.40.782 ± 0.005
CG88.1 ± 1.680.7 ± 2.491.8 ± 1.10.940 ± 0.006
SVM with sigmoid kernel (selected TFs = TF25 derived from T1WI IP, TF3 and TF18 from T1WI OP, and TF4 and TF8 from T1WI WO)
Macro-average77.3 ± 1.366.0 ± 2.483.0 ± 1.20.794 ± 0.006
se-RD78.3 ± 1.358.7 ± 2.788.0 ± 1.20.797 ± 0.006
mo-RD69.5 ± 1.155.4 ± 2.876.5 ± 1.50.696 ± 0.008
CG84.3 ± 1.683.9 ± 1.684.5 ± 1.00.876 ± 0.008
DT (selected TFs = TF3 and TF24 derived from T1WI IP, and TF28, TF31, and TF37 from T1WI OP)
Macro-average75.3 ± 2.063.0 ± 4.081.5 ± 2.40.783 ± 0.013
se-RD75.7 ± 1.965.3 ± 4.281.0 ± 2.70.816 ± 0.021
mo-RD68.1 ± 1.854.3 ± 4.974.9 ± 2.80.688 ± 0.022
CG82.2 ± 2.269.4 ± 3.088.6 ± 1.70.842 ± 0.017
RF (selected TFs = TF3 derived from T1WI IP, TF3, TF13 and TF14 from T1WI OP, and TF2 from T1WI WO)
Macro-average80.5 ± 1.670.8 ± 2.785.4 ± 1.40.874 ± 0.004
se-RD82.0 ± 1.770.0 ± 2.688.0 ± 1.40.898 ± 0.004
mo-RD74.3 ± 1.464.1 ± 3.379.4 ± 1.50.797 ± 0.008
CG85.3 ± 1.878.3 ± 2.188.7 ± 1.20.916 ± 0.003

AUC area under the curve, IP in-phase, OP opposed-phase, T1WI T1-weighted imaging, TF texture feature, WO water-only, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF2 = 90th percentile, TF3 = energy, TF4 = entropy, TF8 = median, TF10 = root mean squared, TF12 = uniformity, TF13 = difference average, TF14 = difference entropy, TF18 = joint energy, TF19 = joint entropy, TF20 = maximum probability, TF24 = dependence variance, TF25 = gray level non uniformity (gray-level dependence matrix), TF28 = gray level non uniformity (gray-level run length matrix), TF31 = run entropy, TF37 = gray level non uniformity normalized. The data are expressed as means ± standard deviations.

Table 10

Performance of each classification attempt in discriminating between the three groups in all imaging methods (ALL IMs).

Accuracy (%)Sensitivity (%)Specificity (%)AUC
LDA (selected TFs = TF3 and TF19 derived from T1WI IP, TF3 from T1WI OP, and TF12 and TF31 from T1WI WO)
Macro-average77.5 ± 0.866.2 ± 1.383.1 ± 0.70.832 ± 0.003
se-RD78.6 ± 0.867.4 ± 1.384.2 ± 0.70.850 ± 0.003
mo-RD70.1 ± 0.754.1 ± 1.378.2 ± 0.90.744 ± 0.007
CG83.7 ± 0.877.2 ± 1.387.0 ± 0.60.890 ± 0.002
SVM with linear kernel (selected TFs = TF3 and TF37 derived from T1WI IP, TF10 from T1WI OP, and TF3 and TF20 from T1WI WO)
Macro-average81.9 ± 0.972.9 ± 1.786.5 ± 0.90.863 ± 0.003
se-RD84.0 ± 1.079.9 ± 1.986.1 ± 0.90.885 ± 0.004
mo-RD75.6 ± 0.858.9 ± 2.084.0 ± 0.90.781 ± 0.007
CG86.2 ± 1.179.9 ± 1.189.3 ± 0.90.910 ± 0.004
SVM with rbf kernel (selected TFs = TF3 and TF18 derived from T1WI IP, TF18 and TF28 from T1WI WO, and TF20 from T2* map)
Macro-average81.6 ± 1.572.3 ± 2.986.2 ± 1.50.890 ± 0.005
se-RD81.2 ± 1.571.7 ± 3.185.9 ± 1.70.911 ± 0.006
mo-RD74.1 ± 1.360.0 ± 3.481.1 ± 1.80.798 ± 0.009
CG89.4 ± 1.785.2 ± 2.291.5 ± 1.10.949 ± 0.006
SVM with sigmoid kernel (selected TFs = TF25 derived from T1WI IP, TF7 and TF11 from T1WI OP, TF21 from T1WI WO, and TF35 from T2* map)
Macro-average76.5 ± 1.464.7 ± 2.782.3 ± 1.50.822 ± 0.005
se-RD77.4 ± 1.460.2 ± 2.686.0 ± 1.80.834 ± 0.005
mo-RD67.9 ± 1.255.7 ± 3.673.9 ± 1.50.724 ± 0.008
CG84.2 ± 1.678.2 ± 1.887.1 ± 1.20.897 ± 0.005
DT (selected TFs = TF3 and TF13 derived from T1WI IP, TF40 from T1WI OP, TF25 from T1WI WO, TF1 from T2* map)
Macro-average78.1 ± 1.967.1 ± 3.983.5 ± 2.10.806 ± 0.014
se-RD81.7 ± 2.074.2 ± 3.785.5 ± 2.30.856 ± 0.020
mo-RD69.9 ± 1.654.6 ± 4.877.5 ± 2.20.696 ± 0.024
CG82.6 ± 2.172.5 ± 3.387.6 ± 1.90.864 ± 0.016
RF (selected TFs = TF3 derived from T1WI IP, TF3 from T1WI OP, TF18 from T2* map, and TF19 and TF40 from ADC map)
Macro-average81.3 ± 1.272.0 ± 2.286.0 ± 1.10.865 ± 0.004
se-RD83.8 ± 1.270.9 ± 2.490.3 ± 1.10.866 ± 0.005
mo-RD75.8 ± 1.163.6 ± 2.382.0 ± 1.30.815 ± 0.009
CG84.4 ± 1.381.6 ± 2.085.8 ± 0.90.903 ± 0.004

ADC apparent diffusion coefficient, AUC area under the curve, IP in-phase, OP opposed-phase, T1WI T1-weighted imaging, TF texture feature, WO water-only, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF1 = 10th percentile, TF3 = energy, TF7 = mean, TF10 = root mean squared, TF11 = total energy, TF12 = uniformity, TF13 = difference average, TF18 = joint energy, TF19 = joint entropy, TF20 = maximum probability, TF21 = sum entropy, TF25 = gray level non uniformity (gray-level dependence matrix), TF28 = gray level non uniformity (gray-level run length matrix), TF31 = run entropy, TF35 = run variance, TF37 = gray level non uniformity normalized, TF40 = zone percentage. The data are expressed as means ± standard deviations.

Performance of each classification attempt in discriminating between the three groups in all T1-weighted imaging methods (ALL T1WIs). AUC area under the curve, IP in-phase, OP opposed-phase, T1WI T1-weighted imaging, TF texture feature, WO water-only, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF2 = 90th percentile, TF3 = energy, TF4 = entropy, TF8 = median, TF10 = root mean squared, TF12 = uniformity, TF13 = difference average, TF14 = difference entropy, TF18 = joint energy, TF19 = joint entropy, TF20 = maximum probability, TF24 = dependence variance, TF25 = gray level non uniformity (gray-level dependence matrix), TF28 = gray level non uniformity (gray-level run length matrix), TF31 = run entropy, TF37 = gray level non uniformity normalized. The data are expressed as means ± standard deviations. Performance of each classification attempt in discriminating between the three groups in all imaging methods (ALL IMs). ADC apparent diffusion coefficient, AUC area under the curve, IP in-phase, OP opposed-phase, T1WI T1-weighted imaging, TF texture feature, WO water-only, DT decision tree, LDA linear discriminant analysis, SVM support vector machine, RF random forest classifier, se-RD severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/3b), CG control group (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2). Feature name codes are as follows: TF1 = 10th percentile, TF3 = energy, TF7 = mean, TF10 = root mean squared, TF11 = total energy, TF12 = uniformity, TF13 = difference average, TF18 = joint energy, TF19 = joint entropy, TF20 = maximum probability, TF21 = sum entropy, TF25 = gray level non uniformity (gray-level dependence matrix), TF28 = gray level non uniformity (gray-level run length matrix), TF31 = run entropy, TF35 = run variance, TF37 = gray level non uniformity normalized, TF40 = zone percentage. The data are expressed as means ± standard deviations.

Discussion

In the present study, we sought to investigate whether multiclass classification models based on TA of kidney MRI could predict the three eGFR groups of renal dysfunction. The results of our study suggest that TA of kidney MRI would be a modest predictor of these eGFR groups, but might not be a valuable differentiator in a clinical setting. TA quantified from T1-weighted IP/OP/WO images provided better classification performance compared with that of those based on ADC maps and T2* maps. Furthermore, we examined the combination models and showed that texture models derived from ALL T1WIs using the SVM with rbf kernel classifier afforded the moderate diagnostic performance as well. To our knowledge, this is the first study to evaluate the possibility of using multiclass models in the classification of eGFR groups. Several attempts have been made to differentiate between eGFR groups using TA of kidney MRI. Previous reports have shown the possibility of using TA based on DWI, BOLD, SWI, and T1 and T2 mapping[18,19]. In all these attempts, only binary classifiers were examined to differentiate between each eGFR group separately. However, in clinical practice, it is not uncommon for the classification of diseases to extend to three or more groups; therefore, it is reasonable to build multiclass problems for clinical use. According to the recent guidelines[2], renal impairment and prognosis have been stratified into six disease groups based on the eGFR: G1 to G5, with G3 split into 3a and 3b. The most important cutoff points are eGFR 60 and 30 mL/min/1.73 m2, and previous studies on TA of kidney MRI were designed to classify the eGFR according to these cutoffs, except for one study that considered eGFR 90 mL/min/1.73 m2 as an additional cutoff point[19,22]. Therefore, our research also aimed to classify three groups, with cutoffs at eGFR 60 and 30 mL/min/1.73 m2. Ideally, the classification system of all six eGFR groups would be beneficial, although a large deviation in the distribution of each group prohibited these attempts. Our study used Dixon-based T1-weighted images, which have not been fully discussed in the assessment of renal dysfunction. Dixon-based imaging, also called chemical shift imaging, uses the IP/OP cycling of fat and water molecules and allows the acquired images to be combined mathematically into four sequences: IP/OP/WO/fat-only (FO) images[23]. This technique has been used as a homogeneous fat suppression or fat quantification method in various medical imaging fields. The Dixon method has the potential for measuring the renal lipid accumulation in type II diabetes mellitus[11]. Moreover, the Dixon technique is useful in detecting iron deposition related to T2* effects and susceptibility artifacts owing to the double-echo sequence[24]. Additionally, Dixon-based images provide better signal-to-noise efficiency than other conventional fat-suppressed methods[25]. In our results, TA based on T1-weighted IP/OP/WO images demonstrated moderate classification scores, and the favorable classification scores were obtained by T1-weighted WO images. Although we could not analyze the FO image and fat fraction ratio map, as these were not available in all patients, a T1-weighted WO image can be regarded as a total fat-suppressed image and contains T1 information on components other than fat. In a recent study on T1 mapping, increased cortical T1 values and decreased T1 cortico-medullary differentiation correlated with the severity of renal impairment[20,21]. The changes in the T1 values could be attributed to renal physiological states, such as inflammation, hypoxia, and fibrosis[20,26-29]. In our opinion, T1-weighted WO images may represent these changes and could be useful in the non-invasive assessment of CKD etiologies. In our study, TA based on the ADC and T2* maps showed relatively low diagnostic performance compared with that of those based on Dixon-based T1WI. In the ADC map, the texture features in the renal cortex had a good correlation with fibrosis and chronic lesions, and the texture features in the renal medulla were more related to renal function than those quantified from the renal cortex[19]. Additionally, the difference between the cortical and medullary ADC, the so-called delta-ADC, has been correlated with fibrosis in CKD[5,30]. However, since BOLD provides a marker of blood oxygenation levels, relative hypoxia associated with renal injury may be reflected by the T2* map. T2* measurement demonstrated a good correlation with the eGFR in patients with CKD, and TA of BOLD was linearly correlated with the eGFR in several studies[18,31-33]. In our results, TA of the ADC and T2* maps showed little correlation with the eGFR, and unsatisfactory results were obtained in multiclass problems. A possible explanation for the lower performances in the ADC and T2* maps is the difference in the quantification methods of texture features. We quantified the texture features from the renal parenchyma as a whole, whereas most other studies quantified the texture features from the renal cortex or medulla[5-10,19]. As described above, it would be favorable to consider the renal cortex and medulla separately for the assessment of the ADC and T2* maps. Moreover, our classification system was a multiclass model, and TA using non-linear classifiers based on clear images, such as Dixon-based T1WI, may be suitable for such systems. Regarding inter-reader reproducibility, the ICC values for the ADC and T2* maps were relatively low. These unsatisfactory results may be attributed to their relatively low resolution. Lower discriminative performance and reproducibility in TA of ADC maps have been reported due to the low resolution of the images[34,35]. In our research, the texture features were extracted from the whole area of both kidneys, although in most studies, these were measured in the renal cortex and medulla separately, and in the ipsilateral kidney, mostly on the right side due to artifacts caused by factors such as intestinal gas, poor breath holds, and susceptibility effects[18]. Considering the different functionalities of the renal cortex and medulla, it might be more appropriate to assess them separately. However, in clinical practice, we contemplated that evaluating them as a whole would be simpler and easier to understand. For the ADC and T2* maps, in particular, the delineation between the renal cortex and medulla is difficult because of their relatively low resolution, and poor reproducibility is expected when considering the renal cortex and medulla separately. Moreover, in patients with advanced renal dysfunction, it is often difficult to distinguish between them because of cortical thinning[36,37]. Another point where our research differs from others is whether one or both sides of the kidney are considered. Our study evaluated both kidneys based on the idea that they might contain more integrated information concerning renal function. However, considering the time-consuming process of region of interest (ROI) delineation, it would be favorable to segment only one side of the kidney. If TA derived from one side of the kidney is sufficient, it would be beneficial to consider only one side of the kidney because of the severe artifacts on the other side. Moreover, in our study, we performed manual segmentation instead of using automatic methods. A method that automatically divides the renal parenchyma into 12 layers using a computer (twelve-layer concentric objects method) has been validated so far[32,38,39]. Its use may improve the discrimination capacity and reproducibility of our models, especially for the ADC and T2* maps, which need to be examined in the future. In recent years, TA has become a promising technique for quantitative imaging analysis, providing biomarkers for pathological changes or the response to treatment[12-17]. In our analysis, TA of kidney MRI was not a good discriminator of eGFR groups, especially in the clinical setting. Texture features such as energy and total energy were frequently included in the selected features in most classification attempts, although their correlation with the eGFR was not good. It might be interesting to know the existence of universal texture parameters, as such features may represent the underlying pathophysiology of kidney disease. The energy and total energy show the magnitude of pixel values and accentuate the high signal intensities in the images[40]. In our opinion, cortico-medullary differentiation may have played a role in the renal dysfunction; as the renal function declined, decreased cortico-medullary differentiation was noted, as described above[20,21]. Both parameters showed decreased values in our results, implying decreased signal intensities in the whole area of the kidneys, and this may be affected by a decrease in the cortico-medullary difference. However, in most studies, entropy correlated well with the eGFR and showed the capacity to differentiate between the eGFR groups of patients with CKD[18,19,41]. Other reports state that skewness, kurtosis, and correlation may be useful in discriminating between these groups[18,19]. None of the studies commented on the energy and total energy. One reason for this discrepancy could be the difference in the classification system: a multiclass classification model was used in our study. Although we did not examine the binary classification for each border, it is suspected that the energy and total energy could be weak indicators for the overall classification of the eGFR groups. Another reason for the discrepancy is that we considered Dixon-based T1WI, whereas other studies mainly discussed DWI, BOLD, and SWI: the classifiers used in this study were mainly non-linear ML methods. In addition, TA was conducted on the whole region of the kidneys, whereas in other studies the renal cortex and medulla were analyzed separately. In this study, we also demonstrated the possible candidates of classifiers, such as SVM and RF. SVM has high generalizability since it can be used to select linear or non-linear kernels, and the 'rbf' (non-linear) kernel could be the most suited to our models. Generally, non-linear classifiers would show good performance in multiclass classification[42], and our results showed this tendency as well. Our study had many limitations. First, we retrospectively enrolled 166 patients from a single institution; this was a small sample with some imbalance between each group. A greater number of patients with more balanced grouping is needed to validate the results in the future. Second, since we excluded patients with renal lesions, some important renal diseases, such as polycystic kidney disease, were ignored in this analysis, which would have caused a selection bias. Third, the data were not divided into training and validation sets because of the limited number of patients; hence, further investigation using an external validation cohort should be performed in the future. Fourth, since we focused on classifications predicting the eGFR group, other important laboratory data or underlying pathologies were missed in this study. As stated above, renal lipid accumulation in diabetes mellitus can be assessed using the Dixon method[11], which is worth investigating in the future. Fifth, we extracted texture features from one layer of the image as two-dimensional data; the use of only one layer may result in important texture features being missed. This problem can be solved by extracting three-dimensional features, although this approach may be time-consuming. Sixth, in this study, the individual texture feature sets were selected for each classifier and imaging method. Future studies should strictly compare the performances of classifiers or imaging methods by selecting one common feature set. Lastly, we should have examined other T1-weighted Dixon-based images, such as FO image and fat fraction ratio map, as well as other diffusion-based images such as intra-voxel incoherent motion. In conclusion, multiclass classification models based on TA of kidney MRI showed modest classification performance for predicting the eGFR in patients with CKD. TA based on Dixon-based T1WI, particularly WO images, showed moderate performance. Energy and total energy were weakly correlated with the eGFR. Our results were limited in terms of the clinical value of TA of kidney MRI, and thus further studies should verify its reproducibility and feasibility.

Methods

Subjects

This study was approved by the Research Ethics Committee of the Saitama Medical University Hospital. The requirement for informed consent was waived by the committee (approval number BYOU2022-037). All experiments were performed in accordance with the relevant guidelines and regulations. Figure 3 presents the inclusion and exclusion criteria for this study. We identified and reviewed 209 patients referred from the Department of Nephrology in our hospital who underwent kidney MRI between January 2017 and September 2021. The inclusion criteria included: (1) age of 15 years or older; and (2) MRI scanning with Dixon-based T1WI, DWI, and ADC maps and T2* maps in our hospital. The exclusion criteria included: (1) lack of Dixon-based T1WI, DWI, and ADC maps or T2* maps (n = 5); (2) insufficient clinical or laboratory data (n = 1); (3) high-grade kidney atrophy (difficulty in segmentation) (n = 2); (4) severe artifacts on MRI (n = 17); and (5) presence of renal lesions with maximal diameter > 1 cm or number of renal masses > 5 in each kidney, including polycystic kidney disease (n = 18). In total, 166 patients were enrolled.
Figure 3

Flow chart of the inclusion and exclusion criteria for the study. ADC apparent diffusion coefficient, DWI diffusion-weighted imaging, MRI magnetic resonance imaging, T1WI T1-weighted imaging.

Flow chart of the inclusion and exclusion criteria for the study. ADC apparent diffusion coefficient, DWI diffusion-weighted imaging, MRI magnetic resonance imaging, T1WI T1-weighted imaging. The eGFR was calculated using Eq. (1): where age is in years and serum creatinine (sCr) is in mg/dL. The eGFR was defined as 120 mL/min/1.73 m2 if it was greater than 120 mL/min/1.73 m2 as calculated using Eq. (1). The patients were divided into three groups according to the eGFR: se-RD group (eGFR < 30 mL/min/1.73 m2, i.e., CKD stage G4–5), mo-RD group (30 ≤ eGFR < 60 mL/min/1.73 m2, i.e., CKD stage G3a/b), and CG (eGFR ≥ 60 mL/min/1.73 m2, i.e., CKD stage G1–2).

MRI acquisition

MRI images were acquired using a 3.0 Tesla superconducting unit (Skyra, Siemens Healthcare, Erlangen, Germany) with a spine coil and an 18-channel phased-array body coil. The standard dedicated MRI protocol consisted of the following sequences: Dixon-based T1WI with the 3D gradient-echo method, DWI with multiple b-factors, and T2*WI with multiple gradient-echoes obtained in the coronal plane. For the Dixon-based T1WI, only IP/OP/WO images were used in the analysis, as other images (such as fat-only images and fat fraction ratio maps) were not generated in all patients. The ADC map was automatically generated based on a monoexponential fitting model using DWI at the four b-factors. In BOLD, 12 T2* WIs corresponding to 12 different gradient echoes were acquired. T2* maps were generated on a pixel-by-pixel basis by fitting a linear regression method through the logarithms of the signal intensities versus their 12 echo times. Table 11 presents the representative MRI scanning sequences and parameters. Representative MRI scanning sequences and parameters. ADC apparent diffusion coefficient, BOLD blood oxygenation level-dependent imaging, DWI diffusion-weighted imaging, FA flip angle, FOV field of view, IP in-phase, OP opposed-phase, T1WI T1-weighted imaging, TE echo time, TR repetition time, WO water-only.

Data analysis procedures

Figure 4 presents the data analysis workflow. After segmentation, image processing, texture feature extraction, and reproducibility analysis were performed for each imaging method (T1-weighted IP/OP/WO images, ADC map, and T2* map), followed by texture feature selection and ML-based model construction in separate classification attempts. The combinations of the texture features were also examined: those derived from all T1-weighted images (ALL T1WIs) and those derived from all imaging methods (ALL IMs).
Figure 4

Flow chart showing the technical study pipeline. After segmentation, image processing, texture feature extraction, and reproducibility analysis were conducted for each imaging method (T1-weighted in-phase/opposed-phase/water-only images, ADC maps, and T2* maps), followed by texture feature selection and ML-based model construction in separate classification attempts. The combinations of texture features were also examined: those derived from all T1-weighted images and those derived from all imaging methods.

Flow chart showing the technical study pipeline. After segmentation, image processing, texture feature extraction, and reproducibility analysis were conducted for each imaging method (T1-weighted in-phase/opposed-phase/water-only images, ADC maps, and T2* maps), followed by texture feature selection and ML-based model construction in separate classification attempts. The combinations of texture features were also examined: those derived from all T1-weighted images and those derived from all imaging methods.

Texture feature extraction

Segmentation was performed using an open-source software (ITK-SNAP version 3.8.0). One slice of T1-weighted IP/OP/WO images, ADC maps, and T2* map images in the coronal plane were selected for each patient. An irregular two-dimensional ROI was drawn manually to contain the outline borders of the entire region of both kidneys on each selected image, and the cystic region was avoided to the maximum extent (Fig. 5). Two radiologists with 7 and 6 years of experience performed ROI delineation independently to assess the inter-observer reproducibility in the segmentation process. Both radiologists were blinded to the clinical information.
Figure 5

A method to set the region of interest (ROI) for each group and each image. ROIs were manually drawn on the contour lines of both kidneys, as shown by the red curves (avoiding the cystic area). ADC apparent diffusion coefficient, IP in-phase, OP opposed-phase, WO water-only. Severe renal dysfunction group (se-RD, estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2), moderate renal dysfunction group (mo-RD, 30 ≤ eGFR < 60 mL/min/1.73 m2), and control group (CG, eGFR ≥ 60 mL/min/1.73 m2).

A method to set the region of interest (ROI) for each group and each image. ROIs were manually drawn on the contour lines of both kidneys, as shown by the red curves (avoiding the cystic area). ADC apparent diffusion coefficient, IP in-phase, OP opposed-phase, WO water-only. Severe renal dysfunction group (se-RD, estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m2), moderate renal dysfunction group (mo-RD, 30 ≤ eGFR < 60 mL/min/1.73 m2), and control group (CG, eGFR ≥ 60 mL/min/1.73 m2). To avoid data heterogeneity bias, all MRI data were subjected to image normalization (the intensity of the image was scaled to 0–100) and resampled to the same resolution (3 × 3 × 3 mm) before feature extraction. The texture features were calculated using an open-source software package capable of extracting a large panel of engineered features from medical images (PyRadiomics version 2.1.0). Texture features were calculated based on six feature classes (first-order statistics, gray-level co-occurrence matrix, gray-level dependence matrix, gray-level run-length matrix, gray-level size zone matrix, and neighboring gray-tone difference matrix). Ninety-three texture features were extracted and analyzed to select the most valuable features for discerning the three CKD groups with each imaging method. We performed dimension reduction of texture features to avoid overfitting and generalization errors in the classification models. After normalizing the numeric values as z-scores, the ICC was measured to evaluate the inter-observer reproducibility. Features with poor reproducibility (ICC < 0.75 or lower 95% CI < 0.6) in any of the imaging methods were excluded. Furthermore, the SFS algorithm, a wrapper-based greedy search algorithm, was used for feature selection. This algorithm identifies feature subsets that maximize the performance of predictive models by adding or eliminating features stepwise based on a user-defined classifier algorithm. We considered four representative ML classifiers in this study: linear discriminant analysis (LDA), SVM, decision tree (DT), and RF classifiers. As for the SVM algorithm, various kernel functions provide different decision-making algorithms and generate versatility. We adopted three representative kernels separately and compared their results in this study: linear, rbf, and sigmoid kernels. Thus, in total, we tested six different ML classifiers: LDA; SVM with linear, rbf, and sigmoid kernels; DT; and RF. By using the SFS algorithm, a subset of features that provided the best classification accuracies in each ML classifier was selected. The number of texture features was reduced to five in this step to prevent overfitting due to the small sample size. Concurrently, the relationship between the eGFR and the selected texture features for each imaging modality was examined using Pearson’s correlation coefficient. Multiclass classification models were created using the six ML classifiers described above and validated using the cross-validation method. We adopted the following methods to obtain the generalizability of our classification models and to test their applicability: (1) synthetic minority oversampling technique (SMOTE), (2) nested cross-validation with grid-search parameter tuning, and (3) 100-time repeat cross-validation method. Since our data had an imbalance between classes, we applied a SMOTE method before the final classification and validation step. This method creates synthetic instances that are not exact replications and increases the datasets of the minority group without damaging the structure of the actual data[43-45]. We applied this technique to augment the minority group datasets (i.e., 45 control cases and 36 severe RD cases), while preserving the majority group datasets (i.e., 85 moderate RD cases), resulting in 85 labeled cases for each class (255 cases in total). Several intrinsic hyperparameters are known for the SVM, DT, and RF classifiers, and the classification performance could be changed by attenuating these values. Thus, a nested cross-validation method with tenfold inner loops and tenfold outer loops was adopted to tune the parameters of these classifiers to avoid the double-dipping phenomenon, a potential bias[46,47], which indicates that training and test data were used for feature selection and model development, along with validation. A grid-search system was used for parameter tuning, in which a set of parameters with a discrete number of values was tested repeatedly to obtain the best parameter combination. The following hyperparameters were tested: C-value = 1, 10, and 100 and gamma = 0.001, 0.01, and 0.1 for SVM; and max-depth = 2, 4, and 6 and min-samples-leaf = 0.1, 0.5, 1, 5, and 10 for DT and RF. The cross-validation method was repeated 100 times to ensure the stability and reproducibility of our results. We repeated a SMOTE process along with a nested cross-validation as data augmented by the SMOTE may have some arbitrariness. The performance of the classifiers was evaluated using ROC curve analysis and the AUC. The accuracy, sensitivity, and specificity for each group and macro-average of all groups were calculated based on the confusion matrix of the classification results. We also evaluated the classification performance of the combination models derived from ALL T1WIs and those derived from ALL IMs. The SFS algorithm was used again for feature selection, and the number of texture features was reduced to 5. The multiclass classification models were created using the six ML classifiers mentioned above, and the performance of the classifiers was evaluated in the same manner as described above. Statistical analyses were performed using an open-source software package (Python scikit-learn 0.22.1). Statistical significance was set at P < 0.05. Supplementary Information 1. Supplementary Information 2. Supplementary Information 3. Supplementary Information 4. Supplementary Information 5. Supplementary Information 6.
Table 11

Representative MRI scanning sequences and parameters.

ParametersT1WI IP/OP/WODWI/ADC mapBOLD (T2* map)
TR (ms)5.351100175
TE (ms)2.46, 3.69704.92, 7.38, 9.84, 12.30, 14.76, 17.22, 19.68, 22.14, 24.60, 27.06, 29.52, and 31.98
FA (°)10N/A50
FOV (mm)360 × 360 × 144360 × 360 × 45360 × 360 × 27
Voxel size (mm)1.1 × 1.1 × 3.01.4 × 1.4 × 3.01.4 × 1.4 × 5.0
Recon matrix320128256
Slice thickness (mm)335
b value (mm2/s)0, 200, 400, 600
Respiratory compensationBreath holdFree breathingBreath hold

ADC apparent diffusion coefficient, BOLD blood oxygenation level-dependent imaging, DWI diffusion-weighted imaging, FA flip angle, FOV field of view, IP in-phase, OP opposed-phase, T1WI T1-weighted imaging, TE echo time, TR repetition time, WO water-only.

  41 in total

1.  Evaluation of renal dysfunction using texture analysis based on DWI, BOLD, and susceptibility-weighted imaging.

Authors:  Jiule Ding; Zhaoyu Xing; Zhenxing Jiang; Hua Zhou; Jia Di; Jie Chen; Jianguo Qiu; Shengnan Yu; Liqiu Zou; Wei Xing
Journal:  Eur Radiol       Date:  2018-12-17       Impact factor: 5.315

Review 2.  Diffusion-weighted imaging and texture analysis: current role for diffuse liver disease.

Authors:  Sofia Gourtsoyianni; Joao Santinha; Celso Matos; Nikolaos Papanikolaou
Journal:  Abdom Radiol (NY)       Date:  2020-10-16

3.  MRI of pancreatic ductal adenocarcinoma: texture analysis of T2-weighted images for predicting long-term outcome.

Authors:  Moon Hyung Choi; Young Joon Lee; Seung Bae Yoon; Joon-Il Choi; Seung Eun Jung; Sung Eun Rha
Journal:  Abdom Radiol (NY)       Date:  2019-01

4.  Renal cortical thickness measured at ultrasound: is it better than renal length as an indicator of renal function in chronic kidney disease?

Authors:  Michael D Beland; Nicholas L Walle; Jason T Machan; John J Cronan
Journal:  AJR Am J Roentgenol       Date:  2010-08       Impact factor: 3.959

5.  Clear Cell Renal Cell Carcinoma: Machine Learning-Based Quantitative Computed Tomography Texture Analysis for Prediction of Fuhrman Nuclear Grade.

Authors:  Ceyda Turan Bektas; Burak Kocak; Aytul Hande Yardimci; Mehmet Hamza Turkcanoglu; Ugur Yucetas; Sevim Baykal Koca; Cagri Erdim; Ozgur Kilickesmez
Journal:  Eur Radiol       Date:  2018-08-30       Impact factor: 5.315

6.  Texture analysis based on quantitative magnetic resonance imaging to assess kidney function: a preliminary study.

Authors:  Gumuyang Zhang; Yan Liu; Hao Sun; Lili Xu; Jianqing Sun; Jing An; Hailong Zhou; Yanhan Liu; Limeng Chen; Zhengyu Jin
Journal:  Quant Imaging Med Surg       Date:  2021-04

7.  Radiomics-based Prognosis Analysis for Non-Small Cell Lung Cancer.

Authors:  Yucheng Zhang; Anastasia Oikonomou; Alexander Wong; Masoom A Haider; Farzad Khalvati
Journal:  Sci Rep       Date:  2017-04-18       Impact factor: 4.379

8.  Magnetic resonance imaging T1- and T2-mapping to assess renal structure and function: a systematic review and statement paper.

Authors:  Marcos Wolf; Anneloes de Boer; Kanishka Sharma; Peter Boor; Tim Leiner; Gere Sunder-Plassmann; Ewald Moser; Anna Caroli; Neil Peter Jerome
Journal:  Nephrol Dial Transplant       Date:  2018-09-01       Impact factor: 5.992

9.  Renal BOLD MRI in patients with chronic kidney disease: comparison of the semi-automated twelve layer concentric objects (TLCO) and manual ROI methods.

Authors:  Lu-Ping Li; Bastien Milani; Menno Pruijm; Orly Kohn; Stuart Sprague; Bradley Hack; Pottumarthi Prasad
Journal:  MAGMA       Date:  2019-12-10       Impact factor: 2.533

10.  Radiomics: Images Are More than Pictures, They Are Data.

Authors:  Robert J Gillies; Paul E Kinahan; Hedvig Hricak
Journal:  Radiology       Date:  2015-11-18       Impact factor: 11.105

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.