| Literature DB >> 34753963 |
Alejandro G Morales1,2, Jinhee J Lee3, Francesco Caliva3, Claudia Iriondo4,3, Felix Liu5, Sharmila Majumdar3, Valentina Pedoia3.
Abstract
Knee pain is the most common and debilitating symptom of knee osteoarthritis (OA). While there is a perceived association between OA imaging biomarkers and pain, there are weak or conflicting findings for this relationship. This study uses Deep Learning (DL) models to elucidate associations between bone shape, cartilage thickness and T2 relaxation times extracted from Magnetic Resonance Images (MRI) and chronic knee pain. Class Activation Maps (Grad-CAM) applied on the trained chronic pain DL models are used to evaluate the locations of features associated with presence and absence of pain. For the cartilage thickness biomarker, the presence of features sensitive for pain presence were generally located in the medial side, while the features specific for pain absence were generally located in the anterior lateral side. This suggests that the association of cartilage thickness and pain varies, requiring a more personalized averaging strategy. We propose a novel DL-guided definition for cartilage thickness spatial averaging based on Grad-CAM weights. We showed a significant improvement modeling chronic knee pain with the inclusion of the novel biomarker definition: likelihood ratio test p-values of 7.01 × 10-33 and 1.93 × 10-14 for DL-guided cartilage thickness averaging for the femur and tibia, respectively, compared to the cartilage thickness compartment averaging.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34753963 PMCID: PMC8578418 DOI: 10.1038/s41598-021-01111-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Bootstrapped (n = 100) test set chronic pain ROC performance for all six biomarker models per bone, as well as an average ensemble across all bones.
| Biomarker type | Biomarker model | Test set ROC (sensitivity/specificity/AUC) (95% CI) | |||
|---|---|---|---|---|---|
| Patella | Tibia | Femur | PTF | ||
| Single | Cartilage T2 | 58.9 (58.3, 59.5) | 49.1 (48.6, 49.7) | 65.7 (65.3, 66.2) | 62.7 (62.2, 63.2) |
| 75.4 (75.0, 75.8) | 77.5 (77.2, 77.9) | 65.1 (64.7, 65.6) | 71.7 (71.4, 72.1) | ||
| 71.0 (70.5, 71.4) | 68.3 (68.0, 68.7) | 70.9 (70.5, 71.3) | 73.7 (73.4, 74.0) | ||
| Cartilage thickness | 55.7 (55.1, 56.2) | 48.1 (47.5, 48.7) | 55.5 (55.0, 55.9) | 54.5 (54.0, 55.0) | |
| 72.2 (71.8, 72.6) | 81.3 (81.0, 81.6) | 77.2 (76.8, 77.5) | 79.2 (78.8, 79.5) | ||
| 68.4 (68.1, 68.8) | 70.7 (70.4, 71.1) | 72.1 (71.7, 72.4) | 72.8 (72.5, 73.2) | ||
| Bone shape | 53.6 (53.1, 54.1) | 50.8 (50.3, 51.3) | 56.6 (56.1, 57.1) | ||
| 79.0 (78.6, 79.4) | 78.8 (78.4, 79.1) | 80.0 (79.7, 80.4) | |||
| 70.6 (70.3, 71.0) | 70.3 (69.9, 70.6) | 74.2 (73.9, 74.5) | |||
| Fusion | Morphological bone and cartilage fusion | 50.8 (50.3, 51.4) | |||
| 79.5 (79.1, 79.8) | |||||
| 71.7 (71.3, 72.0) | |||||
| Morphological and compositional cartilage fusion | 58.8 (58.3, 59.3) | 43.1 (42.6, 43.6) | 50.7 (50.1, 51.2) | 54.4 (53.9, 54.8) | |
| 67.4 (67.0, 67.9) | 82.1 (81.8, 82.4) | 81.3 (81.0, 81.7) | 79.2 (78.9, 79.6) | ||
| 68.1 (67.7, 68.4) | 68.9 (68.6, 69.3) | 73.3 (73.0, 73.7) | 73.1 (72.7, 73.4) | ||
| All biomarkers fusion | 48.8 (48.3, 49.4) | 47.1 (46.6, 47.6) | 52.9 (52.3, 53.5) | 50.3 (49.8, 50.7) | |
| 77.8 (77.5, 78.2) | 82.1 (81.7, 82.4) | 79.2 (78.8, 79.5) | 82.3 (82.0, 82.6) | ||
| 69.9 (69.6, 70.3) | 73.0 (72.6, 73.3) | 71.6 (71.3, 72.0) | 73.8 (73.4, 74.1) | ||
Sensitivity, specificity, and AUC values are shown respectively, along with their corresponding 95% confidence intervals.
The best performances per bone and ensemble are bolded. PTF = Patella + Tibia + Femur ensemble. Result metrics are for the first timepoint for each patient.
Logistic regression model results for the association between chronic knee pain and radiological features including KL grades, OARSI JSN grades for lateral and medial compartments, and the minimum medial JSW measurement.
| Variable | Estimates (95% CI) |
|---|---|
| Chronic pain (%) | 33.1 |
| OARSI JSN grades medial | − 0.12 (− 0.33, 0.10) |
| OARSI JSN grades lateral | |
| Quantitative JSW | − 0.05 (− 0.15, 0.05) |
| KL grades | |
| Age | − |
| Female sex | 0.13 (− 0.04, 0.31) |
| BMI |
In adjusted logistic regression analysis for ages, gender, and BMI, KL grades (OR 2.20; 95% CI 1.91, 2.52) and OARSI JSN grades for the lateral compartment (OR 1.24; 95% CI 1.02, 1.51) were statistically significantly associated with higher odds of chronic pain. The test sensitivity, specificity, and AUC respectively, of 0.80 (95% CI 0.77, 0.83), 0.55 (95% CI 0.50, 0.59), and 0.69 (95% CI 0.66, 0.71).
Bold p-values are significant (p-value < 0.05).
Figure 1The vertices of a reference bone surface, selected to match the average demographic distribution of the test set, were mapped on all the bone surfaces in the test set using a fully automatic landmark-matching algorithm. The maximum and minimum local curvatures were used for coupling homologous points on two surfaces. Both these features were locally defined on the surfaces and used to identify the landmark matching. After the landmark matching procedure, with the heat maps in the same reference space, localized group analysis was performed to compare the true positive (TP) and true negative (TN) model predictions for each single biomarker. Local Statistical Parametric Mapping (SPM) was performed on these two groups to assess differences in location of important features significant for presence of pain (TP) or specific for absence of pain (TN). Point-by-point SPM was performed using ANOVA group comparison considering age, sex and BMI as confounding factors.
Logistic regression results for the cartilage thickness biomarker for all bones.
| Biomarker | Bone | Method | Variable | Estimate | Standard error | p-value | ROC (sensitivity/specificity/AUC) | Likelihood ratio p-value |
|---|---|---|---|---|---|---|---|---|
| Cartilage thickness (n = 2151) | Femur | Classical: clinical compartment average | Intercept | − 3.59 | 0.621 | 7.56 × 10–9 | 13.5 95.2 63.3 | |
| Age | − 0.011 | 5.1 × 10–3 | 3.06 × 10–2 | |||||
| BMI | 0.077 | 1.01 × 10–2 | 1.84 × 10–14 | |||||
| Sex | − 0.193 | 0.114 | 9.05 × 10–2 | |||||
| LF thickness | − 0.582 | 0.264 | 2.77 × 10–2 | |||||
| MF thickness | 1.289 | 0.246 | 1.77 × 10–7 | |||||
| Proposed: DL-guided weighted average | Intercept | − 2.52 | 0.645 | 9.16 × 10–5 | 33.5 89.8 69.2 | |||
| Age | − 1.97 × 10–2 | 5.36 × 10–3 | 2.33 × 10–4 | |||||
| BMI | 5.14 × 10–2 | 1.06 × 10–2 | 1.13 × 10–6 | |||||
| Sex | − 8.78 × 10–2 | 0.118 | 0.455 | |||||
| LF thickness | 2.25 | 0.364 | 6.55 × 10–10 | |||||
| MF thickness | 2.29 | 0.268 | 1.57 × 10–17 | |||||
| DL-thickness | − 3.66 | 0.32 | 2.02 × 10–30 | |||||
| Tibia | Classical: clinical compartment average | Intercept | 0.496 | 0.638 | 0.437 | 15.5 94.2 63.6 | ||
| Age | − 1.81 × 10–2 | 5.17 × 10–3 | 4.5 × 10–4 | |||||
| BMI | 0.078 | 0.01 | 7.27 × 10–15 | |||||
| Sex | − 0.356 | 0.106 | 8.0 × 10–4 | |||||
| LT thickness | − 0.445 | 0.182 | 1.47 × 10–2 | |||||
| MT thickness | − 0.537 | 0.15 | 3.42 × 10–4 | |||||
| Proposed: DL-guided weighted average | Intercept | 0.81 | 0.65 | 0.213 | 24.8 92.9 66.9 | |||
| Age | − 2.11 × 10–2 | 5.27 × 10–3 | 6.06 × 10–5 | |||||
| BMI | 7.03 × 10–2 | 1.02 × 10–2 | 4.86 × 10–12 | |||||
| Sex | − 0.387 | 0.107 | 3.12 × 10–4 | |||||
| LT thickness | 0.289 | 0.208 | 0.165 | |||||
| MT thickness | 0.108 | 0.173 | 0.533 | |||||
| DL-thickness | − 1.37 | 0.184 | 9.6 × 10–14 | |||||
| Patella | Classical: clinical compartment average | Intercept | 1.21 | 0.644 | 6.05 × 10–2 | 15.8 94.8 65.0 | 0.851 | |
| Age | − 2.38 × 10–2 | 5.33 × 10–3 | 8.09 × 10–6 | |||||
| BMI | 6.64 × 10–2 | 1.03 × 10–2 | 1.06 × 10–10 | |||||
| Sex | − 0.389 | 0.105 | 2.28 × 10–4 | |||||
| L thickness | − 0.398 | 0.118 | 7.12 × 10–4 | |||||
| M thickness | − 0.424 | 0.12 | 3.97 × 10–4 | |||||
| Proposed: DL-guided weighted average | Intercept | 1.215 | 0.645 | 5.95 × 10–2 | 16.0 94.9 65.0 | |||
| Age | − 2.38 × 10–2 | 5.33 × 10–3 | 7.96 × 10–6 | |||||
| BMI | 6.63 × 10–2 | 1.03 × 10–2 | 1.20 × 10–10 | |||||
| Sex | − 0.39 | 0.106 | 2.24 × 10–4 | |||||
| L thickness | − 0.376 | 0.166 | 2.39 × 10–2 | |||||
| M thickness | − 0.401 | 0.173 | 2.06 × 10–2 | |||||
| DL-thickness | − 4.57 × 10–2 | 0.243 | 0.851 |
The demographic factors, such as age, BMI, and sex, are included to the logistic regression models as well as the different cartilage thickness averaging methods. The results are shown for the two definitions for OA imaging biomarkers, clinical compartment average and DL-guided weighted average for the femur, tibia, and patella.
LF Lateral Femur, MF Medial Femur, MT Medial Tibia, LT Lateral Tibia, M Medial, L Lateral.
Figure 2The inclusion criteria for a knee image volume from a specific patient timepoint in this cross-sectional study. The three main criteria were the existence of a KL grade, a chronic pain label, and matching 3D-DESS and 2D-MSME image volumes, which resulted in 7437 cross-sectional timepoints from 3067 unique patients.
Figure 3Biomarker 2D spherical maps. The three biomarkers projected to the articular bone surface were converted to 2D spherical maps. (a) The transformation from Cartesian coordinates into spherical coordinates was performed by uniformly sampling 224 × 224 points in the point cloud and describing them based on the angle along the x–y plane from the positive x-axis (θ), the elevation angle from the x–y plane (φ) and the distance from the center of the point cloud to the sampled point in the surface (ρ). The angle θ was sampled from − π to + π for all bones while the angle φ was sampled from − π/2 to + π/8 for the femur and tibia and from − π/2 to + π/8 for the patella. The sampling was designed to be centered around the articular surface to ensure the cartilage would be centered for each bone. (b) Bone shape 2D spherical map. (c) Cartilage thickness 2D spherical map. (d) Cartilage average T2 value 2D spherical map.
Figure 4Overview of the biomarker model strategies, shown for the femur. The normalized spherical images for each patient were merged into a three-channel 8-bit image. (a–c) The first three strategies consisted of the single biomarkers: cartilage thickness, bone shape, and cartilage T2. (a) The cartilage thickness strategy consisted of the cartilage thickness spherical maps replicated three times into a spherical image. (b) The bone shape strategy consisted of the bone shape spherical maps replicated three times into a spherical image. (c) The cartilage T2 strategy consisted of the deep, superficial, and average T2 spherical maps as the first, second, and third channels respectively. (d–f) The last three fusion strategies consisted of the biomarker fusions: morphological cartilage and bone fusion, morphological and compositional cartilage fusion and all biomarkers fusion. (d) The morphological cartilage and bone fusion consisted of the cartilage thickness and bone shape spherical maps as the first and second channels respectively, with the last channel empty. (e) The morphological and compositional cartilage fusion consisted of the deep and superficial T2 spherical maps as the first and second channels respectively with the third channel consisting of the cartilage thickness spherical map. (f) The all biomarkers fusion consisted of the cartilage thickness, bone shape, and average T2 spherical map as the first, second and third channels respectively.
Training, validation, and test splits information for the segmentation and classification models.
| Task | Model | Training (cases) | Validation (cases) | Test (cases) | Cases ratio | Average time points per patient | Timepoint distribution (number of timepoints per number of patients) | Average age (mean ± SD) | Average KL (mean ± SD) | Sex distribution (male/female) | Total WOMAC pain scores (mean ± SD) | χ2 test correlation (sex) (p-values) | MANOVA one-way correlation (age|BMI) (p-values) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Segmentation | Bone | 57 (29) | 15 (8) | 30 (16) | 0.520 | 1.01 | 1:100 | 58.4 ± 8.19 | 0.6 ± 1.06 | 49/53 | 2.4 ± 2.90 | 0.745 | 0.413 |
| 2:1 | |||||||||||||
| Cartilage | 118 (114) | 28 (28) | 28 (28) | 0.977 | 2.0 | 2:87 | 61.6 ± 9.93 | 2.3 ± 0.94 | 90/84 | 4.3 ± 3.80 | 0.156 | ||
| Classification | OA | 12,634 (5402) | 2558 (1111) | 5926 (2530) | 0.428 | 4.78 | 1:179 | 63.2 ± 9.17 | 1.3 ± 1.21 | 9005/12,113 | 2.1 ± 2.95 | 0.121 | 0.190 |
| 2:396 | |||||||||||||
| 3:419 | |||||||||||||
| 4:601 | |||||||||||||
| 5:1367 | |||||||||||||
| 6:527 | |||||||||||||
| 7:927 | |||||||||||||
| Chronic pain | 4029 (1324) | 1257 (411) | 2151 (713) | 0.329 | 2.42 | 1:1103 | 63.9 ± 9.38 | 1.2 ± 1.22 | 3510/3927 | 1.5 ± 2.77 | 0.179 | 0.0848 | |
| 2:771 | |||||||||||||
| 3:509 | |||||||||||||
| 4:345 | |||||||||||||
| 5:192 | |||||||||||||
| 6:104 | |||||||||||||
| 7:43 |
Demographic factors were controlled by testing for statistical independence across the splits using a Pearson’s chi-squared test (χ2) for the categorical sex variable and a one-way Multivariate Analysis of Variance (MANOVA) for the joint effect of age and BMI.
Bold p-values are significant (p-value < 0.05).
Figure 5We used the Grad-CAM model interpretation technique to obtain a class discriminative localization map for each prediction. We first computed the gradient of the class of interest (before the softmax function) with respect to feature maps of the last convolutional layer in the Resnet. These gradients flowing back are global average-pooled to obtain the neuron importance weights for the target class. A heat map of location importance is then up sampled to match the image size and overlaid on the input image. We then leveraged the invertible property of our spherical transformation method to generated articular surface importance heat maps for model interpretation for each bone and for each single biomarker. This process was performed on the first timepoint of every unique patient in the hold out test set (n = 875) and is illustrated for the femur.