Literature DB >> 35260192

Prediction of knee osteoarthritis progression using radiological descriptors obtained from bone texture analysis and Siamese neural networks: data from OAI and MOST cohorts.

Ahmad Almhdie-Imjabbar1,2, Khac-Lan Nguyen1,2, Hechmi Toumi1,2,3, Rachid Jennane1,2, Eric Lespessailles4,5,6.   

Abstract

BACKGROUND: Trabecular bone texture (TBT) analysis has been identified as an imaging biomarker that provides information on trabecular bone changes due to knee osteoarthritis (KOA). In parallel with the improvement in medical imaging technologies, machine learning methods have received growing interest in the scientific osteoarthritis community to potentially provide clinicians with prognostic data from conventional knee X-ray datasets, in particular from the Osteoarthritis Initiative (OAI) and the Multicenter Osteoarthritis Study (MOST) cohorts. PATIENTS AND METHODS: This study included 1888 patients from OAI and 683 patients from MOST cohorts. Radiographs were automatically segmented to determine 16 regions of interest. Patients with an early stage of OA risk, with Kellgren and Lawrence (KL) grade of 1 < KL < 4, were selected. The definition of OA progression was an increase in the OARSI medial joint space narrowing (mJSN) grades over 48 months in OAI and 60 months in MOST. The performance of the TBT-CNN model was evaluated and compared to well-known prediction models using logistic regression.
RESULTS: The TBT-CNN model was predictive of the JSN progression with an area under the curve (AUC) up to 0.75 in OAI and 0.81 in MOST. The predictive ability of the TBT-CNN model was invariant with respect to the acquisition modality or image quality. The prediction models performed significantly better with estimated KL (KLprob) grades than those provided by radiologists. TBT-based models significantly outperformed KLprob-based models in MOST and provided similar performances in OAI. In addition, the combined model, when trained in one cohort, was able to predict OA progression in the other cohort.
CONCLUSION: The proposed combined model provides a good performance in the prediction of mJSN over 4 to 6 years in patients with relevant KOA. Furthermore, the current study presents an important contribution in showing that TBT-based OA prediction models can work with different databases.
© 2022. The Author(s).

Entities:  

Keywords:  Fractal analysis; Knee osteoarthritis; Radiography; Siamese CNNs; Subchondral bone; Trabecular bone texture

Mesh:

Year:  2022        PMID: 35260192      PMCID: PMC8903620          DOI: 10.1186/s13075-022-02743-8

Source DB:  PubMed          Journal:  Arthritis Res Ther        ISSN: 1478-6354            Impact factor:   5.156


Background

Knee osteoarthritis (KOA) is a musculoskeletal condition frequently encountered not only in primary care but also in orthopedic and rheumatology clinics [3]. Due to the heterogeneity of osteoarthritis, i.e., its numerous phenotypes [27] and the wide variability in the trajectory of disease progression [12], it is of the utmost importance to identify KOA patients who have a greater potential of progressing more rapidly. Therefore, it is relevant to develop imaging biomarkers that can help the emergence of new therapeutic treatments and particularly new disease-modifying drugs. Due to the role of the subchondral bone and its remodeling status in KOA progression, texture analysis and tibial subchondral bone mineral density assessments are recognized and established methods to characterize structural alterations associated with KOA [18]. Recently, using the OAI database, the predictive ability of baseline trabecular bone texture to distinguish patients with or without radiographic progression was slightly improved compared to that of conventional clinical risk factors such as age, gender, body mass index (BMI), and joint space width (JSW) [10, 15]. Previously published studies have shown only moderate performance for predicting KOA progression when using pain, race, and previous knee injury [8, 17] as predictor factors. However, since data for pain, race, and previous knee injury were available in both OAI and MOST cohorts, we evaluated the performance of our proposed models with these three additional clinical predictors. In parallel with the improvement in medical imaging technologies, several machine learning techniques have been proposed for the diagnosis and prediction of KOA [14, 26]. Automatic KOA diagnosis is becoming increasingly popular [4, 23, 26] as it has a high potential to complement the OA diagnostic chain and make radiographic KOA grading more objective. The aims of this study were twofold: (i) to evaluate the predictive ability of a combined approach using both trabecular bone texture (TBT) descriptors, calculated by a variogram-based method [9, 10], and radiological gravity scores, calculated by deep learning-based Siamese CNN tools [26], to predict KOA progression; (ii) to study the use of the same KOA progression prediction model validated on independent OA cohort datasets (OAI and MOST), by training the model on one dataset and testing it on the other, and vice versa. The TRIPOD checklist (Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) was used as a framework of quality assurance of the present manuscript [22].

Methods

Patients

In this study, the data used in the preparation were obtained from the OAI and the MOST databases. Details about the acquisition and grading protocols in the OAI and the MOST studies are available online at https://nda.nih.gov/oai and http://most.ucsf.edu, respectively. The primary selected dataset included only the knee images of patients with available KL grades [13] and the Osteoarthritis Research Society International (OARSI) grades as well as the clinical covariates: age, gender, BMI, Western Ontario and McMaster universities osteoarthritis index (WOMAC) pain, race, and history of knee injury. From the selected dataset, the knees with preexisting OA with 2 ≤ KL < 4 [5, 10] at baseline were considered in the present study, in accordance with the European Medicines Agency [3] which recommended to include patients with KL radiographic entry criteria of grades 2 or 3 for studies of structure-modifying drugs. The selected dataset was divided into two sub-datasets according to the type of acquisition modality: the computed radiographs (CR), i.e., digital images acquired by a device using X-ray-sensitive plates which are then read by a processor, and the digitized X-ray films (RG). In order to evaluate the effect of the quality of images on the performance of the predictive models, each of the two datasets (CR and RG) was further divided into two groups according to the quality of the corresponding radiographs. In the first non-quality-controlled (nonQC) group, all radiographs were included except those showing materials (such as metallic materials, prostheses, and screws) in the subchondral zone, whereas in the second quality-controlled (QC) group, exclusion criteria also included radiographs with exposure problems (Fig. 1) in addition to those imposed in the nonQC group. The aim of this exclusion was to avoid the disturbances of these artifacts in the calculation of TBT parameters. This grouping strategy led to four sub-datasets, namely QC-CR, nonQC-CR, QC-RG, and nonQC-RG. As a result of the inclusion/exclusion criteria previously described, 2740 knees (425 cases, OAI) and 845 knees (297 cases, MOST) were judged as eligible for this study. Figure 2 shows the number of subjects and knees for each sub-dataset. The characteristics of selected OA cases and controls are summarized in Tables 1 and 2 for OAI and MOST cohorts, respectively.
Fig. 1

Radiographs from the OAI cohort with overexposure (A) and with materials (B), and radiographs from the MOST cohort with overexposure (C) and with materials (D)

Fig. 2

Flowchart illustrating the selection of study subjects from the OAI and MOST datasets (n is the number of patients and k is the corresponding number of knee radiographs at baseline). From the OAI and MOST initial database, selected sub-datasets were considered, including OA patients with 2 ⩽ KL<4 at baseline, with and without quality control conditions

Table 1

Characteristics of the patients and knees in the OAI subset with different image modalities in the study

Baseline48 months
ControlsCasesTotalControlsCasesTotal
nonQC-CR(k=1723)(k=286)(k=2009)(k=1723)(k=286)(k=2009)
 Age (years)62.3 (±9.0)62.4 (±8.1)62.3 (±8.8)66.3 (±9.0)66.4 (±8.1)66.3 (±8.1)
 BMI (kg/m2)29.4 (±4.6)30.6 (±4.7)29.6 (±4.7)29.3 (±4.8)31.0 (±5.0)29.6 (±4.9)
Gender
 F59%51%58%
 M41%49%42%
Medial JSN grade
 0741417827410741
 161711673361721638
 2365129494365119484
 30000146146
Lateral JSN grade
 01421270169113942641658
 11711218313310143
 2131413513611147
 300060161
nonQC-RG(k=592)(k=139)(k=731)(k=592)(k=139)(k=731)
 Age (years)62.9 (±9.1)62.9 (±8.1)62.4 (±8.8)66.3 (±9.0)66.9 (±8.1)66.4 (±8.1)
 BMI (kg/m2)29.9 ( ± 5.2)31.0 (±5.1)30.1 (±5.2)29.9 (±5.4)31.4 (±5.6)30.2 (±5.5)
Gender
 F67%63%36%
 M33%37%34%
Medial JSN grade
 0245232682450245
 1208442522089217
 21397221113953192
 300007777
Lateral JSN grade
 0498134632489134623
 14825035136
 24634939342
 3050029130
QC-CR(k=1419)(k=228)(k=1647)(k=1419)(k=228)(k=1647)
 Age (years)62.7 (±9.0)62.7 (±8.2)62.7 (±8.9)66.7 (±9.0)66.7 (±8.2)66.7 (±8.2)
 BMI (kg/m2)29.3 (±4.6)30.4 (±4.6)29.5 (±4.6)29.3 (±4.8)30.7 (±4.9)29.5 (±4.8)
Gender
 F59%47%57%
 M41%53%43%
Medial JSN grade
 0620316516200620
 14969058649615511
 230310741030391394
 30000122122
Lateral JSN grade
 01164219138311422151357
 114671531147121
 210921111126118
 300051051
QC-RG(k=407)(k=94)(k=501)(k=407)(k=94)(k=501)
 Age (years)62.0 (±8.9)62.4 (±8.6)62.1 (±8.9)66.01 (±8.9)66.4 (±8.6)66.1 (±8.6)
 BMI (kg/m2)29.8 (±5.3)30.5 (±5.4)29.9 (±5.3)29.9 (±5.6)30.9 (±5.8)30.1 (±5.6)
Gender
 F71%72%71%
 M29%28%29%
Medial JSN grade
 0184162001840184
 1134291631347141
 289491388934123
 300005353
Lateral JSN grade
 03418943033589424
 13523723124
 23133427330
 300022123

k the number of knees

“−” means no changes compared to baseline. Values for age and BMI are represented as mean (±standard deviation)

Table 2

Characteristics of the patients and knees in the MOST subset with different image modalities in the study

Baseline48 months
ControlsCasesTotalControlsCasesTotal
nonQC-CR(k=306)(k=125)(k=431)(k=306)(k=125)(k=431)
Age (years)63.2 (±7.7)62.8 (±7.7)63.1 (±7.7)68.2 (±7.7)67.8 (±7.7)68.1 (±7.7)
BMI (kg/m2)31.7 (±6.0)33.3 (±6.7)32.2 (±6.3)31.7 (±6.0)33.8 (±6.6)32.3 (±6.2)
Gender
 F67%62%65%
 M33%38%35%
Medial JSN grade
 0126141401260126
 11244516912411135
 25666122563894
 300007976
Lateral JSN grade
 0216116333202114316
 15365936541
 23734034337
 300034337
nonQC-RG(k=244)(k=172)(k=416)(k=244)(k=172)(k=416)
 Age (years)64.4 (±7.6)63.12 (±8.2)63.8 (±7.9)69.4 (±7.6)68.1 (±8.2)68.8 (±8.2)
 BMI (kg/m2)30.9 (±5.3)32.2 (±6.7)31.4 (±5.9)30.8 (±5.6)32.6 (±7.3)31.5 (±6.4)
Gender
 F68%55%63%
 M32%45%37%
Medial JSN grade
 0101131141010101
 198801789810108
 245791244572117
 300009090
Lateral JSN grade
 0208168376187161348
 12232517522
 21411524630
 300016016
QC-CR(k=269)(k=107)(k=376)(k=269)(k=107)(k=376)
 Age (years)63.4 (±7.7)62.9 (±7.7)63.2 (±7.7)68.4 (±7.7)67.9 (±7.7)68.2 (±7.7)
 BMI (kg/m2)31.2 (±5.7)32.7 (±5.4)31.6 (±5.6)31.1 (±5.6)33.3 (±5.7)31.7 (±5.7)
Gender
 F65%57%63%
 M35%43%37%
Medial JSN grade
 0116111271160116
 1109401491098117
 24456100443478
 300006565
Lateral JSN grade
 0191100291179100279
 14945332234
 22933231334
 300027229
QC-RG(k=205)(k=124)(k=329)(k=205)(k=124)(k=329)
 Age (years)64.7 (±7.7)63.0 (±7.7)64.0 (±7.7)69.7 (±7.7)68.0 (±7.7)69.0 (±7.7)
 BMI (kg/m2)30.9 (+5.4)32.9 (±7.2)31.7 (±6.2)30.9 (±5.8)33.4 (±7.8)31.8 (±6.7)
Gender
 F76%67%72%
 M24%33%28%
Medial JSN grade
 0911110291091
 1785813678987
 2365591365086
 300006565
Lateral JSN grade
 0173122295153115268
 11912017522
 21311420424
 300015015

k the number of knees

“−” means no changes compared to baseline. Values for age and BMI are represented as mean (±standard deviation)

Radiographs from the OAI cohort with overexposure (A) and with materials (B), and radiographs from the MOST cohort with overexposure (C) and with materials (D) Flowchart illustrating the selection of study subjects from the OAI and MOST datasets (n is the number of patients and k is the corresponding number of knee radiographs at baseline). From the OAI and MOST initial database, selected sub-datasets were considered, including OA patients with 2 ⩽ KL<4 at baseline, with and without quality control conditions Characteristics of the patients and knees in the OAI subset with different image modalities in the study k the number of knees “−” means no changes compared to baseline. Values for age and BMI are represented as mean (±standard deviation) Characteristics of the patients and knees in the MOST subset with different image modalities in the study k the number of knees “−” means no changes compared to baseline. Values for age and BMI are represented as mean (±standard deviation)

Definition of OA progression

Patients with or without OA progression were selected using the following definitions: OA progressors (cases) included patients with non-severe KOA (KL grade 2 ≤ KL ≤ 3) at baseline and with an increased mJSN grade (ΔmJSN >0) over the predefined control period (48 months and 60 months for OAI and MOST cohorts, respectively. ΔmJSN denotes the difference between OARSI mJSN grades at baseline and check points. OA non-progressors (controls) included patients with non-severe KOA at baseline and a constant mJSN grade (ΔmJSN = 0) over the predefined control period.

Regions of interest (ROI)

A patchwork construction technique using a semi-automatic method to extract the ROIs has been previously described [10]. In the current study, in order to extract the trabecular bone ROIs, a fully automatic approach, thanks to the BoneFinder [19] software, was used to delimit the femoral and tibial bone edges. The patchwork consists of 16 ROIs mapping the whole tibial trabecular area (Fig. 3). Our algorithm firstly uses BoneFinder to identify the rough position of the bone in the image and then outline 148 points of the tibial and femoral contours. For the left knee, points 48 and 64 mark the lateral and medial extremities of the tibia, respectively. For the right knee, the medial extremities of the tibia are identified by the points 122 and 138, respectively. Secondly, the algorithm approximates the tibial subchondral baseline as the line going through these anatomical points. Thirdly, this line is used to determine the orientation and size of the 16-ROI patchwork under the cortical plates. The square ROI dimensions were proportional to the knee width defined as the distance between the outer tibial margins. In our sub-dataset, radiographs presented different pixel spacing ranging from 0.1 to 0.2 mm, and the average ROI side length was 73 ± 18 pixels (10.1 ± 0.9 mm), ranging from 7 to 13 mm.
Fig. 3

Knee trabecular bone mapping using Bone Finder software for ROI selection. Dots are the anatomical markers automatically defined by Bone Finder. Each patchwork is defined by 16 squared ROIs

Knee trabecular bone mapping using Bone Finder software for ROI selection. Dots are the anatomical markers automatically defined by Bone Finder. Each patchwork is defined by 16 squared ROIs

Texture analysis

Fractal analysis consists in assigning a fractal dimension (FD) and other fractal characteristics to a dataset [11]. Several methods have been developed to measure the FD of a signal including the well-known technique of fractal signature analysis (FSA) [20], the Whittle estimator (WhE) [7], and the quadratic variation method (VAR) [9, 10]. These three different fractal analysis methods provided consistent results in their capacity to predict OA progression [10]. In the current study, the VAR method, used by Janvier et al. [9], was retained for our experiments. As reported earlier [9], the cut-off scale was observed around 500 mm on the empirical variograms and two fractal parameters were extracted: μFD and mFD corresponding to the texture complexity computed for the two micro (μ-scale) and the milli (m-scale) scales of observation under 400 mm and above 600 mm, respectively. Four TBT parameters (microscopic scale: horizontal μFD, vertical μFD, and macroscopic scale: horizontal mFD, vertical mFD) were computed in the 16 ROIs, resulting in 64 descriptors.

KL grading using Siamese neural networks

A Siamese neural network (SNN)-based method proposed by Tiulpin et al. [26] was used to estimate the probability distribution of the KL grades of baseline radiographs included in our study, in the objective to propose a fully automatic KOA progression prediction model. An SNN is a class of neural network architectures that contain two or more subnetworks sharing the same configuration. SNNs are known to be robust to class imbalance, which is usually the case in medical applications [2, 21]. A full description of the used SNN-based method can be found in [25, 26].

Statistical analysis

Logistic regression was used to predict KOA progression. Several statistical models were developed involving not only clinical covariates and radiological scores but also TBT-based parameters: Model_1: cov Model_2: cov+TBT Model_3: cov+mJSN+lJSN Model_4: cov+KL Model_5: cov+KLprob Model_6: cov+lJSN+mJSN+TBT Model_7: cov+KLprob+TBT Model_8: cov+KLprob+lJSN+TBT Model_9: covPlus+KLprob+lJSN+TBT where lJSN denotes lateral joint space narrowing, cov denotes the traditional clinical covariates (age, gender, and BMI), and covPlus denotes the cov parameters accompanied with additional clinical data (race, WOMAC pain, and history of injury). The TBT-CNN model (Model_8) includes baseline TBT, KLprob, lJSN, and cov. The KLprob was computed as the linear combination of the five probabilities of the KL grades predicted by the CNN-based model. In Model_8, the mJSN was not included, due to the high correlation between the baseline mJSN and KLprob grades. To avoid overfitting problems, all the models were evaluated using a 10-fold cross-validation repeated 300 times. Each model was evaluated using the AUC of the receiver operating characteristic (ROC) as a global performance criterion. The model classification accuracy (ACC), the probability that a random example is correctly classified, was also computed to investigate the relevance of different models. An ACC is defined as the ratio of the number of correct predictions relative to the total number of predictions. All statistical analyses were performed using the R Statistical tool (version 3.6.3) including the packages MASS (for stepwise AIC optimization), Caret (for the cross-validation training), and the pROC (for pROC curves and comparisons). Comparisons between the models were based on the ROC curves using the Delong method [6]. In order to reduce the number of parameters before training the prediction models, a backward selection of the TBT parameters (64 variables) was automatically performed using the Akaike Information Criterion (AIC) [1] as an iterative criterion. At each iteration, the AIC removes one parameter and preserves the most efficient parameter(s) to limit overfitting effects.

Results

Performance comparison

The cov and the JSN scores at baseline are presented in Table 1 for the OAI dataset and in Table 2 for the MOST dataset. The ROC curves of the 8 models were calculated using data from nonQC-OAI and MOST sub-datasets (Fig. 4). The models’ AUC values using all considered sub-datasets are summarized in Table 3.
Fig. 4

ROC curves obtained for the OA progression prediction. Data from the OAI-nonQC-CR (A), RG (B), and CR&RG (C) sub-cohorts and from the MOST-nonQC-CR (D), RG (E), and CR&RG (F) sub-cohorts. QC and nonQC denote quality control and non-quality control, respectively. CR and RG denote computed radiographs and digitized X-ray films, respectively

Table 3

Summary of AUC values of the 8 models: data from OAI and MOST datasets

Model_1Model_2Model_3Model_4Model_5Model_6Model_7Model_8
OAI-QC
 CR0.580.690.710.630.690.720.720.75
 RG0.500.730.680.600.670.670.750.75
 RGCR0.560.690.710.630.690.690.720.74
OAI-nonQC
 CR0.580.690.700.630.680.720.720.73
 RG0.550.720.680.630.670.740.740.75
 RGCR0.580.680.700.630.680.720.720.73
MOST-QC
 CR0.560.750.740.640.680.780.780.79
 RG0.580.800.740.650.680.800.800.81
 RGCR0.570.750.750.650.680.780.780.79
MOST-nonQC
 CR0.560.740.720.620.670.770.770.78
 RG0.580.770.740.650.710.790.790.80
 RGCR0.570.750.740.640.690.780.780.79
ROC curves obtained for the OA progression prediction. Data from the OAI-nonQC-CR (A), RG (B), and CR&RG (C) sub-cohorts and from the MOST-nonQC-CR (D), RG (E), and CR&RG (F) sub-cohorts. QC and nonQC denote quality control and non-quality control, respectively. CR and RG denote computed radiographs and digitized X-ray films, respectively Summary of AUC values of the 8 models: data from OAI and MOST datasets In OAI and MOST datasets, Model_1 was not predictive of OA progression (AUC < 0.6). The combination of cov with TBT or KLprob (Model_2 or Model_5 respectively) improved the prediction to a level comparable with that obtained by the combination of cov with JSN (Model_3). In the MOST dataset, Model_2 was predictive of JSN progression (AUC≥0.74) and significantly better than Model_4 which combines cov and baseline KL (AUC≤0.65); Model_2 outperformed Model_5 (p = 0.021); Model_2 significantly improved the prediction compared to Model_3, especially in the RG subset (p = 0.017); and Model_3 significantly outperformed Model_5 (p < 0.03) in all scenarios regardless of the acquisition modality and image quality. Model_5 showed a significantly better AUC than Model_4, in all cases (p < 0.02 in OAI and p < 0.03 in MOST datasets). Model_7 achieved a similar performance with AUCs up to 0.75 (p > 0.2) in the OAI dataset and up to 0.80 (p > 0.05) in the MOST dataset. The AUCs of Model_7 were significantly better than those of Model_3, especially in the OAI RG subset (p < 0.004) and in the MOST dataset (p < 0.02). Model_6, which combines cov, JSN, and TBT, previously proposed by Janvier et al. [10], achieved a similar performance. In all different scenarios, the proposed TBT-CNN model (Model_8) significantly improved the AUC compared to the Model_3 (p < 0.003) in the OAI dataset and (p < 0.02) in the MOST dataset. Model_8 increased the AUC up to 0.75 in the OAI dataset and 0.81 in the MOST dataset. Model_8 significantly outperformed Model_6 and Model_7 in the OAI CR and CR&RG subsets (p < 0.003) and in the MOST CR&RG subsets, regardless of the image quality. The same observation held when considering the MOST-nonQC-RG subset (p < 0.05). Furthermore, Model_8 had a good accuracy (ACC > 0.8) in the OAI dataset and (ACC > 0.7) in the MOST dataset. With the additional clinical covariates (race, WOMAC pain, and history of injury) used in Model_9, the results showed no improvement on the prediction performance compared to the proposed model (Model_8) in both OAI and MOST datasets.

Performance comparison with respect to acquisition modality

In terms of the acquisition modality, no significant differences in AUCs of the 8 models were found with regard to the three different scenarios (CR, RG, and CR&RG) (p > 0.1), in both OAI and MOST datasets.

Performance comparison with respect to image quality

Results showed that the image quality (QC and nonQC) had no statistically significant effect on the performance of the 8 models (p > 0.2) in the OAI dataset and (p > 0.4) in the MOST dataset. Thus, quality control is not a discriminating determinant of KOA progression prediction.

The prediction performance of models trained on one dataset and tested in another dataset

Model_8 was tested in two scenarios. In the first scenario, the model was trained on the OAI dataset. The trained model was then used for the prediction of OA progression in the MOST dataset. In the second scenario, the model was trained on the MOST datasets. The trained model was then used for the prediction of OA progression in the OAI dataset. Results showed the ability of this model trained on one cohort to predict progression in the other cohort with AUC > 0.7 in the CR and CR&RG cases, whatever the quality of the radiographs (Table 4). However, the model trained in the RG subset did not achieve the same performance (AUC < 0.7).
Table 4

Results obtained from training on one cohort (OAI/MOST) and testing on another cohort (MOST/OAI)

ModalityMetricTrain. on OAI and validation on MOSTTrain. on MOST and validation on OAI
NonQCQCNonQCQC
CRAUC0.74 (0.68–0.79)0.73 (0.67–0.79)0.7 (0.67–0.74)0.71 (0.68–0.75)
DOR6.825.953.424.04
ACC0.740.740.770.73
RGAUC0.64 (0.59–0.70)0.63 (0.56–0.69)0.69 (0.64–0.74)0.59 (0.53–0.65)
DOR3.014.343.362.51
ACC0.620.650.740.78
CR & RGAUC0.73 (0.70–0.77)0.73 (0.69–0.77)0.71 (0.68–0.74)0.71 (0.68–0.74)
DOR5.826.914.034.04
ACC0.70.720.830.82
Results obtained from training on one cohort (OAI/MOST) and testing on another cohort (MOST/OAI)

Discussion

An important contribution of this study consists in showing that OA prediction models can work with different databases. To the best of our knowledge, the present study is the first to evaluate the capability of combined models, including TBT and CNN-based parameters, to predict KOA progression, in both OAI and MOST datasets. The TBT-CNN model consistently provided the best performance in comparison with the other models [15, 16, 26] not only when training and testing on the same cohort (with AUC up to 0.81) but also when training on one cohort (OAI or MOST) and testing on the other one (MOST or OAI). When testing on another cohort, the TBT-CNN model was always predictive particularly in the CR and CR&RG subsets (AUC ≥ 0.7), which was not the case for the other models. Our study also included an evaluation of the effect of different acquisition modalities and image qualities on the performance of our combined prediction models. The TBT-CNN model significantly outperformed the other models, regardless of the quality of the images considered, especially with complete selected OAI and MOST datasets (Fig. 4). The same results were obtained when using the QC- and nonQC-CR sub-datasets of the OAI cohort and the nonQC-RG sub-dataset of the MOST cohort. The AUC of the TBT-CNN model varied from 0.73 to 0.75 in OAI and from 0.78 to 0.81 in MOST, whereas the AUC of the cov-JSN model achieved a maximum AUC of 0.71 in OAI and 0.75 in MOST (Table 3). In both cohorts, the results showed that the performance of the TBT-CNN model was invariant with respect to acquisition modality and image quality. Moreover, results showed that the model prediction performance was better when using CNN-based estimations of KL than those measured manually by radiologists in the OAI and MOST datasets. In addition, the performance of the proposed prediction model remained unchanged when adding more clinical data including race, WOMAC pain, and history of injury. Whatever the cohort, the modality of the radiographs, and the quality of the radiographs, the CNN-based estimation of KL grades provided better results than those obtained from a discrete ordinal grading method. An automatic estimation of JSN grades using a CNN-based method [25] might also be of interest to improve the prediction of OA progression. However, the performance of the prediction model using CNN-based estimations was statistically less significant than when using TBT parameters in the MOST dataset. The performance of the two approaches was similar in the OAI dataset. Previous studies have demonstrated that the texture analysis of subchondral bone from conventional knee radiographs could be a good indicator of the prediction of knee OA progression [10, 15, 16, 28, 29]. In a recent study by Kraus et al. [15], the use of TBT calculated by the FSA method in combination with other clinical covariates and radiological parameters was investigated to propose a predictive model of OA progression using a large sample of 579 RG&CR radiographs selected from the OAI cohort. They investigated not only the radiographic but also the knee pain progression status over 12 and 24 months. However, the performance of the proposed model was modest (AUC = 0.633 − 0.649). Involving a much larger dataset of 1124 CR radiographs, Janvier et al. [10] proposed a prediction model that included JSN grades in addition to TBT and cov parameters. In their study, the TBT analysis covered the medial and lateral subchondral bone. This model showed the ability to predict OA progression over 48 months, providing an AUC score of 0.77 using the WhE estimator for the TBT parameters.

Strengths and limitations

Due to a lack of information in the MOST cohort regarding the JSW, our study took into consideration only the discrete ordinal JSN grades. It would be interesting to consider the use of the continuous JSW values or joint space area (JSA), for which an additional step is required to calculate these values from the selected radiographs. In the current study, age, sex, and BMI were chosen as clinical predictors. Other predictors of KOA progression such as self-reported previous knee injury and knee pain may also be included in future studies. However, the main focus of this study was to show the ability of image-processing-based models to predict KOA progression, rather than investigating other clinical covariates for KOA progression prediction. It should be noticed that the duration of the two tested cohorts is not the same (48 months for OAI and 60 months for MOST). Unfortunately, the OAI cohort did not include imaging data at 60-month follow-up, and the MOST cohort did not include imaging data at 48 months. Consequently, the use of time-to-event data analyses was not relevant since the occurrence of KOA progression is more or less a continuous phenomenon. It has been shown, however, that our proposed models provide a good performance in the prediction of KOA progression when trained on one cohort and tested on the other. The present study has several important strengths. It involves the use of two large datasets. In addition, the proposed model takes advantage of an extensive set of TBT parameters [9, 10] and CNN-based KL grades for the prediction of OA progression. We also evaluated the effect of different image quality and modality scenarios on the performance of the prediction of OA progression. A major contribution of our study is the evaluation using a model trained on one cohort and validated on the other. In this case, the progression prediction models were not only trained on the OAI dataset and tested on the MOST dataset, as proposed by Tiulpin et al. [24], but also trained on the MOST dataset and tested on the OAI dataset, which has never been explored to date. Furthermore, the combination of TBT and CNN-based estimation of KL grades significantly improves the prediction of OA progression. This combination provides mutual information between the evolution of shape surrounding the knee joint space [24-26] and texture variations in the proximal tibial subchondral bone.

Conclusions

In conclusion, our study has demonstrated the feasibility of using the TBT-CNN model to predict mJSN progression in both OAI and MOST cohorts. This model exhibited a good diagnostic performance regardless of both the acquisition modality and the image quality when the model was trained and tested on the same cohort. Moreover, when trained on one cohort, the TBT-CNN model was able to predict mJSN progression on another cohort in the CR and CR&RG subsets, irrespective of the image quality. However, further experiments are needed to develop more comprehensive risk assessment models for KOA progression prediction. In particular, other TBT methods such as the Variance Orientation Transform (VOT) [30], FSA, and WhE methods, as well as the automatic calculation of certain radiographic parameters such as JSN, JSW, or JSA scores, could be investigated.
  26 in total

1.  Radiological assessment of osteo-arthrosis.

Authors:  J H KELLGREN; J S LAWRENCE
Journal:  Ann Rheum Dis       Date:  1957-12       Impact factor: 19.103

Review 2.  OARSI Clinical Trials Recommendations: Knee imaging in clinical trials in osteoarthritis.

Authors:  D J Hunter; R D Altman; F Cicuttini; M D Crema; J Duryea; F Eckstein; A Guermazi; R Kijowski; T M Link; J Martel-Pelletier; C G Miller; T J Mosher; R E Ochoa-Albíztegui; J-P Pelletier; C Peterfy; J-P Raynauld; F W Roemer; S M Totterman; G E Gold
Journal:  Osteoarthritis Cartilage       Date:  2015-05       Impact factor: 6.576

3.  Directional fractal signature analysis of trabecular bone: evaluation of different methods to detect early osteoarthritis in knee radiographs.

Authors:  M Wolski; P Podsiadlo; G W Stachowiak
Journal:  Proc Inst Mech Eng H       Date:  2009-02       Impact factor: 1.617

4.  A dissimilarity-based multiple classifier system for trabecular bone texture in detection and prediction of progression of knee osteoarthritis.

Authors:  Tomasz Woloszynski; Pawel Podsiadlo; Gwidon Stachowiak; Marek Kurzynski
Journal:  Proc Inst Mech Eng H       Date:  2012-11       Impact factor: 1.617

Review 5.  Assessment of bone mineral density and radiographic texture analysis at the tibial subchondral bone.

Authors:  E Lespessailles; R Jennane
Journal:  Osteoporos Int       Date:  2012-11-22       Impact factor: 4.507

6.  Subchondral tibial bone texture predicts the incidence of radiographic knee osteoarthritis: data from the Osteoarthritis Initiative.

Authors:  T Janvier; R Jennane; H Toumi; E Lespessailles
Journal:  Osteoarthritis Cartilage       Date:  2017-09-19       Impact factor: 6.576

7.  Predictive Validity of Radiographic Trabecular Bone Texture in Knee Osteoarthritis: The Osteoarthritis Research Society International/Foundation for the National Institutes of Health Osteoarthritis Biomarkers Consortium.

Authors:  Virginia Byers Kraus; Jamie E Collins; H Cecil Charles; Carl F Pieper; Lawrence Whitley; Elena Losina; Michael Nevitt; Steve Hoffmann; Frank Roemer; Ali Guermazi; David J Hunter
Journal:  Arthritis Rheumatol       Date:  2017-12-15       Impact factor: 10.995

8.  Fully automatic knee osteoarthritis severity grading using deep neural networks with a novel ordinal loss.

Authors:  Pingjun Chen; Linlin Gao; Xiaoshuang Shi; Kyle Allen; Lin Yang
Journal:  Comput Med Imaging Graph       Date:  2019-06-13       Impact factor: 7.422

9.  Automatic Grading of Individual Knee Osteoarthritis Features in Plain Radiographs Using Deep Convolutional Neural Networks.

Authors:  Aleksei Tiulpin; Simo Saarakkala
Journal:  Diagnostics (Basel)       Date:  2020-11-10

10.  Development of a clinical prediction algorithm for knee osteoarthritis structural progression in a cohort study: value of adding measurement of subchondral bone density.

Authors:  Michael P LaValley; Grace H Lo; Lori Lyn Price; Jeffrey B Driban; Charles B Eaton; Timothy E McAlindon
Journal:  Arthritis Res Ther       Date:  2017-05-16       Impact factor: 5.156

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.