| Literature DB >> 35099771 |
Christos Kokkotis1,2, Charis Ntakolia3,4, Serafeim Moustakidis5, Giannis Giakas6, Dimitrios Tsaopoulos7.
Abstract
Knee Osteoarthritis (ΚΟΑ) is a degenerative joint disease of the knee that results from the progressive loss of cartilage. Due to KOA's multifactorial nature and the poor understanding of its pathophysiology, there is a need for reliable tools that will reduce diagnostic errors made by clinicians. The existence of public databases has facilitated the advent of advanced analytics in KOA research however the heterogeneity of the available data along with the observed high feature dimensionality make this diagnosis task difficult. The objective of the present study is to provide a robust Feature Selection (FS) methodology that could: (i) handle the multidimensional nature of the available datasets and (ii) alleviate the defectiveness of existing feature selection techniques towards the identification of important risk factors which contribute to KOA diagnosis. For this aim, we used multidimensional data obtained from the Osteoarthritis Initiative database for individuals without or with KOA. The proposed fuzzy ensemble feature selection methodology aggregates the results of several FS algorithms (filter, wrapper and embedded ones) based on fuzzy logic. The effectiveness of the proposed methodology was evaluated using an extensive experimental setup that involved multiple competing FS algorithms and several well-known ML models. A 73.55% classification accuracy was achieved by the best performing model (Random Forest classifier) on a group of twenty-one selected risk factors. Explainability analysis was finally performed to quantify the impact of the selected features on the model's output thus enhancing our understanding of the rationale behind the decision-making mechanism of the best model.Entities:
Keywords: Clinical data; Explainability; Feature selection; KOA diagnosis; Machine learning
Mesh:
Year: 2022 PMID: 35099771 PMCID: PMC8802106 DOI: 10.1007/s13246-022-01106-6
Source DB: PubMed Journal: Phys Eng Sci Med ISSN: 2662-4729
Main categories of the clinical evaluation data considered in this study
| Category | Description |
|---|---|
| Medical history | Medications and health histories based on questionnaire results (not including medical imaging outcomes) |
| Symptoms | Arthritis symptoms or health-related disability and function based on questionnaire data |
| Subject characteristics | Includes variables which describe anthropometric parameters and personal information |
| Nutrition | Questionnaire based on block food frequency |
| Physical exam | Includes performance measures and knee and hand exams |
| Physical activity | Questionnaire results regarding living and leisure activities |
| Behavioral | Consists of variables which quantify the social behavior and the quality level of daily routine |
Fig. 1The proposed AI methodology for KOA diagnosis
Fig. 2Feature Selection method based on Fuzzy logic flowchart
Fig. 3Fuzzy set of input variables for FIS 1 and 2
Fig. 4Fuzzy set of output variable for FIS 1 and 2
Fig. 5Curves with testing accuracy scores with respect to the number of selected features for different ML models
Summary of best metrics per model and number of selected features
| Models | Accuracy | Precision | Recall | F1-Score | Num. of features |
|---|---|---|---|---|---|
| RF | 73.55 | 73.82 | 73.64 | 73.59 | 21 |
| MLP | 73.20 | 73.48 | 73.20 | 73.13 | 17 |
| LR | 73.27 | 73.38 | 73.27 | 73.24 | 17 |
| SVMs | 73.36 | 73.68 | 73.36 | 73.27 | 18 |
| KNN | 71.55 | 71.74 | 71.55 | 71.49 | 12 |
Fig. 6The number of selected risk factors per category for the first 21 most informative features (a full description is given in Appendix A)
Comparative analysis of FS methods
| FSFL | Vote FS | RF Emb FS | LGBM Emb FS | SVM RFE FS | LR RFE FS | Filter MI FS | Filter f-ANOVA FS | |
|---|---|---|---|---|---|---|---|---|
| Maximum accuracy (%) | 73.55 | 72.99 | 73.36 | 73.51 | 70.53 | 73.50 | 72.75 | 73.44 |
| Number of selected features | 21 | 76 | 43 | 87 | 96 | 60 | 91 | 53 |
| DR (%) | – | + 72% | + 51% | + 76% | + 78% | + 65% | + 77% | + 60% |
Fig. 7a Features’ impact on random forest (21F) model output for the testing set of OAI dataset. b Features’ average impact magnitude for testing instances
Fig. 8Risk factors contributions to ML model output for a KOA status subject
The 21 most informative selected risk factors as described in OAI database
| Selected features | Description | Category |
|---|---|---|
| P02ELGRISK | Knee symptoms, risk factors, or both, status at IEI/SV | Symptoms |
| P01BMI | Body mass index | Subject characteristics |
| V00AGE | Age | Subject characteristics |
| P01WEIGHT | Average current scale weight (kg) | Subject characteristics |
| V00LKFHDEG | Left knee exam: flexion contracture/hyperextension, degrees (contracture positive) | Physical exam |
| V00KOOSKPR | Right knee: KOOS Pain Score | Symptoms |
| P01MOMHRCV | Mother had hip replacement surgery | Medical history |
| P02PA1 | Climb up total of 10 or more flights of stairs on most days | Physical activity |
| P01KSX | Frequent knee pain status by person | Symptoms |
| V00RKFHDEG | Right knee exam: flexion contracture/hyperextension, degrees (contracture positive) | Physical exam |
| V00WTMAXKG | Maximum adult weight, self-reported (kg) | Subject characteristics |
| P02KSURG | Either knee, history of knee surgery | Medical history |
| V00lfTHRL | Left Flexion MAX force high relaxation limit | Physical exam |
| V00BAPFAT | Block Brief 2000: daily % of calories from fat, alcoholic beverages excluded from denominator (kcal) | Nutrition |
| V00RPAVG | Radial pulse: average beats per minute | Subject characteristics |
| V00PASE | Physical Activity Scale for the Elderly (PASE) score | Physical activity |
| V00KOOSQOL | KOOS quality of life score | Symptoms |
| V00LFXCOMP | Isometric strength: left knee flexion, able to complete (3) measurements | Physical exam |
| V00BPDIAS | Blood pressure: diastolic (mm Hg) | Subject characteristics |
| V00PA430CV | How often lift or move objects weighing 25 pounds or more by hand during a typical week, past 30 days | Behavioral |
| V00KPLKN1 | Left knee pain: twisting/pivoting on knee, last 7 days | Symptoms |