| Literature DB >> 35859927 |
Carlo M Bertoncelli1, Paola Altamura2, Sikha Bagui3, Subhash Bagui3, Edgar Ramos Vieira4, Stefania Costantini5, Marco Monticone6, Federico Solla7, Domenico Bertoncelli3.
Abstract
Background: Osteoarthritis (OA) has traditionally been considered a disease of older adults (⩾65 years old), but it may appear in younger adults. However, the risk factors for OA in younger adults need to be further evaluated.Entities:
Keywords: arthritis; machine learning; osteoarthritis; statistical data mining
Year: 2022 PMID: 35859927 PMCID: PMC9290106 DOI: 10.1177/1759720X221104935
Source DB: PubMed Journal: Ther Adv Musculoskelet Dis ISSN: 1759-720X Impact factor: 3.625
Figure 1.The selection process for the study population (NHANES: National Health and Nutrition Examination Survey).
List of dependent and independent variables selected from the National Health and Nutrition Examination Survey.
| Presence of osteoarthritis | Sex | Age | Poverty income ratio |
|---|---|---|---|
| Race/ethnicity | Diabetes | Body mass index | Hypertension if >140 mmHg |
| Smoking status | History of stroke | Physical and mental limitation | Walked or bicycled over the past 30 days |
| Require special healthcare equipment | Difficulty in stooping, kneeling | Difficulty walking up 10 steps | Difficulty sitting for long periods |
| Difficulty walking for a quarter mile | Difficulty reaching up over head | Difficulty in performing leisure activity at home | Difficulty in performing house chore |
| Difficulty in lifting or carrying | Difficulty in standing for long periods | Difficulty in walking between rooms on same floor | Difficulty in dressing themselves |
| Difficulty in standing up from armless chair | Difficulty in using fork, knife, drinking from cup | Difficulty in grasping/holding small objects | Difficulty getting in and out of bed |
| Difficulty in preparing meals | Difficulty in attending social event | Difficulty in going out to movies, events | Difficulty in managing money |
Figure 2.Prediction algorithm flow diagram.
DNN, deep neural network; PCA, principal component analysis.
Figure 3.Two-dimensional plot of first and second principal components of not scaled principal component analysis.
Figure 4.Two-dimensional plot of first and second principal components of quantile scaled principal component analysis.
The first and second principal components in Figures 3 and 4 are the two main dimensions of variation of unscaled (Figure 3) and scaled (Figure 4) data, showing separation of two classes of people (0/1: with or without osteoarthritis).
List of the logistic regression coefficients (independent variables) associated with osteoarthritis.
| Independent variables | Logistic regression | |||||
|---|---|---|---|---|---|---|
| Coefficient | Odds ratio | Standard error | Prob (>| | |||
| 1. Intercept | –7.65 | 0.01 | 0.23 | –33.25 | <2.2e–16 |
|
| Female gender | 0.43 | 1.54 | 0.06 | 7.16 | 8.049e–13 |
|
| Age | 0.08 | 1.09 | 0.01 | 21.31 | <2.2e–16 |
|
| Race/ethnicity | 0.06 | 1.06 | 0.02 | 2.56 | 0.01043 |
|
| Poverty income ratio | –0.01 | 1.00 | 0.01 | –0.65 | 0.51325 | .5 |
| Body mass index | 0.04 | 1.05 | 0.01 | 10.16 | <2.2e–16 |
|
| Cigarette smoking | 0.19 | 1.21 | 0.03 | 5.51 | 3.476e–08 |
|
| High blood pressure | 0.24 | 1.28 | 0.05 | 4.71 | 2.461e–06 |
|
| Physical and mental limitation | –0.03 | 0.97 | 0.01 | –3.97 | 6.914e–05 |
|
Logistic regression: Female gender, being smoker, type of race/ethnicity (Mexican American < Other Hispanic < Non-hispanic White < Non-hispanic Black < Other race), the increasing of age, body mass index, and high blood pressure (positive values), and decreasing of poverty income ratio and physical and mental limitation (negative values) are predictive factors of osteoarthritis (in the ‘Estimate’ column). As an example, this means that for every unit increase in the female gender, the log odds = ln(p/1−p) increases 1.54 times (where p = probability to develop osteoarthritis), while for every unit decrease in poverty income ratio, the log odds = ln(p/1−p) decreases 1.00 time. The ‘Pr(>|z|)’ column at the far right in the table indicates the significant strength of the respective parameter in terms of p-value as osteoarthritis predictor. This means that the significance of female gender, age, race/ethnicity, body mass index, cigarette smoking, high blood pressure, and physical and mental limitation in predicting osteoarthritis is very probable, with a p-value <0.05.
Figure 5.Receiver operating characteristic curve and confusion matrix.
Confusion matrix: [1879 776]-[68 147]; true negative = 1879; true positive = 147; false negative = 68; false positive = 776.
Figure 6.Training versus validation loss curves.
Metrics comparison of LRC, DNN, and SVM algorithms (different input features).
| Accuracy (%) | Sensitivity (%) | Specificity (%) | |
|---|---|---|---|
| LRC | 78 | 50 | 80 |
| SVM | 56 | 80 | 55 |
| DNN | 71 | 68 | 71 |
DNN, deep neural network; LRC, logistic regression classifier; SVM, support vector machine.
Metrics comparison of LRC, DNN, and SVM algorithms (same PCA input features).
| Accuracy (%) | Sensitivity (%) | Specificity (%) | |
|---|---|---|---|
| LRC | 67.6 | 69.3 | 67.4 |
| SVM | 62.3 | 68.9 | 61.8 |
| DNN | 71 | 68 | 71 |
DNN, deep neural network; LRC, logistic regression classifier; SVM, support vector machine.