| Literature DB >> 29958292 |
Jason S Rockel1,2, Weidong Zhang3,4, Konstantin Shestopaloff1,2, Sergei Likhodii5, Guang Sun6, Andrew Furey7, Edward Randell5, Kala Sundararajan1,2, Rajiv Gandhi1,2,8, Guangju Zhai3,9, Mohit Kapoor1,2,8,10.
Abstract
Multiple factors can help predict knee osteoarthritis (OA) patients from healthy individuals, including age, sex, and BMI, and possibly metabolite levels. Using plasma from individuals with primary OA undergoing total knee replacement and healthy volunteers, we measured lysophosphatidylcholine (lysoPC) and phosphatidylcholine (PC) analogues by metabolomics. Populations were stratified on demographic factors and lysoPC and PC analogue signatures were determined by univariate receiver-operator curve (AUC) analysis. Using signatures, multivariate classification modeling was performed using various algorithms to select the most consistent method as measured by AUC differences between resampled training and test sets. Lists of metabolites indicative of OA [AUC > 0.5] were identified for each stratum. The signature from males age > 50 years old encompassed the majority of identified metabolites, suggesting lysoPCs and PCs are dominant indicators of OA in older males. Principal component regression with logistic regression was the most consistent multivariate classification algorithm tested. Using this algorithm, classification of older males had fair power to classify OA patients from healthy individuals. Thus, individual levels of lysoPC and PC analogues may be indicative of individuals with OA in older populations, particularly males. Our metabolite signature modeling method is likely to increase classification power in validation cohorts.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29958292 PMCID: PMC6025859 DOI: 10.1371/journal.pone.0199618
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Demographics of the Newfoundland cohort stratified groups consisting of healthy volunteers (HV) and patients undergoing total knee replacement for osteoarthritis (OA).
| Stratum | Total (n) | OA (n) | HV (n) | Females (n) | Females OA (n) | Females HV (n) | P-value Males:Females OA vs. HV | Age ± SD | Age OA ± SD | Age HV ± SD | P-value Age OA vs. HV | BMI ± SD | BMI OA ± SD | BMI HV ± SD | P-value BMI OA vs. HV |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| All | 346 | 152 | 194 | 187 | 77 | 110 | 0.312 | 56.1 ± 12.8 | 63.8 ± 7.5 | 50.0 ± 12.8 | < 0.001 | 30.0 ± 5.4 | 31.8 ± 5.6 | 28.6 ± 4.9 | < 0.001 |
| Males | 159 | 75 | 84 | 57.3 ± 12.1 | 63.6 ± 7.9 | 51.7 ± 12.4 | < 0.001 | 29.7 ± 4.7 | 31.1 ± 4.9 | 28.5 ± 4.2 | < 0.001 | ||||
| Females | 187 | 77 | 110 | 55.0 ± 13.3 | 64.1 ± 7.1 | 48.7 ± 12.9 | < 0.001 | 30.3 ± 6.0 | 32.5 ± 6.1 | 28.7 ± 5.4 | < 0.001 | ||||
| BMI ≥ 30 | 172 | 89 | 83 | 98 | 47 | 51 | 0.323 | 56.0 ± 12.1 | 62.9 ± 5.9 | 48.5 ± 12.7 | < 0.001 | 34.0 ± 4.2 | 35.0 ± 5.0 | 33.0 ± 2.7 | 0.002 |
| BMI < 30 | 174 | 63 | 111 | 89 | 30 | 59 | 0.586 | 56.2 ± 13.4 | 65.2 ± 9.1 | 51.1 ± 12.8 | < 0.001 | 26.0 ± 3.2 | 27.4 ± 2.4 | 25.3 ± 3.3 | < 0.001 |
| Age > 50 | 250 | 148 | 102 | 128 | 76 | 52 | 1.000 | 62.5 ± 6.9 | 64.4 ± 6.7 | 59.9 ± 6.2 | < 0.001 | 30.7 ± 5.2 | 32.0 ± 5.5 | 28.8 ± 3.8 | < 0.001 |
| Males Age > 50 | 122 | 72 | 50 | 62.6 ± 7.0 | 64.5 ± 6.5 | 59.8 ± 6.8 | < 0.001 | 30.3 ± 4.4 | 31.2 ± 4.9 | 29.1 ± 3.0 | 0.005 | ||||
| Females Age > 50 | 128 | 76 | 52 | 62.5 ± 6.7 | 64.3 ± 6.9 | 59.9 ± 5.6 | < 0.001 | 31.0 ± 5.8 | 32.7 ± 6.0 | 28.5 ± 4.5 | < 0.001 | ||||
| Age > 50, BMI ≥ 30 | 131 | 89 | 42 | 70 | 47 | 23 | 0.983 | 61.5 ± 6.5 | 62.9 ± 5.9 | 58.6 ± 6.7 | 0.001 | 34.1 ± 4.4 | 35.0 ± 5.0 | 32.2 ± 2.1 | < 0.001 |
| Age > 50, BMI < 30 | 119 | 59 | 60 | 58 | 29 | 29 | 1.000 | 63.7 ± 7.1 | 66.6 ± 7.2 | 60.8 ± 5.7 | < 0.001 | 26.9 ± 2.6 | 27.4 ± 2.3 | 26.4 ± 2.8 | 0.031 |
Ratio of males to females, age or BMI in the OA vs. HV groups within each stratified group was determined by chi-square tests. P-values < 0.05 are considered significant. Number of individuals (n); age (in years); body mass index (BMI; in kg/m2); SD, standard deviation.
Fig 1A stepwise approach to metabolite signature identification and predictive model optimization using stratified populations from a single cohort.
The following stepwise approach includes data from the age > 50 years stratified population and is representative of results generated for each subpopulation. AUC, area under the curve; lysophosphatidylcholine (lysoPC); diacyl-phosphatidylcholine (PCaa); acyl-alkylphosphatidylcholine (PCae); partial least squares with logistic regression (PLS); principal component analysis with logistic regression (PCR).
Fig 2A discrete lysoPC and PC signature of metabolites from males over the age of 50 was dominant in individuals over the age of 50 years and was indicative of males with OA versus HV.
(A) Heat-map of the stratified cohort of individuals overs the age of 50 years separated by sex and total knee replacement due to osteoarthritis (OA) vs healthy adult volunteers (HV). (B & C) Venn diagrams generated by metabolite signatures (Table 2) from males, females and all individuals over the age of 50 years (B) or males, individuals with body mass index (BMI) ≥ 30 or BMI < 30 kg/m2 (C). (D & E) AUC curves generated by principal component with logistic regression (PCR) modeling using the metabolite signature (D) or aggregate sum of lysophosphatidylcholine (lysoPC), diacyl-phosphatidylcholine (PCaa) and acyl-alkylphosphatidylcholine (PCae) analogues (E) from the male age > 50 years stratified population. Blue lines represent training set area under the curve (AUC). Red lines represent test set AUC. Dotted lines are 95% confidence intervals.
Metabolites with a 2.5% quantile area under the receiver-operator curve ≥ 0.5 determined by bootstrapped logistic regression of the stratified study population described in Table 1.
| Metabolite | All | Age > 50 | Males | Age > 50 Males | Females | Age > 50 Females | BMI ≥ 30 | Age > 50 BMI ≥ 30 | BMI < 30 | Age > 50 BMI < 30 |
|---|---|---|---|---|---|---|---|---|---|---|
| lysoPCaC16:0 | 0.66 | |||||||||
| lysoPCaC28:1 | 0.6 | 0.64 | 0.64 | 0.67 | 0.66 | |||||
| PCaaC28:1 | 0.62 | 0.65 | ||||||||
| PCaaC32:3 | 0.61 | 0.67 | 0.69 | 0.70 | 0.66 | 0.62 | 0.69 | 0.64 | ||
| PCaaC34:3 | 0.62 | 0.65 | 0.68 | |||||||
| PCaaC36:0 | 0.6 | 0.62 | 0.63 | 0.66 | ||||||
| PCaaC36:2 | 0.65 | |||||||||
| PCaaC36:5 | 0.61 | 0.69 | ||||||||
| PCaaC36:6 | 0.59 | 0.64 | 0.64 | 0.70 | ||||||
| PCaaC38:0 | 0.59 | 0.64 | 0.67 | 0.72 | ||||||
| PCaaC38:5 | 0.61 | 0.64 | 0.68 | |||||||
| PCaaC38:6 | 0.61 | 0.65 | ||||||||
| PCaaC40:1 | 0.61 | 0.63 | 0.67 | |||||||
| PCaaC40:2 | 0.65 | |||||||||
| PCaaC40:6 | 0.65 | |||||||||
| PCaaC42:0 | 0.62 | |||||||||
| PCaaC42:2 | 0.59 | 0.63 | 0.66 | 0.65 | ||||||
| PCaaC42:5 | 0.66 | |||||||||
| PCaeC30:1 | 0.64 | 0.68 | ||||||||
| PCaeC30:2 | 0.59 | 0.63 | 0.65 | 0.68 | ||||||
| PCaeC32:2 | 0.6 | 0.65 | 0.65 | 0.69 | 0.68 | |||||
| PCaeC34:0 | 0.6 | |||||||||
| PCaeC34:1 | 0.61 | |||||||||
| PCaeC34:2 | 0.62 | 0.65 | ||||||||
| PCaeC34:3 | 0.62 | 0.64 | ||||||||
| PCaeC36:2 | 0.59 | 0.64 | 0.62 | 0.67 | 0.66 | |||||
| PCaeC36:3 | 0.59 | 0.64 | 0.63 | 0.66 | 0.67 | |||||
| PCaeC38:0 | 0.61 | 0.66 | 0.68 | 0.74 | 0.67 | |||||
| PCaeC38:1 | 0.61 | 0.64 | ||||||||
| PCaeC38:2 | 0.6 | 0.66 | 0.65 | 0.69 | 0.67 | |||||
| PCaeC38:3 | 0.61 | 0.65 | ||||||||
| PCaeC38:5 | 0.6 | |||||||||
| PCaeC38:6 | 0.6 | 0.65 | 0.66 | 0.72 | 0.64 | |||||
| PCaeC40:1 | 0.63 | 0.65 | 0.67 | |||||||
| PCaeC40:2 | 0.61 | 0.67 | ||||||||
| PCaeC40:5 | 0.61 | 0.63 | 0.63 | 0.67 | 0.67 | |||||
| PCaeC40:6 | 0.62 | 0.67 | 0.64 | 0.7 | 0.67 | 0.63 | ||||
| PCaeC42:2 | 0.61 | |||||||||
| PCaeC42:3 | 0.61 | 0.64 |
Age (in years), body mass index (BMI; in kg/m2), lysophosphatidylcholine (lysoPC), diacyl PC (PCaa), acyl-alkyl PC (PCae).
Model area under the receiver-operator curve values (AUC) of the 2.5%, 50% and 97.5% quantiles generated from bootstrapped multivariate analysis of metabolites determined to be predictive from univariate analysis of stratified groups of study participants described in Table 1.
| Model | All Train | All Test | All Difference | Age > 50 Train | Age > 50 Test | Age > 50 Difference | Males Train | Males Test | Males Difference | Males Age > 50 Train | Males Age > 50 Test | Males Age > 50 Difference | Age > 50, BMI ≥ 30 Train | Age > 50, BMI ≥ 30 Test | Age > 50, BMI ≥ 30 Difference | Age > 50, BMI < 30 Train | Age > 50, BMI < 30 Test | Age > 50, BMI < 30 Difference | Mean Absolute Difference (quantile) | Mean Absolute Difference (all) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| pls 2.5% | 0.595 | 0.563 | 0.032 | 0.630 | 0.589 | 0.041 | 0.647 | 0.588 | 0.059 | 0.684 | 0.625 | 0.059 | 0.645 | 0.573 | 0.072 | 0.617 | 0.562 | 0.055 | 0.053 | |
| pls 50% | 0.649 | 0.641 | 0.008 | 0.693 | 0.678 | 0.015 | 0.724 | 0.703 | 0.021 | 0.768 | 0.751 | 0.017 | 0.739 | 0.710 | 0.029 | 0.705 | 0.689 | 0.016 | 0.018 | 0.028 |
| pls 97.5% | 0.709 | 0.715 | -0.006 | 0.753 | 0.762 | -0.009 | 0.791 | 0.812 | -0.021 | 0.844 | 0.859 | -0.015 | 0.827 | 0.836 | -0.009 | 0.786 | 0.810 | -0.024 | 0.014 | |
| pcr 2.5% | 0.586 | 0.566 | 0.020 | 0.613 | 0.592 | 0.020 | 0.630 | 0.589 | 0.041 | 0.659 | 0.623 | 0.036 | 0.560 | 0.545 | 0.015 | 0.597 | 0.564 | 0.032 | 0.027 | |
| pcr 50% | 0.643 | 0.645 | -0.002 | 0.680 | 0.679 | 0.001 | 0.711 | 0.709 | 0.002 | 0.751 | 0.752 | -0.001 | 0.707 | 0.701 | 0.007 | 0.696 | 0.691 | 0.005 | 0.003 | 0.018 |
| pcr 97.5% | 0.706 | 0.721 | -0.015 | 0.745 | 0.763 | -0.018 | 0.784 | 0.816 | -0.032 | 0.835 | 0.865 | -0.029 | 0.810 | 0.831 | -0.020 | 0.780 | 0.817 | -0.037 | 0.025 | |
| log 2.5% | 0.665 | 0.512 | 0.153 | 0.799 | 0.528 | 0.271 | 0.758 | 0.472 | 0.286 | 0.830 | 0.412 | 0.418 | 0.700 | 0.481 | 0.219 | 0.666 | 0.476 | 0.190 | 0.256 | |
| log 50% | 0.717 | 0.601 | 0.116 | 0.852 | 0.627 | 0.225 | 0.830 | 0.599 | 0.231 | 0.972 | 0.587 | 0.386 | 0.794 | 0.639 | 0.155 | 0.751 | 0.622 | 0.129 | 0.207 | 0.203 |
| log 97.5% | 0.771 | 0.676 | 0.096 | 0.897 | 0.720 | 0.176 | 0.892 | 0.728 | 0.164 | 1.000 | 0.741 | 0.259 | 0.872 | 0.779 | 0.094 | 0.833 | 0.754 | 0.079 | 0.145 | |
| sum pls 2.5% | 0.538 | 0.501 | 0.038 | 0.572 | 0.539 | 0.033 | 0.580 | 0.527 | 0.052 | 0.623 | 0.556 | 0.068 | 0.583 | 0.468 | 0.115 | 0.539 | 0.474 | 0.065 | 0.062 | |
| sum pls 50% | 0.588 | 0.581 | 0.007 | 0.638 | 0.628 | 0.011 | 0.657 | 0.641 | 0.015 | 0.704 | 0.692 | 0.012 | 0.661 | 0.627 | 0.033 | 0.621 | 0.611 | 0.010 | 0.015 | 0.033 |
| sum pls 97.5% | 0.644 | 0.662 | -0.018 | 0.700 | 0.725 | -0.025 | 0.733 | 0.755 | -0.021 | 0.780 | 0.809 | -0.029 | 0.739 | 0.769 | -0.030 | 0.721 | 0.735 | -0.013 | 0.023 | |
| sum pcr 2.5% | 0.514 | 0.498 | 0.016 | 0.553 | 0.536 | 0.017 | 0.557 | 0.527 | 0.030 | 0.598 | 0.553 | 0.044 | 0.514 | 0.432 | 0.082 | 0.521 | 0.480 | 0.042 | 0.039 | |
| sum pcr 50% | 0.577 | 0.578 | -0.001 | 0.630 | 0.625 | 0.005 | 0.648 | 0.640 | 0.008 | 0.691 | 0.689 | 0.003 | 0.611 | 0.604 | 0.008 | 0.616 | 0.615 | 0.000 | 0.004 | 0.023 |
| sum pcr 97.5% | 0.639 | 0.662 | -0.023 | 0.695 | 0.723 | -0.028 | 0.727 | 0.756 | -0.029 | 0.777 | 0.802 | -0.024 | 0.709 | 0.738 | -0.030 | 0.719 | 0.745 | -0.026 | 0.027 | |
| sum log 2.5% | 0.559 | 0.504 | 0.055 | 0.600 | 0.538 | 0.063 | 0.599 | 0.515 | 0.084 | 0.647 | 0.549 | 0.098 | 0.613 | 0.517 | 0.096 | 0.562 | 0.432 | 0.130 | 0.088 | |
| sum log 50% | 0.615 | 0.586 | 0.029 | 0.667 | 0.632 | 0.035 | 0.679 | 0.631 | 0.047 | 0.735 | 0.686 | 0.050 | 0.708 | 0.654 | 0.054 | 0.649 | 0.569 | 0.080 | 0.049 | 0.051 |
| sum log 97.5% | 0.672 | 0.660 | 0.012 | 0.730 | 0.722 | 0.008 | 0.758 | 0.737 | 0.021 | 0.817 | 0.803 | 0.014 | 0.797 | 0.790 | 0.007 | 0.737 | 0.702 | 0.035 | 0.016 |
AUC values were generated using partial least squares (pls), principal component analysis and logistic regression (pcr) and multivariate logistic regression (log) alone. Aggregated lysphosphatidylcholine, diacyl-phosphatidylcholine (PCaa) and acyl-alkylphosphatidylcholine (PCae) concentrations were also modelled (sum) in the same manner for each stratified group. Differences in training and test set were calculated and the mean absolute difference across all stratified groups was calculated for each quantile and for all quantiles to identify the model with least amount of overfitting between bootstrapped test and training sets. Age (in years), body mass index (BMI; in kg/m2).