| Literature DB >> 33112926 |
Abstract
In regression modelling the non-linear relationships between explanatory variables and outcome are often effectively modelled using restricted cubic splines (RCS). We focus on situations where the values of the outcome change periodically over time and we define an extension of RCS that considers periodicity by introducing numerical constraints. Practical examples include the estimation of seasonal variations, a common aim in virological research, or the study of hormonal fluctuations within menstrual cycle. Using real and simulated data with binary outcomes we show that periodic RCS can perform better than other methods proposed for periodic data. They greatly reduce the variability of the estimates obtained at the extremes of the period compared to cubic spline methods and require the estimation of fewer parameters; cosinor models perform similarly to the best cubic spline model and their estimates are generally less variable, but only if an appropriate number of harmonics is used. Periodic RCS provide a useful extension of RCS for periodic data when the assumption of equality of the outcome at the beginning and end of the period is scientifically sensible. The implementation of periodic RCS is freely available in peRiodiCS R package and the paper presents examples of their usage for the modelling of the seasonal occurrence of the viruses.Entities:
Mesh:
Year: 2020 PMID: 33112926 PMCID: PMC7592770 DOI: 10.1371/journal.pone.0241364
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 2Simulation results for the alternative cases.
Each row shows the results from one of the three alternative simulations settings, each column refers to one of the models. The black curves show the probabilities from the model generating the data, the average estimates obtained using the five models are shown with gray lines; dashed lines are average limits of the 95% confidence intervals. Spline models were fitted using 5 parameters. See methods for details on the simulation settings.
Fig 1Estimated probability of virus positivity for 7 different viruses using RCS, periodic RCS and CS, and cosinor models with one and two harmonics using data from Horton et al.
The step functions are the observed weekly proportions of virus positive samples, the lines are the estimated probabilities estimated using the 5 different models.
Analysis of the complete dataset of Horton et al.
Estimates are obtained with repeated 10-fold cross-validation.
| Virus | Method | Parameters | Knots | Brier score | Cal. intercept | Cal. slope | |
|---|---|---|---|---|---|---|---|
| RSV (n = 24503, 16.1% Pos) | RCS | 9 | 10 | 0.127 | 0.680 | -0.005 | 0.997 |
| RCS Per | 7 | 10 | 0.127 | 0.680 | -0.002 | 0.999 | |
| CS Per | 10 | 10 | 0.127 | 0.679 | -0.007 | 0.996 | |
| cosinor | 2 | 0.127 | 0.678 | 0.004 | 1.004 | ||
| cosinor(2h) | 4 | 0.127 | 0.677 | 0.004 | 1.002 | ||
| AdV (n = 9402, 9.8% Pos) | RCS | 4 | 5 | 0.088 | 0.574 | -0.068 | 0.970 |
| RCS Per | 2 | 5 | 0.088 | 0.563 | -0.007 | 0.998 | |
| CS Per | 8 | 8 | 0.088 | 0.574 | -0.219 | 0.901 | |
| cosinor | 2 | 0.089 | 0.574 | -0.010 | 0.997 | ||
| cosinor(2h) | 4 | 0.089 | 0.572 | -0.098 | 0.957 | ||
| hMPV (n = 9384, 6.6% Pos) | RCS | 8 | 9 | 0.059 | 0.739 | -0.009 | 0.999 |
| RCS Per | 6 | 9 | 0.059 | 0.738 | 0.013 | 1.009 | |
| CS Per | 7 | 7 | 0.059 | 0.737 | 0.004 | 1.005 | |
| cosinor | 2 | 0.059 | 0.726 | 0.056 | 1.028 | ||
| cosinor(2h) | 4 | 0.059 | 0.725 | 0.022 | 1.012 | ||
| hPIV1 (n = 9402, 1.7% Pos) | RCS | 4 | 5 | 0.017 | 0.707 | 0.008 | 1.012 |
| RCS Per | 7 | 10 | 0.017 | 0.696 | -0.194 | 0.957 | |
| CS Per | 8 | 8 | 0.017 | 0.695 | -0.257 | 0.940 | |
| cosinor | 2 | 0.017 | 0.683 | 0.125 | 1.043 | ||
| cosinor(2h) | 4 | 0.017 | 0.668 | -0.104 | 0.981 | ||
| hPIV2 (n = 9402, 0.9% Pos) | RCS | 4 | 5 | 0.009 | 0.699 | -0.105 | 0.986 |
| RCS Per | 2 | 5 | 0.009 | 0.709 | 0.089 | 1.030 | |
| CS Per | 5 | 5 | 0.009 | 0.704 | -0.207 | 0.963 | |
| cosinor | 2 | 0.009 | 0.714 | 0.314 | 1.087 | ||
| cosinor(2h) | 4 | 0.009 | 0.712 | 0.069 | 1.031 | ||
| hPIV3 (n = 9402, 3.9% Pos) | RCS | 7 | 8 | 0.037 | 0.675 | -0.091 | 0.974 |
| RCS Per | 5 | 8 | 0.037 | 0.672 | -0.058 | 0.985 | |
| CS Per | 10 | 10 | 0.037 | 0.670 | -0.175 | 0.946 | |
| cosinor | 2 | 0.037 | 0.667 | 0.008 | 1.006 | ||
| cosinor(2h) | 4 | 0.037 | 0.670 | -0.048 | 0.988 | ||
| INF (n = 28438, 11.8% Pos) | RCS | 9 | 10 | 0.098 | 0.674 | -0.014 | 0.993 |
| RCS Per | 7 | 10 | 0.099 | 0.672 | -0.010 | 0.995 | |
| CS Per | 10 | 10 | 0.098 | 0.676 | -0.015 | 0.992 | |
| cosinor | 2 | 0.099 | 0.663 | 0.007 | 1.004 | ||
| cosinor(2h) | 4 | 0.098 | 0.665 | -0.002 | 1.000 |
Repeated analysis of subsets of 500 units; the number of knots is based on AIC; estimates (Brier score, c index, calibration intercept and slope) are obtained on the data not included in the model estimation, power is evaluated on training data.
| Virus | Method | Parameters | Knots | Brier score | c index | Calibration intercept | Calibration slope | Power |
|---|---|---|---|---|---|---|---|---|
| RSV | RCS | 4 | 5 | 0.1291 | 0.667 | -0.005 | 0.786 | 1.00 |
| RCS Per | 4 | 7 | 0.1284 | 0.667 | -0.004 | 0.816 | 1.00 | |
| CS Per | 3 | 3 | 0.1285 | 0.670 | -0.007 | 0.857 | 1.00 | |
| cosinor | 2 | 0.1277 | 0.674 | 0.007 | 0.993 | 1.00 | ||
| cosinor(2h) | 4 | 0.1282 | 0.668 | -0.008 | 0.885 | 1.00 | ||
| AdV | RCS | 2 | 3 | 0.0889 | 0.560 | -0.001 | 0.477 | 0.38 |
| RCS Per | 2 | 5 | 0.0891 | 0.546 | 0.004 | 0.435 | 0.43 | |
| CS Per | 3 | 3 | 0.0892 | 0.543 | 0.004 | 0.399 | 0.32 | |
| cosinor | 2 | 0.0890 | 0.556 | 0.003 | 0.851 | 0.37 | ||
| cosinor(2h) | 4 | 0.0895 | 0.542 | 0.003 | 0.443 | 0.27 | ||
| hMPV | RCS | 5 | 6 | 0.0600 | 0.715 | -0.020 | 0.664 | 1.00 |
| RCS Per | 4 | 7 | 0.0597 | 0.721 | 0.009 | 0.734 | 1.00 | |
| CS Per | 6 | 6 | 0.0600 | 0.714 | -0.085 | 0.546 | 0.99 | |
| cosinor | 2 | 0.0597 | 0.723 | 0.020 | 0.927 | 0.99 | ||
| cosinor(2h) | 2 | 0.0599 | 0.720 | 0.032 | 0.774 | 0.98 | ||
| hPIV3 | RCS | 2 | 4 | 0.0374 | 0.652 | -0.020 | 0.550 | 0.65 |
| RCS Per | 2 | 5 | 0.0374 | 0.661 | -0.000 | 0.666 | 0.76 | |
| CS Per | 4 | 4 | 0.0375 | 0.648 | -0.055 | 0.465 | 0.64 | |
| cosinor | 2 | 0.0370 | 0.649 | 0.043 | 0.880 | 0.63 | ||
| cosinor(2h) | 4 | 0.0372 | 0.638 | 0.043 | 0.692 | 0.51 | ||
| INF | RCS | 3 | 4 | 0.0998 | 0.656 | -0.007 | 0.771 | 0.97 |
| RCS Per | 5 | 8 | 0.1001 | 0.655 | -0.005 | 0.699 | 0.94 | |
| CS Per | 4 | 4 | 0.0995 | 0.654 | -0.006 | 0.767 | 0.97 | |
| cosinor | 2 | 0.1000 | 0.655 | 0.001 | 0.999 | 0.91 | ||
| cosinor(2h) | 4 | 0.0995 | 0.657 | -0.001 | 0.859 | 0.96 |
Simulation results from the alternative and null case; the probabilities in the alternative case are simulated from a sine function, shown in Fig 2.
| RCS | RCS Periodic | CS Periodic | RCS | RCS Periodic | CS Periodic | cosinor | cosinor(2h) | |
| Knots | 5 | 5 | 5 | 4 | 6 | 3 | ||
| Parameters | 4 | 2 | 5 | 3 | 3 | 3 | 2 | 4 |
| Alternative case | ||||||||
| RCS | RCS Periodic | CS Periodic | RCS | RCS Periodic | CS Periodic | cosinor | cosinor(2h) | |
| Power (LRT | 0.847 | 0.912 | 0.814 | 0.873 | 0.878 | 0.874 | 0.913 | 0.880 |
| Power (Score | 0.836 | 0.909 | 0.791 | 0.867 | 0.872 | 0.865 | 0.910 | 0.880 |
| Brier Train | 0.208 | 0.213 | 0.206 | 0.210 | 0.210 | 0.210 | 0.212 | 0.204 |
| Brier Test | 0.231 | 0.226 | 0.233 | 0.228 | 0.227 | 0.228 | 0.226 | 0.235 |
| AUC Train | 0.721 | 0.709 | 0.728 | 0.716 | 0.717 | 0.713 | 0.710 | 0.734 |
| AUC test | 0.678 | 0.693 | 0.673 | 0.687 | 0.690 | 0.684 | 0.694 | 0.671 |
| Calibration Intercept | 0.001 | -0.001 | 0.002 | -0.006 | -0.004 | 0.001 | -0.001 | 0.027 |
| Calibration Slope | 0.822 | 0.998 | 0.754 | 0.909 | 0.920 | 0.901 | 0.998 | 0.734 |
| Coverage of 95% CI | 0.957 | 0.951 | 0.959 | 0.951 | 0.955 | 0.952 | 0.955 | 0.956 |
| Length of 95% CI | 0.956 | 0.749 | 1.082 | 0.839 | 0.872 | 0.864 | 0.749 | 0.996 |
| Null case—size of the test | ||||||||
| LRT, | 0.059 | 0.053 | 0.062 | 0.056 | 0.057 | 0.061 | 0.055 | 0.056 |
| Score Test, | 0.049 | 0.051 | 0.048 | 0.049 | 0.051 | 0.049 | 0.052 | 0.049 |
Simulation results from the second alternative case, where the probabilities are simulated from a periodic function with two peaks, shown in Fig 2.
| RCS | RCS Periodic | CS Periodic | RCS | RCS Periodic | CS Periodic | cosinor | cosinor(2h) | |
|---|---|---|---|---|---|---|---|---|
| Knots | 4 | 6 | 3 | 6 | 8 | 5 | ||
| Parameters | 3 | 3 | 3 | 5 | 5 | 5 | 2 | 4 |
| Power (LRT | 0.976 | 0.997 | 0.311 | 1.000 | 1.000 | 1.000 | 0.337 | 1.000 |
| Power (Score | 0.972 | 0.997 | 0.301 | 1.000 | 1.000 | 0.999 | 0.329 | 1.000 |
| Brier Train | 0.187 | 0.167 | 0.231 | 0.154 | 0.152 | 0.157 | 0.234 | 0.152 |
| Brier Test | 0.203 | 0.183 | 0.250 | 0.179 | 0.175 | 0.181 | 0.248 | 0.175 |
| AUC Train | 0.783 | 0.826 | 0.642 | 0.849 | 0.854 | 0.843 | 0.620 | 0.855 |
| AUC test | 0.764 | 0.806 | 0.583 | 0.815 | 0.826 | 0.808 | 0.583 | 0.829 |
| Calibration Intercept | 0.007 | 0.007 | 0.048 | 0.015 | 0.017 | 0.012 | 0.024 | 0.005 |
| Calibration Slope | 0.943 | 0.958 | 0.563 | 0.855 | 0.862 | 0.857 | 0.814 | 0.879 |
| Coverage of 95% CI | 0.490 | 0.741 | 0.806 | 0.872 | 0.924 | 0.806 | 0.260 | 0.903 |
| Length of 95% CI | 0.954 | 1.128 | 1.319 | 1.409 | 1.523 | 1.319 | 0.709 | 1.241 |
Simulation results from the third alternative, where the probabilities are simulated from a periodic function with complex pattern, shown in Fig 2.
| RCS | RCS Periodic | CS Periodic | RCS | RCS Periodic | CS Periodic | cosinor | cosinor(2h) | |
|---|---|---|---|---|---|---|---|---|
| Knots | 4 | 6 | 3 | 6 | 8 | 5 | ||
| Parameters | 3 | 3 | 3 | 5 | 5 | 5 | 2 | 4 |
| Power (LRT | 0.787 | 0.677 | 0.588 | 0.749 | 0.752 | 0.805 | 0.595 | 0.773 |
| Power (Score | 0.747 | 0.661 | 0.567 | 0.691 | 0.721 | 0.759 | 0.586 | 0.748 |
| Brier Train | 0.214 | 0.218 | 0.222 | 0.208 | 0.208 | 0.206 | 0.225 | 0.211 |
| Brier Test | 0.232 | 0.237 | 0.240 | 0.236 | 0.236 | 0.234 | 0.239 | 0.234 |
| AUC Train | 0.687 | 0.682 | 0.666 | 0.708 | 0.715 | 0.714 | 0.656 | 0.703 |
| AUC test | 0.643 | 0.640 | 0.624 | 0.640 | 0.649 | 0.649 | 0.630 | 0.651 |
| Calibration Intercept | 0.024 | 0.040 | 0.045 | 0.053 | 0.052 | 0.045 | 0.013 | 0.040 |
| Calibration Slope | 0.860 | 0.833 | 0.806 | 0.676 | 0.719 | 0.720 | 0.940 | 0.778 |
| Coverage of 95% CI | 0.895 | 0.801 | 0.765 | 0.924 | 0.924 | 0.951 | 0.711 | 0.913 |
| Length of 95% CI | 0.909 | 0.851 | 0.847 | 1.242 | 1.091 | 1.145 | 0.726 | 0.994 |