Germaine Cornelissen1. 1. Halberg Chronobiology Center, University of Minnesota, 420 Delaware Street SE, 55455 Minneapolis, MN, USA. corne001@umn.edu.
Abstract
A brief overview is provided of cosinor-based techniques for the analysis of time series in chronobiology. Conceived as a regression problem, the method is applicable to non-equidistant data, a major advantage. Another dividend is the feasibility of deriving confidence intervals for parameters of rhythmic components of known periods, readily drawn from the least squares procedure, stressing the importance of prior (external) information. Originally developed for the analysis of short and sparse data series, the extended cosinor has been further developed for the analysis of long time series, focusing both on rhythm detection and parameter estimation. Attention is given to the assumptions underlying the use of the cosinor and ways to determine whether they are satisfied. In particular, ways of dealing with non-stationary data are presented. Examples illustrate the use of the different cosinor-based methods, extending their application from the study of circadian rhythms to the mapping of broad time structures (chronomes).
A brief overview is provided of cosinor-based techniques for the analysis of time series in chronobiology. Conceived as a regression problem, the method is applicable to non-equidistant data, a major advantage. Another dividend is the feasibility of deriving confidence intervals for parameters of rhythmic components of known periods, readily drawn from the least squares procedure, stressing the importance of prior (external) information. Originally developed for the analysis of short and sparse data series, the extended cosinor has been further developed for the analysis of long time series, focusing both on rhythm detection and parameter estimation. Attention is given to the assumptions underlying the use of the cosinor and ways to determine whether they are satisfied. In particular, ways of dealing with non-stationary data are presented. Examples illustrate the use of the different cosinor-based methods, extending their application from the study of circadian rhythms to the mapping of broad time structures (chronomes).
Non-random variations are found as a function of time at the cellular level, in tissue culture, as well as in multi-cellular organisms at different levels of physiologic organization [1]. Multi-frequency rhythms usually account for a sizeable portion of the variability [2]. While there is presently much interest in studying circadian rhythms, the biological time structure covers many different ranges of periods beyond the 24-hour day, from fractions of seconds in single neurons to seconds in the cardiac and respiratory cycles, and a few hours in certain endocrine functions. Cycles with periods of about a week, about a month, and about a year are also ubiquitous, as are some other newly discovered cycles with periods of about 5 and 16 months, and much longer periods [3].The partly built-in nature of circadian rhythms [4,5] is now widely accepted, as is the fact that they are amenable to synchronization by cycles in the environment (e.g., lighting and feeding schedules) [6]. More generally, environmental geophysical cycles such as the day-light cycle, the tides, the phases of the moon, the seasons, as well as a host of other cycles shared between living organisms and the environment in which they evolved, all serve as synchronizers for partly endogenous rhythms [7,8].The application of chronobiology and its concepts to biology and medicine depends upon the quantitative evaluation of data collected as a function of time. The inclusion of time as a primordial factor in chronobiological investigations broadens the scope of methods for data analysis. The methods presented herein serve the purposes of rhythm detection and parameter estimation, with applications in the early diagnosis of altered rhythm characteristics indicative of a heightened risk, the optimization of treatment by timing, and a wider understanding of how our physiology is influenced by our environment.
Data collection and study design
Before turning to the methodology itself, it is important to consider aspects of data collection and study design [9] bearing on the choice of analytical tools used for data analysis. Biological data (Yi, i = 1, 2, …, N) are typically obtained by having a clock trigger the system (instrument, sensor) to measure a biological variable, yielding a set of data at discrete sampling times (ti, i = 1, 2, …, N). Whether the variable examined is discrete (e.g., mitotic counts) or continuous (e.g., oral temperature), the numerical values attributed to Yi are limited in accuracy and precision by the instrumentation used. Any finite variation of a biological system takes place during a non-zero time interval rather than instantaneously. In terms of data analysis, this means that there is a cut-off frequency fS beyond which the spectrum of the biological variations is practically zero [10]. Whether the transducer used to measure a given biological variable is analog or digital, it takes a certain time for it to respond and deliver a reading, so that rhythms with periods shorter than this response time cannot be assessed, and rhythms with a period close to it will be distorted [10]. In other words, a cut-off frequency fT can be defined as the minimal frequency such that for f > fT the output signal remains practically constant (no variation in the data can be assessed). This means that too dense measurements are redundant and do not bring additional information. In the case of equidistant data obtained at Δt intervals, it has been recommended to choose Δt ≤ 1/4fT to assure a good approximation of Y(t) [11].For chronobiological applications, this sampling requirement (to be able to reconstruct changes as a function of time in the context of the theory of signal processing) is often difficult to meet. Instead, sampling is used in its statistical meaning, where it refers to the selection of a few items from a population to draw inferences for that population. In terms of a data series, the selection of a sampling interval Δt > 1/fT allows only some features of the biological variations to be assessed. In the absence of external information, for data collected over an observation span T, only oscillations with periods in the range of T up to 2Δt can be assessed. The resolution with which a signal’s period can be determined also depends on T: in frequency terms, the smallest difference in frequency between two distinct signals is 1/T. The highest (no longer assessable) frequency, 1/2Δt, is called the Nyquist frequency, fN. Within the field of information theory, this is known as the Nyquist-Shannon theorem, which states that if a function Y(t) contains no frequencies higher than fN, it is completely determined by sampling Y(t) at intervals of 1/2fN.Ostle [12] defines the design of an experiment as the complete sequence of steps taken ahead of time to ensure that the appropriate data are obtained in a way which allows for an objective analysis leading to valid inferences with respect to the stated problem. These steps include the statement of objectives, the formulation of hypotheses, and the choice of design and experimental procedure and of the statistical methods to be used. Principles underlying experimental designs rely on replication, randomization and control. Replication relates to repeated measurements to obtain an estimate of uncertainty (experimental error or noise) used to derive statistical significance (P-values) and confidence intervals. Noise originates from variations in the biological system considered not to be part of the deterministic portion of the signal, from errors external to the system (errors of experimentation, of observation, and/or of measurement), and from the transducer and sampler (instrumentation error). Reducing the experimental error increases the precision of experiments. Randomization is an important aspect of study design that allows researchers to proceed as though the assumption of independence of the observation errors is true, which is critical in applying a test of significance. Although randomization cannot guarantee independence, it reduces the correlation that tend to characterize errors associated with experimental material (experimental unit or data) adjacent in space or time, while also improving accuracy. Control relates to the amount of balancing, blocking and grouping of the experimental units [12].The number of replications needed for a given probability of detecting a given difference with statistical significance depends on the standard error per experimental unit [13]. This means that small sample sizes can easily detect large differences, whereas small differences require larger sample sizes. When dealing with rhythmic variables, a sizeable portion of the variance stems from the rhythmic variation. Assessing the rhythmic behavior is thus important to reduce the error term. One important feature of chronobiological study designs is that rhythm stage is often the primary factor, as when assessing the relative efficacy or toxicity of a given treatment administered at different stages of the circadian rhythm. The power of testing for a time effect is usually only slightly affected by the number of timepoints considered when results are analyzed by cosinor, but not when performing an analysis of variance. This difference in approach accounts in part for the controversy between classical designs advocating fewer test groups [14] and chronobiological designs recommending at least 6 timepoints per cycle [15-17].In the framework of chronobiological study designs, three kinds of data can be distinguished, which determine the choice of method for their analysis and how the results can be interpreted. Longitudinal sampling corresponds to obtaining data on the same individual (experimental unit) as a function of time. One example is the around-the-clock monitoring of blood pressure at about 30-minute intervals for 7 days. Results apply to this particular individual. Transverse (cross-sectional) sampling consists of obtaining only one value per individual (experimental unit), different individuals providing data at the same or different sampling times. Time series of survival times are one example of transverse data. When individuals represent a random sample of a given population, results can be generalized for that population. Hybrid (linked cross-sectional) sampling consists of taking a few serial measurements from several individuals (experimental units). For instance, circulating prolactin is determined at 20-minute intervals for 24 hours in women at low or high familial risk of developing breast cancer later in life. The circadian rhythm can be determined for each woman and summarized across all women in each group for assessing any difference as a function of breast cancer risk [18]. When individuals represent a random sample of their respective populations, results can be generalized to these populations.Quite generally, but particularly when sampling is performed on more than a single individual, it is important that they are synchronized. Synchronizers (environmental periodicities determining the temporal placement of biological rhythms) serve this purpose. The rest-activity schedule or the light–dark regimen are effective synchronizers and can be used to determine a reference time (such as time of awakening or light onset in preference to clock hours such as local midnight). Staggered lighting regimens have been used to facilitate data collection in the experimental laboratory [19], making it possible to collect data over several days [20]. Marker rhythms [21] are a useful check of whether synchronization has been achieved, further providing an internal reference. Activity, temperature, heart rate and blood pressure are some useful marker variables that can easily be monitored longitudinally. For instance, blood pressure has been used to guide the timing of administration of anti-hypertensive medication while also providing information regarding the patient’s response to treatment [22].
Summary statistics
Before proceeding with any data analysis, it is recommended to first plot the data as a function of time. Such a chronogram can be informative in several ways. The presence of obvious rhythmicity may be recognized and its relative prominence as compared to the noise may be qualitatively (macroscopically) assessed. When sampling covers several cycles, some measure of the cycle-to-cycle variability can be gained. The presence of any increasing or decreasing trends can be observed, as is the existence of any outliers. After curve fitting, a chronogram of residuals can also provide valuable information regarding the adequacy of the model, and the need for data transformation.A histogram should also be prepared to obtain an estimate of the mean value with its standard deviation, and to check on the assumption of normality. For instance, a long-tailed distribution is indicative of the need for data transformation. Alternatively, the use of robust methods (such as those based on ranks; [23]) may be indicated.When prior information suggests the presence of a rhythm with known period, stacking the data over a single cycle reduces the noise and reveals the rhythm’s waveform. Historically, this approach was used by Franz Halberg to resolve confusing variability in blood eosinophil counts [24-26]. It was also instrumental in showing that the circadian rhythm in core temperature of Fischer rats persisted after the bilateral lesioning of the suprachiasmatic nuclei, albeit with a reduced amplitude and a phase advance [27,28]. It should be noted, however, that stacking the data over an assumed period may yield spurious results if the signal’s period differs from its assumed value. For this reason, it is highly recommended to analyze the original data first before proceeding with any stacking. For instance, it is not uncommon to present data by calendar month, even when the data have been collected over several years. This procedure limits the ability to resolve any periodic signal with a period different from precisely 1 year or its harmonic terms (6, 4, 3 months, …). Stacking the data over a period that has been validated can be complemented by an analysis of variance testing for a time effect when data are binned into a given number of classes of equal duration covering the full cycle (e.g., six 4-hourly classes covering 24 hours). An F-test then serves for testing the equality of class means. While this procedure remains applicable for non-equidistant data, the result depends on the choice of the number of classes used for binning and on the choice of the reference time [29].
Single cosinor
Historically, the single cosinor was developed to analyze short and sparse data series [2,30-32]. Periodograms and classical spectra originally used in chronobiology [33,34] required the data to be equidistant and to cover more than a single cycle. Whereas some spectral analysis techniques are now available to analyze non-equidistant data [35-37], algorithms available in most software packages remain limited to the case of equidistant data.Least squares procedures do not have this limitation. They are thus useful in curve-fitting problems, where it is desirable to obtain a functional form that best fits a given set of measurements. Although periodic regression presents its own limitations, being sensitive to outliers and not having any constraint to conserve the variance in the data, it possesses two important features: first, when data are equidistant, results at Fourier frequencies are identical to those of the discrete Fourier transform [38]; and second, it advantageously uses prior information. Thus, after the existence of ubiquitous circadian rhythms was demonstrated, it was possible to apply the single cosinor method in many experiments aimed at determining the times of highest efficacy and lowest toxicity in response to a variety of drugs and other stimuli by fitting a 24-hour cosine curve to 6 values, 4 hours apart, each value representing the number of experimental animals that survived a given intervention applied at one of the 6 timepoints when overall about 50% of the animals had died. These results led to the fields of chronopharmacy and chronotherapy [39-42].
Single-component cosinor
Notably in studies of circadian rhythms, it is indeed possible to assume that the period is known, being synchronized to the external 24-hour cycle. The regression model for a single component can be written aswhere M is the MESOR (Midline Statistic Of Rhythm, a rhythm-adjusted mean), A is the amplitude (a measure of half the extent of predictable variation within a cycle), ϕ is the acrophase (a measure of the time of overall high values recurring in each cycle), τ is the period (duration of one cycle), and e(t) is the error term (Figure 1).
It should be noted that the P-value obtained from the F-test and the corresponding confidence limits derived for are valid only if assumptions underlying the use of the least squares procedure are satisfied. These assumptions are (1) the model fits the data well, (2) the residuals are normally distributed, (3) the variance is homogeneous, (4) the residuals are independent, and (5) the parameters do not change over time.Model adequacy:Goodness of fit can be examined when replicates are available, either from multiple data collection at different timepoints or from data covering multiple cycles. RSS can then be further partitioned into the “pure error” and the “lack of fit”. An F-test comparing the pure error and lack of fit sums of squares provides a test of the model adequacy [44]. The pure error sum of squares (SSPE) is defined as the sum of squared differences (across all timepoints) between the data collected at a given timepoint and their respective arithmetic mean, whereas the sum of squares ascribed to lack of fit (SSLOF) is obtained by subtracting SSPE from RSSwith where ni is the number of data collected at time ti.The appropriateness of the model is rejected ifwhere m is the number of timepoints and p is the number of (cosine) components in the model (p = 1 for the single-component cosinor).In the presence of lack of fit, adding components in the model may be considered (Figure 3).
The single-component cosinor is easily extended to a multiple-component model (Figure 3)Instead of solving a system of 3 equations in 3 unknowns, there are 2p + 1 normal equations to estimate M and p pairs of (βj, γj) or (Aj, ϕj) when τj are assumed known. Generally, in the normal equations d = Suwhere {xij} are the cos(2πti/τj) and sin(2πti/τj).Estimates of u (M, β1, γ1, β2, γ2, …, βp, γp) are obtained asA confidence ellipsoid can be determined [43] from which approximate confidence intervals can be derived for each component’s amplitude and acrophase, as outlined above. Computations are greatly simplified in the case of equidistant data covering an integer number of cycles [45].A multiple-component model is useful to obtain a better approximation of the signal’s waveform when it deviates from sinusoidality. For instance, a 2-component model consisting of cosine curves with periods of 24 and 12 hours has been extensively used to analyze ambulatory blood pressure data (Figure 3) [62-64]. On the average, these two components account for the larger overall variance in this case [65]. This model is usually well-suited to approximate the nightly drop in blood pressure that reaches a nadir around mid-sleep [66], the slight increase thereafter and a sharper increase after awakening, the post-prandial dip seen more prominently in the elderly [67], and a slow decline in the evening. Whereas better-fitting models can be obtained for each individual patient, the choice of a given model used as a reference standard makes it possible to derive reference values (such as 90% prediction limits) for the model’s parameters for specified populations, usually clinically healthy men or women in several age groups [62]. Deviations from these norms can then be viewed as indicative of rhythm alteration. In addition to the well-known cardiovascular disease risk associated with an elevated blood pressure MESOR, outcome studies [68-72] have determined that certain other rhythm alterations affecting the circadian amplitude and acrophase are also associated with an increase in cardiovascular disease risk [64].
Population-mean cosinor
When data are collected as a function of time on 3 or more individuals, the population-mean cosinor procedure renders it possible to make inferences concerning a population rhythm, provided the k individuals represent a random sample from that population. Each individual series is analyzed by the single- or multiple-component single cosinor to yield estimates of . The goal is to make inferences concerning the population averages of the parameters, u*. The “*” indicates that the expectations are population averages and not averages over the k individuals sampled. Individual vectors ui are assumed to represent a random sample from a (2p + 1)-variate, normal population with mean u*. The within-individual variances are also assumed to be equal, so that the pooled estimate of variance can be estimated asWhen the sample sizes for all individuals are the same or almost the same, as is often the case in hybrid designs, the population estimates are unweighted averages of the individual parametersand the population amplitudes and acrophases can be estimated using the relationsIn the above conditions and assuming normality of errors and individual parameters, sample variances can be computed aswhereIn the case when the population-mean cosinor can be applied separately for each trial period (p = 1), a confidence interval for M* is given byand a joint 1-α confidence ellipse for consists of all points (βz*,γz*) satisfyingwhereThe null hypothesis of A* = 0 is rejected ifand approximate confidence intervals for and can be obtained by computing the minimal and maximal distances from the pole (zero) to the error ellipse and by drawing tangents from the pole to the error ellipse, respectively (Figure 4). As for the single cosinor, closer approximate limits can also be computed [43].
Test statistics have been developed to test the equality of MESORs, amplitudes and acrophases considered jointly or separately for the case of the single cosinor and the population-mean cosinor [43]. These tests can allow for a clearer interpretation of the results, for instance in a circadian experiment involving 6 timepoints 4 hours apart: Student t-tests are sometimes applied at each separate timepoint without adjustment of the P-values for multiple testing; when differences in opposite directions are found, parameter tests may reveal a difference in the circadian amplitude in the absence of a difference in MESOR or in the circadian acrophase [73].
Least squares spectra and population-mean cosinor spectra
The circadian rhythm is often prominent. It is also ubiquitous. These features enabled the single cosinor procedure to be applied to many short data series covering no more than a single cycle in order to yield valuable information regarding the organization of the circadian system in different species. Computationally, estimates of the MESOR, amplitude and acrophase can be obtained for any trial period. This procedure, however, is valid only if there is sufficient evidence for considering this particular trial period. In the absence of such evidence, results can no longer be taken at their face value.It has become much easier for chronobiologists to collect data over much longer spans and/or at much shorter intervals, but it has been more difficult to obtain series of equidistant data. Even for variables that are obtained with automated instrumentation (such as telemetry or ambulatory blood pressure monitors), it is not uncommon to have missing data or to have additional data collected manually at times different from the scheduled times. Investigations have also extended outside the circadian realm. For these reasons, a least squares approach to time series analysis remains attractive, as long as caution is properly taken in interpreting the results.Just as a chronogram provides useful information prior to quantitative data analysis, a view of the time structure of the data in the frequency domain can also be informative. For this purpose, using the cosinor at Fourier frequencies in the range of 1/T (where T is the length of the data series) up to 1/2Δt (where Δt is the sampling interval) can be viewed as no more than another macroscopic view of the data. A plot of amplitudes as a function of frequency (least squares spectrum) is equivalent to a discrete Fourier transform when data are equidistant [38].– Large spectral peaks indicate the presence of signals and provide an approximate estimate of their periods. This information can be used to validate anticipated components while also revealing the presence of other cycles. For rhythms that are anticipated, rhythm detection and parameter estimation can proceed as outlined above as long as P-values are adjusted for multiple testing [74]. Caution needs to be taken regarding non-anticipated cycles. The information thus gained can be used to design the next study or to examine other similar data series that could serve as replications. Additional analyses can be performed to determine the extent of stability of the unanticipated component, for instance by means of applying a chronobiologic serial section [21] or a gliding spectral window [75].– Plotting log-amplitudes versus log-frequency provides useful information regarding the noise structure [65]. White noise corresponds to about the same background amplitudes across the entire frequency range. Larger background amplitudes at lower than at higher frequencies represents colored (or correlated) noise, indicating that underlying assumptions are not met, resulting in under-estimated P-values and too-liberal confidence intervals of rhythm parameters. The noise structure can in itself be valuable. It is used for instance in assessing the 1/f behavior of heart rate variability [76].– Single spectral peaks are found only if the data cover an integer number of cycles. If this is not the case, the signal spreads over several spectral lines [10]. When this happens and the underlying signal was anticipated, it is possible to determine the period (frequency) corresponding to the maximal amplitude by applying the single cosinor procedure not only at the Fourier frequencies but at additional intermediary frequencies as well. Whereas this may provide a clearer picture of the signal, it should be realized that the resolution in frequency (1/T) remains the same, being determined by the series length, T. Tapers such as a Hanning window [77] can be used to reduce sidelobes associated with the finite observation span, but this procedure also affects the estimation of the rhythm parameters. While a Hanning taper does not affect the location of spectral peaks in a spectrum, the width of the peak is wider and the amplitude is reduced (Figure 5). It remains useful, however, for a macroscopic view of the time structure of the data.
When the period is unknown, the single cosinor model (Equations 1 and 12) can no longer be linearized in its parameters as the period is in the argument of the cosine function. Starting from an initial (guess) estimate for the period, all parameters can be estimated using iterations aimed at minimizing the residual sum of squares. Marquardt [79] developed an algorithm which performs an optimum interpolation between the Taylor series and gradient methods. He also derived a way to approximate confidence intervals for all parameters, including the period [80]. For the particular case of single-component models, Bingham offers an easily understood approach [81].For low-frequency signals, simulated annealing [82] is another suitable method that has the advantage of not requiring the specification of initial values for the periods. This approach does not perform well, however, for very sharp signals in the higher frequency range of the spectrum. Both simulated annealing and Marquardt’s nonlinear approach performed best in distinguishing two signals with close periods sampled over less than a beat cycle, when compared to other approaches [83].For signals with a symmetrical waveform, the nonlinear procedure can yield an acceptable estimate of the fundamental period on the basis of very short records not even covering a full cycle [84]. This is not the case, however, when the waveform is asymmetrical. Simulations indicate that about 5 cycles are needed to obtain a reasonable estimate of the period in this case, when the model fitted includes only the fundamental component. Including additional harmonic terms in the model allows the nonlinear procedure to correctly estimate the fundamental period with data covering no more than 2 cycles [84].
Analysis of non-stationary data
When data are equidistant or rendered equidistant by averaging and filling data gaps by interpolation, wavelets can be performed [85]. This approach has been useful to uncover components not detected earlier [86]. Short-term Fourier transforms can be used to visualize changes in the spectral structure of the data as a function of time [87]. Alternatively, gliding spectral windows [75] can be computed. The method consists of defining an interval (I) that is progressively displaced by a given increment (δt) throughout the data series. A least squares spectrum is computed over each interval over a specified frequency range. Both a fractional harmonic increment (δh < 1) and overlapping intervals (δt < I) are chosen to help visualize the time course of changes in frequency and/or amplitude occurring as a function of time in a 3D graph and/or a surface chart. In such a display, time and frequency are the two horizontal axes and amplitude is shown on the vertical axis in a 3D plot or as different shadings in a surface chart. One example relates to competing about 24.0- and 24.8-hour components coexisting in the physiology of an apparently seleno-sensitive woman with adynamic episodes recurring twice a year and lasting 2–3 months, as illustrated for systolic blood pressure in Figure 6. Another example illustrates the changing prominence of the about-weekly and about-daily rhythms in blood pressure and heart rate during the first 40 days of life of a clinically healthy boy [88]. Whereas the procedure can be performed on non-equidistant data, the interpretation of results is greatly helped when data are equidistant, as changes in sampling rate are also associated with changes in spectral structure appearing on the graph. A judicious choice of I and of the frequency range examined is important in order to minimize sidelobes. The use of a Hanning taper [77] is also helpful in this kind of exploratory analysis.
There are, of course, other important tools for the analysis of time series [95-103]. Most of them, however, require that the data be equidistant. This overview focused specifically on the use of the cosinor and its different extensions. The method is fairly simple and its results lead to meaningful interpretations. Despite its several shortcomings related primarily to the difficulty of satisfying all assumptions underlying the use of regression techniques, its wide-ranging applications have played an important role in the development of chronobiology as a quantitative scientific discipline. Used with caution, results based on a combination of exploratory analyses with the different cosinor routines and other conventional statistical tests, progress has also been made in the field of chronomics which aims at mapping broad time structures from the high-frequency brain waves to the multi-decadal cycles characterizing space-terrestrial weather influencing human physiology and pathology [3,104].Despite its simplicity, some reluctance remains for some investigators to use the cosinor for estimating rhythm parameters or for considering more than a single test time in designing experiments. Too many studies still rely on testing only at a fixed time of day (to control for the circadian variation) or at most at two times 12 hours apart, ignoring the possibility that the two selected timepoints may be at the midline crossings rather than at the peak and trough where differences are maximal. As discussed elsewhere, such practice can be misleading in missing an existing difference or even in finding a difference in mean when none exists [15-17]. Computing day-night differences in lieu of an amplitude and acrophase is also widely done to interpret ambulatory blood pressure monitoring records in terms of “dipping” [105], despite the documentation in several outcome studies of the superiority of a chronobiological approach [106,107].To some extent, this status quo may be accounted for by the lack of dissemination of computer software offering chronobiologists tools for time series analysis applicable to non-equidistant data. This situation is slowly changing, however. Personal computers have become more powerful and statistical packages have become more readily available for relatively easy use by investigators not necessarily versed in all statistical details underlying the programs included in the software packages. While professional statistical software packages remain somewhat expensive for individual users, several open-source packages (such as Octave and R) offer an attractive alternative, notably since some are platform-independent, running on PCs, Macs or Linux systems [108]. Programmers have taken advantage of the tools available in these packages to write code to perform analytical tasks of interest to chronobiologists. Perhaps the most comprehensive package is that developed by Oehlert and Bingham [109], offering a large array of procedures that can be applied by writing minimal coding instructions to call the different macros. Selected programs used in chronobiology have long been offered on the website of Refinetti [110], with clear instructions on how to run the programs. While not open-source, the Expert Soft Technologie website [111] also offers an array of cosinor-based and other procedures, including techniques for the study of non-stationary signals. These programs have been used in the study of shift-workers [112].In summary, selected methods for the study of biologic time series have been reviewed and their relative merits have been discussed in the light of underlying assumptions. Some illustrative applications have also been mentioned. When the choice of a model is justified, and it is functional and explicative, quantitative methods of data analysis are extremely valuable to specify the model and obtain estimates of its parameters. Even when underlying assumptions are not fully met, point estimates of the parameters can be very useful. More caution is needed, however, in deciding whether P-values and confidence intervals are trustworthy, since violation of underlying assumptions tends to yield results that are too liberal. Once this limitation is taken into consideration, data analysis methods as described herein constitute extremely valuable tools for research in chronobiology and chronomics.
Competing interests
The author declares that she has no competing interests.
Authors: R Jozsa; F Halberg; G Cornélissen; M Zeman; J Kazsaki; V Csernus; G S Katinas; H W Wendt; O Schwartzkopff; K Stebelova; K Dulkova; S M Chibisov; M Engebretson; W Pan; G A Bubenik; G Nagy; M Herold; R Hardeland; G Hüther; B Pöggeler; R Tarquini; F Perfetto; R Salti; A Olah; N Csokas; P Delmore; K Otsuka; E E Bakken; J Allen; C Amory-Mazaudin Journal: Biomed Pharmacother Date: 2005-10 Impact factor: 6.529
Authors: R Peto; M C Pike; P Armitage; N E Breslow; D R Cox; S V Howard; N Mantel; K McPherson; J Peto; P G Smith Journal: Br J Cancer Date: 1976-12 Impact factor: 7.640
Authors: Christine M Swanson; Steven A Shea; Wendy M Kohrt; Kenneth P Wright; Sean W Cain; Mirjam Munch; Nina Vujović; Charles A Czeisler; Eric S Orwoll; Orfeu M Buxton Journal: J Clin Endocrinol Metab Date: 2020-07-01 Impact factor: 5.958
Authors: Catherine Duclos; Marie Dumont; Jean Paquet; Hélène Blais; Solenne Van der Maren; David K Menon; Francis Bernard; Nadia Gosselin Journal: Sleep Date: 2020-01-13 Impact factor: 5.849
Authors: Dorela D Shuboni-Mulligan; Breyanna L Cavanaugh; Anne Tonson; Erik M Shapiro; Andrew J Gall Journal: Chronobiol Int Date: 2019-08-23 Impact factor: 2.877
Authors: Christopher T Banek; Madeline M Gauthier; Daniel C Baumann; Dusty Van Helden; Ninitha Asirvatham-Jeyaraj; Angela Panoskaltsis-Mortari; Gregory D Fink; John W Osborn Journal: Am J Physiol Regul Integr Comp Physiol Date: 2018-03-07 Impact factor: 3.619
Authors: Garth R Swanson; Annika Gorenz; Maliha Shaikh; Vishal Desai; Thomas Kaminsky; Jolice Van Den Berg; Terrence Murphy; Shohreh Raeisi; Louis Fogg; Martha Hotz Vitaterna; Christopher Forsyth; Fred Turek; Helen J Burgess; Ali Keshavarzian Journal: Am J Physiol Gastrointest Liver Physiol Date: 2016-05-19 Impact factor: 4.052