Literature DB >> 26257845

What repeated measures analysis of variances really tells us.

Younsuk Lee1.   

Abstract

This article examined repeated measures analysis of variance (RMANOVA). Within-subjects repeated measurements are unavoidable during clinical and experimental investigation, and between- and within-subject variability should be treated separately. Only through proper use and meticulous interpretation can ethical and scientific integrity be guaranteed. The philosophical background of, and knowledge pertaining to, RMANOVA are described in the first half of this text. The sphericity assumption and associated issues are discussed in the latter half. The final section provides a summary measure analysis, which was neglected by P value-dependent interpreters.

Entities:  

Keywords:  Data interpretation; Repeated measurements; Sphericity condition

Year:  2015        PMID: 26257845      PMCID: PMC4524931          DOI: 10.4097/kjae.2015.68.4.340

Source DB:  PubMed          Journal:  Korean J Anesthesiol        ISSN: 2005-6419


Introduction

Readers frequently encounter repeated measures analysis of variance (RMANOVA) when browsing the medical literature. In the field of anesthesiology, we measure blood pressure, cardiac outputs, and pain scores repeatedly at different time intervals. We can also measure blood pressure at different sites: radial and femoral, and right and left. Although RMANOVA represents a major analytical method for repeated measures (RM) data, it is frequently misused or misinterpreted due its complexity. Several articles have been published in medical journals focusing on the analysis of RM data [12]. However, despite the quality of these reports, readers of the Korean Journal of Anesthesiology (KJA), as well as potential authors, remain uncertain with regard to understanding and practically applying RMANOVA. This article focuses on three learning objectives: (1) the pitfalls of erroneously applying simple analysis of variance (ANOVA) to RM data instead of RMANOVA; (2) the obligatory sphericity assumption of RMANOVA, including adjustments and workarounds; and (3) summary measures analysis. New readers may find it difficult to understand the statistical jargon employed in this article; therefore, several of the key technical terms and abbreviations are defined and listed presently: · RM data is that in which two or more observations are made within an experimental unit. In the KJA, an experimental unit typically refers to a single human or animal subject. Repeated observations can occur temporally or spatially. Longitudinal data represents a special form of RM data, in which repeated observations are made over long period of time. · RMANOVA is a distinct type of ANOVA associated with within-subject variability. Some statisticians use RMANOVA instead of univariate ANOVA to assess subject effects. In that context, RMANOVA can be considered a univariate rather than multivariate approach. · The sum of squares (SS), which measures the variability (uncertainty, error) of data, is calculated as the sum of the squares of the distances between each observation and the mean. - SSsomething denotes the variability explained by something known: e.g., SStime, SSgroup, and SSsubject. The total sum of squares, SStotal, is the sum of all the SS components of a dataset. If we have groups A, B, and C, then the notations can be simplified as SSA, SSB, and SSC. - Readers should be aware that certain statistical reports use another convention, i.e., "something-SS" or "SS-something", which may be denoted as SST or TSS instead of SStotal. · Mean squares (MS) indicates the average of the SS. MS is estimated by dividing SS by the degrees of freedom (d.f.). The ratio of each MS per MSerror is called the F value. · Y~X denotes that "Y is modeled as X" (according to the convention of Wilkinson and Rogers, 1973) [3], which is equivalent to "Y is explained by X". When the right-hand side of the equation is empty, Y~1 equates to "Y is modeled as an interrupt," or "Y is explained by nothing," which accords with the null hypothesis. · A : B denotes the interaction between conditions A and B. By reading this article, readers will learn typical conventions useful for interpreting full-length statistical reports; the information contained herein should act as a bridge toward understanding complex theory. All statistics were estimated using the R: A Language and Environment for Statistical Computing (ver. 3.2.0; R Foundation for Statistical Computing, Vienna, Austria). An additional library "car" (An R Companion to Applied Regression, 2nd Edition; J. Fox and S. Weisberg) was used for Mauchly's test. The complete computational procedures undertaken are attached in the appendices in R script format. The datasets introduced herein are real but have been modified slightly to aid understanding.

Major Differences between ANOVA and RMANOVA

A total of 16 boys and 11 girls were enrolled in a study conducted at a university dental hospital in North Carolina. Radiographic distances (mm) between the pituitary and pterygomaxillary fissure were measured repeatedly for each subject, at 8, 10, 12, and 14 years of age [4]. For simplicity, the girls' data are focused on herein, and are referred to as the "girls dataset" (Table 1).
Table 1

Dental Measurements (mm) in the "Girls Dataset" (n = 11)

SubjectAge 8Age 10Age 12Age 14
F0121.020.021.523.0
F0221.021.524.025.5
F0320.524.024.526.0
F0423.524.525.026.5
F0521.523.022.523.5
F0620.021.021.022.5
F0721.522.523.025.0
F0823.023.023.524.0
F0920.021.022.021.5
F1016.519.019.019.5
F1124.525.028.028.0

The data for the 11 girls were retrieved from the full dataset (16 boys and 11 girls). originally introduced by Potthoff RF, Roy S. A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika. 1964;51:313-26. With permission from Oxford University Press (3660581189264).

Total Uncertainty Explained by No Factors

The initial estimation begins with a null hypothesis, e.g., "the dental measurements (in girls) were totally unexplainable," or "the dental measurements (in girls) were explained by no factors." Total SS (SST) = the SS of the error (SSE) and is computed by: SS Xi,j denotes the distance on the jth occasion in the ith subject and X denotes the mean distance. SST is also computed using a simple ANOVA table that includes "nothing" as an explanatory variable (Table 2a).
Table 2

Sum of Squares for Two ANOVA Models of the "Girls Dataset" (n = 11)

d.f.SSMSFP value
(a)Null model. "The distances are explained by no factors."
Error43247.35.751--
(b)ANOVA model of the effect of age. "The distances are explained by the effect of age" (misleading)
Age350.6516.8843.4350.0258
Error40196.944.683NA-

ANOVA: analysis of variance, d.f.: degrees of freedom; SS: sum of squares, MS: mean squares.

ANOVA Model of the Effect of Age

It is intuitive to hypothesize that dental distances increase with age. The effect of age can be estimated and is denoted by SSage; this approach would be incorrect unless treated as a within-subjects effect. In this model, SST is given by the sum of SSage = 50.65 and SSE = 196.7, such that SST = 247.29 of the null model (Table 2b).

ANOVA Model of the Effects of Age, Gender, and Their Interactive Effect

Similar to the girls dataset, in the full dental measurements dataset, SST is given by the sum of SSage, SSgender, and SSage : gender (Table 3a). The models estimated thus far all exclude the effect of subject. Because the measurements for each subject were repeated four times, the SS values should have comprised SSwithin-subject and SSbetween-subject. Statistics are not correct here for effects that are repeated within-subjects, such as age and the age : gender interaction. The value of SST = 917.7 after summing all of the SS components.
Table 3

ANOVA Tables for The Full Dental Measurements Dataset (n = 27)

d.f.SSMSFP value
(a)ANOVA model of the effects of age, gender, and their interactive effect (misleading)
Age3237.1979.0615.0303.79e-08
Gender1140.5140.526.7021.22e-06
Age : gender314.04.660.8870.451
Error100526.05.26--
(b)RMANOVA model of the effects of age, gender, and their interactive effect (within-subject variability is estimated)
Within-subjects
 Age3237.1979.0640.0321.49e-15
 Age : gender313.994.662.3620.0781
 Error75148.131.98--
Between-subjects
 Gender1140.5140.59.2920.005375
 Error25377.915.12--

ANOVA: analysis of variance, RMANOVA: repeated measures analysis of variance; d.f.: degrees of freedom, SS: sum of squares, MS: mean squares.

RMANOVA Model of the Effects of Age, Gender, and Their Interactive Effect

We will now discuss RMANOVA (Table 3b). The ANOVA table divides sources of variability into two categories: within- and between-subjects. The effects of age (SSage = 237.19), the age : gender interaction (SSage : gender = 13.99), and its error term (SSw = 148.13) comprise the within-subject variability (SSwithin-subject). The effects of gender (SSgender = 140.5) and its error term (SSbetween = 377.9) comprise the between-subject variability (SSbetween-subject). SS SS SS Perceptive readers may note that the absolute SS values equate to those of the simple ANOVA models described in the previous section; the resulting value of SST is always constant within a dataset. Changes in F values affect the calculation of P values. In the final RMANOVA model, the result of this is that the P values are either lower or higher than those listed in Table 3A, which indicates that, if RMANOVA is not used, a simple ANOVA will inflate Type I error (false-positives) in between-subject effects and Type II error (false-negative decision) in within-subject effects. A graphical approach may aid the reader in understanding the concept that total variability is comprised of several different sources of variability, denoted by the areas of the rectangles (SS; Fig. 1).
Fig. 1

Graphical representation of the concept of analysis of variance (ANOVA). The designated variabilities reduce total variability, and the areas of the rectangles denote the amount of variability. (A) ANOVA model of the effects of age, gender, and their interactive effect. (B) Repeated measures ANOVA model of the effects of age, gender, and their interactive effect. The effects of age, and the age : gender interaction, are estimated within-subjects.

Sphericity Assumption

In simple terms, the variances of the differences between all combinations of measurements should be equal when using univariate RMANOVA. This is referred to as the sphericity (or circular) assumption. Sphericity, of the RM data of the covariance matrix, is strongly assumed for within-subject RMANOVA statistics. In cases that violate the sphericity assumption, within-subject RMANOVA statistics are meaningless. Given its name, i.e., "sphericity," readers may expect to encounter a relatively complicated algebraic concept, such that plain English is used to aid understanding in the discussion below. Violations of sphericity may be evaluated using the sphericity test developed by Mauchly, which can be performed easily, or even automatically, in the majority of statistical software packages. Mauchly's test with a P > 0.05 (or 0.10 depending on your a priori assessment of the data) allows us to interpret the results of RMANOVA. Returning to the girls dataset, six pairwise differences were calculated (Table 4): 10-8, 12-8, ... , 14-12. The variances of the pairwise differences ranged from 0.60 to 1.74, which appears relatively wide; however, the Mauchly statistic (W) = 0.69, and the estimated P = 0.67, indicating that the girls dataset satisfies the sphericity assumption. A favorable result was expected for this dataset because there was no reasonable basis on which to assume the presence of another factor aside from age over the 2-year periods.
Table 4

Pairwise Differences in the "Girls Dataset" (n = 11)

10-812-814-812-1014-1014-12
F01-1.00.52.01.53.01.5
F020.53.04.52.54.01.5
F033.54.05.50.52.01.5
F041.01.53.00.52.01.5
F051.51.02.0-0.50.51.0
F061.01.02.50.01.51.5
F071.01.53.50.52.52.0
F080.00.51.00.51.00.5
F091.02.01.51.00.5-0.5
F102.52.53.00.00.50.5
F110.53.53.53.03.00.0
Variance1.41.41.71.21.40.6
To enhance the reader's understanding of the concept of sphericity, the girls dataset was modified arbitrarily by multiplying the values obtained at 12 years of age by 2. Therefore, the Mauchly statistic W = 0.15, and P = 0.006, which proves that the dataset violates the assumption. This arbitrary modification illustrates the relative rigidity of the sphericity assumption (Table 5). Because conditions between the repeated measurements should be uniform, we cannot anticipate that the assumption will be satisfied, especially when two or more conditions with brief intervals are added to a single RM dataset (e.g., administration of drugs and attempted endotracheal intubation). Such designs represent a substantial proportion of typical anesthesiology study designs.
Table 5

Mauchly's Test of Sphericity for the "Girls Dataset" (n = 11)

Mauchly's test statistic WP value
Original data0.694740.6746
Modified data*0.148980.0056

*Produced by multiplying the distances at 12 years of age by 2 (arbitrarily).

Several "quick-and-dirty" adjustment procedures are available for RM data that violate the sphericity assumption, known as sphericity adjustments. Software packages usually provide factors (ε, epsilon) that adjust for degrees of freedom (d.f.) with respect to within-subject RMANOVA statistics. These include the Greenhouse-Geisser (ε̂) and Huynh-Feldt (ε̃) adjustment factors. By definition, the true ε values = 1, such that the sphericity assumption is fully satisfied. In the modified girls dataset described above, the Greenhouse-Geisser value ε̂ was estimated at 0.47, and the Huynh-Feldt ε̃ = 0.53. The effect of age has d.f. values of 3 (numerator) and 75 (denominator), such that the Huynh-Feldt adjusted d.f. values were as follows: d.f. (numerator) = 3 × 0.53 = 1.05 d.f. (denominator) = 75 × 0.53 = 39.75

Workarounds for RMANOVA

If the repetition has a single factor (e.g., only the time-based repetition), the calculation and interpretation of Mauchly's statistic would be easier. However, if there are more than two repetition factors, or they are nested, such calculations are rendered more difficult. Statisticians use two distinct methods to work around any violation of sphericity: multivariate analysis of variance (MANOVA) and mixed-effect modeling (MEM). Although MANOVA and MEM require more statistical knowledge, MANOVA is highly resistant to the violation of any assumption during the analysis of RM data, and MEM is a highly flexible method that uses user-defined variance structures; therefore, researchers should be familiar with both methods. We must also be aware that the editors of one international anesthesiology journal recommend MEM as the method of choice for the analysis of RM data [5]. The use of MEM should be confined to studies in which an effect of subject represents the primary concern [6].

Summary Measure Analysis

Because the statistics-heavy results and numerous P values generated by RMANOVA often confuse researchers, they sometimes fail to notice straightforward values within their data. Everitt and Rabe-Hesketh (2001) [7], and Frison and Pocock (1992) [8], suggested that researchers should extract more direct values from RM data, such as the overall mean, maximum (minimum) value, time to maximum (minimum) response, regression slope, and time to reach a particular value. Despite a lack of consensus regarding a gold standard summary measure, after identifying data-by-data analysis becomes more straightforward, such that a t-test or simple ANOVA can be applied. In our full dental dataset, the individual mean distances and maximum distances can be calculated readily and compared between genders using a t-test (Table 6).
Table 6

Results of a T-Test Applied to Summary Measures of Dental Distance in 27 Children

BoysGirlsP value
Mean distance (mm)25.0 (1.8)22.6 (2.1)0.01
Maximum distance (mm)27.8 (2.2)24.1 (2.4)0.00

Data are presented as means (SD).

  4 in total

1.  Repeated measures in clinical trials: analysis using mean summary statistics and its implications for design.

Authors:  L Frison; S J Pocock
Journal:  Stat Med       Date:  1992-09-30       Impact factor: 2.373

Review 2.  Correct use of repeated measures analysis of variance.

Authors:  Eunsik Park; Meehye Cho; Chang-Seok Ki
Journal:  Korean J Lab Med       Date:  2009-02

3.  Beyond repeated-measures analysis of variance: advanced statistical methods for the analysis of longitudinal data in anesthesia research.

Authors:  Yan Ma; Madhu Mazumdar; Stavros G Memtsoudis
Journal:  Reg Anesth Pain Med       Date:  2012 Jan-Feb       Impact factor: 6.288

4.  Statistical notes for clinical researchers: A one-way repeated measures ANOVA for data with repeated observations.

Authors:  Hae-Young Kim
Journal:  Restor Dent Endod       Date:  2015-02
  4 in total
  7 in total

Review 1.  Avoiding negative reviewer comments: common statistical errors in anesthesia journals.

Authors:  Sangseok Lee
Journal:  Korean J Anesthesiol       Date:  2016-06-01

Review 2.  Understanding one-way ANOVA using conceptual figures.

Authors:  Tae Kyun Kim
Journal:  Korean J Anesthesiol       Date:  2017-01-26

3.  Publication Delay of Korean Medical Journals.

Authors:  Younsuk Lee; KyoungOk Kim; Yujin Lee
Journal:  J Korean Med Sci       Date:  2017-08       Impact factor: 2.153

Review 4.  Practical statistics in pain research.

Authors:  Tae Kyun Kim
Journal:  Korean J Pain       Date:  2017-09-29

5.  Analgesic effects of ultrasound-guided fourquadrant transversus abdominis plane in patients with cytoreductive surgery with hyperthermic intraperitoneal chemotherapy: a prospective, randomized, controlled study.

Authors:  Jaegyok Song; Nayoung Choi; Minji Kang; Sung Mi Ji; Dong-Wook Kim; Min A Kwon
Journal:  Anesth Pain Med (Seoul)       Date:  2022-01-19

6.  Statistical review of 95 studies employing repeated-measures analysis of variance published in the Korean Journal of Anesthesiology.

Authors:  Sang-Il Park; Dong Kyu Lee; Junyong In
Journal:  Korean J Anesthesiol       Date:  2016-01-28

7.  Effects of ventilatory strategy on arterial oxygenation and respiratory mechanics in overweight and obese patients undergoing posterior spine surgery.

Authors:  Kyung Mi Kim; Jung Ju Choi; Dongchul Lee; Wol Seon Jung; Su Bin Kim; Hyun Jeong Kwak
Journal:  Sci Rep       Date:  2019-11-12       Impact factor: 4.379

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.