Literature DB >> 25835001

cocor: a comprehensive solution for the statistical comparison of correlations.

Birk Diedenhofen1, Jochen Musch1.   

Abstract

A valid comparison of the magnitude of two correlations requires researchers to directly contrast the correlations using an appropriate statistical test. In many popular statistics packages, however, tests for the significance of the difference between correlations are missing. To close this gap, we introduce cocor, a free software package for the R programming language. The cocor package covers a broad range of tests including the comparisons of independent and dependent correlations with either overlapping or nonoverlapping variables. The package also includes an implementation of Zou's confidence interval for all of these comparisons. The platform independent cocor package enhances the R statistical computing environment and is available for scripting. Two different graphical user interfaces-a plugin for RKWard and a web interface-make cocor a convenient and user-friendly tool.

Entities:  

Mesh:

Year:  2015        PMID: 25835001      PMCID: PMC4383486          DOI: 10.1371/journal.pone.0121945

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Determining the relationship between two variables is at the heart of many research endeavours. In the social sciences, the most popular statistical method to quantify the magnitude of an association between two numeric variables is the Pearson product-moment correlation. It indicates the strength of a linear relationship between two variables, which may be either positive, negative, or zero. In many research contexts, it is necessary to compare the magnitude of two such correlations, for example, if a researcher wants to know whether an association changed after a treatment, or whether it differs between two groups of interest. When comparing correlations, a test of significance is necessary to control for the possibility of an observed difference occurring simply by chance. However, many introductory statistics textbooks [1-5] do not even mention significance tests for correlations. Also in research practice, the necessity of conducting a proper statistical test when comparing the magnitude of correlations is often ignored. For example, in neuroscientific investigations, correlations between behavioral measures and brain areas are often determined to identify the brain area that is most strongly involved in a given task. Rousselet and Pernet [6] criticized that such studies rarely provide quantitative tests of the difference between correlations. Instead, many authors fall prey to a statistical fallacy, and wrongly consider the existence of a significant and a nonsignificant correlation as providing sufficient evidence for a significant difference between these two correlations. Nieuwenhuis, Forstmann, and Wagenmakers [7] also found that, when making a comparison between correlations, researchers frequently interpreted a significant correlation in one condition and a nonsignificant correlation in another condition as providing evidence for different correlations in the two conditions. Such an interpretation, however, is fallacious. As pointed out by Rosnow and Rosenthal [8], “God loves the .06 nearly as much as the .05”. To make a valid, meaningful, and interpretable comparison between two correlations, it is necessary to directly contrast the two correlations under investigation using an appropriate statistical test [7]. Even when recognizing the importance of a formal statistical test of the difference between correlations, the researcher has many different significance tests to choose from, and the choice of the correct method is vital. Before picking a test, researchers have to distinguish between the following three cases: (1) The correlations were measured in two independent groups A and B. This case applies, for example, if a researcher wants to compare the correlations between anxiety and extraversion in two different groups A and B (ρ = ρ ). If the two groups are dependent, the relationship between them needs further differentiation: (2) The two correlations can be overlapping (ρ = ρ ), i.e., the correlations have one variable in common. ρ and ρ refer to the population correlations in group A between variables 1 and 2 and variables 2 and 3, respectively. For instance, a researcher may be interested in determining whether the correlation between anxiety and extraversion is smaller than between anxiety and diligence within the same group A. (3) In the case of two dependent correlations, the two correlations can also be nonoverlapping (ρ = ρ ), i.e., they have no variable in common. This case applies, for example, if a researcher wants to determine whether the correlation between anxiety and extraversion is higher than the correlation between intelligence and creativity within the same group. A researcher also faces nonoverlapping dependent correlations when investigating whether the correlation between two variables is higher before rather than after a treatment provided to the same group. For each of these three cases, various tests have been proposed. An overview of the tests for comparing independent correlations is provided in Table 1, and for comparing dependent correlations—overlapping and nonoverlapping—in Tables 2 and 3, respectively. May and Hittner [9] compared the statistical power and Type I error rate of several tests for dependent overlapping correlations, and found no test to be uniformly preferable. Instead, they concluded that the best choice is influenced by sample size, predictor intercorrelation, effect size, and predictor-criterion correlation. Because no clear recommendation for any of these tests can be formulated that applies under all circumstances, and because different methods may be optimal for a research question at hand, it is important that researchers are provided with a tool that allows them to choose freely between all available options. Detailed discussions of the competing tests for comparing dependent overlapping correlations are given in Dunn and Clark [10], Hittner, May, and Silver [11], May and Hittner [9], Neill and Dunn [12], and Steiger [13]. For the case of dependent nonoverlapping correlations, the pros and cons of various tests are discussed in Raghunathan, Rosenthal, and Rubin [14], Silver, Hittner, and May [15], and Steiger [13]. In contrast to most other approaches, Zou [16] has advocated a test that is based on the computation of confidence intervals, which are often regarded as superior to significance testing because they separately indicate the magnitude and the precision of an estimated effect [17, 18]. Confidence intervals can be used to test whether a correlation significantly differs from zero or from some constant, and whether the difference between two correlations exceeds a predefined threshold. Zou’s confidence interval [16] is available for comparisons of independent and dependent correlations with either overlapping or nonoverlapping variables. The tests proposed by Zou [16] have been compared to other confidence interval procedures by Wilcox [19].
Table 1

Software implementing tests for comparing two correlations based on independent groups.

Test psych multilevel Weaver & Wuensch cocor
Fisher’s [20] z
Zou’s [16] confidence interval
Table 2

Software implementing tests for comparing two correlations based on dependent groups with overlapping variables.

Test psych multilevel DEPCORR DEPCOR Weaver & Wuensch cocor
Pearson and Filon’s [21] z
Hotelling’s [22] t
Williams’ [23] t
Olkin’s [24] z
Dunn and Clark’s [25] z
Hendrickson et al.’s [26] modification of Williams’ [23] t
Steiger’s [13] modification of Dunn and Clark’s [25] z
Meng, Rosenthal, and Rubin’s [27] z
Hittner et al.’s [11] modification of Dunn and Clark’s [25] z
Zou’s [16] confidence interval
Table 3

Software implementing tests for comparing two correlations based on dependent groups with nonoverlapping variables.

Test psych DEPCOR Weaver & Wuensch cocor
Pearson and Filon’s [21] z
Dunn and Clark’s [25] z
Steiger’s [13] modification of Dunn and Clark’s [25] z
Raghunathan, Rosenthal, and Rubin’s [14] modification of Pearson and Filon’s [21] z
Silver, Hittner, and May’s [15] modification of Dunn and Clark’s [25] z
Zou’s [16] confidence interval

Existing Software

Many popular statistics programs do not provide any, or only a subset of the significance tests described above. Moreover, existing programs that allow for statistical comparisons between correlations are isolated stand-alone applications and do not come with a graphical user interface (GUI). For example, DEPCOR [28] is a program that is limited to comparisons of two dependent correlations—either overlapping or nonoverlapping. The program is written in Fortran and runs in a DOS command prompt console under the Windows platform. Another available package, DEPCORR [29], is an SAS macro [30] for comparing two dependent overlapping correlations. The latest release of SAS/STAT software (version 9.4) runs on Windows and Linux systems. However, DEPCORR has no GUI and covers only one of the three cases described above. The two packages psych [31] and multilevel [32] for the R programming language [33] also offer functions to compare two dependent or independent correlations. However, each of these functions covers only one or two of the many different available tests of comparison, and there is no GUI available to access the functions of the packages. Weaver and Wuensch [34] recently published thoroughly documented scripts for comparing dependent or independent correlations in SPSS and SAS.

cocor

With cocor (version 1.1-0), we provide a comprehensive solution to compare two correlations based on either dependent or independent groups. The cocor package enhances the R programming environment [33], which is freely available for Windows, Mac, and Linux systems and can be downloaded from CRAN (http://cran.r-project.org/package=cocor). All that is needed to install the cocor package is to type install.packages(“cocor”) in the R console, and the functionality of the package is made available by typing library(“cocor”). The function cocor() calculates and compares correlations from raw data. The underlying variables are specified via a formula interface (see Fig. 1). If raw data are not available, cocor offers three functions to compare correlation coefficients that have already been determined. The function cocor.indep.groups() compares two independent correlations, whereas the functions cocor.dep.groups.overlap() and cocor.dep.groups.nonoverlap() compare two dependent overlapping or nonoverlapping correlations, respectively. Internally, cocor() passes the calculated correlations coefficients to one of these three functions. All functions allow to specify the argument null.value to test whether the difference between the correlations exceeds a given threshold using the confidence intervals by Zou [16]. The results are either returned as an S4 object of class cocor whose input and result parameters can be obtained using the get.cocor.input() and get.cocor.results() functions, respectively. Optionally, results may also be returned as a list of class htest. By default, all tests available are calculated. Specific tests can be selected by passing a test label to the function using the test argument. The flowchart in Fig. 1 shows how to access the available tests and lists them with their individual test label (e.g., zou2007). The formulae of all implemented tests are detailed in S1 Appendix.
Fig 1

A flowchart of how to use the four main functions of cocor, displaying all available tests.

For each case, an example of the formula passed as an argument to the cocor() function and the required correlation coefficients for the functions cocor.indep.groups(), cocor.dep.groups.overlap(), and cocor.dep.groups.nonoverlap() are given. The test label before the colon may be passed as a function argument to calculate specific tests only.

A flowchart of how to use the four main functions of cocor, displaying all available tests.

For each case, an example of the formula passed as an argument to the cocor() function and the required correlation coefficients for the functions cocor.indep.groups(), cocor.dep.groups.overlap(), and cocor.dep.groups.nonoverlap() are given. The test label before the colon may be passed as a function argument to calculate specific tests only. A comparison of cocor with competing software can be found in Tables 1–3. These tables show that cocor offers a larger variety of tests and a more comprehensive approach than all previous solutions. In particular, cocor is the first R package to implement the tests by Zou [16]. Further unique features of the cocor package are the formula interface for comparing correlations that extracts the correlations from data, and the unified function for statistical tests capable of comparing both, independent and dependent correlations with either overlapping or nonoverlapping variables. Some limitations of cocor should be acknowledged, however. First, cocor is limited to the comparison of two correlations. The simultaneous comparison of more than two correlations needs tests that go beyond the scope of the present contribution [35-37]. Second, cocor does not allow one to employ structural equation models that are needed for more advanced, but also more complex approaches to the statistical comparison of correlations [38, 39].

GUIs for cocor

There are two convenient ways to use cocor via a GUI. First, the package includes a plugin for the platform independent R front-end RKWard [40] (Fig. 2). Second, for those unfamiliar with R, a web interface is also available at http://comparingcorrelations.org (Fig. 3).
Fig 2

Screenshot of the cocor GUI plugin for RKWard.

Fig 3

Screenshot of the cocor web interface on http://comparingcorrelations.org.

Thus, cocor offers the best of two worlds: On the one hand, it has the power of a scripting language with the possibility of automation. On the other hand, the two available GUIs allow even inexperienced users to use cocor in a convenient way. As cocor is embedded in the R environment for statistical computing, it allows for a seamless integration into R analyses. R code can be generated via the GUIs and used for subsequent batch analyses. Since cocor is published under the GNU General Public License (GPL; version 3 or higher), all users are invited to inspect, use, copy, modify, and redistribute the code under the same license.

Code Examples

In the following, using fictional data, examples are given for all three cases that may occur when comparing correlations.

Comparison of Two Correlations Based on Independent Groups

The first example presents code for the comparison of the correlations between a score achieved on a logic task (logic) and an intelligence measure A (intelligence.a) in two different groups. Note that the underlying data set (aptitude) is a list that contains two separate data sets. R> require (“cocor”) R> data (“aptitude”) R> cocor ( Results of a comparison of two correlations based on independent groups Comparison between r1.jk (logic, intelligence.a) = 0.3213 and r2.hm (logic, intelligence.a) = 0.2024 Difference: r1.jk—r2.hm = 0.1189 Data: sample1: j = logic, k = intelligence.a; sample2: h = logic, m = intelligence.a Group sizes: n1 = 291, n2 = 334 Null hypothesis: r1.jk is equal to r2.hm Alternative hypothesis: r1.jk is not equal to r2.hm (two-sided) Alpha: 0.05 fisher1925: Fisher’s z (1925) z = 1.5869, p-value = 0.1125 Null hypothesis retained zou2007: Zou’s (2007) confidence interval 95% confidence interval for r1.jk—r2.hm: -0.0281 0.2637 Null hypothesis retained (Interval includes 0) In this example, the test result indicates that the difference between the two correlations r1.jk and r2.hm is not significant, and the null hypothesis cannot be rejected. Alternatively, the same comparison can also be conducted based on the correlation coefficients and the group sizes using the function cocor.indep.groups(). R> cocor.indep.groups (r1.jk = 0.3213, r2.hm = 0.2024, n1 = 291, + n2 = 334)

Comparison of Two Overlapping Correlations Based on Dependent Groups

The second example code determines whether the correlation between a score achieved on general knowledge questions (knowledge) and an intelligence measure A (intelligence.a) differs from the correlation between a score achieved on a logic task (logic) and the same intelligence measure A (intelligence.a) within a group of n = 291 persons. R> cocor ( Results of a comparison of two overlapping correlations based on dependent groups Comparison between r.jk (intelligence.a, knowledge) = 0.1038 and r.jh (intelligence.a, logic) = 0.3213 Difference: r.jk—r.jh = -0.2175 Related correlation: r.kh = 0.0257 Data: aptitude [[“sample1”]]: j = intelligence.a, k = knowledge, h = logic Group size: n = 291 Null hypothesis: r.jk is equal to r.jh Alternative hypothesis: r.jk is not equal to r.jh (two-sided) Alpha: 0.05 pearson1898: Pearson and Filon’s z (1898) z = -2.7914, p-value = 0.0052 Null hypothesis rejected hotelling1940: Hotelling’s t (1940) t = -2.8066, df = 288, p-value = 0.0053 Null hypothesis rejected williams1959: Williams’ t (1959) t = -2.7743, df = 288, p-value = 0.0059 Null hypothesis rejected olkin1967: Olkin’s z (1967) z = -2.7914, p-value = 0.0052 Null hypothesis rejected dunn1969: Dunn and Clark’s z (1969) z = -2.7595, p-value = 0.0058 Null hypothesis rejected hendrickson1970: Hendrickson, Stanley, and Hills’ (1970) modification of Williams’ t (1959) t = -2.8065, df = 288, p-value = 0.0053 Null hypothesis rejected steiger1980: Steiger’s (1980) modification of Dunn and Clark’s z (1969) using average correlations z = -2.7513, p-value = 0.0059 Null hypothesis rejected meng1992: Meng, Rosenthal, and Rubin’s z (1992) z = -2.7432, p-value = 0.0061 Null hypothesis rejected 95% confidence interval for r.jk—r.jh: -0.3925 -0.0654 Null hypothesis rejected (Interval does not include 0) hittner2003: Hittner, May, and Silver’s (2003) modification of Dunn and Clark’s z (1969) using a backtransformed average Fisher’s (1921) Z procedure z = -2.7505, p-value = 0.0059 Null hypothesis rejected zou2007: Zou’s (2007) confidence interval 95% confidence interval for r.jk—r.jh: -0.3689 -0.0630 Null hypothesis rejected (Interval does not include 0) The results of all tests lead to the convergent conclusion that the difference between the two correlations r.jk and r.jh is significant, and the null hypothesis should be rejected. Alternatively, the same comparison can also be conducted based on the correlation coefficients and the group size using the function cocor.dep.groups.overlap(). R> cocor.dep.groups.overlap (r.jk = 0.1038, r.jh = 0.3213, + r.kh = 0.0257, n = 291)

Comparison of Two Nonoverlapping Correlations Based on Dependent Groups

The third example code tests whether the correlation between a score achieved on general knowledge questions (knowledge) and an intelligence measure A (intelligence.a) differs from the correlation between a score achieved on a logic task (logic) and an intelligence measure B (intelligence.b) within the same group of n = 291 persons. R> cocor ( Results of a comparison of two nonoverlapping correlations based on dependent groups Comparison between r.jk (knowledge, intelligence.a) = 0.1038 and r.hm (logic, intelligence.b) = 0.2679 Difference: r.jk—r.hm = -0.164 Related correlations: r.jh = 0.0257, r.jm = 0.1713, r.kh = 0.3213, r.km = 0.4731 Data: aptitude [[“sample1”]]: j = knowledge, k = intelligence.a, h = logic, m = intelligence.b Group size: n = 291 Null hypothesis: r.jk is equal to r.hm Alternative hypothesis: r.jk is not equal to r.hm (two-sided) Alpha: 0.05 pearson1898: Pearson and Filon’s z (1898) z = -2.0998, p-value = 0.0357 Null hypothesis rejected dunn1969: Dunn and Clark’s z (1969) z = -2.0811, p-value = 0.0374 Null hypothesis rejected steiger1980: Steiger’s (1980) modification of Dunn and Clark’s z (1969) using average correlations z = -2.0755, p-value = 0.0379 Null hypothesis rejected raghunathan1996: Raghunathan, Rosenthal, and Rubin’s (1996) modification of Pearson and Filon’s z (1898) z = -2.0811, p-value = 0.0374 Null hypothesis rejected silver2004: Silver, Hittner, and May’s (2004) modification of Dunn and Clark’s z (1969) using a backtransformed average Fisher’s (1921) Z procedure z = -2.0753, p-value = 0.0380 Null hypothesis rejected zou2007: Zou’s (2007) confidence interval 95% confidence interval for r.jk—r.hm: -0.3162 -0.0095 Null hypothesis rejected (Interval does not include 0) Also in this example, the test results converge in showing that the difference between the two correlations r.jk and r.hm is significant, and the null hypothesis should be rejected. Alternatively, the same comparison can also be conducted based on the correlation coefficients and the group size using the function cocor.dep.groups.nonoverlap(). R> cocor.dep.groups.nonoverlap (r.jk = 0.1038, r.hm = 0.2679, + r.jh = 0.0257, r.jm = 0.1713, r.kh = 0.3213, + r.km = 0.4731, n = 291)

Discussion and Summary

In this article, we introduced cocor, a free software package for the R programming language [33]. The cocor package provides a wide range of tests for comparisons of independent and dependent correlations with either overlapping or nonoverlapping variables. Unlike existing solutions, cocor is available for scripting within the R environment, while offering two convenient GUIs: a plugin for RKWard [40] and a web interface. Thus, cocor enables users of all knowledge levels to access a large variety of tests for comparing correlations in a convenient and user-friendly way.

Documentation of All Tests Implemented in cocor.

(PDF) Click here for additional data file.
  6 in total

1.  A Monte Carlo evaluation of tests for comparing dependent correlations.

Authors:  James B Hittner; Kim May; N Clayton Silver
Journal:  J Gen Psychol       Date:  2003-04

2.  Toward using confidence intervals to compare correlations.

Authors:  Guang Yong Zou
Journal:  Psychol Methods       Date:  2007-12

3.  Erroneous analyses of interactions in neuroscience: a problem of significance.

Authors:  Sander Nieuwenhuis; Birte U Forstmann; Eric-Jan Wagenmakers
Journal:  Nat Neurosci       Date:  2011-08-26       Impact factor: 24.884

4.  incor: a computer program for testing differences among independent correlations.

Authors:  N C Silver; H Zaikina; J B Hittner; K May
Journal:  Mol Ecol Resour       Date:  2008-07       Impact factor: 7.090

5.  SPSS and SAS programs for comparing Pearson correlations and OLS regression coefficients.

Authors:  Bruce Weaver; Karl L Wuensch
Journal:  Behav Res Methods       Date:  2013-09

6.  Improving standards in brain-behavior correlation analyses.

Authors:  Guillaume A Rousselet; Cyril R Pernet
Journal:  Front Hum Neurosci       Date:  2012-05-03       Impact factor: 3.169

  6 in total
  339 in total

1.  Brain networks' functional connectivity separates aphasic deficits in stroke.

Authors:  Antonello Baldassarre; Nicholas V Metcalf; Gordon L Shulman; Maurizio Corbetta
Journal:  Neurology       Date:  2018-12-05       Impact factor: 9.910

2.  Incremental validity of Useful Field of View subtests for the prediction of instrumental activities of daily living.

Authors:  Frederik Aust; Jerri D Edwards
Journal:  J Clin Exp Neuropsychol       Date:  2016-01-18       Impact factor: 2.475

3.  Automated segmentation of chronic stroke lesions using LINDA: Lesion identification with neighborhood data analysis.

Authors:  Dorian Pustina; H Branch Coslett; Peter E Turkeltaub; Nicholas Tustison; Myrna F Schwartz; Brian Avants
Journal:  Hum Brain Mapp       Date:  2016-01-12       Impact factor: 5.038

4.  Exploring the effect of factor Xa inhibitors on rotational thromboelastometry: a case series of bleeding patients.

Authors:  Syed Mahamad; Hina Chaudhry; Rosane Nisenbaum; Amanda McFarlan; Sandro Rizoli; Alun Ackery; Michelle Sholzberg
Journal:  J Thromb Thrombolysis       Date:  2019-02       Impact factor: 2.300

5.  Rethinking hyperactivity in pediatric ADHD: Preliminary evidence for a reconceptualization of hyperactivity/impulsivity from the perspective of informant perceptual processes.

Authors:  Michael J Kofler; Nicole B Groves; Leah J Singh; Elia F Soto; Elizabeth S M Chan; Lauren N Irwin; Caroline E Miller
Journal:  Psychol Assess       Date:  2020-06-01

6.  Pronounced between-subject and circadian variability in thymidylate synthase and dihydropyrimidine dehydrogenase enzyme activity in human volunteers.

Authors:  Bart A W Jacobs; Maarten J Deenen; Dick Pluim; J G Coen van Hasselt; Martin D Krähenbühl; Robin M J M van Geel; Niels de Vries; Hilde Rosing; Didier Meulendijks; Artur M Burylo; Annemieke Cats; Jos H Beijnen; Alwin D R Huitema; Jan H M Schellens
Journal:  Br J Clin Pharmacol       Date:  2016-06-03       Impact factor: 4.335

7.  Noninvasive neurostimulation of left ventral motor cortex enhances sensorimotor adaptation in speech production.

Authors:  Terri L Scott; Laura Haenchen; Ayoub Daliri; Julia Chartove; Frank H Guenther; Tyler K Perrachione
Journal:  Brain Lang       Date:  2020-07-29       Impact factor: 2.381

8.  Executive functioning rating scales: Ecologically valid or construct invalid?

Authors:  Elia F Soto; Michael J Kofler; Leah J Singh; Erica L Wells; Lauren N Irwin; Nicole B Groves; Caroline E Miller
Journal:  Neuropsychology       Date:  2020-07-30       Impact factor: 3.295

9.  The Difference Spotting Task: A new nonverbal measure of cheating behavior.

Authors:  Jinting Liu; Qiang Shen; Jieting Zhang; Urielle Beyens; Wei Cai; Jean Decety; Hong Li
Journal:  Behav Res Methods       Date:  2021-03-10

10.  Linking anatomical and physiological markers of auditory system degeneration with behavioral hearing assessments in a mouse (Mus musculus) model of age-related hearing loss.

Authors:  Anastasiya Kobrina; Katrina M Schrode; Laurel A Screven; Hamad Javaid; Madison M Weinberg; Garrett Brown; Ryleigh Board; Dillan F Villavisanis; Micheal L Dent; Amanda M Lauer
Journal:  Neurobiol Aging       Date:  2020-08-26       Impact factor: 4.673

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.