Literature DB >> 22811662

Better ways to improve standards in brain-behavior correlation analysis.

Dietrich Samuel Schwarzkopf1, Benjamin De Haas, Geraint Rees.   

Abstract

Entities:  

Year:  2012        PMID: 22811662      PMCID: PMC3397314          DOI: 10.3389/fnhum.2012.00200

Source DB:  PubMed          Journal:  Front Hum Neurosci        ISSN: 1662-5161            Impact factor:   3.169


× No keyword cloud information.
Rousselet and Pernet (2012) demonstrate that outliers can skew Pearson correlation. They claim that this leads to widespread statistical errors by selecting and re-analyzing a cohort of published studies. However, they report neither the study identities nor inclusion criteria for this survey, so their claim cannot be independently replicated. Moreover, because their selection criteria are based on the authors’ belief that a study used misleading statistics, their study represents an example of “double dipping” (Kriegeskorte et al., 2009). The strong claims they make about the literature are therefore circular and unjustified by their data. Their purely statistical approach also does not consider the biological context of what observations constitute outliers. In discussion, they propose that the skipped correlation (Wilcox, 2005) is an appropriate alternative to the Pearson correlation that is robust to outliers. However, this test lacks statistical power to detect true relationships (Figure 1A) and is highly prone to false positives (Figure 1B). These factors conspire to drastically reduce the sensitivity of this test in comparison to other procedures (Appendix 1). Further, it is susceptible to the parameters chosen for the minimum covariance estimator to identify outliers but these parameters are not reported.
Figure 1

Statistical power (A) and false positive rates (B) for four statistical tests and four sample sizes based on 10,000 simulations (see . Outliers can drastically inflate false positives for Pearson correlation (note the difference in scale for this test). Skipped correlation (Wilcox, 2005) is generally very susceptible to false positives under all conditions. Only Shepherd’s pi provides adequate statistical power and protection against false positives. The black line in (B) denotes the nominal false positive rate of 0.05. (C) Replot of data shown in Rousselet and Pernet’s Figure 2. The contour lines indicate the bootstrapped Mahalanobis distance Ds from the bivariate mean in steps of six squared units (purple colors denote greater distances). Filled circles denote data included in the correlation, open circles denote outliers (see Appendix 2 for details). The solid line is a linear regression over the data after outlier removal. The correlation statistics shown are Spearman’s rho, skipped correlation r′ (critical t in parentheses), and Shepherd’s pi. Asterisks indicate significant results. All p-statistics rounded to third decimal. The freely available LIBRA toolbox (Verboven and Hubert, 2005) was used to calculate the skipped correlation. While the exact estimates of the t-statistic differ between R and MATLAB the conclusions about significance for these tests are very similar.

Statistical power (A) and false positive rates (B) for four statistical tests and four sample sizes based on 10,000 simulations (see . Outliers can drastically inflate false positives for Pearson correlation (note the difference in scale for this test). Skipped correlation (Wilcox, 2005) is generally very susceptible to false positives under all conditions. Only Shepherd’s pi provides adequate statistical power and protection against false positives. The black line in (B) denotes the nominal false positive rate of 0.05. (C) Replot of data shown in Rousselet and Pernet’s Figure 2. The contour lines indicate the bootstrapped Mahalanobis distance Ds from the bivariate mean in steps of six squared units (purple colors denote greater distances). Filled circles denote data included in the correlation, open circles denote outliers (see Appendix 2 for details). The solid line is a linear regression over the data after outlier removal. The correlation statistics shown are Spearman’s rho, skipped correlation r′ (critical t in parentheses), and Shepherd’s pi. Asterisks indicate significant results. All p-statistics rounded to third decimal. The freely available LIBRA toolbox (Verboven and Hubert, 2005) was used to calculate the skipped correlation. While the exact estimates of the t-statistic differ between R and MATLAB the conclusions about significance for these tests are very similar. Their argument fails to consider a broad literature on robust statistics, although an extensive review is outside the scope of this commentary. We limit ourselves instead to presenting a practical alternative to their approach: Shepherd’s pi correlation (http://www.fil.ion.ucl.ac.uk/~sschwarz/Shepherd.zip). We identify outliers by bootstrapping the Mahalanobis distance, Ds, of each observation from the bivariate mean and excluding all points whose average Ds is 6 or greater. Shepherd’s pi is Spearman’s rho but the p-statistic is doubled to account for outlier removal (Appendix 2). This compares very well in power (Figure 1A) to other tests and is more robust to the presence of influential outliers (Figure 1B). We replot the data Rousselet and Pernet presented in their Figure 2. The conclusions drawn from Shepherd’s pi are comparable to skipped correlation but less strict in situations where a relationship is likely (Figure 1C, Figures A1 and A2 in Appendix).
Figure A1

Data shown in Figure . All conventions are as in Figure 1C.

Figure A2

Data shown in Figure . All conventions are as in Figure 1C.

Consider for instance the data in Figure 1C-1. Pearson and Spearman correlation applied to these data are comparable. This implies that the assumptions of Pearson’s r were probably met in this case. The skipped correlation (r’) does not reach significance but nevertheless shows a similar relationship, consistent with our demonstration above that it is too conservative a measure. Under Shepherd’s pi, however, the relationship between these variables is significant. Indeed, reflecting our intimate knowledge of these data (Schwarzkopf et al., 2011), we already know that the relationship studied here replicates for separate behavioral measures (see Schwarzkopf et al., 2011 SOM). A similar pattern was observed for other data, e.g., Figure 1C-2. In some cases skipped correlation even removes the majority of data as outliers (e.g., their Figure 2E), which borders on the absurd. Rousselet and Pernet also claim that none of the studies that they surveyed considered the correlation coefficient and its confidence intervals. Cohen defined that 0.3 < r < 0.5 constitutes correlations of medium strength (Cohen, 1988). Even “strong” correlations have r > 0.5, that is, at least 25% of the variance is explained. A correlation accounting for ~15% of variance is thus not particularly “modest” as they state. Naturally, this taxonomy is somewhat arbitrary but when relating complex cognitive functions to brain measures we are unlikely to find very high r, except for trivial relationships (Yarkoni, 2009). Their failure to find reported confidence intervals in the literature is also puzzling because it does not accurately report the published work they considered. For example, our study, reproduced in their Figure 2A, reported bootstrapped 95% confidence intervals in the figure (Schwarzkopf et al., 2011). They also do not consider important aspects of what confidence intervals reflect. Naturally, a confidence interval is an indicator of the certainty with which the effect size can be estimated. However, it depends on three factors: the strength of the correlation, the sample size, and the data distribution. Because Pearson correlation assumes a Gaussian distribution we can predict the confidence interval for any given r. If the bootstrapped confidence interval differs from this prediction, the data probably do not meet the assumptions. Rousselet and Pernet’s example for bivariate outliers (their Figure 1D) illustrates this: the predicted confidence interval for r = 0.49 with n = 17 should be (0.01, 0.79). However, the bootstrapped confidence interval for this example is (−0.19, 0.87), much wider and also overlapping zero. This indicates that outliers skew the correlation and that it should not be considered significant. Compare this to Figure 1C-1 (their Figure 2A): the nominal confidence interval should be (−0.65, −0.02); the actual bootstrapped interval is very similar: (−0.67, −0.03). Therefore, the use of Pearson/Spearman correlation was justified here. We propose simple guidelines to follow when testing correlations. First, use Spearman’s rho because it captures non-linear relationships. Second, bootstrap confidence intervals. Third, if the interval differs from the nominal interval, apply Shepherd’s pi as a more robust test. Fourth, estimate the reliability of individual observations, especially in cases where outliers strongly affect results. Outliers are frequently the result of artifacts or measurement error. Our last point highlights an important general concern we have with Rousselet and Pernet’s argument. Statistical tests are important tools to be used by researchers for interpreting their data. However, the goal of neuroscience is to answer biologically relevant questions, not to produce statistically significant results. No statistical procedure can determine whether a biological question is valid or if a theory is sound. Rather, one has to inspect each finding and each data point in its own right, evaluating the data quality and the potential confounds on a case-by-case basis. Outliers should not be determined solely by statistical tests but must take into account biological interpretation (Bertolino, 2011; Schott and Düzel, 2011). And finally, there is only one way any finding can be considered truly significant; when upon repeated replication it passes the test of time.
  5 in total

Review 1.  Circular analysis in systems neuroscience: the dangers of double dipping.

Authors:  Nikolaus Kriegeskorte; W Kyle Simmons; Patrick S F Bellgowan; Chris I Baker
Journal:  Nat Neurosci       Date:  2009-05       Impact factor: 24.884

2.  Robust statistics show no evidence for a relationship between fiber density and memory performance.

Authors:  Guillaume A Rousselet; Cyril R Pernet
Journal:  Proc Natl Acad Sci U S A       Date:  2011-08-05       Impact factor: 11.205

3.  Big Correlations in Little Studies: Inflated fMRI Correlations Reflect Low Statistical Power-Commentary on Vul et al. (2009).

Authors:  Tal Yarkoni
Journal:  Perspect Psychol Sci       Date:  2009-05

4.  The surface area of human V1 predicts the subjective experience of object size.

Authors:  D Samuel Schwarzkopf; Chen Song; Geraint Rees
Journal:  Nat Neurosci       Date:  2010-12-05       Impact factor: 24.884

5.  Improving standards in brain-behavior correlation analyses.

Authors:  Guillaume A Rousselet; Cyril R Pernet
Journal:  Front Hum Neurosci       Date:  2012-05-03       Impact factor: 3.169

  5 in total
  40 in total

1.  Pathological uncoupling between amplitude and connectivity of brain fluctuations in epilepsy.

Authors:  Zhiqiang Zhang; Qiang Xu; Wei Liao; Zhengge Wang; Qian Li; Fang Yang; Zongjun Zhang; Yijun Liu; Guangming Lu
Journal:  Hum Brain Mapp       Date:  2015-04-16       Impact factor: 5.038

2.  Dysregulated Maturation of the Functional Connectome in Antipsychotic-Naïve, First-Episode Patients With Adolescent-Onset Schizophrenia.

Authors:  Meiling Li; Benjamin Becker; Junjie Zheng; Yan Zhang; Heng Chen; Wei Liao; Xujun Duan; Hesheng Liu; Jingping Zhao; Huafu Chen
Journal:  Schizophr Bull       Date:  2019-04-25       Impact factor: 9.306

3.  Uncovering a Role for the Dorsal Hippocampal Commissure in Recognition Memory.

Authors:  M Postans; G D Parker; H Lundell; M Ptito; K Hamandi; W P Gray; J P Aggleton; T B Dyrby; D K Jones; M Winter
Journal:  Cereb Cortex       Date:  2020-03-14       Impact factor: 5.357

4.  Gradual acquisition of visuospatial associative memory representations via the dorsal precuneus.

Authors:  Björn H Schott; Torsten Wüstenberg; Eva Lücke; Ina-Maria Pohl; Anni Richter; Constanze I Seidenbecher; Stefan Pollmann; Jasmin M Kizilirmak; Alan Richardson-Klavehn
Journal:  Hum Brain Mapp       Date:  2018-11-15       Impact factor: 5.038

5.  Gender-specific modulation of neural mechanisms underlying social reward processing by Autism Quotient.

Authors:  Adriana Barman; Sylvia Richter; Joram Soch; Anna Deibele; Anni Richter; Anne Assmann; Torsten Wüstenberg; Henrik Walter; Constanze I Seidenbecher; Björn H Schott
Journal:  Soc Cogn Affect Neurosci       Date:  2015-05-04       Impact factor: 3.436

6.  Larger extrastriate population receptive fields in autism spectrum disorders.

Authors:  D Samuel Schwarzkopf; Elaine J Anderson; Benjamin de Haas; Sarah J White; Geraint Rees
Journal:  J Neurosci       Date:  2014-02-12       Impact factor: 6.167

7.  Disrupted structural and functional rich club organization of the brain connectome in patients with generalized tonic-clonic seizure.

Authors:  Rong Li; Wei Liao; Yibo Li; Yangyang Yu; Zhiqiang Zhang; Guangming Lu; Huafu Chen
Journal:  Hum Brain Mapp       Date:  2016-07-28       Impact factor: 5.038

8.  Functional connectivity of the hippocampus in temporal lobe epilepsy: feasibility of a task-regressed seed-based approach.

Authors:  Nuri Erkut Kucukboyaci; Nobuko Kemmotsu; Chris E Cheng; Holly M Girard; Evelyn S Tecoma; Vicente J Iragui; Carrie R McDonald
Journal:  Brain Connect       Date:  2013

9.  Individual differences in the morphometry and activation of time perception networks are influenced by dopamine genotype.

Authors:  Martin Wiener; Yune-Sang Lee; Falk W Lohoff; H Branch Coslett
Journal:  Neuroimage       Date:  2013-11-19       Impact factor: 6.556

10.  Presurgical localization and spatial shift of resting state networks in patients with brain metastases.

Authors:  Ju-Rong Ding; Fangmei Zhu; Bo Hua; Xingzhong Xiong; Yuqiao Wen; Zhongxiang Ding; Paul M Thompson
Journal:  Brain Imaging Behav       Date:  2019-04       Impact factor: 3.978

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.