Literature DB >> 24123491

Evidence of reporting biases in voxel-based morphometry (VBM) studies of psychiatric and neurological disorders.

Paolo Fusar-Poli1, Joaquim Radua, Marianna Frascarelli, Andrea Mechelli, Stefan Borgwardt, Fabio Di Fabio, Massimo Biondi, John P A Ioannidis, Sean P David.   

Abstract

OBJECTIVES: To evaluate whether biases may influence the findings of whole-brain structural imaging literature.
METHODS: Forty-seven whole-brain voxel-based meta-analyses including voxel-based morphometry (VBM) studies in neuropsychiatric conditions were included, for a total of 324 individual VBM studies. The total sample size, the overall number of foci, and different moderators were extracted both at the level of the individual studies and at the level of the meta-analyses.
RESULTS: Sample size ranged from 12 to 545 (median n = 47) per VBM study. The median number of reported foci per study was six. VBM studies with larger sample sizes reported only slightly more abnormalities than smaller studies (2% increase in the number of foci per 10-patients increase in sample size). A similar pattern was seen in several analyses according to different moderator variables with some possible modulating evidence for the statistical threshold employed, publication year and number of coauthors. Whole-brain meta-analyses (median sample size n = 534) found fewer foci (median = 3) than single studies and overall they showed no significant increase in the number of foci with increasing sample size. Meta-analyses with ≥10 VBM studies reported a median of three foci and showed a significant increase with increasing sample size, while there was no relationship between sample size and number of foci (median = 5) in meta-analyses with <10 VBM studies.
CONCLUSIONS: The number of foci reported in small VBM studies and even in meta-analyses with few studies may often be inflated. This picture is consistent with reporting biases affecting small studies.
Copyright © 2013 The Authors. Human Brain Mapping published by Wiley Periodicals, Inc.

Entities:  

Keywords:  VBM; bias; dementia; neuroimaging; psychosis; structural

Mesh:

Year:  2013        PMID: 24123491      PMCID: PMC6869352          DOI: 10.1002/hbm.22384

Source DB:  PubMed          Journal:  Hum Brain Mapp        ISSN: 1065-9471            Impact factor:   5.038


INTRODUCTION

Structural magnetic resonance imaging (sMRI) studies have been carried out by many researchers in different neuropsychiatric conditions including psychosis, depression, dementia, attention deficit hyperactivity disorder (ADHD) and autistic disorders. Often, the morphometric measurements used in these studies have been obtained from a priori regions of interests (ROIs) that can be clearly defined (such as the hippocampi or the ventricles) [Ashburner and Friston, 2000]. However, there are a number of morphometric features that may be more difficult to quantify by inspection, meaning that many structural differences may be over or under looked. The caveat of the ROIs structural analyses is that, because of these difficulties, researchers can introduce a large source of heterogeneity undermining the consistency of their results. These problems may ultimately prevent clinical application of sMRI to psychiatry [Borgwardt and Fusar‐Poli, 2012]. To address this limitation, an advanced structural imaging technique has been recently introduced and widely applied. Voxel‐based morphometry (VBM) involves a voxel‐wise comparison of the local concentration of gray matter between two or more groups of subjects. The procedure usually involves spatially normalizing high‐resolution images from all the subjects in the study into the same stereotactic space, segmenting the gray and white matter and smoothing the resulting gray‐matter segments. Some protocols also include a ‘modulation’ step, but its effects are disputed (Radua et al. 2013). Voxel‐wise parametric or non‐parametric statistical tests comparing the experimental groups are then performed correcting for multiple comparisons. The value of this automated analytical approach is that it gives an “even‐handed and comprehensive assessment of anatomical differences throughout the brain” without necessarily biasing attention a priori to a specific ROI [Laird et al., 2005]. Because of this, VBM is considered an objective method to analyze whole‐brain structural abnormalities in neuroscience and psychiatric research and bridge structural neuroimaging toward clinical applications. However, it is unclear whether the current VBM literature can still be affected by biases, in particular publication and other selective reporting biases, where investigators selectively report statistically significant results and under‐report non‐significant findings – as noted for sMRI studies [Ioannidis, 2011]. Detecting these biases in single studies is difficult: by default it is very difficult to unearth unpublished studies and unless the original protocol is available it is not possible to check whether the presented results are more favorable (e.g., claim more discovered foci with abnormalities) than an analysis based on the original protocol. However, one may obtain hints about the presence of such biases, when many studies have been performed. In the absence of bias, one would expect power to detect abnormalities to improve when sample size increases, other things being equal. Conversely, with such biases small studies with unimpressive null results may be unpublished, or they may be analyzed in a way that they provide more foci. Evidence from many different scientific fields suggests that bias may affect to a lesser extent large studies, since these are likely to be published regardless of their results and analytical manipulation may be less prominent [Rothstein HR et al., 2005]. The first aim of this study was to evaluate the relationship between sample size and reported discovered foci in published VBM studies across different neuropsychiatric conditions so as to probe into the possibility of reporting biases affecting preferentially smaller studies. This was achieved by focusing on available voxel‐based meta‐analyses of VBM studies, such those performed with Activation Likelihood Estimate [Laird et al., 2005] or Signed Differential Mapping [Radua and Mataix‐Cols, 2009, 2012; Radua et al., 2010a, 2012a]. These meta‐analyses are attempting to reconcile contrasting and inconclusive individual VBM findings by obtaining larger sample sizes with an associated greater statistical power. Thus, we assess whether larger sample sizes are associated with larger number of identified foci. If conversely, the same or even more foci are claimed to be identified by small studies as with larger ones this would offer evidence for bias. The second aim of this study was to explore the impact of a number of variables on the relationship between sample size and number of reported foci in the VBM literature including the type of neuropsychiatric disorder, the publication year, the sample size of the study, the slice thickness of the images, the degree of smoothing, the software used to preprocess or perform the statistical analysis of the data, the statistical threshold employed, and the use of a small volume corrections (SVC) in the analysis. Our third aim was then to evaluate whether sample size was related to the number of reported foci in meta‐analyses of VBM studies across different psychiatric conditions and whether these meta‐analyses report more or fewer foci than the much smaller studies that they include.

METHODS

Search Strategy

We conducted a four‐step literature search. First, we searched on PubMed using the Boolean terms “voxel‐based morphometry meta‐analysis.” All publications listed in PubMed prior to August 1, 2012 were included. In a second step we also searched the bibliographies of Brain Map (http://brainmap.org/pubs/) and SDM databases (http://sdmproject.com) (last search performed on July 31, 2012). All eligible publications were included. In a third step we hand searched the references of the included publications to minimize the possibility of biases in the literature search. Full texts were pulled for all potentially eligible publications. Then the retrieved publications underwent an initial culling of ineligible and duplicate analyses. These publications were then hand searched for inclusion criteria and selected by two analysts independently (MF & PFP), with any discrepancies adjudicated until 100% rater agreement was achieved. To achieve a high standard of reporting we have adopted ‘Preferred Reporting Items for Systematic Reviews and Meta‐Analyses’ (PRISMA) guidelines [Liberati et al., 2009]. In the final step, we searched and then collected all the individual studies listed in each included article.

Controlling for Independent Data

Many of the included articles conducted different meta‐analyses for more than one condition (sub‐meta‐analyses). These sub‐meta‐analyses were each considered separately for inclusion in our study. To avoid inclusion of overlapping data, when two meta‐analyses included overlapping sets of studies in a similar contrast, we retained only the more recent and largest number of studies/sample size meta‐analysis. In the case where the same paper included both an overall meta‐analysis and separate sub‐analyses for different conditions, the sub‐analyses were preferentially included (with the overall meta‐analysis being considered a duplicate and excluded). As a consequence, all the included meta‐analyses have addressed different between‐groups contrasts. We then carefully searched the included articles at the level of each individual study and studies were compiled and compared a second time to eliminate overlapping samples to insure that no individual study was double counted.

Inclusion Criteria

Articles were included in our analysis if they were independent (see above) whole‐brain voxel‐based meta‐analyses of magnetic resonance imaging (MRI) studies of human brain. Meta‐analyses were eligible regardless of the neurological or psychiatric condition investigated. Exclusion criteria were (i) meta‐analyses of ROI (not whole brain), (ii) structural modalities other than VBM (e.g., diffusion tensor imaging, cortical pattern matching), (iii) functional brain imaging meta‐analyses or (iv) non‐human studies, and (v) overlapping meta‐analyses. Meta‐analyses of studies investigating white matter differences (Peters et al., in press; Radua et al., 2010b) were also excluded. Only a minority of the retrieved meta‐analyses had fully listed the number of subjects and number of foci identified in each individual study, which was necessary to perform the statistical analysis. To circumvent this problem, include moderators and avoid missing data, we collected all the individual studies for each meta‐analysis and extracted these details (see below).

Data Extraction

At the level of each individual study, we extracted the total sample size, the overall number of foci, the condition, the contrast within the condition, the imaging parameters (magnet intensity, slice thickness, degree of smoothing, and software packages used), the statistical threshold [false discovery rate (FDR), family‐wise error (FWE) correction, uncorrected P‐value] and the use of SVC in the analysis. Similarly, at the level of the meta‐analyses, we extracted total sample size, the overall number of foci, the condition, the contrast within the condition, the statistical threshold and the software packages used.

Theoretical Framework and Statistical Analysis

Given that studies with large samples have more power to detect abnormalities, the number of reported foci should show a positive relationship with the sample size. As exemplified in Figure 1, small studies should detect only a small proportion of the true abnormalities, whilst larger studies should detect a larger part of the true abnormalities. A weak or null relationship could indicate potential reporting biases affecting the smaller studies more than the larger studies [David et al., 2013].
Figure 1

Expected relationship between sample size and number of abnormalities detected by whole‐brain structural neuroimaging techniques.

Footnote: The number of abnormalities correctly detected as abnormal (P < 0.05) in a simulated study with that sample size is shown with solid lines. The number falsely detected abnormalities is shown with dotted lines. In all simulations, brain was composed of 5,000 independent parts, of which 95% were normal and 5% were abnormal.

Expected relationship between sample size and number of abnormalities detected by whole‐brain structural neuroimaging techniques. Footnote: The number of abnormalities correctly detected as abnormal (P < 0.05) in a simulated study with that sample size is shown with solid lines. The number falsely detected abnormalities is shown with dotted lines. In all simulations, brain was composed of 5,000 independent parts, of which 95% were normal and 5% were abnormal. Specifically, the relationship between the number of reported foci in each study and the sample size of the study was assessed with a zero‐inflated Poisson regression [Zeleis et al., 2008]. This model was used, instead of simpler ones, because the number of foci in VBM studies was observed to follow a mixture of a point mass at zero with a count distribution. However, for the sake of completeness, we also conducted meta‐analytical combination of Poisson coefficients, estimated separately for each published meta‐analysis, simple Pearson correlations, non‐linear Spearman correlations, Pearson correlations after discarding the most influential study according to the dfbetas statistic of the regression of foci by sample size [Belsley et al., 1980], and meta‐analytical combination of (Fisher‐transformed) Pearson correlations separately estimated per each published meta‐analysis. It should be noted that variability in the way authors report VBM results could conceal the expected positive relationship between the sample size and the number of reported foci. Statistically significant voxels are usually grouped in clusters of spatially contiguous voxels, and only the local maxima (i.e., foci) are reported. Importantly, an increase of the sample size helps non‐significant voxels between two close clusters to achieve statistical significance, thus sometimes converting the two close clusters into a single larger one. The number of foci should not be affected by this conversion, but some authors choose to report only three foci per cluster. In other words, these authors could report up to six foci when describing the two close clusters, whilst no more than three when describing the single larger cluster obtained after an increase of the sample size. In such a case, the relationship between the sample size and the number of foci could be downwards biased. A simulation framework was used to assess whether such potential bias could significantly affect the expected relationship. First, 84,000 gray‐matter datasets were simulated by adding normally distributed noise to a normal gray‐matter template (n = 42,000 controls), or to a gray‐matter template with abnormal volume in regions reported to have decreased gray matter in first psychotic episodes (n = 42,000 patients) [Radua et al., 2012b]. Second, these data were smoothed with a large Gaussian kernel [σ = 6 mm, full‐width at half maximum (FWHM) = 14 mm], thus simulating both the spatial covariance observable in raw data and the smoothing usually applied in VBM pre‐processing. Finally, individuals were grouped in 400 simulated studies with different numbers of participants (from n = 10 to 200 per group), and standard group‐level voxel‐based statistics were performed (uncorrected P = 0.001, 20 voxels extent). As shown in Figure 2, the number of clusters followed a clear positive relationship with the sample size. The relationship would be the same if each cluster was substituted by three reported foci.
Figure 2

Relationship between sample size and number of clusters in simulated VBM data.

Relationship between sample size and number of clusters in simulated VBM data. To explore experimental variables influencing the relationship between sample size and number of reported foci, subgroup regressions were conducted on the following subsets of studies: studies published up to and after 2008, studies with up to or more than six authors, studies with less than or at least 32 patients, studies with samples sizes up to 80 patients, studies conducted in MRI devices with magnets up to or stronger than 1.5 Tesla (T), studies with MRI acquisition slices thickness of at least or thinner than 1.5 mm, studies employing statistical parametric mapping (SPM) or other software packages to pre‐process and compare the images, studies applying a smoothing of up to or superior to 8 mm of FWHM, studies thresholding at P < 0.001 uncorrected for multiple comparisons, studies thresholding at P < 0.05 FDR‐ or FWE‐corrected for multiple comparisons, studies employing SVC, and studies investigating different neuropsychiatric conditions. Cutoffs for magnet intensity, slice thickness, and smoothing kernel were chosen because they allowed dividing the total sample of studies in two sub‐groups of fairly similar size. The year 2008 was chose as cutoff to specifically test the impact of advanced VBM algorithms such as the DARTEL, which were introduced shortly before [Ashburner, 2007b]. The sample size of 32 patients was chosen on the basis of evidence indicating that the minimum sample size for a neuroimaging study is 16 patients per group [Friston, 2012]. The number of authors of six was chosen on the basis of the previous findings by Sayo et al (2011). Regression slopes of complementary subgroups with different findings (e.g., one slope is significantly higher than zero and the other slope is not significantly higher than zero) were formally compared with zero‐inflated Poisson models. All calculations were performed with the “pscl” package for R [Jackman, 2012] [R_Development_Core_Team, 2011]. A regression line was applied to the reported plots for both significant and non‐significant relationships. Finally, we also assessed the results of the meta‐analyses that had combined these VBM studies. First, we evaluated the relationship between the number of reported foci in each meta‐analysis and the combined sample size of the studies included in the meta‐analysis with a Poisson regression. Again, this model was used, instead of simpler ones, because the number of foci in VBM studies was observed to follow a count distribution. For the sake of completeness, we also conducted simple Pearson correlation, non‐linear Spearman correlation, and Pearson correlation after discarding the most influential study according to the dfbetas statistic of the regression of foci by sample size. Second, we evaluated whether the number of foci reported in the meta‐analyses was larger or smaller than the number of foci reported in each of the VBM studies that they had combined. We tested the hypothesis that the much larger sample size of the meta‐analyses would allow detecting more or at least as many foci as in the individual studies. In the presence of bias in single studies, the meta‐analyses may report even fewer validated foci than the single studies, because the biases may be diluted in the meta‐analysis. The analysis used the Wilcoxon paired test, where the number of foci in the meta‐analysis was compared paired against the number of foci in each study that it included.

RESULTS

Database

Our literature search identified 54 full text articles, which were assessed for inclusion criteria. The final database comprised 42 articles with 79 meta‐analyses (including sub‐analyses). After checking for duplicate or overlapping meta‐analyses, a final set of 47 meta‐analyses were included and the final dataset used in this study comprised a total of 324 individual VBM studies. The literature search and the characteristics of the included and excluded meta‐analyses are detailed in the Figure 3, Table 1 and Supporting Information Table IS. As shown in Table 2, the number of participants ranged from 12 to 545 in the studies (median = 47, interquartile range = 39), and from 149 to 4087 in the meta‐analyses (median = 534, interquartile range = 721). The median number of reported foci per study was six, while the median number of foci reported per meta‐analysis was three. Seventy‐four percent of the studies reported 10 foci or less, and 79% of the meta‐analyses five foci or less. Other descriptive details of the included studies and meta‐analyses are depicted in Table 2.
Figure 3

PRISMA Flow chart of literature search.

Table 1

List of articles included in the present meta‐meta‐analysis. Each meta‐analysis (contrast) used for each included study is detailed in the specific column

AuthorYearStatistical thresholdConditionContrastSampleNumber of foci
Yang2012 (Yang et al., 2012) P < 0.05 (FDR)Semantic dementiaSD vs. HC23510
Bora2010 (Bora et al., 2010) P < 0.001Bipolar disorderBD vs. HC14303
FE BD vs. HC1971
Bora2011 (Bora et al., 2011) P < 0.001SchizophreniaSCZ vs. HC40875
FEP vs. HC8742
FEP vs. CHR SCZ40873
Bora2012 (Bora et al., 2012) P < 0.001Depressive disorderFE DEP vs. HC2921
ME DEP vs. HC16312
FE DEP vs. ME DEP19231
CO ANX vs. HC11171
NO CO ANX vs. HC5582
C DEP vs. HC14141
Chen2010 (Chen and Ma 2010) P < 0.001Amyotrophic lateral sclerosisALS vs. HC1651
Fusar‐Poli2011 (Fusar‐Poli et al., 2011a) P < 0.05 (FDR)Psychosis riskClinical HR vs. HC5165
HR NoTransition vs. HR Transition1652
Genetic HR vs. HC9523
Genetic HR vs. Clinical HR14685
HR vs. SCZ7545
Fusar‐Poli2011 (Fusar‐Poli et al., 2011b) P < 0.001Psychosis onsetFEP vs. HC4133
FEP vs. HR8654
Lai2011 (Lai 2011) P < 0.001Panic disorderPD vs. HC2262
Li2012 (Palaniyappan et al., 2012) P < 0.05 (FDR)EpilepsyRTLE vs. HC3007
LTLE vs. HC43311
Modinos2012 (Modinos et al., 2012) P < 0.001Auditory verbal hallucinationsAVHs Severe vs. AVHs Non severe2596
SCZ AVHs vs. SCZ No AVHs2660
Nakao2011 (Nakao et al., 2011) P < 0.001ADHDADHD vs. HC7222
Nickl‐Jockschat2012 (Nickl‐Jockschat et al., 2012b) P < 0.05 (n.a.)Mild cognitive impairmentMCI vs. HC17263
Amnestic MCI vs. HC16595
Pan2012 (Pan et al., 2012a)n.a.Parkinson's diseasePKD vs. HC8731
Pan2012 (Pan,, 2012b) P < 0.001BV‐Fronto temporal dementiaBV‐FTD vs. HC5341
Radua2010 (Radua et al. 2010a) P < 0.001Anxiety disordersOCD vs. HC8364
OAD vs. HC4833
OCD vs. OAD13763
Richlan2012 (Richlan et al., 2012) P < 0.005DyslexiaDYL vs. HC2665
Schroeter2009 (Schroeter et al., 2009) P < 0.0001Alzheimer's diseaseAD vs. HC4216
Via2011 (Via et al., 2011) P < 0.001Autism Spectrum DisorderASD vs. HC9675
AUT vs. ASP9050
CHI ASD vs. ADU ASD9671
Zheng2012 (Zheng et al., 2012) P < 0.05 (FDR)Focal dystoniaDYT vs. HC4465
Yu2010 (Yu et al., 2010) P < 0.05 (FDR)Bipolar disorder and schizophreniaBD vs. SCZ261723
Ferreira2011 (Ferreira et al., 2011) P < 0.01 (FDR)Mild cognitive impairmentMCI conv vs. MCI st4791
Yu2011 (Yu et al., 2011) P < 0.05 (FDR)Autism spectrum disorderAUT vs. HC34324
ASP vs. HC37413
Frodl2012 (Frodl and Skokauskas 2012) P < 0.001ADHDCHI ADHD vs. HC3482
ADU ADHD vs. HC2602
Rotge2010 (Rotge et al., 2010) P < 0.05 (FDR)OCDCHI OCD vs. HC14916
ADU OCD vs. HC51213

AD = Alzheimer's disease; ADHD = attention deficit‐hyperactivity disorder; ADU = adults; ALS = amyotrophic lateral sclerosis; ASP = Asperger disorder; ASP = autism spectrum disorders; AUT = autism; AVHs = auditory verbal hallucinations; BD = bipolar disorder; BV‐FTD = behavioral variant fronto‐temporal dementia; C = currently; CHI = children; CHR = chronic; CO ANX = comorbid anxiety; conv = converters; DEP = depressive disorder; DYL = dyslexia; DYT = dystonia; FDR = false discovery rate; FE = first episode; FEP = first episode psychosis; HC = healthy controls; HR = high risk; LTLE = left temporal lobe epilepsy; MCI = mild cognitive impairment; n.a. = non available; OAD = other anxiety disorders; OCD = obsessive‐compulsive disorder; PD = panic disorder; PKD = Parkinson's disease; RTLE = right temporal lobe epilepsy; SCZ = schizophrenia; SD = semantic dementia; st = stable.

Table 2

Number of participants and reported foci in the VBM studies (a) and meta‐analyses (b) included in the present study

Number of participantsNumber of foci
MinQ1MedianQ3MaxMinQ1MedianQ3Max
(a) All studies1233477254502611105
Studies of psychiatric disorders123452794250251058
Studies of neurological disorders1330416654503713105
Studies of non‐affective psychosis123762874250261238
Studies of dementia1329406354503712105
Studies of neurological diseases other than dementia163141661340271448
Studies of anxiety disorders20343951144013719
Studies of autism20303365188015958
Studies of bipolar disorder264868841320261733
Studies of depressive disorders273660903280261124
Studies of ADHD24314057128004817
Studies published up to 20081230427242503711105
Studies published after 2008133752725450251148
Studies with up to six authors123043695450371233
Studies with more than six authors1333517442502510105
Studies with less than 32 patients1222262931024958
Studies with at least 32 patients3241608454502613105
Studies with up to 80 patients12304059800251058
Studies with magnets up to 1.5T1234527854502610105
Studies with magnets stronger than 1.5T162938592210261358
Studies with slices of 1.5 mm or more1335577742502610105
Studies with slices inferior to 1.5 mm123040705450261358
Studies employing SPM in pre‐processing1232457254502612105
Studies not employing SPM in pre‐processing244163873170271027
Studies with 8 mm smoothing or less153348724250251258
Studies with more than 8 mm smoothing1230457554502611105
Studies employing SPM in statistics1232457254502612105
Studies not employing SPM in statistics243761833170271027
Studies thresholding at P < 0.001 uncorrected153039564250471448
Studies thresholding at P < 0.05 corrected1233497238202510105
Studies thresholding at P < 0.05 FDR‐corrected163650641880261258
Studies thresholding at P < 0.05 FWE‐corrected133659903820248105
Studies not employing SVC123247715452612105
Studies employing SVC20407112331723727
(b) All meta‐analyses14932253410424087023524
Meta‐analyses with less than 10 studies149259343446952025724
Meta‐analyses with at least 10 studies483873124716524087013423
PRISMA Flow chart of literature search. List of articles included in the present meta‐meta‐analysis. Each meta‐analysis (contrast) used for each included study is detailed in the specific column AD = Alzheimer's disease; ADHD = attention deficit‐hyperactivity disorder; ADU = adults; ALS = amyotrophic lateral sclerosis; ASP = Asperger disorder; ASP = autism spectrum disorders; AUT = autism; AVHs = auditory verbal hallucinations; BD = bipolar disorder; BV‐FTD = behavioral variant fronto‐temporal dementia; C = currently; CHI = children; CHR = chronic; CO ANX = comorbid anxiety; conv = converters; DEP = depressive disorder; DYL = dyslexia; DYT = dystonia; FDR = false discovery rate; FE = first episode; FEP = first episode psychosis; HC = healthy controls; HR = high risk; LTLE = left temporal lobe epilepsy; MCI = mild cognitive impairment; n.a. = non available; OAD = other anxiety disorders; OCD = obsessive‐compulsive disorder; PD = panic disorder; PKD = Parkinson's disease; RTLE = right temporal lobe epilepsy; SCZ = schizophrenia; SD = semantic dementia; st = stable. Number of participants and reported foci in the VBM studies (a) and meta‐analyses (b) included in the present study

Association Between Sample Size and Number of Foci in Individual VBM Studies

As shown in Table 3 and Figure 4, studies with larger sample sizes were found to report more abnormalities, but the slope was very small (2% increase in the number of foci per each 10‐patient increase in sample‐size, P < 0.001), thus indicating potential reporting biases affecting the smaller studies more than the larger studies. Results were similar when using Pearson and Spearman correlations, when discarding statistically influential studies, or when combining Pearson correlations separately estimated per each published meta‐analysis (simple Pearson r = 0.148, P = 0.004; Spearman rho = 0.139, P = 0.006; Pearson r without the most influential study = 0.176, P < 0.001; meta‐analytically combined Pearson r = 0.110, P = 0.010). The binomial part of the zero‐inflated Poisson regression was not found to be influenced by the sample size.
Table 3

Relationship between sample size and number of reported foci in subgroups defined by different moderator factors: (a) analyses at the level of single studies and (b) analyses at the level of VBM meta‐analyses

Zero‐inflated Poisson regressionMeta‐analytic Poisson regressiona Pearson correlationSpearman correlationMeta‐analytic correlationa
Number of studiesEstimateb P‐valuec Estimateb P valuec R P valuec Rho P valuec R P valuec
(a) All studies3241.935%<0.0012.121%<0.0010.1480.0040.1390.0060.1100.005
Studies of psychiatric disorders2152.420%<0.0012.165%<0.0010.2050.0010.1150.0470.1070.012
Studies of neurological disorders1091.458%<0.0011.888%<0.0010.1110.1260.2460.0050.1260.113
Studies of non‐affective psychosis983.154%<0.0013.319%<0.0010.405<0.0010.2840.0020.339<0.001
Studies of dementia581.076%0.0041.749%<0.0010.0890.2530.3050.0100.1220.174
Studies of neurological diseases other than dementia518.341%<0.0014.191%0.0250.2270.0540.1940.0870.1330.223
Studies of anxiety disorders297.376%<0.0017.220%<0.0010.2440.1010.2270.1180.3300.003
Studies of autism27−9.535%0.994−11.034%>0.999−0.1590.785−0.0610.620−0.1870.928
Studies of bipolar disorder24−8.088%0.999−10.036%>0.999−0.2760.904−0.2740.902−0.3050.987
Studies of depressive disorders23−2.563%0.987−1.634%>0.999−0.1090.690−0.1270.718−0.1510.918
Studies of ADHD141.975%0.413−17.923%>0.999−0.3820.911−0.4080.926−0.4080.949
Studies published up to 20082022.939%<0.0010.2170.0010.1790.005
Studies published after 20081220.225%0.3130.0410.3260.1440.056
Studies up to six authors1080.615%0.0890.0660.2490.0850.191
Studies with more than six authors2142.592%<0.0010.1750.0050.1610.009
Studies with less than 32 patients7552.494%<0.0010.2220.0280.2820.007
Studies with at least 32 patients2491.749%<0.0010.1460.0110.1130.037
Studies with up to 80 patients2575.103%<0.0010.0780.1060.1100.039
Studies with magnets up to 1.5T2572.038%<0.0010.1580.0050.1140.034
Studies with magnets stronger than 1.5T601.266%0.1030.1120.1970.2330.037
Studies with slices of 1.5 mm or more1612.935%<0.0010.2070.0040.1750.013
Studies with slices inferior to 1.5 mm1510.850%0.0100.0740.1830.0830.156
Studies employing SPM in pre‐processing2821.639%<0.0010.1310.0140.1510.006
Studies not employing SPM in pre‐processing294.943%<0.0010.5100.0020.2550.091
Studies with 8 mm smoothing or less1471.746%<0.0010.1020.1100.0550.256
Studies with more than 8 mm smoothing1521.865%<0.0010.1610.0240.1890.010
Studies employing SPM in statistics2811.622%<0.0010.1300.0150.1500.006
Studies not employing SPM in statistics305.142%<0.0010.5290.0010.2830.065
Studies thresholding at P < 0.001 uncorrected671.270%0.0170.1290.1490.2420.024
Studies thresholding at P < 0.05 corrected1822.464%<0.0010.1500.0220.0780.147
Studies thresholding at P < 0.05 FDR‐corrected60−2.760%0.923−0.1350.847−0.1150.808
Studies thresholding at P < 0.05 FWE‐corrected492.253%<0.0010.2040.0800.3820.003
Studies not employing SVC2961.877%<0.0010.1450.0720.1650.002
Studies employing SVC265.693%<0.0010.4790.007−0.1300.263

Meta‐analytical combination of the Poisson regressions or the (Fisher‐transformed) correlations separately estimated per each published meta‐analysis.

Increase in the number of reported foci per each increase of 10 patients.

P values were obtained from one‐tailed tests.

Figure 4

Relationship between sample size and identified number of foci with abnormalities.

Relationship between sample size and number of reported foci in subgroups defined by different moderator factors: (a) analyses at the level of single studies and (b) analyses at the level of VBM meta‐analyses Meta‐analytical combination of the Poisson regressions or the (Fisher‐transformed) correlations separately estimated per each published meta‐analysis. Increase in the number of reported foci per each increase of 10 patients. P values were obtained from one‐tailed tests. Relationship between sample size and identified number of foci with abnormalities.

Effect of Moderators

No major differences according to the field (psychiatry or neurology, Supporting Information Fig. 1S) and clinical conditions (Supporting Information Fig. 2S) were observed. With respect to methodological moderators the only subgroup where the regression between sample size and number of reported foci was relatively stronger was the set of studies with less than 32 patients (52% increase in the number of foci per each 10‐patient increase in sample‐size, P < 0.001). The regression slope was nominally significant but small in studies with at least 32 patients (2% increase in the number of foci per each 10‐patient increase in sample‐size, P < 0.001), with differences in regression slope between these two subgroups being statistically significant (P < 0.001). The regression slope was small but still nominally significant in studies published up to 2008, in studies with more than six authors (Fig. 5), and in studies thresholding at P < 0.05 FWE‐corrected for multiple comparisons (Fig. 6) (2–3% increase in the number of foci per each 10‐patient increase in sample‐size, P < 0.001). Conversely, it was null in studies published after 2008, in studies with up to six authors, and in studies thresholding at P < 0.05 FDR‐corrected for multiple comparisons (<1% increase in the number of foci per each 10‐patient increase in sample‐size, P > 0.05), with differences in regression slope between these pairs of subgroups being statistically significant (P ≤ 0.005 in all cases).
Figure 5

Relationship between number of authors and number of identified foci.

Figure 6

Relationship between sample size and number of identified foci in studies using FDR correction for multiple comparisons and those using FWE correction for multiple comparisons. Difference in regression slope P = 0.005.

Relationship between number of authors and number of identified foci. Relationship between sample size and number of identified foci in studies using FDR correction for multiple comparisons and those using FWE correction for multiple comparisons. Difference in regression slope P = 0.005. There were no significant differences in the other methodological subgroups (up to 80 patients, studies with magnets up to 1.5T, studies with magnets stronger than 1.5T, MRI slices of 1.5 mm or more, MRI slices inferior to 1.5 mm, SPM used for pre‐processing, other software used for pre‐processing, FWHM of 8 mm or less, FWHM superior to 8 mm, SPM used for statistics, other software used for statistics, no correction for multiple comparisons, use of SVC, and no use of SVC). Similarly, exclusion of foci detected with SVC from the main analysis did not change the main results (2% increase in the number of foci per each 10‐patient increase in sample‐size, P < 0.001; Pearson r = 0.138, P = 0.007; Spearman rho = 0.113, P = 0.023; Pearson r without the most influential study = 0.166, P = 0.002).

Association Between Sample Size and Number of Foci in VBM Meta‐Analyses

The regression was not nominally significant when meta‐analyses, rather than individual studies, were analyzed as a whole group. However, this was only true for those meta‐analyses which include less than 10 studies (−0.17% increase in the number of foci per each 10‐patient increase in sample‐size, P = 0.641), while the regression achieved statistical significance for those meta‐analyses including 10 studies or more (0.35% increase in the number of foci per each 10‐patient increase in sample‐size, P < 0.001, Fig. 7). As shown in Figure 7, there were many meta‐analyses with fewer than 10 studies and total sample size <1,000 that reported a substantial number of loci, e.g., six of them reported at least 10 loci, while this occurred in only one meta‐analysis with 10 or more studies. The median number of loci was three in meta‐analyses with at least 10 studies and five in meta‐analyses with fewer than 10 studies.
Figure 7

Relationship between sample size and number of identified foci in all meta‐analyses (above), meta‐analyses with at least 10 studies (bottom left) and meta‐analyses with less than 10 studies (bottom right).

Relationship between sample size and number of identified foci in all meta‐analyses (above), meta‐analyses with at least 10 studies (bottom left) and meta‐analyses with less than 10 studies (bottom right).

Number of Reported Foci in Meta‐analyses Versus Single Studies Contained in Each Meta‐Analysis

The number of foci reported in the meta‐analyses was significantly smaller than the number of foci reported in the respective studies contained in each meta‐analysis (P < 0.001 with paired test). This difference was also significant when only meta‐analyses with 10 or more studies were considered (P < 0.001 with paired test), while it lost the significance for meta‐analyses with fewer than 10 studies (P = 0.111 with paired test).

DISCUSSION

This study explored the potential confounding role of biases in VBM studies by assessing whether the number of reported brain abnormalities was positively related to the sample size of the studies, as it would be statistically expected given that studies with larger sample sizes have more power to detect abnormalities. Overall, we found a weak correlation between sample size and number of reported loci, corresponding to an increase of only 2% per 10 additional patients. This is far less than what would be expected based on power considerations. Thus, it suggests that reporting biases may be inflating the number of discovered reported loci in small studies. Evaluation of a large number of moderator variables suggested similar findings in a wide array of study, disease, technical and other characteristics, although there were hints that the statistical threshold employed and publication year had a modulating effect. Finally, we found that for whole‐brain voxel‐based meta‐analyses including less than 10 studies, there was no association between sample size and number of loci, which is again suggestive of potential reporting biases. This pattern was not seen in meta‐analyses with more than 10 studies, which generally reported few loci (median = 3). The number of loci reported in meta‐analyses, especially large ones, was significantly smaller than the number of loci reported in single studies, also corroborating that the literature of single studies may often present inflated numbers of discoveries. Overall, the strength of the evidence that we found for reporting biases in VBM studies may be weaker than previous findings in non‐VBM (i.e., ROI) structural neuroimaging studies, where an excess significance bias was more clearly detected [Ioannidis, 2011]. However, the ROI assessment evaluated the number of observed versus expected significant results in each study and in multiple studies, while this was not possible to do for VBM studies given the nature of the data. Second, automated methods such as VBM tend to be less biased by the researcher's influence; in contrast, a researcher could perform several exploratory ROI analyses and report results for only those ROIs that yielded significant results [Ioannidis, 2011; Radua and Mataix‐Cols, 2012]. Third, the manual tracing of ROIs, as compared with VBM methods, can introduce significant heterogeneity in the anatomical definition of the brain areas investigated across studies and thus affect the significance of the results reported, hampering publication biases. Fourth, we found that the median sample size of the individual VBM studies retrieved was of 47, which is larger than the typical sample size of previously analyzed ROI studies [Ioannidis, 2011]. Some authors have even proposed optimal sample sizes for individual VBM studies of 16–32 subjects per group [Friston, 2012], suggesting that between‐subjects comparison studies of n < 32 are too small even by liberal estimates. Still the fact that the number of loci reported in single VBM studies is smaller than what eventually gets validated in large meta‐analyses suggests that reporting or other biases may sometimes be substantial in some VBM studies. We explored several potential factors that may be influencing and modulating reporting biases in VBM literature such as publication year, type of condition investigated, statistical threshold employed and other methodological characteristics of the analysis method. A similar pattern of weak or null correlations was seen in several analyses according to different moderator variables although we found some hints that statistical threshold employed and publication year had some modulating effect. There was no difference in the relationship between identified foci and sample size according to type of clinical conditions. Similarly, no differences in the relationship between number of foci and sample size were detected when VBM psychiatric literature was compared with the VBM neurological literature. Factors other than publication biases may account for the lack of clinical applications of psychiatric neuroimaging (e.g., heterogeneity of psychiatric diagnoses or differences in the psychopathological characteristics across samples) [Borgwardt et al., 2012]. Sample size, magnet field, slices thickness, type of analysis package, type of smoothing kernel, use of SVC did not affect the results. Conversely, the foci‐sample relationship was positive and statistically significant in studies employing an FWE correction, negative and non‐significant in studies applying an FDR correction, even if the increase of foci related to sample size should be higher in studies applying such correction (Fig. 1). Furthermore, there was a statistically significant difference between subgroups. The reasons for the existence of potential bias in studies using an FDR are again speculative. It should be noted that sample size range appeared to be smaller for VBM studies which employed FDR correction compared with those which employed FWE correction (see Fig. 6); it is therefore possible that the absence of a significant relationship for studies which used FDR but not for those which used few correction could simply be explained by differences in power. Alternatively, the absence of a relationship between sample size and foci in studies applying an FDR correction could be related to some mis‐use of FDR in neuroimaging [Chumbley and Friston, 2009]. In addition, found an effect of the number of authors, with significant correlation between sample size and number of foci for VBM studies with more than six authors and no correlation for studies with up to six authors (statistically significant between‐subgroups differences, Fig. 5). Strikingly, we replicated the similar relationship previously reported by Sayo et al. (2011), who found that studies with less coauthors reported larger ventricular‐brain ratio abnormalities in patients with schizophrenia. They suggested that larger research groups may be more conservative or exacting in their research methodology. Similarly, we also found a differential effect for publication year, with significant correlation between sample size and number of foci for VBM studies published up to 2008 and no correlation for studies published after 2008 (statistically significant between‐subgroups differences). The reasons for this observation are highly speculative and it could be a chance finding, given the number of modulator variables assessed. It could be, for instance, that as far as the structural abnormalities in many disorders have been more or less established in previous studies, only studies finding such abnormalities are published. Alternatively, newer studies use advanced VBM algorithms (i.e., DARTEL [Ashburner, 2007b], introduced shortly before 2008) [Ashburner, 2007a] which could enable them to detect most of the abnormalities with even relatively small sample sizes. However, this seems unlikely to be the case given that the number of foci reported in these studies is indeed lower than in older studies (9–74% decrease depending on the sample size). The causes for this observation are highly speculative. On the one hand, in recent years investigators may have conservatively thresholded the analyses when the results appeared in brain regions that were unexpected based on the results of previous studies. On the other hand, some of the new VBM algorithms that have been introduced based on theoretical grounds lack formal empirical validations and may have had a detrimental impact on the sensitivity of the analyses. A third possibility is that these new VBM algorithms have resulted in fewer false‐positives compared to standard VBM, for instance by improving the spatial registration of the images or minimizing the impact of non‐normality [Salmond et al., 2002; Viviani et al., 2007]. Because the causes for the lower number of significant foci after 2008 are speculative, the implications of this observation for the minimal appropriate N per group are also unclear. Finally, we tested the sample size/number of foci correlation hypothesis at meta‐analytical level (Fig. 7). We found that the relationship was absent when meta‐analyses with less than 10 studies were included in a Poisson regression. Small meta‐analyses report more foci than larger ones. This finding may be useful to guide editors, reviewers and authors to improve the reliability of voxel‐based meta‐analyses by either setting the bar for the number of required studies included in meta‐analyses at k ≥ 10 or ensuring that high‐quality null findings in meta‐analyses of conditions with k < 10 available published singleton studies are available after systemic literature review, and ideally, also access to registries of data, since it is notoriously difficult to unearth unpublished unregistered data. The fact that meta‐analyses with many studies validate few foci (median = 3) is also suggestive that the larger numbers of foci reported in small studies and small meta‐analyses may be inflated by several false‐positives. Some caveats should be discussed about our study. First, we cannot rule out the possibility that in some cases large studies and even large meta‐analyses may suffer from reporting and/or other biases that inflate the number of discovered loci. Conversely, some small studies may be more meticulous and thus optimize the yield of discoveries, despite their limited sample size. However, it is unlikely that there would be a systematic error in favor of small studies being better than larger studies in this regard. Our analysis focuses on the large picture including many hundreds of studies. Second, the total number of genuine loci to be discovered in each disease and condition is unknown and power calculations require making assumptions about how many of such abnormalities would be detected. Most likely, the number and magnitude of abnormalities differs substantially across different diseases and conditions. Thus, again, our approach offers an aggregate view of the big picture and inferences may not be possible to extrapolate to each of the topics that we analyzed. Even if reporting biases are present in the field‐at‐large, this does not mean that all sub‐fields and each topic are equally affected. Third, there is preliminary evidence that VBM studies with a smaller sample size may be more susceptible to false positive rates than those with a larger sample size; this is due to the impact of non‐normality of the data [Salmond et al., 2002; Viviani et al., 2007] which is critically dependent on sample size [Scarpazza et al., in press]. Thus we cannot exclude the possibility that our results reflect differences in false positive rates as a function of sample size rather than reporting biases.

CONCLUSIONS

Acknowledging these caveats our analysis offers some evidence about the relationship between sample size and number of discovered loci that can be used in designing the future research agenda for VBM studies and in interpreting the results of single studies and meta‐analyses thereof in this discipline.
  22 in total

1.  Distributional assumptions in voxel-based morphometry.

Authors:  C H Salmond; J Ashburner; F Vargha-Khadem; A Connelly; D G Gadian; K J Friston
Journal:  Neuroimage       Date:  2002-10       Impact factor: 6.556

2.  Third-generation neuroimaging in early schizophrenia: translating research evidence into clinical utility.

Authors:  Stefan Borgwardt; Paolo Fusar-Poli
Journal:  Br J Psychiatry       Date:  2012-04       Impact factor: 9.319

3.  Validity of modulation and optimal settings for advanced voxel-based morphometry.

Authors:  Joaquim Radua; Erick Jorge Canales-Rodríguez; Edith Pomarol-Clotet; Raymond Salvador
Journal:  Neuroimage       Date:  2013-08-08       Impact factor: 6.556

4.  A new meta-analytic method for neuroimaging studies that combines reported peak coordinates and statistical parametric maps.

Authors:  J Radua; D Mataix-Cols; M L Phillips; W El-Hage; D M Kronhaus; N Cardoner; S Surguladze
Journal:  Eur Psychiatry       Date:  2011-06-11       Impact factor: 5.361

Review 5.  Voxel-wise meta-analysis of grey matter changes in obsessive-compulsive disorder.

Authors:  Joaquim Radua; David Mataix-Cols
Journal:  Br J Psychiatry       Date:  2009-11       Impact factor: 9.319

6.  When the single matters more than the group: very high false positive rates in single case Voxel Based Morphometry.

Authors:  C Scarpazza; G Sartori; M S De Simone; A Mechelli
Journal:  Neuroimage       Date:  2013-01-02       Impact factor: 6.556

7.  Excess significance bias in the literature on brain volume abnormalities.

Authors:  John P A Ioannidis
Journal:  Arch Gen Psychiatry       Date:  2011-04-04

8.  Study factors influencing ventricular enlargement in schizophrenia: a 20 year follow-up meta-analysis.

Authors:  Angelo Sayo; Robin G Jennings; John Darrell Van Horn
Journal:  Neuroimage       Date:  2011-07-20       Impact factor: 6.556

9.  Multimodal meta-analysis of structural and functional brain changes in first episode psychosis and the effects of antipsychotic medication.

Authors:  J Radua; S Borgwardt; A Crescini; D Mataix-Cols; A Meyer-Lindenberg; P K McGuire; P Fusar-Poli
Journal:  Neurosci Biobehav Rev       Date:  2012-08-10       Impact factor: 8.989

10.  Why are psychiatric imaging methods clinically unreliable? Conclusions and practical guidelines for authors, editors and reviewers.

Authors:  Stefan Borgwardt; Joaquim Radua; Andrea Mechelli; Paolo Fusar-Poli
Journal:  Behav Brain Funct       Date:  2012-09-01       Impact factor: 3.759

View more
  26 in total

1.  Lack of evidence to favor specific preventive interventions in psychosis: a network meta-analysis.

Authors:  Cathy Davies; Andrea Cipriani; John P A Ioannidis; Joaquim Radua; Daniel Stahl; Umberto Provenzani; Philip McGuire; Paolo Fusar-Poli
Journal:  World Psychiatry       Date:  2018-06       Impact factor: 49.548

2.  How reliable are gray matter disruptions in specific reading disability across multiple countries and languages? Insights from a large-scale voxel-based morphometry study.

Authors:  Katarzyna Jednoróg; Artur Marchewka; Irene Altarelli; Ana Karla Monzalvo Lopez; Muna van Ermingen-Marbach; Marion Grande; Anna Grabowska; Stefan Heim; Franck Ramus
Journal:  Hum Brain Mapp       Date:  2015-01-17       Impact factor: 5.038

3.  Examining neural correlates of psychopathology using a lesion-based approach.

Authors:  Matthew Calamia; Kristian E Markon; Matthew J Sutterer; Daniel Tranel
Journal:  Neuropsychologia       Date:  2018-06-27       Impact factor: 3.139

4.  Lack of Evidence for Regional Brain Volume or Cortical Thickness Abnormalities in Youths at Clinical High Risk for Psychosis: Findings From the Longitudinal Youth at Risk Study.

Authors:  Paul Klauser; Juan Zhou; Joseph K W Lim; Joann S Poh; Hui Zheng; Han Ying Tng; Ranga Krishnan; Jimmy Lee; Richard S E Keefe; R Alison Adcock; Stephen J Wood; Alex Fornito; Michael W L Chee
Journal:  Schizophr Bull       Date:  2015-03-04       Impact factor: 9.306

5.  Morphological features in juvenile Huntington disease associated with cerebellar atrophy - magnetic resonance imaging morphometric analysis.

Authors:  Abderrahmane Hedjoudje; Gaël Nicolas; Alice Goldenberg; Catherine Vanhulle; Clémentine Dumant-Forrest; Guillaume Deverrière; Pauline Treguier; Isabelle Michelet; Lucie Guyant-Maréchal; Didier Devys; Emmanuel Gerardin; Jean-Nicolas Dacher; Pierre-Hugues Vivier
Journal:  Pediatr Radiol       Date:  2018-06-20

6.  Neural signatures of human fear conditioning: an updated and extended meta-analysis of fMRI studies.

Authors:  M A Fullana; B J Harrison; C Soriano-Mas; B Vervliet; N Cardoner; A Àvila-Parcet; J Radua
Journal:  Mol Psychiatry       Date:  2015-06-30       Impact factor: 15.992

7.  Brain structure in autism: a voxel-based morphometry analysis of the Autism Brain Imaging Database Exchange (ABIDE).

Authors:  Kaitlin Riddle; Carissa J Cascio; Neil D Woodward
Journal:  Brain Imaging Behav       Date:  2017-04       Impact factor: 3.978

8.  Imaging genetics paradigms in depression research: Systematic review and meta-analysis.

Authors:  Lícia P Pereira; Cristiano A Köhler; Brendon Stubbs; Kamilla W Miskowiak; Gerwyn Morris; Bárbara P de Freitas; Trevor Thompson; Brisa S Fernandes; André R Brunoni; Michael Maes; Diego A Pizzagalli; André F Carvalho
Journal:  Prog Neuropsychopharmacol Biol Psychiatry       Date:  2018-05-17       Impact factor: 5.067

9.  Detecting the psychosis prodrome across high-risk populations using neuroanatomical biomarkers.

Authors:  Nikolaos Koutsouleris; Anita Riecher-Rössler; Eva M Meisenzahl; Renata Smieskova; Erich Studerus; Lana Kambeitz-Ilankovic; Sebastian von Saldern; Carlos Cabral; Maximilian Reiser; Peter Falkai; Stefan Borgwardt
Journal:  Schizophr Bull       Date:  2014-06-09       Impact factor: 9.306

Review 10.  Structural Neuroimaging of Anorexia Nervosa: Future Directions in the Quest for Mechanisms Underlying Dynamic Alterations.

Authors:  Joseph A King; Guido K W Frank; Paul M Thompson; Stefan Ehrlich
Journal:  Biol Psychiatry       Date:  2017-08-24       Impact factor: 13.382

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.