| Literature DB >> 34194371 |
Alexander Berger1, Markus Kiefer1.
Abstract
In response time (RT) research, RT outliers are typically excluded from statistical analysis to improve the signal-to-noise ratio. Nevertheless, there exist several methods for outlier exclusion. This poses the question, how these methods differ with respect to recovering the uncontaminated RT distribution. In the present simulation study, two RT distributions with a given population difference were simulated in each iteration. RTs were replaced by outliers following two different approaches. The first approach generated outliers at the tails of the distribution, the second one inserted outliers overlapping with the genuine RT distribution. We applied ten different outlier exclusion methods and tested, how many pairs of distributions significantly differed. Outlier exclusion methods were compared in terms of bias. Bias was defined as the deviation of the proportion of significant differences after outlier exclusion from the proportion of significant differences in the uncontaminated samples (before introducing outliers). Our results showed large differences in bias between the exclusion methods. Some methods showed a high rate of Type-I errors and should therefore clearly not be used. Overall, our results showed that applying an exclusion method based on z-scores / standard deviations introduced only small biases, while the absence of outlier exclusion showed the largest absolute bias.Entities:
Keywords: mental chronometry; outlier exclusion; reaction time; response time; simulation study
Year: 2021 PMID: 34194371 PMCID: PMC8238084 DOI: 10.3389/fpsyg.2021.675558
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
FIGURE 1Flowchart of the present simulation study. In a first step (list 1), valid response time distributions were simulated in two conditions. The distributions in both conditions were simulated with equal parameters N, σ and τ. In the second condition, a constant (diff) was added to μ of the first condition. In the second step (list 2), randomly chosen valid RTs were replaced by outliers. In the third step (list 3), outliers were excluded according to the different outlier exclusion methods. The figure shows three methods as examples. Excluded response times are highlighted with black color. The resulting distributions were compared with t-tests.
Overview of the outlier exclusion methods.
| Method | Lower threshold | Upper threshold | Additional information |
| no | - | - | no correction for outliers |
| cutoff | 200 ms | mean + 1000 ms | |
| 2sd | mean – 2 × SD | mean + 2 × SD | |
| 3sd | mean – 3 × SD | mean + 3 × SD | |
| tukey1.5 | q0.25 – 1.5 × IQR | q0.75 + 1.5 × IQR | |
| q10 | q0.05 | q0.95 | |
| q05 | q0.025 | q0.975 | |
| MAD | median – 2.5 × MAD | median + 2.5 × MAD | |
| MAD_adjusted | median – 2.5 × MAD | median + 2.5 × MAD | |
| transform | –2 | 2 | z-transformation of square root of uniformized RTs |
Statistics of the valid RTs and contaminated RTs in millisecond by the two different approaches to simulate outliers.
| Sample | Statistic | Mean | SD | Min | Max |
| valid RTs | 600.07 | 90.71 | 360.38 | 873.95 | |
| 175.57 | 27.24 | 64.30 | 409.49 | ||
| 60.02 | 23.08 | 20 | 100 | ||
| contaminated RTs | 667.81 | 101.28 | 360.38 | 1060.02 | |
| 374.86 | 117.35 | 78.76 | 939.19 | ||
| valid RTs | 599.96 | 90.81 | 362.97 | 902.26 | |
| 175.64 | 27.26 | 71.81 | 427.38 | ||
| 60.01 | 23.11 | 20 | 100 | ||
| contaminated RTs | 636.28 | 93.67 | 362.97 | 965.34 | |
| 247.43 | 45.50 | 71.81 | 513.03 | ||
Results of the model predicting t-values.
| Coefficient | β | SE | |
| Intercept | –0.988 | –468.0 | 0.002 |
| given difference | 0.033 | 2342.8 | <0.001 |
| Sample SD | –0.002 | –462.6 | <0.001 |
| Sample N | 0.014 | 800.1 | <0.001 |
| cutoff | 0.340 | 186.9 | 0.002 |
| 3sd | 0.406 | 223.3 | 0.002 |
| q05 | 0.415 | 228.2 | 0.002 |
| transform | 0.679 | 373.6 | 0.002 |
| q10 | 0.688 | 378.9 | 0.002 |
| 2sd | 0.691 | 380.4 | 0.002 |
| tukey1.5 | 0.955 | 525.7 | 0.002 |
| MAD | 1.171 | 644.5 | 0.002 |
| MAD_adjusted | 1.438 | 791.4 | 0.002 |
FIGURE 2Bias for the outlier simulation approach . The x-axis shows the population difference between the two conditions. A positive bias value indicates a larger proportion of significant t-test after outlier exclusion compared to valid RTs, a negative value a smaller proportion. The black line serves as reference (=no bias). Method names are abbreviations (see Table 1).
FIGURE 3Bias for the outlier simulation approach . The x-axis shows the population difference between the two conditions. A positive bias value indicates a larger proportion of significant t-test after outlier exclusion compared to valid RTs, a negative value a smaller proportion. The black line serves as reference (=no bias). Method names are abbreviations (see Table 1).
Average bias of outlier exclusion methods according to the different approaches to simulate outliers.
| Bias | |||
| Method | |||
| no | – | – | -0.160 |
| MAD_adjusted | 0.202 | 0.202 | |
| MAD | 0.156 | 0.155 | 0.157 |
| tukey1.5 | 0.110 | 0.108 | 0.113 |
| cutoff | –0.087 | -0.050 | |
| q05 | |||
| 3sd | |||
| transform | 0.033 | 0.012 | 0.053 |
| 2sd | 0.031 | 0.062 | |
| q10 | 0.023 | 0.036 | |