| Literature DB >> 28533971 |
Anja F Ernst1, Casper J Albers1.
Abstract
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.Entities:
Keywords: Linear regression; Literature review; Misconceptions about normality; Statistical assumptions
Year: 2017 PMID: 28533971 PMCID: PMC5436580 DOI: 10.7717/peerj.3323
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Simulated example of a t-test based on n = 40 observations per group and no violations of the assumptions.
The main panel shows a scatterplot of (X, Y)-scores. The red curve corresponds to the best-fitting normal distribution for Y, where the blue curves correspond to the best-fitting normal distribution for both subpopulations of Y. The histograms in the top and side panels clearly indicate non-normality for X and Y. However, within both subpopulations the distribution is normal.
Selection of Clinical Psychology Journals.
The first column gives the ranking of the journal, the first number denoting the quartile in which the journal falls, the second number the rank of the journal within that quartile.
| Label | Journal |
|---|---|
| Q1.1 | Annual Review of Clinical Psychology |
| Q1.2 | Clinical Psychology Review |
| Q1.3 | Journal of Consulting and Clinical Psychology |
| Q2.1 | International Psychogeriatrics |
| Q2.2 | Journal of Attention Disorders |
| Q2.3 | American Journal of Drug and Alcohol Abuse |
| Q3.1 | Zeitschrift fur Klinische Psychologie und Psychotherapie |
| Q3.2 | Journal of Obsessive-Compulsive and Related Disorders |
| Q3.3 | International Journal of Psychology and Psychological Therapy |
| Q4.1 | Internet Journal of Mental Health |
| Q4.2 | Indian Journal of Psychological Medicine |
| Q4.3 | Behaviour Change |
Figure 2Prisma flow diagram of included records.
Classification of the reviewed regression papers.
Rubrics 3 and 5–12 represent papers with imperfect handling of regression assumptions: in rubrics 5–7, it is unclear from whether assumptions are correctly dealt with; in rubrics 8–12, the dealing with assumptions was incorrect.
| Class | Reason |
|---|---|
| 1 | No Model of Interest |
| 2 | Rejection of linear regression on basis of correct assumptions |
| 3 | Rejection of linear regression on basis of not meeting incorrect assumptions |
| 4 | Correct linear regression |
| 5 | Mentioned all correct assumptions but not if the ‘normality assumption’ was tested on the residuals or on |
| 6 | Did not test all but some correct assumptions, included neither normality of variables nor errors |
| 7 | Use of linear regression but no indication if any or which assumptions were tested |
| 8 | Assumed/tested normally distributed |
| 9 | Assumed/tested normally distributed |
| 10 | Assumed/tested normally distributed |
| 11 | Assumed/tested normally distributed variables but did not indicate if |
| 12 | Other misconceptions about assumptions |
Proportion of various types of papers in our selected journals.
Categorisations are mutually exclusive and exhaustive. Journals are referred by the labels assigned in Table 1. “Rub.” refers to the rubrics in Table 2. The online Supplemental Information 1 indicates which papers belong to each of the numbers in this table.
| Journal | Number of papers with regression ( | Dealing with assumptions | No regression | ||||
|---|---|---|---|---|---|---|---|
| Correctly ( | Unclear ( | Incorrectly ( | Correct (violation of true assumption) ( | Incorrect (violation of false assumption) ( | |||
| Q1.1 | 33 | 0 | 0 | 0 | 0 | 0 | 0 |
| Q1.2 | 86 | 6 (7%) | 0 | 6 (100%) | 0 | 0 | 0 |
| Q1.3 | 98 | 26 (28%) | 0 | 25 (100%) | 0 | 3 (100%) | 0 |
| Q2.1 | 227 | 44 (19%) | 3 (7%) | 39 (89%) | 2 (5%) | 1 (100%) | 0 |
| Q2.2 | 199 | 52 (26%) | 0 | 49 (94%) | 3 (6%) | 0 | 0 |
| Q2.3 | 54 | 14 (26%) | 0 | 14(100%) | 0 | 0 | 0 |
| Q3.1 | 23 | 5 (22%) | 0 | 5 (100%) | 0 | 1 (50%) | 1 (50%) |
| Q3.2 | 59 | 21 (55%) | 0 | 16 (71%) | 5 (29%) | 1 (100%) | 0 |
| Q3.3 | 10 | 2 (20%) | 0 | 2 (100%) | 0 | 0 | 0 |
| Q4.1 | 2 | 1 (50%) | 0 | 1 (100%) | 0 | 0 | 0 |
| Q4.2 | 82 | 0 | 0 | 0 | 0 | 0 | 0 |
| Q4.3 | 20 | 2 (10%) | 0 | 2 (100%) | 0 | 0 | 0 |
| Total | 893 | 172 (19 %) | 3 (2%) | 159 (92%) | 10 (6%) | 6 (86%) | 1 (14%) |
Notes.
Papers in Spanish excluded.
Breakdown of the different types of ‘Unclear’ classifications.
Only Journals with unclear models are listed. Categorizations are mutually exclusive and exhaustive. Journals are referred by the labels assigned in Table 1.
| Journal | Papers in which handling of regression assumption was unclear | Unclear | ||
|---|---|---|---|---|
| If the ‘normality assumption’ was tested on the residuals or on | Did not test all but some correct assumptions | No indication if any or which assumptions were tested | ||
| Q1.2 | 6 | 0 | 2 (33%) | 4 (67%) |
| Q1.3 | 26 | 0 | 0 | 25 (100%) |
| Q2.1 | 39 | 4 (10%) | 5 (13%) | 30 (77%) |
| Q2.2 | 49 | 1 (2%) | 2 (4%) | 46 (94%) |
| Q2.3 | 14 | 0 | 1 (7%) | 13 (93%) |
| Q3.1 | 5 | 0 | 0 | 5 (100%) |
| Q3.2 | 16 | 0 | 0 | 16 (100%) |
| Q3.3 | 2 | 0 | 0 | 2 (100%) |
| Q4.1 | 1 | 0 | 0 | 1 (100%) |
| Q4.3 | 2 | 0 | 0 | 2 (100%) |
| Total | 159 | 5 (3%) | 10 (6%) | 144 (91%) |
Breakdown of the types of mistakes that were observed.
Only Journals with flawed models are listed. Categorizations are mutually exclusive and exhaustive. Journals are referred by the labels assigned in Table 1.
| Journal | Articles with flawed linear regression model ( | Other misconceptions (rub. 12) | ||||
|---|---|---|---|---|---|---|
| Q2.1 | 2 | 0 | 0 | 0 | 2 (100%) | |
| Q2.2 | 3 | 2 (67%) | 0 | 0 | 0 | 1 (33%) |
| Q3.2 | 5 | 4 (80%) | 1 (20%) | 0 | 0 | 0 |
| Total | 10 | 6 (60%) | 1 (10%) | 0 | 0 | 3 (30%) |