| Literature DB >> 21713072 |
Abstract
Social research is plagued by many biases. Most of them are due to situation specificity of social behavior and can be explained using a theory of situation specificity. The historical background of situation specificity in personality social psychology research is briefly sketched, then a theory of situation specificity is presented in detail, with as centerpiece the relationship between the behavior and its outcome which can be described as either "the more, the better" or "not too much and not too little." This theory is applied to reliability and validity of assessments in social research. The distinction between "maximum performance" and "typical performance" is shown to correspond to the two behavior-outcome relations. For maximum performance, issues of reliability and validity are much easier to be solved, whereas typical performance is sensitive to biases, as predicted by the theory. Finally, it is suggested that biases in social research are not just systematic error, but represent relevant features to be explained just as other behavior, and that the respective theories should be integrated into a theory system.Entities:
Keywords: assessment; bias; maximum performance; reliability; situation specificity; systematic error; typical performance; validity
Year: 2011 PMID: 21713072 PMCID: PMC3113195 DOI: 10.3389/fpsyg.2011.00018
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Types of consistency (with additions from Patry, .
| Relative consistency or situation specificity | Absolute consistency or situation specificity | Coherence | |
|---|---|---|---|
| Definition | Rank order of subjects similar | Absolute value of (groups of) people equal | Reliable behavior patterns |
| Key figure | Correlation or the like | ANOVA, Interaction P × S | |
| Indicator of situation specificity | Small correlation (absolute value; low effect size) | Significant differences (high effect size) | High variance accounted for P × S |
| Subjects | Same | Same or similar (through random assignment) | Same |
| Number of subjects | Several | One or more | Several |
| Assessment tool | Similar or different | Similar | Same |
| Required scale level | At least ordinal | Nominal possible | Usually interval |
Figure 1Relationships between the behavior and its outcome value; a: the more, the better; b: the less, the better; c: not too much and not too little.
Figure 2“Not too much and not too little”: different optima in different situations; d, e, f: different optima.
Figure 3Directivity of one teacher in five classes; teaching is once a week.
Differences between maximum and typical performance assessments following Fiske and Butler (.
| Feature | Maximum performance | Typical performance |
|---|---|---|
| Assessed variable | Ability to respond | Disposition to respond |
| Generalization aimed at | What someone can do, but not what he or she will do | What someone is likely to do; the ability to do so is assumed to be given |
| Method | Usually assessed directly: The subject does, what the researcher is interested in | Assessed indirectly: The subject describes what he or she does or feels in certain situations or reacts to ambiguous material; sometimes observation is used |
| Instruction: What is assessed? | “This is an ability test.” The subject is informed about the ability at stake (intelligence, knowledge, creativity, etc.) | Usually the subjects are not told that it is a (personality) test and about the disposition at stake to avoid reactive effects |
| Instruction: Right answer | “Give the right answer to each question!” “There is only one right answer” | “There is no right or wrong answer” |
| Instruction: How to answer | “Try to give your best!” | “Be as honest as possible!” |
| Instruction: Number of answers | “Do not expect to be able to answer all questions!” | “Please answer all questions, do not leave out any!” |
| Dealing with missing values | Missings are errors | Missings lead to elimination of the subject from the sample (or a guess what the answer would have been) |
| Instruction: Clearness | Instruction is not always clear, but clearness is aimed at | Instruction is not always clear, but ambiguity is often intended, particularly in projective tests |
| Implicit understanding | The subject assumes that the researcher wants him or her to do his or her best | The subject has no information about what the researcher aims at, he or she may guess (rightly or wrongly) |
| Relationship researcher-subject | Researcher controls the situation; for the test to be possible, the subject must accept his or her role; researcher and subject agree in their goals: harmonic relationship | Researcher controls the situation; for the test to be possible, the subject must accept his or her role; researcher and subject have different goals: relationship is not harmonic |
| “Difficulty” (probability of answers of a certain type) | True difficulty: There are items that the subject cannot answer (within the time restrictions); difficulty is important | All items can be answered in all ways by all subjects; “difficulty” plays no role |
| Robustness | Slightly differences in the formulation of the item and in context factors have no influence on difficulty | Slight differences in formulation of the items or context factors have an important influence on the results |
| “Upper limit” | There is an upper limit in performance: ability | There is no “upper limit” |
| Response strategy | Usually the strategy used by the subject is the one assumed by the researcher | Usually the researcher has no information about the answer strategy used by the subject |
| Consequences for the subject | Usually the subject knows quite well what consequences his or her answers will have | Usually the subject does not know how his or her answers will be interpreted and what consequences a specific answer will have (but he or she can guess) |
| Comparability | Assessments of different subjects are comparable: The test assesses the same for all subjects | Assessments of different subjects may assess different constructs (particularly in projective tests) |
| Reliability | Stable, high internal consistency | Lower stability, lower internal consistency; reduced applicability of test theory |
| Judgment criterion | The more, the better | There is an optimum: not too much and not too little; the optimum may differ from situation to situation |
Real-world correlations (excerpt from Follman, .
| Variables | |
|---|---|
| IQ test reliability | 0.90s |
| Standardized school achievement test reliabilities | 0.90s |
| IQ and school achievement–grade 1 | 0.85-0.90 |
| IQ and school achievement–college from high school | 0.50-0.55 |
| GRE and graduate school grade point average | 0.00-0.40 |
| IQ and memory (higher with age into adulthood) | 0.50-0.70 |
| School achievement (cognitive) and affective | 0.35 |
| School achievement and self-concept | 0.35 |
| School achievement and motivation | 0.35 |
| School achievement and student ratings of teacher effectiveness | 0.44 |
| IQ and self-concept | 0.35 |
| IQ and creativity | 0.35 |