| Literature DB >> 32405927 |
Andrew L Cohen1, Jeffrey J Starns1, Caren M Rotello1, Andrea M Cataldo2,3.
Abstract
BACKGROUND: The majority of eyewitness lineup studies are laboratory-based. How well the conclusions of these studies, including the relationship between confidence and accuracy, generalize to real-world police lineups is an open question. Signal detection theory (SDT) has emerged as a powerful framework for analyzing lineups that allows comparison of witnesses' memory accuracy under different types of identification procedures. Because the guilt or innocence of a real-world suspect is generally not known, however, it is further unknown precisely how the identification of a suspect should change our belief in their guilt. The probability of guilt after the suspect has been identified, the posterior probability of guilt (PPG), can only be meaningfully estimated if we know the proportion of lineups that include a guilty suspect, P(guilty). Recent work used SDT to estimate P(guilty) on a single empirical data set that shared an important property with real-world data; that is, no information about the guilt or innocence of the suspects was provided. Here we test the ability of the SDT model to recover P(guilty) on a wide range of pre-existing empirical data from more than 10,000 identification decisions. We then use simulations of the SDT model to determine the conditions under which the model succeeds and, where applicable, why it fails.Entities:
Keywords: Base rate; Computational modeling; Confidence; Eyewitness lineup; Posterior probability of guilt; Signal detection theory
Mesh:
Year: 2020 PMID: 32405927 PMCID: PMC7221056 DOI: 10.1186/s41235-020-00219-4
Source DB: PubMed Journal: Cogn Res Princ Implic ISSN: 2365-7464
Fig. 1Posterior probability of guilt (PPG) as a function of base rate of guilt. Solid curve based on d’ = 1.8; dashed curve, d’ = 1.0 (c = 1 in both cases). Red lines at PPG = 0.95 (reasonable doubt) and 0.51 (probable cause)
Fig. 2The signal detection model for lineups. Left: unequal-variance model. Right: equal-variance model. μg and σ are the mean and standard deviation of the guilty-suspect distribution. μf and σ are the mean and standard deviation of the filler distribution. c1, c2, c3 are the response criteria for the low-, medium- (med), and high-response confidence regions. ID and No ID are response regions in which an identification is or is not made, respectively
Fig. 3Results of the signal detection theory (SDT) model analysis when fit to the full data set. Left: actual and estimated base rates for the full data set combined across all 13 experiments. The green circle shows the actual experimental base rate and the SDT-model estimate. Each gray circle shows the estimated base rate for one of the sampled (smp) data sets, with the sampled base-rate value jittered for visibility. The red lines and red x show the 10th and 90th quantiles and median of these estimated base rates. The number in the lower-right corner is the overall sample size. Middle: data results and model predictions for low- (L), medium- (M), and high- (H) confidence suspect identifications, low- (l), medium- (m), and high- (h) confidence filler identifications, and lineup rejections (R). Right: the green line shows the actual base rate in the experimental data and the red line shows the estimated base rate from the SDT model. In the upper section, the black curve shows how the model fit value changes as the model base rate varied. The number in the lower-left provides the likelihood ratio (lr) of the model when the estimated and actual experimental base rates are used. In the lower section, the gray curve shows the distribution of estimated base rates for data simulated from the SDT model with the model base rate fixed at the actual experimental base rate. The y-axis is frequency
Fig. 4Calibration curves using the full data set for both experimental data and model predictions. Each column is for a different sampled base rate. The top row plots the proportion of correct responses given either a rejection (rej) or a low (sL)-, medium (sM)-, or high (sH)-confidence suspect response. The bottom row plots the probability of a suspect identification given a low- (L), medium- (M), or high- (H) confidence response (suspect or filler)
Fig. 5The model-predicted posterior probability of guilty (PPG) plotted against the actual sampled PPG for different base rates sampled from the full data set, for low-, medium-, and high-confidence responses
Fig. 6Actual and estimated base rates for each of the 13 individual experiments (see Table 1). The green circle shows the actual experimental base rate plotted against its estimate. Each gray circle shows the estimated base rate for one of the sampled data sets (smp; the sampled base rates are jittered for visibility). The red lines and red x show the 10th and 90th quantiles and median of these estimated base rates. The number in the lower-right corner is the experiment sample size
The experimental data used in the analyses
| Reference | Data subset | Label | Lineup size | Confidence levels | # confidence levels | Factors | |
|---|---|---|---|---|---|---|---|
| Brewer and Wells ( | Single exp. | BW,2006 | 1200 | 8 | 0 to 100 by 10 | 11 | Instruction fairness (biased, unbiased), foil similarity (low, high) |
| Carlson et al. ( | Single exp. | C+,2016 | 1415 | 6 | 0 to 1 by .1 | 11 | Exposure duration (3 s, 10 s), weapon (present, absent) |
| Carlson, Dias, Weatherford, and Carlson ( | Single exp. | C+,2017 | 1234 | 6 | 0 to 100 by 10 | 11 | Weapon (present, absent, concealed) |
| Gronlund, Carlson, Dailey, and Goodsell ( | Single exp. (simultaneous only) | G+,2009 | 1279 | 6 | 1 to 7 by 1 | 7 | Video quality (good, poor), guilty strength (strong, weak), innocent strength (weak, strong), lineup fairness (unbiased, intermediate, biased), suspect position (2, 5) |
| Kneller and Harvey ( | Single exp. | KH,2016 | 120 | 6 | 1 to 7 by 1 | 7 | Alcohol (control, placebo, mild intoxication) |
| Mickes ( | Exp. 1 | M,2015,E1 | 302 | 6 | 0 to 100 by 10 | 11 | – |
| Mickes ( | Exp. 2 (simultaneous only) | M,2015,E2 | 238 | 6 | 0 to 100 by 10 | 11 | – |
| Mickes et al. ( | Single exp. (confidence ratings only) | M+,2017,E1 | 978 | 6 | 0–30 and 40 to 100 by 10 | 8 | – |
| Palmer et al. ( | Exp. 1 | P+,2013 | 908 | 8 | 0–20 and 30,100 by 20 | 5 | Exposure duration (90 s, 5 s), retention interval (imm., 6–103 days) |
| Rotello, Guggenmos, and Isbell ( | Exp. 1 (simultaneous only) | R,E1 | 437 | 6 | 0 to 100 by 10 | 11 | Suspect position (2, 5) |
| Rotello et al. ( | Exp. 2 (simultaneous only) | R,E2 | 521 | 6 | 0 to 100 by 10 | 11 | Suspect position (2, 5) |
| Rotello et al. ( | Exp. 3 (simultaneous only) | R,E3 | 376 | 6 | 0 to 100 by 10 | 11 | Suspect position (2, 5) |
| Wetmore et al. ( | Single exp. (simultaneous only) | W+,2015 | 1129 | 6 | 1 to 7 by 1 | 7 | Retention interval (imm., 48 h), lineup fairness (biased, unbiased, other), innocence strength (weak, strong) |
Notes: exp. experiment, imm. immediate
The data analyzed by condition
| Reference | Condition | Label | |
|---|---|---|---|
| Brewer and Wells ( | Low fairness, neutral instructions | BW,2006,LN | 300 |
| High fairness, neutral instructions | BW,2006,HN | 300 | |
| Low fairness, liberal instructions | BW,2006,LL | 300 | |
| High fairness, liberal instructions | BW,2006,HN | 300 | |
| Carlson et al. ( | Shown weapon, short exposure | C+,2016,SS | 337 |
| No weapon, long exposure | C+,2016,NL | 368 | |
| Shown weapon, long exposurea | C+,2106,SL | 710 | |
| Carlson et al. ( | No weapon | C+,2017,N | 409 |
| Concealed weapon | C+,2017,C | 418 | |
| Shown weapon | C+,2017,S | 407 | |
| Gronlund et al. ( | Fair lineup, innocent weak | G+,2009,FIW | 317 |
| Intermediate bias, innocent weak | G+,2009,IIW | 308 | |
| Biased lineup, innocent weak | G+,2009,BIW | 325 | |
| Fair lineup, innocent strong | G+,2009,FIS | 312 | |
| Intermediate bias, innocent strong | G+,2009,IIS | 320 | |
| Biased lineup, innocent strong | G+,2009,BIS | 336 | |
| Fair lineup, guilty weak | G+,2009,FGW | 320 | |
| Intermediate bias, guilty weak | G+,2009,IGW | 316 | |
| Biased lineup, guilty weak | G+,2009,BGW | 323 | |
| Fair lineup, guilty strong | G+,2009,FGS | 314 | |
| Intermediate bias, guilty strong | G+,2009,IGS | 310 | |
| Biased lineup, guilty strong | G+,2009,BGS | 336 | |
| Kneller and Harvey ( | No alcohol | KH,2016,N | 40 |
| Alcohol | KH,2016,A | 40 | |
| Placebo | KH,2016,P | 40 | |
| Palmer et al. ( | Short exposure, short retention | P+,2013,SS | 253 |
| Short exposure, long retention | P+,2013,SL | 218 | |
| Long exposure, short retention | P+,2013,LS | 219 | |
| Long exposure, long retention | P+,2013,LL | 218 | |
| Wetmore et al. ( | Fair, innocent weak, short retention | W+,2015,FIWS | 178 |
| Fair, innocent strong, short retention | W+,2015,FISS | 101 | |
| Biased, innocent weak, short retention | W+,2015,BIWS | 184 | |
| Biased, innocent strong, short retention | W+,2015,BISS | 107 | |
| Otherd, innocent weak, short retention | W+,2015,OIWS | 111 | |
| Other, innocent strong, short retention | W+,2015,OISS | 107 | |
| Fair, innocent weak, long retention | W+,2015,FIWL | 176 | |
| Fair, innocent strong, long retention | W+,2015,FISL | 115 | |
| Biased, innocent weak, long retention | W+,2015,BIWL | 153 | |
| Biased, innocent strong, long retention | W+,2015,BISL | 110 | |
| Other, innocent weak, long retention | W+,2015,OIWL | 108 | |
| Other, innocent strong, long retention | W+,2015,OISL | 113 |
Note: Only a subset of these conditions are discussed in the text. The full set of results is provided on OSF
aIncludes both the 3-s and 10-s weapon view. The culprit’s face was seen for 10 s in both conditions
bTo perform the analysis, both guilty-suspect and innocent-suspect trials are needed. Both strength levels of the innocent or guilty suspects that were not fixed were included. All levels of video quality and suspect position were used
cTo perform the analysis, both guilty-suspect and innocent-suspect trials are needed. The innocent weak/strong suspects were paired with the guilty strong suspects
dDetails are not provided in Wetmore et al. (2015) or in the data regarding these instructions, but they were designed to be similar to police instructions, so were included
Fig. 7Actual and estimated base rates for conditions from Gronlund et al. (2009). The green circle shows the actual experimental base rate and its estimate. Each gray circle shows the estimated base rate for one of the sampled data sets (smp). The red lines and red x show the 10th and 90th quantiles and median of these estimated base rates. The number in the lower-right corner is the condition sample size. See Table 2 for condition labels
Fig. 8Actual and estimated base rates for conditions from Carlson et al. (2017). The green circle shows the actual experimental base rate and its estimate. Each gray circle shows the estimated base rate for one of the sampled data sets (smp). The red lines and red x show the 10th and 90th quantiles and median of these estimated base rates. The number in the lower-right corner is the condition sample size. See Table 2 for condition labels
Fig. 9The response category proportions for all 47 experimental conditions with knowledge of the guilt or innocence of the suspect (top) or collapsed data (bottom). The lines show data from conditions in which the model-estimated PPG was higher (red, closed points) or lower (blue, open points) than the actual PPG. The larger symbols provide the means within each response category
Fig. 10Results of the signal detection theory (SDT) model analysis when fit to the best-practices data set. Left: actual and estimated base rates when combined across all best-practices data. The green circle shows the actual experimental base rate and the SDT-model estimate. Each gray circle shows the estimated base rate for one of the sampled data sets (smp), with the sampled base-rate value jittered for visibility. The red lines and red x show the 10th and 90th quantiles and median of these estimated base rates. The number in the lower-right corner is the overall sample size. Middle: data results and model predictions for low- (L), medium- (M), and high- (H) confidence suspect identifications, low- (l), medium- (m), and high- (h) confidence filler identifications, and lineup rejections (R). Right: the green line shows the actual base rate in the experimental data and the red line shows the estimated base rate from the SDT model. In the upper section, the black curve shows how the model fit value changes as the model base rate varied. The number in the lower-left provides the likelihood ratio (lr) of the model when the estimated and actual experimental base rates are used. In the lower section, the gray curve shows the distribution of estimated base rates for data simulated from the SDT model with the model base rate fixed at the actual experimental base rate. The y-axis is frequency
Fig. 11Calibration curves using the best-practices data set for both experimental data and model predictions. Each column is for a different sampled base rate. The top row plots the proportion of correct responses given either a rejection (rej) or a low (sL)-, medium (sM)-, or high (sH)-confidence subject response. The bottom row plots the probability of a suspect identification given a low- (L), medium- (M), or high- (H) confidence response (suspect or filler)
Fig. 12The model-predicted posterior probability of guilty (PPG) plotted against the actual sampled PPG for different base rates sampled from the best-practices data set, for low-, medium-, and high-confidence responses
Fig. 13Parameter recovery of μg, c, and p from data simulated with the signal detection theory (SDT) model
Fig. 14The change in the signal detection theory (SDT) response distribution collapsed across innocent-suspect and guilty-suspect lineups when the proportion of suspect picks is increased by changing either memory strength or the base rate of guilty suspects in the SDT model
Fig. 15Actual and estimated posterior probability of guilt (PPG) for data simulated by the signal detection theory (SDT) model