| Literature DB >> 30788451 |
John E Byrd1, Carrie B LeGarde1.
Abstract
Evaluation of method performance involves the consideration of numerous factors that can contribute to error. A variety of measures of performance can be borrowed from the signal detection literature and others are drawn from statistical science. This article demonstrates the principles of performance evaluation by applying multiple measures to osteometric sorting models for paired elements run against data from known individuals. Results indicate that false positive rates are close, on average, to expected values. As assemblage size grows, the false positive rate becomes unimportant and the false negative rate becomes significant. Size disparity among the commingled individuals plays a significant role in method performance, showing that case-specific circumstances (e.g. assemblage size and size disparity) will determine method power.Entities:
Keywords: Forensic sciences; error; forensic anthropology; method performance; osteometric sorting; signal detection; statistics; test method
Year: 2019 PMID: 30788451 PMCID: PMC6374941 DOI: 10.1080/20961790.2018.1535762
Source DB: PubMed Journal: Forensic Sci Res ISSN: 2471-1411
Projections for future performance given performance metrics reported above for humerus, radius, and ulna (including TL).
| Items | Sample | PPV | NPV | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Prev | Humerus | Radius | Ulna | Humerus | Radius | Ulna | |||
| Projections | 2 | 4 | 2 | 0.94 | 0.87 | 0.81 | 0.86 | 0.85 | 0.83 |
| 3 | 9 | 6 | 0.97 | 0.93 | 0.89 | 0.75 | 0.74 | 0.71 | |
| 4 | 16 | 12 | 0.98 | 0.95 | 0.93 | 0.66 | 0.66 | 0.63 | |
| 5 | 25 | 20 | 0.99 | 0.96 | 0.94 | 0.60 | 0.59 | 0.56 | |
| 6 | 36 | 30 | 0.99 | 0.97 | 0.95 | 0.54 | 0.54 | 0.50 | |
| 7 | 49 | 42 | 0.99 | 0.98 | 0.96 | 0.50 | 0.49 | 0.45 | |
| 8 | 64 | 56 | 0.99 | 0.98 | 0.97 | 0.46 | 0.45 | 0.42 | |
| 9 | 81 | 72 | 0.99 | 0.98 | 0.97 | 0.43 | 0.42 | 0.38 | |
| 10 | 100 | 90 | 0.99 | 0.98 | 0.97 | 0.40 | 0.39 | 0.36 | |
| 11 | 121 | 110 | 0.99 | 0.98 | 0.98 | 0.37 | 0.37 | 0.33 | |
| 12 | 144 | 132 | 0.99 | 0.99 | 0.98 | 0.35 | 0.35 | 0.31 | |
| 13 | 169 | 156 | 1.00 | 0.99 | 0.98 | 0.33 | 0.33 | 0.29 | |
| 14 | 196 | 182 | 1.00 | 0.99 | 0.98 | 0.31 | 0.31 | 0.28 | |
| 15 | 225 | 210 | 1.00 | 0.99 | 0.98 | 0.30 | 0.29 | 0.26 | |
| 16 | 256 | 240 | 1.00 | 0.99 | 0.98 | 0.28 | 0.28 | 0.25 | |
| 17 | 289 | 272 | 1.00 | 0.99 | 0.99 | 0.27 | 0.27 | 0.24 | |
| 18 | 324 | 306 | 1.00 | 0.99 | 0.99 | 0.26 | 0.25 | 0.23 | |
| 19 | 361 | 342 | 1.00 | 0.99 | 0.99 | 0.25 | 0.24 | 0.22 | |
| 20 | 400 | 380 | 1.00 | 0.99 | 0.99 | 0.24 | 0.23 | 0.21 | |
| 30 | 900 | 870 | 1.00 | 0.99 | 0.99 | 0.17 | 0.17 | 0.15 | |
| 50 | 2 500 | 2 450 | 1.00 | 1.00 | 1.00 | 0.11 | 0.11 | 0.09 | |
| 70 | 4 900 | 4 830 | 1.00 | 1.00 | 1.00 | 0.08 | 0.08 | 0.07 | |
| 90 | 8 100 | 8 010 | 1.00 | 1.00 | 1.00 | 0.06 | 0.06 | 0.05 | |
| 100 | 10 000 | 9 900 | 1.00 | 1.00 | 1.00 | 0.06 | 0.06 | 0.05 | |
| 110 | 12 100 | 11 990 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.04 | |
| 120 | 14 400 | 14 280 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.04 | |
| 160 | 25 600 | 25 440 | 1.00 | 1.00 | 1.00 | 0.04 | 0.04 | 0.03 | |
| 200 | 40 000 | 39 800 | 1.00 | 1.00 | 1.00 | 0.03 | 0.03 | 0.02 | |
| 300 | 90 000 | 89 700 | 1.00 | 1.00 | 1.00 | 0.02 | 0.02 | 0.02 | |
| 400 | 160 000 | 159 600 | 1.00 | 1.00 | 1.00 | 0.01 | 0.01 | 0.01 | |
The number of individuals, each with a pair of elements.
The number of comparisons calculated as N (indiv)r.
Prevalence defined as number of pairwise comparisons NOT from same individual.
TL: total length of bone; PPV: positive predictive value; NPV: negative predictive value
Showing the effects of difference in body size (as represented by stature) on test performance.
| Model | FNR | |||
|---|---|---|---|---|
| Δ = 4 inches | Δ = 6 inches | Δ = 8 inches | ||
| Humerus | 437 | 0.06 | 0.03 | 0.03 |
| Radius | 407 | 0.05 | 0.02 | 0.01 |
| Ulna | 367 | 0.05 | 0.01 | 0.01 |
FNR: false negative rate
Performance metrics for Vickers et al. [6] recommended approach applied to Byrd data.
| Model | Standard [D](mm) | FNR | SE | SP | PPV | NPV | EFF | |
|---|---|---|---|---|---|---|---|---|
| Humerus | 188 + 997 | 0–31 | 0.55 | 0.45 | 1 | 1 | 0.45 | 0.54 |
| Radius | 117 + 931 | 0–23 | 0.56 | 0.44 | 1 | 1 | 0.44 | 0.50 |
| Ulna | 107 + 911 | 0–25 | 0.61 | 0.39 | 1 | 1 | 0.39 | 0.46 |
| Humerus (no TL) | 272 + 992 | 0–2.3 | 0.56 | 0.44 | 1 | 1 | 0.44 | 0.56 |
| Radius (no TL) | 241 + 994 | 0–3.6 | 0.41 | 0.44 | 1 | 1 | 0.59 | 0.67 |
| Ulna (no TL) | 63 + 885 | 0–5.1 | 0.57 | 0.43 | 1 | 1 | 0.43 | 0.47 |
FNR: false negative rate; SE: sensitivity; SP: specificity; PPV: positive predictive value; NPV: negative predictive value; EFF: efficiency; TL: total length
Figure 1.Receiver operating characteristic (ROC) curve for the humerus paired element model (with total length) with various P value cutoffs identified. The optimal value is P= 0.125.
False positive results from the study of Vickers et al. [6] and application of performance metrics of Byrd models.
| Source | Model | FPR | SP | ||
|---|---|---|---|---|---|
| [ | Humerus | 1 063 | 0.09 | 0.91 | 0.91 |
| Radius | 981 | 0.12 | 0.05 | 0.88 | |
| Ulna | 934 | 0.17 | 1.6 × 10−11 | 0.83 | |
| Femur | 1 001 | 0.08 | 0.95 | 0.92 | |
| Tibia | 933 | 0.08 | 0.97 | 0.92 | |
| Fibula | 855 | 0.07 | 0.99 | 0.93 | |
| Mean FPR | 0.10 | ||||
| 0.47 | |||||
FPR: false positive rate; SP: specificity
False positive results from the study of Byrd and LeGarde [4] and application of performance metrics of Byrd models.
| Source | Model | FPR | Qactual | Qmax | SE | SP | κ(0,0) | κ(1,0) | PPV | NPV | EFF | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| [ | Humerus | 188 + 997 = 1 185 | 0.05 | 0.99 | 0.01 | 0.04 | 0.84 | 0.95 | 0.93 | 0.44 | 0.99 | 0.52 | 0.86 |
| Radius | 117 + 931 = 1 048 | 0.13 | 0.12 | 0.02 | 0.03 | 0.84 | 0.88 | 0.84 | 0.34 | 0.98 | 0.42 | 0.85 | |
| Ulna | 107 + 911 = 1 018 | 0.20 | 0.00083 | 0.03 | 0.03 | 0.84 | 0.80 | 0.75 | 0.30 | 0.97 | 0.37 | 0.84 | |
| Humerus (no TL) | 272 + 992 = 1 264 | 0.07 | 0.95 | 0.02 | 0.09 | 0.68 | 0.94 | 0.89 | 0.29 | 0.98 | 0.45 | 0.74 | |
| Radius (no TL) | 241 + 994 = 1 235 | 0.08 | 0.84 | 0.03 | 0.07 | 0.74 | 0.92 | 0.86 | 0.33 | 0.97 | 0.46 | 0.78 | |
| Ulna (no TL) | 63 + 886 = 949 | 0.08 | 0.61 | 0.01 | 0.06 | 0.67 | 0.92 | 0.88 | 0.11 | 0.99 | 0.17 | 0.69 | |
| Mean FPR | 0.10 | ||||||||||||
| 0.47 | |||||||||||||
FPR: false positive rate; Qactual: false discovery rate; Qmax: maximum false discovery rate; SE: sensitivity; SP: specificity; κ: sensitivity quality index; PPV: positive predictive value; NPV: negative predictive value; EFF: efficiency; TL: total length of bone: UT: University of Tennessee
False positive results from the study of LeGarde [5] and application of performance metrics of Byrd models.
| Source | Model | FPR | SP | |||
|---|---|---|---|---|---|---|
| [ | Humerus | HML | 151 | 0.07 | 0.84 | 0.93 |
| HEB | 148 | 0.13 | 0.10 | 0.87 | ||
| MDDT | 151 | 0.13 | 0.12 | 0.87 | ||
| HML + MDDT | 151 | 0.09 | 0.66 | 0.91 | ||
| HML + MDDT + HEB | 148 | 0.09 | 0.63 | 0.91 | ||
| HML + MDDT + HEB + HMiD | 148 | 0.10 | 0.52 | 0.91 | ||
| MDDT + HMiD | 151 | 0.13 | 0.12 | 0.87 | ||
| Radius | RML | 142 | 0.10 | 0.45 | 0.90 | |
| MDRT | 145 | 0.08 | 0.70 | 0.92 | ||
| RML + MDRT | 142 | 0.13 | 0.12 | 0.87 | ||
| MDRT + RMiD + RMaD | 144 | 0.12 | 0.19 | 0.88 | ||
| Femur | FML | 119 | 0.07 | 0.85 | 0.93 | |
| Mean FPR | 0.10 | |||||
| 0.71 | ||||||
FPR: false positive rate; SP: specificity; UT: University of Tennessee
Using standard deviation calculated from the sample (for the UT data test, the sample and reference data together produced the standard deviation), rather than Byrd standard deviation.
False positive results from the study of LeGarde [5] and application of performance metrics of Byrd models.
| Source | Model | FPR | SP | ||||
|---|---|---|---|---|---|---|---|
| [ | Humerus | 46 | 0.13, | 0.04 | 0.17, | 0.85 | 0.87 (0.96) |
| Radius | 47 | 0.15, | 0.09 | 0.09, | 0.51 | 0.85 (0.91) | |
| Ulna | 23 | 0.13, | 0.04 | 0.19, | 0.68 | 0.87 (0.96) | |
| Femur | 46 | 0.07, | 0.02 | 0.69, | 0.95 | 0.93 (0.98) | |
| Tibia | 46 | 0.04, | 0.02 | 0.85, | 0.95 | 0.96 (0.98) | |
| Fibula | 43 | 0.05, | 0.05 | 0.82, | 0.82 | 0.95 (0.95) | |
| Mean FPR | 0.09 | 0.04 | |||||
| 0.47 | |||||||
FPR: false positive rate; SP: specificity.