| Literature DB >> 34071952 |
Olivia Strunge Meyer1, Nina Mjølsnes Salvo2, Anne Kjærbye1, Marianne Kjersem2, Mikkel Meyer Andersen3, Erik Sørensen4, Henrik Ullum5, Kirstin Janssen2, Niels Morling1, Claus Børsting1, Gunn-Hege Olsen2, Jeppe Dyrberg Andersen1.
Abstract
Description of a perpetrator's eye colour can be an important investigative lead in a forensic case with no apparent suspects. Herein, we present 11 SNPs (Eye Colour 11-EC11) that are important for eye colour prediction and eye colour prediction models for a two-category reporting system (blue and brown) and a three-category system (blue, intermediate, and brown). The EC11 SNPs were carefully selected from 44 pigmentary variants in seven genes previously found to be associated with eye colours in 757 Europeans (Danes, Swedes, and Italians). Mathematical models using three different reporting systems: a quantitative system (PIE-score), a two-category system (blue and brown), and a three-category system (blue, intermediate, brown) were used to rank the variants. SNPs with a sufficient mean variable importance (above 0.3%) were selected for EC11. Eye colour prediction models using the EC11 SNPs were developed using leave-one-out cross-validation (LOOCV) in an independent data set of 523 Norwegian individuals. Performance of the EC11 models for the two- and three-category system was compared with models based on the IrisPlex SNPs and the most important eye colour locus, rs12913832. We also compared model performances with the IrisPlex online tool (IrisPlex Web). The EC11 eye colour prediction models performed slightly better than the IrisPlex and rs12913832 models in all reporting systems and better than the IrisPlex Web in the three-category system. Three important points to consider prior to the implementation of eye colour prediction in a forensic genetic setting are discussed: (1) the reference population, (2) the SNP set, and (3) the reporting strategy.Entities:
Keywords: DNA phenotyping; eye colour; forensic genetics; pigmentation; rs12913832
Year: 2021 PMID: 34071952 PMCID: PMC8227851 DOI: 10.3390/genes12060821
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Eye colour variation in the variant discovery and model data sets.
| Quantitative System 1 | Two-Category System 2 | Three-Category System 3 | ||||
|---|---|---|---|---|---|---|
| Mean PIE-Score | Blue | Brown | Blue | Intermediate | Brown | |
| Discovery data set (n = 757) | 0.24 | 447 (59%) | 310 (41%) | 368 (49%) | 148 (20%) | 241 (31%) |
| Model data set (n = 523) | 0.44 | 376 (72%) | 147 (28%) | 293 (56%) | 123 (24%) | 107 (20%) |
1. Statistically significant difference in mean PIE-scores between the two data sets (p < 0.05). 2. PIE-score > 0.2: blue, and PIE-score ≤ 0.2: brown. 3. PIE-score > 0.8: blue, PIE-score ≤ 0.8 to ≥−0.5: intermediate, and PIE-score < −0.5: brown.
Allele frequencies of 44 variants typed in the discovery data set and 13 variants typed in the model data set.
| Gene | Variant 1 | Reference Allele | Variant Allele | Variant Allele Frequency | |
|---|---|---|---|---|---|
| Discovery Data Set (n = 757) | Model Data Set (n = 523) | ||||
|
|
|
|
|
|
|
|
| rs1050976 | C | T | 0.44 | |
|
| rs10530949 | TCT | - | 0.43 | |
|
| rs12211228 | G | C | 0.14 | |
|
| rs9378807 | C | T | 0.49 | |
|
| rs11018509 | T | A | 0.29 | |
|
|
|
|
|
|
|
|
| rs1393350 * | G | A | 0.23 | 0.23 |
|
| rs2047512 | T | C | 0.35 | |
|
| rs34749698 | T | C | 0.23 | |
|
|
|
|
|
|
|
|
| rs9919559 | T | C | 0.33 | |
|
|
|
|
|
|
|
|
| rs10491745 | T | C | 0.82 | |
|
| rs12590749 | C | A | 0.37 | |
|
| rs12880508 | C | T | 0.74 | |
|
| rs12894551 | T | C | 0.65 | |
|
|
|
|
|
|
|
|
| rs17128288 | A | G | 0.30 | |
|
| rs17128324 | C | T | 0.17 | |
|
| rs201447946 | T | TA | 0.06 | |
|
| rs34755843 | CGACTCT | - | 0.16 | |
|
| rs35617057 | G | T | 0.41 | |
|
| rs4904887 | C | G | 0.36 | |
|
| rs4904891 | G | C | 0.35 | |
|
| rs4904897 | C | T | 0.22 | |
|
|
|
|
|
|
|
|
| rs59977926 | T | C | 0.18 | |
|
| rs62538950 | A | T | 0.10 | |
|
| rs62538956 | T | C | 0.11 | |
|
| rs7144273 | C | T | 0.49 | |
|
| rs7152962 | G | A | 0.23 | |
|
| rs7401792 | G | A | 0.62 | |
|
| rs74606098 | C | T | 0.06 | |
|
| rs79586719 | G | A | 0.06 | |
|
|
|
|
|
|
|
|
| rs12203592 * | C | T | 0.08 | 0.09 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| rs1800414 | T | C | <0.01 | |
|
| rs62008729 | C | T | 0.09 | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 Variants in bold were part of the EC11 SNPs and typed in the model data set. rs7120151 was not included in the final prediction modelling. 2 The combined frequency of rs121918166 and rs74653330 was 0.01 in the discovery data set. * Part of the IrisPlex prediction model [3]. NA: not analysed.
Twelve variants selected for the EC11 SNP set.
| Rank | Gene | Variant | Mean Variable Importance |
|---|---|---|---|
| 1 |
| rs12913832 * | 74.63% |
| 2 |
| rs121918166 + rs74653330 | 8.54% |
| 3 |
| rs16891982 * | 6.23% |
| 4 |
| rs1800407 * | 5.26% |
| 5 |
| rs1408799 | 1.54% |
| 6 |
| rs4904927 | 1.19% |
| 7 |
| rs12896399 * | 0.68% |
| 8 |
| rs1126809 | 0.47% |
| 9 |
| rs7120151 1 | 0.46% |
| 10 |
| rs10131374 | 0.32% |
| 11 |
| rs1800401 | 0.32% |
1 Variant not included in the final prediction modelling. * Part of the IrisPlex prediction model [3].
Prediction errors for the nine eye colour prediction models (three reporting systems modelled with three SNP sets) and the IrisPlex online tool (IrisPlex Web).
| Eye Colour Prediction Model | Quantitative System 1 | Two-Category System 2 | Three-Category System 3 |
|---|---|---|---|
| EC11 | 5.07 | 0.26 | 0.59 |
| IrisPlex Norway | 5.90 | 0.30 | 0.66 |
| rs12913832 | 6.96 | 0.32 | 0.69 |
| IrisPlex Web * | NA | NA | 0.80 |
1 Prediction error is the mean squared error. 2 Prediction error for a predicted probability, p, is log(p) if the true eye colour was blue, and log(1-p) if the true eye colour was brown. 3 Prediction error is the Kullback–Leibler divergence. * The IrisPlex Web predicts eye colour according to a three-category system. NA: not analysed.
Sensitivity and specificity of eye colour prediction models in the two-category reporting system modelled with three SNP sets. No probability threshold was applied (pmax).
| Two-Category System | Sensitivity 1 | Specificity 1 |
|---|---|---|
| rs12913832 | 0.92 | 0.84 |
| IrisPlex Norway | 0.92 | 0.84 |
| EC11 | 0.96 | 0.82 |
1 Reference is blue eye colour.
Sensitivity and specificity of eye colour prediction models in the three-category reporting system modelled with three SNP sets and the IrisPlex Web model. No probability threshold was applied (pmax).
| Three-Category System | Sensitivity 1 | Specificity 1 | |
|---|---|---|---|
| rs12913832 | Blue | 0.95 | 0.61 |
| Intermediate | 0.00 | 1.00 | |
| Brown | 0.95 | 0.87 | |
| IrisPlex Norway | Blue | 0.94 | 0.61 |
| Intermediate | 0.10 | 0.97 | |
| Brown | 0.86 | 0.90 | |
| EC11 | Blue | 0.95 | 0.59 |
| Intermediate | 0.15 | 0.95 | |
| Brown | 0.88 | 0.96 | |
| IrisPlex Web | Blue | 0.95 | 0.60 |
| Intermediate | 0.00 | 1.00 | |
| Brown | 0.95 | 0.88 |
1 Reference is blue eye colour.
Figure 1Performance of eye colour prediction models in the two-category reporting system modelled with three SNP sets: rs12913832, IrisPlex Norway, and EC11. Bars represent the percentage of correct, incorrect, and inconclusive predictions with no probability threshold (pmax) and a probability threshold of 0.7 (p > 0.7).
Figure 2Performance of eye colour prediction models in the three-category reporting system modelled with three SNP sets: rs12913832, IrisPlex Norway and EC11, as well as performance of the IrisPlex Web prediction model. Bars represent the percentage of correct, incorrect, and inconclusive predictions with no probability threshold (pmax), probability threshold of 0.5 (p > 0.5), and probability threshold of 0.7 (p > 0.7).