| Literature DB >> 32937881 |
Katharina Hohlbaum1, Giuliano Mario Corte2, Melanie Humpenöder1, Roswitha Merle3, Christa Thöne-Reineke1.
Abstract
To maintain and foster the welfare of laboratory mice, tools that reliably measure the current state of the animals are applied in clinical assessment. One of these is the Mouse Grimace Scale (MGS), a coding system for facial expression analysis. Since there are concerns about the objectivity of the MGS, we further investigated its reliability. Four observers (two experienced and two inexperienced in use of the MGS) scored 188 images of 33 female and 31 male C57BL/6JRj mice. Images were generated prior to, 150 min, and two days after ketamine/xylazine anesthesia. The intraclass correlations coefficient (ICC = 0.851) indicated good agreement on total MGS scores between all observers when all three time points were included in the analysis. However, interrater reliability was higher in the early post-anesthetic period (ICC = 0.799) than at baseline (ICC = 0.556) and on day 2 after anesthesia (ICC = 0.329). The best agreement was achieved for orbital tightening, and the poorest agreement for nose and cheek bulge, depending on the observers' experience levels. In general, experienced observers produced scores of higher consistency when compared to inexperienced. Against this background, we critically discuss factors that potentially influence the reliability of MGS scoring.Entities:
Keywords: Cohen’s kappa; Fleiss’ kappa; Mouse Grimace Scale; animal welfare indicators; intraclass correlations coefficient; laboratory mice
Year: 2020 PMID: 32937881 PMCID: PMC7552260 DOI: 10.3390/ani10091648
Source DB: PubMed Journal: Animals (Basel) ISSN: 2076-2615 Impact factor: 2.752
Figure 1MGS scores (mean, 95% confidence interval). (a) Mean MGS scores, (b) Mean MGS difference scores. Please note that data were derived from animals that received either no, one, or six anesthesia. Since the present study focuses on MGS reliability, treatment groups are not differentiated. The effects of the anesthesia regimes on the well-being of mice in the different treatment groups was investigated in a previous study by Hohlbaum et al. [17]. Abbreviation: Mouse Grimace Scale (MGS).
Interrater reliability of overall MGS scores.
| Total MGS Scores 1 | Mean MGS Scores 1 | Mean MGS Difference Scores 2 | |
|---|---|---|---|
|
| |||
| ICC over all time points 3 | 0.851 | 0.845 | 0.855 |
| ICC at baseline | 0.556 | 0.522 | - |
| ICC at 150 min after anesthesia | 0.799 | 0.802 | 0.813 |
| ICC on day 2 after anesthesia | 0.329 | 0.317 | 0.433 |
|
| |||
| ICC over all time points 3 | 0.749 | 0.757 | 0.736 |
|
| |||
| ICC over all time points 3 | 0.824 | 0.821 | 0.811 |
|
| |||
| ICC over all time points 3, excluding scorer #1 (inexperienced) | 0.812 | 0.797 | 0.791 |
| ICC over all time points 3, excluding scorer #2 (inexperienced) | 0.826 | 0.822 | 0.855 |
| ICC over all time points 3, excluding scorer #3 (experienced) | 0.814 | 0.808 | 0.816 |
| ICC over all time points 3, excluding scorer #4 (experienced) | 0.791 | 0.786 | 0.799 |
1 Scores obtained at baseline, 150 min, and two days after anesthesia. 2 Scores obtained at 150 min, and two days after anesthesia. Baseline scores were subtracted from the score generated after anesthesia. 3 Three scores of each mouse were included in the analysis (i.e., one score each was obtained from images generated at baseline, 150 min, and two days after anesthesia). Abbreviations: Intraclass correlation (ICC), Mouse Grimace Scale (MGS).
Figure 2Reliability analysis of the mean MGS scores. The mean MGS scores of each of the four scorers were compared to the mean MGS scores of all scorers. Three scores for each mouse were included in the analysis (i.e., one score each was obtained from images generated at baseline, 150 min, and two days after anesthesia). Abbreviation: Mouse Grimace Scale (MGS).
Interrater reliability of each facial action unit.
| Orbital Tightening | Nose Bulge | Cheek Bulge | Ear Position | Whisker Change | |
|---|---|---|---|---|---|
|
| |||||
| Fleiss’ kappa over all time points 1 | 0.664 | 0.093 | 0.125 | 0.285 | 0.279 |
| Intraclass correlation over all time points 1,2 | 0.876 | 0.156 | 0.146 | 0.542 | 0.519 |
| Fleiss’ kappa at baseline | 0.245 | 0.104 | 0.032 | 0.164 | 0.066 |
| Intraclass correlation at baseline 2 | 0.416 | 0.117 | 0.041 | 0.173 | 0.174 |
| Fleiss’ kappa at 150 min after anesthesia | 0.531 | 0.052 | 0.083 | 0.197 | 0.220 |
| Intraclass correlation at 150 min after anesthesia 2 | 0.900 | 0.243 | 0.115 | 0.674 | 0.721 |
| Fleiss’ kappa on day 2 after anesthesia | 0.357 | −0.089 | −0.031 | 0.074 | 0.096 |
| Intraclass correlation on day 2 after anesthesia 2 | 0.695 | 0.098 | 0.102 | 0.282 | 0.450 |
|
| |||||
| Cohen’s kappa over all time points 1 | 0.708 | 0.197 | 0.013 | 0.280 | 0.244 |
|
| |||||
| Cohen’s kappa over all time points 1 | 0.629 | 0.355 | 0.372 | 0.484 | 0.428 |
The score “not assessable” was included in the analysis. 1 Three scores of each mouse were included in the analysis (i.e., one score each was obtained from images generated at baseline, 150 min, and two days after anesthesia). 2 To be able to compare our data with other studies, we calculated the ICC in addition to Fleiss’ kappa. Abbreviation: Intraclass correlation (ICC).