| Literature DB >> 32101577 |
Sven Sebastian Uhlmann1, Noëlle Yochum2, Bart Ampe1.
Abstract
Using measures of reflex impairment and injury to quantify an aquatic organism's vitality have gained popularity as survival predictors of discarded non-target fisheries catch. To evaluate the robustness of this method with respect to 'rater' subjectivity, we tested inter- and intra-rater repeatability and the role of 'expectation bias'. From video clips, multiple raters determined impairment levels of four reflexes of beam-trawled common sole (Solea solea) intended for discard. Raters had a range of technical experience, including veterinary students, practicing veterinarians, and fisheries scientists. Expectation bias was evaluated by first assessing a rater's assumption about the effect of air exposure on vitality, then comparing their reflex ratings of the same fish, once when the true air exposure duration was indicated and once when the time was exaggerated (by either 15 or 30 min). Inter-rater repeatability was assessed by having multiple raters evaluate those clips with true air exposure information; and intra- and inter-rater repeatability was determined by having individual raters evaluate a series of duplicated clips, all with true air exposure. Results indicate that inter- and intra-rater repeatability were high (intra-class correlation coefficients of 74% for both), and were not significantly affected by background type nor expectation bias related to assumed impact from prolonged air exposure. This suggests that reflex impairment as a metric for predicting fish survival is robust to involving multiple raters with diverse backgrounds. Bias is potentially more likely to be introduced through subjective reflexes than raters, given that consistency in scoring differed for some reflexes based on rater experience type. This study highlights the need to provide ample training for raters, and that no prior experience is needed to become a reliable rater. Moreover, before implementing reflexes in a vitality study, it is important to evaluate whether the determination of presence/absence is subjective.Entities:
Mesh:
Year: 2020 PMID: 32101577 PMCID: PMC7043772 DOI: 10.1371/journal.pone.0229456
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Lecture theatre view to illustrate video projection.
Third-year veterinary medicine students from the University of Ghent independently scored a tail grab reflex response of a common sole (Solea solea) from a video clip projected onto a lecture theatre screen during a workshop session in April 2016. Note: The picture was blurred in parts to guarantee that any person in the audience cannot be identified to comply with the PLOS open-access (CC-BY) license.
List of scoring criteria for categorical reflex responses (i.e., absent, weak, moderate, and strong) of common sole (Solea solea) in the order tested within 5 s of observation after stimulus (based on [4, 9]).
| Reflex | Stimulus | Absent | Weak | Moderate | Strong |
|---|---|---|---|---|---|
| The fish is held outside the water on the palms of two hands (touching each other) with its belly | No active movement, the body rests limp on the hand. | Tail is moving slightly, but not beyond the plain of the hand. | Tail is flexing beyond the plain of the hand. Body may move–spastic | The fish is actively trying to move head and tail towards each other; or | |
| The fish is held underwater at the surface on the palms of two hands (touching each other) with its belly | Fish drifts and sinks passively to the bottom of the container. | Fish appears stunned, but rights itself very slowly. | Fish appears stunned, but starts to turn after a delay. The rotation can be swift. | Fish actively and quickly turns underwater. | |
| The fish’s head is held between thumb and index finger, with either belly or dorsal side facing up. | No movement. The body dangles motionless. | The fish may move its tail slightly. | The fish may exhibit a cramp-like flexion, but no | Fish immediately and repeatedly curls around fingers. | |
| The fish’s tail is held between thumb and index finger. | Fish does not struggle free; it | Fish does not struggle free; no swimming | Fish does not struggle free, but moves its body as if it attempts to swim away. | The fish actively struggles free and swims away. |
Intensity of a response increases from absent to strong. The speed of a response for weak and moderate categories may be delayed; for strong it should be immediate.
Fig 2Example of a pictogram handout (A) and scoresheet (B) to train and score reflexes from video clips. The scoresheet details how to score the body flex reflex (which was also termed ‘bellybend’) response of common sole (Solea solea) on a continuous tagged-analogue visual scale from a short video clip.
Schematic representation of the treatment (inter-rater vs intra-rater repeatability, with or without expectation bias; see shading) assigned to 36 video clips (each <30 s in length) of a given fish’s reflex response per scoring session.
| Reflex | ||||
|---|---|---|---|---|
| Fish | Body flex | Righting | Head | Tail grab |
| F | F | T | F | |
| F | T | F | T | |
| T | T | T | F | |
| T, F | T, F | F, F | T, T | |
| F, F | T, F | F, F | T, T | |
| T, T | F, F | T, F | T, T | |
| F | F | T | F | |
| F | T | T | F | |
| T | T | T | F | |
| T, F | T, F | T, F | T, F | |
| T, F | T, F | T, F | T, F | |
| T, F | T, F | T, F | T, F | |
| T, F | T, F | T, F | T, F | |
| T, F | T, F | T, F | ||
| T, T | T, T | T, T | ||
| T | T | T | T | |
| T, T | ||||
| T, T | T, T | T, T | T, T | |
| T, F | ||||
| Inter-Rater Repeatability | ||||
| Inter-and Intra-Rater Repeatability | ||||
| Inter-and Intra-Rater Repeatability, and Expectation Bias | ||||
The notations ‘T,F’ or ‘T,T’ are indicating whether air exposure information was true or falsified on the duplicated video pair, and if not duplicated air exposure was marked as either false or true (‘F’ or ‘T’, respectively). For sessions 1–4, if falsified, 15 minutes were added to the true value, for sessions 5–7, 30 minutes were added.
Fig 3Frequency distribution of average ‘silver standard’ reflex scores.
Scores by three expert raters who developed the reflex scoring methodology and were experienced raters were considered as the ‘silver standard’. These three raters scored all original clips (N = 12) that went into the making of the scoring video for sessions 2–4.
Number, gender, experience, and expectation of workshop participants per scoring session (1–7), stratified by previous experience in scoring reflex responsiveness of live animals (‘none’: no animals scored; ‘ some’: <100 animals scored; and ‘experienced’: ≥100 animals scored).
| Session | No. participants | Male | Female | NA | Experience | Expectation | ||
|---|---|---|---|---|---|---|---|---|
| 157 | 35 | 120 | None | 54 | 62 | 39 | ||
| 2 | 0 | Experienced | 2 | 0 | 0 | |||
| 18 | 8 | 7 | None | 3 | 12 | 0 | ||
| 3 | 0 | Experienced | 1 | 2 | 0 | |||
| 13 | 2 | 1 | None | 2 | 1 | 0 | ||
| 3 | 1 | Some | 3 | 1 | 0 | |||
| 3 | 3 | Experienced | 1 | 5 | 0 | |||
| 19 | 4 | 0 | None | 4 | 0 | 0 | ||
| 5 | 3 | Some | 7 | 1 | 0 | |||
| 4 | 3 | Experienced | 5 | 2 | 0 | |||
| 182 | 39 | 140 | 2 | None | 78 | 41 | 62 | |
| 1 | 0 | Experienced | 1 | 0 | 0 | |||
| 33 | 20 | 12 | None | 13 | 16 | 3 | ||
| 1 | 0 | Experienced | 1 | 0 | 0 | |||
| 14 | 4 | 6 | 1 | None | 7 | 3 | 1 | |
| 2 | 0 | Some | 1 | 1 | 0 | |||
| 1 | 0 | Experienced | 1 | 0 | 0 | |||
| 436 | 134 | 296 | 3 | 184 | 147 | 105 | ||
‘Expectation’ was classified based on the rater’s response to a question on the scoresheet asking whether s/he expected air exposure to impact reflex impairment, either positively or negatively (i.e., the rater believes that prolonged air exposure would exacerbate or reduce reflex impairment, respectively). NA, not all participants revealed their gender or gave a score for the expectation question.
Fig 4Frequency distribution of hypothetical tVAS scores provided by unique raters from scoring sessions 2–7.
N of unique raters is indicated above each bar as it was marked by a rater on their scoresheet in response to a question whether a reflex response would weaken or strengthen when the animal was knowingly exposed to air for a prolonged period (15–30 min).
Tukey comparisons of the least-square mean (lsmean) ± SE reflex score of a given reflex type which was scored by a rater with a certain experience and a positive (1) or negative (0) expectation.
| Experience | Reflex | Expectation | Air exposure | lsmean | SE | l.CL | u.CL | Group |
|---|---|---|---|---|---|---|---|---|
| Body flex | 1 | TRUE | 50.8 | 15.4 | 20.6 | 81 | a | |
| 1 | FALSE | 58.7 | 15.4 | 28.5 | 88.9 | b | ||
| 0 | TRUE | 55.2 | 15.4 | 25.0 | 85.5 | ab | ||
| 0 | FALSE | 60.4 | 15.4 | 30.2 | 90.7 | b | ||
| Head | 1 | TRUE | 39.8 | 13.4 | 13.6 | 66 | a | |
| 1 | FALSE | 37.0 | 13.4 | 10.8 | 63.2 | a | ||
| 0 | TRUE | 40.8 | 13.4 | 14.6 | 67.1 | a | ||
| 0 | FALSE | 38.2 | 13.4 | 12.0 | 64.4 | a | ||
| Righting | 1 | TRUE | 46.3 | 13.4 | 20.1 | 72.5 | a | |
| 1 | FALSE | 49.4 | 13.4 | 23.2 | 75.7 | a | ||
| 0 | TRUE | 48.3 | 13.4 | 22.1 | 74.5 | a | ||
| 0 | FALSE | 51.3 | 13.4 | 25.0 | 77.5 | a | ||
| Tail grab | 1 | TRUE | 46.0 | 13.4 | 19.8 | 72.2 | a | |
| 1 | FALSE | 45.1 | 13.4 | 19.0 | 71.3 | a | ||
| 0 | TRUE | 47.6 | 13.4 | 21.4 | 73.8 | a | ||
| 0 | FALSE | 49.0 | 13.4 | 22.8 | 75.2 | a | ||
| Body flex | 1 | TRUE | 60.8 | 16.2 | 29.0 | 92.5 | ab | |
| 1 | FALSE | 69.2 | 16.2 | 37.4 | 100.9 | b | ||
| 0 | TRUE | 42.9 | 17.7 | 8.2 | 77.6 | a | ||
| 0 | FALSE | 53.9 | 17.7 | 19.2 | 88.6 | ab | ||
| Head | 1 | TRUE | 42.3 | 13.7 | 15.4 | 69.1 | a | |
| 1 | FALSE | 34.1 | 13.7 | 7.3 | 61 | a | ||
| 0 | TRUE | 39.4 | 14.8 | 10.3 | 68.4 | a | ||
| 0 | FALSE | 31.9 | 14.8 | 2.8 | 60.9 | a | ||
| Righting | 1 | TRUE | 62.5 | 13.7 | 35.7 | 89.4 | a | |
| 1 | FALSE | 51.8 | 13.9 | 24.6 | 79.1 | a | ||
| 0 | TRUE | 61.9 | 14.8 | 32.9 | 91 | a | ||
| 0 | FALSE | 50.2 | 15.2 | 20.3 | 80 | a | ||
| Tail grab | 1 | TRUE | 53.0 | 13.7 | 26.2 | 79.9 | a | |
| 1 | FALSE | 54.5 | 13.7 | 27.6 | 81.3 | a | ||
| 0 | TRUE | 51.0 | 14.8 | 21.9 | 80 | a | ||
| 0 | FALSE | 52.1 | 14.8 | 23.1 | 81.2 | a | ||
| Body flex | 1 | TRUE | 66.1 | 16.2 | 34.4 | 97.9 | a | |
| 1 | FALSE | 62.6 | 16.2 | 30.8 | 94.4 | a | ||
| 0 | TRUE | 74.0 | 16.5 | 41.7 | 106.3 | a | ||
| 0 | FALSE | 75.8 | 16.5 | 43.5 | 108.2 | a | ||
| Head | 1 | TRUE | 41.1 | 13.8 | 13.9 | 68.2 | a | |
| 1 | FALSE | 39.4 | 13.8 | 12.3 | 66.5 | a | ||
| 0 | TRUE | 40.7 | 13.8 | 13.7 | 67.7 | a | ||
| 0 | FALSE | 38.7 | 13.8 | 11.7 | 65.7 | a | ||
| Righting | 1 | TRUE | 53.1 | 13.9 | 26.0 | 80.3 | a | |
| 1 | FALSE | 51.2 | 14.0 | 23.8 | 78.7 | a | ||
| 0 | TRUE | 57.0 | 13.8 | 30.0 | 84 | a | ||
| 0 | FALSE | 50.9 | 14.0 | 23.5 | 78.3 | a | ||
| Tail grab | 1 | TRUE | 57.5 | 13.8 | 30.3 | 84.6 | a | |
| 1 | FALSE | 51.6 | 13.8 | 24.5 | 78.8 | a | ||
| 0 | TRUE | 56.4 | 13.8 | 29.4 | 83.4 | a | ||
| 0 | FALSE | 48.5 | 13.8 | 21.5 | 75.5 | a |
Clips were duplicated within a scoring video and imprinted onto the screened clip with either false (an added 15 or 30 min to the true value) or true air exposure information. A rater’s expectation (scored on a scale of 0 to 100) of the effect of prolonged, onboard air exposure on a fishes’ reflex responsiveness was categorized as to whether it would result in either a weaker (<30; positive expectation; 1) reflex response or no effect (≥30, no or negative/wrong expectation; 0). Our hypothesis was that clips imprinted with false air exposure information would receive a lower score than their duplicate shown with the true value, as the fish would have been weakened from additional air exposure (positive expectation). Groups with the same letter were not significantly different at p = 0.05.
Fig 5Plot of mean score per video clip doublet (with either ‘true’ or ‘false’ air exposure information) of a given reflex across all workshop participants (a), and then stratified by experience in scoring reflexes of fish: none (b); some (c); and experienced (d). Doublet ID includes a scoring video clip ID (2 or 3) and a running ID number for each doublet, with each clip of a doublet abbreviated by ‘a’ or ‘b’. Treatments include: ‘Intra-rater reliability with expectation bias’ (IOR-exp.), which refers to duplicated clips of the same fish and reflex with either true or false air exposure information; or ‘Intra-rater reliability’ (IOR), which refers to duplicated clips of the same fish and reflex and always true air exposure information. Where available, dots indicate the ‘silver standard’ scores which were averaged across three experienced, expert raters who scored 12 unmodified, original clips.