| Literature DB >> 27415814 |
Shanis Barnard1, Simone Calderara2, Simone Pistocchi2, Rita Cucchiara2, Michele Podaliri-Vulpiani1, Stefano Messori1, Nicola Ferri1.
Abstract
Mankind directly controls the environment and lifestyles of several domestic species for purposes ranging from production and research to conservation and companionship. These environments and lifestyles may not offer these animals the best quality of life. Behaviour is a direct reflection of how the animal is coping with its environment. Behavioural indicators are thus among the preferred parameters to assess welfare. However, behavioural recording (usually from video) can be very time consuming and the accuracy and reliability of the output rely on the experience and background of the observers. The outburst of new video technology and computer image processing gives the basis for promising solutions. In this pilot study, we present a new prototype software able to automatically infer the behaviour of dogs housed in kennels from 3D visual data and through structured machine learning frameworks. Depth information acquired through 3D features, body part detection and training are the key elements that allow the machine to recognise postures, trajectories inside the kennel and patterns of movement that can be later labelled at convenience. The main innovation of the software is its ability to automatically cluster frequently observed temporal patterns of movement without any pre-set ethogram. Conversely, when common patterns are defined through training, a deviation from normal behaviour in time or between individuals could be assessed. The software accuracy in correctly detecting the dogs' behaviour was checked through a validation process. An automatic behaviour recognition system, independent from human subjectivity, could add scientific knowledge on animals' quality of life in confinement as well as saving time and resources. This 3D framework was designed to be invariant to the dog's shape and size and could be extended to farm, laboratory and zoo quadrupeds in artificial housing. The computer vision technique applied to this software is innovative in non-human animal behaviour science. Further improvements and validation are needed, and future applications and limitations are discussed.Entities:
Mesh:
Year: 2016 PMID: 27415814 PMCID: PMC4944961 DOI: 10.1371/journal.pone.0158748
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Example of a trajectory (top view) performed by a dog inside its pen.
Fading lines are performed earlier, intense lines are more recent. Red dots represent the barycentre of the animal at a given instant. This bird eye view is achieved by re-projecting the acquired points through a homographic projection onto the pen’s ground plane. The projection matrix is computed automatically from the videos without manual calibration of the Kinect 3D sensor.
Fig 2A grid divides the floor of the pen.
The time (seconds) spent by the dog on each square is calculated. Different scales of grey also quantify the amount of time spent in each square (i.e. black = never entered the square, lighter shades = more time spent in that square).
Fig 3(a) Visual representation of the alignment of two sequences using the Dynamic Time Warping (DTW). The DTW stretches the sequences in time by matching the same point with several points of the compared time series. (b) The Needleman Wunsh (NW) algorithm substitutes the temporal stretch with gap elements (red circles in the table) inserting blank spaces instead of forcefully matching point. The alignment is achieved by arranging the two sequences in this table, the first sequence row-wise (T) and the second column-wise (S). The figure shows a score table for two hypothetical sub-sequences (i, j) and the alignment scores (numbers in cells) for each pair of elements forming the sequence (letters in head row and head column). Arrows show the warping path between the two series and consequently the final alignment. The optimal alignment score is in the bottom-right cell of the table.
Global alignment and score table algorithms.
| Eq | Description | Formula | Coding |
|---|---|---|---|
| Base condition for symbol-to-symbol matching in the score table | • The computed alignment score of each pair of elements | ||
| • | |||
| • –represents a blank gap in the sequence (e.g. a temporal stretch). | |||
| Final alignment score given by the maximum score among all possible alignments | |||
| • Ω(Sa,i, Sb,j) symbol-to-symbol score defined by Eq 3 | |||
| Symbol-to-symbol score | • | ||
| • Sa,i symbol of sequence Ti | |||
| • Sb,j symbol of sequence Tj | |||
| • | |||
| Symbol-to-symbol similarity for trajectories | • Negative exponential of the Euclidian distance between pairs of coordinates in time ( | ||
| Symbol-to-symbol similarity for actions | • | ||
| • M is the matrix | |||
| Laplace operator | • | ||
| • I is the identity matrix | |||
| • | |||
| • Given two Laplace operators, the graph similarity is computed through measuring the angles between the eigenvectors of Laplacians graph discarding low frequency components and retaining the main structure |
A Gramian determinant is used to calculate the volume of a parallelotope (i.e. generalisation of a parallelepiped in higher dimensions). In this case, the parallelotope is created by the divergence between two graph structures. Thus a Gramian matrix is defined by G = MTM where M is a real matrix and the vectors are elements of an Euclidean space.
In computer science a matrix can be used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph.
A matrix is a rectangular array of numbers, symbols, or expressions, arranged in rows and columns. A diagonal matrix is when all entries outside the main diagonal are zero. The identity matrix is a type of diagonal matrix where all the elements on the main diagonal are equal to 1 and all other elements are equal to zero.
Fig 4Comparison between different frames of the dog skeleton.
For each frame a descriptor score for the dog skeleton is computed (top). Then, all the descriptors (from the same or different dogs) are compared and matched. The NW algorithm creates the similarity scores and aligns the segments.
Fig 5Clustering analysis.
Each dot represents a video segment. Distance between dots is computed by the alignment algorithm either on coordinates (i.e. trajectories) or body parts (i.e. actions). In this example, the system created three clusters (green, blue and red clouds) where the centroid (black cross) is the most representative sequence of the cluster. The more the sequences are close together the more the alignment produces a high similarity score. Sequences distant from the centroid are less similar than the computed action prototype.
Variation in the Percentage of Correctly estimated body Parts (PCP) with increasing training size and different light conditions.
| Training Size | ||||
|---|---|---|---|---|
| 67 | 128 | 228 | 385 | |
| 65% | 72% | 77% | 80% | |
| 55% | 63% | 69% | 74% | |
| 59% | 74% | 79% | 83% | |
Trial 1 kennel environment with constant/natural light conditions; Trial 2 kennel environment with unstable/natural light conditions; Trial 3 laboratory environment with controlled/artificial lighting conditions.
Fig 6Qualitative Examples of Body part detection.
Each image shows an example of the extracted dog body part in different conditions. Different line colours correspond to different body parts.
Comparative results of our PCP against the state of the art methods.
Bold values are the best.
| Our | (Chen et al 2011) [ | (Pistocchi et al 2014)[ | |
|---|---|---|---|
| 50% | 70% | ||
| 48% | 67% | ||
| 53% | 78% |
Fig 7Confusion Matrix of the B.A.R.K. posture classification (predicted class) compared to the manual annotation (Ground Truth class).
Numbers and percentages of samples corresponding to the system outputs are reported in the cells. Ideally, values in the green diagonal should aim at 100% accuracy.
Fig 8Bar-plot of the percentage of time dogs spent in a specific posture:
The y axis is the percentage of the video while the bars graphically depict how the video was scored in terms of behaviours. On the X axis the B.A.R.K. results and the Ground Truth are shown for every video.
Correlation between the manual annotation and automated scoring of behaviour using B.A.R.K. Spearman’s rho and p-values are presented.
| Sit | Stand | Locomotion | Lie | |
|---|---|---|---|---|
| 0.59 | 0.92 | 0.95 | 0.81 | |
| 0.04 | <0.0001 | <0.0001 | 0.001 |
Fig 9Example of automated cluster analysis (a) dialog window showing the cluster analysis results. On the right menu the 5sec sequences are grouped into 4 clusters that were manually renamed after visualising them all (i.e. double-click on the sequence to play). The different dog’s trajectory performed in the sequences in cluster 1 (b), cluster 2 (c), cluster 3 (d) and cluster 4 (e) give an idea of how the B.A.R.K. automatically groups different patterns of behaviour expressed by the dog in the clip.