| Literature DB >> 36236311 |
Xiaopin Wang1,2, Wei Wang1, Jisheng Lu1, Haiyan Wang1,2.
Abstract
The body size of pigs is a vital evaluation indicator for growth monitoring and selective breeding. The detection of joint points is critical for accurately estimating pig body size. However, most joint point detection methods focus on improving detection accuracy while neglecting detection speed and model parameters. In this study, we propose an HRNet with Swin Transformer block (HRST) based on HRNet for detecting the joint points of pigs. It can improve model accuracy while significantly reducing model parameters by replacing the fourth stage of parameter redundancy in HRNet with a Swin Transformer block. Moreover, we implemented joint point detection for multiple pigs following two steps: first, CenterNet was used to detect pig posture (lying or standing); then, HRST was used for joint point detection for standing pigs. The results indicated that CenterNet achieved an average precision (AP) of 86.5%, and HRST achieved an AP of 77.4% and a real-time detection speed of 40 images per second. Compared with HRNet, the AP of HRST improved by 6.8%, while the number of model parameters and the calculated amount reduced by 72.8% and 41.7%, respectively. The study provides technical support for the accurate and rapid detection of pig joint points, which can be used for contact-free body size estimation of pigs.Entities:
Keywords: CNN; deep learning; keypoint detection; object detection; transformer
Mesh:
Year: 2022 PMID: 36236311 PMCID: PMC9571911 DOI: 10.3390/s22197215
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1Annotated example for posture detection (the yellow bounding box represents the standing pig, and the green bounding box represents the lying pig).
Details of posture annotation dataset.
| Posture Classes | Number of Individual Postures | |||
|---|---|---|---|---|
| Train Dataset | Validation Dataset | Test Dataset | Total | |
| Standing | 5499 | 735 | 644 | 6234 |
| Lying | 9058 | 1084 | 1171 | 10,412 |
| Total sample | 14557 | 1819 | 1815 | 18,191 |
Figure 2Pig joint point annotation example. (Black marks indicate left and right neck, blue marks indicate left and right shoulders, green marks indicate left and right abdomen, red marks indicate left and right hips, and orange marks indicate left and right tails).
Standard deviation of each joint point.
| Joint Point |
| Joint Point |
|
|---|---|---|---|
| left neck | 0.005322321731521584 | right abdomen | 0.005311422910349784 |
| right neck | 0.00546914966592658 | left hip | 0.010024322728349425 |
| left shoulder | 0.009892777323066001 | right hip | 0.008588638693752731 |
| right shoulder | 0.00871434068851134 | left tail | 0.004319627728346724 |
| left abdomen | 0.004523805890671292 | right tail | 0.00422832022133345 |
Figure 3HRST model structure.
Figure 4The training loss curve (a) and AP curve (b) of the posture detection model.
Results of different object detection models on the pig posture test set. GFLOPs represent the computational effort of the model; Params represent the number of parameters; AP50 and AP75 are the average precision when the intersection over union (IOU) threshold is set to 0.5 and 0.75. AP and AR are average precision and average recall averaged over 10 IOU threshold (0.50:0.05:0.95); FPS represents the inference speed. The best results are in bold.
| Class | Feature Extractor | GFLOPs | Params | AP | AP50 | AP75 | AR | FPS |
|---|---|---|---|---|---|---|---|---|
| Faster-RCNN | ResNet50-FPN | 177.11 | 41.4M | 84.7 | 99.0 | 98.6 | 88.9 | 18 |
| Faster-RCNN | MobileNetV3-Large-FPN |
|
| 82.3 |
| 97.8 | 86.6 | 29 |
| Faster-RCNN | EfficientNetV2-S-FPN | 59.73 | 24.3M | 84.8 | 99.0 | 98.9 | 88.9 | 21 |
| Faster-RCNN | ConvNeXt-T-FPN | 97.98 | 34.3M | 86.1 |
|
| 90.0 | 18 |
| YOLOv4 | CSPDarknet53 | 119.50 | 63.9M | 84.1 | 98.5 | 97.8 | 88.1 |
|
| FCOS | ResNet50-FPN | 177.47 | 31.84M | 85.7 | 99.0 | 98.1 |
| 21 |
| CenterNet (for posture detection) | DLA-34 | 96.29 | 20.2M |
| 99.0 | 98.9 | 89.5 | 26 |
Figure 5Examples of standing (yellow rectangles) and lying (green rectangles) detected by CenterNet with DLA-34 as the feature extraction network.
Figure 6The training loss curve (a) and AP curve (b) of joint points detection model.
Results of different joint point detection models on the joint point test set of pigs. AP50 and AP75 are the average precision when Object Keypoint Similarity (OKS) threshold is set to 0.5 and 0.75. AP and AR are average precision and average recall averaged over 10 OKS threshold (0.50:0.05:0.95). We calculated the percent difference in params, GFLOPs, and AP between models marked with the same symbol. "†" marks the models we compared, and "↓" and "↑" indicate the decline and increase of the comparison results, respectively.
| Class | GFLOPs | Params | AP | AP50 | AP75 | AR | FPS |
|---|---|---|---|---|---|---|---|
| CenterNet (for joint point detection) |
| 20.6M | 67.2 | 94.4 | 83.2 | 73.8 |
|
| HRNet-w48 | 35.43 † | 63.6M † | 70.6 † | 94.9 | 86.4 | 78.0 | 26 |
| HRNetv2-w48 | 39.53 | 65.9M | 72.1 |
|
| 78.6 | 24 |
| Simple Baseline-152 | 28.67 | 68.6M | 69.6 | 96.0 | 86.6 | 76.0 | 48 |
| Tokenpose-L-D24 | 23.98 | 29.9M | 70.2 | 95.6 | 85.1 | 77.6 | 26 |
| HRST | 20.65 † (↓41.7%) | 95.9 | 90.4 |
| 40 |
Results of different joint point detection models on the ATRW test set. The relevant indicators of this table are consistent with those in Table 4.
| Class | GFLOPs | Params | AP | AP50 | AP75 | AR | FPS |
|---|---|---|---|---|---|---|---|
| CenterNet (for joint point detection) |
| 20.6M | 75.2 | 95.2 | 77.0 | 86.5 |
|
| HRNet-w48 | 35.43 † | 63.6M † | 88.8 † | 97.2 | 90.6 | 91.9 | 26 |
| HRNetv2-w48 | 39.53 | 65.9M | 89.0 | 97.3 | 90.4 | 92.2 | 24 |
| Simple Baseline-152 | 28.67 | 68.6M | 86.4 | 96.4 | 90.3 | 90.1 | 48 |
| Tokenpose-L-D24 | 23.98 | 29.9M | 87.1 | 97.2 | 89.3 | 90.6 | 26 |
| HRST | 20.65 † (↓41.7%) |
|
|
| 40 |
Figure 7Detection results of HRST on pig and ATRW test sets.
Figure 8Example of multi-pig joint point detection. The green bounding box represents lying, and the yellow bounding box shows standing. Joint point detection is only performed on standing pigs.