| Literature DB >> 36080994 |
Shunli Wang1, Honghua Jiang1, Yongliang Qiao2, Shuzhen Jiang3, Huaiqin Lin1, Qian Sun1.
Abstract
Pork accounts for an important proportion of livestock products. For pig farming, a lot of manpower, material resources and time are required to monitor pig health and welfare. As the number of pigs in farming increases, the continued use of traditional monitoring methods may cause stress and harm to pigs and farmers and affect pig health and welfare as well as farming economic output. In addition, the application of artificial intelligence has become a core part of smart pig farming. The precision pig farming system uses sensors such as cameras and radio frequency identification to monitor biometric information such as pig sound and pig behavior in real-time and convert them into key indicators of pig health and welfare. By analyzing the key indicators, problems in pig health and welfare can be detected early, and timely intervention and treatment can be provided, which helps to improve the production and economic efficiency of pig farming. This paper studies more than 150 papers on precision pig farming and summarizes and evaluates the application of artificial intelligence technologies to pig detection, tracking, behavior recognition and sound recognition. Finally, we summarize and discuss the opportunities and challenges of precision pig farming.Entities:
Keywords: artificial intelligence; behavior recognition; livestock farming; pig detection and tracking; precision pig farming; sound recognition
Mesh:
Year: 2022 PMID: 36080994 PMCID: PMC9460267 DOI: 10.3390/s22176541
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1Precision pig farming framework.
Figure 2Implementation of precision pig farming technology.
Figure 3Camera top-view image and its corresponding 3D point cloud. (a) RGB image; (b) 3D point cloud.
Figure 4Examples of pig postures’ detection results.
Main research work on pig image detection and tracking.
| Authors, Year | Dataset Size | Method | Breed | Result |
|---|---|---|---|---|
| Zhao et al., 2022 [ | 18,000 | Mask R-CNN and GAN | - | Average Precision = 90.6% |
| Lei et al., 2022 [ | 416,873 | U-Net and UNet-Attention | Yorkshire pig | Average Precision = 94.80% |
| Ocepek et al., 2022 [ | 583 | Mask R-CNN and YOLOv4 | Crossbred Norsvin Land-race × | Precision = 96.00% |
| Ding et al., 2022 [ | 5000 | YOLOv5 and FD-CNN | Pregnant Large White sow | Precision = 93.60% |
| Wutke et al., 2021 [ | 12,285 | CNN and KF | - | MOTA = 94.40% |
| Sun and Li., 2021 [ | - | A multi-object tracking algorithm, | - | Correct tracking rate = 99.00% |
| Van Der Zande et al., 2021 [ | 4000 | YOLOv3 and SORT | Crossbred pig | mAP = 99.70% |
| Sha et al., 2021 [ | 5988 | YOLOv3 | - | - |
| Liu et al., 2021 [ | 5000 | ResNet-50 and DLC-KPCA | Weaned Yorkshire piglets | Accuracy = 96.88% |
| Jung et al., 2021 [ | 2182 | Faster R-CNN and OCTA | - | Accuracy = 77.00% |
| He et al., 2021 [ | 1400 | Mask R-CNN and Track R-CNN | - | MOTSA = 94.90% |
| Gan et al., 2021 [ | 100 video clips | Faster R-CNN and OPTN | Meihua sow | MOTA = 97.04% |
| Zhang et al., 2020 [ | 425 GB | CamTracor-PG | - | The average overlap rate = 91.00% |
| Liu et al., 2020 [ | 320 | SSD + ResNet-50 and MTU | (Landrace × Large White) × | Precision = 96.38% |
| Chen et al., 2020 [ | 51 video clips | Bottom-up keypoints detection | - | mAP = 84.30% |
| Chen et al., 2020 [ | 15,000 | YOLACT | Landrace × Yorshire crossbred pig | Accuracy = 90.00% |
| Zhang et al., 2019 [ | 18,000 | SSD and Correlation Filter | Large White × Landrace breed | Precision = 94.72% |
| Cowton et al., 2019 [ | 3292 | Faster R-CNN, SORT and Deep SORT | - | mAP = 90.10% |
Notes: GAN means generative adversarial network; FD-CNN means frame differences in combination with convolutional neural network; KF means kKalman filter algorithm; SORT means simple online real-time tracking; CNN means convolutional neural networks; ResNet means residual nets; DLC-KPCA means deepLabcut-kernel principal component analysis; OCTA means object center-point tracking algorithm; R-CNN means regionconvolutional neural network; Mask R-CNN means mask region-convolutional neural network; OPTN means online piglet tracking network; CamTracor-PG means camshift tracking approach based on correlation probability graph; SSD means single shot multibox detector; MTU means minimum tracking unit; STRF means spatialaware temporal response filtering; mAP means mean average precision; MOTSA means multi-object tracking and segmentation accuracy; MOTA means multi-object tracking accuracy; YOLACT means you only look at coefficients; Deep SORT means deep simple online real-time tracking; - means that the authors did not state specific data or did not mention this property in the text.
Figure 5Examples of pig’s four different behaviors (i.e., drinking, mounting, aggressive and lying), (a) Pig drinking behavior; (b) Pig mounting behavior; (c) Pig aggressive behavior; (d) Pig lying behavior.
The main research work of pig behavior recognition.
| Authors, Year | Data Type | Behavior | Method | Breed | Accuracy |
|---|---|---|---|---|---|
| Riekert et al., 2021 [ | 2D | Lying | Faster R-CNN, NASNet | Pig (GermanHybrid × German Piétrain) | 84.00% |
| Gan et al., 2021 [ | 2D | Snout-snout and | ResNet-101 | Meihua sow | 93.09% |
| D’Eath et al., 2021 [ | 3D | Scratched tails | Linear mixed models | Grower/finisher pig | - |
| Gan et al., 2021 [ | 3D | Nursing | ResNet-50, FlowNet2.0 | Meihua sow | 97.63% |
| Ji et al., 2020 [ | 2D | Eating and drinking | YOLOv2 | Yorkshi sow | 94.59% |
| Chen et al., 2020 [ | 2D | Drinking | ResNet-50 + LSTM | Mixed nursery pig | 92.50% |
| Wang et al., 2020 [ | 2D | Estrus | MFO-LSTM | Landrace pig | 98.02% |
| Zhuang et al., 2020 [ | 3D | Estrus | AlexNet | Large white sow | 93.33% |
| Chen et al., 2020 [ | 2D | Aggressive | VGG16 + LSTM | Mixed nursery pig | 98.40% |
| Zheng et al., 2020 [ | 3D | Walking, | Fast R-CNN and HMM | Small-ears spotted pig | 92.70% |
| Riekert et al., 2020 [ | 2D | Lying | Faster R-CNN + NAS | Fattening pig | 80.20% |
| Li et al., 2020 [ | 3D | Feeding, lying, | PMB-SCN | Fragrance pig | 97.63% |
| Zhang et al., 2020 [ | 3D | Feeding, lying, | TSCNM | Fragrance pig | 98.99% |
| Alameer et al., 2020 [ | 2D | Nursing | SVM | Sow pig | 96.40% |
| Chen et al., 2020 [ | 2D | Feeding | Xception + LSTM | Mixed nursery pig | 98.40% |
| Li et al., 2019 [ | 2D | Mounting | Mask R-CNN and KELM | Minipigs pig | 91.47% |
| Li et al., 2019 [ | 2D | Mounting | Mask R-CNN and ResNet-FPN | - | 94.50% |
| gao et al., 2019 [ | 3D | Aggressive | 3D CONVNet | - | 96.78% |
| Tan et al., 2018 [ | 2D | Drinking | Douglas-Peukcer | - | 93.75% |
| Yang et al., 2018 [ | 2D | Drinking | Google Lenet | - | 92.11% |
| Xue et al., 2018 [ | 3D | Standing, sitting, | Faster R-CNN, ZF-D2R | Sow | 96.73% |
Notes: NAS means neural architecture search; ResNet means residual nets; LSTM means long short-term memory; MFO means moth-flame optimization; HMMmeans hidden Markov model; PMB-SCN means a SlowFast networkbased spatiotemporal convolutional network for the pig’s multi-behavior recognition; Xception means Extreme version of Inception; KELM means kernel extreme learning machine; NAS means neural architecture search; TSCNM means two-stream convolutional network models; SVM means support vector machine; Mask R-CNN means mask region-convolutional neural network; ResNet-FPN means residual net feature pyramid networks; ZF-D2R means ZF model with deeper layers and two residual learning frameworks; keep standing means maintaining a standing position continuously for a certain period of time without making any other movements; keep sitting means maintaining a sitting position continuously for a certain period of time without making any other movements; keep ventral recumbency means maintaining a ventral recumbency position continuously for a certain period of time without making any other movements; - means that the authors did not state specific data or did not mention this property in the text.
Figure 6Spectrogram and mel frequency cepstral coefficients of pig cough sound. Hertz (Hz) is the unit of frequency. Second (s) is the unit of time. (a) Spectrogram of pig cough sound; (b) Pig cough sound mel frequency cepstral coefficients diagram.
The main research work of pig sound recognition.
| Authors, Year | Sound Category | Method | Breed | Result |
|---|---|---|---|---|
| Yin et al., 2021 [ | Cough | AlexNet | - | Accuracy = 95.40% |
| Chen et al., 2021 [ | Estrus sound | VGG16, DTL-CNN | Sow | Accuracy = 96.62% |
| Zhao et al., 2020 [ | Cough | DNN-HMM | Landrace pig | Average WER = 8.03% |
| Shen et al., 2020 [ | Cough | MFCC-CNN | - | Accuracy = 97.72% |
| Hong et al., 2020 [ | Cough, grunt, scream | MnasNet | Pig (Yorkshire, Landrace, and Duroc) | Accuracy = 94.70% |
| Li et al., 2020 [ | Cough | SVDD | - | Accuracy = 93.70% |
| Cang et al., 2020 [ | Cough, sneeze, | MobileNetV2 | Three-way sow | Accuracy = 97.30% |
| Zhang et al., 2019 [ | Cough, sneeze, | SVDD, BPNN | - | Accuracy = 95.40% |
| Wang et al., 2019 [ | Cough | PCA, SVM | Landrace weaners | Accuracy = 95.00% |
| Li et al., 2019 [ | Cough | BLSTM-CTC | Landrace | Accuracy = 93.77% |
| Cordeiro et al., 2018 [ | Pig vocalization | decision-tree | Sow | Accuracy = 81.92% |
| Li et al., 2018 [ | Cough | PCA, DBN | Landrace | Accuracy = 94.29% |
| Dong et al., 2017 [ | Cough, wind noise | DCT | - | - |
| Hui et al., 2016 [ | Cough | - | - | Accuracy = 96.00% |
| Yan et al., 2016 [ | Nursing grunt, | The sub-band | - | Accuracy = 95.17% |
Notes: DTL-CNN means deep transfer learning and convolutional neural network; DNN-HMM means deep neural network hidden Markov model; MFCC means mel frequency cepstral coefficient; MnasNet means Neural Architecture Search for Mobile; SVDD means support vector data description; MobileNetV2 means mobile networks; BPNN means back-propagation neural network; PCA means principal component analysis; SVM means support vector machine; BLSTM-CTC means birectional long short-term memory-connectionist temporal classification; DBN means deep nelief network; DCT means discrete cosine transform; WER means word error rate; - means that the authors did not state specific data or did not mention this property in the text.