| Literature DB >> 35990155 |
Yunxuan Wu1,2.
Abstract
As technology changes, virtual reality generates realistic images through computer graphics and provides users with an immersive experience through various interactive means. In the context of digitalization, the application of VR for digital media art creation becomes a normalized method. Today's digital media art creation is closely related to vigorous technological innovation behind it, so the influence of modern technology is inevitable. Virtual reality and artificial intelligence have gradually become the main technical means in line with the development aim for digital media art creation. This work proposes an art object detection method AODNET in virtual reality digital media art creation with AI. Aiming at the particularity of object detection in this direction, an art object detection strategy based on residual network and clustering idea is proposed. First of all, it uses ResNet50 as backbone, which deepens network depth and improves the model feature extraction ability. Second, it uses the K-means++ algorithm to perform clustering statistics on the size of the real annotated boxes in the dataset to obtain appropriate hyperparameters for preset candidate boxes, which enhances the tolerance of the algorithm to the target size. Third, it replaces the ROI pooling algorithm with ROI align to eliminate the error caused by the quantization operation on the characteristics of the candidate region. Fourth, to reduce the missed detection rate of overlapping targets, soft-NMS algorithm is used instead of the NMS algorithm to post-process the candidate boxes. Finally, this work conducts extensive experiments to verify the superiority of AODNET for object detection in virtual reality digital media art creation.Entities:
Mesh:
Year: 2022 PMID: 35990155 PMCID: PMC9385317 DOI: 10.1155/2022/3781750
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Residual unit.
ResNet50 structure.
| Layer | Parameter |
|---|---|
| Conv 1 | 7 × 7, 3 × 3, Maxpooling |
| Conv 2 | [1× 1, 3 × 3, 1 × 1] × 3 |
| Conv 3 | [1 × 1, 3 × 3, 1 × 1] × 4 |
| Conv 4 | [1 × 1, 3 × 3, 1 × 1] × 6 |
| Conv 5 | [1 × 1, 3 × 3, 1 × 1] × 3 |
| Average pooling | 14 × 14 |
| FC | 1000 |
| Softmax | 1000 |
Figure 2Combination of ResNet and faster R-CNN.
Experimental environment details.
| Name | Parameter |
|---|---|
| GPU | GTX 1080Ti 16 GB |
| CPU | Intel Core i5-9300H |
| System | Ubuntu 18.04 |
| Language | Python |
| Framework | PyTorch |
Figure 3AODNET training loss.
Comparison with other detection methods.
| Method | mAP | Accuracy |
|---|---|---|
| Faster R-CNN | 83.8 | 86.4 |
| YOLO | 86.1 | 89.5 |
| SSD | 87.9 | 91.4 |
| AONET | 90.3 | 93.6 |
Figure 4Analysis on combination of Faster R-CNN and ResNet.
Figure 5Analysis on candidate box optimization.
Figure 6Analysis on ROI align.
Figure 7Analysis on soft-NMS.
Analysis on training batch.
| Batch size | 8 | 16 | 32 | 64 | 128 |
|---|---|---|---|---|---|
| mAP | 85.3 | 88.9 | 90.3 | 89.5 | 87.5 |
| Accuracy | 90.9 | 92.6 | 93.6 | 93.1 | 91.2 |