| Literature DB >> 35310175 |
Qi Wang1,2, Qing-Ming Wang1.
Abstract
For athletes who are eager for success, it is difficult to obtain their own movement data due to field equipment, artificial errors, and other factors, which means that they cannot get professional movement guidance and posture correction from sports coaches, which is a disastrous problem. To solve this big problem, combined with the latest research results of deep learning in the field of computer technology, based on the related technology of human posture recognition, this paper uses convolution neural network and video processing technology to create an auxiliary evaluation system of sports movements, which can obtain accurate data and help people interact with each other, so as to help athletes better understand their body posture and movement data. The research results show that: (1) using OpenPose open-source library for pose recognition, joint angle data can be obtained through joint coordinates, and the key points of video human posture can be identified and calculated for easy analysis. (2) The movements of the human body in the video are evaluated. In this way, it is judged whether the action amplitude of the detected target conforms to the standard action data. (3) According to the standard motion database created in this paper, a formal motion auxiliary evaluation system is established; compared with the standard action, the smaller the Euclidean distance is, the more standard it is. The action with an Euclidean distance of 4.79583 is the best action of the tested person. (4) The efficiency of traditional methods is very low, and the correct recognition rate of the method based on BP neural network can be as high as 96.4%; the correct recognition rate of the attitude recognition method based on this paper can be as high as 98.7%, which is 2.3% higher than the previous method. Therefore, the method in this paper has great advantages. The research results of the sports action assistant evaluation system in this paper are good, which effectively solves the difficult problems that plague athletes and can be considered to have achieved certain success; the follow-up system test and operation work need further optimization and research by researchers.Entities:
Mesh:
Year: 2022 PMID: 35310175 PMCID: PMC8926528 DOI: 10.1155/2022/8388325
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Description of neurons.
| A natural neuron | |
|---|---|
| Composition | Neuronal nucleus |
| Dendrites [ | |
| Axon | |
|
| |
| Explanation: Axons are branched off to connect with dendrites of other neurons to form synapses. Artificial neurons have a similar structure, which also contains a nucleus (processing unit), multiple dendrites (similar to the input), and an axon (similar to the output). | |
Figure 1Artificial neuron.
Figure 2Single-layer convolution neural network.
Feature extractor based on convolution neural network design.
| Name | Description |
|---|---|
| LeNet [ | Proposed by LeCun in 1998, it is the first CNN. It has a seven-level convolution network dedicated to classifying numbers and has the ability to classify numbers without being affected by minor distortion, rotation, and changes in position and scale. |
|
| |
| AlexNet | Proposed by Krizhevesky et al., by deepening CNN and applying many parameter optimization strategies to enhance CNN's learning ability, it is considered as the first deep CNN architecture, showing the pioneering achievements of image classification and recognition tasks. |
|
| |
| ZefNet | It is recognized as the winner of ILSVRC (CNN competition) in 2013. It uses deconvolution to visually analyze CNN's intermediate feature map, finds a way to improve the model by analyzing feature behavior, and fine-tunes AlexNet to improve its performance. It manages to achieve a Top-5 error rate of only 14.8%. This achievement of ZefNet is achieved by adjusting AlexNet's super parameters and keeping the same structure. In order to further improve the effectiveness and accuracy of ZefNet, more deep learning elements have been added. |
|
| |
| GoogleNet | GoogleNet, which won the 2014-ILSVRC competition, introduced a new concept of inception blocks into CNN, integrating multiscale convolution transformations through split, transform, and merge ideas. This block encapsulates filters of different sizes (1 × 1, 3 × 3, and 5 × 5) to capture spatial information of different scales (fine-grained and coarse-grained). In addition to improving learning ability, GoogleNet focuses on improving the efficiency of CNN parameters. |
|
| |
| VGG [ | With the successful application of CNN in image recognition, Simonyan et al. put forward a simple and effective design principle of CNN architecture. Their architecture, called VGG, is a modular layered pattern. VGG has a depth of 19 layers to simulate the relationship between depth and network presentation ability. VGG replaces 11 × 11 and 5 × 5 filters with a stack of 3 × 3 convolution layers. Experiments show that placing 3 × 3 filters at the same time can achieve the effect of large-size filters. |
Examples of 2D human posture data set.
| Data set | Type | Number of joint points | Number of samples/103 | Usage |
|---|---|---|---|---|
| LSP | Single person | 14 | 2 | Basic abandonment |
| FLIC | Single person | 9 | 20 | Basic abandonment |
| MPII [ | Single person, multiple person | 16 | 25 | Mainstream |
| MSC0C0 | Multiple persons | 17 | >300 | Mainstream |
| AI challenger | Multiple persons | 14 | 380 | Competition only |
| PoseTrack [ | Multiple persons | 15 | >20 frame | Most of them are used for attitude tracking |
Motion evaluation method.
| Method | Description |
|---|---|
| Template-based method | In the matching mode, the action sequence to be detected is compared with the pre-established style action library according to a specific time order, and the action similarity is introduced to evaluate the action. When encountering complex action, we don't need to pay attention to the time order when studying its action. We can use the dynamic matching method to compare and analyze the actions at any time in the action sequence to be detected with the style action library. Next, find the best way to match these two methods to achieve the effect of motion recognition and classification. |
|
| |
| Method based on state space [ | The hidden Markov (HMN) model is one of the most convincing methods in this type of action evaluation. Researchers put forward Bayesian network based on probability inference, which eliminates uncertainty and incompleteness. Compared with the link structure of HMN, Bayesian network is a directed graph describing a random process, which well expresses the time and sequence transformation of the state. |
Figure 3RGB color space representation.
Figure 4HSV color space representation.
Figure 5Flow chart of moving target detection.
Figure 6Workflow of human posture recognition module.
Figure 7Action description rule workflow.
Selection of joint angle.
| Joint angle number | Joint point |
|---|---|
| 1 | Head, neck, left shoulder |
| 2 | Left shoulder, left elbow, left wrist |
| 3 | Right shoulder, right elbow, right wrist |
| 4 | Left elbow, left shoulder, left hip |
| 5 | Right elbow, right shoulder, right hip |
| 6 | Left knee, neck, right knee |
| 7 | Left hip, left knee, left foot |
| 8 | Right hip, right knee, right foot |
Main configuration.
| Hardware platform | Software platform |
|---|---|
| CPU: Intel (R) core (TM) i5-8300H 2.30 GHz; GPU: NVIDIA GeForce GTX1050Ti 4G; disk: Samsung MZVLB256HAHQ 237G; memory: Samsung DDR4 2666 MHz 8 GB | The software platform is mainly built in visual studio 2015 environment. Open-source computer vision library OpenCV, Microsoft MFC interface library, CUDA architecture, and GPU acceleration library cuDNN are used. The system uses C++ for development, using C++ file operation classes of stream, if stream involved in the program files, data types, text flow, and other objects for processing. |
Figure 8Flow chart of automatic motion attitude recognition.
Figure 9Overall function of the system.
OpenPose configuration environment.
| Type | Parameter | |
|---|---|---|
| Hardware environment | Operating system | Windows 10 Home Edition |
| CPU | Intel® core I5–8300H | |
| Graphics card | Nvidia GTX1050Ti | |
| Memory | Samsung 8 GB | |
|
| ||
| Software environment | Integrated development environment | Visual studio 2015, cmake |
| Acceleration library | CUDA 8.0, CUDNN 5.5 | |
Figure 10Human skeleton model diagram.
Figure 11Joint point coordinates.
Figure 12Joint point coordinates.
Figure 13The same action model diagram of three participants.
Figure 14Comparison of joint angle data of three participants.
Figure 15System interface.
Figure 16(Standard) joint angle data.
Figure 17Joint angle data similar to standard actions 1 and 2.
Experimental test video settings.
| Video name | Format | Frame number | Storage space (MB) |
|---|---|---|---|
| Walk | 1,280 × 720 | 300 | 644 |
| Swimming | 1,280 × 720 | 900 | 592 |
| Running | 1,920 × 1,080 | 600 | 411 |
| Sit-ups | 1,920 × 1,080 | 504 | 508 |
| Pull-up | 1,280 × 720 | 600 | 555 |
| Basketball | 1,920 × 1,080 | 600 | 623 |
| Skipping rope | 1,920 × 1,080 | 500 | 517 |
Figure 18Experimental comparison results of correct identification quantity.