Literature DB >> 31915429

Human Activity Recognition Using Gaussian Mixture Hidden Conditional Random Fields.

Muhammad Hameed Siddiqi1, Madallah Alruwaili1, Amjad Ali2, Saad Alanazi1, Furkh Zeshan2.   

Abstract

In healthcare, the analysis of patients' activities is one of the important factors that offer adequate information to provide better services for managing their illnesses well. Most of the human activity recognition (HAR) systems are completely reliant on recognition module/stage. The inspiration behind the recognition stage is the lack of enhancement in the learning method. In this study, we have proposed the usage of the hidden conditional random fields (HCRFs) for the human activity recognition problem. Moreover, we contend that the existing HCRF model is inadequate by independence assumptions, which may reduce classification accuracy. Therefore, we utilized a new algorithm to relax the assumption, allowing our model to use full-covariance distribution. Also, in this work, we proved that computation wise our method has very much lower complexity against the existing methods. For the experiments, we used four publicly available standard datasets to show the performance. We utilized a 10-fold cross-validation scheme to train, assess, and compare the proposed model with the conditional learning method, hidden Markov model (HMM), and existing HCRF model which can only use diagonal-covariance Gaussian distributions. From the experiments, it is obvious that the proposed model showed a substantial improvement with p value ≤0.2 regarding the classification accuracy.
Copyright © 2019 Muhammad Hameed Siddiqi et al.

Entities:  

Mesh:

Year:  2019        PMID: 31915429      PMCID: PMC6935449          DOI: 10.1155/2019/8590560

Source DB:  PubMed          Journal:  Comput Intell Neurosci


1. Introduction

In real-life environments, there are some fascinating applications in which the analysis of human activities plays a significant role. Some applications include human/object detection and recognition based on vision object analysis and processing areas such as tracking and detection [1, 2], computer engineering [3], physical sciences [4], health-related issues, natural sciences, and industrial academic areas [5]. Most of the authors [6-11] recognized the human activities in indoor environments based on different methodologies. However, in their respective systems, they used stable environment like fixed camera setting and prelighting setting, and most of the activities were performed by the instructions provided by the instructor. Similarly, the authors of [10, 12–14] proposed different methods to recognize the human daily activities in outdoor environments. However, in most of the used datasets, they used static background and this is one of the common drawbacks in their systems. Similarly, different sensors were utilized by the authors of [15-17] in order to classify indoor and outdoor human activities. Moreover, in telemedicine and healthcare, human activity recognition (HAR) can be explained by helping physically disabled persons' scenario. A paralyzed patient with half of the body critically disturbed by stroke is completely unable to walk and the one way to recover him is through daily exercises. Normally, the daily exercises (activities) are recommended by the doctors to the stroke patients for getting better improvements in their health. A human activity recognition (HAR) system can correctly train and identify the activities performed by the stroke patients, through which the doctors easily can monitor the improvement scale in the patients' health. There are four modules in a typical HAR system: preprocessing (segmentation), feature extraction, feature selection, and recognition as shown in Figure 1. Most of the existing works [18-23] focused on feature extraction and selection; however, very limited works have been done for the recognition module. Some studies exploited conventional techniques [24-28]. Among them, HMM is one of the best candidates for the activity recognition; however, HMM is generative in nature and less precise than its matching part like HCRF model [29].
Figure 1

A typical human activity recognition (HAR) system.

The inspiration behind the recognition stage is the lack of enhancement in the learning method. Therefore, we have made the following contribution: The existing HCRF model is inadequate by independence assumptions, which may reduce classification accuracy. Therefore, the first objective of this study is to propose a recognition model that presents a new algorithm to relax the assumption, allowing our model in order to use full -covariance distribution. Another objective of the work is to prove that computation wise our method has very much lower complexity against the existing methods. In this method, our goal was to find some parameters to maximize the conditional probability of the training data at the training phase. Therefore, in our work, we utilize limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) method to search for the optimal point. However, instead of repeating the forward and backward algorithms to compute the gradients as others did [30], we run the forward and backward algorithms only when calculating the conditional probability, and then we reuse the result to compute the gradients. As a result, the computation time is significantly reduced. A comprehensive set of experiments which yielded a weighted average classification rate 97% that is better improvement in the performance against the state-of-the-art methods. The rest of the paper is organized as follows: Section 2 presents related works with their limitations. Section 3 provides the proposed recognition model with its advantages. Section 4 describes the experimental setup for the proposed model against four datasets. Based on the setup, a series of experiments are presented in Section 5. Finally, Section 6 describes conclusion with some future directions.

2. Related Works

In a typical HAR system, different types of latest segmentation methods were used in preprocessing module in order to extract the human body from the activity frame. This process helps to improve the performance of the activity recognition system. Therefore, in the literature, the authors of [31-36] utilized the latest methods to segment the human body from the video frames. Similarly, for the feature extraction, different latest methodologies have been employed which help the classifiers to accurately classify the human activities (as the workflow shown in Figure 1) [37-42]. They showed better performance on different datasets, and most of them achieved average accuracy between 70 and 90%. Regarding recognition, the researchers have proposed diverse systems which exploit various classifiers such as Gaussian mixture model (GMM) [43, 44], artificial neural network (ANN) [45, 46], and support vector machine (SVMs) [47-50]. These classifiers were principally employed for frame-based classification. Contrarily, in many HAR systems [37, 51, 52], the eminent hidden Markov model (HMM) has extensively been utilized for sequence-based classification. In the case of frame-level features, HMMs are benefited over vector-based classifiers like SVM, GMM, and ANN in terms of effectively handling the sequential data. However, the Markovian property implied in the traditional HMM assumes that the current state is a function of the past state only. This causes the labels of two adjacent states in the observation sequence to hypothetically appear in succession. But in practical implementation, this assumption often does not meet satisfaction. Besides, the generative characteristic of HMM and independence presumptions between observations and states also limit its performance [29]. To get rid of these limitations, the maximum entropy Markov model (MEMM) had been proposed which comparatively performs better than HMM [53]. However, MEMM is associated with the well-known disadvantage termed as “label bias problem”. Two generalized models of MEMM known as conditional random fields (CRFs) [29] and HCRF [54] were developed to fix the shortcoming of “label bias problem” [29]. For learning the hidden structure of the sequential data, HCRF facilitates the effectiveness of CRF with hidden states. However, in both models, the per-state normalization is replaced with global normalization, permitting the weighted scores which in turn result in larger parameter spaces as compared to HMM and MEMM. For example, the CRFs achieved in the HAR system having the observed frames from a video are represented by feature vector U, resultant label V, and unknown state label K. Suppose, the problem image labeling is assumed by original labels K with image features U and parameter of the model is Λ, then the later probability (post(K | U; Λ)) maximized by CRF is given aswhere the normalization factor is Some issues in HCRF implementation are reviewed and analyzed in the following description. The later probability of CRF in (1) has been updated by the post(K | U; Λ) in a HCRF model that is the addition of exponentials of latent functions with all expected labels L as given below The above equations are used to warranty the sum to one rule of the conditional probability. V′ is the possible tag for the series of frames, and is a series of hidden states l ,  i = 1,2,…, T, and equations (1) and (2) have constant values from 1 to Q (the number of states), Λ is the vector factor, and is a feature vector that will yield a decision which parameter will be educated by the model. Then, the feature vector concludes the addition of the existing HCRF model. For example, the underneath selections will create a Markov restraint HCRF with a Gaussian distribution at every state:where each v ∈ V is the expected tag and every u ∈ U is a predicted vector. The per-component square of the observation vector v at state t (i.e., v ) is given as It can be seen that along with certain set of parameters (Λ), the HCRF addition is similar to the hidden Markov model, for instance along with the abovementioned feature vector, if we choose where b in (6) is an earlier dissemination of Gaussian HMM and C in (7) is an evolution matrix; then conditional possibility numerator might be explained as In the above equation, N represents Gaussian distribution. Equation (11) is the conditional probability of U, given V is calculated along with a Gaussian HMM through equation (11) which has an earlier distribution b with a conversion matrix C. Moreover, the authors of [30] proposed a comprehensive form of the HCRF model to tackle composite scatterings utilizing a linear combination of Gaussian distribution functions, which is explained as In equation (12), M indicates the number of components in Gaussian mixture. Lots of works have been developed which showed better performance based on the usage of the abovementioned HCRF [55, 56]; however, most of them did not consider the limitations of the model. It is obvious from the aforementioned equations that the existing model employed diagonal (sloping)-covariance Gaussian distribution, which means that the variables (columns of u ,  i = 1,2,…, N) were presumed to be couples independent. On the other hand, equations (8)–(10) suggest that with a specific set of value, each state observation density will congregate to Gaussian procedure. Unluckily, there is no training method designed yet to guarantee this convergence, and those suppositions might decrease the accuracy results. Therefore, we proposed the improved version of the HCRF technique that has the ability to openly employ full-covariance Gaussian mixture in the feature function. The proposed model will get the benefits of hidden conditional random field model that completely considered the drawbacks of the previous method.

3. Proposed Methodology

3.1. Feature Extraction

In our previous work, we utilized symlet wavelet [37] for extracting various features from the activity frames. There are number of reasons for using the symlet wavelet which produces relatively better classification results. These include its capability to extract the conspicuous information from the activity frames in terms of frequency and its support to the characteristics of the grayscale images like orthogonality, biorthogonality, and reverse biorthogonality. For a certain provision size, the symlet is characterized with the highest number of vanishing moments and has the least asymmetry.

3.2. Proposed Hidden Conditional Random Fields (HCRFs) Model

As described earlier, the current Gaussian mixture HCRF model does not have the capability of utilizing full-covariance distributions and also does not guarantee the conjunction of its factors to certain values upon which the conditional probability is demonstrated as a combination of the normal density functions. To address these limitations, we explicitly involve a mixture of Gaussian distributions in the feature functions as illustrated in the following forms: then,where N represents the number of density functions, Gamma “Γ” considers the appropriate information of the entire observations, D indicates the dimension of the observation, and Γ Obs presents the partying weightiness for the m th constituent along with mean μ and covariance matrix Σ . As indicated in equation (14), when we change some of the parameters such as Γ, μ, and Σ, then we may build a combination of the standard densities. The resultant conditional probability might be written astherefore, The forward and backward algorithms are used to calculate the conditional probability based on equations (19) and (20) that can be written as In the training data, to maximize the conditional probability, we initially focused on calculating the parameters (Λ, Γ, μ, and Σ). In the proposed approach, limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BGFS) method has been implemented in order to search the optimum point. Unlikely the other models [30], both the forward and backward algorithms are used to compute the conditional probability and the results were reused for finding the gradients. This makes the algorithm more significant in reducing the computation time. At the observation level, we particularly incorporated the full-covariance matrix in the feature function as shown in (16). Equation (17) may be used for getting the normal distribution which is further elaborated in the following equations: The d Score function is a gradient function for a variable of the prior probability vector: The d Score function is a gradient function for a variable of the transition probability vector: The d Score function is a gradient function for a Gaussian mixture weight variable. Here, a function V(t) can be determined as The d Score function is a gradient function for the Gaussian distribution mean: The d Score function is a gradient function for the covariance of the Gaussian distributions. Equations (24)–(27) presented above describe an analysis method algorithm for calculating values of gradients for a feature function, the mean of Gaussian distributions, and the covariance of the prior probability vector, the transition probability vector, and the observation probability vector obtained from the existing HCRF. In our model, the recognition of a variety of real-time activities can be divided into two steps: a training step and an inference step. In the first step, data with known labels are inputted for recognizing the target as well as training the hidden conditional field model. In the inference step, the inputs to be actually estimated are ordered dependent on parameters determined in the training phase. If the activity frame is acting as an input in the training step, then, in the preprocessing step, the applied distinctive lighting effects are decreased for detecting and extracting faces from the activity frames. At that point, the movable features are extricated from the various facial parts for creating the feature vector. After that, the feature vector obtained serves as an input to a full-covariance Gaussian-mixed hidden conditional random field model of the suggested recognition model. As mentioned in the earlier discussion, a feature gradient is generally determined by LBFG approach in the training phase of the HCRF model. Nonetheless, in the current gradient calculation technique, a forward and backward iterative execution algorithm is iteratively called upon, which needs an exceptionally high computational time and thus leads to reduction in the computational speed. Another analysis approach has been formulated that reduces the invoking of the forward and backward iterative execution algorithm using five gradient functions determined by equations (24)–(28). Using this analysis, the real-time computation can be carried out at a higher speed resulting in an enormous decrease in the computational time compared to a known analysis approach. The overall workflow of the proposed model is shown in Figure 2.
Figure 2

Workflow diagram of the proposed recognition model.

4. Model Validation

4.1. Datasets Used

In this work, we employed four open-source standard action datasets like Weizmann action datasets [57], KTH action dataset [58], UCF sports dataset [59], and IXMAS action dataset [60] for corroborating the proposed HCRF model performance. All the datasets are explained below.

4.1.1. Weizmann Action Dataset

This dataset consisted of 10 actions such as bending, running, walking, skipping, place jumping, side movement, jumping forward, two hand waving, and one hand waving that were performed by total 9 subjects. This dataset comprised of 90 video clips with average of 15 frames per clip where the frame size is 144 × 180.

4.1.2. KTH Action Dataset

KTH dataset employed for activity recognition comprised of 25 subjects who performed 6 activities like running, walking, boxing, jogging, handclapping, and hand waving in four distinctive scenarios. Using a static camera, in the homogenous background, a total of 2391 sequences were taken with a frame size of 160 × 120.

4.1.3. UCF Sports Dataset

In this dataset, there were 182 videos which were evaluated by n-fold cross-validation rule. This dataset has been taken from different sports activities in broadcast television channels. Some of the videos had high intraclass similarities. This dataset was also collected using a static camera. This dataset covers 9 activities like running, diving, lifting, golf swinging, skating, kicking, walking, horseback riding, and baseball swinging. Each frame has a size of 720 × 480.

4.1.4. IXMAS Action Dataset

IXMAS (INRIA Xmas motion acquisition sequences) dataset comprised of 13 activity classes which were performed by 11 actors, each 3 times. Every actor opted a free orientation as well as position. The dataset has provided annotated silhouettes for each person. For our experiments, we have selected only 8 action classes like walk, cross arms, punch, turn around, sit down, wave, get up, and kick. IXMAS dataset is a multiview dataset for a view-invariant human activity recognition where each frame has a size of 390 × 291. This dataset has a major occlusion and that may cause misclassification; therefore, we utilized global histogram equalization [61] in order to resolve the occlusion issue.

4.2. Setup

For a comprehensive validation, we carried out the following set of experiments executed using Matlab. The first experiment was conducted on each dataset separately in order to show the performance of the proposed model. In this experiment, we employed 10-fold cross-validation rule, which means that data from 9 subjects were utilized for training data, while the data from one subject was picked as a testing data. The procedure was reiterated for 10 times provided each subject data is utilized for both training and testing. The second experiment was conducted in the absence of the proposed recognition model on all the four datasets that will show the importance of the developed model. For this purpose, we used the existing eminent classifiers like SVM, ANN, HMM, and existing HCRF [30] as a recognition model rather than utilizing the proposed HCRF model. The third experiment was conducted to show the performance of the proposed approach against the state-of-the-art methods. In the last experiment, the computational complexity of the proposed HCRF model was compared with forward/backward algorithms.

5. Results and Discussion

5.1. First Experiment

As described before, this experiment validates the performance of the proposed recognition model on an individual dataset. The overall results are shown in Tables 1 (using Weizmann dataset), 2 (using KTH dataset), 3 (using UCF sports dataset), and 4 (using IXMAS), respectively.
Table 1

Confusion matrix of the proposed recognition model using Weizmann action dataset (unit: %).

ActivitiesBendJackPjumpRunSideSkipWalkWave 1Wave 2
Bend 98 01001000
Jack1 96 0020100
Pjump01 97 010010
Run000 99 00010
Side1200 95 0020
Skip00000 100 000
Walk100101 96 01
Wave 10110000 98 0
Wave 201001111 95
Average 97.11
Table 2

Confusion matrix of the proposed recognition model using KTH action dataset (unit: %).

ActivitiesWalkingJoggingRunningBoxingHand-waveHandclap
Walking 100 00000
Jogging0 98 1100
Running21 95 110
Boxing021 97 00
Hand-wave1010 98 0
Handclap00100 99
Average 97.83
Table 3

Confusion matrix of the proposed recognition model using UCF sports dataset (unit: %).

ActivitiesDivingGSKickingLiftingHBRRunSkatingBSWalk
Diving 95 12101000
GS1 94 0021110
Kicking02 98 000000
Lifting111 94 10011
HBR0020 96 1010
Running03000 97 000
Skating101101 95 01
BS0001010 97 1
Walking00000000 100
Average 96.22

GS: golf swinging, HBR: horseback riding, and BS: baseball swinging.

Table 4

Confusion matrix of the proposed recognition model using IXMAS action dataset (unit: %).

ActivitiesCASDGUTAWalkWavePunchKick
CA 97 0012000
SD0 99 100000
GU12 94 30000
TA001 95 2110
Walk0110 98 000
Wave00201 97 00
Punch010102 96 0
Kick0000100 99
Average 96.88

CA: cross arm, SD: sit down, GU: get up, and TA: turn around.

As observed from Tables 1 –4, the proposed recognition model constantly obtained higher recognition rates on individual dataset. This result shows the robustness of the proposed model which means that the model not only showed better performance on one dataset but also showed better performance across multiple spontaneous datasets.

5.2. Second Experiment

As described before, the second experiment was conducted in the absence of the proposed recognition model, to show the importance of the proposed model using all the four datasets. For this purpose, we used the existing eminent classifiers like SVM, ANN, HMM, and existing HCRF [30] as a recognition model rather than utilizing the proposed HCRF model. Tables 5 –8 show that when the proposed HCRF model was substituted with ANN, SVM, HMM, and existing HCRF [30], the system failed to accomplish higher recognition rates. The better performance of the proposed HCRF model is visualized in Tables 1 –4, which show that the proposed HCRF model effectively fix the drawbacks of HMM and existing HCRF that has been extensively utilized for sequential HAR.
Table 5

Classification results of the proposed system on Weizmann action dataset (A) using ANN, (B) using SVM, (C) using HMM, and (D) using existing HCRF [30], while removing the proposed HCRF model (unit: %).

ActivitiesBendJackPjumpRunSideSkipWalkWave 1Wave 2
(A)
Bend 70 45332553
Jack4 68 3672334
Pjump24 75 632224
Run423 72 62533
Side5354 65 6462
Skip46435 67 326
Walk424734 70 33
Wave 12133457 71 4
Wave 225364435 68
Average 69.55
(B)
Bend 69 34464235
Jack2 72 2343545
Pjump14 75 245423
Run243 78 22423
Side2453 70 4354
Skip21324 80 332
Walk203432 82 13
Wave 12234323 77 4
Wave 212123134 83
Average 76.22
(C)
Bend 82 30223152
Jack3 80 1232342
Pjump34 85 300122
Run542 79 02134
Side0154 81 3123
Skip31223 88 001
Walk023212 83 34
Wave 11322423 78 5
Wave 212222310 87
Average 82.56
(D)
Bend 80 23140523
Jack1 88 0203231
Pjump02 90 103022
Run212 85 23005
Side4123 80 4123
Skip14051 84 032
Walk210012 89 23
Wave 13012020 91 1
Wave 241302302 85
Average 85.78
Table 6

Classification results of the proposed system on KTH action dataset (A) using ANN, (B) using SVM, (C) using HMM, and (D) using existing HCRF [30], while removing the proposed HCRF model (unit: %).

ActivitiesWalkingJoggingRunningBoxingHand-waveHandclap
(A)
Walking 79 56433
Jogging3 81 5344
Running64 77 553
Boxing676 69 57
Hand-wave4755 73 6
Handclap46546 75
Average 75.66
(B)
Walking 82 23544
Jogging3 86 2324
Running53 80 453
Boxing533 79 46
Hand-wave1433 89 0
Handclap35243 83
Average 83.17
(C)
Walking 86 32423
Jogging0 88 3243
Running03 90 043
Boxing304 92 10
Hand-wave1322 91 1
Handclap13412 89
Average 89.33
(D)
Walking 90 30340
Jogging2 88 2332
Running42 92 002
Boxing132 91 30
Hand-wave0132 93 1
Handclap13243 87
Average 90.17
Table 7

Classification results of the proposed system on UCF sports dataset (A) using ANN, (B) using SVM, (C) using HMM, and (D) using existing HCRF [30], while removing the proposed HCRF model (unit: %).

ActivitiesDivingGSKickingLiftingHBRRunSkatingBSWalk
(A)
Diving 68 42566432
GS2 71 2454633
Kicking34 70 354236
Lifting543 65 56462
HBR3463 66 4554
Running33546 64 645
Skating254534 69 35
BS4253465 67 4
Walking54234363 70
Average 67.78
(B)
Diving 71 42356324
GS3 77 2432522
Kicking42 74 453233
Lifting563 69 43532
HBR2332 80 2422
Running23225 75 623
Skating212344 78 24
BS3463423 70 5
Walking41242303 81
Average 75.00
(C)
Diving 79 32234322
GS0 83 2432132
Kicking12 85 133230
Lifting302 82 32422
HBR0224 80 0534
Running12134 84 212
Skating203401 86 31
BS1112030 88 4
Walking12425243 77
Average 82.67
(D)
Diving 90 30102211
GS3 84 2131321
Kicking34 85 002312
Lifting121 89 11122
HBR0210 91 2301
Running23123 80 423
Skating241230 84 40
BS2112103 88 2
Walking02110140 91
Average 86.89

GS: golf swinging, HBR: horseback riding, BS: baseball swinging.

Table 8

Classification results of the proposed system on IXMAS action dataset (A) using ANN, (B) using SVM, (C) using HMM, and (D) using existing HCRF [30], while removing the proposed HCRF model (unit: %).

ActivitiesCASDGUTAWalkWavePunchKick
(A)
CA 65 5765435
SD5 72 435434
GU43 75 53244
TA674 68 3543
Walk3454 70 644
Wave46534 71 34
Punch355673 67 4
Kick4546543 69
Average 69.62
(B)
CA 77 3422354
SD3 79 324522
GU56 69 34544
TA232 80 4432
Walk3542 71 564
Wave26354 73 43
Punch153413 81 2
Kick3674354 68
Average 74.75
(C)
CA 79 3412434
SD1 84 323142
GU01 88 12323
TA523 79 2324
Walk1031 90 230
Wave23103 86 23
Punch102304 89 1
Kick3241024 84
Average 84.77
(D)
CA 90 1203400
SD3 85 213321
GU01 91 10232
TA132 87 1222
Walk1031 89 213
Wave02102 90 31
Punch142123 84 3
Kick1241324 83
Average 87.38

CA: cross arm, SD: sit down, GU: get up, TA: turn around.

5.3. Third Experiment

In this experiment, a comparative analysis was made between the state-of-the-art methods and the proposed model. All of these approaches were implemented by the instructions provided in their particular articles. A 10-fold cross-validation rule was employed on each dataset as explained in Section 4. The average classification results of the existing methods along with the proposed method across different datasets are summarized in Table 9.
Table 9

Weighted average recognition rates of the proposed method with the existing state-of-the-art methods (unit: %).

State-of-the-art worksAverage classification ratesStandard deviation
GMM63.3±2.7
SVM67.5±4.4
HMM82.8±3.8
Embedded HMM85.9±1.9
[62]92.1±3.2
[63]84.3±4.9
[18]93.6±2.7
[19]93.0±1.6
[22]92.7±2.5
[64]80.1±3.2
Proposed method 97.2 ±2.8
It is obvious from Table 9 that the proposed method showed a significant performance against the existing state-of-the-art methods. Therefore, the proposed method accurately and robustly recognizes the human activities using different video data.

5.4. Fourth Experiment

In this experiment, we have presented the computational complexity that is also one of the contributions in this paper. The implementations of the previous HCRF are available in literature, which calculate the gradients by reiterating the forward and backward techniques, while the proposed HCRF model executes them once only and cashes the outcomes for the later use. From (21) and (22), it is clear that the forward or backward technique has a complexity of O(TQ 2 M), where T represents the input sequence length, Q represents the number of states, and M indicates the number of mixtures. The proposed HCRF model, however, requires a full complexity of O(TM) to calculate gradients as can be seen from (22)–(29). Figure 3 shows a comparison of the execution time when the gradients are computed by the forward (or backward) algorithm and by our proposed method. The computational time is calculated by running Matlab R2013a with the specification of Intel® Pentium® Core™ i7-6700 (3.4 GHz) with a RAM capacity of 16 GB.
Figure 3

An illustration of gradient computational time (equation (30)) of the previous forward and backward algorithms and the proposed HCRF model. (a) Q=1 − 5, M=5, T=90 and (b), Q=5, M=1 − 5, T=90.

6. Conclusion

In healthcare and telemedicine, the human activity recognition (HAR) can be best explained by helping physically disable persons' scenario. A paralyzed patient with half of the body critically attacked by paralysis is completely unable to perform their daily exercises. The doctors recommend specific activities to get better improvement in their health. So, for this purpose, the doctors need a human activity recognition (HAR) system through which they can monitor the patients' daily routines (activities) on a regular basis. The accuracy of most of the HAR systems depends upon the recognition modules. For feature extraction and selection modules, we used some of the existing well-known methods, while for the recognition module, we proposed the usage of HCRF model which is capable of approximating a complex distribution using a mixture of Gaussian density functions. The proposed model was assessed against four publicly available standard action datasets. From our experiments, it is obvious that the proposed full-covariance Gaussian density function showed a significant improvement in accuracy than the existing state-of-the-art methods. Furthermore, we also proved that such improvement is significant from statistical point of view by showing value ≤0.2 of the comparison. Similarly, the complexity analysis points out that the proposed computational method strongly decreases the execution time for the hidden conditional random field model. The ultimate goal of this study is to deploy the proposed model on smartphones. Currently, the proposed model is using full-covariance matrix; however, this might be time consuming, especially when using on smartphones. Using a lightweight classifier such as K-nearest neighbor (K-NN) could be one possible solution. But K-NN is very much sensitive to environmental factor (like noise). Therefore, in future, we will try to investigate further research to reduce the time and sustain the same recognition rate when employing on smartphones in real environment.
  7 in total

1.  Hidden conditional random fields.

Authors:  Ariadna Quattoni; Sybor Wang; Louis-Philippe Morency; Michael Collins; Trevor Darrell
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2007-10       Impact factor: 6.226

2.  Learning to Segment Human by Watching YouTube.

Authors: 
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2016-08-05       Impact factor: 6.226

3.  Actions as space-time shapes.

Authors:  Lena Gorelick; Moshe Blank; Eli Shechtman; Michal Irani; Ronen Basri
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2007-12       Impact factor: 6.226

4.  Robust Indoor Human Activity Recognition Using Wireless Signals.

Authors:  Yi Wang; Xinli Jiang; Rongyu Cao; Xiyang Wang
Journal:  Sensors (Basel)       Date:  2015-07-15       Impact factor: 3.576

5.  Hierarchical recognition scheme for human facial expression recognition systems.

Authors:  Muhammad Hameed Siddiqi; Sungyoung Lee; Young-Koo Lee; Adil Mehmood Khan; Phan Tran Ho Truc
Journal:  Sensors (Basel)       Date:  2013-12-05       Impact factor: 3.576

6.  Video-based human activity recognition using multilevel wavelet decomposition and stepwise linear discriminant analysis.

Authors:  Muhammad Hameed Siddiqi; Rahman Ali; Md Sohel Rana; Een-Kee Hong; Eun Soo Kim; Sungyoung Lee
Journal:  Sensors (Basel)       Date:  2014-04-04       Impact factor: 3.576

Review 7.  A Review on Human Activity Recognition Using Vision-Based Method.

Authors:  Shugang Zhang; Zhiqiang Wei; Jie Nie; Lei Huang; Shuang Wang; Zhen Li
Journal:  J Healthc Eng       Date:  2017-07-20       Impact factor: 2.682

  7 in total
  1 in total

Review 1.  Wearable Sensor-Based Human Activity Recognition in the Smart Healthcare System.

Authors:  Fatemeh Serpush; Mohammad Bagher Menhaj; Behrooz Masoumi; Babak Karasfi
Journal:  Comput Intell Neurosci       Date:  2022-02-24
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.