Literature DB >> 34975183

Compressed feature vector-based effective object recognition model in detection of COVID-19.

Chao Chen1, Jinhong Mao2, Xinzhi Liu2, Yi Tan1, Ghada M Abaido3, Hamdy Alsayed4.   

Abstract

To better understand the structure of the COVID-19, and to improve the recognition speed, an effective recognition model based on compressed feature vector is proposed. Object recognition plays an important role in computer vison aera. To improve the recognition accuracy, most recent approaches always adopt a set of complicated hand-craft feature vectors and build the complex classifiers. Although such approaches achieve the favourable performance on recognition accuracy, they are inefficient. To raise the recognition speed without decreasing the accuracy loss, this paper proposed an efficient recognition modeltrained witha kind of compressed feature vectors. Firstly, we propose a kind of compressed feature vector based on the theory of compressive sensing. A sparse matrix is adopted to compress feature vector from very high dimensions to very low dimensions, which reduces the computation complexity and saves enough information for model training and predicting. Moreover, to improve the inference efficiency during the classification stage, an efficient recognition model is built by a novel optimization approach, which reduces the support vectors of kernel-support vector machine (kernel SVM). The SVM model is established with whether the subject is infected with the COVID-19 as the dependent variable, and the age, gender, nationality, and other factors as independent variables. The proposed approach iteratively builds a compact set of the support vectors from the original kernel SVM, and then the new generated model achieves approximate recognition accuracy with the original kernel SVM. Additionally, with the reduction of support vectors, the recognition time of new generated is greatly improved. Finally, the COVID-19 patients have specific epidemiological characteristics, and the SVM recognition model has strong fitting ability. From the extensive experimental results conducted on two datasets, the proposed object recognition model achieves favourable performance not only on recognition accuracy but also on recognition speed.
© 2022 Elsevier B.V. All rights reserved.

Entities:  

Year:  2021        PMID: 34975183      PMCID: PMC8710134          DOI: 10.1016/j.patrec.2021.12.016

Source DB:  PubMed          Journal:  Pattern Recognit Lett        ISSN: 0167-8655            Impact factor:   3.756


Introduction

Since the outbreak of the COVID-19, The number of infections worldwide is increasing every day. The main symptoms are respiratory cough, fever, shortness of breath, difficulty breathing, etc. Severe cases have pneumonia, acute respiratory syndrome, kidney failure, and even die. There is no specific treatment for it. The manifestations of the COVID-19 patients are still not fully understood, and the virus is mutating, making the research of COVID-19 urgent. Object recognition plays an important role in the field of computer vision and various multimedia applications. The task of object recognition is to determine whether a certain type of object is existed in a given image. Due to the problem of blur, deformation, partial occlusion, illumination change, clutter background, etc.; object recognition becomes a challenge task. To address such kind of problems, the high dimensional hand-craft features and the classification model with high computation complexity are adopted to ensure the recognition accuracy. However, the high dimensional hand-craft features and complicated classification models make the prediction are inefficiency. In practice use, we demand not only on recognition accuracy but also on recognition speed. Because that the fast and accurate object recognition can provide effective support for other computer vision tasks, such as object detection and tracking. It provides powerful basis for the system's decision-making. Moreover, with the growth of the image data and the improvement of the demand of system's intelligence, developing the high-precision, real-time and well-adapted approaches have become a trend. Traditional virus classification tests take into account the morphology, serology, host range, protein, and physical and chemical characteristics, such as sensitivity to organic solvents, cell culture, structure, and molecular weight. With the development of computer algorithms, intelligent systems are widely used in identification of chemical composition and complex biological macromolecules because of their sensitivity and accuracy. Some researchers have used relevant algorithms to conduct systematic cluster analysis to identify viruses. Similar strains clustering together has important implication significance for identifying virus strains. This paper deals with the practical problems of object recognition based on feature compression and classification model optimization. This model assists in the identification and classification of the COVID-19. Firstly, from the aspect of feature compression, the traditional approaches always involve a large number of matrix decomposition operations, such as Principal Component Analysis (PCA) [1] and Singular Value Decomposition (SVD) [2], which are inefficient. To improve the compression efficiency, a feature compression algorithm based on compressive sensing is proposed. Though this algorithm, a spares random matrix is adopted to map the high-dimensionality feature into the low-dimensionality feature space, and the mapping just involves the operation of matrix multiplication. Secondly, although the commonly used classification model–Kernel SVM [3] as good generalization capability, the prediction cost will increase with the number of support vectors increases. Therefore, a support vector reduction algorithm is proposed to optimize the classification model, which reconstructs a simplified subset of original support vectors through the way of cyclic iterations. The optimized classification model achieves the similar generalization capability with the original model but spends less time on predicting. Fig. 1 shows the workflow of the proposed method, which is divided into two training stage and the testing stage. High-dimensional feature vectors are extracted from the COVID-19 image first In the training stage, the high-dimensional feature vectors are extracted from the image firstly, and then a very sparse matrix is constructed to map the high-dimensional feature vectors into the low-dimensional domain. Lastly, a kernel SVM classifier is trained by the compressed features, and the kernel SVM is optimized by the support vector reduction method. In the testing phase, we adopt the same method to compress the features and put the low-dimensional features into the optimized kernel SVM for prediction.
Fig. 1

Workflow of the proposed method.

Workflow of the proposed method. The rest of this paper is organized as follows: In Section 2, we introduce the relevant studies about the related works. In Section 3, the basic theories and analysis regarding to feature compression and model optimization are introduced. In Section 4, experimental results and analysis are presented. In Section 5, we conclude this paper and propose future work.

Related works

After the virus infects the body, the body recognizes the virus, initiates an innate immune response, produces an antiviral response, secretes a large number of inflammatory cytokines, and mediates the occurrence of inflammation. The body resists viral infections by clearing the infected virus, like type I interferon can induce the expression of “Mx” gene, hinders the initial transcription process of PB2 polymerase. The joint feature and compressed dictionary learning have few applications in detection of viruses. Some scholars have applied this learning algorithm to classify cars and found, that the learning algorithm can improve the accuracy of car model recognition. In this section, we briefly review the recent studies [4], [5], [6], [7], [8], [9], [10], [11], [12], [13] in the literature on object recognition. Recently, object recognition approaches focus on extracting discriminative features, which can be divided into two kind of frameworks: deep learning [4] and bag-of-words [5]. Although the deep learning approaches achieved excellent performance in object recognition, these approaches always involved higher computation resources and training set. The classical deep model such as AlexNet, GoogleNet and ResNet were designed for large scale image classification task, which adopt the ImageNet that is a very large image dataset contained about 14 million images for training and testing[6]. However, the deep model is not suitable for the dataset with less samples. Moreover, the deep models always trained on multi-GPUs, which is time consuming for the platform with on CPU architecture[7]. Therefore, for the situation of less samples and less computing resources, the bag-of-word framework is well adoptive. For the bag-of-words framework, the image features always extracted with dense sampling scheme, which adopted a fixed size and scale to extract a large number of local descriptions from the image. The extracted local feature descriptions have higher redundancy, which can be encoded into a feature vector, then, the feature vector is fed into classifier for classification. For example, Xu et al. [8] designed the HOG (Histogram of oriented gradient) feature and trained a linear SVM, which achieved great success in vehicle recognition. Ma et al. [9] adopted GMM (Gaussian mixture model) and LCC (Local coordinate coding) to reduce the redundant information of the densely sampled SIFT (Scale-invariant feature transform) descriptions from the image and trained a linear classifier for object recognition. Zhuang et al. [10] fused multi kind of local feature, vector quantization coding and adopted spatial pyramid match to generate the high dimensional feature vectors, then trained them by a kernel SVM for recognition. Due to the rich and hierarchical information of the images, this approach performed well on object recognition. Clearly, with the richer information of feature vector, the classifier obtained better performance. Therefore, in order to improve the recognition performance, multi kind of feature were fused, but this will cause the computation cost. Jorge et al. Dr. Lowery from the University of Ars in Northern Ireland developed a "DNA fingerprinting" system. It can detect a variety of viruses including smallpox within 15 years. [11] proposed the approach that adopted the fisher vector to encode the feature vector and then compressed the high dimensional feature vector by product quantization algorithm. With the compressed feature vectors, the trained classifier performed well on object recognition. Feature compression aims to save as much information of original feature vectors as possible, which reduce the model training time and the storage space. The PCA and SVD are two main feature compression methods in the file of object recognition. The PCA is used to map the feature vector from the high dimensional space to the low dimensional space by orthogonal linear projection[12]. While the SVD adopts the singular value decomposition of matrix to extract the key information of original feature vectors. However, the matrix decomposition usually involves higher computation complexity. Additionally, classification model is another important in object recognition framework. The SVM is the classical classification model and it was adopted in a lot of works, such as the studies [8], [9], [10]. It shows well generalization capability in the situation of less samples, non-linear and high dimensional problems. However, the SVM still has some limitations, such as the prediction speed will decrease with the number of the support vectors increases. Koibayashi et al. [13] think that the support vector set can be reduced by finding a simplified support vector set in feature space. Geebelen et al. [14] proposed a method that selected a typical sample set from the original training set and adopted the reduced sample set to train the prediction model. Therefore, this work deal with the object recognition problems from two aspects: 1) designing an efficient and compressed feature to reduce the computational complexity and speed up the prediction, 2) developing a model optimization method to improve the recognition speed without the accuracy loss.

Materials and methods

Characteristics of coronary viruses

Corona viruses have no cell structure and can colonizes living cells. They rely on living cells to synthesize protein. The main genetic material of the coronaviruses is single-stranded positive-stranded RNA. Its basic unit is ribonucleotides, which can be stained with Piro Red dye. They bind to ACE2 on the cell membrane to invade the cell, indicating that the cell membrane is responsible for information exchange. The offspring of the virus are discharged out of the cell through the vesicles, indicating that the biofilm has certain fluidity. Fig. 2 shows the structure of the coronavirus. S is the spike protein, E is the mantle protein, M is the membrane protein, and N is the ribonucleoprotein. Fig. 3 is COVID-19.
Fig. 2

Structure of coronary viruses.

Fig. 3

The COVID-19.

Structure of coronary viruses. The COVID-19.

Feature extraction

50 cases infected with COVID-19 admitted to XX hospital were selected as the research subjects to investigate the epidemiology under their consent. The SVM model was established with the COVID-19 infection as the dependent variable, and age, gender, education level, and close contact history as the independent variables. The SIFT descriptor and the dense sampling scheme is adopted in our work. Firstly, the image of COCID-19 is segmented into local patch with 16 × 16 pixels; each local patch is divided into 16 grids with the size of 4 × 4 pixels. Secondly, the gradient magnitude g(x, y) and orientation (x, y) of each position (x, y) in the grid is calculated by Eq. (1), then the orientation of each position (x, y) to is assigned to a specific bin and forming a histogram (which is ranged from 0 to 360° and divided into 8 bins, each of which is cover 45°). Thirdly, all of the orientation histograms of each local patch (with 4 × 4 grids) is aggregated and is formed into a 128D (D is short for dimension) SIFT descriptor; the detail is shown in Fig. 2. The SIFT descriptor of next local patch is calculated with spacing of 8 pixels until whole image is traversed.

Feature coding

The extracted 128D local features contained a lot of redundancy and are very difficult to model the semantic features. Therefore, though the bag-of-word framework, the quantified local feature descriptors can be seen as “visual word”, and the image content is expressed by the distribution of the “visual words” in the image. In our approach, we employed 200 images and extracted their local feature descriptors according to the method in Section 3.1, then the K-Means [15] algorithm is adopted to calculated the cluster centers. We set the hyper parameter K to 1024, therefore, 1024 “visual words” were selected to build a “visual dictionary”. Each image can be encoded by the “visual dictionary” with the size of 1024D. This procedure is described in Fig. 3.

Feature compression

From compressive sensing theory [16,17], it is known that if dimensionality of feature space is extremely high, these features can be randomly projected to a low-dimensional feature space that includes enough information. Given a vector v in the low dimensionality space,, a vector u in the high dimensionality feature space, , and the mapping by a random matrix P is defined as Eq. (2): (Figs. 4 and 5 )wheren<
Fig. 4

The procedure of the feature extraction.

Fig. 5

The framework of bag-of-word.

The procedure of the feature extraction. The framework of bag-of-word. The projection v is similar to a compressive measurement in the compressive sensing encoding stage. From the JL (Johnson-Lindenstrauss) lemma, given one signal is linear combination of only K basis [18], that signal is possible to reconstruct from a small number of random measurements. Lemma1. (Johnson-Lindenstrauss lemma): Let D be a finite collection of d points in . Given 0 <ℰ< 1 and β> 0, let n be a positive integer such that Eq. (3) : Then, for any set D of d points in, there a mappingexists, such that for every pairs, Let f be a random matrix P,, and Eqs. (5) and (6), where, Or Therefore, for any two vectors , the Eq. (4) will be satisfied with probability which exceeding. Liu et al. [19] adopted random Gaussian matrix and showed that sparse random measurement matrix obtained favorable results in texture classification. Actually, a very sparse measurement matrix with sparse elements can be defined as Eq. (7): Chen et al. [20]proved that very sparse matrix can obtain the similar results with random gaussian matrix, and x can be defined as , or even. In our work, we define for building the very sparse measurement matrix.

Model optimization

When the optimal classification hyperplane of kernel SVM is determined, the normal vector of its eigenspace (in the Eq. (8)) can be represented as a linear combination of all support vectors. Model optimization approach aims to find a simplified support vector set {c},, to replace the original support vector set {s}, of the kernel SVM model, where . The normal vector of the simplified support set is defined as Eq. (9). Then, we define (in Eq. (10)) to be the square error of and . Take the derivative of in Eq. (10) and the result is shown in equation (11) and (12) Where, and . Moreover, in order to solve the simplified support vector set, we follow the Eq. (13) Take the derivative of sin Eq. (13), the result in Eq. (14) In our approach, we define (the Gaussian kernel), the Eq. (14) can be define as Eq. (15) According to the Eq. (12) and (14), we can iteratively solve the m and s (shown in Eq. (16)). where, The condition for the iteration to terminate is that less than a threshold or m reaches the specified number of support vectors.

The experimental results

In order to validate the performance of ourapproach, we tested it on Caltech 101 dataset. In the experiment,all approaches were programmed in Matlab 2014a and were run on a PC with an CPU (Intel Core i5 and 2.5 GHz) and 12 GB memory. The Caltech 101 dataset is a popular dataset in the field of computer vision. Each category of this includes from 31 to 1100 images. These images are medium resolution with the size of 500 × 800. In our experiments, about 775 images of 8 sub-categorieswere selected for training and testing. Fig. 6 and Table 1 respectively shows the distribution and some examples of each sub-categories.
Fig. 6

Some examples of each categories.

Table 1

Distribution of 8 sub-categories.

CategoriesImage size(pixels)Number
ButterflyAbout 300 × 20091
BonsaiAbout 300 × 200128
BrainAbout 300 × 25098
CarAbout 300 × 200123
ElephantAbout 300 × 25064
PianoAbout 300 × 28099
StarfishAbout 300 × 25086
SunflowerAbout 200 × 30086
Some examples of each categories. Distribution of 8 sub-categories. (1) Recognition accuracy In the training stage, firstly, we randomly selected 10 images from each sub-categories and extracted the SIFT descriptors with the method mentioned in Section 3.1, and then the K-Means algorithm (K was set to 1024) was adopted to build a “visual dictionary” with the size of 1024 (described in Section 3.2). The rest images were randomly divided into 1:1 and adopted for model training and testing. Secondly, the feature vectors were compressed to 100D by a very sparse matrix of size 100 × 1024 (which was built based on the compressive sensing method mentioned in Section 3.3). The compressed feature vectors were fed into the Kernel SVM for model training. Finally, the trained model was optimized by the method in Section 3.4. About 50% of support vectors were reduced in the experiments. In the testing stage, we firstly extracted the feature vector from each testing image; secondly, the feature vectors were adopted the same sparse matrix (in training stage) for feature compression. Thirdly, the compressed features were fed into the optimized model for prediction. We compared the proposed method with other two methods as follows, which employed the uncompressed feature vectors and unoptimized classification model. The tests were conducted 5 times and the comparison results show in Table 2 . extracting SIFT descriptors and building 1024D feature vector with the same “visual dictionary” of our method, then adopting linear SVM for prediction extracting SIFT descriptors and building 1024D feature vector with the same “visual dictionary” of our method, then adopting Kernel SVM for predictionfrom the results in Table 2, the kernel SVM achieved the best performance but consumed most time. Our proposed method obtained the approximate recognition performance with kernel SVM, but the recognition speed is much faster than kernel SVM. The linear SVMachieved the worst recognition performance and the worse recognition speed than our method. Therefore, the proposed method achieved the balance of accuracy and speed.
Table 2

Comparison results of three method.

CategoriesSIFT + linear SVM (%)SIFT + Kernel SVM (%)Our method (%)
Butterfly71.50±2.1372.30±1.9671.80±1.93
Bonsai75.87±3.1276.18±3.4076.16±3.35
Brain85.33±3.7587.27±3.7587.33±3.75
Car72.09±5.7172.55±5.1972.55±5.19
Elephant81.88±4.7986.47±4.7186.47±4.71
Piano95.35±1.1298.73±1.1098.73±1.09
Starfish85.61±4.0487.09±4.3687.25±4.33
Sunflower92.38±5.8298.25±6.4397.85±6.31
Mean acc. (%)82.50±3.8184.86±3.7984.75±3.83
Speed0.12s0.25s0.10s
Comparison results of three method. (2) Feature compression We compared the performance of the feature compression, the classical PCA and SVD were employed for comparison. In addition to the method of feature compression, we adopted the same settings for training and testing stage. The original 1024D feature vectors were compressed to different dimensions (D=20, 40, 60, 80, 100, 120, 140, 160, 180, 200).The results are shown in Fig. 7 .
Fig. 7

The compression performance of different methods.

The compression performance of different methods. (3)Model optimization Additionally, we validated the performance under different reduction rate (it means the proportion of the reduced support vectors to the total number of support vectors). Fig. 8 illustrate the results. From the aspect of accuracy, the optimized model keeps the performance approximately unchanged with reduction rate from 10% to 50%, when the reduction rate is raised to 60% or higher, the performance becomes decrease. From the aspect of recognition speed, with the reduction rate increases, the recognition speed increases linearly.
Fig. 8

Relationship between reduction rate and accuracy, speed.

Relationship between reduction rate and accuracy, speed.

Conclusions

In this paper, we focus on dealing with the practical problem in the task of object recognition. In order to improve the recognition efficiency of the object recognition workflow, we proposed the method to improve the stage of feature compression and classification. Traditionally, high dimensional features are compressed and extracted the main information by PCA or SVD. To improve the compression efficiency, we adopted a very sparse matrix to map the high dimensional features to the low dimensional space by the theory of compressive sensing. Moreover, to improve the recognition speed further, we proposed a model optimization method, which accelerates the classification speed of the model by reducing the number of the support vector of the kernel SVM. From the experimental results, our proposed model achieved favourable recognition accuracy and speed. Our future work will focus on developing more powerful features. In this work, although the local feature and yield favourable results, we think there is room for improvement. Moreover, applying this method into other computer vison tasks (such as object detection and object tracking) is very interesting as well.

Declaration of Competing Interest

The authors declared that they have no conflicts of interest to this work.
  6 in total

1.  Texture classification from random features.

Authors:  Li Liu; Paul W Fieguth
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2012-03       Impact factor: 6.226

2.  Reducing the number of support vectors of SVM classifiers using the smoothed separable case approximation.

Authors:  Dries Geebelen; Johan A K Suykens; Joos Vandewalle
Journal:  IEEE Trans Neural Netw Learn Syst       Date:  2012-04       Impact factor: 10.451

3.  Compressive Sensing Image Restoration Using Adaptive Curvelet Thresholding and Nonlocal Sparse Regularization.

Authors:  Nasser Eslahi; Ali Aghagolzadeh
Journal:  IEEE Trans Image Process       Date:  2016-05-03       Impact factor: 10.856

4.  Generalized Pooling for Robust Object Tracking.

Authors: 
Journal:  IEEE Trans Image Process       Date:  2016-07-07       Impact factor: 10.856

5.  Structured Sparse Principal Components Analysis With the TV-Elastic Net Penalty.

Authors:  Amicie de Pierrefeu; Tommy Lofstedt; Fouad Hadj-Selem; Mathieu Dubois; Renaud Jardri; Thomas Fovet; Philippe Ciuciu; Vincent Frouin; Edouard Duchesnay
Journal:  IEEE Trans Med Imaging       Date:  2017-09-04       Impact factor: 10.048

6.  A Hybrid Vehicle Detection Method Based on Viola-Jones and HOG + SVM from UAV Images.

Authors:  Yongzheng Xu; Guizhen Yu; Yunpeng Wang; Xinkai Wu; Yalong Ma
Journal:  Sensors (Basel)       Date:  2016-08-19       Impact factor: 3.576

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.