Literature DB >> 35194579

HRNET: AI-on-Edge for Mask Detection and Social Distancing Calculation.

Kinshuk Sengupta1, Praveen Ranjan Srivastava2.   

Abstract

The purpose of the paper is to provide innovative emerging technology framework for community to combat epidemic situations. The paper proposes a unique outbreak response system framework based on artificial intelligence and edge computing for citizen centric services to help track and trace people eluding safety policies like mask detection and social distancing measure in public or workplace setup. The framework further provides implementation guideline in industrial setup as well for governance and contact tracing tasks. The adoption will thus lead in smart city planning and development focusing on citizen health systems contributing to improved quality of life. The conceptual framework presented is validated through quantitative data analysis via secondary data collection from researcher's public websites, GitHub repositories and renowned journals and further benchmarking were conducted for experimental results in Microsoft Azure cloud environment. The study includes selective AI models for benchmark analysis and were assessed on performance and accuracy in edge computing environment for large-scale societal setup. Overall YOLO model outperforms in object detection task and is faster enough for mask detection and HRNetV2 outperform semantic segmentation problem applied to solve social distancing task in AI-Edge inferencing environmental setup. The paper proposes new Edge-AI algorithm for building technology-oriented solutions for detecting mask in human movement and social distance. The paper enriches the technological advancement in artificial intelligence and edge computing applied to problems in society and healthcare systems. The framework further equips government agency, system providers to design and construct technology-oriented models in community setup to increase the quality of life using emerging technologies into smart urban environments.
© The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd 2022.

Entities:  

Keywords:  Deep learning; Edge-computing; Social and industrial safety

Year:  2022        PMID: 35194579      PMCID: PMC8830974          DOI: 10.1007/s42979-022-01023-1

Source DB:  PubMed          Journal:  SN Comput Sci        ISSN: 2661-8907


Introduction

The recent outbreak of coronavirus SARS-CoV-2 infection early detected in December 2019 in Wuhan, China [1]. The magnitude of infectious spread has affected more than 3.2 million peoples causing 239 K deaths, according to the European Centre for Disease Prevention and Control. In the latest report, the total death caused by fever is 64.7% and 52.9% due to cough [1]. The role of flattering the curve via quarantine and preventing community spread using a respiratory surgical mask or N95 mask have found significance in controlling the spread in previously published literature [1]. The literature had shown evidence indicating the use of a surgical mask reduces the transmissibility per individual by preventing the droplets transmission in both laboratory and clinical contexts [2]. In the compliant scenario for industrial workplaces, airports, and places for community gathering possess the highest risk of spread without prevention. Public health authorities have approached to contain the virus spread via isolation, personal protection, and hygiene compliance [3], social distancing, contact tracing, and surveillance application [4]. Recent surveys and literature have studied how to handle community gatherings to prevent the global spread of COVID-19. Several research gaps that need to address for the response to COVID-19 need to be discussed in the current situation, such as developing an ethical framework to control and contain spread in such states. This also involves devising appropriate ways to prevent and control the infection by identifying optimal personal protective equipment (PPE’s), and after that, understand behavior among various vulnerable groups [4]. Many recent research works propose policymakers to make masks as an official guideline to stop spread in the community [2]. Hence, the need to find sustainably designed AI-powered technology solution like robotics, self-explainable digital solutions [5] to tackle the post-pandemic situation in society and industrial setup needs drastic attention to counter the ripple effect of COVID-19 in economic circumstances. Policymakers and industries would need an efficient solution within the industrial and societal structure to trace and track people and prevent droplet spread [6]. An older research review conducted randomized studies-controlled trials using masks and found that a low-cost intervention would be useful to break the transmission of the respiratory viruses [7]. The summary of existing studies is depicted in below Table 1.
Table 1

Compare and contract review implication and discovering significant scope for GAP analysis

S. No.Research articleResearch objectiveMethodology/techniqueLimitationsImplications
1.

Defending against the Novel Coronavirus (COVID-19) outbreak: how can the internet of things (IoT) help to save the world?

https://doi.org/10.1016/j.hlpt.2020.04.005

Studied how IoT-based smart disease surveillance systems can act as a potential solution to control the current pandemicLiterature reviewLimited to the theoretical modelDirecting toward conducting more research on automated and effective alert systems for detection and control of the virus
2.

COVID-19: toward controlling of a pandemic

https://doi.org/10.1016/S0140-6736(20)30673-5

Article and WHO review for ways and recommendation on controlling COVID-19Literature reviewNeed of conceptual frameworkA key outcome is to develop a framework for outbreak response
3.

The role of masks and respiratory protection against SARS-CoV-2

https://doi.org/10.1017/ice.2020.83

Identifying the role of mask and personal protection again COVID-19 spreadLiterature reviewNeed empirical evidenceN95 and Surgical mask acting are classified as key aspect to fight SARS-CoV-2
4.

Wearing face masks in the community during the COVID-19 pandemic

https://doi.org/10.1016/S0140-6736(20)30918-1

Studies whether only wearing mask to control spread of virus or together with social distancing and personal hygiene is also importantLiterature reviewNeed empirical evidenceNo empirical evidence of masks in infection control
5.

Face masks against COVID-19: an evidence review

https://doi.org/10.20944/preprints202004.0203.v1

Study the factor of lowering community transmission using face masksLiterature reviewNeed empirical evidenceThe review focus on providing evidence from literature to use mask for controlling spread and help frame policy around use of non-medical masks in public
6.

Rational use of face masks in the COVID-19 pandemic

https://doi.org/10.1016/S2213-2600(20)30134-X

Study the need of face mask in community settingsLiterature reviewLimited to the theoretical recommendations from WHO and health agenciesFace masks are recommended by WHO, ICMR and other health agencies to prevent potential asymptomatic or presymptomatic transmission
7.

COVID-19: protecting worker health

https://doi.org/10.1093/annweh/wxaa033

Discuss use of PPE, effect of wearing mask and social distancingLiterature reviewNeed empirical evidenceDebates on urgent need for research on control measures to protect workers and prevent spreading
8.

Scientific and ethical basis for social distancing interventions against COVID-19

https://doi.org/10.1016/S1473-3099(20)30190-0

Discuss impact of social distancing on spread of virusLiterature reviewNeed empirical evidenceFocuses to create evidence-based intervention for public communication
Compare and contract review implication and discovering significant scope for GAP analysis Defending against the Novel Coronavirus (COVID-19) outbreak: how can the internet of things (IoT) help to save the world? https://doi.org/10.1016/j.hlpt.2020.04.005 COVID-19: toward controlling of a pandemic https://doi.org/10.1016/S0140-6736(20)30673-5 The role of masks and respiratory protection against SARS-CoV-2 https://doi.org/10.1017/ice.2020.83 Wearing face masks in the community during the COVID-19 pandemic https://doi.org/10.1016/S0140-6736(20)30918-1 Face masks against COVID-19: an evidence review https://doi.org/10.20944/preprints202004.0203.v1 Rational use of face masks in the COVID-19 pandemic https://doi.org/10.1016/S2213-2600(20)30134-X COVID-19: protecting worker health https://doi.org/10.1093/annweh/wxaa033 Scientific and ethical basis for social distancing interventions against COVID-19 https://doi.org/10.1016/S1473-3099(20)30190-0 The past studies and research work conducted by authors hold few limitations from a conceptual framework point of view. The evidence found literature depicts the need to conduct further research focusing on devising a mature technology model infusing Artificial intelligence to respond to the outbreak in the current situation addressing the gap (Fig. 1). The paper focuses on building a technology framework that can be extended for implementation in public and industrial setup to detect people with or without facemask and, therefore, eases out tracing and further control spread. The current research work on how studying the impact of IoT-based smart disease surveillance systems can help in controlling the spread of the pandemic [8]. Policymakers need to think about enforcing and implementing smart urban and industrial planning solutions across the geographies post-lockdown situation to resume economic activities. In the past, Bibi and Krogstie [9] studies have directed toward a new generation of urban planning tools for improving mobility and accessibility; this can further be applied in combating pandemic situation across the globe.
Fig. 1

Current studies and GAP in outbreak response from a technology point of view for early economic revival

Current studies and GAP in outbreak response from a technology point of view for early economic revival

The Need of Technology Framework

The world had seen unprecedented times before as well due to other epidemic health crises caused by Ebola, H1N1, SARS-CoV1, Zika Virus, and many others that have caused a drastic impact on the economy and social wellbeing. However, now since the exponential spread of COVID-19 globally has created a need for responses from governments due to heavy loss in GDP ranging from 3 to 6%, or even more depending on the country [10] and economy predicted by the model developed by Wang and Yu [1]. The section discusses the impact on the economy due to lockdown and subsequently discuss the need for a responsive framework from industrial, citizen, and government’s viewpoint aiding to early lockdown policy formulation and strategy by government and industries.

Industrial Point of View

Globally, government has taken measures to cut the spread by various measures like prohibiting gatherings of the crowd, remote health advice, creating mobile surveillance app for tracking, healthcare system to detect COVID using quantum machine learning methods [11] and tracing people with the virus from studies conducted by Keeling et al. [12]. Though the measures that have been taken till date can help settle the pandemic stronger strategies are envisioned in reducing transmission from community and household with better support for home diagnosis facility, and dealing with the economic consequences of absence from jobs and work for individuals with experiences from past recessions researchers has suggested the impact on the economic backbone can go lower or persistent [13]. Further, the discussion and studies done by Wittkowski [14] and Kissler et al. [15], state that effectiveness on lockdown strategy is not known to impact the spread of the virus. Industries, such as hospitality, the airline, have taken a significant hit, followed by agriculture taking a global drop of more than 20% in demand, and manufacturing had shown a large drop in overall demand [16]. In the present situation with lockdown, corporates and industries have started adopting digital platforms Gaines-Ross [17]. These industries cannot operate with work from the home policy by companies, and due to COVID-19 business has seen disruption in staffing shortage due to self-isolation and lockdown across the globe. Discussions around lifting lockdown in a phased manner and through early planning can help revive economic activities to a great extent, which later would demand effective contact tracing mechanism within industrial setup as well as community, religious places [12]. Hence, the digital solution would play a vital role to support post-lockdown phases in the overall situation, where systems like monitoring, surveillance, detection would need to be developed leveraging IoT, big data systems and AI as the core technology to be in need [18].

Citizen’s Point of View

It is important to understand the responsibility of individuals they have to play in controlling the overall spread following the guideline from local government and WHO recommendation of practicing social distancing, personal hygiene, and wearing the mask in public places [19]. Despite communication from health bodies, individual's awareness around the COVID-19 threat was limited, and people did not adhere to social distance as practice despite mandatory steps outlined as controlling mechanisms by WHO. Social distancing could be an effective way to reduce mortality and spread rate in any setup due to the droplet nature of the virus that can sustain in the environment [20]. The recommendation from health agencies and published government guidelines, people would strictly need to follow social distancing and follow respiratory hygiene [21] when in public. Since vaccine development is a time-consuming process, the spread can only be curb through social reformation and with individual efforts within both societal and industrial setup.

Government’s Point of View

Some Asian countries had shown success in controlling the pandemics through testing, contact tracing, and quarantine strategy along with moderate or strong social distancing measures. China leverages intrusive surveillance technology for controlling the virus’s spread tracking monitor citizens to establish safety protocol. Similarly, many other countries are now adopting use of technology in many form to communicate and trace people during this pandemic, like Ministry of Electronics and communication, India has mandated use of mobile tracking application ‘Health Bridge’ for contact tracking according to official website of MeitY (mygov.in). EU Members States, backed by the commission, have rolled out mobile app ‘eHealth’ for automated contact tracking which is more efficient compared to manual effort which is time-consuming and expensive. The government needs to be responsive about to pandemic situation with the citizens, create awareness campaigns, social drills through border forces, and strengthening the disasters-humanitarian coordination. Policymakers need to create effective models for forecasting, helping in making the right decisions in a timely way, even with such uncertainty around COVID-19 containment. Their post-lockdown strategy is a more crucial stage for government to enforce tough decisions around individuals responding to how best to prevent transmission through governed actions by industries and individuals returning to work [19]. Additionally, looking at the economic and social aspects, measures based on isolation are not sustainable in the long run. An extended drawn economic shutdown would create negative health consequences [22]. Several studies depict the role of technology it has played during earlier epidemics, to battle the current situation policymakers and health agencies need to lay a stronger documented lockdown exit strategy keeping post-implication in view and therefore leverage a strong technology framework for contact tracing and surveillance [17].

Proposed Model and Framework

In this section, the paper proposes the need for building surround outbreak response system (Fig. 2), aiding in tracking and tracing safety-related concern in industrial and societal setup. The overall need for contact tracing is important to control infectious diseases and its spread [23]. The framework would encompass video feeds from a surveillance camera and IoT edge devices placed inside the industrial setup or public places to track people’s movement. The architecture proposed here is a hybrid design approach to facilitate feeds from existing cameras as well as IoT devices with edge computing environments on the cloud. Light edge devices, such as Intel Movidius, Nvidia Jetson, or heavy edge devices, like Nvidia Tesla or Intel FPGA and Cloud Environment for training and testing large-scale object detection model.
Fig. 2

The high-level design of outbreak response system for tracking and tracing

The high-level design of outbreak response system for tracking and tracing The framework is focused on leveraging edge computing for detecting face masks in certain environmental setup (workshops, hospitals, industrial premises). Edge computing can deliver swift localized events, near real-time insights, and reduction in overall cost due to efficient local data management and operations [24]. The positive side of deploying deep learning models on edge is to tackle bandwidth-related challenges and providing extensive data security being PII in nature Satyanarayanan et al. [25]. The edge devices can hold relatively lighter deep learning models, process raw information in a smaller size of image frames [24]. The research work extends the concept of multimodal face detection and tracking of people in workshops, public and community places like temples, mosques, government offices, public transports, and offices. The paper stirs work by the researcher [26] using facial-recognition technique to examine the spacio-temporal behavior of individuals. Further, the research work performs advanced experiments in a controlled environment to deduce optimal algorithms for object detection and estimate social distancing criteria from image frames. The sub-sections describe the data collection strategy for the experimental design, feature engineering stages applied, and algorithm implementation benchmarking.

System Design: Edge-Cloud Computing

The section describes the foundation of edge computing architecture for large-scale image analytics and inferencing problems. The system architecture for edge computing environment is a three-tier architecture [27] described in Figs. 3 and 14, respectively. The key advantages of leveraging edge computing are:
Fig. 3

Three-tier edge computing architecture

Fig. 14

FPS capability of various object detectors

Low-latency access due to localization of compute environment, storage, and networking locally. Lessened bandwidth utilization due to intelligent aggregation and filtering of data to be transited for the purpose of training complex models. Localization of models needs intermittent access to the internet. Localization of Machine learning inferencing for models trained on public cloud. Three-tier edge computing architecture The architecture explains the design of edge computing environment, the device layer connects edge devices supporting local compute to run deep learning models as compact models, provides support for connected device configuration to connect multiple mash of edge cameras, security configuration to extend the data privacy together mapped to surveillance cameras IoT edge runtime. The machine learning model training to happen in the public cloud environment and tiny trained model snapshot via docker containers would be deployed using IoT Hub and Agents within the device environment [28]. While edge computing delivering resilience to overall system design from computation standpoint, the challenge of training large deep learning models and deploying for inferencing is more trivial task. The next section analyzes the relevant deep learning models for fast training and deployment with relevant benchmarking of performance and accuracy conducted on different datasets.

AI Models

A combinatory system using IoT, Edge, and Artificial Intelligence would provide the enterprise with a wide range of new services and business opportunities for industries and helps companies create new value [29]. The working description of the model is represented as a flow diagram in Fig. 4. The model needs edge cameras deployed within the manufacturing floor shops or inside office areas continuously infer leveraging the machine learning models built in. The device capturing the area motion feeds converts video to image frames and keep in small compact patches.
Fig. 4

Flow diagram of contact tracing model (mask detection and social distancing)

Flow diagram of contact tracing model (mask detection and social distancing) The role of edge cameras is to collect, and process feeds locally using the deployed DL models. The sequence of inferencing happens at the edge environment executing people and mask detection ML module, image segmentation ML module, and social distancing classifier from the segmented image. The processed feeds are stored in a native cloud environment for further alerting and tracking mechanisms through MIS dashboards. The next subsequent sections discuss the overall system design from IT implementation perspective and algorithms evaluated for the completeness of the framework. The application of deep convolutional neural networks has attained state-of-the-art results in solving computer vision problems like object detection, semantic segmentation, human pose estimation (Fig. 5). The mask detection is a subset domain of object detection technique, and the social distancing is of semantic segmentation. The object is the problem domain to determine where objects are located within an image, i.e., called object localization and which class each object belongs to called object classification Zhao et al. [30]. Semantic segmentation deals with the problem of assigning a class label to each pixel within an image [28]. The conceptual descriptions of models selected for conducting the benchmarking are elucidated in subsequent sections below.
Fig. 5

Object detection and segmentation

Object detection and segmentation

Region-Based Convolutional Neural Network (R-CNN)

The model uses selective search technique Uijlings et al. [31] in contrast of comprehensive search method in an image to detect region proposals. The initialization happens over a small region in an image merging all the regions with a hierarchical grouping. The final group are the boxes containing the entire image. The discovered regions are further combined based on color spaces and similarity metrics. The output are rare numbers of region proposals which could comprise of an object by melding small regions. How the algorithm would function is described in Fig. 6 below.
Fig. 6

R-CNN model

R-CNN model

Fast R-CNN

Like R-CNN, the paper also benchmarks Fast R-CNN. The advantage of Fast R-CNN (Fig. 7) is to reduce overall model computational expense that was happening in R-CNN version. Here, instead of multiple ConvNet being applied for each region, a single ConvNet takes the entire patch of image. A layer of regions is detected with a search method and fed into a fully connected layer creating feature vector and passed through SoftMax classifier to predict object.
Fig. 7

Fast R-CNN model

Fast R-CNN model

Faster R-CNN

Fast R-CNN has some limitations, the search algorithm used in region detection is slow and the model replaces it with fast neural network with a novel concept of regional proposal networks. The model considers each location in previous feature-map and consider r different boxes. Then from each box, the outputs containing an object or not are selected and fed into Fast R-CNN.

YOLO (You Only Look Once)

YOLO was developed by Facebook AI research group [. The architecture of YOLO is much faster that can process in real time at 45 fps. The proposed model considers object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. Not like region proposal or sliding window methods, YOLO observes an entire image during the training process and therefore implicitly encodes contextual information about classes as well as their appearance. The architecture of YOLO is described in Fig. 8 below. The method does pre-training of convolutional layers on ImageNet classification task with 224 × 224 input image resolution and thereafter fold twice the resolution for detection.
Fig. 8

Yolo network has 24 convolutional layers followed by two fully connected layers

Yolo network has 24 convolutional layers followed by two fully connected layers

Semantic Segmentation Using HRNetV2

The HRNetV2 (Fig. 9) recently has advanced in semantic segmentation problem, helping segment objects in an image. The method uses high-resolution representation by combining up sampled representations from all parallel convolutions in contrast to choosing representation from the high-resolution convolution in deep high-resolution representation learning model making it powerful for segmentation problems. The scope of the model can further be extended to calculate distance between two segmented using distance measure functions, such as Euclidian distance, hamming distance, partition distance measure functions [33].
Fig. 9

High-resolution network architecture Sun et al. [32]

High-resolution network architecture Sun et al. [32]

Methods

The section discusses the approach pursued for model development. Specifics on the overall data collection process followed by key algorithmic implementation for the study are outlined in upcoming sections.

Data, Pre-processing and De-biasing

This section illustrates the data assimilation, pre-processing steps examined for the design of the experiments. The data collection was performed using open-source datasets repositories like msropendata.com, data.world and datasetsearch.research.google.com, pjreddie.com, town center dataset and image segmentation model on PASCAL VOC dataset for overall experimentations and modeling. Two sets of datasets were collected for the problem, (a) images of individual personals with mask, without masks in group and unaccompanied. (b) Images representing group of people in a cluster with some distance and others without distance observed. A de-noising was performed on images to scale the image quality and resizing was performed for fitting the model for end experiments and create fairer model. A known problem with image datasets is determining existence of visual representation bias of women over men. This is widespread in the history of journalism and advertising [34-36]. Hence, as key step during pre-processing stage, an appropriate step was taken for producing fair samples toward specific gender shade. An automated labeling tool was used to provide label to the images tagged as ‘with-mask’ or ‘without-mask’ and correspondingly label as ‘following-distance’ or ‘not-following-distance’. The selected images from the large, collected sample corpora were first labeled using the outline tool. Post labeling a bootstrapping, resampling technique was applied on the entire image dataset to set free from any representational bias [37]. For gender-specific biases, a de-biasing method was employed to identify the male–female representation by a separate model used for identifying male and female in each image and later providing statistical estimation of frequency of the entire dataset. The process was an important step recently researchers have demonstrated how many algorithms have seen demonstrating biases [38]. Crawford and Paglen [39] deliberately argues how ImageNet dataset that is extensively used for training image-labeling algorithms exhibit biases in more than 14+ million images collected through web scrapping from the internet.

Experiments

The section illustrates details around the actual experimental setup and specifics related to model implementation, algorithmic implementation of social distance calculation process, and their evaluation criteria have been discusses. The experiment design phase involved selecting a base HRNET model for initial trials and rapidly performed A/B testing across other models using the similar set of datasets collected.

Mask Detection

The experiment adopted Yolo, Microsoft Computer Vision, Fast-RCNN, RCNN to bench-marked across samples of data chosen for identifying objects in an image. The machine learning models were trained on the collected dataset, Pascal VOC dataset and COCO dataset [40]. In previous work, authors have proposed transfer learning approach [41] fine-tuned on the MobileNetV2 model Yadav [42]; however, the architecture is not tested on large-scale data problems on edge systems. The paper further evaluates models for runtime as well as fastest inferencing on commercial edge devices. Here as part of experiments, Microsoft Vision AI DevKit (Fig. 10) was used to simulate the environment. The simulation environment architecture is depicted in Fig. 11 below.
Fig. 10

Azure IoT Vision AI Development KIT (Vision AI Development Kit-Qualcomm Developer Network)

Fig. 11

Edge simulation architecture

Azure IoT Vision AI Development KIT (Vision AI Development Kit-Qualcomm Developer Network) Edge simulation architecture The simulation tasks leveraged Vision AI dev kit to deploy custom YOLO model trained on collected datasets consisting of images of individuals with and without mask using Azure ML services. The model was further tested on real subjects in a lab environment setup for evaluation.

Social Distance Calculation

The paper introduces a distance calculation algorithm to calculate the social distance score of a segmented image using HR-Net segmentation model. The core segmentation model is developed using HRNET and object contextual representation transformer architecture [43]. The transformer pipeline is illustrated in Fig. 12. The calculation algorithm design is shown below in Table 2.
Fig. 12

HRNET + object contextual representation transformer pipeline

Table 2

Social distance calculation algorithm

Step 1: The model at first is set by the weights pre-trained on ImageNet dataset
Step 2: The semantic segmentation of an image frame is obtained from the above step
Step 3: The segmented images (Fig. 13) is further taken as input for edge detection using Canny, dilation and erosion for removing any gap between object edge
Step 4: Detect contours for shapes of the objects in the edge-map using findContours method in OpenCV
Step 5: Loop over contours individually, then rotated bounding box is calculate of the contour using minAreaRect and BoxPoints method in OpenCV
Step 6: Re-ordering the contours to organize in defined top-left, top-right, bottom-right and bottom-left order to draw the rotated bounding box and then calculate the center of the bounding box
Step 7: To calculate distance between each object, the algorithm starts considering each contour starting with left-most as initial reference, then keeps on calculating the mid-point between top-left and top-right points followed by top-right and bottom-right points
Step 8: In final stage, Euclidean distance is calculated between mid-points for final handling of reference object reconstruction
HRNET + object contextual representation transformer pipeline Social distance calculation algorithm Segmented image (left) + distance calculation between each object (right) The designed calculation algorithms were further deployed as custom model following mask detection method in the AI dev kit environment for simulations. The scope of the simulation was limited to limited objects and human identifications in a certain environmental setup chosen for this study. The models were evaluated as cross-reference benchmarking activities discussed in upcoming results sections.

Results and Discussion

The paper discusses the use of deep learning algorithms to accomplish two problems, first is detecting people with or without a mask and then predicts if social distancing is observed as image segmentation problem. The experimental results validate the performance of HRNetV2 for generating predictions for measuring social distancing score. The paper compares the computational performance of various algorithms suitable for edge inferencing and the overall accuracy of algorithms for the problem domain. The methodology adopted for benchmarking in the paper is based on secondary data collection and experimental run using existing benchmarking tools. There has been significant research work done in the space of object detection, and semantic segmentation within deep learning space and models like HRNET, Yolo, R-CNN, F-CNN, Mask-RCNN (refer to Appendix 1 for nomenclature) have proven to have yielded significant breakthrough results in image and video analytics space [44]. Edge devices work on local inferencing or performing real-time inference on the camera itself. The technology has no transmission delay, and errors can be debug faster than the previous method, it becomes essential to evaluate DL models that are computationally inexpensive in training and deployment as well as provide better efficacy to the problem domain. The paper discusses the benchmarks for MM detection toolbox presented in Table 3. The MM detection toolbox is developed by Multimedia lab from The Chinese University of Hong Kong. It compares Mask RCNN and RetinaNet to help evaluate the best deep learning method to adopt for faster computing on GPU based cloud environment and edge-based local inferencing.
Table 3

MM detection analysis [45]

ModelTrain (iter/s)Inf (fps)Mem (GB)APboxAPmask
Mask RCNN0.4310.83.837.434.3
Mask RCNN0.43612.13.337.834.2
Mask RCNN0.7448.18.837.834.1
Mask RCNN0.6468.86.737.133.7
RetinaNet0.28513.13.435.8
RetinaNet0.27511.12.736
RetinaNet0.5528.36.935.4
RetinaNet0.56511.65.135.6
MM detection analysis [45] The paper further contrasts overall performance of processing frames per second (highest and lowest) of various object detection models (Fig. 14) on Town Centre Dataset and image segmentation model on PASCAL VOC dataset. The visual object classes’ dataset is standardized image data used for object recognition problem. The benchmarks of real-time systems on PASCAL VOC 2007 and 2012 dataset are represented below in Table 4. The analysis provides higher degree of comparison on algorithms to be selected and studied further.
Table 4

Performance and speed comparison of models on PASCAL VOC dataset on edge compute

ModelSpeed (real-time)mAP (%)FPS
YOLOYes63.445
Fast- YOLOYes52.7155
YOLO-VGG-16No66.421
Fast R-CNNNo70.10.5
Faster R-CNN VGG-16No73.27
Faster R-CNN ZFNo62.118
FPS capability of various object detectors Performance and speed comparison of models on PASCAL VOC dataset on edge compute Further to the analysis, below are training comparisons conducted on Cityscape’s dataset. The model training and testing were conducted with an image with input size of 512 × 1024 and 1024 × 2048. The models were set with the weights pre-trained on the ImageNet on Small (Table 6) and large model (Table 6). The analysis outcome is represented using mean Intersection over Union (mIoU). Intersection over Union evaluation metric is used for object detector on a database to measure the accuracy represented as area of overlap by area of union.
Table 6

Large model

Selected model(s)Number of parametersMulti-scaleFlipmIoU
HRNetV2-W4865.8 MNoNo80.9
HRNetV2-W4865.8 MNoNo81.2
HRNetV2-W4865.8 MYesYes80.5
HRNetV2-W4865.8 MYesYes81.1
HRNetV2-W4865.8 MYesYes81.5
HRNetV2-W4865.8 MYesYes81.9
The model comparison conducted encompasses small and large models. The small model is trained on smaller set of parameters ranging from 15 to 150 parameters on different models taken for benchmarking evaluated on mIoU shown in Table 5, whereas large model in Table 6, illustrates the algorithmic comparison on large parameters set 65.8 million. The analysis however limits to AI models on a limited dataset and, future empirical research work needs to be conducted to validate large-scale deployment and adoption across various societal and industrial setup.
Table 5

Small model

Selected model(s)Number of parametersMulti-scaleFlipDistillationmIoU
ICNetNoNoNo70.6
ResNet18 (1.0)15.2NoNoNo69.1
ResNet18 (1.0)15.2NoNoYes72.7
MD (enhanced)14.4NoNoNo67.3
MD (enhanced)14.4NoNoYes71.9
SQNoNoNo59.8
CRF-RNNNoNoNo62.5
Dilation10140.8NoNoNo67.1
MobileNetV2Plus8.3NoNoNo70.1
MobileNetV2Plus8.3NoNoYes74.5
HRNetV2-W18-Small-v11.5 MNoNoNo70.3
HRNetV2-W18-Small-v23.9 MNoNoNo76.2
Small model Large model

Societal Implication

The situation of pandemic and COVID-19 situation across the globe has forced the government agencies, industrial ecosystem, and educational setup to comply and reinforce standard safety protocols for individuals to break the spread of the virus. There has been sheer need to build technology solutions for different setup to reduce the impact of spread [46]. The study provides relevant technical solutions for policymakers and technology innovators to build systems around the problem to reduce adverse impact in post-pandemic. The study provides future directions to adopt new technology systems and help design frameworks to keep safety as key aspect while planning for return-to-work or return-to-school stages when currently both are prohibited due to surge in cases worldwide.

Conclusion

To summarize, the paper discusses an applicable framework using artificial intelligence on edge computing for adopting an outbreak response system for contact tracing, mask detection, and detecting social distancing measures from video feeds in the surveillance systems. The benchmarks conducted on selected models to gage the performance and accuracy yield a deeper investigational analysis to further help in the overall implementation of such a solution in industrial practice. Overall YOLO outperforms in object detection task and is faster enough for complex edge inferencing and HRNetV2 outperform semantic segmentation problem applied to solve social distancing prediction. The study provides directional support for new technology adoption for policymakers from identified gaps in the current situation of COVID-19 spread. It discusses various aspects of new technology development from government point of view that could be leveraged for contact tracking and tracing in community setup once the lockdown is lifted to resume the economy. Not only the government, but the framework would also assist and encourage industry practices to adopt and build stronger surveillance systems within the workplace environment for deeper adherence to social distancing, physical hygiene, and further thermal screening using thermal cameras to help in combating spread to a large extent.
CNN: convolutional neural networkFPGA: field programmable gate array
R-CNN: regions with CNN featuresYOLO: you only look once; an object detection system trained on COCO dataset
HRNet: high-resolution networksmPA: mean average precision
COCO: common objects in contextFPS: frames per second
GPU: graphical processing unitTP/TN: true positive/true negative
SGD: stochastic gradient descentDL: deep learning
IoT: internet of thingsPASCAL: pattern analysis statistical modeling and computational learning
FPS: frames per secondVOC: visual object classes
  19 in total

1.  Object Detection With Deep Learning: A Review.

Authors:  Zhong-Qiu Zhao; Peng Zheng; Shou-Tao Xu; Xindong Wu
Journal:  IEEE Trans Neural Netw Learn Syst       Date:  2019-01-28       Impact factor: 10.451

2.  Respiratory virus shedding in exhaled breath and efficacy of face masks.

Authors:  Nancy H L Leung; Daniel K W Chu; Eunice Y C Shiu; Kwok-Hung Chan; James J McDevitt; Benien J P Hau; Hui-Ling Yen; Yuguo Li; Dennis K M Ip; J S Malik Peiris; Wing-Hong Seto; Gabriel M Leung; Donald K Milton; Benjamin J Cowling
Journal:  Nat Med       Date:  2020-04-03       Impact factor: 53.440

3.  Face masks for the public during the covid-19 crisis.

Authors:  Trisha Greenhalgh; Manuel B Schmid; Thomas Czypionka; Dirk Bassler; Laurence Gruer
Journal:  BMJ       Date:  2020-04-09

4.  Facemasks, hand hygiene, and influenza among young adults: a randomized intervention trial.

Authors:  Allison E Aiello; Vanessa Perez; Rebecca M Coulborn; Brian M Davis; Monica Uddin; Arnold S Monto
Journal:  PLoS One       Date:  2012-01-25       Impact factor: 3.240

5.  Efficacy of contact tracing for the containment of the 2019 novel coronavirus (COVID-19).

Authors:  Matt J Keeling; T Deirdre Hollingsworth; Jonathan M Read
Journal:  J Epidemiol Community Health       Date:  2020-06-23       Impact factor: 3.710

6.  Preparing for a responsible lockdown exit strategy.

Authors:  Marius Gilbert; Mathias Dewatripont; Eric Muraille; Jean-Philippe Platteau; Michel Goldman
Journal:  Nat Med       Date:  2020-05       Impact factor: 53.440

7.  How will country-based mitigation measures influence the course of the COVID-19 epidemic?

Authors:  Roy M Anderson; Hans Heesterbeek; Don Klinkenberg; T Déirdre Hollingsworth
Journal:  Lancet       Date:  2020-03-09       Impact factor: 79.321

8.  Defending against the Novel Coronavirus (COVID-19) outbreak: How can the Internet of Things (IoT) help to save the world?

Authors:  Md Siddikur Rahman; Noah C Peeri; Nistha Shrestha; Rafdzah Zaki; Ubydul Haque; Siti Hafizah Ab Hamid
Journal:  Health Policy Technol       Date:  2020-04-22

9.  The role of masks and respirator protection against SARS-CoV-2.

Authors:  Qiang Wang; Chaoran Yu
Journal:  Infect Control Hosp Epidemiol       Date:  2020-06       Impact factor: 3.254

10.  Quantum algorithm for quicker clinical prognostic analysis: an application and experimental study using CT scan images of COVID-19 patients.

Authors:  Kinshuk Sengupta; Praveen Ranjan Srivastava
Journal:  BMC Med Inform Decis Mak       Date:  2021-07-30       Impact factor: 2.796

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.