| Literature DB >> 35684717 |
Vinothini Selvaraju1,2, Nicolai Spicher1, Ju Wang1, Nagarajan Ganapathy1, Joana M Warnecke1, Steffen Leonhardt3, Ramakrishnan Swaminathan2, Thomas M Deserno1.
Abstract
In recent years, noncontact measurements of vital signs using cameras received a great amount of interest. However, some questions are unanswered: (i) Which vital sign is monitored using what type of camera? (ii) What is the performance and which factors affect it? (iii) Which health issues are addressed by camera-based techniques? Following the preferred reporting items for systematic reviews and meta-analyses (PRISMA) statement, we conduct a systematic review of continuous camera-based vital sign monitoring using Scopus, PubMed, and the Association for Computing Machinery (ACM) databases. We consider articles that were published between January 2018 and April 2021 in the English language. We include five vital signs: heart rate (HR), respiratory rate (RR), blood pressure (BP), body skin temperature (BST), and oxygen saturation (SpO2). In total, we retrieve 905 articles and screened them regarding title, abstract, and full text. One hundred and four articles remained: 60, 20, 6, 2, and 1 of the articles focus on HR, RR, BP, BST, and SpO2, respectively, and 15 on multiple vital signs. HR and RR can be measured using red, green, and blue (RGB) and near-infrared (NIR) as well as far-infrared (FIR) cameras. So far, BP and SpO2 are monitored with RGB cameras only, whereas BST is derived from FIR cameras only. Under ideal conditions, the root mean squared error is around 2.60 bpm, 2.22 cpm, 6.91 mm Hg, 4.88 mm Hg, and 0.86 °C for HR, RR, systolic BP, diastolic BP, and BST, respectively. The estimated error for SpO2 is less than 1%, but it increases with movements of the subject and the camera-subject distance. Camera-based remote monitoring mainly explores intensive care, post-anaesthesia care, and sleep monitoring, but also explores special diseases such as heart failure. The monitored targets are newborn and pediatric patients, geriatric patients, athletes (e.g., exercising, cycling), and vehicle drivers. Camera-based techniques monitor HR, RR, and BST in static conditions within acceptable ranges for certain applications. The research gaps are large and heterogeneous populations, real-time scenarios, moving subjects, and accuracy of BP and SpO2 monitoring.Entities:
Keywords: blood pressure; body temperature; camera; contactless; continuous monitoring; heart rate; noncontact; oxygen saturation; remote health care; respiratory rate; vital sign
Mesh:
Year: 2022 PMID: 35684717 PMCID: PMC9185528 DOI: 10.3390/s22114097
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1PRISMA diagram of literature screening.
Number n of articles from the considered years: January 2018 to April 2021.
| Article Year | HR | RR | BP | BST | SpO2 | Multiple | Total |
|---|---|---|---|---|---|---|---|
| 2018 | 25 | 2 | 1 | 1 | 0 | 7 | 36 |
| 2019 | 14 | 9 | 3 | 0 | 0 | 3 | 29 |
| 2020 | 16 | 5 | 2 | 1 | 0 | 3 | 27 |
| 2021 | 5 | 4 | 0 | 0 | 1 | 2 | 12 |
| Total | 60 | 20 | 6 | 2 | 1 | 15 | 104 |
Figure 2General flow diagram of vital sign measurements from video/image data.
Figure 3Data acquisition according to study design, hardware, and ground truth.
Figure 4Distribution of subject information including obtained data, number of subjects, participant type and available (A) and not available (NA) info of skin tone.
Figure 5Distribution of hardware parameters, namely camera type, frame rate, resolution (r), and camera-subject distance.
Figure 6Chord diagram of conceptual semantics between camera and vital sign.
Figure 7Distribution of ground truth.
Existing databases.
| Database | No. of Subjects | Camera Type | Camera Detail | Frame Rate | Resolution | Ground Truth |
|---|---|---|---|---|---|---|
| MAHNOB-HCI | 27 | RGB | Allied Vision Stingray | 60 | 780 × 580 | ECG |
| DEAP | 22 | RGB | Sony DCR-HC27E | 50 | 720 × 576 | PPG, Respiration, EEG, EOG, EMG, GSR, BT |
| FAVIP | 15 | RGB | Samsung galaxy S3 and | 30 | 1280 × 720 | Pulse oximeter |
| UBFC-RPPG [ | 42 | RGB | Logitech C920 HD pro | 30 | 640 × 480 | Pulse oximeter |
| PURE [ | 10 | RGB | evo274CVGE | 30 | 640 × 480 | Finger pulse oximeter |
| Pulse from face | 13 | RGB | Nikon D5300 camera | 50 | 1280 × 720 | Two Mio Alpha II wrist heart rate monitors |
| VIPL_HR | 107 | RGB | Logitech C310 | 25 | 960 × 720 | CONTEC CMS60C blood volume pulse recorder |
| NIR | Realsense F200 | 30 | 640 × 480 | |||
| RGB | 30 | 1920 × 1080 | ||||
| RGB | HUAWEI P9 smart phone | 30 | 1920 × 1080 | |||
| COHFACE | 40 | RGB | Logitech HD C525 | 20 | 640 × 480 | Blood volume pulse sensor, respiratory belt |
| MMSE-HR | 40 | RGB, IR | Di3D dynamic imaging system, FLIR A655sc | 25 | 1040 × 1392 | Biopac MP150 system—BP, HR |
| 50 | 640 × 480 | |||||
| TokyoTech Remote PPG [ | 9 | RGB, NIR | Prototype RGB-NIR camera | 300 | 640 × 480 | Contact PPG sensor |
| MR-NIRP | 18 | RGB, NIR | FLIR Grasshopper3, Point Grey Grasshopper | 30 | 640 × 640 | Finger pulse oximeter |
Figure 8Comprehensive work flow from various camera images to HR vital sign (Note: FF: full frame, VJ: Viola Jones algorithm, HoG: Histogram of oriented gradients; CNN: convolutional neural network; ManualD: manual ROI detection, NoD: no ROI detection, OtherD: Other ROI detection, ManualT: manual ROI tracking, KLT: Kanade–Lucas–Tomasi; KCF: kernel correlation filter; featT: feature tracking, OtherT: other ROI tracking, NoT: no ROI tracking, BSS: blind source separation algorithms, otherSP: other signal processing, ML: Machine learning FD: frequency domain, TD: time domain).
Automatic detection of ROI by utilizing various classical and DL methods.
| Algorithms | Description | Advantage | Disadvantage |
|---|---|---|---|
| Viola–Jones | It utilizes Haar-like features and Adaboost algorithm to construct a cascade classifier. | It works well on full, frontal, and well-lit facts. | It suffers from faces in a crowd, face rotation, inclined or angled faces, expression variations, and low image resolution. |
| Histogram of oriented Gradients | It constructs the feature by calculating the gradient direction histogram on the local area of the image. | Fast running speed and identifies 68 facial landmark points. | It may be influenced by light intensity and detection and inaccurate location of feature points on profile. |
| Multitask cascaded convolutional neural network [ | It is a convolutional-neural-network (CNN)-based framework, which consists of three stages for joint face detection and alignment. | Accurate face detection, less affected by light intensity and direction. | It may provide sophisticated models and calculation, which may result in a slow running speed; only five feature points can be tracked. |
| Single shot multibox detector | It is a fast convolutional neural network to detect faces using a single neural network. | Fast processing speed and multiscale feature map is adopted. | The robustness of the network to small object detection may not too high. |
| You look only once | It is one stage detector based on object detection. | Fastest object detection algorithm. It utilizes full image as context information which is possible to achieve real-time requirements. | It requires a graphics-processing-based computational machine. It may be relatively sensitive to the scale of the object. |
| Template matching | It matches the image by providing a base template which to compare. | Relatively easier to implement and use. | Not suitable for complex templates, no face in the frame, or occlusion of face. |
Preprocessing using different filtering methods.
| Filtering | Description |
|---|---|
| Detrending filter | It removes the trend in signal |
| Moving average filter [ | It smooths a signal and suppresses high frequency noise |
| Band-pass filter [ | It eliminates the frequency components outside the bandwidth range |
Various algorithms for HR extraction.
| Signal Processing Techniques | Characterization |
|---|---|
| ICA [ | It decomposes the signal and extracts independent components of pulse information from temporal RGB traces. |
| PCA [ | It utilizes a statistical technique to obtain uncorrelated components from RGB traces [ |
| GREEN [ | In blood, hemoglobin and oxyhemoglobin absorb light of 520–580 nm, which is in the range of the camera’s green filter [ |
| CHROM [ | A linear combination of chrominance signals with the assumption of skin color necessitates a priori knowledge and eliminates motion artifacts but it may fail if pulse and specular signals are same [ |
| POS [ | It projects the RGB-derived signals onto a plane orthogonal to the temporally normalized skin tone component. |
| Spatial subspace rotation [ | It utilizes the subspace of skin pixels and rotation measurements for extracting cardiac pulse information but it may require complete continuous sequence of camera frames to recover the pulse wave [ |
| Kernel density ICA [ | It does not require a prior assumption of probability distributions of hidden sources and so-called semi-BSS method. |
Studies using RR by utilizing camera-based approaches.
| Ref. | Camera Type | Camera Details | Frame Rate (fps) | Resolution (px × px) | ROI | Ground Truth | Results |
|---|---|---|---|---|---|---|---|
| [ | FIR | Infratec VarioCAM HD head | 30 | 1024 × 768 | Nose | Philips IntelliVue MP30 monitor | Correlation coefficient (CC): 0.607 upon arrival, 0.849 upon discharge |
| [ | FIR | Optrics PI 450 | 80 | 382 × 288 | Nose | Manual counting | CC: near distance: 0.960; far distance: 0.508; |
| [ | FIR | InfraTec VarioCAM HD head | 30 | 1024 × 768 | Full frame and split into sub-ROI | Adults: Respiratory belt, Infants: Dräger M540 patient monitor | Root mean square error (RMSE): healthy adults: (sit still: 0.31 ± 0.09 cpm, stimulated breathing: 3.27 ± 0.72 cpm), infants: 4.15 ± 1.44 cpm |
| [ | FIR | FLIR SC3000 | 30 | 320 × 240 | Nostril area and mouth, nose and cheeks enclosed | Subject finger flexion (upward–inhalation, downward–exhalation) | RMSE: 3.40 cpm |
| [ | FIR | Optrics PI-450 | 27 | 382 × 288 | Nose | Manual | RMSE: stay still: 3.81 cpm, moving: 6.20 cpm |
| [ | FIR | FLIR Lepton 2.5, | 8.7 | 60 × 80; | Full frame | Philips MX700 patient monitor | Mean absolute error (MAE): 2.07 cpm |
| [ | FIR | Seek Thermal Compact PRO for iPhone | 17 | 640 × 480 | Highest temperature point and around it | Respiration belt | RMSE: 1.82 ± 0.75 cpm |
| [ | FIR | FLIR T450sc | 30 | - | Nose | GE healthcare patient monitor, visual inspection | CC: 0.95 |
| [ | FIR | Infratec ImageIR 9300 | 50 | 1024 × 768 | Nose | piezo plethysmography, | RMSE: 0.71 ± 0.30 cpm |
| [ | FIR | FLIR T-420 | 10 | 320 × 240 | Nostril | Respiratory volume monitor | CC: 0.86 before sedation |
| [ | RGB; | Dual camera DFK23U618; | 15 | 640 × 480; | Nose and mouth | Respiration effort belt | CC: 0.87; RMSE: 1.73 cpm |
| [ | RGB; | Dual camera DFK23U618; | 15 | 640 × 480; | Nasal area | Respiratory effort belt | RMSE: standing: 1.44 cpm, seated position with body movement: 2.52 cpm |
| [ | RGB, FIR | MAG62 thermal imager | 10 | 640 × 480 | Nostril region | Sleep respiratory monitor | Coefficient of determination: 0.905 |
| [ | NIR; | NIR: see3cam_CU40, | 15; 8.7 | 336 × 190, 160 × 120 | Chest, Nostril | Respiratory belt | RMSE: 4.44 cpm |
| [ | RGB; | RGB: IDS UI-2220SE; | 20; 8.7 | 576 × 768; | Full frame | Philips patient monitor | MAE: 5.36 cpm |
| [ | RGB | Point Grey Flea 3 GigE | - | 648 × 488 | Chest | Polysomnography | Mean error: non magnified: 0.874 cpm; magnified: 0.67 cpm |
| [ | RGB | IP camera | 10 | 320 × 180 | Full frame | ECG impedance pneumography | CC: 0.948; RMSE: 6.36 cpm |
| [ | RGB | IDS uEye-2220 | 20 | -- | Torso | Capnography | All clothing styles and respiratory patters (CC: 0.90–1.00) except winter coat-slow-deep scenario (CC:0.84) |
| [ | RGB | Digital camera | 24/30 | 1920 × 1080 /1280 × 720 | Abdominal area | Dräguer NICU monitor | CC: 0.86 |
| [ | RGB | IDS UI-3160CP | 120 | 1920 × 1080 | Face | Upper chest signal | Error: −0.25 to 0.5 cpm |
| [ | NIR | Point Grey Firefly MV USB 2.0 | 30 | 640 × 480 | Full frame | PSG, ECG, Inductance plethysmography | CC: 0.80; RMSE: 2.10 ± 1.64 cpm; MAE: 0.82 ± 0.89 cpm |
| [ | RGB | Nikon D610, D5300 | 30 | 1920 × 1080 | Abdominal area | Philips intellivue monitor | Limits of agreement: −22 to 23.6 cpm |
| [ | RGB | CCD camera | 30 | 1280 × 720 | Jugular notch | Differential digital pressure sensor | MAE: 0.39 cpm; Limits of agreements: (slim fit: ±0.98 cpm, loose fit: ±1.07 cpm) |
| [ | RGB | Smartphone LG G2 | 30 | - | Forehead | Visual inspection, Pulse oximeter, Heart rate monitor | RMSE: hue: 3.88 cpm; Green: 5.68 cpm |
| [ | RGB | Logitech C922/ | 60 | 1280 × 720/ 659 × 494 | Forehead, nose, cheeks | Respiratory belt | Relative error < 2% and inter quartile range < 5% |
| [ | NIR | Monochromatic infrared camera | 62 | 640 × 240 | Neck area with chin and upper chest | Chest belt | CC: 0.99; RMSE: 0.70 cpm |
| [ | RGB | JAI 3-CCD AT-200CL | 20 | 1620 × 1236 | Skin | Philips patient monitor | MAE: 3.5 cpm |
| [ | NIR | Thermal imager MAG62, Avigilon H4 HD Dome | - | 640 × 480 | Chest | Manual | Coefficient of determination: dataset 1: 0.92, dataset 2: 0.87 |
| [ | RGB | Smartphone Galaxy S9+ | 240 | 1920 × 1080 | Abdomen or waist area | Manual counting | Accuracy 99.09% |
| [ | RGB | Canon camera | - | - | Cheeks enclosed | PPG sensor, Respiratory belt | RMSE: 2.16 cpm |
Figure 9(a) RMSE of HR in static and dynamic conditions and, (b) varying distance between the subject and camera.
Figure 10RMSE of RR estimation with varying distance.
Figure 11Factors affecting the vital signs performance.
Figure 12The framework of main research gaps and future directions. In this, red color boxes represent the research gaps, and green color boxes refer to future research directions.