| Literature DB >> 34199084 |
Yudai Takahashi1,2, Yi Gu1, Takaaki Nakada3, Ryuzo Abe3, Toshiya Nakaguchi4.
Abstract
Respiration is a key vital sign used to monitor human health status. Monitoring respiratory rate (RR) under non-contact is particularly important for providing appropriate pre-hospital care in emergencies. We propose an RR estimation system using thermal imaging cameras, which are increasingly being used in the medical field, such as recently during the COVID-19 pandemic. By measuring temperature changes during exhalation and inhalation, we aim to track the respiration of the subject in a supine or seated position in real-time without any physical contact. The proposed method automatically selects the respiration-related regions from the detected facial regions and estimates the respiration rate. Most existing methods rely on signals from nostrils and require close-up or high-resolution images, while our method only requires the facial region to be captured. Facial region is detected using YOLO v3, an object detection model based on deep learning. The detected facial region is divided into subregions. By calculating the respiratory likelihood of each segmented region using the newly proposed index, called the Respiratory Quality Index, the respiratory region is automatically selected and the RR is estimated. An evaluation of the proposed RR estimation method was conducted on seven subjects in their early twenties, with four 15 s measurements being taken. The results showed a mean absolute error of 0.66 bpm. The proposed method can be useful as an RR estimation method.Entities:
Keywords: deep learning; likelihood index; object detection; signal processing; thermal imaging; vital sign measurement
Mesh:
Year: 2021 PMID: 34199084 PMCID: PMC8271612 DOI: 10.3390/s21134406
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Overview of proposed method: Firstly, a rectangular region of a face is detected by YOLO v3 [11]. The detected face region is divided into 4 × 6 subregions. Signal intensity is extracted from each subregion, and signal processing is applied. By calculating the respiratory likelihood of each segmented region using the newly proposed index “RQI (Respiratory Quality Index)”, the respiratory-related region is automatically selected and RR is estimated.
Figure 2YOLOv3 architecture.
Figure 3Respiratory-related signal: The signal extracted from the segmented region near the nose and its spectrum.
Figure 4Noisy signal: The signal extracted from the segmented region far from the nose and its spectrum.
Figure 5Verification of likelihood index: (a) shows signal obtained from the region where RR estimation error is lower than 1 bpm. (b) shows signal obtained from the region where RR estimation error is higher than 1 bpm. The x-axis represents the difference in respiratory rate [bpm] estimated from the time and frequency domains, respectively. The y-axis represents the Spectrum Index.
Figure 6Training data: Images obtained from the thermal imaging camera for YOLOv3 training. The images were taken in a seated (a) and supine (b) position.
Detection result: IoU0.70, IoU0.75, and IoU0.80 indicate the ratio of detection results whose IoU is higher than a certain value. For example, IoU0.70 shows the fraction of detection whose IoU is greater than 0.7, and Average IoU shows the Average of IoU of detection results.
| Subjects | IoU0.70 | IoU0.75 | IoU0.80 | Average IoU |
|---|---|---|---|---|
| 1 | 1.00 | 0.89 | 0.56 | 0.80 |
| 2 | 1.00 | 0.90 | 0.30 | 0.79 |
| 3 | 1.00 | 1.00 | 1.00 | 0.89 |
| 4 | 1.00 | 1.00 | 1.00 | 0.89 |
| 5 | 1.00 | 1.00 | 1.00 | 0.88 |
| 6 | 1.00 | 1.00 | 1.00 | 0.91 |
| 7 | 1.00 | 0.97 | 0.81 | 0.85 |
| Mean | 1.00 | 0.97 | 0.83 | 0.86 |
RR estimation result: This shows the MAE of the RR between the prediction and the ground truth.
| Subjects | Mean | |||||||
|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | ||
| MAE [bpm] | 0.33 | 0.45 | 0.29 | 1.60 | 0.50 | 1.10 | 0.42 | 0.66 |
Figure 7Bland–Altman plot of RR estimation: This plot shows the difference of RR against the mean on the x-axis. and stand for the RR predicted by the proposed system and ground truth, respectively. The bias average is 0.19 bpm and the 95% limits of agreement vary between −1.9 and 2.3 bpm.