Literature DB >> 35277285

Evaluation of Pulmonary Edema Using Ultrasound Imaging in Patients With COVID-19 Pneumonia Based on a Non-local Channel Attention ResNet.

Qinghua Huang1, Ye Lei1, Wenyu Xing2, Chao He3, Gaofeng Wei4, Zhaoji Miao1, Yifan Hao1, Guannan Li5, Yan Wang5, Qingli Li5, Xuelong Li1, Wenfang Li3, Jiangang Chen6.   

Abstract

Recent research has revealed that COVID-19 pneumonia is often accompanied by pulmonary edema. Pulmonary edema is a manifestation of acute lung injury (ALI), and may progress to hypoxemia and potentially acute respiratory distress syndrome (ARDS), which have higher mortality. Precise classification of the degree of pulmonary edema in patients is of great significance in choosing a treatment plan and improving the chance of survival. Here we propose a deep learning neural network named Non-local Channel Attention ResNet to analyze the lung ultrasound images and automatically score the degree of pulmonary edema of patients with COVID-19 pneumonia. The proposed method was designed by combining the ResNet with the non-local module and the channel attention mechanism. The non-local module was used to extract the information on characteristics of A-lines and B-lines, on the basis of which the degree of pulmonary edema could be defined. The channel attention mechanism was used to assign weights to decisive channels. The data set contains 2220 lung ultrasound images provided by Huoshenshan Hospital, Wuhan, China, of which 2062 effective images with accurate scores assigned by two experienced clinicians were used in the experiment. The experimental results indicated that our method achieved high accuracy in classifying the degree of pulmonary edema in patients with COVID-19 pneumonia by comparison with previous deep learning methods, indicating its potential to monitor patients with COVID-19 pneumonia.
Copyright © 2022. Published by Elsevier Inc.

Entities:  

Keywords:  COVID-19; Deep learning; Lung ultrasound; Pulmonary edema

Mesh:

Year:  2022        PMID: 35277285      PMCID: PMC8818339          DOI: 10.1016/j.ultrasmedbio.2022.01.023

Source DB:  PubMed          Journal:  Ultrasound Med Biol        ISSN: 0301-5629            Impact factor:   2.998


Introduction

The outbreak of COVID-19 pneumonia (PN) worldwide has rapidly become a major concern. As an infectious disease, COVID-19 PN is highly contagious, has a rapid onset and presents with symptoms such as fever, dry cough and shortness of breath. As of September 24, 2021, the total number of patients with COVID-19 PN had risen sharply to 231,410,702, with 4,742,994 (2.0%) deaths worldwide (https://sa.sogou.com/new-weball/page/sgs/epidemic?type_page=VR). At the same time, some researchers (Roy et al. 2020; Salehi et al. 2020) have started to investigate the solutions for the assisted diagnosis of lung diseases and achieved valuable results. Relevant data reveal that coronavirus-associated pneumonia is often accompanied by excessive lung water and pulmonary edema (Jin et al. 2020). Lung edema is a manifestation of acute lung injury (ALI), and may progress to hypoxemia and potentially acute respiratory distress syndrome (ARDS), which have higher mortality (Li et al. 2020). Accurate classification of the degree of pulmonary edema in patients provides meaningful guidance in formulating a treatment plan. For example, in the early stages of pulmonary edema, treatment with systemic and/or local glucocorticoids or patients might be helpful in alleviating pulmonary inflammation and edema, which may decrease the development and/or consequences of ARDS (Li et al. 2020). Because of the sudden increase in COVID-19 cases, medical resources have been rapidly depleted in many regions. Therefore, promptly and accurately making an appropriate decision for the subsequent treatment is crucial to saving more lives. Lung ultrasound (LUS) is useful for management of patients with COVID-19 PN (Peng et al. 2020; Poggiali et al. 2020; Soldati et al. 2020), from diagnosis to monitoring and follow-up. Compared with computed tomography (CT) and magnetic resonance imaging (MRI), LUS is cheaper and more convenient to use. More importantly, it can provide much faster imaging, which makes it more suitable in intensive care and emergency situations (Saraogi 2015; Francisco et al. 2016; Alzahrani et al. 2017; Mayo et al. 2019) and, hence, as a bedside tool for clinical monitoring any time. A lung ultrasound score (LUSS) has been proposed for the semi-quantitative assessment of pulmonary edema (Noble et al. 2009; Corradi et al. 2013; Picano and Pellikka 2016; Gattupalli et al. 2019). However, with respect to visual detection of the distributions of A-lines and B-lines, that is, the maximum number of B-lines, visual percentage of lung area occupied by B-lines, and so on, different clinicians may assign different LUSSs to the same LUS image. In addition, accurate assignment of the LUSS depends greatly on the experience of the radiologist. However, experienced clinicians are extremely inadequate during outbreaks of COVID-19 PN. In recent years, a few computer-aided methods have been developed for quantitative analysis of the distributions of A-lines and B-lines to determine the LUSS (Brattain et al. 2013; Corradi et al. 2013, 2015, 2016, 2020; Brusasco et al. 2019). In one pioneer attempt, Brattain et al. (2013) proposed five features and the threshold method to detect whether there are B-lines in one frame. An image frame was scored by summing the number of detections in that frame and then applying thresholds to the sum to map to a B-line score Brusasco et al. (2019). developed a K-means-based image segmentation method for automated detection of B-line areas and then calculated the percentage of B-line areas of LUS images Corradi et al. (2013., 2016) proposed a quantitative method to analyze the LUS images based on the frequency distribution of gray scales in the region of interest (ROI). Compared with the physician's observation, the aforementioned computer-aided diagnosis methods are often more objective and faster and can reduce observer bias. However, these methods depend on manually extracted features and need to maximize resolution of images and select the ROI, introducing additional workload in clinical use. Recently, methods based on deep learning that detect and locate B-lines to evaluate lung ultrasound are emerging (van Sloun and Demi 2019), but these methods just focus on detecting the number or position of B-lines instead of grading the degree of pulmonary edema. Unlike the studies mentioned above, we skipped the detection of B-lines or the calculation of the number of B-lines in this study. Instead, as deep learning models often have excellent capability for feature extraction and representation, we proposed a non-local channel attention ResNet architecture (NCA-ResNet)–based automated LUS scoring system following the LUSS criteria (Li et al. 2018). Non-local channel attention (NCA) comprises a non-local module and channel attention mechanism to extract the potential information on characteristics of A-lines and B-lines, respectively, and hence can emphasize the channels of higher importance. Compared with the traditional methods mentioned above, our model was developed based on an end-to-end architecture and can directly output the degree of pulmonary edema defined by the LUSS criteria (Li et al. 2018). Therefore, our model can be used to monitor the degree of pulmonary edema in patients with COVID-19 PN, thus facilitating clinicians’ adoption of appropriate treatments for the patients.

Methods

In this study, 31 patients affected by COVID-19 with a positive reverse transcription polymerase chain reaction (RT-PCR) test (age: 55 ± 21 y, men: 19, women: 12) admitted to the intensive care unit (ICU) of Huoshenshan Hospital from February 23, 2020, to April 2, 2020, were recruited. According to the diagnosis and treatment standard for COVID-19 PN (National Health Commission & State Administration of Traditional Chinese Medicine 2020), these recruits could be characterized by four conditions: critical (10 cases, 32.3%), severe (9 cases, 29.0%), common (7 cases, 22.6%) and mild (5 cases, 16.1%). After chest CT (uCT760, United Imaging, Los Angeles, CA, USA), the patients were found to have bilateral ground-glass opacities with peripheral, posterior and basal predominance, which was in line with the international agreement. All patients underwent LUS examinations for 12 standard fields on both hemithoraces, including the upper and lower halves of the anterior, lateral and posterior fields (Soummer et al. 2012). The ultrasound equipment (LOGIQ e, GE Healthcare, Wauwatosa, WI, USA) was used with a curved array low-frequency transducer (1–5 MHz) with an image depth of 15 cm, focal depth of 7.5 cm, mechanical index (MI) of 1.2 and thermal index (TI) of 0.7, and the operation mode was penetration. These parameters all were safe for all the patients. Considering that there would be a large discrepancy in images collected from 12 different regions of a patient, lung edema scores were given per region (one image standing for one region) in this experiment. Furthermore, the images with the scores only were used to train the model. When the model was used in the clinic, it was also based on clinical diagnostic principles that the maximum scores in the 12 different regions were the final diagnostic results. Therefore, we would take the highest score as the severity of patients. Repeated examinations were performed for patients whose conditions changed during treatment, and finally a total of 2220 LUS images were selected, about 60–80 images per patient. In total, 1860 LUS images collected from 25 patients (critical: 8, severe: 7, common: 6, mild: 4) were used for training the scoring model, and 360 LUS images collected from another 6 patients (critical: 2, severe: 2, common: 1, mild: 1) were used to test the model. To avoid the high similarity between adjacent LUS images of a cine loop, the images were picked up every 30 frames, their similarity was less than 0.9 as calculated using cross-correlation (Chen et al. 2021). Two clinicians each with more than 6 y of experience in using LUS and blinded to the clinical background scored these 2220 LUS images according to the LUSS criteria (Li et al. 2018). If two clinicians had the same score on an LUS image, that score was assigned to the LUS image. Otherwise, the LUS image was removed from the data set. According to their evaluation, the degree of pulmonary edema in each LUS image was graded from 0 to 3. In clinic, we scored the LUS images collected from 12 different regions, respectively, and took the highest score as the severity. The four scores corresponded to normal lung, septal syndrome, interstitial-alveolar syndrome and white lung, respectively. In Figure 1 are four typical LUS images with scores of 0, 1, 2 and 3, respectively. By excluding the images that were assigned different scores by the two clinicians, finally, 2062 scored images (training set: 1735, testing set 327) were used for this study. The proportions of the four degrees of pulmonary edema (scores 0, 1, 2, 3) in the training set were 18.6% (323 images), 25.2% (437 images), 27.7% (480 images) and 28.5% (495 images), respectively. The proportions for the counterparts in the testing set were 17.1% (56 images), 18.1% (59 images), 31.2% (102 images) and 33.6% (110 images), respectively. The data and codes will be provided on the website in the future (https://bio-hsi.ecnu.edu.cn/). This study was approved by the ethics committee of Huoshenshan Hospital, Wuhan, China (Approval No. HSSLL030). All patients or their family members provided written informed consent.
Fig. 1

Typical ultrasound images corresponding to scores 0 (a), 1 (b), 2 (c) and 3 (d), respectively. (a) Normal lung. B-lines are absent and A-lines are present. (b) Septal syndrome. B-lines are about 7 mm apart, corresponding to subpleural septa. (c) Interstitial-alveolar syndrome. B-lines are confluent. (d) White lung. B-lines have coalesced, resulting in an echographic lung field that is almost completely white.

Typical ultrasound images corresponding to scores 0 (a), 1 (b), 2 (c) and 3 (d), respectively. (a) Normal lung. B-lines are absent and A-lines are present. (b) Septal syndrome. B-lines are about 7 mm apart, corresponding to subpleural septa. (c) Interstitial-alveolar syndrome. B-lines are confluent. (d) White lung. B-lines have coalesced, resulting in an echographic lung field that is almost completely white. In the work described here, we used the deep-learning–based classification model to evaluate LUSS mentioned above. Our model is based on residual structures (He et al. 2016). The residual structures can effectively alleviate the problem of gradient disappearance and improve classification performance; thus, it is widely used in computer vision tasks. Although stacking convolution operations (Fukushima and Miyake 1982; LeCun et al. 1989) can expand the receptive field, because of the fixed receptive field, it is difficult for traditional residual structures to extract the dependencies between distant pixels which are essential for classification of the degree of pulmonary edema, as illustrated in Figure 2 . Moreover, traditional residual structures assign the same weights to the channels; therefore, they cannot emphasize specific key channels. To solve these two problems, our model is further augmented by adding a non-local module and channel attention mechanism, respectively Figure 3. illustrates the main architecture with the modules of the non-local and channel attention proposed in this article.
Fig. 2

A-lines (a) and B-lines (b) in lung ultrasound (LUS) images. The curved red lines in (a) represent A-lines, and the straight red lines in (b) represent B-lines. The yellow rays indicate the potential dependencies between the pixels represented by blue points in image. Long-range dependencies contain the information of A-lines and B-lines; that is, the points corresponding to these two red curves have potential dependencies that can help the network to identify the appearance of A-lines. The dependencies between points 1 and 5 can help the network to identify the appearance of B-lines. The dependencies between point 1 and points 2–4 correspond to the density of B-lines, indicating the degree of pulmonary edema.

Fig. 3

Framework of our method. Conv = convolution operation; BN = batch normalization; NCB = non-local module.

A-lines (a) and B-lines (b) in lung ultrasound (LUS) images. The curved red lines in (a) represent A-lines, and the straight red lines in (b) represent B-lines. The yellow rays indicate the potential dependencies between the pixels represented by blue points in image. Long-range dependencies contain the information of A-lines and B-lines; that is, the points corresponding to these two red curves have potential dependencies that can help the network to identify the appearance of A-lines. The dependencies between points 1 and 5 can help the network to identify the appearance of B-lines. The dependencies between point 1 and points 2–4 correspond to the density of B-lines, indicating the degree of pulmonary edema. Framework of our method. Conv = convolution operation; BN = batch normalization; NCB = non-local module.

Non-local module

We used the non-local module proposed by Wang et al. (2018) to capture long-distance dependencies which are essential for classifying the degree of pulmonary edema, thus improving the classification accuracy. Wang et al. designed a generic non-local operation that can be defined aswhere x and y are input and output, respectively; i and j represent the spatial position of the input; x is a vector with the same dimension as the channel number of the input x; f is a function for calculating the similarity relationship between any two points; and g is a mapping function that maps a point to a number, which can be regarded as calculating the characteristics of one point. The non-local module can be easily implemented, as illustrated in Figure 4 . The input feature map is denoted as , where , and are the height, width and number of channels of the feature map. As a convenient explanation, we let =1. To implement the function f, we take two parallel 1 × 1 convolution operations; as a result, we can obtain two feature maps, that is, and . Then we shape them into vectors of dimension N = H*W named as and , respectively. Now, we can obtain the dependency matrix as a result of . For the element of ,
Fig. 4

The non-local module. The softmax operation is performed on each row. The gray boxes denote the feature maps produced by 1 × 1 convolution. ⨂ = matrix multiplication; ⨁ = element-wise sum.

The non-local module. The softmax operation is performed on each row. The gray boxes denote the feature maps produced by 1 × 1 convolution. ⨂ = matrix multiplication; ⨁ = element-wise sum. Equation (1) is equivalent to We can easily implement the function g by feeding X to a 1 × 1 convolution layer and then reshaping it to vector of dimension N named q; thus, is the ith element of . For , by adding weighted to the corresponding feature of the ith pixel, the feature map will consequently have long-range dependencies.

Channel attention module

The characteristics of A-lines and B-lines are important in the classification of pulmonary edema. In LUS images, the regions that most influence the classification result usually occupy a part of the image (e.g., region A in Fig. 5 ). However, not all semantic feature channels clearly pay more attention to region A. Unfortunately, some of the channels pay more attention to the insignificant regions (e.g., region B in Fig. 5). Because most deep-learning networks (Simonyan and Zisserman 2014) assign proportional weights to channels, the influence of those feature channels that contain discriminative information is reduced. The channel attention module (Hu et al. 2018) can change the proportion of different channels to selectively emphasize useful informative features and suppress less useful features Hu et al. (2018). proposed the Squeeze-and-Excitation (SE) Blocks adaptively recalibrating channel-wise feature responses by explicitly modeling the interdependencies between different channels. Specifically, it can automatically obtain the importance of each feature channel through learning, and then use this importance to enhance key features and suppress features that are not useful for the current task. The typical SE block is depicted in Figure 6 .
Fig. 5

Different regions in ultrasound images with different contributions to the classification. Region A is relatively more significant, whereas region B is relatively insignificant.

Fig. 6

Architecture with the channel attention module in a parallel manner. H, W and C represent the height, width and channel of the feature map, respectively.

Different regions in ultrasound images with different contributions to the classification. Region A is relatively more significant, whereas region B is relatively insignificant. Architecture with the channel attention module in a parallel manner. H, W and C represent the height, width and channel of the feature map, respectively. Given an input feature map , , where represents a channel. The variables H, W and, C are the height, width, and number of channels, respectively. Taking the mth channel as an example, the global average pooling is performed on each channel map with respect to the equation Equation (4) is the spatial squeeze operation which can collect global spatial information and then embed to the vector z. A fully connected layer is used to calculate the impact of each channel on other channels and further generate the weight of each channel by adding another fully connected layer, which is denoted aswhere , W 1 and W 2 are the two fully connected layers, δ is the ReLU activation function and σ is the sigmoid function. Finally, the features of each channel are scaled up or down by the normalized weights. We use the channel attention module after each residual unit in ResNet, which helps increase the proportion of the feature channel containing more information about the A-line and B-line.

Experiments and Results

For model training, we used the stochastic gradient descent (SGD) as the optimizer. The model was developed on the platform of Pytorch 1.8.1 with Python 3.8 (Facebook Open Source) and trained on a personal computer (CPU: amd4800u, RAM: 16g 3200 MHz). All training and test images were resized to 300 × 300, and the learning rate was set at 0.01. To evaluate the performance of our method, we carried out the following experiments. We used accuracy, F1 score, precision and recall as the indices to measure the performance (Xi et al., 2020a, 2020b) of the proposed method. Furthermore, we compared the proposed method with other classification methods including VGGNet (Simonyan and Zisserman 2014), ResNet (He et al. 2016), Densenet 201 (Soret et al. 2015), Inception-V3 (Chollet 2016) and InceptionResnet-V2 (Szegedy et al. 2017). Through the training and testing of these six methods using 1735 and 327 LUS images, respectively, these experimental results were evaluated by four indicators including accuracy, F1 score, precision and recall. It can be seen in Table 1 that our method outperforms the other methods, with the four indicators being 92.34%, 92.05%, 91.96% and 90.43%, respectively.
Table 1

Performance comparison for different methods under different metrics

MethodAccuracy (%)F1 (%)Precision (%)Recall (%)
Ours92.3492.0591.9690.43
ResNet90.7990.5689.1589.19
VGGNet26.6610.526.1925.00
Densenet 20118.639.8340.6219.21
Inception-V357.8439.4681.8659.04
InceptionResnet-V255.8833.4782.8855.64
Performance comparison for different methods under different metrics Interpretability of the model can be enhanced by viewing the class activation map (CAM) (Zhou et al. 2016). Here, we selected four typical images in categories 0, 1, 2 and 3, respectively, from the data set Figure 7. contains the four images as well as their corresponding CAMs.
Fig. 7

Ultrasound images of classified pulmonary edema and their corresponding classification activation maps (CAMs). (a), (b), (c) and (d) correspond to ultrasound image categories 0, 1, 2 and 3, respectively. Row 1: CAMs produced by our model; row 2: CAMS produced by our model without using the non-local module; row 3: original images.

Ultrasound images of classified pulmonary edema and their corresponding classification activation maps (CAMs). (a), (b), (c) and (d) correspond to ultrasound image categories 0, 1, 2 and 3, respectively. Row 1: CAMs produced by our model; row 2: CAMS produced by our model without using the non-local module; row 3: original images. As illustrated in row 1 of Figure 7(a), the highlighted region of the image plays an important role in classification of LUSS. In row 3 of Figure 7(a), we can see that A-lines exist in the highlighted region of row 1, Figure 7(a), indicating that the appearance of A-lines is evidence that the patients with COVID-19 PN did not have pulmonary edema with respect to the LUSS criteria. Likewise, we can see B-lines in the highlighted region in Figure 7(b–d). To determine the effectiveness of the non-local module for extracting the potential characteristics of A-lines and B-lines, we removed the non-local module to produce the CAMs and compared the results with those produced by the model with the non-local module. As illustrated in Figure 7, the highlighted parts of the CAMs in row 1 (produced by our model with the non-local module) are more focused on the regions where the A-lines or B-lines appeared than those of the CAMs in row 2 (produced by our model without the non-local module). This indicates that our model using the non-local module can extract more focused information for prediction of the degree of pulmonary edema from the characteristics of A-lines or B-lines.

Discussion

Relevant data revealed that in the patients, COVID-19 PN was often accompanied by excessive pulmonary edema, which is a manifestation of ALI, and the accurate classification of the degree of pulmonary edema in patients provides very important guidance for choosing the treatment plan and thus improving the survival rate. LUS can be used in different regions, including low- to middle-income areas; it can quickly facilitate diagnosis and therefore is suitable for monitoring the degree of pulmonary edema at any time. The LUSS has been used in clinics for the semi-quantitative assessment of pulmonary edema. However, accurate assignment of the LUSS depends greatly on the experience of clinicians, who are scarce during outbreaks of COVID-19 PN. Some computer-aided methods have been proposed for quantitative analysis of LUS images. Nevertheless, these methods depend on manually extracted features or need to select the ROI, therefore increasing the workload in the clinic. Recently, methods based on deep learning that evaluate lung ultrasound by detecting local B-lines have emerged; however, these methods focus only on detecting the number or position of B-lines instead of grading the degree of pulmonary edema. In this study, we did not directly detect B-lines or calculate the number of B-lines. Instead, following the LUSS criteria, we proposed an automated LUS scoring system to directly output the degree of pulmonary edema in patients with COVID-19 PN. Through use of the deep-learning method, our model is fully automatic and does not manually extract features or select ROIs. The framework of our network is based on the ResNet and augmented by adding the non-local and channel attention mechanisms. The non-local mechanism is used to extract the long-range dependencies which can preserve the characteristics of A-lines/B-lines and the density of B-lines, improving classification accuracy. The channel attention mechanism is used to emphasize the important channels that pay more attention to the ROI. With the two aforementioned mechanisms, our model performs well and has no need to select an ROI or manually extract features. This design can thus directly output the degree of pulmonary edema defined by the LUSS criteria. Furthermore, when used clinically, our model can highlight the region that plays a more important role in deciding. According to the experiments, the highlighted regions often contain the A-lines/B-lines, which can improve the interpretability of the proposed model and therefore help convince the clinicians. According to the quantitative evaluation results, the proposed method outperforms the other four famous deep learning models, that is, VGGNet (Simonyan and Zisserman 2014), ResNet (He et al. 2016), Densenet 201 (Soret et al. 2015) and Inception-V3. It is also better than the model described in our previous article and has an accuracy of 87% (Chen et al. 2021), indicating its usefulness in classification of LUS scores for patients with COVID-19. However, there are two main limitations of this study. First, the labels of the data (i.e., the LUSS for each image) were assigned blindly by two clinicians. In practice, however, the degree of pulmonary edema should be judged by considering the clinical symptoms, which are difficult to obtain. Second, our data set is far from sufficient to ensure the generalization ability of our model. That requires the collection of more images. In our future work, we will therefore focus on collecting more data and training our model with more accurate labels with respect to the LUSS criteria. With the recovery of medical resources, we will collect LUS images from multiple hospitals, and try to redefine LUSS criteria with considering the clinical symptoms.

Conclusions

We have proposed an automated method called NCA-ResNet model to evaluate the degree of pulmonary edema in patients with COVID-19 PN. This method combines the ResNet model with the non-local module and the channel attention mechanism, which extract the features of A-lines and B-lines and assign more weights to decisive channels, respectively. The results indicated that this method performs well in the automatic scoring of LUS images, and could be used for the analysis and diagnosis of the severity of pulmonary edema. It also proved that this method is potentially applicable in clinics.
  1 in total

1.  A lightweight CNN-based network on COVID-19 detection using X-ray and CT images.

Authors:  Mei-Ling Huang; Yu-Chieh Liao
Journal:  Comput Biol Med       Date:  2022-05-11       Impact factor: 6.698

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.