Literature DB >> 34764618

Unsupervised domain adaptation based COVID-19 CT infection segmentation network.

Han Chen¹, Yifan Jiang¹, Murray Loew², Hanseok Ko¹.

Abstract

Automatic segmentation of infection areas in computed tomography (CT) images has proven to be an effective diagnostic approach for COVID-19. However, due to the limited number of pixel-level annotated medical images, accurate segmentation remains a major challenge. In this paper, we propose an unsupervised domain adaptation based segmentation network to improve the segmentation performance of the infection areas in COVID-19 CT images. In particular, we propose to utilize the synthetic data and limited unlabeled real COVID-19 CT images to jointly train the segmentation network. Furthermore, we develop a novel domain adaptation module, which is used to align the two domains and effectively improve the segmentation network's generalization capability to the real domain. Besides, we propose an unsupervised adversarial training scheme, which encourages the segmentation network to learn the domain-invariant feature, so that the robust feature can be used for segmentation. Experimental results demonstrate that our method can achieve state-of-the-art segmentation performance on COVID-19 CT images.

Entities: Chemical

Keywords: Adversarial training; Automatic segmentation; COVID-19; Computed tomography; Domain adaptation

Year: 2021 PMID： 34764618 PMCID： PMC8421243 DOI： 10.1007/s10489-021-02691-x

Source DB: PubMed Journal: Appl Intell (Dordr) ISSN： 0924-669X Impact factor: 5.019

Introduction

The novel coronavirus disease (COVID-19) has become one of the most serious global pandemics. COVID-19 is caused by the infection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which can be transmitted via breathing, coughing, sneezing, or other means. A recent report [1] showed that, by March 2021, more than 120 million people around the world would have been infected with COVID-19, with a fatality rate of over 2%. To diagnose COVID-19, the real-time reverse transcription polymerase chain reaction (RT–PCR) test is routinely used. However, RT-PCR is time-consuming, and a series of tests may be required to exclude the possibility of false negatives, which means that there is an urgent need for alternative methods for the fast and accurate diagnosis of COVID-19. Chest computed tomography (CT) has been strongly recommended for the early recognition and evaluation of suspected SARS-CoV-2 infection [2]. Chest CT scans are very useful for the auxiliary diagnosis of the typical radiographic features of COVID-19, including ground-glass opacity and consolidation [3]. Therefore, the qualitative assessment of infection in CT scans could provide important information in the fight against the spread of COVID-19. Image segmentation has proven to be effective in COVID-19 CT image analysis [4-6], but it remains challenging because (1) the diversity in the size and distribution of infection leads to a large number of false negative segmentation results, and (2) ground-glass opacity and consolidation are similar in appearance, this small inter-class difference makes the segmentation more difficult [7]. Deep learning based automatic segmentation is a powerful technique for medical imaging analysis [8]. The excellent performance can be attributed to the availability of large volumes of labeled training data. However, it is time-consuming and laborious to collect a sufficient number of COVID-19 CT images with annotations due to concerns over patient privacy [9, 10] and lack of experts [11]. To tackle this issue, some methods have employed parameterized transformation to augment the limited annotated COVID-19 CT images for supervised learning [12-19]. Despite parameterized transformation can solve the problem of data shortage to some extent, but networks trained on limited data still suffer from the poor generalization to unseen datasets due to the insufficient data diversity. Besides, several works have explored to construct new networks suitable for small-scale labeled COVID-19 data [20-22], of which the high performance relies on the carefully designed network structure, thus losing scalability and flexibility. More recently, some efforts have been devoted to generating synthetic COVID-19 CT data for promoting computer-aided diagnosis ability of COVID-19 [23-25], which made it possible to train deep models on synthetic images and computer-generated annotations. Nevertheless, as the work [26] shows, a model directly trained with the synthetic data may fail to produce precise results for real COVID-19 CT images due to the domain shift. In view of the fact that (1) existing supervised and semi-supervised methods are limited by small-scale COVID-19 CT data; (2) the synthetic COVID-19 CT data is not available directly for training due to domain shift problem. A natural and practical question comes up: how to properly utilize the potential of synthetic data to improve the segmentation performance on COVID-19 CT images? To address above issues, we propose a novel unsupervised domain adaptation based segmentation network for COVID-19 CT infection segmentation task. The contributions of this paper can be summarized as follows: (1) we propose to make full use of synthetic data and limited unlabeled real COVID-19 CT images to jointly train the segmentation network, so as to introduce richer diversity; (2) we design a domain adaptation module to align the two domains and overcome the domain shift. It effectively improves the generalization capability of segmentation network; (3) we propose an unsupervised adversarial training scheme, in which the cross-domain adversarial loss will guide the segmentation network to learn domain-invariant feature, thus improving the segmentation performance. In the meanwhile, our training scheme is very flexible, as it can be arbitrarily combined with any segmentation network with encoder-decoder structure. The remainder of this paper is organized as follows. In Section 2, we review previous related works. Section 3 discusses the main components and training scheme of our proposed method, while Section 4 describes our experiments on real COVID-19 CT images. Finally, Section 5 concludes the paper.

Related works

COVID-19 infection segmentation

Medical imaging such as CT has played an important role in the fight against COVID-19. As an essential step in the processing and assessment of CT images, segmentation can identify the regions of interest, such as ground-glass opacity, consolidation, and the lung [8]. Recently, deep learning based segmentation methods have been utilized in COVID-19 CT diagnosis [12-22]. For instance, Ouyang et al. [18] developed a 3D CNN network for COVID-19 infection segmentation, and proposed a dual-sampling attention mechanism to alleviate the imbalanced problem of data. Oulefki et al. [19] presented a multilevel thresholding procedure based on Kapur entropy to improve the COVID-19 segmentation performance. Fan et al. [20] presented a semi-supervised segmentation method based on random selection propagation strategy, which requires only a few labeled images and primarily utilizes unlabeled data. Qiu et al. [21] proposed a lightweight network to solve the overfitting problem caused by the limited training data for COVID-19 segmentation. Laradji et al. [22] proposed a new COVID-19 segmentation model using point-level rather than full image-level annotations, which overcame the labeling issue to some extent. Most previous works are trained with a supervised or semi-supervised manner, thus the performance is limited by the scale of the labeled data. Furthermore, since the infection areas of COVID-19 could be small with large variations of shapes and textures, segmenting the areas of infection is still a challenging task.

Unsupervised medical segmentation

To deal with the lack of annotated data, unsupervised segmentation techniques have attracted growing interest. Most existing proposals employ clustering, which divides a medical image into different groups according to the similarity of the image intensity. For example, Jose et al. [27] proposed a method based on K-means clustering to segment abnormal brain regions. Cheng et al. [28] utilized a clustering algorithm to generate superpixel-based pseudo-labels to provide supervision for the segmentation network. However, these clustering based methods heavily depend on the pixel intensity, which means different areas but with similar intensity are likely to be mistakenly segmented into the same class. This is undesired in COVID-19 infection segmentation, because ground-glass opacity and consolidation may not be distinguished as they have similar appearance. Some studies regard the unsupervised medical segmentation task as an unsupervised deformable registration process [29-31]. Despite the success of these methods, they are insufficient for COVID-19 segmentation since there are large variations of infection on CT images, such as irregular shapes and ambiguous boundaries [32].

Domain adaptation

Domain adaptation aims to reduce the shift between two distributions [33, 34], it has been widely employed in conjunction with the use of synthetic data for real-world tasks [35-39]. There are several different strategies proposed to gain better domain adaptation. Some studies utilize maximum mean discrepancy (MMD) [40] to minimize differences between feature distributions [41-43], but its effect is limited by whether the distributions follows Gaussian distribution. Another strategy is self-training, which utilizes predictions from an ensemble model as pseudo-labels for unlabeled data to train the current model [44-46]. There is increasing interest in the use of adversarial training to achieve domain adaptation [47-50]. This approach reduces the domain shift by forcing the features from different domains to fool the discriminator, thus leading to features from different domains exhibiting a similar distribution. For medical image segmentation, domain adaptation has also demonstrated positive effects [51-55]. For instance, Degel et al. [52] minimized segmentation loss with a domain discriminator to encourage feature domain-invariance across ultrasound datasets for left atrium segmentation. Christian et al. [51] addressed the domain shift by extending the self-ensembling method to MRI image segmentation. Kamnitsas et al. [55] employed adversarial learning and utilized synthetic data and sufficient labeled data for brain lesion segmentation. The domain adaption technology has achieved some impressive success, especially in the medical imaging field. Therefore, we consider exploiting this novel technology to solve COVID-19 CT infection segmentation task in this paper.

Methodology

Overview of the proposed method

As shown in Fig. 1, our method consists of two parts: the segmentation network composed of a feature extractor f(⋅) and a pixel-wise classifier c(⋅), as well as the domain adaptation module including a generator g(⋅) and a discriminator d(⋅). The source dataset (the synthetic data) and target dataset (the COVID-19 CT images without annotations) are denoted as and , respectively. We first forward the two inputs X and X to f(⋅), thus generating feature maps F and F. Then, the c(⋅) takes F as input and produces an image-sized segmentation map , which is used to optimize segmentation network together with Y. To overcome the domain shift, we align the distributions of the source and target data using domain adaptation module in the image space. We utilize g(⋅) to reconstruct the inputs conditioned on the feature maps F and F. We then feed the outputs of g(⋅) and X, X to the discriminator d(⋅) and classify them as real or fake within- or cross-domain. The gradients of the cross-domain adversarial loss are propagated from d(⋅) to f(⋅), which leads f(⋅) to learn transferable feature representations applicable to both the source and target domains.

Fig. 1

Overview of the proposed network. Our network consists of two parts: the segmentation network including a feature extractor, a pixel-wise classifier, as well as the domain adaptation module (DA) including a generator and a discriminator. The black solid lines with one-way arrow indicate the data flow and the dashed lines denote reconstruction and adversarial loss. The feature extractor and pixel-wise classifier together perform the segmentation task. The DA module is introduced to overcome the domain shift through adversarial training in image space

Network structure

Feature extractor

We build a feature extractor that follows the typical architecture of convolutional neural network. It is composed of four 3 × 3 convolutional layers, and each is followed with a 2 × 2 max pooling operation. Given a source image X and a target image X, the feature extractor shares the weights and produces feature maps F and F, as shown in (1), where δ ∈ (S, T) denotes whether the term stems from the source domain or target domain. The learned features are then sent to the classifier and generator. The former is used to generate pixel-level segmentation results, while the latter is projected into image space for further domain adaptation.

Pixel-wise classifier

With the learned feature maps, the pixel-wise classifier converts low-resolution, semantically strong features into pixel-wise classification results, i.e., a class label is assigned to each pixel. We build a classifier that contains three upsampling layers, and each layer is followed by a concatenation with the correspondingly cropped feature map from the feature extractor. It takes F as input and produces segmentation map with the same size as X, i.e., As discussed later, in order to make our network have the pixel-level discriminative ability, we use the above predicted segmentation map to calculate the segmentation loss in a supervised manner. Because we can only access the annotations of the source data, we feed only the feature maps of the source domain to the classifier to obtain the segmentation map.

Domain adaptation module

Unlike recent adversarial based domain adaptation approaches for segmentation tasks that directly calculate the adversarial loss in feature space. Here, we utilize the generator to project the intermediate feature maps to image space for robust adversarial training. Given the feature maps F and F, the generator shares the weights and produces the reconstructions of source image X and target image X. The reconstructions G, G, and X, X are then sent to the discriminator and classified as real or fake. The reconstruction process is formulated as (3), Image-space reconstruction is more robust than feature map when applied to the calculation of adversarial loss. This is particularly so for our infection segmentation task, where the differences in the intensity and texture between the source and target images are not that significant. Section 4.4 provides detailed verification of the effectiveness of the image-space training. The design of our domain adaptation module is inspired by PatchGAN [56]. Our generator consists of four upsampling layers, each layer is composed of a 3 × 3 transposed convolutional layer and two residual blocks [57]. Our discriminator includes two 4 × 4 convolutional layers, and the first layer is followed by nine residual blocks. For each input, the output of domain adaptation module is a probability map, in which the value of each pixel indicates the possibility that each patch in the input is real or fake. Compared with a normal discriminator whose output is only real numbers, our discriminator is more helpful for retaining detailed information.

Training and testing process

Our goal is to train a segmentation network that produces a competitive performance on real COVID-19 CT images even if no annotations are provided. We use the annotated synthetic images as the source and unlabeled real COVID-19 CT images as the target to jointly train the network, and update the parameters using segmentation loss, adversarial loss, and reconstruction loss. The segmentation loss is defined over adequately annotated source domain images, allowing the network to develop pixel-level discriminative ability. The adversarial loss can be divided into within-domain loss and cross-domain loss. The latter is used to guide the update of the feature extractor, thus allowing the feature extractor to identify the necessary features that should be extracted from the target domain. The reconstruction loss is utilized to ensure the fidelity of the reconstructions.

Generator update

The generator takes the learned features F and F as input, and reconstruct X and X as G and G conditioned on these feature maps. Intuitively, if the reconstruction is sufficiently accurate, there should be a low L1 loss between the reconstruction G and input X. We also optimize the generator using adversarial loss, which forces the discriminator to classify G and G as real, thus fooling the discriminator. The object function of the generator can be represented by (4), where index i indicates the pixel location in the output probability map of discriminator and label map, index j indicates the pixel location in the input and reconstruction.

Discriminator update

Given X, X, G, or G, the patch discriminator produces a 4-D probability map for each input. We calculate the adversarial loss using the cross-entropy loss between the output probability map and the label map Y(, δ ∈ (S, T),γ ∈ (real, fake). Therefore, the optimization process for the discriminator is as follows, where Y( is the 64 × 64 label map, in which each value corresponds to the label of each patch, indicating whether each patch of the input image belongs to the category of source-real, source-fake, target-real, or target-fake.

Feature extractor and classifier update

The updating of the feature extractor and classifier is a crucial process in our network for domain adaptation. We optimize these two components with the following combination of loss terms, where the first term is the segmentation loss. This is the pixel-wise cross-entropy loss calculated between the segmentation map and the annotation of the source domain. Index c is the number of the categories in the segmentation results (four categories in our work: background, consolidation, ground-glass opacity, and the lung). Directly minimizing the segmentation loss in (6) leads to good segmentation performance with the source images, but when tested on the target images, the performance will be significantly lower due to the domain shift. To overcome this problem, we introduce cross-domain adversarial loss to our network. Please note that, unlike the updating process for the generator and discriminator shown in (4) and (5), where the adversarial loss is calculated within the source or target domain. Here, the adversarial loss is cross-domain, and the gradients of the cross-domain adversarial loss can lead to a reversed domain classification. We utilize these gradients to update the feature extractor. To be more specific, the cross-domain adversarial losses are used to ensure that, when target features are passed to the generator, source-like images can be reconstructed, when source features are passed to the generator, target-like images can be reconstructed. Through this constraint domain alignment, the learned features from the two domains will exhibit a similar distribution, thus enabling the feature extractor to learn the common representations of the two domains. In Fig. 2, we illustrate the training process for each module in the network with the direction of the data flow and gradient flow. For each iteration, the randomly sampled X, δ ∈ (S, T) are sent to the network. The generator, discriminator, feature extractor, and classifier are then iteratively updated in turn. Note that, unlike the updating of the generator and discriminator, the adversarial loss used to update the feature extractor and classifier is cross-domain. Except for the segmentation loss, all the other losses are calculated in the source domain and target domain. During the testing process, we only use the trained feature extractor and classifier. The network takes the real CT images of COVID-19 cases as input and generates the predicted segmentation map.

Fig. 2

Training process for the proposed network. The solid lines indicate the data flow, and the dashed lines indicate the gradient flow

Experiments

Experimental settings

Dataset

The source data comes from our previous work [24], which was designed to generate high-quality and realistic COVID-19 lung CT images for use in deep learning based medical imaging tasks. The dataset contains 10,200 synthetic 2D CT images with corresponding pixel-wise annotations. There are four categories in the annotation map: ground-glass opacity, consolidation, the lung, and background. The first two are the most common characteristics used for COVID-19 diagnosis in lung CT imaging. The target data is taken from the COVID-19 CT segmentation dataset http://medicalsegmentation.com/covid19/ collected by the Italian Society of Medical and International Radiology. It contains 9 CT volumes from confirmed COVID-19 patients, and each volume contains 200 slices. We reformated all 3D volumes into 2D slices with a size of 512 × 512. Small rotations, shearings, gamma transforms, and intensity normalizations are used for data augmentation, and there are a total of 12,000 slices after pre-processing. We employ 70% slices as the unlabeled target data for training, while the remaining 30% slices are used for testing segmentation performance. We follow the patient-level split rule when we separate the target data into training set and test set.

Implementation details

We use PyTorch [58] for implementation. Our network is trained with 100K iterations using Adam optimizer [59]. The hyperparameters are set at α = 0.1 and l = 1.0 × 10− 5. The batch size is 1, and for every 10K iterations, the l is reduced by 20%. The network training is accelerated with an NVIDIA RTX 2080Ti and an Intel(R) Core i7-9700K CPU.

Evaluation metrics

For quantitative evaluation, we adopted the three most commonly used evaluation metrics in medical imaging analysis: the dice similarity coefficient (Dice), sensitivity (Sen), and specificity (Spec) [60, 61]. The dice similarity coefficient is an overlap index that indicates the similarity between the prediction and the ground truth. Sensitivity and specificity are two statistical metrics for the performance of binary medical image segmentation tasks. The former measures the percentage of actual positive pixels correctly predicted to be positive, while the latter measures the proportion of actual negative pixels correctly predicted to be negative. These metrics are defined as follows: where T P, F P, T N, and F N represent the pixel number of true positive, false positive, true negative, and false negative in the prediction respectively.

Quantitative results

Evaluation on two-class segmentation task

In this section, we compare the segmentation performance of our proposal with two state-of-the-art unsupervised medical image segmentation methods: Self-ensembling [51] and SSL [28]. Because these methods are designed for two-class segmentation, we train our proposed approach as a two-class segmentation network, e.g., by taking the ground-glass opacity as the object to segment and other classes as the background. Table 1 presents the experimental results when taking each category as the segmentation object. The results are reported as the mean ± error interval (calculated based on 95% confidence interval). Our proposed method outperforms the reported methods across most metrics. Compared with the second-best method Self-ensembling [51], the proposed method produces a 6.11% improvement in the dice similarity score for infection. Different with other compared methods, which utilize consistency loss to minimize the discrepancy between predictions in the source and target domain or employ superpixel-based pseudo-labels for supervision, our proposed approach attempts to learn the more discriminative feature representations when dealing with the challenging medical segmentation task.

Table 1

Quantitative results for the two-class segmentation of COVID-19 CT images

Methods	Dice (%) ↑	Sen (%) ↑	Spe (%) ↑	Dice (%) ↑	Sen (%) ↑	Spe (%) ↑
	Ground-glass opacity			Consolidation
Source-only	80.60 ± 0.48	78.86 ± 0.52	99.60 ± 0.01	61.75 ± 0.50	66.73 ± 0.51	99.83 ± 0.01
Self-ensembling [51]	82.43 ± 0.36	80.18 ± 0.47	99.53 ± 0.01	65.16 ± 0.78	66.58 ± 1.26	99.26 ± 0.01
SSL [28]	78.34 ± 0.87	71.37 ± 0.53	99.47 ± 0.01	73.83 ± 0.91	81.30 ± 0.82	99.44 ± 0.01
Ours	85.34 ± 0.36	82.13 ± 0.41	99.87 ± 0.01	74.67 ± 0.57	68.69 ± 0.39	99.97 ± 0.01
Target-only	88.73 ± 0.98	87.55 ± 1.34	99.84 ± 0.02	84.58 ± 0.80	84.71 ± 0.94	99.94 ± 0.01
	Infection			Lung
Source-only	78.82 ± 0.61	70.99 ± 0.86	99.80 ± 0.01	89.60 ± 0.62	92.38 ± 0.23	97.89 ± 0.15
Self-ensembling [51]	80.43 ± 0.47	80.74 ± 0.51	99.63 ± 0.01	93.53 ± 0.29	90.47 ± 0.10	99.61 ± 0.01
SSL [28]	79.15 ± 0.51	78.77 ± 0.50	99.81 ± 0.01	94.59 ± 0.19	93.47 ± 0.13	97.60 ± 0.01
Ours	86.54 ± 0.39	85.54 ± 0.43	99.80 ± 0.01	95.75 ± 0.25	93.11 ± 0.26	99.74 ± 0.01
Target-only	91.50 ± 0.43	92.56 ± 0.52	99.81 ± 0.01	97.62 ± 0.15	97.38 ± 0.16	99.69 ± 0.02

Infection considers both ground-glass opacity and consolidation. (The highest evaluation score is marked in bold. ↑ indicates that a higher number is better)

Quantitative results for the two-class segmentation of COVID-19 CT images Infection considers both ground-glass opacity and consolidation. (The highest evaluation score is marked in bold. ↑ indicates that a higher number is better)

Evaluation on multi-class segmentation task

As an assistant diagnostic tool, our model is expected to provide more detailed information about the infected areas and the lung. Therefore, we extend our method to a multi-class segmentation task, and compare it with state-of-the-art domain adaptation based segmentation methods MinEnt [47], AdvEnt [47], and IntraD [49]. It should be noted that the metrics for each category are calculated by taking the other categories as background. More specifically, even though the network is trained for a multi-class segmentation task, we employ two classes (the object and background) when calculating the metrics. Table 2 shows the quantitative results on real CT images from COVID-19 cases. Our proposed approach outperforms the compared methods across most metrics. Compared with the second-best method AdvEnt [47], our proposal produces a 5.03% improvement in the dice similarity score for infection. When excluding the domain adaptation module of our network, and only using the base feature extractor and classifier trained with the source data (source-only), we observe a significant drop in performance (Dice: for infection), clearly illustrating the effectiveness of our domain adaptation strategy, which employs adversarial training to learn the true features of the infection from real COVID-19 CT images. It can also be observed that, even without access to the ground truth for the real CT images, our proposed method achieves results that are comparable to the target-only method trained with the target data in a supervised manner. Moreover, our proposal achieves the highest performance in lung segmentation. This proves that the proposed method is also suitable for large-area tissues or organ segmentation.

Table 2

Quantitative results for multi-class segmentation of COVID-19 CT images

Methods	Dice (%) ↑	Sen (%) ↑	Spe (%) ↑	Dice (%) ↑	Sen (%) ↑	Spe (%) ↑
	Ground-glass opacity			Consolidation
Source-only	79.16 ± 0.56	73.65 ± 0.41	99.81 ± 0.01	61.42 ± 0.45	57.54 ± 0.67	99.82 ± 0.01
MinEnt [47]	79.72 ± 0.42	71.83 ± 0.48	99.87 ± 0.01	75.33 ± 0.41	67.23 ± 0.68	99.97 ± 0.01
AdvEnt [47]	81.99 ± 0.38	76.68 ± 0.45	99.83 ± 0.01	64.07 ± 0.74	54.18 ± 0.98	99.95 ± 0.01
IntraDA [49]	79.30 ± 0.34	69.17 ± 0.35	99.88 ± 0.01	62.33 ± 0.88	57.80 ± 1.00	99.97 ± 0.01
Ours	86.31 ± 0.27	85.37 ± 0.26	99.81 ± 0.01	74.55 ± 0.30	67.44 ± 0.32	99.95 ± 0.01
Target-only	87.54 ± 0.27	86.83 ± 0.34	99.82 ± 0.01	84.88 ± 0.42	82.79 ± 0.62	99.96 ± 0.01
	Infection			Lung
Source-only	76.98 ± 0.30	70.92 ± 0.47	99.66 ± 0.01	88.54 ± 0.32	93.47 ± 0.16	97.41 ± 0.08
MinEnt [47]	80.91 ± 0.27	72.61 ± 0.30	99.86 ± 0.01	95.55 ± 0.01	95.62 ± 0.01	99.33 ± 0.01
AdvEnt [47]	81.12 ± 0.28	74.55 ± 0.35	99.82 ± 0.01	95.69 ± 0.06	95.41 ± 0.05	99.41 ± 0.01
IntraDA [49]	77.34 ± 0.32	67.76 ± 0.43	99.89 ± 0.01	95.27 ± 0.07	95.01 ± 0.06	99.35 ± 0.01
Ours	86.15 ± 0.29	84.29 ± 0.31	99.81 ± 0.01	96.13 ± 0.07	94.61 ± 0.09	99.67 ± 0.01
Target-only	89.55 ± 0.35	88.57 ± 0.29	99.82 ± 0.01	97.12 ± 0.13	97.04 ± 0.18	99.59 ± 0.01

Infection considers both ground-glass opacity and consolidation. (The highest evaluation score is marked in bold. ↑ means a higher number is better)

Quantitative results for multi-class segmentation of COVID-19 CT images Infection considers both ground-glass opacity and consolidation. (The highest evaluation score is marked in bold. ↑ means a higher number is better)

Qualitative results

Figure 3 shows the qualitative results for two-class segmentation of real COVID-19 CT images. We train our method as a two-class segmenter for ground-glass opacity, consolidation, infection, and the lung, respectively. It is obvious that the proposed domain adaptation based segmentation network can learn the discriminative features by employing the adversarial training, so as to accurately segment the object areas. The Self-ensembling [51] can handle the large object segmentation such as (a) and (d), but demonstrates a poor performance for the relatively small consolidation shown in (b). SSL [28] relies on the superpixel-based pseudo labels for supervision during training, so it fails to capture the details of ground-glass opacity in (a).

Fig. 3

Qualitative results for two-class segmentation task. Columns 1 and 2 present the input real COVID-19 CT images and corresponding ground truth, while Column 3 to 6 are segmentation results of Source-only, Self-ensembling [51], SSL [28], and our proposed method. The first to last rows are the results when taking ground-glass opacity (a), consolidation (b), infection (c) and the lung (d) as the segmentation object, respectively Figure 4 displays the qualitative results for multi-class segmentation of real COVID-19 CT cases. It is obvious that there are a large number of mis-segmented areas in the visualization results of the source-only (baseline) model. This is mainly due to the differences in the texture and intensity between the synthetic data and the real COVID-19 CT images. We observe a significant improvement in performance when introducing cross-domain adversarial learning in our proposed approach, which confirms the importance of adversarial training based domain adaptation. MinEnt [47] attempts to minimize the entropy value of the model output directly to overcome the domain gap. However, compared with our proposed method, MinEnt fails to capture the fine-grained details of the infection in (a) and (b). AdvEnt [47] conducts adversarial training on entropy map and it is quite sensitive to the influence of irrelevant areas, for example, there is obvious noise in the results (d) and (e). IntraDA [49] relies on the pseudo labels for training, thus it fails to separate the ground-glass opacity and consolidation in (a).

Fig. 4

Qualitative results for multi-class segmentation task. Columns 1 and 2 show the input real COVID-19 CT images and corresponding ground truth, in which the ground-glass opacity is marked in blue, consolidation is marked in green, and the lung is marked in red. Columns 3 to 7 are the segmentation results for the Source-only, MinEnt [47], AdvEnt [47], IntraDA [49], and our proposed method, respectively Our domain adaptation based segmentation network outperforms the baseline method and other state-of-the-art methods. It produces a performance that is close to the ground truth with fewer mis-segmented infection areas, especially for consolidation, which is relatively small and challenging to segment. The success of the proposed method is attributed to our adversarial training scheme, through which our network can learn the true features of target data under the constraint of cross-domain adversarial loss. This scheme allows our network to more clearly distinguish the real features of ground-glass opacity and consolidation even without access to ground truth annotations of the target data. In addition, our proposed method also performs best in terms of lung segmentation, which proves that our method can be generalized. That is, it can be used not only for COVID-19 infection segmentation, but also for other organs.

Ablation study

In order to assess the important settings and components of our method, we conduct ablation experiments following the multi-class experimental settings in Section 4.2. The evaluation criterion is the dice similarity coefficient.

Comparison of different feature map selection strategies

As described in Section 3.2, the generator takes three different feature maps from the feature extractor as the input and maps them to the image space. Then, the output of the generator is used to calculate the adversarial loss, which is crucial for domain-invariant feature learning. Therefore, the selection strategy for the feature maps will affect the segmentation performance. Because there are four down-sampling operations in our feature extractor, we have a total of five sizes including the original image (1:512 × 512, 2:256 × 256, 3:128 × 128, 4:64 × 64, 5:32 × 32). We conduct a series of experiments using different combinations. From Table 3, it can be observed that our network achieves the highest performance when the generator takes high-level feature maps as the input. This proves that the high-level semantic information is more helpful for domain adaptation than the rich details in the low-level features maps. We adopt this setting for our network.

Table 3

Ablation study of different feature map selection strategies

1	2	3	4	5	Dice (%) ↑
					GGO	Consolidation	Infection	Lung
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\checkmark $\end{document}✓	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\checkmark $\end{document}✓	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\checkmark $\end{document}✓	×	×	83.43 ± 0.40	67.35 ± 0.36	83.30 ± 0.23	95.69 ± 0.07
×	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\checkmark $\end{document}✓	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\checkmark $\end{document}✓	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\checkmark $\end{document}✓	×	85.36 ± 0.28	74.83 ± 0.40	84.24 ± 0.16	96.11 ± 0.08
×	×	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\checkmark $\end{document}✓	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\checkmark $\end{document}✓	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\checkmark $\end{document}✓	86.31 ± 0.27	74.55 ± 0.30	86.15 ± 0.29	96.13 ± 0.07

(The highest evaluation score is marked in bold. ↑ means a higher number is better. GGO: ground-glass opacity)

Ablation study of different feature map selection strategies (The highest evaluation score is marked in bold. ↑ means a higher number is better. GGO: ground-glass opacity)

Effect of different components in our network

Table 4 shows how the different components of our network influence the segmentation performance. The bold line corresponds to our proposed method, while the other methods differ from our proposed approach in the following respects. (1) w/o adversarial training: the source-only baseline corresponds to α= 0. Here, the domain adaptation module is excluded and only the feature extractor and classifier are used. (2) w/o skip connections: the skip connections between the feature extractor and classifier are removed, which are essential for preserving the fine-grained details in the segmentation. (3) feature space training: the domain adaptation module is removed and pixel-level adversarial loss for the feature maps is calculated, which is then used to update the feature extractor and classifier. The experimental results show that the domain adaptation module is critical to ensuring the excellent performance of our network. In addition, compared with feature space training, calculating the adversarial loss on image space is more efficient.

Table 4

Ablation study of different components in the proposed network

Network configuration	Dice (%) ↑
	GGO	Consolidation	Infection	Lung
w/o adversarial training	79.16 ± 0.56	61.42 ± 0.45	79.98 ± 0.30	88.54 ± 0.32
w/o skip connections	78.56 ± 0.48	61.71 ± 0.63	78.81 ± 0.24	94.11 ± 0.01
Feature space training	81.87 ± 0.33	52.61 ± 0.43	79.52 ± 0.41	92.33 ± 0.22
Ours	86.31 ± 0.27	74.55 ± 0.30	86.15 ± 0.29	96.13 ± 0.07

(The highest evaluation score is marked in bold. ↑ means a higher number is better. GGO: ground-glass opacity)

Ablation study of different components in the proposed network (The highest evaluation score is marked in bold. ↑ means a higher number is better. GGO: ground-glass opacity)

Conclusion

In this paper, we proposed a novel unsupervised domain adaptation based method for COVID-19 infection segmentation in CT images. We considered a challenging situation in which abundant synthetic annotated medical images are available, but no annotations are available for real COVID-19 lung CT images. We introduced unsupervised adversarial training to our network to correlate the features between real COVID-19 CT images and synthetic images. The cross-domain adversarial loss enforces the features learned by feature extractor from the two domains closer, thus the network can learn the common representations of two domains and retain the diagnostic information (i.e., the features of COVID-19 infection). Experimental results on CT images of COVID-19 cases demonstrated that our proposal outperforms baseline and state-of-the-art approaches. We also demonstrated the effectiveness of our network in lung segmentation. Our proposed method has great potential for use in diagnosing COVID-19 by quantifying the infected areas of the lung.

20 in total

1. Sharing clinical data electronically: a critical challenge for fixing the health care system.

Authors: Julia Adler-Milstein; Ashish K Jha
Journal: JAMA Date: 2012-04-25 Impact factor: 56.272

2. COVID-19 CT Image Synthesis with a Conditional Generative Adversarial Network.

Authors: Yifan Jiang; Han Chen; M H Loew; Hanseok Ko
Journal: IEEE J Biomed Health Inform Date: 2020-12-04 Impact factor: 5.772

3. Dual-Sampling Attention Network for Diagnosis of COVID-19 From Community Acquired Pneumonia.

Authors: Xi Ouyang; Jiayu Huo; Liming Xia; Fei Shan; Jun Liu; Zhanhao Mo; Fuhua Yan; Zhongxiang Ding; Qi Yang; Bin Song; Feng Shi; Huan Yuan; Ying Wei; Xiaohuan Cao; Yaozong Gao; Dijia Wu; Qian Wang; Dinggang Shen
Journal: IEEE Trans Med Imaging Date: 2020-08 Impact factor: 10.048

4. Inf-Net: Automatic COVID-19 Lung Infection Segmentation From CT Images.

Authors: Deng-Ping Fan; Tao Zhou; Ge-Peng Ji; Yi Zhou; Geng Chen; Huazhu Fu; Jianbing Shen; Ling Shao
Journal: IEEE Trans Med Imaging Date: 2020-08 Impact factor: 10.048

5. Serial Quantitative Chest CT Assessment of COVID-19: A Deep Learning Approach.

Authors: Lu Huang; Rui Han; Tao Ai; Pengxin Yu; Han Kang; Qian Tao; Liming Xia
Journal: Radiol Cardiothorac Imaging Date: 2020-03-30

6. CT Imaging Features of 2019 Novel Coronavirus (2019-nCoV).

Authors: Michael Chung; Adam Bernheim; Xueyan Mei; Ning Zhang; Mingqian Huang; Xianjun Zeng; Jiufa Cui; Wenjian Xu; Yang Yang; Zahi A Fayad; Adam Jacobi; Kunwei Li; Shaolin Li; Hong Shan
Journal: Radiology Date: 2020-02-04 Impact factor: 11.105

7. Radiological findings from 81 patients with COVID-19 pneumonia in Wuhan, China: a descriptive study.

Authors: Heshui Shi; Xiaoyu Han; Nanchuan Jiang; Yukun Cao; Osamah Alwalid; Jin Gu; Yanqing Fan; Chuansheng Zheng
Journal: Lancet Infect Dis Date: 2020-02-24 Impact factor: 25.071

8. Automatic COVID-19 CT segmentation using U-Net integrated spatial and channel attention mechanism.

Authors: Tongxue Zhou; Stéphane Canu; Su Ruan
Journal: Int J Imaging Syst Technol Date: 2020-11-24 Impact factor: 2.177

9. Chest CT for detecting COVID-19: a systematic review and meta-analysis of diagnostic accuracy.

Authors: Buyun Xu; Yangbo Xing; Jiahao Peng; Zhaohai Zheng; Weiliang Tang; Yong Sun; Chao Xu; Fang Peng
Journal: Eur Radiol Date: 2020-05-15 Impact factor: 5.315

5 in total

1. A fast lightweight network for the discrimination of COVID-19 and pulmonary diseases.

Authors: Oussama Aiadi; Belal Khaldi
Journal: Biomed Signal Process Control Date: 2022-06-21 Impact factor: 5.076

Review 2. Supervised and weakly supervised deep learning models for COVID-19 CT diagnosis: A systematic review.

Authors: Haseeb Hassan; Zhaoyu Ren; Chengmin Zhou; Muazzam A Khan; Yi Pan; Jian Zhao; Bingding Huang
Journal: Comput Methods Programs Biomed Date: 2022-03-05 Impact factor: 7.027

3. A radiological image analysis framework for early screening of the COVID-19 infection: A computer vision-based approach.

Authors: Shouvik Chakraborty; Kalyani Mali
Journal: Appl Soft Comput Date: 2022-02-03 Impact factor: 6.725

4. Deep feature fusion classification network (DFFCNet): Towards accurate diagnosis of COVID-19 using chest X-rays images.

Authors: Jingyao Liu; Wanchun Sun; Xuehua Zhao; Jiashi Zhao; Zhengang Jiang
Journal: Biomed Signal Process Control Date: 2022-04-13 Impact factor: 5.076

5. A teacher-student framework with Fourier Transform augmentation for COVID-19 infection segmentation in CT images.

Authors: Han Chen; Yifan Jiang; Hanseok Ko; Murray Loew
Journal: Biomed Signal Process Control Date: 2022-09-26 Impact factor: 5.076

5 in total