| Literature DB >> 34305181 |
Nan Mu1, Hongyu Wang1, Yu Zhang1, Jingfeng Jiang2,3, Jinshan Tang4.
Abstract
In this paper, a progressive global perception and local polishing (PCPLP) network is proposed to automatically segment the COVID-19-caused pneumonia infections in computed tomography (CT) images. The proposed PCPLP follows an encoder-decoder architecture. Particularly, the encoder is implemented as a computationally efficient fully convolutional network (FCN). In this study, a multi-scale multi-level feature recursive aggregation (mmFRA) network is used to integrate multi-scale features (viz. global guidance features and local refinement features) with multi-level features (viz. high-level semantic features, middle-level comprehensive features, and low-level detailed features). Because of this innovative aggregation of features, an edge-preserving segmentation map can be produced in a boundary-aware multiple supervision (BMS) way. Furthermore, both global perception and local perception are devised. On the one hand, a global perception module (GPM) providing a holistic estimation of potential lung infection regions is employed to capture more complementary coarse-structure information from different pyramid levels by enlarging the receptive fields without substantially increasing the computational burden. On the other hand, a local polishing module (LPM), which provides a fine prediction of the segmentation regions, is applied to explicitly heighten the fine-detail information and reduce the dilution effect of boundary knowledge. Comprehensive experimental evaluations demonstrate the effectiveness of the proposed PCPLP in boosting the learning ability to identify the lung infected regions with clear contours accurately. Our model is superior remarkably to the state-of-the-art segmentation models both quantitatively and qualitatively on a real CT dataset of COVID-19.Entities:
Keywords: Coronavirus disease 2019 (COVID-19); Feature recursive aggregation; Global perception; Local polishing; Multiple supervision
Year: 2021 PMID: 34305181 PMCID: PMC8272691 DOI: 10.1016/j.patcog.2021.108168
Source DB: PubMed Journal: Pattern Recognit ISSN: 0031-3203 Impact factor: 7.740
Fig. 1An illustrative flowchart showing major components of the proposed PCPLP network.
Fig. 2An overview of the proposed PCPLP network architecture.
A summary of four commonly used classic neural network models for image segmentation.
| AlexNet | VGG | GoogLeNet | ResNet | |
|---|---|---|---|---|
| Input Size | 227 × 227 | 224 × 224 | 224 × 224 | 224 × 224 |
| Number of Layers | 8 | 19 | 22 | 152 |
| Number of Conv. Layers | 5 | 16 | 21 | 151 |
| Filter Sizes | 3, 5, 11 | 3 | 1, 3, 5, 7 | 1, 3, 5, 7 |
| Strides | 1, 4 | 1 | 1, 2 | 1, 2 |
| Fully Connected Layers | 3 | 3 | 1 | 1 |
| TOP-5 Test Accuracy | 84.6% | 92.7% | 93.3% | 96.4% |
| Contributions | ReLU, Dropout | Small filter kernel | 1 × 1 Conv. | Residual learning |
| Advantages | Increase training speed and prevent overfitting | Suitable for parallel acceleration, nonlinear | Reduce the amount of computation | Overcome gradient vanishing |
| Disadvantages | Low accuracy | Small receptive field | Overfitting, vanishing gradient | Many parameters, long training time |
Details of the Block1~Block5 to extract image features in a multi-level Pyramid scheme.
| Block | Layer | Filter Size/Channels | Stride | Padding |
|---|---|---|---|---|
| Conv1-1 | 3 × 3/64 | 1 | Yes | |
| Conv1-2 | 3 × 3/64 | 1 | Yes | |
| Maxpool | 2 × 2/64 | 2 | No | |
| Conv2-1 | 3 × 3/128 | 1 | Yes | |
| Conv2-2 | 3 × 3/128 | 1 | Yes | |
| Maxpool | 2 × 2/128 | 2 | No | |
| Conv3-1 | 3 × 3/256 | 1 | Yes | |
| Conv3-2 | 3 × 3/256 | 1 | Yes | |
| Conv3-3 | 3 × 3/256 | 1 | Yes | |
| Maxpool | 2 × 2/256 | 2 | No | |
| Conv4-1 | 3 × 3/512 | 1 | Yes | |
| Conv4-2 | 3 × 3/512 | 1 | Yes | |
| Conv4-3 | 3 × 3/512 | 1 | Yes | |
| Maxpool | 2 × 2/512 | 2 | No | |
| Conv5-1 | 3 × 3/512 | 1 | Yes | |
| Conv5-2 | 3 × 3/512 | 1 | Yes | |
| Conv5-3 | 3 × 3/512 | 1 | Yes | |
| Maxpool | 2 × 2/512 | 2 | No |
Details of the proposed encoder-decoder convolutional network.
| Module | Block | Layer | Filter Size/Channels | Stride | Padding | |
|---|---|---|---|---|---|---|
| GPM | GPMleft | Conv | 7 × 1/128 | 1 | Yes | |
| Conv | 1 × 7/256 | 1 | Yes | |||
| GPMright | Conv | 1 × 7/128 | 1 | Yes | ||
| Conv | 7 × 1/256 | 1 | Yes | |||
| LPM | Conv | 3 × 3/256 | 1 | Yes | ||
| ReLU | ||||||
| Conv | 3 × 3/256 | 1 | Yes | |||
| Conv | Conv | 3 × 3/128 | 1 | Yes | ||
| ReLU | ReLU | |||||
| LPM | Conv | 3 × 3/128 | 1 | Yes | ||
| ReLU | ||||||
| Conv | 3 × 3/128 | 1 | Yes | |||
| Contrast Layer | avg_pool | 3 × 3/128 | 1 | No | ||
| DeConv | DeConv | 3 × 3/384 | 2 | Yes | ||
| ReLU | ReLU | |||||
| LPM | Conv | 3 × 3/384 | 1 | Yes | ||
| ReLU | ||||||
| SConv | 3 × 3/384 | 1 | Yes | |||
| ReLU | ReLU | |||||
Fig. 3A schematic diagram showing the structure of GPM module.
Fig. 4A schematic diagram showing the structure of LPM.
The 12 metrics for evaluating the performance of various segmentation models.
| Metric | Formula | Description |
|---|---|---|
| TPR and FPR measure the proportion of correctly identified actual positives and actual negatives, respectively | ||
| PR curve mainly evaluates the comprehensiveness of the detected lung infection pixels | ||
| F- | ||
| DICE score | DICE score measures the similarity between the predicted map and the ground truth | |
| Sensitivity score | Sensitivity score measures the rate of missed detection | |
| Specificity score | Specificity score measures the rate of false detection | |
| MAE score indicates the similarity between the segmentation map and the ground truth | ||
| AUC score gives an intuitive indication of how well the segmentation map predicts the true lung infection regions | ||
| WP and WP measure the exactness and completeness, respectively | ||
| OR score measures the completeness of lung infection pixels and the correctness of non-lung infection pixels | ||
| S-M score measures the structural similarity between the segmentation map and the ground truth | ||
| Execution time | Average execution time per image (in second) | All experiments were performed with the same equipment and settings |
Fig. 5Performance comparisons of the proposed PCPLP framework with other models using the COVID-19 CT dataset [36].
Quantitative performance comparison of nine models in terms of different metrics. The best two results are highlighted in red and blue, respectively. The up-arrow ↑ indicates the higher value obtained, the better segmentation quality is, whereas the down-arrow ↓ implies the opposite.
Fig. 6Visual comparisons of lung infection segmentation using different algorithms. The green and the yellow areas represent the undetected and detected true infection regions, respectively. The red areas indicate the false infection regions that are incorrectly detected. (a) The original CT images from the test set. (b) The corresponding ground truth for each image. (c-j) The corresponding segmentation results from the eight state-of-the-art models. (k) The segmentation maps of the proposed PGPLP.
Fig. 7Performance comparisons using different variants of the proposed PCPLP model.
Fig. 8Qualitative performance comparisons using different variants of the proposed PCPLP model.
Quantitative performance comparisons using different variants of the proposed PCPLP model, the best two results are highlighted in red and blue colors, respectively.