| Literature DB >> 34972109 |
Mohsin Raza1, Khuram Naveed2, Awais Akram3, Nema Salem4, Amir Afaq5, Hussain Ahmad Madni2, Mohammad A U Khan4,6, Mui-Zzud- Din7.
Abstract
In this era, deep learning-based medical image analysis has become a reliable source in assisting medical practitioners for various retinal disease diagnosis like hypertension, diabetic retinopathy (DR), arteriosclerosis glaucoma, and macular edema etc. Among these retinal diseases, DR can lead to vision detachment in diabetic patients which cause swelling of these retinal blood vessels or even can create new vessels. This creation or the new vessels and swelling can be analyzed as biomarker for screening and analysis of DR. Deep learning-based semantic segmentation of these vessels can be an effective tool to detect changes in retinal vasculature for diagnostic purposes. This segmentation task becomes challenging because of the low-quality retinal images with different image acquisition conditions, and intensity variations. Existing retinal blood vessels segmentation methods require a large number of trainable parameters for training of their networks. This paper introduces a novel Dense Aggregation Vessel Segmentation Network (DAVS-Net), which can achieve high segmentation performance with only a few trainable parameters. For faster convergence, this network uses an encoder-decoder framework in which edge information is transferred from the first layers of the encoder to the last layer of the decoder. Performance of the proposed network is evaluated on publicly available retinal blood vessels datasets of DRIVE, CHASE_DB1, and STARE. Proposed method achieved state-of-the-art segmentation accuracy using a few number of trainable parameters.Entities:
Mesh:
Year: 2021 PMID: 34972109 PMCID: PMC8719769 DOI: 10.1371/journal.pone.0261698
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Difference between DenseNet and proposed DAVS-Net.
| DenseNet | DAVS-Net |
|---|---|
| DenseNet is a classification network with fully connected layers | DAVS-Net is a semantic segmentation network which does not use fully connected layer to operate in fully convolutional manner |
| DenseNet does not use any upsampling (decoder) | DAVS-Net is an encoder-decoder network |
| DenseNet used many dense blocks (E.g five dense block for ImageNet dataset) | DAVS-Net is just using 3 dense block is encoder and 3 dense blocks for decoder |
| In each dense block DenseNet use four convolutional layers | DAVS-Net net use just two convolutions in each dense block |
| DenseNet does not use unpooling layers, so it does not transfers the pooling indices | DAVS-Net uses unpooling layers in combination with pooling layers, so it transfers the pooling indices to decoder |
| DenseNet uses global average pooling in the end of the network | DAVS-Net use max-pooling layer after each dense block |
Fig 1Flow diagram of the proposed method.
Fig 2Connectivity principle of DAVS-Net.
Fig 3Architecture of DAVS-Net used for vessel segmentation in our work.
DAVS-Net encoder-decoder I/O feature map sizes.
Where EDB, EDB-C, EDB-Cat, DDB, DDB-C, DDB-Cat represent encoder dense block, encoder dense block convolution, encoder dense block concatenation, decoder dense block, decoder dense block convolution, decoder dense block concatenation, respectively. The layer with shows that layer includes rectified linear unit (ReLU), and batch normalization (BN) after.
| Dense Block | Layer/Size | Filters | Layer O/P | Parameters |
|---|---|---|---|---|
| EDB1 | EDB1-C1 | 64 | 640 × 640 × 64 | 1792 + 128 |
| EDB1-C2 /3 × 3 × 64 to EDB1-cat | 64 | 640 × 640 × 64 | 36,928 | |
| EDB1-Cat (EDB1-C1 * EDB1-C2) | - | 640 × 640 × 128 | - | |
| E-Bneck-1 | 64 | 640 × 640 × 64 | 8256 + 128 | |
| Pool-1 | - | 320 × 320 × 64 | - | |
| EDB2 | EDB2-C1 | 128 | 320 × 320 × 128 | 73,856 + 256 |
| EDB2-C2 /3 × 3 × 64 to EDB2-Cat | 128 | 320 × 320 × 128 | 147,584 | |
| EDB2-Cat (EDB2-C1 * EDB2-C2) | - | 320 × 320 × 256 | - | |
| E-Bneck-2 | 64 | 640 × 640 × 64 | 8256 + 128 | |
| Pool-2 | - | 160 × 160 × 128 | - | |
| EDB3 | EDB3-C1 | 256 | 160 × 160 × 256 | 295,168 + 512 |
| EDB3-C2 /3 × 3 × 64 to EDB3-Cat | 256 | 160 × 160 × 256 | 590,080 | |
| EDB3-Cat (EDB3-C1 * EDB3-C2) | - | 160 × 160 × 512 | - | |
| E-Bneck-3 | 64 | 160 × 160 × 256 | 131328 + 512 | |
| Pool-3 | - | 80 × 80 × 256 | - | |
| DDB3 | Unpool-3 | - | 160 × 160 × 256 | - |
| DDB3-C1 | 256 | 160 × 160 × 128 | 590,080 + 512 | |
| DDB3-C2 /3 × 3 × 256 to DDB3-Cat | 64 | 160 × 160 × 640 | 295,040 | |
| DDB3-Cat (DDB3-C1 * DDB3-C2 * EDB3-C1) | - | 160 × 160 × 640 | - | |
| D-Bneck-1 | 128 | 160 × 160 × 128 | 82048 + 256 | |
| DDB2 | Unpool-2 | - | 320 × 320 × 128 | - |
| EDB2-C1 | 128 | 320 × 320 × 64 | 147,584 + 256 | |
| EDB2-C2 /3 × 3 × 64 to EDB2-Cat | 64 | 320 × 320 × 128 | 73,792 | |
| DDB2-Cat (DDB2-C1 * DDB2-C2 * EDB2-C1) | - | 320 × 320 × 320 | - | |
| D-Bneck-2 | 64 | 320 × 320 × 64 | 20544 + 128 | |
| DDB1 | Unpool-1 | - | 640 × 640 × 64 | - |
| DDB1-C1 | 64 | 640 × 640 × 64 | 36,928 + 128 | |
| DDB1-C2 /3 × 3 × 64 to DDB1-Cat | 2 | 640 × 640 × 2 | 1,154 | |
| DDB1-Cat (DDB1-C1 * DDB1-C2 * EDB1-C1) | - | 640 × 640 × 130 | - | |
| D-Bneck-3 | 2 | 640 × 640 × 2 | 262 + 4 |
Comparison of architectural differences with similar state-of-the-art networks.
| Model | Other architecture | DAVS-Net |
|---|---|---|
| Overall 26 convolutional layers | Overall 12 convolutional layers | |
| No residual or dense connectivity | Dense connectivity is used | |
| First two blocks have two convolutional layers while other include three convolutional layers | Only two convolutions in each block | |
| The block with channel depth-512 is used twice | Did not use Channel depth-512 | |
| Overall 23 convolutional layers | Overall 12 convolutional layers | |
| Up convolutions in decoder | Unpooling layers in decoder | |
| No dense connectivity within encoder/decoder (just dense connectivity encoder to decoder) | Inner and outer dense connectivity for both encoder and decoder | |
| 1024 chandel-depth is used in bridge which involve many trainable parameters | maximum channel depth-256 | |
| Use cropping | Dis not use cropping | |
| Overall 16 convolutional layers | Overall 12 convolutional layers | |
| Based on residual connectivity | Based on dense connectivity | |
| First convolutional block missing with feature empowerment connectivity | Each convolutional layer is connected with dense empowerment | |
| No bottleneck layers are employed | Bottleneck layers are used to control number of channels | |
| 10 residual paths | 12 dense paths | |
| Overall 89 convolutional layers | Overall 12 convolutional layers | |
| Overall 10 dense blocks are used in both encoder and decoder | Overall 6 dense blocks are used in both encoder and decoder | |
| unpooling layers are not utilized | Pooling and unpooling layers are used in combination | |
| Eight convolutions in each dense block | Two convolutions in each dense block | |
| Maximum channel depth-512 | Maximum channel depth-256 | |
| 4 bottleneck layers are used in just encoder | 6 bottlenck layers are used in both encode and decoder |
Summary of datasets used in the experiments.
| Dataset Name | Training Set | Test Set | Dataset Size | Dimension (pixels) |
|---|---|---|---|---|
| STARE | 10 | 10 | 20 | 700 605 |
| CHASE_DB1 | 20 | 8 | 28 | 999 960 |
| DRIVE | 20 | 20 | 40 | 565 584 |
Fig 4Visual results on the CHASE_DB1 dataset.
From left-to-right: input images, ground truth, result obtained by our proposed method.
Fig 6Visual results on the STARE dataset.
From left-to-right: input images, ground truth, result obtained by our proposed method.
Performance comparison of our proposed model on CHASE_DB1 dataset with other existing models.
| Method | Year | Se | Sp | Acc | AUC |
|---|---|---|---|---|---|
| Khawaja | 2019 | 0.7974 | 0.9697 | 0.9528 | NA |
| Zhang | 2016 | 0.7626 | 0.9661 | 0.9452 | 0.9606 |
| Arsalan | 2019 | 0.8206 | 0.9800 | 0.9726 | 0.9800 |
| Jin | 2019 | 0.7595 | 0.9878 | 0.9641 | 0.9832 |
| Yin | 2020 | 0.7993 | 0.9868 | 0.9783 | 0.9869 |
| Wang | 2020 | 0.8186 | 0.9844 | 0.9673 | 0.9881 |
| Segnet-basic [ | 2020 | 0.8190 | 0.9735 | 0.9638 | 0.9780 |
|
| 2021 |
|
|
|
|
Performance comparison of our proposed model on DRIVE dataset with other existing models.
| Method | Year | Se | Sp | Acc | AUC |
|---|---|---|---|---|---|
| Ma | 2019 | 0.7916 | 0.9811 | 0.9570 | 0.9810 |
| Guo | 2019 | 0.7891 | 0.9804 | 0.9561 | 0.9806 |
| Wu | 2019 | 0.8038 | 0.9802 | 0.9578 | 0.9821 |
| Wang | 2019 | 0.7940 | 0.9816 | 0.9567 | 0.9772 |
| Arsalan | 2019 | 0.8022 | 0.9810 | 0.9655 | 0.9820 |
| Gu | 2019 | 0.8309 | - | 0.9545 | 0.9779 |
| Yin | 2020 | 0.8038 | 0.9837 | 0.9578 | 0.9846 |
| Wang | 2020 | 0.7991 | 0.9813 | 0.9581 | 0.9823 |
| Segnet-Basic [ | 2020 | 0.7949 | 0.9738 | 0.9579 | 0.9720 |
|
| 2021 |
|
|
|
|
Performance comparison of our proposed model on STARE database with other existing models.
| Method | Year | Se | Sp | Acc | AUC |
|---|---|---|---|---|---|
| Jin | 2019 | 0.8155 | 0.9752 | 0.9610 | 0.9804 |
| Chen | 2018 | 0.8320 | 0.9760 | 0.9650 | 0.9735 |
| Wang | 2019 | 0.8074 | 0.9821 | 0.9661 | 0.9812 |
| Guo | 2019 | 0.7888 | 0.9801 | 0.9627 | 0.9840 |
| Arsalan | 2019 | 0.8526 | 0.9791 | 0.9697 | 0.9883 |
| Wu | 2019 | 0.8132 | 0.9814 | 0.9661 | 0.9860 |
| SegNet-Basic [ | 2020 | 0.8118 | 0.9738 | 0.9543 | 0.9728 |
| Wang | 2020 | 0.8239 | 0.9813 | 0.9670 | 0.9871 |
|
| 2021 |
|
|
|
|