| Literature DB >> 35743771 |
Abbas Jafar1, Muhammad Talha Hameed2, Nadeem Akram2, Umer Waqas3, Hyung Seok Kim4, Rizwan Ali Naqvi5.
Abstract
Semantic segmentation for diagnosing chest-related diseases like cardiomegaly, emphysema, pleural effusions, and pneumothorax is a critical yet understudied tool for identifying the chest anatomy. A dangerous disease among these is cardiomegaly, in which sudden death is a high risk. An expert medical practitioner can diagnose cardiomegaly early using a chest radiograph (CXR). Cardiomegaly is a heart enlargement disease that can be analyzed by calculating the transverse cardiac diameter (TCD) and the cardiothoracic ratio (CTR). However, the manual estimation of CTR and other chest-related diseases requires much time from medical experts. Based on their anatomical semantics, artificial intelligence estimates cardiomegaly and related diseases by segmenting CXRs. Unfortunately, due to poor-quality images and variations in intensity, the automatic segmentation of the lungs and heart with CXRs is challenging. Deep learning-based methods are being used to identify the chest anatomy segmentation, but most of them only consider the lung segmentation, requiring a great deal of training. This work is based on a multiclass concatenation-based automatic semantic segmentation network, CardioNet, that was explicitly designed to perform fine segmentation using fewer parameters than a conventional deep learning scheme. Furthermore, the semantic segmentation of other chest-related diseases is diagnosed using CardioNet. CardioNet is evaluated using the JSRT dataset (Japanese Society of Radiological Technology). The JSRT dataset is publicly available and contains multiclass segmentation of the heart, lungs, and clavicle bones. In addition, our study examined lung segmentation using another publicly available dataset, Montgomery County (MC). The experimental results of the proposed CardioNet model achieved acceptable accuracy and competitive results across all datasets.Entities:
Keywords: CardioNet; cardiothoracic ratio; chest anatomy; semantic segmentation; transverse cardiac diameter
Year: 2022 PMID: 35743771 PMCID: PMC9225197 DOI: 10.3390/jpm12060988
Source DB: PubMed Journal: J Pers Med ISSN: 2075-4426
Figure 1Chest anatomy segmentation to calculate cardiomegaly: (a) original CXR PA image, (b) segmented image by CardioNet, (c) maximum width of heart and thorax to calculate the CTR.
Figure 2Flowchart of the proposed CardioNet methodology.
Figure 3The proposed CardioNet architecture for CXR semantic segmentation.
Cardio-Net with feature concatenation, where the Downsample and Upsample blocks include Convolution (Conv), Bottleneck convolution (Bottleneck-C), Depth-wise separable convolution (DW-Sep-Conv), Concatenation, and Pool. Batch normalization and ReLU layers are used with convolutions and denoted as “**”.
| Block | Layer Name | Layer Size | Filters/Groups | Output |
|---|---|---|---|---|
| Downsample | Conv-1-1 ** | 3 × 3 × 64 (S = 1) | 64 | 350 × 350 × 64 |
| Conv-1-2 | 3 × 3 × 64 (S = 1) | 64 | 350 × 350 × 64 | |
| Concatenation-1 | 350 × 350 × 128 | |||
| Bottleneck-C-1 ** | 1 × 1 (S = 1) | 64 | 350 × 350 × 64 | |
| Pool-1 2 × 2 (S = 2) | 175 × 175 × 64 | |||
| Conv-2-1 ** | 3 × 3 × 64 (S = 1) | 128 | 175 × 175 × 128 | |
| Conv-2-2 | 3 × 3 × 128 (S = 1) | 128 | 175 × 175 × 128 | |
| Concatenation-2 | 175 × 175 × 256 | |||
| Bottleneck-C-2 ** | 1 × 1 (S = 1) | 128 | 175 × 175 × 128 | |
| Pool-2 2 × 2 (S = 2) | 87 × 87 × 128 | |||
| Conv-3-1 ** | 3 × 3 × 128 (S = 1) | 256 | 87 × 87 × 256 | |
| DW-Sep-Conv-3-2 | 3 × 3 × 256 (S = 1) | 256 | 87 × 87 × 256 | |
| Concatenation-3 | 87 × 87 × 512 | |||
| Bottleneck-C-3 ** | 1 × 1 (S = 1) | 128 | 87 × 87 × 256 | |
| Pool-3 2 × 2 (S = 2) | 43 × 43 × 256 | |||
| DW-Sep-Conv-4-1 ** | 3 × 3 × 256 (S = 1) | 256 | 43 × 43 × 256 | |
| DW-Sep-Conv-4-2 | 3 × 3 × 256 (S = 1) | 256 | 43 × 43 × 256 | |
| Concatenation-4 | 43 × 43 × 512 | |||
| Bottleneck-C-4 ** | 1 × 1 (S = 1) | 256 | 43 × 43 × 256 | |
| Pool-4 2 × 2 (S = 2) | 21 × 21 × 256 | |||
| Upsample | UnPool-4 2 × 2 (S = 2) | 43 × 43 × 256 | ||
| DW-Sep-Conv-4-2 ** | 3 × 3 × 256 (S = 1) | 256 | 43 × 43 × 256 | |
| DW-Sep-Conv-4-1 | 3 × 3 × 256 (S = 1) | 256 | 43 × 43 × 256 | |
| Concatenation-5 | 43 × 43 × 512 | |||
| Bottleneck-C-5 ** | 1 × 1 (S = 1) | 256 | 43 × 43 × 256 | |
| UnPool-3 2 × 2 (S = 2) | 87 × 87 × 256 | |||
| DW-Sep-Conv-3-2 ** | 3 × 3 × 256 (S = 1) | 256 | 87 × 87 × 256 | |
| Conv-3-1 | 3 × 3 × 256 (S = 1) | 128 | 87 × 87 × 128 | |
| Concatenation-6 | 87 × 87 × 640 | |||
| Bottleneck-C-6 ** | 1 × 1 (S = 1) | 128 | 87 × 87 × 128 | |
| UnPool-2 2 × 2 (S = 2) | 175 × 175 × 128 | |||
| Conv-2-2 ** | 3 × 3 × 128 (S = 1) | 128 | 175 × 175 × 128 | |
| Conv-2-1 | 3 × 3 × 128 (S = 1) | 64 | 175 × 175 × 64 | |
| Concatenation-7 | 175 × 175 × 320 | |||
| Bottleneck-C-7 ** | 1 × 1 (S = 1) | 64 | 175 × 175 × 64 | |
| UnPool-1 2 × 2 (S = 2) | 350 × 350 × 64 | |||
| Conv-1-2 ** | 3 × 3 × 64 (S = 1) | 64 | 350 × 350 × 64 | |
| Conv-1-1 | 3 × 3 × 64 (S = 1) | 64 | 350 × 350 × 64 | |
| Concatenation-8 | 350 × 350 × 160 | |||
| Bottleneck-C-8 ** | 1 × 1 (S = 1) | 2 | 350 × 350 × 2 | |
An architectural comparison of CardioNet with famous segmentation methods.
| Method | Other Architectures | CardioNet |
|---|---|---|
| SegNet [ | 26 convolutional layers (3 × 3) | 16 convolutional layers (3 × 3) |
| No depth-wise separable convolution | 6 depth-wise separable convolutions are involved in reducing the number of trainable parameters | |
| No skip connections are used. | Dense skip paths are used. | |
| Each block has a different number of convolutional layers | Each block has the same number of convolutions (2 convolutions) | |
| No booster block | Booster block. | |
| 5 pooling layers | 4 pooling layers | |
| The number of trainable parameters is 29.46 M. | The number of trainable parameters is 1.72 M. | |
| OR-Skip-Net [ | There is no internal connectivity between the convolutional layers in the encoder and decoder. | Both internal and external connectivities are used. |
| Residual connectivity is used. | Dense connectivity is used. | |
| 16 convolutional layers (3 × 3) | 16 convolution layers (3 × 3) including 6 layers of booster block (max. depth 32) | |
| No depth-wise separable convolution | 6 depth-wise separable convolution is involved in reducing the number of trainable parameters | |
| Bottleneck layers are not used. | Bottleneck layers are used to reduce the number of channels. | |
| The number of trainable parameters is 09.70 M | The number of trainable parameters is 1.72 M | |
| U-Net [ | 23 convolutional layers are used | 16 convolutional layers (3 × 3) |
| No depth-wise separable convolution | 6 depth-wise separable convolution is involved in reducing the number of trainable parameters | |
| Up convolutions are used in the expansive part for upsampling | Unpooling layers are used for upsampling | |
| External dense connectivity is used from encoder to decoder. | Both internal and external dense connectivity in downsampling and upsampling block | |
| Cropping is required owing to border pixel loss during convolution | Cropping is not required | |
| The number of trainable parameters is 31.03 M | The number of trainable parameters is 1.72 M |
Figure 4The CardioNet dense connectivity feature concatenation method.
Feature map size details for features boost block. Batch normalization and ReLU layers are used with bottleneck convolution layers as a unit and denoted as “**”. Stride is 1 throughout the features boost block.
| Block | Layer Name | Layer Size | Filters/Groups | Output |
|---|---|---|---|---|
| Features boost block (FBB) | Bottleneck-C ** | 1 × 1 × 8 | 8 | 350 × 350 × 8 |
| Boost-Conv-1-1 ** | 3 × 3 × 8 | 8 | 350 × 350 × 8 | |
| Boost-Conv-1-2 ** | 3 × 3 × 8 | 8 | 350 × 350 × 8 | |
| Boost-Conv-2-1 ** | 3 × 3 × 16 | 16 | 350 × 350 × 16 | |
| Boost-Conv-2-2 ** | 3 × 3 × 16 | 16 | 350 × 350 × 16 | |
| Boost-Conv-3-1 ** | 3 × 3 × 32 | 32 | 350 × 350 × 32 | |
| Boost-Conv-3-2 ** | 3 × 3 × 32 | 32 | 350 × 350 × 32 |
Figure 5CardioNet features boost block concatenation.
Figure 6Sample CXR images from the JSRT dataset with corresponding ground truth.
Figure 7The proposed data augmentation approach increases the data size; the horizontal flip is represented as H-Flip.
Figure 8The training loss curves are on the left, and the training accuracy curves are on the right sides of the figures. The two-fold training is d on the number of epochs: (a) first fold, (b) second fold.
Figure 9Examples of chest-related organs by CardioNet: (a) original chest PA X-ray image; (b) image with a ground-truth mask; (c) CardioNet predicted mask.
The ablation study-based comparison of CardioNet-B and CardioNet-X on the JSRT dataset.
| Methods | Segmentation Regions | Number of Trainable Parameters | Number of 3 × 3 Convolution Layers | Acc | J | D |
|---|---|---|---|---|---|---|
| CardioNet-X | Lungs | 1.57 M | 10 | 98.08 | 93.04 | 96.38 |
| Heart | 98.91 | 88.70 | 93.84 | |||
| Clavicle bone | 97.81 | 85.99 | 91.53 | |||
| CardioNet-B | Lungs | 1.72 M | 16 | 99.24 | 97.28 | 98.61 |
| Heart | 99.08 | 90.42 | 94.76 | |||
| Clavicle bone | 99.76 | 86.74 | 92.74 |
Accuracies of CardioNet and existing methods for the JSRT dataset (unit: %).
| Type | Method | Lungs | Heart | Clavicle Bone | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Acc | J | D | Acc | J | D | Acc | J | D | ||
| Local feature-based methods | Coppini et al. [ | - | 92.7 | 95.5 | - | - | - | - | - | - |
| Jangam et al. [ | - | 95.6 | 97.6 | - | - | - | - | - | - | |
| ASM default [ | - | 90.3 | - | - | 79.3 | - | - | 69.0 | - | |
| Chondro et al. [ | - | 96.3 | - | - | - | - | - | - | - | |
| Candemir et al. [ | - | 95.4 | 96.7 | - | - | - | - | - | - | |
| Dawoud [ | - | 94.0 | - | - | - | - | - | - | - | |
| Peng et al. [ | 97.0 | 93.6 | 96.7 | - | - | - | - | - | - | |
| Wan Ahmed et al. [ | 95.77 | - | - | - | - | - | - | - | - | |
| Deep feature-based methods | Dai et al. FCN [ | - | 94.7 | 97.3 | - | 86.6 | 92.7 | - | - | - |
| Oliveira et al. FCN [ | 95.05 | 97.45 | 89.25 | 94.24 | 75.52 | 85.90 | ||||
| OR-Skip-Net [ | 98.92 | 96.14 | 98.02 | 98.94 | 88.8 | 94.01 | 99.70 | 83.79 | 91.07 | |
| ResNet101 [ | 95.3 | 97.6 | 90.4 | 94.9 | 85.2 | 92.0 | ||||
| ContextNet-2 [ | - | 96.5 | - | - | - | - | - | |||
| BFPN [ | - | 87.0 | 93.0 | - | 82.0 | 90.0 | - | - | - | |
| InvertedNet [ | 94.9 | 97.4 | 88.8 | 94.1 | 83.3 | 91.0 | ||||
| HybridGNet [ | 97.43 | 93.34 | ||||||||
| RU-Net [ | 85.57 | |||||||||
| MPDC DDLA U-Net [ | 95.61 | 97.90 | ||||||||
|
|
|
|
|
|
|
|
|
|
| |
Figure 10Examples of MC chest X-ray images with ground truths.
Figure 11Examples of CardioNet lung semantic segmentation for MC data: (a) represents the best and most accurate segmentation (right image) and (b) shows the worst lung segmentation (right image) by proposed CardioNet.
The comparison of proposed CardioNet and other state-of-the-art methods for the MC dataset (unit: %).
| Type | Method | Accuracy | Jaccard Index | Dice Coefficient |
|---|---|---|---|---|
| Handcrafted feature-based methods | Vajda et al. [ | 69.0 | - | - |
| Candemir et al. [ | - | 94.1 | 96.0 | |
| Peng et al. [ | 97.0 | - | - | |
| Deep feature-based methods | Feature selection and Vote [ | 83.0 | - | - |
| Feature selection with BN [ | 77.0 | - | - | |
| Bayesian feature pyramid network [ | - | 87.0 | 93.0 | |
| Souza et al. [ | 96.97 | 88.07 | 96.97 | |
| HybridGNet [ | 95.4 | |||
| MPDC DDLA U-Net [ | 94.83 | 96.53 | ||
|
|
|
|
|
Figure 12Example image from the JSRT dataset for calculating CTR using the proposed CardioNet: (a) original image; (b) ground truth image annotated with the cardiothoracic ratio (G-CTR); (c) predicted mask annotated with the cardiothoracic ratio (P-CTR).