Literature DB >> 34698988

Quantification of pulmonary involvement in COVID-19 pneumonia by means of a cascade of two U-nets: training and assessment on multiple datasets using different annotation criteria.

Francesca Lizzi^1,2, Abramo Agosti³, Francesca Brero^4,5, Raffaella Fiamma Cabini^4,3, Maria Evelina Fantacci^6,7, Silvia Figini^4,8, Alessandro Lascialfari^4,5, Francesco Laruina^9,6, Piernicola Oliva^10,11, Stefano Piffer^12,13, Ian Postuma⁴, Lisa Rinaldi^4,5, Cinzia Talamonti^12,13, Alessandra Retico⁶.

Abstract

PURPOSE: This study aims at exploiting artificial intelligence (AI) for the identification, segmentation and quantification of COVID-19 pulmonary lesions. The limited data availability and the annotation quality are relevant factors in training AI-methods. We investigated the effects of using multiple datasets, heterogeneously populated and annotated according to different criteria.
METHODS: We developed an automated analysis pipeline, the LungQuant system, based on a cascade of two U-nets. The first one (U-net[Formula: see text]) is devoted to the identification of the lung parenchyma; the second one (U-net[Formula: see text]) acts on a bounding box enclosing the segmented lungs to identify the areas affected by COVID-19 lesions. Different public datasets were used to train the U-nets and to evaluate their segmentation performances, which have been quantified in terms of the Dice Similarity Coefficients. The accuracy in predicting the CT-Severity Score (CT-SS) of the LungQuant system has been also evaluated.
RESULTS: Both the volumetric DSC (vDSC) and the accuracy showed a dependency on the annotation quality of the released data samples. On an independent dataset (COVID-19-CT-Seg), both the vDSC and the surface DSC (sDSC) were measured between the masks predicted by LungQuant system and the reference ones. The vDSC (sDSC) values of 0.95±0.01 and 0.66±0.13 (0.95±0.02 and 0.76±0.18, with 5 mm tolerance) were obtained for the segmentation of lungs and COVID-19 lesions, respectively. The system achieved an accuracy of 90% in CT-SS identification on this benchmark dataset.
CONCLUSION: We analysed the impact of using data samples with different annotation criteria in training an AI-based quantification system for pulmonary involvement in COVID-19 pneumonia. In terms of vDSC measures, the U-net segmentation strongly depends on the quality of the lesion annotations. Nevertheless, the CT-SS can be accurately predicted on independent test sets, demonstrating the satisfactory generalization ability of the LungQuant.

Entities: Chemical

Keywords: COVID-19; Chest Computed Tomography; Ground-glass opacities; Machine Learning; Segmentation; U-net

Mesh：

Year: 2021 PMID： 34698988 PMCID： PMC8547130 DOI： 10.1007/s11548-021-02501-2

Source DB: PubMed Journal: Int J Comput Assist Radiol Surg ISSN： 1861-6410 Impact factor: 2.924

Introduction

The task of segmenting the abnormalities of the lung parenchyma related to COVID-19 infection is a typical segmentation problem that can be addressed with methods based on Deep Learning (DL). CT findings of patients with COVID-19 infection may include bilateral distribution of ground-glass opacifications (GGO), consolidations, crazy-paving patterns, reversed halo sign and vascular enlargement [2]. Due to the extremely heterogeneous appearance of COVID-19 lesions in density, textural pattern, global shape and location in the lung, an analytical approach is definitely hard to code. The potential of DL-based segmentation approaches is particularly suited in this case, provided that a sufficient number of annotated examples are available for training the models. Few fully automated software tools devoted to this task have been recently proposed [4, 10, 11]. Lessmann et al. [10] developed a U-net model for lesion segmentation trained on semi-automatically annotated COVID-19 cases. The output of this system was then combined with the lung lobe segmentation algorithm reported in Xie et al. [14]. The approach proposed in Fang et al. [4] implements the automated lung segmentation method provided in the work of Hofmanninger et al. [7], together with a lesion segmentation strategy based on multiscale feature extraction [5]. The specific problem related to the development of fully automated DL-based segmentation strategies with limited annotated data samples has been explicitly tackled by Ma et al. [11]. The authors studied how to train and evaluate a DL-based system for lung and COVID-19 lesion segmentation on poorly populated samples of CT scans. They also made the data publicly available, allowing for a fair comparison with their system. In this work, we present a DL-based fully automated system to segment both lungs and lesions associated with COVID-19 pneumonia, the LungQuant system, which provides the part of lung volume compromised by the infection. We extended the study proposed by Ma et al. [11] focusing our efforts in investigating and discussing the impact of using different datasets and different labelling styles. Data can be highly variable in terms of acquisition protocols and machines when they are gathered from different sources. This poses a serious problem of dependence of the segmentation performances on the training sample characteristics. Despite that advanced data harmonization strategies could mitigate this problem [6], this approach is not applicable in absence of data acquisition information, as it is in this study for the available CT data. Nevertheless, DL methods, when trained with sufficiently large samples of heterogeneous data, can acquire the desired generalization ability by themselves. In our analysis, we implemented an inter-sample cross-validation method to train, test and evaluate the generalization ability of the LungQuant DL-based segmentation pipeline across different available datasets. Finally, we also quantified the effect of using larger datasets to train, validate and test this kind of algorithm.

Material and Methods

Datasets

We used only publicly available datasets in order to make our results easily verifiable and reproducible. Five different datasets have been used to train and evaluate our segmentation pipeline. Most of them include image annotations, but each annotation has been associated with patients using different criteria. In Table 1, a summary of available labels for each dataset is reported.

Table 1

Dataset name	Lung	GGO	CT-SS	N. of
	mask	mask		cases
Plethora [8]	Yes	No	No	402
Lung CT Segmentation Challenge [15]	Yes	No	No	60
COVID-19 Challenge [1]	No	Yes	No	199
MosMed [12]	No	No	No	1110
MosMed (annotated subsample)	No	Yes	Inferable	50
MosMed (in-house annotated subsample)	Yes	No	No	91
COVID-19-CT-Seg [11]	Yes	Yes	Inferable	10

A summary of the datasets used in this study. The CT Severity Score (CT-SS) information is not available for all datasets, but it can be computed for data which has both lung masks and ground-glass opacification (GGO) masks The lung segmentation problem has been tackled using a wide representation of the population and three different datasets: the Plethora, the Lung CT Segmentation Challenge and a subset of the MosMed dataset. On the other hand, the number of samples that are publicly available for COVID-19 infection segmentation may not be sufficient to obtain good performances on this task. The currently available data, provided along with infection annotations, have been labelled following different guidelines and released in NifTI format. They do not contain complete acquisition and population information, and they have been stored according to different criteria (see the Supplementary Materials for further details). Some of the choices made during the DICOM to NifTI conversion may strongly affect the quality of data. For example, the MosMed dataset as described by Morozov et al. [12] preserves only one slice out of ten during this conversion. This operation results in a significantly loss of resolution with respect to the COVID-19 Challenge dataset. Questioning how much such conversion influences the quantitative analysis is important to improve not only the performance but also the possibility of comparing DL algorithm in a fair modality.

LungQuant: a DL based quantification analysis pipeline

The analysis pipeline, which is hereafter referred to as the LungQuant system, provides in output the lung and COVID-19 infection segmentation masks, the percentage P of lung volume affected by COVID-19 lesions and the corresponding CT-SS (CT-SS 1 for P< 5%, CT-SS 2 for 5% P< 25%, CT-SS 3 for 25% P< 50%, CT-SS 4 for 50% P< 75%, CT-SS 5 for P 75%). A summary of our image analysis pipeline is reported in Fig. 1. The central analysis module is a U-net for image segmentation [13] (see Sec. U-net), which is implemented in a cascade of two different U-nets: the first network, U-net, is trained to segment the lung and the second one, U-net, is trained to segment the COVID lesions in the CT scans.

Fig. 1

A summary of the whole analysis pipeline: the input CT scans are used to train U-net, which is devoted to lung segmentation; its output is refined by a morphology-based method. A bounding box containing the segmented lungs is made and applied to all CT scans for training U-net, which is devoted to COVID-19 lesion segmentation. Finally, the output of U-net is the definitive COVID-19 lesion mask, whereas the definitive lung mask is obtained as the union between the outputs of U-net and U-net. The ratio between the COVID-19 lesion mask and the lung mask provides the CT-SS for each patient

U-net

For both lung and COVID-19 lesion segmentation, we implemented a U-net using Keras [3], a Python DL API that uses Tensorflow as backend. In Fig. 2, a simplified scheme of our U-net is reported.

Fig. 2

U-net scheme: the neural network is made of 6 levels of depth. In the compression path (left), the input is processed through convolutions, activation layers (ReLu) and instance normalization layers, while in the decompression one (right), in addition to those already mentioned, 3D Transpose Convolution (de-convolution) layers are also introduced Each block of layers in the compression path (left) is made by 3 convolutional layers, ReLu activation functions and instance normalization layers. The input of each block is added to the block output in order to implement a residual connection. In the decompression path (right), one convolutional layer has been replaced by a de-convolutional layer to upsample the images to the input size. In the last layer of the U-nets, a softmax is applied to the final feature map, and then, the loss is computed.

The U-net cascade for lesion quantification and severity score assignment

The input CT scans, whose number of slices is highly variable, have been resampled to matrices of voxels and then used to train U-net, which is devoted to lung segmentation, using the three datasets containing original CT scans and lung masks (see Table 1). The output of U-net was refined using a connected component labelling strategy to remove small regions of the segmented mask not connected with the main objects identified as the lungs. We identified the connected components in the lung masks generated by U-net, and we excluded those components whose number of voxels was below an empirically fixed threshold (see Supplementary Materials for further details). We then built for each CT a bounding box enclosing the refined segmented lungs, adding a conservative padding of 2.5 cm. The bounding boxes were used to crop the training images for U-net, which has the same architecture as U-net. Training U-net to recognize the COVID-19 lesions on a conservative bounding box has two main advantages: it allows to restrict the action volume of the U-net to the region where the lung parenchyma is supposed to be, thus avoiding false-positive findings outside the chest; it facilitates the U-net training phase, as the dimensions of the lungs of different patients are standardized to focus the U-net learning process on the textural patterns characterizing the COVID-19 lesions. The cropped images were resized to a matrix of voxels. We applied a windowing on the grey-level values of the CT scans to optimize the image contrast for the two segmentation problems: the [− 1000, 1000] HU window range for the U-net and the [− 1000, 300] HU range for U-net. The first window highlights the contrast between the lung parenchyma and the surrounding tissues, whereas the second one enhances the heterogeneous structure of the lung abnormalities related to the COVID-19 infection. We implemented a data augmentation strategy, relying on the most commonly used data augmentation techniques for DL (see Supplementary Materials for further details) to overcome the problem of having a limited amount of labelled data. We transformed the images with rotations, zooming, elastic transformations and adding Gaussian noise. The LungQuant system returns the infection mask as the output of U-net and the lung mask as the union between the output of U-net and U-net. This choice has been made a priori by design, as U-net has been trained to segment the lungs relying on the available annotated data, which are almost totally of patients not affected by COVID-19 pneumonia. Thus, U-net is expected to be unable to accurately segment the areas affected by GGO or consolidations; as also these areas are part of the lungs, they should be instead included in the mask. Lastly, once lung and lesion masks have been identified, the LungQuant system computes the percentage of lung volume affected by COVID-19 lesions as the ratio between the volume of the infection mask and the volume of the lung mask and converts it into the corresponding CT severity score.

Training details and evaluation strategy for the U-nets

Both U-net and U-net have been evaluated using the volumetric Dice Similarity Coefficients (vDSC). U-net has been trained with the vDSC as loss function, while U-net has been trained using the sum of the vDSC and a weighted cross-entropy as error function in order to balance the number of voxels representing lesions and the background (see Supplementary Materials for further details). The performances of the whole system have been evaluated also with the surface Dice Similarity Coefficient (sDSC) for different values of tolerance [9].

Cross-validation strategy

To train, validate and test the performances of the two U-nets, we partitioned the datasets into training, validation and test sets. We then evaluated the network performance separately and globally. U-net has been trained twice, i.e. on the 60% and 90% of the CT scans of COVID-19-Challenge and Mosmed datasets to investigate the effect of maximizing the training set size on the lesion segmentation. The amount of CT scan used for train, validation and test sets for each U-net is reported in Table 2. To evaluate the ability of the trained networks to predict the percentage of the affected lung parenchyma and thus the CT-SS classification, we used a completely independent set consisting of 10 CT scans from the COVID-19-CT-Seg dataset, which is the only publicly available dataset containing both lung and infection mask annotations.

Table 2

Number of CT scans assigned to the train, validation (val) and test sets used during the training and performance assessment of the U-net and the U-net networks

U-net\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_1$$\end{document}1	Train	Val	Test
Plethora	319	40	40
MosMed (91 CT-0)	55	18	18
LCTSC	36	12	12
COVID-19-CT-Seg	–	–	10

Number of CT scans assigned to the train, validation (val) and test sets used during the training and performance assessment of the U-net and the U-net networks

Results

In this section, we report, first, the performance achieved by U-net and U-net, then, the quantification performance of the integrated LungQuant system, evaluated on a completely independent test set. We trained both the U-nets for 300 epochs on a NVIDIA V100 GPU using ADAM as optimizer and we kept the models trained at the epoch where the best evaluation metric on the validation set was obtained.

U-net: Lung segmentation performance

U-net for lung segmentation was trained and validated using three different datasets, as specified in Table 2. Then, we tested U-net on each of the three independent test sets and we reported in Table 3 the performance achieved in terms of vDSC, computed between the segmented masks and the reference ones, both separately for each dataset and globally.

Table 3

Performances achieved by U-net in lung segmentation on different test sets, evaluated in terms of the vDSC at three successive stages of the segmentation procedure

Test set	Masks of U-net size	Masks before refinement	Masks after refinement
	vDSC	vDSC	vDSC
Plethora	0.96 ± 0.02	0.95 ± 0.02	0.95 ± 0.04
MosMed	0.97 ± 0.02	0.97 ± 0.02	0.97 ± 0.02
LCTSC	0.96 ± 0.03	0.95 ± 0.03	0.96 ± 0.01
COVID-19-CT-Seg	0.96 ± 0.01	0.95 ± 0.01	0.95 ± 0.01

The evaluation of the lung segmentation performances was made in three cases: (1) on CT scans and masks resized to the voxel array size; (2) on CT scans and masks in the original size before undergoing the morphological refinement; (3) on CT scans and masks in the original size and after the morphological refinement. Even if segmentation refinement has a small effect on vDSC, since it is a volume-based metric, as shown in Table 3, it is a fundamental step to allow the definition of precise bounding boxes enclosing the lungs and thus to facilitate the U-net learning process. Performances achieved by U-net in lung segmentation on different test sets, evaluated in terms of the vDSC at three successive stages of the segmentation procedure

U-net: COVID-19 lesion segmentation performance

U-net for COVID-19 lesion segmentation has been trained and evaluated separately on the COVID-19-Challenge dataset and on the annotated subset of the MosMed dataset, following the train/validation/test partitioning reported in Table 2. The segmentation performances achieved on the test sets are reported in terms of the vDSC in Table 4, according to the cross-sample validation scheme.

Table 4

Performances achieved by U-net in COVID-19 lesion segmentation, evaluated in terms of the vDSC

U-net	Trained on	Test set	U-net size	Original CT size
			(vDSC)	(vDSC)
U-net\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_2^{60\%}$$\end{document}260%	COVID-19 challenge	COVID-19 challenge	0.51 ± 0.24	0.51 ± 0.25
	COVID-19 Challenge	MosMed	0.39 ± 0.19	0.40 ± 0.19
	MosMed	MosMed	0.54 ± 0.22	0.55 ± 0.22
	MosMed	COVID-19 challenge	0.25 ± 0.23	0.25 ± 0.23
	COVID-19 challenge	COVID-19 challenge	0.49 ± 0.21	0.50 ± 0.21
	+ MosMed	+ MosMed
U-net\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_2^{90\%}$$\end{document}290%	COVID-19 challenge	COVID-19 challenge	0.64 ± 0.23	0.65 ± 0.23
	+ MosMed	+ MosMed

The composition of the train and test sets is reported in Table 2

Performances achieved by U-net in COVID-19 lesion segmentation, evaluated in terms of the vDSC The composition of the train and test sets is reported in Table 2 As expected, the U-net performances are higher when both the training set and independent test sets belong to the same data cohort. By contrast, when a U-net is trained on COVID-19-Challenge data and tested on Mosmed (and the other way around), performances significantly decrease. This effect is related to different criteria used to both collect and annotate the data. We obtained a better result with the U-net trained on the COVID-19 Challenge dataset and tested on the MosMed test set, since the network has been trained on a larger data sample and hence it has a higher generalization capability. The best segmentation performances have been obtained by the U-net trained using the 90 of the available data, U-net, which reaches a vDSC of 0.65 ± 0.23 on the test set. This result suggests the need to train U-net models on the largest possible data samples in order to achieve higher segmentation performance. Performances of the LungQuant system on the independent COVID-19-CT-Seg test dataset. The vDSC and sDSC computed between the lung and lesion reference masks and those predicted by the LunQuant system are reported On the rows: three axial slices of the first CT scan on the COVID-19-CT-Seg test dataset (coronacases001.nii) are shown. On the columns: original images (left); overlays between the predicted and the reference lung (centre) and COVID-19 lesion (right) masks. The reference masks are in green, while the predicted ones, obtained by the LungQuant system integrating U-net,are in blue

Evaluation of the quantification performance of the LungQuant system on a completely independent set

Evaluation of lung and COVID-19 lesion segmentations

Once the two U-nets have been trained and the whole analysis pipeline has been integrated into the LungQuant system, we tested it on a completely independent set (COVID-19-CT-Seg dataset) of CT scans. The performances of the whole process were quantified both in terms of vDSC and sDSC with tolerance values of 1, 5 and 10 mm (Table 5). A very good overlap between the predicted and reference lung masks is observable in terms of vDSC, whereas the sDSC values are highly dependent on tolerance values, ranging from moderate to very good agreement measures. Regarding the lesion masks, a moderate overlap is measured between the predicted and reference lesion masks in terms of vDSC, whereas the sDSC returns measures extremely dependent on tolerance values that span from limited to moderately good and ultimately satisfactory performances for tolerance values of 1 mm, 5 mm and 10 mm, respectively.. Figure 3 allows for a visual comparison between the lung and lesion masks provided by the LungQuant system integrating U-net and the reference ones.

Table 5

Performances of the LungQuant system on the independent COVID-19-CT-Seg test dataset. The vDSC and sDSC computed between the lung and lesion reference masks and those predicted by the LunQuant system are reported

Metrics	Lung segmentation
	vDSC	sDSC (1 mm)	sDSC (5 mm)	sDSC (10 mm)
LungQuant (U-net\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_2^{60\%})$$\end{document}260%)	0.96 ± 0.01	0.66 ± 0.09	0.95 ± 0.02	0.98 ± 0.01
LungQuant (U-net\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_2^{90\%})$$\end{document}290%)	0.95 ± 0.01	0.65 ± 0.09	0.95 ± 0.02	0.98 ± 0.01
Infection Segmentation
LungQuant (U-net\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_2^{60\%})$$\end{document}260%)	0.62 ± 0.09	0.29 ± 0.06	0.75 ± 0.11	0.90 ± 0.09
LungQuant (U-net\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_2^{90\%})$$\end{document}290%)	0.66 ± 0.13	0.36 ± 0.13	0.76 ± 0.18	0.87 ± 0.13

Fig. 3

On the rows: three axial slices of the first CT scan on the COVID-19-CT-Seg test dataset (coronacases001.nii) are shown. On the columns: original images (left); overlays between the predicted and the reference lung (centre) and COVID-19 lesion (right) masks. The reference masks are in green, while the predicted ones, obtained by the LungQuant system integrating U-net,are in blue

Percentage of affected lung volume and CT-SS estimation

The lung and lesion masks provided by the LungQuant system can be further processed to derive the physical volumes of each mask and the ratios between the lesion and lung volumes. We show in Fig. 4 the relationship between the percentage of lung involvement as predicted by the LungQuant system vs. the corresponding values for the reference masks of the fully independent test set COVID-19-CT-Seg, for both the LungQuant systems with the U-net and the U-net. Despite the limited range of samples to carry out this test, an agreement between the LungQuant system output and the reference values is observed for both U-net and U-net. In terms of the mean absolute error (MAE) among the estimated and the reference percentages of affected lung volume (P), we obtained a Mean Absolute Error equal to MAE 4.6% for the LungQuant system with U-net and MAE 4.2% for the system with U-net.

Fig. 4

Estimated percentages P of affected lung volume versus the ground truth percentages, as obtained by the LungQuant system integrating U-net (left) and U-net (right). The grey areas in the plot backgrounds guide the eye to recognize the CT-SS values assigned to each value of P (from left to right: CT-SS 1, CT-SS 2, CT-SS 3) The accuracy in assigning the correct CT-SS class is reported in Table 6, together with the number of misclassified cases, for the 10 cases of the COVID-19-CT-Seg dataset. The best accuracy achieved by LungQuant is of 90% with U-net. In all cases, the system misclassifies the examples by 1 class at most.

Table 6

Classification performances of the whole system in predicting CT Severity Score on MosMed and COVID-19-CT-Seg datasets. The number of misclassified cases is reported

U-net	Dataset	Accuracy	Misclassified	Misclassified
			by 1 class	by 2 classes
U-net\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_2^{60\%}$$\end{document}260%	COVID-19-CT-Seg	6/10	4/10	0
U-net\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_2^{90\%}$$\end{document}290%	COVID-19-CT-Seg	9/10	1/10	0

Classification performances of the whole system in predicting CT Severity Score on MosMed and COVID-19-CT-Seg datasets. The number of misclassified cases is reported

Discussion and Conclusion

We developed a fully automated quantification pipeline, the LungQuant system, for the identification and segmentation of lungs and pulmonary lesions related to COVID-19 pneumonia in CT scans. The system returns the COVID-19 related lesions, the lung mask and the ratio between their volumes, which is converted into a CT Severity Score. The performance obtained against a voxel-wise segmentation ground truth was evaluated in terms of the vDSC, which provides a measure of the overlap between the predicted and the reference masks. The LungQuant system achieved a vDSC of 0.95 ± 0.01 in the lung segmentation task and of 0.66 ± 0.13 in segmenting the COVID-19 related lesions on the fully annotated publicly available benchmark COVID-19-CT-Seg dataset of 10 CT scans. The LungQuant has been evaluated also in terms of sDSC for different values of tolerance. The results obtained at a tolerance of 5 mm equal to are satisfactory for our purpose, given the heterogeneity of the labelling process. Regarding the correct assignment of the CT-SS, the LungQuant system showed an accuracy of 90% on the completely independent test set COVID-19-CT-Seg. Despite that this result is encouraging, it was obtained on a rather small independent test set; thus, a broader validation on larger data sample with more heterogeneous composition in terms of disease severity is required. Training DL algorithms requires a huge amount of labelled data. The lung segmentation task has been made feasible in this work thanks to the use of lung CT datasets collected for purposes different from the study of COVID-19 pneumonia. Training a segmentation system on these samples had the effect that when we use the trained network to process the CT scan of a patient with COVID-19 lesions, especially in case ground glass opacities and consolidation are very severe, the lung segmentation is not accurate anymore. In order to overcome this problem, the proposed LungQuant system returns a lung mask which is the logical union between the output mask of the U-net and the infection mask generated by the U-net. The LungQuant system can actually be improved whether lung masks annotation are available on subjects with COVID-19 lesions. A similar problem occurs for the segmentation of ground glass opacities and consolidations. The available data, in fact, are very unbalanced with respect to the severity of COVID-19 disease, and hence, the accuracy in segmenting the most severe case is worse. The current lack of a large dataset, collected by paying attention to adequately represent all categories of disease severity, limits the possibility to carry out accurate training of AI-based models. Finally, we found that the difference in the annotation and collection guidelines among datasets is an issue. Processing aggregated data from different sources can be difficult if labelling has been performed using different guidelines. CT scans should contain the acquisition parameters, usually stored in the DICOM header, when they are published. The lack of this information is a drawback of our study. If we had that data, we could study more in detail the dependence of the LungQuant performances on specific acquisition protocols or scanners. On the contrary, even with this information, it would not be possible to standardize the different annotation styles. The results of LungQuant (last 2 rows of Table 4) demonstrate its robustness across different datasets even without a dedicated preprocessing step to account for this information. Below is the link to the electronic supplementary material. Supplementary material 1 (pdf 9932 KB)

10 in total

1. Toward data-efficient learning: A benchmark for COVID-19 CT lung and infection segmentation.

Authors: Jun Ma; Yixin Wang; Xingle An; Cheng Ge; Ziqi Yu; Jianan Chen; Qiongjie Zhu; Guoqiang Dong; Jian He; Zhiqiang He; Tianjia Cao; Yuntao Zhu; Ziwei Nie; Xiaoping Yang
Journal: Med Phys Date: 2020-12-23 Impact factor: 4.071

2. Multi-Organ Segmentation Over Partially Labeled Datasets With Multi-Scale Feature Abstraction.

Authors: Xi Fang; Pingkun Yan
Journal: IEEE Trans Med Imaging Date: 2020-10-28 Impact factor: 10.048

3. Relational Modeling for Robust and Efficient Pulmonary Lobe Segmentation in CT Scans.

Authors: Weiyi Xie; Colin Jacobs; Jean-Paul Charbonnier; Bram van Ginneken
Journal: IEEE Trans Med Imaging Date: 2020-08 Impact factor: 10.048

4. Harmonization of cortical thickness measurements across scanners and sites.

Authors: Jean-Philippe Fortin; Nicholas Cullen; Yvette I Sheline; Warren D Taylor; Irem Aselcioglu; Philip A Cook; Phil Adams; Crystal Cooper; Maurizio Fava; Patrick J McGrath; Melvin McInnis; Mary L Phillips; Madhukar H Trivedi; Myrna M Weissman; Russell T Shinohara
Journal: Neuroimage Date: 2017-11-17 Impact factor: 6.556

5. Automatic lung segmentation in routine imaging is primarily a data diversity problem, not a methodology problem.

Authors: Johannes Hofmanninger; Forian Prayer; Jeanny Pan; Sebastian Röhrich; Helmut Prosch; Georg Langs
Journal: Eur Radiol Exp Date: 2020-08-20

6. Automated Assessment of CO-RADS and Chest CT Severity Scores in Patients with Suspected COVID-19 Using Artificial Intelligence.

Authors: Nikolas Lessmann; Clara I Sánchez; Ludo Beenen; Luuk H Boulogne; Monique Brink; Erdi Calli; Jean-Paul Charbonnier; Ton Dofferhoff; Wouter M van Everdingen; Paul K Gerke; Bram Geurts; Hester A Gietema; Miriam Groeneveld; Louis van Harten; Nils Hendrix; Ward Hendrix; Henkjan J Huisman; Ivana Išgum; Colin Jacobs; Ruben Kluge; Michel Kok; Jasenko Krdzalic; Bianca Lassen-Schmidt; Kicky van Leeuwen; James Meakin; Mike Overkamp; Tjalco van Rees Vellinga; Eva M van Rikxoort; Riccardo Samperna; Cornelia Schaefer-Prokop; Steven Schalekamp; Ernst Th Scholten; Cheryl Sital; Lauran Stöger; Jonas Teuwen; Kiran Vaidhya Venkadesh; Coen de Vente; Marieke Vermaat; Weiyi Xie; Bram de Wilde; Mathias Prokop; Bram van Ginneken
Journal: Radiology Date: 2020-07-30 Impact factor: 11.105

Review 7. Chest CT features of coronavirus disease 2019 (COVID-19) pneumonia: key points for radiologists.

Authors: Marina Carotti; Fausto Salaffi; Piercarlo Sarzi-Puttini; Andrea Agostini; Alessandra Borgheresi; Davide Minorati; Massimo Galli; Daniela Marotto; Andrea Giovagnoni
Journal: Radiol Med Date: 2020-06-04 Impact factor: 3.469

8. Association of AI quantified COVID-19 chest CT and patient outcome.

Authors: Xi Fang; Uwe Kruger; Fatemeh Homayounieh; Hanqing Chao; Jiajin Zhang; Subba R Digumarthy; Chiara D Arru; Mannudeep K Kalra; Pingkun Yan
Journal: Int J Comput Assist Radiol Surg Date: 2021-01-23 Impact factor: 3.421

9. Novel Autosegmentation Spatial Similarity Metrics Capture the Time Required to Correct Segmentations Better Than Traditional Metrics in a Thoracic Cavity Segmentation Workflow.

Authors: Kendall J Kiser; Arko Barman; Sonja Stieb; Clifton D Fuller; Luca Giancardo
Journal: J Digit Imaging Date: 2021-05-23 Impact factor: 4.056

10. PleThora: Pleural effusion and thoracic cavity segmentations in diseased lungs for benchmarking chest CT processing pipelines.

Authors: Kendall J Kiser; Sara Ahmed; Sonja Stieb; Abdallah S R Mohamed; Hesham Elhalawani; Peter Y S Park; Nathan S Doyle; Brandon J Wang; Arko Barman; Zhao Li; W Jim Zheng; Clifton D Fuller; Luca Giancardo
Journal: Med Phys Date: 2020-08-28 Impact factor: 4.071

10 in total

1 in total

1. COVLIAS 1.0_Lesion vs. MedSeg: An Artificial Intelligence Framework for Automated Lesion Segmentation in COVID-19 Lung Computed Tomography Scans.

Authors: Jasjit S Suri; Sushant Agarwal; Gian Luca Chabert; Alessandro Carriero; Alessio Paschè; Pietro S C Danna; Luca Saba; Armin Mehmedović; Gavino Faa; Inder M Singh; Monika Turk; Paramjit S Chadha; Amer M Johri; Narendra N Khanna; Sophie Mavrogeni; John R Laird; Gyan Pareek; Martin Miner; David W Sobel; Antonella Balestrieri; Petros P Sfikakis; George Tsoulfas; Athanasios D Protogerou; Durga Prasanna Misra; Vikas Agarwal; George D Kitas; Jagjit S Teji; Mustafa Al-Maini; Surinder K Dhanjil; Andrew Nicolaides; Aditya Sharma; Vijay Rathore; Mostafa Fatemi; Azra Alizad; Pudukode R Krishnan; Ferenc Nagy; Zoltan Ruzsa; Mostafa M Fouda; Subbaram Naidu; Klaudija Viskovic; Manudeep K Kalra
Journal: Diagnostics (Basel) Date: 2022-05-21

1 in total