Mohit Agarwal1, Sushant Agarwal2, Luca Saba3, Gian Luca Chabert3, Suneet Gupta1, Alessandro Carriero3, Alessio Pasche4, Pietro Danna4, Armin Mehmedovic5, Gavino Faa6, Saurabh Shrivastava7, Kanishka Jain7, Harsh Jain7, Tanay Jujaray8, Inder M Singh9, Monika Turk10, Paramjit S Chadha9, Amer M Johri11, Narendra N Khanna12, Sophie Mavrogeni13, John R Laird14, David W Sobel15, Martin Miner16, Antonella Balestrieri3, Petros P Sfikakis17, George Tsoulfas18, Durga Prasanna Misra19, Vikas Agarwal19, George D Kitas20, Jagjit S Teji21, Mustafa Al-Maini22, Surinder K Dhanjil9, Andrew Nicolaides23, Aditya Sharma24, Vijay Rathore9, Mostafa Fatemi25, Azra Alizad26, Pudukode R Krishnan27, Rajanikant R Yadav28, Frence Nagy29, Zsigmond Tamás Kincses29, Zoltan Ruzsa30, Subbaram Naidu31, Klaudija Viskovic5, Manudeep K Kalra32, Jasjit S Suri33. 1. Department of Computer Science Engineering, Bennett University, India. 2. Department of Computer Science Engineering, PSIT, Kanpur, India; Advanced Knowledge Engineering Centre, Global Biomedical Technologies, Inc., Roseville, CA 95661, USA. 3. Department of Radiology, Azienda Ospedaliero Universitaria (A.O.U.), Cagliari, Italy. 4. Depart of Radiology, "Maggiore della Carità" Hospital, University of Piemonte Orientale, Via Solaroli 17, 28100, Novara, Italy. 5. University Hospital for Infectious Diseases, Zagreb, Croatia. 6. Department of Pathology - AOU of Cagliari, Italy. 7. College of Computing Sciences and IT, Teerthanker Mahaveer University, Moradabad, 244001, India. 8. Dept of Molecular, Cell and Developmental Biology, University of California, Santa Cruz, CA, USA. 9. AtheroPoint LLC, CA, USA. 10. The Hanse-Wissenschaftskolleg Institute for Advanced Study, Delmenhorst, Germany. 11. Division of Cardiology, Queen's University, Kingston, Ontario, Canada. 12. Department of Cardiology, Indraprastha APOLLO Hospitals, New Delhi, India. 13. Cardiology Clinic, Onassis Cardiac Surgery Center, Athens, Greece. 14. Heart and Vascular Institute, Adventist Health St. Helena, St Helena, CA, USA. 15. Minimally Invasive Urology Institute, Brown University, Providence, RI, USA. 16. Men's Health Center, Miriam Hospital Providence, Rhode Island, USA. 17. Rheumatology Unit, National Kapodistrian University of Athens, Greece. 18. Aristoteleion University of Thessaloniki, Thessaloniki, Greece. 19. Dept. of Immunology, SGPIMS, Lucknow, UP, India. 20. Academic Affairs, Dudley Group NHS Foundation Trust, Dudley, UK; Arthritis Research UK Epidemiology Unit, Manchester University, Manchester, UK. 21. Ann and Robert H. Lurie Children's Hospital of Chicago, Chicago, USA. 22. Allergy, Clinical Immunology and Rheumatology Institute, Toronto, Canada. 23. Vascular Screening and Diagnostic Centre and Univ. of Nicosia Medical School, Cyprus. 24. Division of Cardiovascular Medicine, University of Virginia, Charlottesville, VA, USA. 25. Dept. of Physiology & Biomedical Engg., Mayo Clinic College of Medicine and Science, MN, USA. 26. Dept. of Radiology, Mayo Clinic College of Medicine and Science, MN, USA. 27. Neurology Department, Fortis Hospital, Bangalore, India. 28. Radiodiagnosis, SGPIMS, Lucknow, Uttar Pradesh, India. 29. Department of Radiology, University of Szeged, 6725, Hungary. 30. Invasive Cardiology Division, University of Szeged, Budapest, Hungary. 31. Electrical Engineering Department, University of Minnesota, Duluth, MN, USA. 32. Department of Radiology, Massachusetts General Hospital, Boston, MA, USA. 33. College of Computing Sciences and IT, Teerthanker Mahaveer University, Moradabad, 244001, India; Stroke Diagnostic and Monitoring Division, AtheroPoint™, Roseville, CA, USA. Electronic address: jasjit.suri@atheropoint.com.
Abstract
BACKGROUND: COVLIAS 1.0: an automated lung segmentation was designed for COVID-19 diagnosis. It has issues related to storage space and speed. This study shows that COVLIAS 2.0 uses pruned AI (PAI) networks for improving both storage and speed, wiliest high performance on lung segmentation and lesion localization. METHOD: ology: The proposed study uses multicenter ∼9,000 CT slices from two different nations, namely, CroMed from Croatia (80 patients, experimental data), and NovMed from Italy (72 patients, validation data). We hypothesize that by using pruning and evolutionary optimization algorithms, the size of the AI models can be reduced significantly, ensuring optimal performance. Eight different pruning techniques (i) differential evolution (DE), (ii) genetic algorithm (GA), (iii) particle swarm optimization algorithm (PSO), and (iv) whale optimization algorithm (WO) in two deep learning frameworks (i) Fully connected network (FCN) and (ii) SegNet were designed. COVLIAS 2.0 was validated using "Unseen NovMed" and benchmarked against MedSeg. Statistical tests for stability and reliability were also conducted. RESULTS: Pruning algorithms (i) FCN-DE, (ii) FCN-GA, (iii) FCN-PSO, and (iv) FCN-WO showed improvement in storage by 92.4%, 95.3%, 98.7%, and 99.8% respectively when compared against solo FCN, and (v) SegNet-DE, (vi) SegNet-GA, (vii) SegNet-PSO, and (viii) SegNet-WO showed improvement by 97.1%, 97.9%, 98.8%, and 99.2% respectively when compared against solo SegNet. AUC > 0.94 (p < 0.0001) on CroMed and > 0.86 (p < 0.0001) on NovMed data set for all eight EA model. PAI <0.25 s per image. DenseNet-121-based Grad-CAM heatmaps showed validation on glass ground opacity lesions. CONCLUSIONS: Eight PAI networks that were successfully validated are five times faster, storage efficient, and could be used in clinical settings.
BACKGROUND: COVLIAS 1.0: an automated lung segmentation was designed for COVID-19 diagnosis. It has issues related to storage space and speed. This study shows that COVLIAS 2.0 uses pruned AI (PAI) networks for improving both storage and speed, wiliest high performance on lung segmentation and lesion localization. METHOD: ology: The proposed study uses multicenter ∼9,000 CT slices from two different nations, namely, CroMed from Croatia (80 patients, experimental data), and NovMed from Italy (72 patients, validation data). We hypothesize that by using pruning and evolutionary optimization algorithms, the size of the AI models can be reduced significantly, ensuring optimal performance. Eight different pruning techniques (i) differential evolution (DE), (ii) genetic algorithm (GA), (iii) particle swarm optimization algorithm (PSO), and (iv) whale optimization algorithm (WO) in two deep learning frameworks (i) Fully connected network (FCN) and (ii) SegNet were designed. COVLIAS 2.0 was validated using "Unseen NovMed" and benchmarked against MedSeg. Statistical tests for stability and reliability were also conducted. RESULTS: Pruning algorithms (i) FCN-DE, (ii) FCN-GA, (iii) FCN-PSO, and (iv) FCN-WO showed improvement in storage by 92.4%, 95.3%, 98.7%, and 99.8% respectively when compared against solo FCN, and (v) SegNet-DE, (vi) SegNet-GA, (vii) SegNet-PSO, and (viii) SegNet-WO showed improvement by 97.1%, 97.9%, 98.8%, and 99.2% respectively when compared against solo SegNet. AUC > 0.94 (p < 0.0001) on CroMed and > 0.86 (p < 0.0001) on NovMed data set for all eight EA model. PAI <0.25 s per image. DenseNet-121-based Grad-CAM heatmaps showed validation on glass ground opacity lesions. CONCLUSIONS: Eight PAI networks that were successfully validated are five times faster, storage efficient, and could be used in clinical settings.
AccuracyArtificial IntelligenceAcute Respiratory Distress SyndromeBland-AltmanCorrelation CoefficientCross-EntropyCoronavirus diseaseCOVID Lung Image Analysis SystemCroatian data setComputed TomographyDifferential EvolutionDeep LearningDice SimilarityEvolutionary AlgorithmFigure of meritGenetic AlgorithmGlass Ground OpacitiesGround TruthHybrid Deep LearningHounsfield UnitsJaccard IndexItalian data setNational Institute of HealthPruned AIPruned and optimized AIParticle Swarm OptimizationReverse Transcription-Polymerase Chain ReactionSolo Deep LearningSegmentation NetworkVisual Geometric GroupWorld Health OrganizationWhale OptimizationMeanStandard DeviationCross-entropy-lossClassifier probabilityi Input gold standard label 1Optimized fitness functionOriginal hidden neuronCompressed hidden neuronMean intersection over union accuracyWeight factors for the fitness functionTrue PositiveTrue NegativeFalse PositiveFalse NegativeVariable for counting EA and takes the values 1, 2, 3, and 4Evolutionary algorithm - DE, GA, PSO, and WODice Similarity for the base models FCN & EA (m)Dice Similarity for the base models FCN & EA (m)Jaccard Index for the base models FCN & EA (m)Jaccard Index for the base models SegNet & EA (m)Correlation Coefficient for the base models FCN & EA (m)Correlation Coefficient for the base models SegNet & EA (m)Dice Similarity for the MedSegJaccard Index for the MedSegCorrelation Coefficient for the MedSegDifference between FCN and MedSegDifference between SegNet and MedSegDifference between FCN and MedSegDifference between SegNet and MedSegDifference between FCN and MedSegDifference between SegNet and MedSegMean of all the four EA using FCN as base modelMean of all the four EA using SegNet as base modelDatatype - CroMed and NovMedMean area for EA models for the datatype “d”Mean area for EA models for the datatype “d”Total number of imagesFigure-of-Merit for EA using datatype “d”Area of the lungs in the CT image ‘n’GT area of the lungs in the CT image ‘n’Summation
Introduction
COVID-19 (the novel coronavirus) that was proclaimed by the World Health Organization (WHO) on March 11, 2020 [1] as a global pandemic has been a rapidly developing disease with limited hospital resources worldwide. As of February 15, 2022, it has infected more than 410 million people and killed almost 5.8 million [2] worldwide. COVID-19 molecular pathways [3,4] are worse in people with comorbidities such as coronary artery disease [3,5,6], diabetes [7], atherosclerosis [8], and fetal programming [9]. Further, it also damages the vasa vasorum of the aorta, causing micro thrombosis and atherosclerotic plaque vulnerability [10,11]. It has also caused architectural deformation because of interactions between alveolar and vascular modifications [12] and affects daily activities such as nutrition [13]. Pathology revealed that vaccine-induced immune thrombotic thrombocytopenia (VITT) is generated even after vaccine inoculation (ChAdOx1 nCoV-19) [14]. Adults who are born tiny, a condition known as intrauterine growth restriction (IUGR), are also more likely to be infected with COVID-19 [9]. Due to the lack of adequate immunization or therapy, early and accurate detection of COVID-19 is critical and in need of immediate attention. Due to the lower specificity of reverse transcription-polymerase chain reaction (RT-PCR) tests [[15], [16], [17]], research has become more inclined to use image-based analysis for detecting and diagnosing COVID-19.With the advancement of artificial intelligence (AI) technology, machine learning (ML) and deep learning (DL) techniques have been widely adopted for pneumonia detection and classification [[18], [19], [20], [21]]. However, these technologies possess a threat when it comes to real-time analysis [22]. An advanced stage of DL is hybrid DL (HDL) [[23], [24], [25], [26], [27], [28], [29], [30]]. Even though DL/HDL models offer higher accuracy and performance, the cost associated with these models are high since it takes time to train and predict results. One way to solve the processing problem is to use GPU or supercomputers [29,[31], [32], [33]]]. However, they are expensive and difficult to maintain in the long run.Originally, LeCun et al. [34], in their paper "Optimal Brain Damage," published in 1989, was the first to bring the concept of pruning to the area of deep learning. Pruning is a term that refers to trimming or cutting away excess from a model or search area to remove the redundant or insignificant sections [35]. This pruning concept was extended to improve storage [36,37] and speed up the model training by choosing the correct/appropriate hyperparameters [38,39]. They have mostly been applied to X-ray non-COVID and COVID-19 imaging [40]. As a result, the proposed study hypothesizes that such pruning methods can significantly enhance storage and processing time for computed tomography (CT) lung segmentation and lesion localization while requiring no additional hardware and maintenance.The proposed study is the first to introduce eight DL-based innovative pruning models based on evolutionary algorithms (EA). This is our prime contribution and innovation in COVID-19 based CT lung segmentation. The eight-novel pruning AI (PAI) models are (i) differential evolution (DE), (ii) genetic algorithm (GA), (iii) particle swarm optimization algorithm (PSO), and (iv) whale optimization algorithm (WO) with FCN and SegNet as base DL infrastructure. As a result, the following eight pruning models emerges (i) FCN-DE, (ii) FCN-GA, (iii) FCN–PSO, (iv) FCN-WO when using solo FCN and (v) SegNet-DE, (vi) SegNet-GA, (vii) SegNet-PSO, and (viii) SegNet-WO when using solo SegNet. To scientifically validate the aforesaid innovation and contributions, we used a multicenter paradigm in an unseen framework. This requires training with our DL-based EA on the high-GGO cohort, which is taken from Croatia (CroMed), and predicting the segmented lung using the data from Novara, Italy (NovMed). Note that these “Unseen NovMed” data were acquired utilizing different CT machines and patients from various geographies and ethnicities, resulting in the Unseen AI paradigm. We consider this as our second innovation for this study. To assert the power of our innovation, we designed a unique pipeline to accurately localize the COVID-19 lesions in segmented lungs, which are taken from PAI and converted into a classification framework by fusing DenseNet-121 and GRAD-cam to generate powerful and visual color heatmaps. This takes us to our third and most distinctive innovation: evolutionary-based lung segmentation with lesion localization. To authenticate our contributions, we benchmarked our evolutionary-based COVLIAS 2.0 against MedSeg ([41]), a web-based lung segmentation tool, as the first attempt in low-storage and high-speed infrastructure. This is the fourth major contribution we have made. All our data analysis is comprehensive and adapted to 360-degree assessments. This consists of (i) Dice Similarity (DS), (ii) Jaccard Index (JI) [42], (iii) Bland-Altman plots (BA) [43,44], (iv) Correlation coefficient (CC) plot [45,46], (v) lung area error (AE), and (vi) receiver operator characteristic (ROC). We consider this as the fifth contribution for this study. We put COVLIAS 2.0 through clinical and statistical testing to prove the reliability and stability of our hypothesis, which is considered as the sixth and final contribution.COVLIAS 2.0 system pipeline is presented in Fig. 1
, which demonstrates a universally applicable AI system for COVID-19 based lung segmentation that uses pruned and optimized AI (POAI) models for faster processing and less storage. It consists of volume acquisition, online lung segmentation, and benchmarking against MedSeg, as well as performance evaluation using “Unseen NovMed” data set.
Fig. 1
COVLIAS 2.0 using pruned AI system for segmentation of CT lungs. The benchmarking is conducted using MedSeg. The performance is evaluated using manually delineated borders of lungs by radiologist. The scientific validation was conducted using Unseen NovMed data set.
COVLIAS 2.0 using pruned AI system for segmentation of CT lungs. The benchmarking is conducted using MedSeg. The performance is evaluated using manually delineated borders of lungs by radiologist. The scientific validation was conducted using Unseen NovMed data set.Our suggested research is organized into seven sections. The background literature is presented in section 2, and the methodologies, demographics, image acquisition, and a brief explanation of the AI models employed in this work are presented in section 3. The findings of the models, as well as their performance evaluation, are presented in Section 4. The validation and statistical analysis of a different data set and MedSeg are shown in Section 5. Section 6 deals with discussion and benchmarking. Section 7 deals with the conclusion.
Literature survey
The art of segmentation in medical imaging has existed for several years [[47], [48], [49]], but it has recently been steered into an AI framework. Further, segmentation has caused problems in tissue characterization and classification in the early diagnosis of disease. Hence AI has begun to dominate that framework [19,20]. It started with ML, where it progressed to point-based models, such as problems related to diabetes [50,51], neonatal and infant mortality [52], and gene classification and analysis [53]. Moving from point-based to image-based machine learning for classification, several innovations have emerged which applied to lung segmentation [54], carotid plaque classification [[55], [56], [57], [58], [59]], thyroid [60], liver [61], stroke [62], coronary [63], ovarian [64], prostate [65], skin cancer [[66], [67], [68]], Wilson disease [32], and ophthalmology [69].CT and X-ray play the most critical role in medical imaging for diagnosing COVID-19. CT has demonstrated high sensitivity and repeatability in the general diagnosis of COVID-19, body imaging [62,[70], [71], [72]], and the ability to detect various types of opacities such as ground-glass opacity (GGO), consolidation, crazy paving, and other opacities [[73], [74], [75]] that are mainly seen in COVID-19 patients [[76], [77], [78], [79], [80], [81]]. Jiang et al. presented COVID-19-based CT lung classification where the authors synthetically generated 1,186 CT images using CycleGAN [82]. The raw input lung cancer CT images were processed by the network taken from LUAN16 [83]. Each image patch had a size of 5122 pixels. A total of five AI models, namely, VGG16, ResNet-50, Inception ResNet_v2, Inception_v3, and DenseNet-169, were used to train on the synthetic data. The authors showed that Densenet-169 was the best performing model with an accuracy of 98.92%, while VGG16, ResNet-50, Inception ResNet_v2, and Inception_v3 had an accuracy of 94.19%, 94.83%, 96.55%, and 95.91%, respectively. There was no application of PAI in their research.By analysing CT scan images, Kogilavani et al. [84] demonstrated the classification of CT images between COVID-19 and non-COVID-19. The author used six deep learning architectures, namely, VGG16 [85], DenseNet [86], MobileNet [87], Xception [88], NASNet [89], and EfficientNet [90]. All the models were trained on 3,873 CT images with an image resolution of 2242. While the authors used metrics like precision, recall, and F1-score for performance evaluation, the best-performing model was VGG16 with an accuracy of 97.68%. There was no application of PAI in their system design.Paluru et al. [91] proposed a combination of UNet and ENet, the so-called AnamNet. It was designed to separate COVID-19-based lesions in segmented CT images of the lungs. The model was trained on a data set of 69 patients [92], in which the segmented lung was the input image to the AnamNet. The model was compared to ENet [93], UNet++ [94], SegNet, and LEDNet [95]. The pre-processing did not use any augmentation, and the dice similarity for lesion detection was 0.77. While the author did not use PAI, they deployed AnamNet on an edge device using an android application instead. The authors didn't compare lung area errors or create JI or BA plots.Cai et al. [96] used the UNet model for lung and lesion segmentation, which resulted in a dice similarity of 0.77 using a ten-fold CV protocol with a database of 250 pictures taken from 99 patients. They also proposed a method for forecasting the length of an intensive care unit (ICU) stay based on the results of the lesion segmentation. The authors didn't compare lung area errors or create JI or BA plots.Saood et al. [97] presented a COVID-19 based CT lung image segmentation system that used 100 COVID-19 lung CT images downscaled to 256 × 256. The scientists evaluated the findings of the two models, UNet and SegNet, and found they had similar DS scores of 0.73 and 0.74, respectively. The authors did not compare lung area errors or create JI or BA plots, and further, the authors did not apply PAI.Our proposed study offers (i) eight pruning AI models demonstrating high speed and low storage, (ii) training of our pruning models using CroMed data set and performing unseen analysis using Unseen NovMed data set, (iii) classification framework to assert the COVID-19 lesions, (iv) MedSeg for benchmarking COVLIAS 2.0 to authenticate previous contributions, and finally, (v) conducting performance evaluation and statistical analysis to further provide the clinical evidence of our hypothesis.
Methodology
Patient demographics
CroMed Data set
The proposed study makes use of two distinct cohorts from separate nations. The first data set, also known as the experimental data set, consists of 80 CroMed COVID-19 positive individuals, of which 57 were male, and the rest were female. An RT-PCR test was conducted to confirm the positivity of COVID-19 in the selected cohort. The average value of ground-glass opacity (GGO), consolidation, and other opacities was ∼4. Out of the 80 CroMed patients who participated in this study, 83% had a cough, 60% had dyspnoea, 50% had hypertension, 8% were smokers, 12% had a sore throat, 15% were diabetic, and 3.7% had COPD. A total of 17 patients were admitted to the Intensive Care Unit (ICU), and 3 died due to a COVID-19 infection.
NovMed data set
The second data set, also known as the validation data set, consisted of 72 NovMed COVID-19 positive individuals, of which 47 were male, and the rest were female. An RT-PCR test was conducted to confirm the positivity of COVID-19 in the selected cohort. The average value of ground-glass opacity (GGO), consolidation, and other opacities was ∼2.4. Out of the 72 NovMed patients who participated in this study, 61% had a cough, 9% had a sore throat, 54% had dyspnoea, 42% had hypertension, 12% were diabetic, 11% had COPD, and 11% were smokers. Total 10 patients died due to COVID-19 infection.
Image acquisition and data preparation
CroMed and NovMed dataset
A CroMed data of 80 COVID-19 positive individuals (Fig. 2
) was employed in this suggested investigation. The retrospective cohort study was conducted at the University Hospital for Infectious Diseases in Zagreb, Croatia, from March 1 to December 31, 2020. All patients over the age of 18, who agreed to take part in the study, had a positive RT-PCR test for the SARS-CoV-2 virus and underwent thoracic MDCT during their hospital stay. The patients met at least one of the following criteria: hypoxia (oxygen saturation 92%), tachypnea (respiratory rate 22 per minute), tachycardia (pulse rate >100), or hypotension (systolic blood pressure 100 mmHg). The UHID Ethics Committee approved the proposal. The scanner used was a 64-detector FCT Speedia HD (from Fujifilm Corporation, Tokyo, Japan, 2017), and the acquisition protocol for collecting the CT scan was a single full inspiratory breath-hold in the craniocaudal direction.
Fig. 2
Sample CT scans taken from raw CroMed data set.
Sample CT scans taken from raw CroMed data set.The NovMed dataset taken from Italy consisted of 72 patients (Fig. 3
), were positioned supine, and chest CT scans were taken in a full inspiratory breath-hold using a 128 slice multidetector-row CT scanner (Philips Ingenuity Core, by Philips Healthcare). There was no intravenous or oral administration of the contrast agent. A soft tissue kernel with a lung kernel of a 768 × 768 matrix (lung window) was used to reproduce a one mm thick slice. The CT scans were performed using a 120 kV, 226 mAs/slice (using Philips' automatic tube current modulation – Z-DOM), a 1.08 spiral pitch factor, and a 0.5-s gantry rotation time 64 × 0.625 detector setup.
Fig. 3
Sample CT scans taken from raw unseen NovMed data set.
Sample CT scans taken from raw unseen NovMed data set.
Artificial intelligence models
Overall system for pruned AI and unpruned AI
Fig. 4 shows the local pruned AI system for CT lung segmentation. It consists of two parts, the training and testing systems. The training system has two parts. Part A uses a conventional system of two types of base models such as FCN or SegNet for training the AI model without EA, the so-called unpruned AI (UnPAI). It uses raw grayscale pre-processed images and gold standard. In part B, the training system uses PAI system that utilizes evolutionary algorithms, such as DE, GA, PSO, and WO to generate the PAI models with EA. Next, we discuss the base DL models, which are well known in the DL industry.
Fig. 4
Local COVLIAS 2.0 system for pruned and unpruned AI models.
Local COVLIAS 2.0 system for pruned and unpruned AI models.
Base model 1: Fully Convolutional Networks
Fully Convolutional Networks, or FCNs (Fig. 5
) are a type of architecture mainly utilized for semantic segmentation. They only use locally connected layers for convolution, pooling, and up sampling. Since they avoid the dense layers, it results in fewer parameters, which will make the networks faster to train. It also implies that an FCN can work with varying image sizes if all connections are local. The model learns the essential features in an image using a CNN feature extractor, referred to as the model's encoder. The image gets down sampled as it goes through convolutional layers. The output is then transmitted to the model's decoder portion, consisting of additional convolutional layers.
Fig. 5
Architecture for base model one - FCN.
Architecture for base model one - FCN.The decoder layers step-by-step up samples the image to its original dimensions, resulting in pixelwise labeling, also known as pixel mask or segmentation mask of the original image [98]. Some standard versions of FCNs are FCN-32, FCN-16, and FCN-8.
Base model 2: segmentation network
SegNet is a semantic segmentation model (Fig. 6
) and has been implemented before [99]. However, we have fused evolutionary strategies with this base model, is a truly innovative component of this study, and it has been used for the first time in CT lung segmentation. SegNet is a trainable segmentation architecture that consists of (i) an encoder network (the left half of Fig. 6) topologically identical to the 13 convolutional layers, (ii) a corresponding decoder network, followed by (the right half of Fig. 6) (iii) a pixel-wise classification layer (the last block of Fig. 6). The encoder network's architecture down-samples the encoder output using a technique that requires storing the max-pooling (filter size of 2 × 2) indices. On the encoder side, the max-pooling layer is added at the end of each block, doubling the depth of the following layer (64–128 – 256–512). Similarly, up-sampling occurs in the second half of the design, where the layer depth is reduced by two (512–256 – 128–64). The max-pooling indices at the relevant encoder layer are recalled during the up-sampling operation. Finally, a K-class SoftMax classifier is used to predict the class for each pixel in the conclusion. This provides reasonable performance while also saving space. Typical SegNet examples for segmentation for other applications can be seen here [23,25,28].
Fig. 6
Architecture for base model two – SegNet.
Architecture for base model two – SegNet.
Design of eight PAI models
As discussed earlier, the overall structure of the interface between base FCN/SegNet models and the adaptation of four EA methods can be seen in Fig. 7
. Note that the input to the entire system is the raw grayscale lung CT scans along with the binary gold standard. These two base DL networks (FCN/SegNet) get optimized using the four EA methods. These four times two constitute the eight pruning techniques. The eight novel AI model pruning techniques are (i) DE, (ii) GA, (iii) PSO, and (iv) WO, with FCN and SegNet as base DL models. This yields the following eight pruning models: (i) FCN-DE, (ii) FCN-GA, (iii) FCN–PSO, (iv) FCN-WO when using solo FCN and (v) SegNet-DE, (vi) SegNet-GA, (vii) SegNet-PSO, and (viii) SegNet-WO when using solo SegNet.
Fig. 7
Four pruning techniques (DE, GA, PSO, and WO) leading to eight system: FCN-DE, FCN-GA, FCN–PSO, FCN-WO, SegNet-DE, SegNet-GA, SegNet-PSO, and SegNet-WO.
Four pruning techniques (DE, GA, PSO, and WO) leading to eight system: FCN-DE, FCN-GA, FCN–PSO, FCN-WO, SegNet-DE, SegNet-GA, SegNet-PSO, and SegNet-WO.Note that the EA methods are fundamentally compressing the base networks (FCN/SegNet). The steps are broken down as follows: (i) This compression process starts from 1st hidden convolutional layer and forms a vector equal to number of hidden neurons in that layer, comprising of random 0s and 1s. The trained FCN/SegNet model hidden neurons corresponding to 0 position of vector are temporarily removed and fitness criterion value is calculated. Fitness value is dual objective function with accuracy and compressed nodes ratio as the two objectives. Note that we need to maximize a weighted sum of both the objectives. After several EA iterations (say 20–30), we find the best vector with the best set of neurons that can be dropped to get the best compression ratio and accuracy after compression. (ii) The above process is repeated for all the hidden layers till we remove redundant filters from all the hidden convolutional layers. (iii) A new compressed model is created with each hidden layer having those number of neurons equal to sum of 1s in the EA vector for that hidden layer. (iv) Weights are then copied from original trained FCN/SegNet to the newly created compressed model. This is done such that each original model neuron's weight, which is at vector position that has a value 1, is copied to the corresponding compressed model neuron. (v) The compressed model is further trained for around 10–50 epochs to get the accuracy back which was reduced due to the nodes discarding process. After this, the accuracy and compressed model size is recorded. (vi) If the accuracy of the compressed model is within an acceptable value, i.e., not less than 1% of the original accuracy, then this compressed model is treated as the original uncompressed model and is further compressed by going back to step 1. By doing this, we get large compression after several compression iterations (say ∼10). The four evolutionary algorithms work similarly by making vectors of random 0s and 1s of an initial population (around 50 individuals). Only the intermediate steps are different to form new child vectors. Finally, the best vector is retained with the best fitness function value. This best vector helps to remove redundant neurons from the trained FCN/SegNet.DE is a reproduction process that uses distance and orientation information via unit vectors, improving solutions through evolutionary processes [100,101]. The process uses mutation recombination to form new vectors [102,103]. These algorithms make minimal assumptions about the underlying optimization problem and explore enormous design spaces rapidly. In second EA method so-called GA, which originated from Darwin's Theory of Evolution [104], maintains a population of individuals who differ from one another. Those who are better adapted to their environment have a higher chance of surviving, reproducing, and passing on their traits to future generations (survival of the fittest). GA uses the process of selection, crossover, and mutation to produce optimized solutions [105,106]. The third EA method, so-called PSO was originally proposed by Kennedy and Eberhart [107] in 1995, originated from this concept that imitates a flock of birds or fishes which learn from each other to find the best position having food [108,109]. The position vector is formed in this scenario by having random 0 and 1. The vector with the highest fitness is assumed to be the position of food. It has a set of equations to find new position vectors in the next iterations. Finally, the last EA method so-called WO was inspired by the meta-heuristic optimization algorithm [110]. It originated from humpback whale behavior of encircling its prey in a spiral fashion [111,112]. The best vector with the highest fitness is assumed to be the position of prey, so the algorithm converges to that position. It also has a different set of equations to find new position vectors in the next iterations. These algorithms are shown in the diagrammatic form in Fig. 7(a–d) where (a), (b), (c), and (d) represent DA, GA, PSO, and WO.
Loss function for AI base models
The new models adopted the cross-entropy (CE)-loss functions during the model generation. If represented the CE-loss function, represents classifier's probability used in the AI model,
i represents the input gold standard label 1, (1-
i) represents the gold standard label 0. The loss function can be expressed mathematically as shown in Eq. (1).Here represents the product of the two terms.
Fitness function for AI EA models
The main foundation of EA is fitness function which helps us to achieve the best possible solution set to maximize or minimize the objective fitness criterion. In each iteration the value of fitness function is compared with the previous best and solution space updated depending on its value. In all the evolutionary algorithms in this study, a dual objective fitness function has been used which depends on mean intersection over union (mIoU) accuracy () of test set and nodes compression ratio (Cr) [113,114]. If be the original hidden neuron and be the compressed hidden neuron, then the optimized fitness function can be mathematically given by the equation Eq. (2) below:Where μ1 and μ2 are weight factors that helps to give more emphasis on one of the objectives. In our implementations, we have chosen these weights as 0.5. If we choose combinations such as 0.2 and 0.8, more compression can be achieved, but accuracy may be compromised and vice-versa.
Experimental protocol
A standardized cross-validation (CV) technique was used to train the AI models. Our team has created several CV-based protocols of various types using the AI framework. The K5 technique was used because the data was moderate. The data-consisted of 80% training data (4,000 CT scans) and 20% testing data (1,000 CT images). Five folds were designed in such a way that each fold has its own test set. Internal validation was included in the K5 protocol, with 10% of data being examined for validation.The accuracy of an AI system is measured by comparing the expected output to ground truth pixel values. These readings were interpreted as binary (0 or 1) numbers because the output lung mask was only black or white. Finally, the total number of pixels in the image is divided by the sum of these binary integers. If TP, TN, FN, and FP stand for true positive, true negative, false negative, and false positive, respectively, then Eq. (3) can be used to calculate the AI system's accuracy.
Results and performance evaluation
The results have been compartmentalize using eight sets as proposed by COVLIAS 2.0, and eight pruning and optimization techniques. The first one uses (i) first base model FCN and four optimization techniques with FCN such as (ii) FCN-DE, (iii) FCN-GA, (iv) FCN–PSO, and (v) FCN-WO. The second one uses (ii) second base model SegNet and four optimization techniques with SegNet such as (ii) SegNet-DE, (iii) SegNet-GA, (iv) SegNet-PSO, and (v) SegNet-WO. This would enhance the speed and reduce the storage of the final AI models.
Results
Visual Results using Pruning Models on CroMed Experimental Data setThe training of the AI modes was done using the CroMed data set with 5,000 COVID-19 CT lung images, with one set of ground truth annotation from a senior radiologist. Fig. 8, Fig. 9
shows the overlay results of the segmented lungs with grayscale in the background. Fig. 8 is for the combinations FCN + DE, FCN + GA, SegNet + DE, and SegNet + GA and Fig. 9 for FCN + PSO, FCN + WO, SegNet + PSO, and SegNet + WO.
Fig. 8
Overlays for optimized pruning networks over raw grayscale CT scans using CroMed data set. : FCN-DE and FCN-GA pruning combination and : SegNet-DE and SegNet-GA pruning combination.
Fig. 9
Overlays for optimized pruning networks over raw grayscale CT scans using CroMed data set. : FCN–PSO and FCN-WO pruning combination and : SegNet-PSO and SegNet-WO pruning combination.
Percentage Storage Reduction using Pruning Models on CroMed Data setOverlays for optimized pruning networks over raw grayscale CT scans using CroMed data set. : FCN-DE and FCN-GA pruning combination and : SegNet-DE and SegNet-GA pruning combination.Overlays for optimized pruning networks over raw grayscale CT scans using CroMed data set. : FCN–PSO and FCN-WO pruning combination and : SegNet-PSO and SegNet-WO pruning combination.Table 1 shows the storage reduction for the eight pruning techniques using the percentage storage reduction (PSR) formula as shown in Eqs. [4], [5]).where SFCN represents the storage of the FCN model and SFCN-EA represents the storage corresponding to the EA algorithm.where SSegNet represents the storage of the SegNet model and SSegNet-EA represents the storage corresponding to the EA algorithm.
Table 1
Percentage storage reduction (in MB) for the pruned AI models.
Models
Size (MB)
PSR
Models
Size (MB)
PSR
FCN
512
–
SegNet
20.9
–
FCN-DE
38.72
92.4%
SegNet-DE
0.6
97.1%
FCN-WO
23.85
95.3%
SegNet-GA
0.44
97.9%
FCN-GA
6.55
98.7%
SegNet-WO
0.25
98.8%
FCN–PSO
1.2
99.8%
SegNet-PSO
0.16
99.2%
Percentage storage reduction (in MB) for the pruned AI models.For (i) FCN-DE, (ii) FCN-GA, (iii) FCN–PSO, and (iv) FCN-WO, the PSR using Eq. (4) was 92%, 95%, 99%, and 100%, respectively when compared against FCN and for (v) SegNet-DE, (vi) SegNet-GA, (vii) SegNet-PSO, and (viii) SegNet-WO, the PSR using Eq. (5) was 97%, 98%, 99%, and 99%, respectively when compared against SegNet. Thus, proving the hypothesis that pruning helps considerably reduce the size of the AI models, making the system fast, efficient, and most importantly, keeping the performance of the AI models in clinical standards. As per Table 1, the PSR values are made to increase along the rows. We observed that PSR was highest in FCN–PSO and SegNet-PSO pruning models. The lowest PSR was for FCN-DE and SegNet-DE models. The intermediate PSR models were FCN-WO and FCN-GA when FCN was used as a base. The intermediate models were SegNet-GA and SegNet-WO when SegNet was used as a model. Note that all the eight fused systems were five times faster compared to the base model FCN and SegNet.PSR: Percentage Storage Reduction; DE: differential evolution; GA: genetic algorithm; PSO: particle swarm optimization algorithm; WO: whale optimization algorithm.Note that in Eq. (2), Cr can have a minimum value of 1 and a maximum value equal to the total number of hidden neurons. To avoid unacceptable conditions, one must ensure that compressed hidden neurons never be zero. The maximum Cr obtained for FCN–PSO and SegNet-PSO after 10 iterations were 962.47 and 133.12, respectively. The maximum Cr obtained for FCN-GA and SegNet-GA after 4 iterations were 80.03 and 48.71, respectively. The Cr for FCN-DE, after 4 iterations was 13.54, while after 8 iterations for SegNet-DE it was 34.79. The Cr for FCN-WO, after 12 iterations was 31.99, while after 10 iterations for SegNet-WO it was 85.73. The further effort to compress led to degradation in performance, therefore it was not considered.
Performance evaluation
Performance Evaluation for eight EA modelsThe proposed study makes use of mainly five kinds of performance evaluation metrics (i) AE, (ii) BA, (iii) CC, (iv) DS, (v) JI, and (vi) ROC to compare the performance of the pruned AI models. The cumulative frequency distribution (CFD) plot for AE is presented in Fig. 10
at a threshold cutoff of 80% for CroMed data. Fig. 11
shows the BA plot with mean and standard deviation (SD) line for the estimated lung area against the AI models and ground truth tracings for CroMed data. Similarly, CC plots with a cutoff of 80% are displayed in Fig. 12
. CFD plots for DS and JI at a threshold cutoff of 80% for the CroMed is presented in Fig. 13, Fig. 14
. The COVIAS 2.0 diagnostic performance can be calculated utilizing the variable threshold technique and ROC analysis. Fig. 15
shows the ROC curves using CroMed data set for FCN-based and SegNet-based four EA models, each having AUC values greater than ∼0.94 (p < 0.0001) and ∼0.96 (p < 0.0001), respectively.
Fig. 10
Cumulative frequency plot for lung Area Error for CroMed data. : FCN with four pruning (DE, GA, PSO, WO) and : SegNet with four pruning (DE, GA, PSO, WO).
Fig. 11
Bland-Altman plots for Area Error on Croatia data. : FCN with four pruning (DE, GA, PSO, WO) and : SegNet with four pruning (DE, GA, PSO, WO).
Fig. 12
CC plots for Area Error on CroMed data. : FCN with four pruning (DE, GA, PSO, WO) and : SegNet with four pruning (DE, GA, PSO, WO).
Fig. 13
Dice Similarity on CroMed data. : FCN with four pruning (DE, GA, PSO, WO) and : SegNet with four pruning (DE, GA, PSO, WO).
Fig. 14
Jaccard Index plots for Area Error on CroMed data. : FCN with four pruning (DE, GA, PSO, WO) and : SegNet with four pruning (DE, GA, PSO, WO).
Fig. 15
ROC using CroMed data set. : FCN-based four EA (DE, GA, PSO, WO) and : SegNet-based four EA (DE, GA, PSO, WO).
DS, JI, and CC for eight EA models for CroMed and NovMed Data setsCumulative frequency plot for lung Area Error for CroMed data. : FCN with four pruning (DE, GA, PSO, WO) and : SegNet with four pruning (DE, GA, PSO, WO).Bland-Altman plots for Area Error on Croatia data. : FCN with four pruning (DE, GA, PSO, WO) and : SegNet with four pruning (DE, GA, PSO, WO).CC plots for Area Error on CroMed data. : FCN with four pruning (DE, GA, PSO, WO) and : SegNet with four pruning (DE, GA, PSO, WO).Dice Similarity on CroMed data. : FCN with four pruning (DE, GA, PSO, WO) and : SegNet with four pruning (DE, GA, PSO, WO).Jaccard Index plots for Area Error on CroMed data. : FCN with four pruning (DE, GA, PSO, WO) and : SegNet with four pruning (DE, GA, PSO, WO).ROC using CroMed data set. : FCN-based four EA (DE, GA, PSO, WO) and : SegNet-based four EA (DE, GA, PSO, WO).Let and be the DS for the base models FCN and SegNet, corresponding to EA(m), where m can take the values 1, 2, 3, and 4, which implies DE, GA, PSO, and WO, respectively. Similarly, and be the JI for the base models FCN and SegNet, corresponding to EA(m), where m can take the values 1, 2, 3, and 4, which implies DE, GA, PSO, and WO, respectively.Let and be the CC for the base models FCN and SegNet, corresponding to EA(m), where m can take the values 1, 2, 3, and 4, which implies DE, GA, PSO, and WO, respectively. Using the above notations, we can define and as given in Eq. (6), and as given in Eq. (7) and and as in Eq. (8).Using Eqs. [6], [7], [8]) the one can compute the mean of all the four EA models for FCN and SegNet base models as defined in Eq. (9).Table 2 summarizes the DS, JI, and CC values for the eight pruned AI models over CroMed (experimental) and Unseen NovMed (validation) data sets, along with the results achieved by the MedSeg. Table 3
presents the percentage difference for the DS over eight AI pruned models with FCN-DE, FCN-GA, FCN–PSO and FCN-WO on the left side and SegNet-DE, SegNet-GA, SegNet-PSO and SegNet-WO on the right side using Eq. (9). The mean dice for the four pruned FCN models is 3%, 8%, and 2%, while for four pruned SegNet models was 0%, 4%, and 1%, respectively for CroMed (experimental) and Unseen NovMed (validation) data sets using Eq. (9). Further, the mean percentage difference using Eq. (7) for JI values is presented in Table 4
, which is 5%, 15%, and 3% for pruned FCN models, and 1%, 8%, and 1% for pruned SegNet models, respectively for CroMed (experimental) and Unseen NovMed (validation) data sets. Similarly, the mean percentage difference for CC values using Eq. (8) is presented in Table 5
, which is 3%, 1%, and 1% for pruned FCN models, and 1%, 1%, and 0% for pruned SegNet models, respectively for CroMed (experimental) and Unseen NovMed (validation) data set.
Table 2
DS, JI, and CC values for experimental CroMed and Unseen NovMed.
Models
CroMed
Unseen NovMed
DS
JI
CC
DS
JI
CC
FCN-DE
0.93
0.88
0.97
0.93
0.87
0.99
FCN-GA
0.93
0.87
0.94
0.94
0.88
0.98
FCN–PSO
0.92
0.86
0.97
0.91
0.84
0.98
FCN-WO
0.94
0.89
0.97
0.94
0.89
0.99
SegNet-DE
0.96
0.92
0.97
0.94
0.89
0.99
SegNet-GA
0.96
0.93
0.98
0.94
0.89
0.98
SegNet-PSO
0.96
0.94
0.99
0.95
0.91
0.99
SegNet-WO
0.96
0.94
0.98
0.95
0.91
0.99
MedSeg
0.96
0.92
0.99
0.95
0.90
0.99
Table 3
Percentage difference in Dice Similarity (against GT) for the eight EA models when compared against MedSeg using experimental CroMed and Unseen NovMed.
Models
CroMed
Unseen NovMed
Models
CroMed
Unseen NovMed
FCN-DE
3%
2%
SegNet-DE
0%
1%
FCN-GA
3%
1%
SegNet-GA
0%
1%
FCN–PSO
4%
4%
SegNet-PSO
0%
0%
FCN-WO
2%
1%
SegNet-WO
0%
0%
μ
3%
2%
μ
0%
1%
σ
1%
1%
σ
0%
1%
Table 4
Percentage Difference in Jaccard Index (against GT) for the eight EA models when compared against MedSeg using experimental CroMed and Unseen NovMed.
Models
CroMed
Unseen NovMed
Models
CroMed
Unseen NovMed
FCN-DE
4%
3%
SegNet-DE
0%
1%
FCN-GA
5%
2%
SegNet-GA
1%
1%
FCN–PSO
7%
7%
SegNet-PSO
2%
1%
FCN-WO
3%
1%
SegNet-WO
2%
1%
μ
5%
3%
μ
1%
1%
σ
1%
2%
σ
1%
0%
Table 5
Percentage difference in Correlation Coefficient (against GT) for the eight EA models when compared against MedSeg using experimental CroMed and Unseen NovMed.
Models
CroMed
Unseen NovMed
Models
CroMed
Unseen NovMed
FCN-DE
2%
0%
SegNet-DE
2%
0%
FCN-GA
5%
1%
SegNet-GA
1%
1%
FCN–PSO
2%
1%
SegNet-PSO
0%
0%
FCN-WO
2%
0%
SegNet-WO
1%
0%
μ
3%
1%
μ
1%
0%
σ
2%
1%
σ
1%
1%
Comparison between eight EA models using Figure-of-MeritDS, JI, and CC values for experimental CroMed and Unseen NovMed.Percentage difference in Dice Similarity (against GT) for the eight EA models when compared against MedSeg using experimental CroMed and Unseen NovMed.Percentage Difference in Jaccard Index (against GT) for the eight EA models when compared against MedSeg using experimental CroMed and Unseen NovMed.Percentage difference in Correlation Coefficient (against GT) for the eight EA models when compared against MedSeg using experimental CroMed and Unseen NovMed.We compared the FCN and SegNet-based eight pruned AI models using standardized Figure-of-Merit (FoM) algorithm [25,[115], [116], [117]]. Mathematically, this can be represented as follows. Let represents mean area for EA models on a datatype [d], represents mean ground truth (GT) area for the datatype [d], both taken over ‘n’ images of the cohort datatype [d] having ‘N’ images, where ‘d’ representing CroMed, or NovMed. Using these notations, the FoM in % for the EA using datatype [d] can be represented as shown in Eq. (10).where, and ,represents the area of the image ‘n’ using the evolutionary algorithm EA (DE, GA, PSO, and WO). represents the GT area corresponding to image ‘n’, ‘Σ’ represents the summation of area of all the image in the cohort. Table 6, Table 7
shows the FoM for the CroMed, and NovMed data sets. Table 6 shows the FoM for CroMed corresponding to FCN and SegNet based EA models in the column 2 and 3. The difference of the two columns is shown in the column ‘Superior’. For the EA algorithm except DE, SegNet-based EA is superior in the range 1–10%. Overall mean for the four EA using the two base models are comparable, having a different of 2.39%. Table 7 shows the FoM for unseen NovMed corresponding to FCN and SegNet based EA models in the column 2 and 3. The difference of the two columns is shown in the column ‘Superior’. For the EA algorithm except GA and WO, SegNet-based EA is superior in the range 1–10%. Overall mean for the four EA using the two base models are comparable, having a difference of 5.45%.
Table 6
FoM comparison for CroMed using eight EA models.
FoM comparison for CroMed Seen Data
Models
FCN
SegNet
Superior
DE
94.33
93.00
1.33
GA
97.95
98.94
0.99
PSO
90.12
97.08
6.96
WO
98.08
98.35
0.28
μ
95.12
96.84
2.39
σ
3.76
2.68
3.08
Table 7
FoM comparison for Unseen NovMed using eight EA models.
FoM comparison for NovMed Unseen Data
Models
FCN
SegNet
Superior
DE
89.81
98.50
8.70
GA
93.21
89.25
3.96
PSO
85.62
94.32
8.70
WO
92.90
92.47
0.43
μ
90.38
93.64
5.45
σ
3.53
3.86
4.02
FoM comparison for CroMed using eight EA models.FoM comparison for Unseen NovMed using eight EA models.Finally, we can conclude from Table 6, Table 7 that the mean FoM for FCN based EA models over all three data sets (CroMed and NovMed) is ∼94%, while for SegNet it was ∼96%. This clearly demonstrated that the four EA models based on SegNet are better than the four EA models based on FCN. This is mainly because the number of layers present in the SegNet (Fig. 6) models are more than the layers in the FCN (Fig. 5) model. SegNet has skip connections where FCN lacks it. The results in which SegNet is superior to FCN are similar to previous applications [23,25,26,28].
Validation and statistical tests
The proposed study presents a lung segmentation validation using (i) Unseen NovMed data set on the trained pruned AI models and (ii) compared the results against MedSeg, a web-based CT lung segmentation tool. Fig. 16, Fig. 17
shows the overlay results of the segmented lungs (red) with grayscale in the background using the Unseen NovMed data, respectively. Fig. 16 is a combination of FCN-DE, FCN-GA, SegNet-DE, and SegNet-GA. Fig. 17 is a combination of FCN–PSO, FCN-WO, SegNet-PSO, and SegNet-WO using the Unseen NovMed data, respectively. Further, note that “Unseen NovMed” data sets were taken to validate the results. Fig. 18, Fig. 19, Fig. 20, Fig. 21, Fig. 22
show the CFD plots of AE, BA, CC, DS, and JI, respectively, for the segmented lungs using COVLIAS 2.0 on the “Unseen NovMed” data set. Fig. 23
shows the ROC curves using Unseen NovMed for FCN-based and SegNet-based four EA models, having each AUC values more than ∼0.86 (p < 0.0001) and ∼0.86 (p < 0.0001), respectively.
Fig. 16
Overlays (red) for optimized pruning networks over raw grayscale CT scans for NovMed data set. : FCN-DE and FCN-GA pruning combination and : SegNet-DE and SegNet-GA pruning combination.
Fig. 17
Overlays (red) for optimized pruning networks over raw grayscale CT scans for NovMed data set. : FCN-DE and FCN-GA pruning combination and : SegNet-DE and SegNet-GA pruning combination.
Fig. 18
Cumulative frequency plot for Area Error using NovMed data.
Fig. 19
BA plot using NovMed data.
Fig. 20
CC plot using NovMed data.
Fig. 21
Cumulative frequency plot for Dice using NovMed data.
Fig. 22
Cumulative frequency plot for Jaccard using NovMed data.
Fig. 23
ROC using NovMed data set. : FCN-based four EA (DE, GA, PSO, WO) and : SegNet-based four EA (DE, GA, PSO, WO).
Overlays (red) for optimized pruning networks over raw grayscale CT scans for NovMed data set. : FCN-DE and FCN-GA pruning combination and : SegNet-DE and SegNet-GA pruning combination.Overlays (red) for optimized pruning networks over raw grayscale CT scans for NovMed data set. : FCN-DE and FCN-GA pruning combination and : SegNet-DE and SegNet-GA pruning combination.Cumulative frequency plot for Area Error using NovMed data.BA plot using NovMed data.CC plot using NovMed data.Cumulative frequency plot for Dice using NovMed data.Cumulative frequency plot for Jaccard using NovMed data.ROC using NovMed data set. : FCN-based four EA (DE, GA, PSO, WO) and : SegNet-based four EA (DE, GA, PSO, WO).The MedSeg tool's results were compared to the gold standard tracings of the two data sets utilized in the study. CFD plot of DS, JI, and CC for the segmented lungs using the MedSeg tool for CroMed and “Unseen NovMed” data sets are shown in Fig. 24, Fig. 25, Fig. 26
, respectively. Similarly, Fig. 27
show the BA plot of the results from the MedSeg compared to the ground truth tracings of the two data sets (CroMed on the left and NovMed on the right). The percentage difference between the DS, JI, and CC score of the COVLIAS 2.0 AI models compared to MedSeg is <5%, thus proving the applicability of the proposed AI models in the clinical domain. Fig. 28
shows the ROC curve and AUC values for the MedSeg, with CroMes and Unseen NovMed having AUC values of 0.99 (p < 0.0001) and 0.89 (p < 0.0001), respectively. The size utilized by the models in decreasing order were FCN, FCN-DE, FCN-GA, FCN–PSO, and FCN-WO is shown in Fig. 29
and for SegNet, SegNet-DE, SegNet-GA, SegNet -PSO, and SegNet-WO is shown in Fig. 30
. The quantitative analysis using AE, DS, JI, BA, CC, and ROC on Unseen NovMed have similar behavior as Seen CroMed data.
Fig. 24
Cumulative frequency plot of DS for MedSeg using CroMed (left) and Unseen NovMed (right) data sets.
Fig. 25
Cumulative frequency plot of JI for MedSeg using CroMed (left) and Unseen NovMed (right) data sets.
Fig. 26
CC plot for MedSeg vs. GT using CroMed (left) and Unseen NovMed (right) data sets.
Fig. 27
BA plot for MedSeg vs. GT using CroMed (left) and Unseen NovMed (right) data sets.
Fig. 28
ROC plot for MedSeg vs. GT using CroMed and Unseen NovMed data sets.
Fig. 29
Size (in Megabyte) of the FCN-based models in descending order.
Fig. 30
Size (in Megabyte) of the SegNet-based models in descending order.
Cumulative frequency plot of DS for MedSeg using CroMed (left) and Unseen NovMed (right) data sets.Cumulative frequency plot of JI for MedSeg using CroMed (left) and Unseen NovMed (right) data sets.CC plot for MedSeg vs. GT using CroMed (left) and Unseen NovMed (right) data sets.BA plot for MedSeg vs. GT using CroMed (left) and Unseen NovMed (right) data sets.ROC plot for MedSeg vs. GT using CroMed and Unseen NovMed data sets.Size (in Megabyte) of the FCN-based models in descending order.Size (in Megabyte) of the SegNet-based models in descending order.We present a summary and percentage improvement for all eight pruned AI models for DS, JI, and CC values using experimental CroMed, validation Unseen NovMed data set in Table 2, Table 3, Table 4, Table 5, respectively. When comparing four pruning techniques against the base model (FCN and SegNet) for experimental and validation data, the pruning model performs far better than the base model for all three performance evaluation matrices. Finally, to prove the reliability of the AI-based segmentation system COVLIAS 2.0, statistical test such as Mann-Whitney, Paired t-Test, and Wilcoxon test were presented for using experimental CroMed (Table 8
), validation Unseen NovMed (Table 9
) analysis. All the above analysis was conducted by using MedCalc software (v18.2.1, Osteen, Belgium).
Table 8
Statistical test for CroMed data set.
Models
Paired t-Test
Mann-Whitney
Wilcoxon
FCN-DE
P < 0.0001
P < 0.0001
P < 0.0001
FCN-GA
P < 0.0001
P < 0.0001
P < 0.0001
FCN–PSO
P < 0.0001
P < 0.0001
P < 0.0001
FCN-WO
P < 0.0001
P < 0.0001
P < 0.0001
SegNet-DE
P < 0.0001
P < 0.0001
P < 0.0001
SegNet-GA
P < 0.0001
P < 0.0001
P < 0.0001
SegNet-PSO
P < 0.0001
P < 0.0001
P < 0.0001
SegNet-WO
P < 0.0001
P < 0.0001
P < 0.0001
MedSeg
P < 0.0001
P < 0.0001
P < 0.0001
Table 9
Statistical test for NovMed data set.
Models
Paired t-Test
Mann-Whitney
Wilcoxon
FCN-DE
P < 0.0001
P < 0.0001
P < 0.0001
FCN-GA
P < 0.0001
P < 0.0001
P < 0.0001
FCN–PSO
P < 0.0001
P < 0.0001
P < 0.0001
FCN-WO
P < 0.0001
P < 0.0001
P < 0.0001
SegNet-DE
P < 0.0001
P < 0.0001
P < 0.0001
SegNet-GA
P < 0.0001
P < 0.0001
P < 0.0001
SegNet-PSO
P < 0.0001
P < 0.0001
P < 0.0001
SegNet-WO
P < 0.0001
P < 0.0001
P < 0.0001
MedSeg
P < 0.0001
P < 0.0001
P < 0.0001
Statistical test for CroMed data set.Statistical test for NovMed data set.
Lesion localization validation
Lesions have different characteristics such as texture, contrast, intensity variation, density changes, and etc. [118]. Fig. 31
presents the pipeline for lesion validation using heatmaps, where the input to the eight pruned segmentation model is the CT image, which produces the segmented lungs. This segmented lung goes to the DenseNet-121 for classification into two classes, i.e., COVID-19 and Controls.
Fig. 31
COVLIAS 2.0 Lesion heatmap pipeline using DenseNet model using the infrastructure of pruned AI models during lung segmentation.
COVLIAS 2.0 Lesion heatmap pipeline using DenseNet model using the infrastructure of pruned AI models during lung segmentation.The Gradient-weighted Class Activation Mapping (Grad-CAM) (Fig. 32
) algorithm is applied to produce the lesion heatmap. Grad-CAM employs the gradients of any target concept (example “COVID-19” in this classification network) to build a coarse feature map by showing the critical regions in the picture for predicting the concept. Grad-CAM produces a coarse localization map showing the essential regions in the image for predicting the idea by using the gradients of any target concept flowing into the final convolutional layer. From a high level, we take an image as input to the model for which a Grad-CAM heatmap is desired. Then this image is passed through the network following the normal prediction cycle (including the FC Layer) and generating the class probability scores followed by the loss calculation. Next, we calculate the gradient of our desired model layer's output concerning the model loss. Then we trim, resize, and rescale areas of the gradient that contribute to the prediction such that the heatmap can be overlaid with the original image. In this scenario, ReLU is the best option because it highlights qualities that positively impact the class of interest.
Fig. 32
Grad-CAM process using DenseNet-121 model utilized for lesion localization.
Grad-CAM process using DenseNet-121 model utilized for lesion localization.DenseNet-121(Fig. 33
) is made up of 120 convolutions and 4 AvgPool layers. All layers, including those within the same dense block and transition layers, spread their weights across numerous inputs, allowing deeper layers to leverage characteristics collected earlier in the process. DenseNets result in more compact models, attain state-of-the-art performances, and better outcomes across competing data sets than their typical CNN or ResNet equivalents. This is because they require fewer parameters and allow feature reuse [[119], [120], [121]]. These critical regions that differentiate the COVID-19 CT scan from the Control CT scans are presented in Figs. Fig. 34, Fig. 35, Fig. 36
.
Fig. 33
DenseNet-121 model utilized for lesion localization.
Fig. 34
: Raw grayscale CT slice. : Segmented lungs with colored heatmap where red arrow indicates the right lung damage only.
Fig. 35
: Raw grayscale CT slice. : Segmented lungs with heatmap where red arrow indicated the left lung damage.
Fig. 36
: Raw grayscale CT slice. : Segmented lungs with heatmap where red arrow indicated the damage in both left and right lungs.
DenseNet-121 model utilized for lesion localization.: Raw grayscale CT slice. : Segmented lungs with colored heatmap where red arrow indicates the right lung damage only.: Raw grayscale CT slice. : Segmented lungs with heatmap where red arrow indicated the left lung damage.: Raw grayscale CT slice. : Segmented lungs with heatmap where red arrow indicated the damage in both left and right lungs.
Discussion
Major contributions and study findings
The proposed study presented eight pruned AI techniques, namely, (i) FCN-DE, (ii) FCN-GA, (iii) FCN–PSO, (iv) FCN-WO, (v) SegNet-DE, (vi) SegNet-GA, (vii) SegNet -PSO, and (viii) SegNet-WO for CT lung segmentation, namely, and this was our premier contribution of this study. The pruned AI models were trained on a CroMed data set, containing 5,000 COVID-19 CT images collected from 80 patients. The pre-processing involved in the CroMed data set consists of Hounsfield unit (HU) that was adjusted to highlight the lung region (1600, −400), making the model train efficiently. The second major contribution was the Unseen data analysis taken from Unseen NovMed data and it consists of validating the eight pruned AI models using Unseen NovMed data on 4,000 CT images from 72 patients. The two data sets CroMed and Unseen NovMed were annotated by senior radiologists. The third major innovation of our study is the design of the lesion localization that consists of the superposition of heatmaps on the pruned AI segmented lungs, displaying red color on the lesions. This innovation was first designed to develop the classification model using DensNet-121 and the trained model was then adapted for applying the Grad-CAM heatmap system on the grayscale lungs (Fig. 33, Fig. 34, Fig. 35).Our fourth major contribution mainly deals with benchmarking our evolutionary-based COVLIAS 2.0 against MedSeg ([41], a web-based lung segmentation tool). This was the first time attempted in a low-storage and high-speed infrastructures. The percentage difference between the DS, JI, and CC score of the COVLIAS 2.0 AI models was compared to MedSeg is <5%, thus proving that the proposed AI models are clinically applicable. For DS, JI, and CC values, we presented a summary and percentage improvement for all eight pruned AI models in Table 3, Table 4, Table 5The study revealed that the pruned AI models (i) FCN-DE, (ii) FCN-GA, (iii) FCN–PSO, and (iv) FCN-WO) had storage reductions of 92%, 95%, 99%, and 100%, respectively, when compared against the base FCN. When compared against base SegNet for (v) SegNet-DE, (vi) SegNet-GA, (vii) SegNet -PSO, and (viii) SegNet -WO, it showed a percentage storage reduction 97%, 98%, 99%, and 99%, respectively. Thus, we validate our hypothesis and demonstrate our superior clinical performance. We put COVLIAS 2.0 through clinical and statistical tests to ensure its reliability and stability to support our theory. All our data analysis was comprehensively adapted to 360-degree evaluations, and this consists of (i) DS, (ii) JI, (iii) BA (iv) CC, (v) AE and (vi) ROC plots. We consider this as our fifth contribution of our study.
Benchmarking
We couldn't identify any articles that used pruning techniques to segment CT lungs. There have been several articles that use pruning approaches on X-ray images to classify COVID-19 pneumonia. As a result, we chose to include papers that used CT lung segmentation in our benchmarking strategy (Table 10
). This included Jiang et al. [82], Kogilavani et al. [84] that used VGG16 [85], DenseNet [86], MobileNet [87], Xception [88], NASNet [89], and EfficientNet [90] for COVID-19 lung classification. Paluru et al. [91] developed a CT-based COVID-19 detection model, named AnamNet, based on the data set in Ref. [92]. Other authors include Cai et al. [96] and Saood et al. [97]. The authors did not compare lung area errors or create JI or BA plots. The authors did not use any kind of model pruning technique to reduce the size of the training model, which helps to improve running time.
The study shows that using the eight pruning techniques such as (i) FCN-DE, (ii) FCN-GA, (iii) FCN–PSO, (iv) FCN-WO, (v) SegNet-DE, (vi) SegNet-GA, (vii) SegNet-PSO, and (viii) SegNet-WO showed a reduction in storage by ∼97%. Thus, the hypothesis proves that pruning helps considerably reduce the size of the AI models, making the system fast, efficient, and most importantly, keeping the performance of the AI models in clinical standards. Our pruning techniques did not result in a decrease in the overall performance of the AI models. First, it was five times faster than the existing base models. Second, the system was validated using the “Unseen NovMed” data set. Third, the lesion localization showed powerful results and was graded by our expert radiologists. Fourth, the system has shown strong clinical statistical results.The current pilot study is encouraging and yields promising results. However, we still need to explore the hybrid deep learning networks since it has proved to be efficient in classification and segmentation. We did not adapt mix-matching of the data for testing its abilities as done by our group earlier [122]. Further, we wish to explore statistical or non-statistical pruning and optimization techniques. The recent evolution of UNet has a potential in our application [23,26,33,123]. The current study can therefore be extended by comparing SegNet-EA and FCN-EA against UNet-EA. Big data framework can be a powerful paradigm for including data from multiple sources [124].Note that we did the HU adjustments to enhance the lung region in the CT images in our case, which can possibly also use noise removal or a mixture of window-level with noise smoothing [125] One can also use conventional methods like level sets [126] or stochastic segmentation [47] to get lesion region and then modify the loss function. AI tends to give a higher accuracy with a lack of clinical validations, thus bringing bias in AI systems. One can therefore compute the AI bias using AP(ai)Bias [[127], [128], [129]].
Conclusions
The proposed study is the first pilot study that integrates eight kinds of evolution with two different DL paradigms. Thus, it accomplishes in designing eight different low storage and high-speed strategies for COVID-19 based CT lung segmentation. COVLIAS 2.0 demonstrated that the performance was retained while yielding the percentage storage reduction by ∼97%. Overall, on average, the speed of the eight pruning methods was almost five times faster than the base models. We compare our eight pruned AI models against the open-source web-based lung segmentation tools MedSeg. We also validated our hypothesis that the overall system error was under 6%, thus proving the clinical applicability. The online COVLIAS 2.0 takes ∼0.25 s for one CT slice. The COVLIAS 2.0 is reliable, accurate, and stable in clinical settings.Eight different pruning techniques (i) differential evolution (DE), (ii) genetic algorithm (GA), (iii) particle swarm optimization algorithm (PSO), and (iv) whale optimization algorithm (WO) in two deep learning frameworks namely, (i) Fully connected network (FCN) and (ii) SegNet.
Declaration of competing interest
We respectfully want to declare that there was a conflict of interest between our senior authors (by working with them for over 10 years) and Dr. U. Rajendra Acharya and Dr. Filippo Molinari. We therefore humbly request you that they be not the reviewers or AE of this manuscript. We deeply appreciate your kind understanding.
Authors: Mohit Agarwal; Luca Saba; Suneet K Gupta; Amer M Johri; Narendra N Khanna; Sophie Mavrogeni; John R Laird; Gyan Pareek; Martin Miner; Petros P Sfikakis; Athanasios Protogerou; Aditya M Sharma; Vijay Viswanathan; George D Kitas; Andrew Nicolaides; Jasjit S Suri Journal: Med Biol Eng Comput Date: 2021-02-05 Impact factor: 2.602
Authors: Jasjit S Suri; Mrinalini Bhagawati; Sudip Paul; Athanasios Protogeron; Petros P Sfikakis; George D Kitas; Narendra N Khanna; Zoltan Ruzsa; Aditya M Sharma; Sanjay Saxena; Gavino Faa; Kosmas I Paraskevas; John R Laird; Amer M Johri; Luca Saba; Manudeep Kalra Journal: Comput Biol Med Date: 2022-01-04 Impact factor: 4.589
Authors: Riccardo Cau; Pier Paolo Bassareo; Lorenzo Mannelli; Jasjit S Suri; Luca Saba Journal: Int J Cardiovasc Imaging Date: 2020-11-19 Impact factor: 2.357
Authors: Narendra N Khanna; Mahesh Maindarkar; Anudeep Puvvula; Sudip Paul; Mrinalini Bhagawati; Puneet Ahluwalia; Zoltan Ruzsa; Aditya Sharma; Smiksha Munjral; Raghu Kolluri; Padukone R Krishnan; Inder M Singh; John R Laird; Mostafa Fatemi; Azra Alizad; Surinder K Dhanjil; Luca Saba; Antonella Balestrieri; Gavino Faa; Kosmas I Paraskevas; Durga Prasanna Misra; Vikas Agarwal; Aman Sharma; Jagjit Teji; Mustafa Al-Maini; Andrew Nicolaides; Vijay Rathore; Subbaram Naidu; Kiera Liblik; Amer M Johri; Monika Turk; David W Sobel; Gyan Pareek; Martin Miner; Klaudija Viskovic; George Tsoulfas; Athanasios D Protogerou; Sophie Mavrogeni; George D Kitas; Mostafa M Fouda; Manudeep K Kalra; Jasjit S Suri Journal: J Cardiovasc Dev Dis Date: 2022-08-15
Authors: Jasjit S Suri; Mahesh A Maindarkar; Sudip Paul; Puneet Ahluwalia; Mrinalini Bhagawati; Luca Saba; Gavino Faa; Sanjay Saxena; Inder M Singh; Paramjit S Chadha; Monika Turk; Amer Johri; Narendra N Khanna; Klaudija Viskovic; Sofia Mavrogeni; John R Laird; Martin Miner; David W Sobel; Antonella Balestrieri; Petros P Sfikakis; George Tsoulfas; Athanase D Protogerou; Durga Prasanna Misra; Vikas Agarwal; George D Kitas; Raghu Kolluri; Jagjit S Teji; Mustafa Al-Maini; Surinder K Dhanjil; Meyypan Sockalingam; Ajit Saxena; Aditya Sharma; Vijay Rathore; Mostafa Fatemi; Azra Alizad; Padukode R Krishnan; Tomaz Omerzu; Subbaram Naidu; Andrew Nicolaides; Kosmas I Paraskevas; Mannudeep Kalra; Zoltán Ruzsa; Mostafa M Fouda Journal: Diagnostics (Basel) Date: 2022-06-24