| Literature DB >> 35136715 |
Sakthivel R1, I Sumaiya Thaseen2, Vanitha M2, Deepa M2, Angulakshmi M2, Mangayarkarasi R2, Anand Mahendran3, Waleed Alnumay4, Puspita Chatterjee5.
Abstract
Deep learning models demonstrate superior performance in image classification problems. COVID-19 image classification is developed using single deep learning models. In this paper, an efficient hardware architecture based on an ensemble deep learning model is built to identify the COVID-19 using chest X-ray (CXR) records. Five deep learning models namely ResNet, fitness, IRCNN (Inception Recurrent Convolutional Neural Network), effectiveness, and Fitnet are ensembled for fine-tuning and enhancing the performance of the COVID-19 identification; these models are chosen as they individually perform better in other applications. Experimental analysis shows that the accuracy, precision, recall, and F1 for COVID-19 detection are 0.99,0.98,0.98, and 0.98 respectively. An application-specific hardware architecture incorporates the pipeline, parallel processing, reusability of computational resources by carefully exploiting the data flow and resource availability. The processing element (PE) and the CNN architecture are modeled using Verilog, simulated, and synthesized using cadence with Taiwan Semiconductor Manufacturing Co Ltd (TSMC) 90 nm tech file. The simulated results show a 40% reduction in the latency and number of clock cycles. The computations and power consumptions are minimized by designing the PE as a data-aware unit. Thus, the proposed architecture is best suited for Covid-19 prediction and diagnosis.Entities:
Keywords: Accuracy; COVID-19; Data-aware computational unit; Deep learning; Ensemble; Latency of CNN; Performance; Pre-training
Year: 2022 PMID: 35136715 PMCID: PMC8812126 DOI: 10.1016/j.scs.2022.103713
Source DB: PubMed Journal: Sustain Cities Soc ISSN: 2210-6707 Impact factor: 10.696
Fig. 1General architecture of the proposed model.
Fig. 2First Stage Pre Training of CNN models in the proposed approach.
Fig. 3Second stage pre-training of CNN models in the proposed approach.
Fig. 4(a) An FC layer (b) A Semi Parallel Implementation for an FC layer with each neuron.
Optimized computation by resource utilization for Convolutional Computations.
| The first row of the output | The second row of the output | The third row of the output | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Clock cycles | Neuron #1 | Neuron #2 | Neuron #3 | Neuron#4 | Neuron#5 | Neuron#6 | Neuron#7 | Neuron#8 | Neuron#9 |
| 1 | X1 × W1 | X2 × W1 | X3 × W1 | X16 × W7 | X17 × W7 | X18 × W7 | X21 × W7 | X22 × W7 | X23 × W7 |
| 2 | X2 × W2 | X3 × W2 | X4 × W2 | X17 × W8 | X18 × W8 | X19 × W8 | X22 × W8 | X23 × W8 | X24 × W8 |
| 3 | X3 × W3 | X4 × W3 | X5 × W3 | X18 × W9 | X19 × W9 | X20 × W9 | X23 × W9 | X24 × W9 | X25 × W9 |
| 4 | X6 × W4 | X7 × W4 | X8 × W4 | X6 × W1 | X7 × W1 | X8 × W1 | X16 × W4 | X17 × W4 | X18 × W4 |
| 5 | X7 × W5 | X8 × W5 | X9 × W5 | X7 × W2 | X8 × W2 | X9 × W2 | X17 × W5 | X18 × W5 | X19 × W5 |
| 6 | X8 × W6 | X9 × W6 | X10 × W6 | X8 × W3 | X9 × W3 | X10 × W3 | X18 × W6 | X19 × W6 | X20 × W6 |
| 7 | X11 × W7 | X12 × W7 | X13 × W7 | X11 × W4 | X12 × W4 | X13 × W4 | X11 × W1 | X12 × W1 | X13 × W1 |
| 8 | X12 × W8 | X13 × W8 | X14 × W8 | X12 × W5 | X13 × W5 | X14 × W5 | X12 × W2 | X13 × W2 | X14 × W2 |
| 9 | X13 × W9 | X14 × W9 | X15 × W9 | X13 × W6 | X14 × W6 | X15 × W6 | X13 × W3 | X14 × W3 | X15 × W3 |
Performance Metrics of Fine-tuned second-stage pre-trained models for COVID-19 detection.
| Models | Technique | Acc | S | SP | P | F1 | MCC | K | DOR | AUC |
|---|---|---|---|---|---|---|---|---|---|---|
| IRCNN | Baseline | 0.847 | 0.833 | 0.867 | 0.842 | 0.847 | 0.694 | 0.694 | 30.79 | 0.928 (0.886, 0.970) |
| Fine-tuned | 0.854 | 0.902 | 0.888 | 0.884 | 0.865 | 0.736 | 0.736 | 44.4 | 0.917 (0.872, 0.962) | |
| Mobile Net | Baseline | 0.854 | 0.902 | 0.875 | 0.839 | 0.855 | 0.698 | 0.666 | 35.31 | 0.932 (0.891, 0.95 |
| Fine-tuned | 0.875 | 0.902 | 0.819 | 0.833 | 0.866 | 0.724 | 0.7222 | 42.17 | 0.904 (0.856, 0.952) | |
| FITNET | Baseline | 0.868 | 0.847 | 0.888 | 0.847 | 0.844 | 0.736 | 0.736 | 44.4 | 0.921 (0.877, 0.965) |
| Fine-tuned | 0.875 | 0.902 | 0.902 | 0.895 | 0.863 | 0.737 | 0.736 | 46.47 | 0.930 (0.888, 0.971) | |
| ResNet-18 | Baseline | 0.833 | 0.916 | 0.847 | 0.884 | 0.865 | 0.714 | 0.708 | 41.83 | 0.930 (0.888, 0.971) |
| Fine-tuned | 0.895 | 0.861 | 0.902 | 0.897 | 0.878 | 0.751 | 0.752 | 51.54 | 0.981 (0.864, 0.957) | |
| Efficient Net | Baseline | 0.847 | 0.847 | 0.791 | 0.814 | 0.862 | 0.673 | 0.694 | 30.06 | 0.915 (0.868, 0.960) |
| Fine-tuned | 0.868 | 0.847 | 0.930 | 0.930 | 0.892 | 0.793 | 0.791 | 83.2 | 0.947 (0.913, 0.985) |
Initial stage of proposed dataflow for convolutional computations.
| The first row of the output | The second row of the output | A third row of the output | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Clock cycle | Neuron #1 | Neuron #2 | Neuron #3 | Neuron#4 | Neuron#5 | Neuron#6 | Neuron#7 | Neuron#8 | Neuron#9 |
| 1 | X1 × W1 | X1 × 0 | X1 × 0 | X16 × W7 | X16 × 0 | X16 × 0 | X21 × W7 | X21 × 0 | X21 × 0 |
| 2 | X2 × W2 | X2 × W1 | X2 × 0 | X17 × W8 | X17 × W7 | X17 × 0 | X22 × W8 | X22 × W7 | X22 × 0 |
| 3 | X3 × W3 | X3 × W2 | X3 × W1 | X18 × W9 | X18 × W8 | X18 × W7 | X23 × W9 | X23 × W8 | X23 × W7 |
| 4 | X4 × 0 | X4 × W3 | X4 × W2 | X19 × 0 | X19 × W9 | X19 × W8 | X24 × 0 | X24 × W9 | X24 × W8 |
| 5 | X5 × 0 | X5 × 0 | X5 × W3 | X20 × 0 | X20 × 0 | X20 × W9 | X25 × 0 | X25 × 0 | X25 × W9 |
| 6 | X6 × W4 | X6 × 0 | X6 × 0 | X6 × W1 | X6 × 0 | X6 × 0 | X16 × W4 | X16 × 0 | X16 × 0 |
| 7 | X7 × W5 | X7 × W4 | X7 × 0 | X7 × W2 | X7 × W1 | X7 × 0 | X17 × W5 | X17 × W4 | X17 × 0 |
| 8 | X8 × W6 | X8 × W5 | X8 × W4 | X8 × W3 | X8 × W2 | X8 × W1 | X18 × W6 | X18 × W5 | X18 × W4 |
| 9 | X9 × 0 | X9 × W6 | X9 × W5 | X9 × 0 | X9 × W3 | X9 × W2 | X19 × 0 | X19 × W6 | X19 × W5 |
| 10 | X10 × 0 | X10 × 0 | X10 × W6 | X10 × 0 | X10 × 0 | X10 × W3 | X20 × 0 | X20 × 0 | X20 × W6 |
| 11 | X11 × W7 | X11 × 0 | X11 × 0 | X11 × W4 | X11 × 0 | X11 × 0 | X11 × W1 | X11 × 0 | X11 × 0 |
| 12 | X12 × W8 | X12 × W7 | X12 × 0 | X12 × W5 | X12 × W4 | X12 × 0 | X12 × W2 | X12 × W1 | X12 × 0 |
| 13 | X13 × W9 | X13 × W8 | X13 × W7 | X13 × W6 | X13 × W5 | X13 × W4 | X13 × W3 | X13 × W2 | X13 × W1 |
| 14 | X14 × 0 | X14 × W9 | X14 × W8 | X14 × 0 | X14 × W6 | X14 × W5 | X14 × 0 | X14 × W3 | X14 × W2 |
| 15 | X15 × 0 | X15 × 0 | X15 × W9 | X15 × 0 | X15 × 0 | X15 × W6 | X15 × 0 | X15 × 0 | X15 × W3 |
Fig. 5Overall architecture for proposed CNN computations.
Fig. 6Proposed Processing Element architecture.
Optimization of hyper-parameters.
| HYPERPARAMETER | Setting | Data augmentation | |
|---|---|---|---|
| ResNet, FitNet, IRCNN, EffectiveNet, and FitNet | Majority Voting Ensemble, Simple Averaging Ensemble, and Weighted Averaging Ensemble | ||
| Optimizer | ADAM | ADAM | Both axis side random reflection |
| Batch Size | 10 | 10 | |
| Max Epoch | 200 | 100 | |
| Global Learning Rate | 4 | 4 | |
| Dropout rate | 0.5 | 0.8 | |
| Validation Frequency | 68 | 68 | |
| Learn Rate Factor | 10 | 10 | |
| classification layer weight vector | [0.75 0.15 1.18] | [0.75 0.15 1.18] | |
Top-1, top-2, and top-4 fine-tuned Ensemble models performance for COVID-19 identification.
| Ensemble method | Top-N models | Accuracy | Sensitivity | Specificity | Precision | F1 | MCC | Kappa | DOR | AUC |
|---|---|---|---|---|---|---|---|---|---|---|
| Majority voting | 1 | 0.932 | 0.961 | 0.926 | 0.945 | 0.958 | 0.938 | 0.955 | 102.22 | 0.949 (0.962, 0.956) |
| 2 | 0.941 | 0.9612 | 0.945 | 0.9586 | 0.979 | 0.764 | 0.763 | 57.63 | 0.961 (0.829, 0.934) | |
| 4 | 0.948 | 0.955 | 0.932 | 0.94 | 0.967 | 0.928 | 0.977 | 65.02 | 0.958 (0.837, 0.940) | |
| Simple averaging | 1 | 0.955 | 0.968 | 0.972 | 0.951 | 0.945 | 0.937 | 0.971 | 74.32 | 0.938 (0.972, 0.984) |
| 2 | 0.931 | 0.961 | 0.952 | 0.948 | 0.979 | 0.964 | 0.963 | 57.63 | 0.946 (0.967,9831) | |
| 4 | 0.961 | 0.975 | 0.968 | 0.977 | 0.981 | 0.964 | 0.963 | 56.01 | 0.955 (0.908, 0.982) | |
| Weighted averaging | 1 | 0.999 | 0.992 | 0.984 | 0.989 | 0.989 | 0.989 | 0.989 | 105.6 | 0.987 (0.981, 0.984) |
| 2 | 0.972 | 0.975 | 0.980 | 0.976 | 0.9 | 0.906 | 0.985 | 93.87 | 0.949 (0.953, 0.985) | |
| 4 | 0.988 | 0.988 | 0.988 | 0.988 | 0.988 | 0.977 | 0.977 | 64.02 | 0.945 (0.958, 0.982) |
Ensemble model classification results on chest x-rays.
| Dataset | Normal | Pneumonia | COVID-19 | |
|---|---|---|---|---|
| Balanced dataset | Precision | 0.982 | 0.976 | 0.984 |
| Recall | 0.977 | 0.988 | 0.975 | |
| F1 | 0.965 | 0.972 | 0.985 | |
| Imbalanced dataset | Precision | 0.906 | 0.864 | 0.877 |
| Recall | 0.897 | 0.853 | 0.881 | |
| F1 | 0.902 | 0.858 | 0.879 | |
Comparison of proposed model with state-of-the-art deep learning models.
| Models | Accuracy(in%) | Precision(in%) | Recall(in%) | F1(in%) |
|---|---|---|---|---|
| Alexnet ( | 76 | 75 | 60 | 67 |
| ResNet-50 ( | 88 | 71 | 86 | 77 |
| VGG-2 ( | 89 | 88 | 87 | 87 |
| LeNet-5 ( | 88 | 88 | 87 | 86 |
| VGG-1 ( | 84 | 83 | 83 | 83 |
| VGG-3 ( | 81 | 80 | 81 | 80 |
| Inception V3 ( | 94 | 91 | 91 | 95 |
| IRCNN | 85 | 87 | 84 | 87 |
| MobileNet | 87 | 89 | 88 | 87 |
| FitNet | 87 | 88 | 90 | 86 |
| ResNet-18 | 89 | 89 | 93 | 84 |
| Deep-LSTM ( | 94 | 97.59 | 95 | 96.78 |
| CSEN ( | 95 | 90 | 93 | 89 |
| RVM-L ( | 96 | 95 | 96 | 96 |
| SSDMNV2[31] | 92.64 | 93 | 93 | 93 |
| Proposed Model | 99 | 98 | 98 | 98 |
Fine-tuned ensemble model on mean squared error (MSE) cross-validation replicates.
| Model | R2 | R.S.S | d.f | F-Value | P-Value |
|---|---|---|---|---|---|
| MMSE1 | 0.3535 | 7.9759 | 2147 | 41.73 | 4.442×10−15 |
| MMSE2 | 0.1904 | 9.8519 | 4145 | 9.759 | 5.107×10−7 |
| M MSE3 | 0.5564 | 5.3238 | 6143 | 32.14 | 2.2 × 10−6 |
| M MSE4 | 0.9931 | 0.0778 | 14,135 | 1540 | 2.2 × 10−6 |
*R2= Percentage of variation in a response variable, *R.S.S= Residual Sum of Squares, *d.f=degrees of freedom. The table shows the individual model ANOVA on Mean Squared Error (MSE) cross-validation replicates. The first model MMSE1 (Majority voting model) contains highly significant evidence for the variance in MSE influenced by the choice of the learning approach. The second model MMSE2 (Simple Averaging model) contains evidence for significant contribution to the variance in MSE by choice of attribute mapping approach. The third model contains both learning techniques and mapping approaches, but without interactions between techniques and attributes M MSE3 (Weighted Averaging model), contained a significantly better fit to either MR1 or MR2 model that contained only learning approaches or mapping methods. Finally, the model MR4, which contained interaction terms between techniques and methods, had a marginally significantly better fit than the model MR3.
Fig. 7Latency dependency on N and p.
Dependence on N for latency and memory accesses.
| Value of N | Processing Latency (ms) | MAfilters MB) | MAinput pixelMB) |
|---|---|---|---|
| 7 | 16,334.6 | 4384.7 | 13,154 |
| 14 | 14,615.8 | 2192.2 | 11,692.6 |
| 28 | 13,784.7 | 1096.1 | 11,027.8 |
| 56 | 13,507.1 | 548.2 | 10,805.8 |
| 112 | 13,436.9 | 273.9 | 10,749.5 |
| 224 | 13,422.6 | 138.8 | 10,738 |
Computation time of the ensemble models.
| Ensemble Method | Top-N models | Computation Time (in Seconds) |
|---|---|---|
| Majority Voting | 1 | 5.64 |
| 2 | 8.4 | |
| 4 | 9.7 | |
| Simple Averaging | 1 | 4.54 |
| 2 | 4.94 | |
| 4 | 6.75 | |
| Weighted Averaging | 1 | 3.72 |
| 2 | 5.62 | |
| 4 | 8.78 |
Hardware resources required for CNN architecture implementation.
| Sl.No | Parameters | Existing Architecture | Proposed Architecture | |
|---|---|---|---|---|
| AlexNet | VGG-16 | VGG-16 | ||
| 1 | Technology | 45nm | 45nm | 45nm |
| 2 | Gate Count (NAND-2) | 1852k | 565k | 485K |
| 3 | #MAC | 168 | 192 | 178 |
| 4 | Supply voltage (Volts) | 1v | 1v | 1v |
| 5 | Power(mW) | 278 | 236 | 196 |
| 6 | Total Latency(ms) | 115.3 | 4309.5 | 2678.3 |
| 7 | Throughput(fps) | 34.7 | 26.8 | 43.2 |
| 8 | No. of clock cycles required | 25 | 15 | 9 |
| 9 | Performance (Gops) | 46.1 | 21.4 | 70.3 |
| 10 | Performance Efficiency | 55% | 26% | 93% |
| Abbreviations | Descriptions |
| AI | Artificial Intelligence |
| AUC | Area Under Curve |
| BCNN | Bayesian Convolutional Neural Networks |
| CNN | Convolutional Neural Network |
| CSEN | Convolution Support Estimation Network |
| CRM | Class-specific Relevance Mapping |
| CT | Computed Tomography |
| CXRs | Chest X-rays |
| CADx | Computer-Aided Diagnostic devices |
| DL | Deep Learning |
| DM | Diabetes Mellitus |
| DOR | Diagnostic Odds Ratio |
| FITNET | Function fitting Neural Network |
| FC | Fully Connected Network |
| HDs | Heart Disorders |
| HTN | Hyper Tension |
| HCLS | Hypercholesterolemia |
| IRCNN | Inception Recurrent Convolutional Neural Network |
| IoT | Internet of Things |
| K-NN | K-Nearest Neighbor |
| LSTM | Long Short-Term Memory |
| LRP | Layer-wise Relevance Propagation |
| LFSR | Linear Feedback Shift Register |
| MCR | Miss-Classification Rates |
| MCC | Matthews Correlation Coefficient |
| PE | Processing Element |
| ROI | Region-Of-Interest |
| RT-PCR | Reverse Transcription-Polymerase Chain Reaction |
| RCL | Recurrent Convolution Layer |
| ReLU | Rectified Linear Unit |
| SVM | Support Vector Machine |
| SLSQP | Sequential Least- Squares Programming Method |
| TESM | Truth Estimate from Self Distances Method |
| TESD | Truth Estimate from Self Distances |
| UF | Utilization Factor |