| Literature DB >> 31426516 |
Weijun Hu1, Yan Zhang2, Lijie Li3.
Abstract
The fast progress in research and development of multifunctional, distributed sensor networks has brought challenges in processing data from a large number of sensors. Using deep learning methods such as convolutional neural networks (CNN), it is possible to build smarter systems to forecasting future situations as well as precisely classify large amounts of data from sensors. Multi-sensor data from atmospheric pollutants measurements that involves five criteria, with the underlying analytic model unknown, need to be categorized, so do the Diabetic Retinopathy (DR) fundus images dataset. In this work, we created automatic classifiers based on a deep convolutional neural network (CNN) with two models, a simpler feedforward model with dual modules and an Inception Resnet v2 model, and various structural tweaks for classifying the data from the two tasks. For segregating multi-sensor data, we trained a deep CNN-based classifier on an image dataset extracted from the data by a novel image generating method. We created two deepened and one reductive feedforward network for DR phase classification. The validation accuracies and visualization results show that increasing deep CNN structure depth or kernels number in convolutional layers will not indefinitely improve the classification quality and that a more sophisticated model does not necessarily achieve higher performance when training datasets are quantitatively limited, while increasing training image resolution can induce higher classification accuracies for trained CNNs. The methodology aims at providing support for devising classification networks powering intelligent sensors.Entities:
Keywords: convolutional neural network; diabetic retinopathy; images processing; multi-sensor
Mesh:
Year: 2019 PMID: 31426516 PMCID: PMC6718995 DOI: 10.3390/s19163584
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Comparison of Deep Learning Methods.
| Methods | Characteristics |
|---|---|
| Deep feedforward convolutional neural network [ | Learning a hierarchy of features including simple curves and edges to global motifs from raw images. Sensitive to crucial minute details yet insensitive to large irrelevant variations in image. |
| Group method of data handling [ | Self-organizing deep learning method for time series forecasting problems without requirement of big data |
| Ada-CGFace framework [ | Uses Adaboost classifier in place of softmax. Contains dropout layers for avoiding overfitting and trained using adaptive moment estimation instead of stochastic gradient descent. |
| Deep CNN with dual modules (our method) | Has certain level of scale invariability to target object. 1 × 1 convolutional kernel induces small computational cost. |
| Inception Resnet v2 [ | Residual connection improves training speed greatly. Inception is computationally efficient. It can abstract features at different scales simultaneously |
The CNN architecture modified for training on higher resolution images.
| Module Name | Kernel Size (Width × Height × Channel), Number and Stride | Output Size |
|---|---|---|
| Input Raw image | N/A | (224 × 224 × 3) |
| Simple Convolution | 9 × 9 × 3 Conv (96 stride 3) | (74 × 74 × 96) |
| Normal dual-path modules 1 | 1 × 1 × 96 Conv (32 stride 1), 3 × 3 × 96 Conv (32 stride 1) | (74 × 74 × 64) |
| Normal dual-path modules 2 | 1 × 1 × 64 Conv (32 stride 1), 3 × 3 × 64 Conv (48 stride 1) | (74 × 74 × 80) |
| Dual-path reduction module 1 | 3 × 3 × 80 Conv (80 stride 2), 3 × 3 Max pooling (1 stride 2) | (37 × 37 × 160) |
| Normal dual-path modules 3 | 1 × 1 × 160 Conv (112 stride 1), 3 × 3 × 160 Conv (48 stride 1) | (37 × 37 × 160) |
| Normal dual-path modules 4 | 1 × 1 × 160 Conv (96 stride 1), 3 × 3 × 160 Conv (64 stride 1) | (37 × 37 × 160) |
| Normal dual-path modules 5 | 1 × 1 × 160 Conv (80 stride 1), 3 × 3 × 160 Conv (80 stride 1) | (37 × 37 × 160) |
| Normal dual-path modules 6 | 1 × 1 × 160 Conv (48 stride 1), 3 × 3 × 160 Conv (96 stride 1) | (37 × 37 × 144) |
| Dual-path reduction module 2 | 3 × 3 × 144 Conv (96 stride 2), 3 × 3 Max pooling (1 stride 2) | (19 × 19 × 240) |
| Normal dual-path modules 7 | 1 × 1 × 240 Conv (176 stride 1), 3 × 3 × 240 Conv (160 stride 1) | (19 × 19 × 336) |
| Normal dual-path modules 8 | 1 × 1 × 336 Conv (176 stride 1), 3 × 3 × 336 Conv (160 stride 1) | (19 × 19 × 336) |
| Dual-path reduction module 3 | 3 × 3 × 336 Conv (96 stride 2), 3 × 3 Max pooling (1 stride 2) | (10 × 10 × 432) |
| Normal dual-path modules 9 | 1 × 1 × 432 Conv (176 stride 1), 3 × 3 × 432 Conv (160 stride 1) | (10 × 10 × 336) |
| Normal dual-path modules 10 | 1 × 1 × 336 Conv (176 stride 1), 3 × 3 × 336 Conv (160 stride 1) | (10 × 10 × 336) |
| Pooling layer | 10 × 10 Average pooling (1 stride 1) | (1 × 1 × 336) |
| Flatten | N/A | (336 × 1) |
| Fully connected layer | Hidden nodes (5) | (5 × 1) |
| Softmax layer | N/A | (5 × 1) |
Figure 1Illustration of classification methodology and deep CNN training process: The input image is analyzed by convolutional filters (Blue squares) which apply to a small receptive field but detecting at each of the image positions and always extending through the whole depth of the image. The resulting output feature maps (Green Squares) are analyzed by convolutional layers or pooling filters at each of their surface positions until being processed by a global average pooling layer which is followed by one fully connected layer. Each green square represents a feature map corresponding to the output for one of the learned features. The convolutional neurons are activated using ReLU function, while the final layer connects to a Softmax function which converts the output into probabilities of the input image belonging to different categories. For training, the true label joins the predicted classes for calculating loss using object function which is cross entropy in our case. The losses are then backpropagated through the network for updating weights of convolutional filters and fully connected layer, where stochastic gradient descent was used for updating weights.
Figure 2A sample image from training dataset used in training CNN for distinguishing air pollutant. Grey scales of numbered zones are determined by a sample’s five parameters.
Figure 33-dimensional t-SNE visualization of the result of air component experiment validation set in activation space representation. Class 0 to 4 represent acetone, ethanol, chloroform, toluene, methanol, respectively. Finally, class 5 represents a new undefined ingredient apart from the defined five types of ingredients.
DR fundus images distribution.
| Class | DR Classification | No. of Images | Percentage (%) | Imbalanced Ratio |
|---|---|---|---|---|
| 0 | Normal | 25,810 | 73.48 | 1.01 |
| 1 | Mild NPDR | 2443 | 6.96 | 1.84 |
| 2 | Moderate NPDR | 5292 | 15.07 | 1.26 |
| 3 | Severe NPDR | 873 | 2.48 | 2.76 |
| 4 | Proliferative DR | 708 | 2.01 | 2.89 |
Figure 4Two randomly picked fundus images from the training data set that feature oppositely oriented retinas.
Figure 5(a) validation accuracies recorded for every 20 epochs over 400 epochs. (b) t-SNE visualization of validation set in activation space representation by the trained CNN with feedforward structure with dual modules. The arrangement of circles of class 0 to 4 in 3-dimensional space reflects the correct sequence of disease progression.
Figure 6Reconstruction of disease progression in diabetic retinopathy by CNN with reductive first convolutional layer that was trained on high resolution images. The red arrow indicates the inferred disease progression sequence among dots representing class 0 to 4.
The deepened CNN architecture with 6 more modules inserted.
| Module Name | Kernel size (Width × Height × Channel), Number and Stride | Output Size |
|---|---|---|
| Input Raw Image | N/A | (224 × 224 × 3) |
| Simple Convolution | 3 × 3 × 3 Conv (96 stride 1) | (224 × 224 × 96) |
| Normal dual-path modules 1 | 1 × 1 × 96 Conv (32 stride 1), 3 × 3 × 96 Conv (32 stride 1) | (224 × 224 × 64) |
| Normal dual-path modules 2 | 1 × 1 × 64 Conv (32 stride 1), 3 × 3 × 64 Conv (48 stride 1) | (224 × 224 × 80) |
| Dual-path reduction module 1 | 3 × 3 × 80 Conv (80 stride 2), 3 × 3 Max pooling (1 stride 2) | (112 × 112 × 160) |
| Normal dual-path modules 3 | 1 × 1 × 160 Conv (112 stride 1), 3 × 3 × 160 Conv (48 stride 1) | (112 × 112 × 160) |
| Normal dual-path modules 4 | 1 × 1 × 160 Conv (96 stride 1), 3 × 3 × 160 Conv (64 stride 1) | (112 × 112 × 160) |
| Normal dual-path modules 5 | 1 × 1 × 160 Conv (80 stride 1), 3 × 3 × 160 Conv (80 stride 1) | (112 × 112 × 160) |
| Normal dual-path modules 6 | 1 × 1 × 160 Conv (48 stride 1), 3 × 3 × 160 Conv (96 stride 1) | (112 × 112 × 144) |
| Dual-path reduction module 2 | 3 × 3 × 144 Conv (96 stride 2), 3 × 3 Max pooling (1 stride 2) | (56 × 56 × 240) |
| Normal dual-path modules 7 | 1 × 1 × 240 Conv (176 stride 1), 3 × 3 × 240 Conv (160 stride 1) | (56 × 56 × 336) |
| Normal dual-path modules 8 | 1 × 1 × 336 Conv (176 stride 1), 3 × 3 × 336 Conv (160 stride 1) | (56 × 56 × 336) |
| Dual-path reduction module 3 | 3 × 3 × 336 Conv (96 stride 2), Max pooling 3 × 3 (1 stride 2) | (28 × 28 × 432) |
| Normal dual-path modules 9 | 1 × 1 × 432 Conv (176 stride 1), 3 × 3 × 432 Conv (160 stride 1) | (28 × 28 × 336) |
| Normal dual-path modules 10 | 1 × 1 × 336 Conv (176 stride 1), 3 × 3 × 336 Conv (160 stride 1) | (28 × 28 × 336) |
| Dual-path reduction module 4 | 3 × 3 × 336 Conv (112 stride 2), 3 × 3 Max pooling (1 stride 2) | (14 × 14 × 448) |
| Normal dual-path modules 11 | 1 × 1 × 448 Conv (224 stride 1), 3 × 3 × 448 Conv (192 stride 1) | (14 × 14 × 416) |
| Normal dual-path modules 12 | 1 × 1 × 416 Conv (224 stride 1), 3 × 3 × 416 Conv (192 stride 1) | (14 × 14 × 416) |
| Dual-path reduction module 5 | 3 × 3 × 416 Conv (112 stride 2), 3 × 3 Max pooling (1 stride 2) | (7 × 7 × 528) |
| Normal dual-path modules 13 | 1 × 1 × 528 Conv (224 stride 1); 3 × 3 × 528 Conv (192 stride 1) | (7 × 7 × 4160 |
| Normal dual-path modules 14 | 1 × 1 × 416 Conv (224 stride 1); 3 × 3 × 416 Conv (192 stride 1) | (7 × 7 × 416) |
| Pooling layer | 7 × 7 Average pooling (1 stride 1) | (1 × 1 × 416) |
| Flatten | N/A | (416 × 1) |
| Fully connected layer | Hidden nodes (5) | (5 × 1) |
| Softmax layer | N/A | (5 × 1) |
Figure 7Reconstruction of disease progression in diabetic retinopathy by the deepened CNN. (a) tSNE visualization of validation image data set in activation space representation, colored according to corresponding disease phase. (b) Randomly picked repetitive fundus images that from each phase.
New version of six inserted dual modules following the last normal dual module of the original CNN architecture.
| Module Type | Kernel Size (Width × Height × Channel), Number and Stride | Output Size |
|---|---|---|
| Reduction Dual-path Module | 3 × 3 × 336 Conv (96 stride 2), 3 × 3 Max pooling (1 stride 2) | (14 × 14 × 432) |
| Normal Dual-path Module | 1 × 1 × 432 Conv (176 stride 1), 3 × 3 × 432 Conv (160 stride 1) | (14 × 14 × 336) |
| Normal Dual-path Module | 1 × 1 × 336 Conv (176 stride 1), 3 × 3 × 336 Conv (160 stride 1) | (14 × 14 × 336) |
| Reduction Dual-path Module | 3 × 3 × 336 Conv (96 stride 2), 3 × 3 Max pooling (1 stride 2) | (7 × 7 × 432) |
| Normal Dual-path Module | 1 × 1 × 432 Conv (176 stride 1, 3 × 3 × 432 Conv (160 stride 1) | (7 × 7 × 336) |
| Normal Dual-path Module | 1 × 1 × 336 Conv (176 stride 1), 3 × 3 × 336 Conv (160 stride 1) | (7 × 7 × 336) |
| Global Pooling Layer | 7 × 7 Average pooling | (1 × 1 × 336) |
Figure 8Validation accuracies of five different CNNs. ‘Org’ represents original feedforward CNN; ‘Ext2’ represents revised deepened CNN; ‘Ext1’ represents deepened CNN; ‘Rdu’ represents reductive CNN; ‘Inc’ represents Inception Resnet v2.
Comparison of highest validation accuracy of different CNNs.
| CNN Architecture | Validation Accuracy (%) | Appeared Epoch | Training Time (hours/100 h) |
|---|---|---|---|
| Original Architecture | 76.3068 | 95 | N/A |
| Reduction Architecture | 80.7823 | 100 | 6 |
| Deepened Architecture | 81.3068 | 93 | 23 |
| Revised Deepened Architecture | 81.9294 | 100 | 15 |
| Inception Resnet v2 | 78.708 | 17 | 42 |