| Literature DB >> 35847966 |
Sunil Kumar Prabhakar1, Harikumar Rajaguru2, Kwangsub So1, Dong-Ok Won1.
Abstract
To classify the texts accurately, many machine learning techniques have been utilized in the field of Natural Language Processing (NLP). For many pattern classification applications, great success has been obtained when implemented with deep learning models rather than using ordinary machine learning techniques. Understanding the complex models and their respective relationships within the data determines the success of such deep learning techniques. But analyzing the suitable deep learning methods, techniques, and architectures for text classification is a huge challenge for researchers. In this work, a Contiguous Convolutional Neural Network (CCNN) based on Differential Evolution (DE) is initially proposed and named as Evolutionary Contiguous Convolutional Neural Network (ECCNN) where the data instances of the input point are considered along with the contiguous data points in the dataset so that a deeper understanding is provided for the classification of the respective input, thereby boosting the performance of the deep learning model. Secondly, a swarm-based Deep Neural Network (DNN) utilizing Particle Swarm Optimization (PSO) with DNN is proposed for the classification of text, and it is named Swarm DNN. This model is validated on two datasets and the best results are obtained when implemented with the Swarm DNN model as it produced a high classification accuracy of 97.32% when tested on the BBC newsgroup text dataset and 87.99% when tested on 20 newsgroup text datasets. Similarly, when implemented with the ECCNN model, it produced a high classification accuracy of 97.11% when tested on the BBC newsgroup text dataset and 88.76% when tested on 20 newsgroup text datasets.Entities:
Keywords: Convolutional Neural Network; Differential Evolution; Particle Swarm Optimization; deep neural network; natural language processing
Year: 2022 PMID: 35847966 PMCID: PMC9276958 DOI: 10.3389/fncom.2022.900885
Source DB: PubMed Journal: Front Comput Neurosci ISSN: 1662-5188 Impact factor: 3.387
Figure 1Overview framework of the CCNN model.
Figure 2Framework of the proposed ECCNN.
Parameter Values of the proposed ECCNN architecture.
|
|
|
|---|---|
| Specifications of Convolution Layer: | |
| Size of the filters | 2 to 10 |
| Number of filters per convolution filter size (NFCS) | 50, 100, 150, 200, 250, 300, 350, 400, 450, 500 |
| Specification of Fully Connected Layer: | |
| Dropout Rate Setting | 0.2 to 0.8 |
| Initialization Mode Setting | Uniform, normal, LeCun uniform, He uniform |
| Neuron Number | 50, 100, 200, 250, 300, 350, 400 |
Architecture specification implementation for the 2 datasets using ECCNN architecture.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| 20 Newsgroup | [2,3,5] | 300 | 100 | Normal | 0.5 |
| BBC news group | [2,3,5] | 300 | 100 | Normal | 0.4 |
Figure 3A standard structure of DNN.
PSO parameters values/ranges.
|
|
|
|---|---|
|
| 0 to 4 |
|
| 0 to 4 |
|
| [5, 75] |
|
| 0 |
|
| 1 |
| W | [0.5, 0.8] |
|
| [20, 120] |
| ε | [0.1, 0.00001] |
DNN hyperparameter and its domain.
|
|
|
|
|---|---|---|
| Momentum | [0.1, 0.9] | Continuous |
| Learning Rate | [0.1, 0.9] | Continuous |
| Drop rate | [0.1, 0.9] | Continuous |
| Delay | [0.0001, 0.01] | Continuous |
| Number of hidden layers | [1, 10] | Discrete with step = 1 |
| Number of neurons in hidden layer | [1, 300] | Discrete with step = 1 |
| Number of epochs | [5, 25] | Discrete with step = 5 |
| Batch size | [100, 1,000] | Discrete with step = 100 |
| Layer type | [1, 2] | Discrete with step = 1 |
| Optimizer | [1, 6] | Discrete with step = 1 |
| Initialization function | [1, 8] | Discrete with step = 1 |
| Activation function | [1, 8] | Discrete with step = 1 |
Figure 4Flow chart of the proposed Swarm DNN.
Results of the ECCNN architecture with DE/best/1 and DE/best/2 strategy, F = 0.2, CR = 0.3.
|
|
|
|
|
|
|---|---|---|---|---|
| Population size 60 | Population size 90 | Population size 60 | Population size 90 | |
| Dataset utilized | Accuracy | Accuracy | Accuracy | Accuracy |
| 20 newsgroup | 72.34 | 78.91 | 73.24 | 79.22 |
| BBC newsgroup | 87.63 | 82.35 | 81.56 | 89.12 |
Results of the ECCNN architecture with DE/best/1 and DE/best/2 strategy, F = 0.4, CR = 0.5.
|
|
|
|
|
|
|---|---|---|---|---|
| Population size 60 | Population size 90 | Population size 60 | Population size 90 | |
| Dataset utilized | Accuracy | Accuracy | Accuracy | Accuracy |
| 20 newsgroup | 81.62 | 84.45 | 85.56 | 88.76 |
| BBC newsgroup | 92.35 | 92.49 | 96.12 | 97.11 |
Results of the ECCNN architecture with DE/best/1 and DE/best/2 strategy, F = 0.6, CR = 0.8.
|
|
|
|
|
|
|---|---|---|---|---|
| Population size 60 | Population size 90 | Population size 60 | Population size 90 | |
| Dataset utilized | Accuracy | Accuracy | Accuracy | Accuracy |
| 20 newsgroup | 79.12 | 77.45 | 80.34 | 78.35 |
| BBC newsgroup | 89.23 | 88.35 | 89.24 | 91.37 |
Results of the ECCNN architecture with DE/best/1 and DE/best/2 strategy, F = 0.8, CR =0.2.
|
|
|
|
|
|
|---|---|---|---|---|
| Population size 60 | Population size 90 | Population size 60 | Population size 90 | |
| Dataset utilized | Accuracy | Accuracy | Accuracy | Accuracy |
| 20 newsgroup | 78.35 | 79.35 | 84.76 | 82.34 |
| BBC newsgroup | 88.45 | 89.46 | 85.56 | 85.13 |
Results of the ECCNN architecture with DE/best/1 and DE/best/2 strategy, F = 0.6, CR = 0.4.
|
|
|
|
|
|
|---|---|---|---|---|
| Population size 60 | Population size 90 | Population size 60 | Population size 90 | |
| Dataset utilized | Accuracy | Accuracy | Accuracy | Accuracy |
| 20 newsgroup | 80.23 | 82.36 | 79.23 | 78.23 |
| BBC newsgroup | 91.04 | 94.25 | 89.94 | 92.46 |
Results of the ECCNN architecture with DE/best/1 and DE/best/2 strategy, F = 0.5, CR = 0.3.
|
|
|
|
|
|
|---|---|---|---|---|
| Population size 60 | Population size 90 | Population size 60 | Population size 90 | |
| Dataset utilized | Accuracy | Accuracy | Accuracy | Accuracy |
| 20 newsgroup | 79.45 | 80.21 | 78.25 | 81.24 |
| BBC newsgroup | 91.48 | 89.24 | 84.59 | 87.31 |
Results of the Swarm DNN architecture for the parameters of the four different cases.
|
|
|
|
|
|
|---|---|---|---|---|
| Dataset utilized | Accuracy | Accuracy | Accuracy | Accuracy |
| 20 newsgroup | 84.45 | 82.21 | 87.99 | 85.24 |
| BBC newsgroup | 93.48 | 93.24 | 95.59 | 97.32 |
Consolidated result analysis of the proposed techniques in text classification.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| 20 newsgroup | ECCNN | 88.76 | 82.45 | 81.35 | 81.23 | 79.45 |
| Swarm DNN | 87.99 | 83.13 | 82.45 | 82.06 | 80.13 | |
| BBC | ECCNN | 97.11 | 91.36 | 90.43 | 89.13 | 88.31 |
| newsgroup | Swarm DNN | 97.32 | 90.18 | 89.48 | 88.12 | 87.06 |