| Literature DB >> 35720773 |
Abstract
College students learn words always under both teachers' and school administrators' control. Based on multi-modal discourse analysis theory, the analysis of English words under the synergy of different modalities, students improve the motivation and effectiveness of word learning, but there are still some problems, such as the lack of visual modal memory of pictures, incomplete word meanings, little interaction between users, and lack of resource expansion function. To this end, this paper proposes a stepped image semantic segmentation network structure based on multi-scale feature fusion and boundary optimization. The network aims at improving the accuracy of the network model, optimizing the spatial pooling pyramid module in Deeplab V3+ network, using a new activation function Funnel ReLU (FReLU) for vision tasks to replace the original non-linear activation function to obtain accuracy compensation, improving the overall image segmentation accuracy through accurate prediction of the boundaries of each class, reducing the intra-class error in the prediction results. The accuracy compensation is obtained by replacing the original linear activation function with FReLU. Experimental results on the Englishhnd dataset demonstrate that the improved network can achieve 96.35% accuracy for English characters with the same network parameters, training data and test data.Entities:
Keywords: Deeplab V3+ network; feature fusion; image semantic; learning; multi-modal discourse analysis
Year: 2022 PMID: 35720773 PMCID: PMC9200988 DOI: 10.3389/fncom.2022.895680
Source DB: PubMed Journal: Front Comput Neurosci ISSN: 1662-5188 Impact factor: 3.387
Figure 1DeeplabV3+ network structure.
Figure 2Improved ladder-type DeeplabV3+ network structure.
Figure 3Improved ASPP module.
Figure 4Two-dimensional FReLU activation function with funnel condition.
Figure 5Englishhnd dataset.
Figure 6Representation of the letters A and B.
Network parameters.
|
|
|
|---|---|
| Number of hidden layers | 1 |
| Number of hidden units | 500 |
| Enter the number of nodes | 35 |
| Number of output nodes | 52 |
| Target error | 0.0001 |
| Maximum training times | 40 |
Figure 7Relationship between word vector dimensionality and F1, training time. (A) BP network. (B) Our method.
Dataset parameters.
|
|
| |
|---|---|---|
| Recognition accuracy | 88.56% | 96.35% |
| AUC | 0.72 | 0.89 |
Figure 8Histogram of word vectors.
Figure 9The semantic segmentation process of English units in this model. (A) Original picture. (B) Grayscale. (C) Binarization diagram. (D) Peak noise. (E) Splitting effect.