| Literature DB >> 35800693 |
Jia Liu1, Jianling Guo1.
Abstract
The accuracy of video and goal enables students to learn and strengthen their ability constantly. Strengthening our country's study initiative degree can promote student study more effectively. As a new teaching method, students can not only obtain the basic knowledge, learning priorities, and difficulties needed for learning through video, but also understand the content of the text, the content of the article, and even cultivate students' interest in many related languages, such as writing, text, sound, image, color, and video, which can be displayed, clarified, and displayed intuitively, creating a free and relaxed learning environment, an interesting background teaching process, encouraging students to experience emotion, including physical experience, and being open and open. Establishing a complete and comprehensive ideological channel to further improve students' acceptance of information is helpful for students' analysis and training, understanding, and evaluation. Therefore, this paper first identifies video and excavates the intrinsic value of video application. This can provide technical and methodological support for the design of video teaching system.Entities:
Mesh:
Year: 2022 PMID: 35800693 PMCID: PMC9256379 DOI: 10.1155/2022/7501765
Source DB: PubMed Journal: Comput Intell Neurosci
Correlation between three modes.
| Modality | Correlation coefficient |
|---|---|
| Visual and audio | 0.5036 |
| Visual and text | 0.5069 |
| Audio and text | 0.1217 |
Correlation between individual modes and scene categories.
| Modality | Correlation coefficient |
|---|---|
| Visual | 0.6135 |
| Audio | 0.1755 |
| Text | 0.2827 |
Correlation between representation of three modes in common subspaces and categories.
| Modality | Corrcoef_CCA | Corrcoef_MVDA |
|---|---|---|
| Visual | 0.2502 | 0.2235 |
| Audio | 0.2074 | 0.0234 |
| Text | 0.2849 | 0.0964 |
Comparison K network performance (mAP) at different values.
| Values of | mAP (@50) | mAP (@100) |
|---|---|---|
|
| 0.4255 | 0.4240 |
|
| 0.4246 | 0.4200 |
|
| 0.4504 | 0.4468 |
|
| 0.4293 | 0.4298 |
|
| 0.4253 | 0.4164 |
Comparison of mAP performance of this section with traditional multimodal fusion methods.
| Method | mAP (@50) | mAP (@100) |
|---|---|---|
| Concatenating | 0.398 | 0.358 |
| LDA | 0.411 | 0.393 |
| CCA | 0.258 | 0.234 |
| MvDA | 0.282 | 0.250 |
| Multilayer neural network | 0.450 | 0.445 |
| Proposed method | 0.469 | 0.477 |
Comparison of mAP@50 performance of methods in this section with individual hash learning methods.
| Method | 8 bits | 16 bits | 32 bits | 64 bits |
|---|---|---|---|---|
| LFH | 0.388 | 0.365 | 0.406 | 0.359 |
| KSH | 0.338 | 0.393 | 0.452 | 0.439 |
| SDH | 0.239 | 0.223 | 0.223 | 0.292 |
| COSDISH | 0.330 | 0.369 | 0.400 | 0.375 |
| Proposed method | 0.469 | 0.452 | 0.455 | 0.454 |
Comparison of mAP@100 performance of this section method with individual hash learning methods.
| Method | 8 bits | 16 bits | 32 bits | 64 bits |
|---|---|---|---|---|
| LFH | 0.395 | 0.366 | 0.406 | 0.358 |
| KSH | 0.341 | 0.377 | 0.431 | 0.411 |
| SDH | 0.247 | 0.219 | 0.220 | 0.266 |
| COSDISH | 0.330 | 0.369 | 0.403 | 0.378 |
| Proposed method | 0.477 | 0.453 | 0.455 | 0.453 |
Comparison of the performance of methods in this section with existing methods on Maryland datasets.
| Class | HOF + GIST | SFA | C3D | ACSL |
|---|---|---|---|---|
| Avalanche | 0.200 | 0.600 | 1.000 | 1.000 |
| Boiling water | 0.500 | 0.700 | 0.900 | 1.000 |
| Chaotic traffic | 0.300 | 0.800 | 0.900 | 1.000 |
| Forest fire | 0.500 | 0.100 | 0.800 | 1.000 |
| Fountain | 0.200 | 0.500 | 0.900 | 1.000 |
| Iceberg collapse | 0.200 | 0.600 | 1.000 | 0.800 |
| Landslide | 0.200 | 0.600 | 0.800 | 0.800 |
| Smooth traffic | 0.300 | 0.500 | 0.800 | 0.800 |
| Tornado | 0.400 | 0.700 | 0.800 | 0.800 |
| Volcanic eruption | 0.200 | 0.800 | 0.900 | 0.800 |
| Waterfall | 0.200 | 0.500 | 0.700 | 0.400 |
| Waves | 0.800 | 0.600 | 1.000 | 0.600 |
| Whirlpool | 0.300 | 0.800 | 0.900 | 1.000 |
| Average | 0.330 | 0.600 | 0.860 | 0.850 |
Performance comparison of methods in this section with existing methods on Yupenn datasets.
| Class | HOF + GIST | SFA | C3D | ACSL |
|---|---|---|---|---|
| Beach | 0.870 | 0.930 | 0.970 | 1.000 |
| Elevator | 0.870 | 0.970 | 1.000 | 1.000 |
| Fire | 0.630 | 0.700 | 1.000 | 1.000 |
| Fountain | 0.430 | 0.570 | 0.830 | 1.000 |
| Highway | 0.470 | 0.930 | 0.970 | 0.890 |
| Lightning | 0.630 | 0.870 | 0.930 | 1.000 |
| Ocean | 0.970 | 1.000 | 1.000 | 1.000 |
| Railway | 0.830 | 0.930 | 0.970 | 1.000 |
| Rfiver | 0.770 | 0.870 | 1.000 | 0.890 |
| Sky | 0.870 | 0.930 | 0.970 | 1.000 |
| Snowing | 0.470 | 0.700 | 0.930 | 0.560 |
| Street | 0.770 | 0.970 | 1.000 | 0.890 |
| Waterfall | 0.470 | 0.730 | 0.970 | 0.890 |
| Windmill | 0.530 | 0.870 | 1.000 | 0.890 |
| Average | 0.680 | 0.850 | 0.970 | 0.930 |
Performance comparison of methods in this section with existing methods on videoSceneData_10 datasets.
| Class | HOF + GIST | SFA | C3D | ACSL |
|---|---|---|---|---|
| Museum | 0.250 | 0.080 | 0.100 | 0.797 |
| Pier | 0.130 | 0.070 | 0.050 | 0.594 |
| Garden | 0.500 | 0.040 | 0.100 | 0.815 |
| Office | 0.030 | 0.020 | 0.050 | 0.594 |
| Bridge | 0.190 | 0.120 | 0.120 | 0.768 |
| Racetrack | 0.230 | 0.070 | 0.120 | 0.774 |
| Landmark | 0.210 | 0.060 | 0.050 | 0.788 |
| Aquarium | 0.300 | 0.470 | 0.050 | 0.818 |
| Lake | 0.060 | 0.070 | 0.120 | 0.683 |
| Bowling alley | 0.290 | 0.030 | 0.090 | 0.895 |
| Average | 0.220 | 0.100 | 0.100 | 0.753 |
Comparison of performance of dual-branch and single-branch networks.
| Class | Single-branch | Two-branch |
|---|---|---|
| Museum | 0.774 | 0.797 |
| Pier | 0.622 | 0.594 |
| Garden | 0.877 | 0.815 |
| Office | 0.438 | 0.594 |
| Bridge | 0.752 | 0.768 |
| Racetrack | 0.744 | 0.774 |
| Landmark | 0.741 | 0.788 |
| Aquarium | 0.796 | 0.818 |
| Lake | 0.590 | 0.683 |
| Bowling alley | 0.860 | 0.895 |
| Average | 0.719 | 0.753 |
Validation of LSTM layers.
| Class | W/O-LSTM | W/LSTM |
|---|---|---|
| Museum | 0.589 | 0.797 |
| Pier | 0.139 | 0.594 |
| Garden | 0.834 | 0.815 |
| Office | 0.775 | 0.594 |
| Bridge | 0.721 | 0.768 |
| Racetrack | 0.719 | 0.774 |
| Laminuurk | 0.668 | 0.788 |
| Aquarium | 0.776 | 0.818 |
| Lake | 0.500 | 0.683 |
| Bowling alley | 0.819 | 0.895 |
| Average | 0.684 | 0.753 |
Modal semantic enhancement experiment.
| Modal | Acc_before_enhancement | Acc_after_enhancement |
|---|---|---|
| Audio | 0.3286 | 0.3427 |
| Text | 0.4153 | 0.4210 |
| Visual | 0.9816 | 0.9697 |
Ablation experiment.
| Modal | Accuracy |
|---|---|
| Audio | 0.3427 |
| Text | 0.4210 |
| Visual | 0.9697 |
| Visual + audio + text | 0.9826 |
Figure 1Cognitive model of multimedia learning.
Figure 2Classification of video elements for VARK learning style learners.
Figure 3Information processing process model with redundant information.
Figure 4Production process for learning style microvideo.
Figure 5Microvideo resource application process diagram.