| Literature DB >> 35512010 |
Feng Pan1,2, Bingyao Huang1, Chunhong Zhang1, Xinning Zhu1, Zhenyu Wu1, Moyu Zhang1, Yang Ji1, Zhanfei Ma2, Zhengchen Li3.
Abstract
Student Dropout Prediction (SDP) is pivotal in mitigating withdrawals in Massive Open Online Courses. Previous studies generally modeled the SDP problem as a binary classification task, providing a single prediction outcome. Accordingly, some attempts introduce survival analysis methods to achieve continuous and consistent predictions over time. However, the volatility and sparsity of data always weaken the models' performance. Prevailing solutions rely heavily on data pre-processing independent of predictive models, which are labor-intensive and may contaminate authentic data. This paper proposes a Survival Analysis based Volatility and Sparsity Modeling Network (SAVSNet) to address these issues in an end-to-end deep learning framework. Specifically, SAVSNet smooths the volatile time series by convolution network while preserving the original data information using Long-Short Term Memory Network (LSTM). Furthermore, we propose a Time-Missing-Aware LSTM unit to mitigate the impact of data sparsity by integrating informative missingness patterns into the model. A survival analysis loss function is adopted for parameter estimation, and the model outputs monotonically decreasing survival probabilities. In the experiments, we compare the proposed method with state-of-the-art methods in two real-world MOOC datasets, and the experiment results show the effectiveness of our proposed model.Entities:
Mesh:
Year: 2022 PMID: 35512010 PMCID: PMC9071151 DOI: 10.1371/journal.pone.0267138
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Fig 1The architecture of SAVSNet.
The model consists of three major components, namely the volatility modeling network, the sparsity modeling network, and the survival analysis network.
Fig 2An example of the convolution operation of 1D-CNN.
1D-CNN moves the convolution kernel smoothly along the temporal wise of the input time series and generates the filtered signal by performing convergence over the local area of the input.
Fig 3The structure of Time-Missing-Aware LSTM (TM-LSTM) cell.
TM-LSTM introduces the elapsed time Δ and the missing indicator u to adjust the previous long-term and short-term memory, respectively.
Datasets characteristics of KDDCup 2015 and XuetangX.
| Type | KDDCup 2015 | XuetangX |
|---|---|---|
|
| 39 | 19 |
|
| 72,395 | 23,839 |
|
| 30 | 35 |
|
| 7 | 22 |
|
| 90.8% | 90.02% |
| 20.7% : 79.3% | 38.7% : 61.3% |
The results on KDDCup 2015 dataset.
| KDDCup 2015 dataset | |||||
|---|---|---|---|---|---|
| Category |
| C-index | Accuracy | F1 | AUC |
| Semi-parametric models | 0.7827 | 0.7161 | 0.7993 | 0.7233 | |
| 0.7826 | 0.7274 | 0.8120 | 0.7079 | ||
| 0.8118 | 0.7668 | 0.8407 | 0.7565 | ||
| Parametric models | 0.8237 | 0.7855 | 0.8576 | 0.7466 | |
|
|
|
|
| ||
|
|
|
|
| ||
↑ indicates that the higher the value, the better the performance. Of all the results, the highest are shown in bold. The second highest results are shown above underlines.
The results on XuetangX dataset.
| XuetangX dataset | |||||
|---|---|---|---|---|---|
| Category |
| C-index | Accuracy | F1 | AUC |
| Semi-parametric models | 0.7464 | 0.7286 | 0.8202 | 0.6464 | |
| 0.5927 | 0.6461 | 0.7589 | 0.5604 | ||
| 0.7467 | 0.7313 | 0.8172 | 0.6831 | ||
| Parametric models | 0.7039 |
|
| 0.6477 | |
|
| 0.7523 | 0.8336 |
| ||
|
|
|
|
| ||
↑ indicates that the higher the value, the better the performance. Of all the results, the highest are shown in bold. The second highest results are shown above underlines.
Ablation study of key components of SAVSNet.
|
| C-index ↑ | Accuracy ↑ | F1 ↑ | AUC ↑ |
|---|---|---|---|---|
|
|
|
|
| |
| 0.8636 | 0.8386 | 0.8948 | 0.8022 | |
| 0.8052 | 0.8240 | 0.8830 | 0.8069 | |
| 0.7849 | 0.8091 | 0.8711 | 0.8053 |
-VM and -SM denote removing the volatility modeling network or the sparsity modeling network from SAVSNet, respectively. -TM indicates replacing the TM-LSTM unit with a standard LSTM unit.
Ablation study of components of volatility modeling network.
|
| C-index ↑ | Accuracy ↑ | F1 ↑ | AUC ↑ |
|---|---|---|---|---|
|
|
|
|
| |
| 0.8648 | 0.8088 | 0.8708 | 0.8061 | |
| 0.8959 | 0.8255 | 0.8836 | 0.8135 | |
| 0.7519 | 0.8235 | 0.8826 | 0.8074 |
-CN and -LM denote removing the 1D-CNN or the LSTM part from the volatility modeling network, respectively. -UG indicates replacing the update gate with a ratio of 0.5:0.5.
The effect of different convolution kernel on KDDCup 2015.
|
| C-index ↑ | Accuracy ↑ | F1 ↑ | AUC ↑ |
|---|---|---|---|---|
| 0.9072 | 0.8158 | 0.8762 | 0.8104 | |
|
|
|
|
| |
| 0.8949 | 0.8319 | 0.8887 | 0.8140 | |
| 0.9066 | 0.8138 | 0.8745 | 0.8102 | |
| 0.8875 | 0.8297 | 0.8870 | 0.8138 | |
| 0.8955 | 0.8358 | 0.8919 | 0.8116 |
Fig 4The results of C-index and F1 on KDDCup 2015 with different convolution kernel size.
Fig 5A case study of SAVSNet.
A: The silhouette scores of clusters from 2 to 7. B: Exhibiting unsupervised clustering into 5 clusters. C: Comparison of the survival probability curves of the representative instance in 5 clusters. D: A set of heatmaps showing longitudinal and multivariate learning activities for the 5 representative instances.