| Literature DB >> 35085306 |
Yanyan Cui1, Lixin Liu1.
Abstract
In recent years, online lending has created many risks while providing lending convenience to Chinese individuals and small and medium-sized enterprises. The timely assessment and prediction of the status of industry indicators is an important prerequisite for effectively preventing the spread of risks in China's new financial formats. The role of investor sentiment should not be underestimated. We first use the BERT model to divide investor sentiment in the review information of China's online lending third-party information website into three categories and analyze the relationship between investor sentiment and quantitative indicators of online lending product transactions. The results show that the percentage of positive comments has a positive relationship to the borrowing interest rate of P2P platforms that investors are willing to participate in for bidding projects. The percentage of negative comments has an inverse relationship to the borrowing period. Second, after introducing investor sentiment into the long short-term memory (LSTM) model, the average RMSE of the three forecast periods for borrowing interest rates is 0.373, and that of the borrowing period is 0.262, which are better than the values of other control models. Corresponding suggestions for the risk prevention of China's new financial formats are made.Entities:
Mesh:
Year: 2022 PMID: 35085306 PMCID: PMC8794130 DOI: 10.1371/journal.pone.0262539
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Research framework.
Fig 2BERT process structure.
Fig 3Schematic diagram of the BERT model input structure.
Fig 4Comments text length.
BERT model training specific parameters.
| Parameter | Description | Value |
|---|---|---|
| max_seq_length | The maximum length of comments | 275 |
| train_batch_size | batch size | 16 |
| learning_rate | Learning rate | 2e-5 |
| num_train_epochs | training epochs | 4 |
Unit root test results.
| Time series | T value | P-value |
|---|---|---|
| Percentage of positive-sentiment | -9.3375 | 0.01 |
| Average borrowing interest rate | -19.223 | 0.01 |
| Percentage of negative-sentiment | -14.837 | 0.01 |
| Average borrowing period | -8.268 | 0.01 |
Granger causality test results.
| Lagged value | 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|---|
| p-value of hypothesis 1.1 | 0.0024 | 0.0109 | 0.0288 | 0.0324 | 0.0224 |
| p-value of hypothesis 1.2 | 4.123e-07 | 0.0032 | 0.0054 | 0.0123 | 0.0054 |
Fig 5LSTM model training process.
LSTM model training specific parameters.
| Parameter | Description | Value |
|---|---|---|
| cell size | Number of hidden layer neurons | 50 |
| loss function | Choice of the loss function | Mean Absolute Error |
| optimizer | Determine the optimizer | Adam |
| batch size | The size of each batch of samples | 20 |
| learning rate | Learning rate value | 0.001 |
| N epoch | training epochs | 1000 |
Fig 6Average borrowing interest rate prediction results.
Fig 7Average borrowing period prediction results.
Interpretation of comparison models.
| Model | Description |
|---|---|
| AttLSTM | The attention mechanism layer is added to the LSTM network to make better use of the input data. Comparisons to the ordinary LSTM model are conducted to determine if the weight given to the input can achieve outstanding performance. |
| SVR | Support vector regression is a regression version of the support vector machine. It is a nonparametric regression technology developed based on the assumption of structural risk minimization, which can achieve good data generalization results. Comparisons to the LSTM model determine if the support vector regression of loss calculation "inclusive" has good performance. |
| MLP | The multilayer neural network model is a feed-forward neural network model with multilayer perceptrons. The model better fits the relationship between training data by continuously adjusting link weights between neurons. Comparisons to the LSTM model reveal the different effects of the relationships between training data. |
| Random Forest | The random forest algorithm (regression) is a supervised ensemble algorithm model. The model is composed of many regression trees, and the results are aggregated from the results of many regression trees. Comparisons to the LSTM model reveal the prediction effect of the integrated model. |
Average borrowing interest rate prediction results.
| long-term (94 periods) | medium-term (54 periods) | short-term (24 periods) | |||||||
|---|---|---|---|---|---|---|---|---|---|
| RMSE | MAPE | SMAPE | RMSE | MAPE | SMAPE | RMSE | MAPE | SMAPE | |
| LSTM |
|
|
|
|
|
| 0.336 | 18.064% | 16.141% |
| AttLSTM | 0.392 | 16.673% | 14.905% | 0.438 | 20.588% | 17.882% |
|
|
|
| SVR | 1.578 | 40.271% | 28.033% | 1.734 | 50.903% | 33.132% | 1.585 | 41.365% | 28.204% |
| MLP | 1.753 | 44.369% | 27.385% | 2.039 | 55.144% | 33.624% | 1.852 | 44.716% | 27.937% |
| Random Forest | 1.934 | 66.483% | 41.230% | 2.439 | 64.689% | 45.410% | 2.262 | 63.400% | 35.706% |
Note: The part in bold is the model’s evaluation indicator results with the best prediction effect.
Average borrowing period prediction results.
| long-term (94 periods) | medium-term (54 periods) | short-term (24 periods) | |||||||
|---|---|---|---|---|---|---|---|---|---|
| RMSE | MAPE | SMAPE | RMSE | MAPE | SMAPE | RMSE | MAPE | SMAPE | |
| LSTM | 0.280 | 15.072% | 13.129% |
| 17.077% |
|
|
| 11.535% |
| AttLSTM |
|
|
| 0.316 |
| 14.757% | 0.215 | 12.057% |
|
| SVR | 0.796 | 37.433% | 27.664% | 0.781 | 29.934% | 26.867% | 0.759 | 34.754% | 29.962% |
| MLP | 0.864 | 39.625% | 31.651% | 0.997 | 50.198% | 36.946% | 1.049 | 35.339% | 37.890% |
| Random Forest | 0.877 | 33.005% | 34.944% | 0.959 | 31.544% | 35.153% | 0.779 | 34.484% | 36.245% |
Note: The part in bold is the model’s evaluation indicator results with the best prediction effect.