| Literature DB >> 35360616 |
Weilin Xiang1, Yongbin Ma2, Dewen Liu3, Sikang Zhang1.
Abstract
In online communities, such as Twitter, Facebook, or Reddit, millions of pieces of contents are generated by users every day, and these user-generated contents (UGCs) show a great variety of topics discussed that make the online community vivid and attractive. However, the reasons why UGCs show great variety and how a firm can influence this variety was unknown, which had been an obstacle to understanding and managing UGCs' variety. This study fills these two gaps based on variety-seeking theory and topic modeling, which is a technique in machine learning. We extract, quantitatively, the topic of the UGCs using topic modeling and divide UGCs into two types: single topic and multiple topics. The user's tendency to choose the type of UGC is used to measure variety-seeking behavior. We found that users have an intrinsic preference for variety when producing UGCs; the more single topic UGCs were produced in the past, the higher the probability of producing multiple topics UGC and the lower the probability of producing single topic UGC would be in the next, and vice versa. Furthermore, we discussed the effect of language/linguistic style matching (LSM) between firm feedbacks and UGCs on users' variety-seeking tendencies in UGCs' production. This study makes three contributions: (1) broadening variety-seeking theory to new behavior, that is content production behavior, and the results demonstrated that people would show a variety-seeking behavior in producing UGCs. (2) a new feasible method to measure the variety of UGCs by using topic modeling to extract the topics of UGCs and then measure the variety-seeking behavior in producing UGCs by analyzing the choice between single topic and multiple topics. (3) guidance for the firm to alter LSM of feedbacks to influence the variety of UGCs.Entities:
Keywords: language/linguistic style matching; machine learning; topic modeling; user-generated contents; variety-seeking
Year: 2022 PMID: 35360616 PMCID: PMC8960714 DOI: 10.3389/fpsyg.2022.808785
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
FIGURE 1The number of topics and the corresponding Coherence Score.
The top 20 words in the first three topics.
| Topic | The top 20 words | Explanation on topic |
| Topic 1 |
| MIUI SMS function related |
| Topic 2 |
| MIUI topic related |
| Topic 3 |
| Mobile plans, etc. |
Examples of posts corresponding to the first three topics.
| Topic | Explanation on topic | One original post | Probability of the topic |
| Topic 1 | MIUI SMS function related | .“ | 0.982 |
| Topic 2 | MIUI topic related | .“ | 0.989 |
| Topic 3 | Mobile plans, etc. | .“ | 0.968 |
“| “ in original post means newline and does not influence any conclusion in this manuscript.
The definition of variables.
| Variable | Definition |
| Dummy variable, 1 means that user | |
| Dummy variable, 1 means that user | |
| To the end of the month | |
| To the end of the month | |
| To the end of the month | |
|
| To the end of the month |
| To the end of the month | |
| To the end of the month | |
| To the end of the month | |
| To the end of the month | |
| To the end of the month | |
| To the end of the month | |
| To the end of the month | |
| To the end of the month | |
| To the end of the month |
Descriptive statistics of variables.
| Name of variables | Mean | SD | Min | Max |
| 0.119 | 0.324 | 0 | 1 | |
| 0.081 | 0.273 | 0 | 1 | |
| 0.568 | 0.472 | 0 | 4.635 | |
| 0.369 | 0.45 | 0 | 4.564 | |
| 0.535 | 0.177 | 0.001 | 1 | |
|
| 0.247 | 0.526 | −1 | 1 |
| 4.689 | 1.105 | 0.693 | 10.298 | |
| 0.478 | 0.894 | 0 | 7.182 | |
| 0.424 | 0.517 | 0 | 4.489 | |
| 4.12 | 1.029 | 1.099 | 8.592 | |
| 0.219 | 0.559 | −1 | 1 | |
| 1.673 | 0.992 | 0 | 6.94 | |
| 4.847 | 1.388 | 0.693 | 10.524 | |
| 0.526 | 0.169 | 0.001 | 1 | |
| 0.232 | 0.408 | −1 | 1 |
Correlation coefficient analysis of core variables.
| Variable | (1) | (2) | (3) | (4) | (5) |
| 1.000 | |||||
| 0.107 | 1.000 | ||||
| 0.275 | −0.077 | 1.000 | |||
| −0.072 | 0.350 | −0.177 | 1.000 | ||
| −0.014 | −0.036 | −0.054 | −0.096 | 1.000 |
**p < 0.05 and ***p < 0.01.
Panel Logistics model regression results.
| Model 1 | Model 2 | |
| DV | ||
| −3.880 | 0.925 | |
| (0.342) | (0.289) | |
| 1.395 | −4.601 | |
| (0.253) | (0.358) | |
| 1.661 | 3.257 | |
| (0.592) | (0.607) | |
|
| 0.122 | 1.081 |
| (0.330) | (0.362) | |
| −0.141 | −0.040 | |
| (0.169) | (0.176) | |
| 0.539 | 0.773 | |
| (0.125) | (0.144) | |
| −0.114 | 1.140 | |
| (0.410) | (0.433) | |
| −0.183 | −0.562 | |
| (0.192) | (0.209) | |
| −0.057 | 0.680 | |
| (0.256) | (0.302) | |
| −0.677 | −0.410 | |
| (0.302) | (0.332) | |
| 0.246 | 0.221 | |
| (0.185) | (0.193) | |
| −0.494 | −0.156 | |
| (0.577) | (0.619) | |
| 0.803 | 0.036 | |
| (0.372) | (0.410) | |
| Individual FE | Yes | Yes |
| Time FE | Yes | Yes |
| N_sample | 5,058 | 4,095 |
| N_individuals | 693 | 547 |
**p < 0.05, ***p < 0.01, SE in parentheses; FE means fix effect, the same below.
The impact of LSM on the number of monthly posts by users (Threshold = 0.5).
| Model 3 | Model 4 | Model 5 | |
|
| |||
|
|
|
| |
|
|
|
|
|
| −0.158 | −4.490 | 1.187 | |
| (0.132) | (0.376) | (0.227) | |
| −0.262 | 0.921 | −2.676 | |
| (0.079) | (0.382) | (0.329) | |
| 1.715 | 1.912 | 2.317 | |
| (0.656) | (0.711) | (0.517) | |
|
| 0.624 | −0.198 | 0.779 |
| (0.429) | (0.413) | (0.311) | |
| 0.190 | −0.270 | −0.154 | |
| (0.128) | (0.225) | (0.150) | |
| 0.229 | 0.501 | 0.606 | |
| (0.094) | (0.171) | (0.120) | |
| 0.165 | 0.123 | 0.230 | |
| (0.671) | (0.473) | (0.384) | |
| −0.094 | 0.129 | −0.499 | |
| (0.199) | (0.225) | (0.183) | |
| 0.346 | 0.026 | 0.149 | |
| (0.304) | (0.345) | (0.241) | |
| −0.481 | −0.614 | −0.997 | |
| (0.146) | (0.401) | (0.274) | |
| 0.149 | 0.280 | 0.387 | |
| (0.075) | (0.259) | (0.158) | |
| 0.072 | 0.389 | −0.165 | |
| (0.293) | (0.728) | (0.505) | |
| −0.032 | 1.353 | 0.101 | |
| (0.134) | (0.481) | (0.322) | |
| Individual FE | Yes | Yes | Yes |
| Time FE | Yes | Yes | Yes |
| N_sample | 25,618 | 6,506 | 6,506 |
| N_individuals | 4,268 | 898 | 898 |
**p < 0.05, ***p < 0.01, SE in parentheses; FE means fix effect.
Robustness test – regression results under different thresholds.
| Model 6 | Model 7 | Model 8 | Model 9 | |
| threshold | 0.6 | 0.6 | 0.7 | 0.7 |
| DV | ||||
| −4.201 | 1.139 | −4.490 | 1.187 | |
| (0.354) | (0.243) | (0.376) | (0.227) | |
| 1.464 | −3.520 | 0.921 | −2.676 | |
| (0.325) | (0.336) | (0.382) | (0.329) | |
| 1.612 | 2.671 | 1.912 | 2.317 | |
| (0.606) | (0.552) | (0.711) | (0.517) | |
|
| −0.105 | 0.996 | −0.198 | 0.779 |
| (0.344) | (0.336) | (0.413) | (0.311) | |
| −0.364 | 0.086 | −0.270 | −0.154 | |
| (0.192) | (0.157) | (0.225) | (0.150) | |
| 0.563 | 0.638 | 0.501 | 0.606 | |
| (0.142) | (0.127) | (0.171) | (0.120) | |
| 0.201 | 0.273 | 0.123 | 0.230 | |
| (0.430) | (0.405) | (0.473) | (0.384) | |
| 0.090 | −0.500 | 0.129 | −0.499 | |
| (0.197) | (0.197) | (0.225) | (0.183) | |
| 0.174 | 0.194 | 0.026 | 0.149 | |
| (0.285) | (0.257) | (0.345) | (0.241) | |
| −0.847 | −0.792 | −0.614 | −0.997 | |
| (0.351) | (0.287) | (0.401) | (0.274) | |
| 0.308 | 0.266 | 0.280 | 0.387 | |
| (0.215) | (0.168) | (0.259) | (0.158) | |
| −0.389 | 0.271 | 0.389 | −0.165 | |
| (0.631) | (0.534) | (0.728) | (0.505) | |
| 0.914 | 0.019 | 1.353 | 0.101 | |
| (0.407) | (0.357) | (0.481) | (0.322) | |
| Individual FE | Yes | Yes | Yes | Yes |
| Time FE | Yes | Yes | Yes | Yes |
| N_sample | 4,056 | 5,128 | 3,025 | 5,703 |
| N_individuals | 542 | 698 | 403 | 769 |
**p < 0.05, ***p < 0.01, SE in parentheses; FE means fix effect.
Robustness test (Threshold = 0.4).
| Model 10 | Model 11 | |
| DV | ||
| −0.415 | 0.216 | |
| (0.060) | (0.045) | |
| 0.659 | −1.883 | |
| (0.093) | (0.199) | |
| 2.315 | 3.096 | |
| (0.547) | (0.929) | |
|
| 0.477 | 1.237 |
| (0.311) | (0.576) | |
| −0.160 | −0.023 | |
| (0.145) | (0.255) | |
| 0.442 | 0.585 | |
| (0.121) | (0.205) | |
| −0.052 | 1.080 | |
| (0.379) | (0.589) | |
| −0.169 | −1.073 | |
| (0.181) | (0.309) | |
| −0.087 | 0.655 | |
| (0.258) | (0.458) | |
| −1.077 | −1.263 | |
| (0.278) | (0.482) | |
| 0.266 | 0.805 | |
| (0.168) | (0.307) | |
| −0.030 | 0.628 | |
| (0.519) | (0.925) | |
| 0.518 | −1.179 | |
| (0.346) | (0.673) | |
| Individual FE | Yes | Yes |
| Time FE | Yes | Yes |
| N_sample | 5,515 | 3,347 |
| N_individuals | 751 | 451 |
**p < 0.05, ***p < 0.01, SE in parentheses; FE means fix effect.