| Literature DB >> 28926569 |
Abstract
Across social media platforms users (sub)consciously represent themselves in a way which is appropriate for their intended audience. This has unknown impacts on studies with unobtrusive designs based on digital (social) platforms, and studies of contemporary social phenomena in online settings. A lack of appropriate methods to identify, control for, and mitigate the effects of self-representation, the propensity to express socially responding characteristics or self-censorship in digital settings, hinders the ability of researchers to confidently interpret and generalize their findings. This article proposes applying boosted regression modelling to fill this research gap. A case study of paid Amazon Mechanical Turk workers (n = 509) is presented where workers completed psychometric surveys and provided anonymized access to their Facebook timelines. Our research finds indicators of self-representation on Facebook, facilitating suggestions for its mitigation. We validate the use of LIWC for Facebook personality studies, as well as find discrepancies with extant literature about the use of LIWC-only approaches in unobtrusive designs. Using survey data and LIWC sentiment categories as predictors, the boosted regression model classified the Five Factor personality model with an average accuracy of 74.6%. The contribution of this work is an accurate prediction of psychometric information based on short, informal text.Entities:
Mesh:
Year: 2017 PMID: 28926569 PMCID: PMC5604947 DOI: 10.1371/journal.pone.0184417
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Workflow illustrating the steps to acquire, analyse, and interpret text data.
Fig 2Model representation of regression analysis.
Average and Standard Deviation per profile (n = 283).
| Per Profile | Mean | Standard | Median |
|---|---|---|---|
| Words Used | 9379 | 1578 | 1726 |
| +6 Letter Words | 17 | .37 | 16 |
| Words/Sentence | 109 | 50 | 17 |
Fig 3Positive and negative sentiment usage across the sample population (logarithmic scale).
Fig 4Gendered usage of confidence-expressing statements on Facebook profiles.
LIWC dictionary attributes significantly predicting the trait openness.
| Model Term Openness | Coefficient | Lower | Upper | Std. error | Sig | |
|---|---|---|---|---|---|---|
| 3.058 | 2.781 | 3.334 | 0.14 | 21.766 | 0.000 | |
| 0.108 | 0.006 | 0.209 | 0.052 | 2.086 | 0.038 | |
| 0.077 | 0.017 | 0.138 | 0.031 | 2.511 | 0.013 | |
| 0.035 | 0.005 | 0.066 | 0.015 | 2.316 | 0.021 | |
| 0.007 | 0.001 | 0.013 | 0.003 | 2.408 | 0.017 | |
| 0.001 | 0 | 0.003 | 0.001 | 2.809 | 0.005 | |
| -0.033 | -0.063 | -0.002 | 0.015 | -2.107 | 0.036 | |
| -0.042 | -0.083 | 0 | 0.021 | -1.97 | 0.05 | |
| -0.078 | -0.141 | -0.016 | 0.032 | -2.462 | 0.014 | |
| -0.107 | -0.19 | -0.024 | 0.042 | -2.524 | 0.012 | |
| -0.117 | -0.191 | -0.044 | 0.037 | -3.131 | 0.002 | |
| -0.458 | -0.83 | -0.086 | 0.189 | -2.423 | 0.016 |
LIWC dictionary attributes significantly predicting the trait neuroticism.
| Model Term Neuroticism | Coefficient | Lower | Upper | Std. error | Sig | |
|---|---|---|---|---|---|---|
| 3.056 | 2.746 | 3.366 | 0.157 | 19.419 | 0.000 | |
| 0.322 | 0.118 | 0.525 | 0.103 | 3.116 | 0.002 | |
| 0.303 | 0.001 | 0.606 | 0.153 | 1.977 | 0.049 | |
| 0.217 | 0.063 | 0.37 | 0.078 | 2.784 | 0.006 | |
| 0.085 | 0.011 | 0.159 | 0.038 | 2.272 | 0.024 | |
| 0.052 | 0.016 | 0.087 | 0.018 | 2.845 | 0.005 | |
| -0.038 | -0.071 | -0.005 | 0.017 | -2.274 | 0.024 | |
| -0.046 | -0.08 | -0.012 | 0.017 | -2.696 | 0.007 | |
| -0.074 | -0.142 | -0.006 | 0.034 | -2.142 | 0.033 | |
| -0.13 | -0.224 | -0.036 | 0.048 | -2.719 | 0.007 |
Prediction accuracy, explained variance, and nested cross-validation values of the five factor personality traits compared to the accuracy and explained variance of [48].
| Trait Name | Prediction Accuracy (ALM) | R2 (ALM) | Nested CV Mean Linear Correlation | Schwartz et al. R2 (LIWC only) | Schwartz et al. R2 (LIWC combined with topics and words) |
|---|---|---|---|---|---|
| 65.0 | 0.47 | 0.62 | 0.29 | 0.42 | |
| 66.7 | 0.43 | 0.68 | 0.29 | 0.35 | |
| 77.9 | 0.56 | 0.70 | 0.27 | 0.38 | |
| 63.5 | 0.46 | 0.68 | 0.25 | 0.31 | |
| 70.8 | 0.50 | 0.61 | 0.21 | 0.31 | |
| 68.8 | 0.49 | 0.66 | 0.26 | 0.35 |
Performance comparison of standard ALM results and 10-fold cross-validated (CV) ALM results.
| Min CV Linear Correlation | Mean CV Linear Correlation | Max CV Linear Correlation | ALM Results | |
|---|---|---|---|---|
| 0.422 | 0.6247 | 0.888 | 0.650 | |
| 0.549 | 0.6802 | 0.793 | 0.667 | |
| 0.311 | 0.701 | 0.938 | 0.779 | |
| 0.456 | 0.6754 | 0.803 | 0.635 | |
| 0.354 | 0.6149 | 0.758 | 0.708 |
LIWC dictionary attributes significantly predicting the trait conscientiousness.
| Model Term Conscientiousness | Coefficient | Lower | Upper | Std. error | Sig | |
|---|---|---|---|---|---|---|
| 3.232 | 3.002 | 3.462 | 0.117 | 27.717 | 0.000 | |
| 0.449 | 0.156 | 0.741 | 0.149 | 3.021 | 0.003 | |
| 0.258 | 0.119 | 0.397 | 0.071 | 3.657 | 0.000 | |
| 0.168 | 0.023 | 0.313 | 0.074 | 2.286 | 0.023 | |
| 0.068 | 0.025 | 0.112 | 0.022 | 3.1 | 0.002 | |
| 0.066 | 0.003 | 0.129 | 0.032 | 2.052 | 0.041 | |
| 0.056 | 0.021 | 0.09 | 0.018 | 3.16 | 0.002 | |
| 0.044 | 0.001 | 0.086 | 0.022 | 2.001 | 0.046 | |
| 0.013 | 0.004 | 0.021 | 0.004 | 2.865 | 0.005 | |
| 0.01 | 0.002 | 0.017 | 0.004 | 2.497 | 0.013 | |
| -0.009 | -0.016 | -0.002 | 0.003 | -2.549 | 0.011 | |
| -0.027 | -0.052 | -0.003 | 0.012 | -2.22 | 0.027 | |
| -0.028 | -0.05 | -0.006 | 0.011 | -2.51 | 0.013 | |
| -0.076 | -0.128 | -0.024 | 0.027 | -2.863 | 0.005 | |
| -0.079 | -0.156 | -0.003 | 0.039 | -2.043 | 0.042 | |
| -0.097 | -0.163 | -0.031 | 0.033 | -2.9 | 0.004 | |
| -0.097 | -0.18 | -0.014 | 0.042 | -2.289 | 0.023 | |
| -0.1 | -0.195 | -0.004 | 0.048 | -2.061 | 0.04 | |
| -0.138 | -0.266 | -0.011 | 0.065 | -2.135 | 0.034 | |
| -0.139 | -0.274 | -0.004 | 0.069 | -2.021 | 0.044 | |
| -0.176 | -0.349 | -0.002 | 0.088 | -1.992 | 0.047 | |
| -0.335 | -0.548 | -0.123 | 0.108 | -3.108 | 0.002 |
LIWC dictionary attributes significantly predicting the trait extraversion.
| Model Term Extroversion | Coefficient | Lower | Upper | Std. error | Sig | |
|---|---|---|---|---|---|---|
| 2.97 | 2.638 | 3.301 | 0.168 | 17.63 | 0.000 | |
| 0.466 | 0.023 | 0.908 | 0.225 | 2.073 | 0.039 | |
| 0.206 | 0.058 | 0.354 | 0.075 | 2.739 | 0.007 | |
| 0.204 | 0.022 | 0.386 | 0.092 | 2.21 | 0.028 | |
| 0.172 | 0.105 | 0.239 | 0.034 | 5.052 | 0.000 | |
| 0.172 | 0.063 | 0.281 | 0.055 | 3.12 | 0.002 | |
| 0.125 | 0.048 | 0.202 | 0.039 | 3.199 | 0.002 | |
| 0.101 | 0.031 | 0.17 | 0.035 | 2.865 | 0.005 | |
| 0.067 | 0.019 | 0.114 | 0.024 | 2.779 | 0.006 | |
| 0.028 | 0.006 | 0.051 | 0.011 | 2.541 | 0.012 | |
| 0.003 | 0.001 | 0.004 | 0.001 | 4.049 | 0.000 | |
| -0.017 | -0.033 | -0.001 | 0.008 | -2.094 | 0.037 | |
| -0.054 | -0.075 | -0.032 | 0.011 | -4.933 | 0.000 | |
| -0.101 | -0.18 | -0.022 | 0.04 | -2.518 | 0.012 | |
| -0.105 | -0.204 | -0.006 | 0.05 | -2.094 | 0.037 | |
| -0.111 | -0.207 | -0.015 | 0.049 | -2.288 | 0.023 | |
| -0.568 | -1.055 | -0.082 | 0.247 | -2.3 | 0.022 | |
| -0.592 | -1.158 | -0.025 | 0.288 | -2.057 | 0.041 |
LIWC dictionary attributes significantly predicting the trait agreeableness.
| Model Term Agreeableness | Coefficient | Lower | Upper | Std. error | Sig | |
|---|---|---|---|---|---|---|
| 2.71 | 2.426 | 2.994 | 0.144 | 18.798 | 0.000 | |
| 0.151 | 0.09 | 0.213 | 0.031 | 4.843 | 0.000 | |
| 0.13 | 0.046 | 0.213 | 0.042 | 3.059 | 0.002 | |
| 0.118 | 0.025 | 0.21 | 0.047 | 2.512 | 0.013 | |
| 0.083 | 0.023 | 0.143 | 0.03 | 2.739 | 0.007 | |
| 0.063 | 0.001 | 0.126 | 0.032 | 2.006 | 0.046 | |
| 0.004 | 0.001 | 0.006 | 0.001 | 3.101 | 0.002 | |
| 0.002 | 0.001 | 0.003 | 0.001 | 4.672 | 0.000 | |
| -0.008 | -0.015 | -0.001 | 0.004 | -2.26 | 0.025 | |
| -0.025 | -0.042 | -0.009 | 0.008 | -2.999 | 0.003 | |
| -0.043 | -0.077 | -0.009 | 0.017 | -2.481 | 0.014 | |
| -0.092 | -0.177 | -0.008 | 0.043 | -2.15 | 0.033 | |
| -0.14 | -0.272 | -0.007 | 0.067 | -2.08 | 0.039 |