| Literature DB >> 29276313 |
Matthew L Williams1, Pete Burnap1, Luke Sloan1.
Abstract
New and emerging forms of data, including posts harvested from social media sites such as Twitter, have become part of the sociologist's data diet. In particular, some researchers see an advantage in the perceived 'public' nature of Twitter posts, representing them in publications without seeking informed consent. While such practice may not be at odds with Twitter's terms of service, we argue there is a need to interpret these through the lens of social science research methods that imply a more reflexive ethical approach than provided in 'legal' accounts of the permissible use of these data in research publications. To challenge some existing practice in Twitter-based research, this article brings to the fore: (1) views of Twitter users through analysis of online survey data; (2) the effect of context collapse and online disinhibition on the behaviours of users; and (3) the publication of identifiable sensitive classifications derived from algorithms.Entities:
Keywords: Twitter; algorithms; computational social science; context collapse; ethics; social data science; social media
Year: 2017 PMID: 29276313 PMCID: PMC5718335 DOI: 10.1177/0038038517708140
Source DB: PubMed Journal: Sociology ISSN: 0038-0385
Sample descriptives (N = 564).
| Coding | %/ | ||
|---|---|---|---|
|
| |||
| Concern – university research | Not at all concerned | 37.2 | 136 |
| Slightly concerned | 46.4 | 170 | |
| Quite concerned | 11.2 | 41 | |
| Very concerned | 5.2 | 19 | |
| Concern – government research | Not at all concerned | 23.3 | 85 |
| Slightly concerned | 27.7 | 101 | |
| Quite concerned | 25.5 | 93 | |
| Very concerned | 23.6 | 86 | |
| Concern – commercial research | Not at all concerned | 16.8 | 61 |
| Slightly concerned | 32.1 | 117 | |
| Quite concerned | 29.4 | 107 | |
| Very concerned | 21.7 | 79 | |
| Expect to be asked for consent | Disagree | 7.2 | 26 |
| Tend to disagree | 13.1 | 47 | |
| Tend to agree | 24.7 | 89 | |
| Agree | 55.0 | 198 | |
| Expect to be anonymised | Disagree | 5.1 | 18 |
| Tend to disagree | 4.8 | 17 | |
| Tend to agree | 13.7 | 48 | |
| Agree | 76.4 | 268 | |
|
| |||
| Frequency of posts daily | Scale (range: 1 ‘Less than once’ to 7 ‘over 10’) | 1.75 | 1.23 |
| Post personal activity | Yes = 1 | 37.7 | 161 |
| Post personal photos | Yes = 1 | 19.0 | 81 |
| Knowledge of terms of service (ToS) consent | Yes = 1 | 75.5 | 317 |
| Net use (years) | Scale (range: 1 ‘Less than year’ to 9 ‘15+ years’) | 6.59 | 1.76 |
| Net use (hours per day) | Scale (range: 1 ‘Less than hour’ to 10 ‘10+ hours’) | 6.03 | 2.52 |
| Net skill | Scale (range: 1 ‘Novice’ to 10 ‘Expert’) | 7.69 | 1.60 |
| Sex | Male = 1 | 48.93 | 276 |
| Age | Scale (range: 18 to 83) | 25.38 | 10.17 |
| Sexual orientation | Heterosexual = 1 | 83.6 | 357 |
| Ethnicity | White = 1 | 91.1 | 389 |
| Relationship status | Partnered = 1 | 45.4 | 194 |
| Income | Scale (range: 1 ‘below 10K’ to 11 ‘100K+’) | 3.72 | 3.07 |
| Has child under 16 | Yes = 1 | 7.3 | 31 |
Notes: aMean and Standard Deviation given for scale variables; bReduction in sample size due to missing data.
Ordered regression predicting concern about using Twitter data in three research settings.
| University | Government | Commercial | ||||
|---|---|---|---|---|---|---|
| B | Exp(B) | B | Exp(B) | B | Exp(B) | |
| Frequency of posts | 0.048 | 1.05 | −0.067 | 0.94 | −0.227 | 0.80 |
| Post personal activity | 0.216 | 1.24 | 0.101 | 1.11 | 0.470 | 1.60 |
| Post personal photos | 0.132 | 1.14 | 0.119 | 1.13 | 0.133 | 1.14 |
| Knowledge of ToS consent | −0.465 | 0.63 | −0.246 | 0.78 | −0.289 | 0.75 |
| Net use (years) | −0.041 | 0.96 | −0.059 | 0.94 | 0.059 | 1.06 |
| Net use (hours per day) | 0.092 | 1.10 | 0.089 | 1.09 | 0.062 | 1.06 |
| Net skill | 0.078 | 1.08 | 0.149 | 1.16 | 0.135 | 1.14 |
| Sex | −0.659 | 0.52 | −0.291 | 0.75 | −0.353 | 0.70 |
| Age | 0.003 | 1.00 | 0.072 | 1.07 | 0.066 | 1.07 |
| Sexual orientation | −0.27 | 0.76 | −0.752 | 0.47 | −0.653 | 0.52 |
| Ethnicity | 0.055 | 1.06 | −0.311 | 0.73 | −0.422 | 0.66 |
| Relationship status | −0.414 | 0.66 | −0.206 | 0.81 | −0.363 | 0.70 |
| Income | −0.045 | 0.96 | −0.045 | 0.96 | −0.053 | 0.95 |
| Has child under 16 | 0.846 | 2.33 | 0.27 | 1.31 | 0.498 | 1.65 |
| Model fit | ||||||
| −2 log likelihood | 790.730 | 944.497 | 920.106 | |||
| Model chi-square | 31.182 | 65.712 | 66.789 | |||
| d.f. | 15 | 15 | 15 | |||
| sig. | 0.00 | 0.00 | 0.00 | |||
| Cox and Snell pseudo R² | 0.08 | 0.17 | 0.17 | |||
| Nagelkerke pseudo R² | 0.09 | 0.18 | 0.18 | |||
Ordered regression predicting expectation of request for informed consent and anonymity in Twitter research in university settings.
| Informed consent | Anonymity | |||
|---|---|---|---|---|
| B | Exp(B) | B | Exp(B) | |
| Frequency of posts | −0.05 | 0.95 | −0.097 | 0.91 |
| Post personal activity | 0.034 | 1.03 | 0.311 | 1.36 |
| Post personal photos | −0.272 | 0.76 | 0.471 | 1.61 |
| Knowledge of ToS consent | −0.478 | 0.62 | 0.115 | 1.12 |
| Net use (years) | −0.155 | 0.86 | −0.105 | 0.90 |
| Net use (hours per day) | 0.055 | 1.06 | 0.049 | 1.05 |
| Net skill | −0.063 | 0.94 | −0.109 | 0.90 |
| Sex | −0.241 | 0.79 | −0.385 | 0.68 |
| Age | −0.020 | 0.98 | 0.017 | 1.02 |
| Sexual orientation | −0.167 | 0.85 | 0.004 | 1.00 |
| Ethnicity | 0.160 | 1.17 | −1.369 | 3.90 |
| Relationship status | −0.019 | 0.98 | −0.129 | 0.88 |
| Income | −0.021 | 0.98 | −0.004 | 1.00 |
| Has child under 16 | 0.243 | 1.27 | 0.052 | 1.05 |
| Model fit | ||||
| −2 log likelihood | 788.767 | 526.805 | ||
| Model chi-square | 24.762 | 18.68 | ||
| d.f. | 15 | 15 | ||
| sig. | 0.00 | 0.00 | ||
| Cox and Snell pseudo R² | 0.09 | 0.09 | ||
| Nagelkerke pseudo R² | 0.09 | 0.09 | ||
Figure 1.Decision flow chart for publication of Twitter communications.