| Literature DB >> 26544601 |
Abstract
In this paper we take advantage of recent developments in identifying the demographic characteristics of Twitter users to explore the demographic differences between those who do and do not enable location services and those who do and do not geotag their tweets. We discuss the collation and processing of two datasets-one focusing on enabling geoservices and the other on tweet geotagging. We then investigate how opting in to either of these behaviours is associated with gender, age, class, the language in which tweets are written and the language in which users interact with the Twitter user interface. We find statistically significant differences for both behaviours for all demographic characteristics, although the magnitude of association differs substantially by factor. We conclude that there are significant demographic variations between those who opt in to geoservices and those who geotag their tweets. Not withstanding the limitations of the data, we suggest that Twitter users who publish geographical information are not representative of the wider Twitter population.Entities:
Mesh:
Year: 2015 PMID: 26544601 PMCID: PMC4636345 DOI: 10.1371/journal.pone.0142209
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Crosstabulating whether location services are enabled and gender (‘Dataset1’).
| Gender: | Not Enabled: | Enabled: | Total: |
|---|---|---|---|
| Male | 55.0% (n = 2,864,158) | 45.0% (n = 2,346,886) | 5,211,044 |
| Female | 54.2% (n = 2,776,220) | 45.8% (n = 2,345,110) | 5,121,330 |
| Unisex | 54.1% (n = 651,809) | 45.9% (n = 552,957) | 1,204,766 |
| Unknown | 60.9% (n = 11,247,704) | 39.1% (n = 7,235,602) | 18,483,306 |
|
| 58.4% (n = 17,539,891) | 41.6% (n = 12,480,555) | 30,020,446 |
Crosstabulating geotagging users and gender (‘Dataset2’).
| Gender: | Not Used Geotagging: | Used Geotagging: | Total: |
|---|---|---|---|
| Male | 95.8% (n = 4,031,850) | 4.2% (n = 175,265) | 4,207,115 |
| Female | 95.9% (n = 4,088,213) | 4.1% (n = 174,922) | 4,263,135 |
| Unisex | 96.5% (n = 1,018,140) | 3.5% (n = 36,496) | 1,054,636 |
| Unknown | 97.6% (n = 13,919,963) | 2.4% (n = 344,415) | 14,264,378 |
|
| 96.9% (n = 23,058,166) | 3.1% (n = 731,098) | 23,789,264 |
Fig 1Comparing the age distribution of users who do and do not enable location services.
Fig 2Comparing the age distribution of users who do and do not geotag tweets (y-axis logged).
Crosstabulating NS-SEC with whether location services are enabled.
| NS-SEC | Label | Not Enabled | Enabled | Total |
|---|---|---|---|---|
| 1 | Higher managerial, administrative & professional occupations | 46.6% (n = 4299) | 53.4% (n = 4931) | 9230 |
| 2 | Lower managerial, administrative & professional occupations | 47.2% (n = 12685) | 52.8% (n = 14171) | 26856 |
| 3 | Intermediate occupations | 47.5% (n = 4153) | 52.5% (n = 4593) | 8746 |
| 4 | Small employers & own account workers | 45.1% (n = 2477) | 54.9% (n = 3021) | 5498 |
| 5 | Lower supervisory & technical occupations | 44.9% (n = 534) | 55.1% (n = 654) | 1188 |
| 6 | Semi-routine occupations | 43.6% (n = 3987) | 56.4% (n = 5159) | 9146 |
| 7 | Routine occupations | 48.2% (n = 3451) | 51.8% (n = 3705) | 7156 |
|
| 46.6% (n = 31586) | 53.3% (n = 36234) | 67820 |
Crosstabulating NS-SEC with geotagging users.
| NS-SEC | Label | Not Geotagged | Geotagged | Total |
|---|---|---|---|---|
| 1 | Higher managerial, administrative & professional occupations | 96.1% (n = 7,618) | 3.9% (n = 308) | 7,926 |
| 2 | Lower managerial, administrative & professional occupations | 96.5% (n = 22,935) | 3.5% (n = 835) | 23,770 |
| 3 | Intermediate occupations | 96.4% (n = 7,359) | 3.6% (n = 274) | 7,633 |
| 4 | Small employers & own account workers | 95.8% (n = 4,873) | 4.2% (n = 215) | 5,088 |
| 5 | Lower supervisory & technical occupations | 95.9% (n = 984) | 4.1% (n = 42) | 1,026 |
| 6 | Semi-routine occupations | 95.8% (n = 7,658) | 4.2% (n = 335) | 7,993 |
| 7 | Routine occupations | 96.7% (n = 6,152) | 3.3% (n = 209) | 6,361 |
|
| 96.3% (n = 57,579) | 3.7% (n = 2,218) | 59,797 |
Crosstabulating Twitter user interface language (top 20) with whether location services are enabled.
| Language: | Code: | Not Enabled: | Enabled: | Total: |
|---|---|---|---|---|
| English | en | 56.6% (n = 8143700) | 43.4% (n = 6234416) | 14378116 |
| Japanese | ja | 69.4% (n = 3603473) | 30.6% (n = 1589842) | 5193315 |
| Spanish | es | 48.8% (n = 1764145) | 51.2% (n = 1848193) | 3612338 |
| Arabic | ar | 70.1% (n = 919952) | 29.9% (n = 391723) | 1311675 |
| Portuguese | pt | 43.0% (n = 516463) | 57.0% (n = 685814) | 1202277 |
| Turkish | tr | 52.1% (n = 468241) | 47.9% (n = 431067) | 899308 |
| Russian | ru | 77.2% (n = 573039) | 22.8% (n = 169713) | 742752 |
| French | fr | 57.1% (n = 405485) | 42.9% (n = 304086) | 709571 |
| Indonesian | id | 44.4% (n = 273780) | 55.6% (n = 342348) | 616128 |
| Korean | ko | 73.2% (n = 202514) | 26.8% (n = 74000) | 276514 |
| Italian | it | 56.1% (n = 135032) | 43.9% (n = 105877) | 240909 |
| Thai | th | 55.5% (n = 124466) | 44.5% (n = 99627) | 224093 |
| German | de | 72.5% (n = 139954) | 27.5% (n = 53126) | 193080 |
| Dutch | nl | 61.3% (n = 74685) | 38.7% (n = 47184) | 121869 |
| Swedish | sv | 60.2% (n = 29872) | 39.8% (n = 19711) | 49583 |
| Chinese | zh | 75.2% (n = 35025) | 24.8% (n = 11558) | 46583 |
| Polish | pl | 62.6% (n = 27172) | 37.4% (n = 16208) | 43380 |
| Finnish | fi | 63.8% (n = 15840) | 36.2% (n = 8989) | 24829 |
| Catalan | ca | 53.4% (n = 11723) | 46.6% (n = 10250) | 21973 |
| Greek | el | 67.4% (n = 12320) | 32.6% (n = 5958) | 18278 |
|
| 58.4% (n = 17476881) | 41.6% (n = 12449690) | 29926571 |
Crosstabulating interface language (top 20) with whether users geotag or not.
| Language: | Code: | Not Geotagged | Geotagged | Total: |
|---|---|---|---|---|
| English | en | 96.6% (n = 10719718) | 3.4% (n = 373678) | 11093396 |
| Japanese | ja | 99.2% (n = 4673294) | 0.8% (n = 37175) | 4710469 |
| Spanish | es | 95.7% (n = 2583704) | 4.3% (n = 115959) | 2699663 |
| Portuguese | pt | 94.3% (n = 958167) | 5.7% (n = 57611) | 1015778 |
| Arabic | ar | 99.1% (n = 990961) | 0.9% (n = 8762) | 999723 |
| Russian | ru | 97.5% (n = 637263) | 2.5% (n = 16632) | 653895 |
| Turkish | tr | 91.2% (n = 574730) | 8.8% (n = 55530) | 630260 |
| French | fr | 97.4% (n = 516445) | 2.6% (n = 13781) | 530226 |
| Indonesian | id | 93.7% (n = 433745) | 6.3% (n = 29310) | 463055 |
| Korean | ko | 99.7% (n = 226643) | 0.3% (n = 644) | 227287 |
| Italian | it | 96.9% (n = 181098) | 3.1% (n = 5848) | 186946 |
| German | de | 98.7% (n = 147140) | 1.3% (n = 1978) | 149118 |
| Thai | th | 94.8% (n = 108667) | 5.2% (n = 5910) | 114577 |
| Dutch | nl | 97.1% (n = 90554) | 2.9% (n = 2690) | 93244 |
| Chinese | zh | 97.9% (n = 35956) | 2.1% (n = 786) | 36742 |
| Polish | pl | 98.6% (n = 35372) | 1.4% (n = 505) | 35877 |
| Swedish | sv | 96.7% (n = 32924) | 3.3% (n = 1137) | 34061 |
| Catalan | ca | 96.4% (n = 13177) | 3.6% (n = 498) | 13675 |
| Greek | el | 97.9% (n = 11622) | 2.1% (n = 245) | 11867 |
| Finnish | fi | 98.0% (n = 11035) | 2.0% (n = 222) | 11257 |
|
| 96.9% (n = 22982215) | 3.1%(n = 728901) | 23711116 |
Crosstabulating tweet language (top 20) with whether location services are enabled.
| Language: | Code: | Not Enabled | Enabled | Total: |
|---|---|---|---|---|
| English | en | 55.9% (n = 6232464) | 44.1% (n = 4921668) | 11154132 |
| Japanese | ja | 69.8% (n = 3699020) | 30.2% (n = 1600458) | 5299478 |
| Spanish | es | 49.1% (n = 1621107) | 50.9% (n = 1679340) | 3300447 |
| Arabic | ar | 70.5% (n = 1365579) | 29.5% (n = 572310) | 1937889 |
| Portuguese | pt | 41.5% (n = 481955) | 58.5% (n = 678749) | 1160704 |
| Indonesian | in | 44.2% (n = 499068) | 55.8% (n = 629732) | 1128800 |
| Turkish | tr | 54.0% (n = 489846) | 46.0% (n = 418055) | 907901 |
| Russian | ru | 81.8% (n = 676610) | 18.2% (n = 150450) | 827060 |
| French | fr | 56.2% (n = 327940) | 43.8% (n = 256048) | 583988 |
| Korean | ko | 71.1% (n = 243286) | 28.9% (n = 99087) | 342373 |
| Thai | th | 48.2% (n = 161163) | 51.8% (n = 173226) | 334389 |
| Tagalog | tl | 45.8% (n = 143638) | 54.2% (n = 169922) | 313560 |
| Italian | it | 54.4% (n = 110504) | 45.6% (n = 92461) | 202965 |
| German | de | 69.0% (n = 95398) | 31.0% (n = 42916) | 138314 |
| Dutch | nl | 58.1% (n = 72855) | 41.9% (n = 52541) | 125396 |
| Swedish | sv | 52.5% (n = 26326) | 47.5% (n = 23851) | 50177 |
| Polish | pl | 56.8% (n = 26953) | 43.2% (n = 20487) | 47440 |
| Haitian | ht | 53.6% (n = 24879) | 46.4% (n = 21502) | 46381 |
| Estonian | et | 55.8% (n = 24478) | 44.2% (n = 19370) | 43848 |
| Ukrainian | uk | 77.6% (n = 23306) | 22.4% (n = 6716) | 30022 |
|
| 58.4% (n = 16346375) | 41.6% (n = 11628889) | 27975264 |
Crosstabulating tweet language (top 20) with users used geotagging or not.
| Language: | Code: | Not Geotagged: | Geotagged: | Total: |
|---|---|---|---|---|
| English | en | 96.7% (n = 8083523) | 3.3% (n = 275321) | 8358844 |
| Japanese | ja | 99.2% (n = 4762822) | 0.8% (n = 38231) | 4801053 |
| Spanish | es | 95.6% (n = 2343493) | 4.4% (n = 107758) | 2451251 |
| Arabic | ar | 99.1% (n = 1352119) | 0.9% (n = 12707) | 1364826 |
| Portuguese | pt | 94.1% (n = 931212) | 5.9% (n = 58225) | 989437 |
| Indonesian | in | 93.0% (n = 812496) | 7.0% (n = 61579) | 874075 |
| Russian | ru | 98.0% (n = 719338) | 2.0% (n = 14816) | 734154 |
| Turkish | tr | 91.7% (n = 567254) | 8.3% (n = 51249) | 618503 |
| French | fr | 97.1% (n = 427340) | 2.9% (n = 12901) | 440241 |
| Tagalog | tl | 95.7% (n = 298373) | 4.3% (n = 13415) | 311788 |
| Korean | ko | 99.6% (n = 256037) | 0.4% (n = 1117) | 257154 |
| Thai | th | 94.4% (n = 184890) | 5.6% (n = 10899) | 195789 |
| Italian | it | 96.3% (n = 159880) | 3.7% (n = 6074) | 165954 |
| German | de | 98.0% (n = 114658) | 2.0% (n = 2331) | 116989 |
| Dutch | nl | 96.6% (n = 98897) | 3.4% (n = 3484) | 102381 |
| Polish | pl | 97.8% (n = 42613) | 2.2% (n = 978) | 43591 |
| Swedish | sv | 95.5% (n = 40047) | 4.5% (n = 1887) | 41934 |
| Haitian | ht | 96.1% (n = 39975) | 3.9% (n = 1621) | 41596 |
| Estonian | et | 94.5% (N = 36023) | 5.5% (n = 2094) | 38117 |
| Slovenian | sl | 96.3% (n = 26653) | 3.7% (n = 1031) | 27684 |
|
| 96.9% (n = 21297643) | 3.1% (n = 677718) | 21975361 |