| Literature DB >> 27227152 |
Songhua Xu1, Christopher Markson1, Kaitlin L Costello2, Cathleen Y Xing3, Kitaw Demissie4, Adana Am Llanos4.
Abstract
BACKGROUND: As social media becomes increasingly popular online venues for engaging in communication about public health issues, it is important to understand how users promote knowledge and awareness about specific topics.Entities:
Keywords: Twitter; awareness; breast cancer; colorectal cancer; disparities; lung cancer; prostate cancer; social media
Year: 2016 PMID: 27227152 PMCID: PMC4869239 DOI: 10.2196/publichealth.5205
Source DB: PubMed Journal: JMIR Public Health Surveill ISSN: 2369-2960
Figure 1Histogram showing the distribution of Tweet character lengths.
Figure 2Histogram of the log of the character length of user timelines. We present this graph in log-form due to the wider distribution of character lengths in timelines.
Figure 3Balance and overall accuracy equestions.
Text classification with synonym expansion model classification and accuracy results.
| Race and ethnicity | % | |
| Balanced accuracy |
|
|
|
| Caucasian | 88.87 |
|
| African American | 81.26 |
|
| Asian | 72.32 |
|
| Hispanic | 69.07 |
| Overall accuracy |
|
|
|
| All groups | 76.07 |
|
| Caucasian and African Americans | 88.30 |
|
|
|
|
Confusion matrix.
| Classification | Reference, n | |||
| Caucasian | African American | Asian | Hispanic | |
| Caucasian | 1067 | 117 | 49 | 71 |
| African American | 890 | 1286 | 337 | 380 |
| Asian | 26 | 10 | 39 | 35 |
| Hispanic | 7 | 7 | 25 | 54 |
Distribution of unique active Twitter users during each month of the study period by race and ethnicity.
| Month | Race and ethnicity, n (%) | Total | |||
| African American | Caucasian | Asian | Hispanic |
| |
| April | 49,104 (9.72) | 452,924 (89.64) | 1289 (0.25) | 1935 (0.38) | 505,252 |
| Maya | 40,956 (12.76) | 277,169 (86.36) | 1177 (0.37) | 1646 (0.51) | 320,948 |
| Julya | 43,349 (9.58) | 405,185 (89.57) | 1661 (0.37) | 2191 (0.48) | 452,386 |
| August | 54740 (7.91) | 632,687 (91.47) | 1820 (0.26) | 2466 (0.36) | 691713 |
| September | 52,224 (10.16) | 457,300 (89.02) | 1789 (0.35) | 2417 (0.47) | 513,730 |
| October | 50,120 (11.07) | 398,440 (88.02) | 1763 (0.39) | 2371 (0.52) | 452,694 |
| November | 50,060 (10.80) | 409,125 (88.30) | 1762 (0.38) | 2370 (0.51) | 463,317 |
| December | 48,247 (11.20) | 378,412 (87.86) | 1727 (0.40) | 2292 (0.53) | 430,678 |
| January | 30,707 (15.62) | 162,682 (82.75) | 1435 (0.73) | 1780 (0.91) | 196,604 |
aTweets from May 13, 2014 through July 24, 2014 were not retained due to a system outage.
Figure 4Monthly frequency of cancer terms by race/ethnicity (African American, left axis; Caucasian, right axis), and all Twitter users (right axis). Cancer terms are "Cancer" (top left), "Breat Cancer" (top right), "Prostate Cancer" (bottom left), and "Lung Cancer" (bottom right). It is important to note the sharp decreases seen following cancer awareness months (Prostate Cancer Awareness Month [PCAM, September], Breast Cancer Awareness Month [BCAM, October], and Lung Cancer Awareness Month [LCAM, November]), particularly among African Americans. Both groups are seen returning to lower frequencies following awareness months; however, this observation is more prevalent among African Americans, specifically following BCAM.
Statistical significance of pairwise differences in cancer term usage between African Americans and Caucasians during each month of the study perioda.
| Month | Cancer term, | ||||
| "Cancer" | "Breast cancer" | "Prostate cancer" | "Colorectal cancer" | "Lung cancer" | |
| April | 0.00003 | 0.053025 | 0.014894 | 0.025347 | 0.080356 |
| May | 0.008194 | 0.584394 | 0.122251 | 0.095581 | 0.510364 |
| July | 0.013599 | <0.0001 | 0.006656 | 0.157299 | 0.890133 |
| August | <0.0001 | 0.001168 | 0.157209 | 0.312076 | 0.165111 |
| September | <0.0001 | 0.00007 | 0.017132 | 0.157299 | 0.013196 |
| October | <0.0001 | <0.0001 | 0.242175 | 0.974206 | 0.000162 |
| November | <0.0001 | <0.0001 | 0.027708 | 0.014306 | 0.000631 |
| December | 0.000266 | 0.000001 | 0.027575 | 0.317311 | 0.000067 |
| January | 0.241671 | 0.00945 | 0.1573 | 0.083265 | 0.91944 |
aEach user’s total term usage was calculated by summing the frequency with which cancer terms appeared in their timeline.