| Literature DB >> 30497991 |
Meeta Pradhan1, Anand Kulanthaivel2, Josette Jones2, Masoud Hosseini2, Mahmood Hosseini3.
Abstract
BACKGROUND: The increasing use of social media and mHealth apps has generated new opportunities for health care consumers to share information about their health and well-being. Information shared through social media contains not only medical information but also valuable information about how the survivors manage disease and recovery in the context of daily life.Entities:
Keywords: data interpretation; infodemiology; natural language processing; patient-generated information; social media; statistical analysis
Year: 2018 PMID: 30497991 PMCID: PMC6293240 DOI: 10.2196/medinform.9162
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Overview of the methods used to analyze the study content.
Breast cancer websites explored.
| Website name | Site URL | Country | Forums | Threads | Posts | Members |
| Breastcancer.org Community [ | community.breastcancer.org | USa | 80 | 121,688 | 3,608,324 | 153,620 |
| Breast Cancer Care [ | forum.breastcancercare.co.uk | UKb | 54 | 36,949 | 782,486 | N/Ac |
| Susan G Komen Foundation: Message Board [ | apps.komen.org/Forums | US | 24 | 44,175 | 354,592 | 26,883 |
| Triple Negative Breast Cancer: Forums [ | forum.tnbcfoundation.org | US | 17 | 9641 | 100,706 | 123,427 |
| No Surrender Breast Cancer Foundation [ | nosurrenderbreastcancersupportforum.com | US | 36 | 2443 | 55,498 | 5549 |
aUS: United States.
bUK: United Kingdom.
cN/A: not applicable.
The partial table of topics generated by Machine Learning Language Toolkit in the 30-topic model, with interpretations (the list goes on up to the 30th topic; only 3 are shown for brevity).
| Machine Learning Language Toolkit topic identifier | Topic label | Topic keywords |
| 1 | Diagnostic testing and waiting for results | breast biopsy cancer lump results ultrasound benign surgeon mammogram doctor mri waiting back mammo good radiologist feel pain left i'm |
| 2 | Side effects of inflammation and its treatment | breast ibc skin symptoms pain rash red cancer nipple biopsy infection diagnosed antibiotics swollen treatment left specialist redness swelling lymph |
| 3 | Positive results after recurrence | chemo stage years cancer treatment nodes onc tumor triple negative taxol positive rads year diagnosed node recurrence congratulations lymph radiation |
A portion of the file-feature set generated by Machine Learning Language Toolkit software (the list goes on up to the 80th file and 30th topic; values were truncated for brevity of display).
| File identifiera | Topic IDb | Strengthc | Topic ID | Strength | Topic ID | Strength | Topic ID | Strength | Topic ID | Strength |
| F100 | 12 | 0.275 | 18 | 0.269 | 2 | 0.251 | 5 | 0.06 | 7 | 0.053 |
| F102 | 2 | 0.542 | 18 | 0.136 | 7 | 0.087 | 12 | 0.056 | 1 | 0.04 |
| F104 | 2 | 0.315 | 14 | 0.118 | 1 | 0.104 | 7 | 0.09 | 20 | 0.043 |
| F105 | 2 | 0.295 | 11 | 0.25 | 6 | 0.213 | 7 | 0.067 | 14 | 0.042 |
aScraped forum file.
bTopic identifier: Machine Learning Language Toolkit-generated topics.
cWeight of topic in the file.
Figure 2Distribution of qualitative content analysis-generated categories according to the number of forum threads that each manual category possesses. DX: Diagnosed; TLD: Top-Level Domain; NDC: Not Diagnosed but Concerned.
Topics #8 and #29 with Latent Dirichlet allocation strengths author topic label interpretations.
| Topic identifier | Latent Dirichlet allocation strength | Topic words | Authors’ topic label |
| 8 | 1.38724 | cancer chemo years feel life family mom time support things breast people treatment don’t husband care friends diagnosed talk mother | Hope, love, family, and friends |
| 29 | 0.19954 | hair book pink survivor happy deb health country president shirley obama congratulations cats article eye mammo fumi beth beautiful vote | Daily living and breast cancer |
Figure 3File-file similarity matrix.
Top scored file-file similarity measures.
| File identifier | Associated files (similarity score) |
| F102 | F133 (0.85), F144 (0.91), F152 (0.97), F116 (0.95) |
| F104 | F109 (0.89), F142 (0.81), F150 (0.82), F27 (0.89) |
| F108 | F132 (0.94), F137 (0.86), F145 (0.97), F5 (0.90), F71 (0.93), F88 (0.86), F96 (0.97) |
| F109 | F104 (0.89), F142 (0.89), F127 (0.85) |
| F110 | F26 (0.80) |
| F111 | F132 (0.8), F68 (0.87) |
| F112 | F47 (0.92), F93 (0.94) |
| F113 | F139 (0.89), F55 (0.87) |
| F133 | F102 (0.85), F135 (0.86), F145 (0.90), F5 (0.87), F71 (0.94), F96 (0.94), F88 (0.87) |
Figure 4Topic-topic similarity matrix.
Most significant topics identified via multiple linear regression analysis.
| Topic identifier | Topic label | Akaike information criterion values |
| 21 | Lingering side effects while in remission | −642.75 |
| 18 | Chemotherapy side effects and change of treatment | −641.98 |
| 10 | Radiation and side effects | −633.17 |
| 7 | Genetic risk and testing | −620.41 |
| 25 | Support from caregiver and medical team for recovery long term | −571.78 |
| 11 | Looking for support from people in similar circumstances | −412.32 |