| Literature DB >> 28967000 |
Shaodian Zhang1, Lin Qiu2, Frank Chen3, Weinan Zhang2, Yong Yu2, Noémie Elhadad1.
Abstract
Patients discuss complementary and alternative medicine (CAM) in online health communities. Sometimes, patients' conflicting opinions toward CAM-related issues trigger debates in the community. The objectives of this paper are to identify such debates, identify controversial CAM therapies in a popular online breast cancer community, as well as patients' stances towards them. To scale our analysis, we trained a set of classifiers. We first constructed a supervised classifier based on a long short-term memory neural network (LSTM) stacked over a convolutional neural network (CNN) to detect automatically CAM-related debates from a popular breast cancer forum. Members' stances in these debates were also identified by a CNN-based classifier. Finally, posts automatically flagged as debates by the classifier were analyzed to explore which specific CAM therapies trigger debates more often than others. Our methods are able to detect CAM debates with F score of 77%, and identify stances with F score of 70%. The debate classifier identified about 1/6 of all CAM-related posts as debate. About 60% of CAM-related debate posts represent the supportive stance toward CAM usage. Qualitative analysis shows that some specific therapies, such as Gerson therapy and usage of laetrile, trigger debates frequently among members of the breast cancer community. This study demonstrates that neural networks can effectively locate debates on usage and effectiveness of controversial CAM therapies, and can help make sense of patients' opinions on such issues under dispute. As to CAM for breast cancer, perceptions of their effectiveness vary among patients. Many of the specific therapies trigger debates frequently and are worth more exploration in future work.Entities:
Keywords: Complementary and Alternative Medicine (CAM); Debate Identification; Online Health Community
Year: 2017 PMID: 28967000 PMCID: PMC5617343 DOI: 10.1145/3041021.3055134
Source DB: PubMed Journal: Proc Int World Wide Web Conf
Figure 1An example debate in thread as the input of our model. Green and blue posts were published by two users engaged in the debate with opposing opinions respectively. Grey posts are not engaged in the debate, but provide as context. User names are removed from the text and replaced by X, Y, and Z, from which it could be seen that debate detection is highly context-dependent.
Figure 2The architecture of our model for debate detection, motivated by [21].
Features used for the logistic regression model.
| Thread-level features | Description |
|---|---|
| NumPost | Number of posts in the thread |
| NumUser | Number of authors participating in the thread discussion |
| AvgLen | Average length of post (by word numbers) in the thread |
|
| |
| NumName | Number of mentions of other authors’ names |
| NumNeg | Number of negative sentiment words |
| NumPos | Number of positive sentiment words |
| NumCAM | Number of CAM related keywords |
| NumOverlap | Number of words that also occur in previous post |
| Num? | Number of question marks |
| Num! | Number of exclamation marks |
| TimeDif | Time difference between current and previous post in thread |
| Sig | If the author has a signature profile |
| NAgree | Number of “agree”s |
| NDisagree | Number of “disagree”s |
|
| |
| LDA | Topic modeling |
| LDA-sim | cosine similarity between LDA of current and previous post |
| W2V | Word embedding |
| W2V-sim | cosine similarity between W2V of current and previous post |
Example posts annotated as three types of debates (presented here out of their thread context). User names are removed from the text and replaced by “X” and “Y”.
| Type of debate | Example post |
|---|---|
| CAM | “Laetrile is snake oil and potentially dangerous. it is illegal to sell it as a cancer treatment because there is zero evidence to so much as suggest that it has any efficacy.” |
| Breast cancer related | “X, Y is correct. Please read all parts of your link. It clearly states that dcis can be any size. ” |
| Other | “X, no offense taken and I usually agree with you on the harmless/lonely bit. However, there were some truly over the top comments made that needed to be addressed, IMHO.” |
System performance for binary debate classification with different methods. The baseline system simply classifies everything as debate.
| Precision | Recall | F | |
|---|---|---|---|
| Baseline | 16.3 | 100.0 | 28.0 |
| Logistic regression | 64.6 ( | 89.6 ( | 75.1 ( |
| LSTM+CNN | 68.1 ( | 88.9 ( | 77.1( |
System performance for 4-class debate classification by the proposed LSTM-CNN based system.
| Precision | Recall | F | |
|---|---|---|---|
| Non-debate | 71.4 ( | 79.1 ( | 75.1 ( |
| CAM | 58.0 ( | 73.9 ( | 65.0 ( |
| Breast cancer related | 43.4 ( | 41.3 ( | 41.9 ( |
| Other | 55.1 ( | 59.4 ( | 57.2 ( |
System performance for binary stance classification with different methods. Precision, recall, and F are calculated for the con-CAM class. The baseline system classifies everything as con-CAM.
| Precision | Recall | F | |
|---|---|---|---|
| Baseline | 30.9 | 100.0 | 47.2 |
| Logistic regression | 69.6 ( | 70.6 ( | 70.1 ( |
| CNN | 69.1 ( | 70.9 ( | 70.1 ( |
CAM therapies identified through the manual coding, and number of posts identified for each therapy group in the sampled posts.
| Code | Examples | # |
|---|---|---|
| CAM | General CAM v.s. conventional discussions; | 135 |
| General | CAM v.s. conventional discussions; | 135 |
| Gerson therapy | Effectiveness and scientific validity of Gerson therapy | 44 |
| Diet | Effectiveness and/or practice of diets for cure, prevention, and management of breast cancer therapy (gluten free, low carb, hormone free meal, vegan, Ayurvedic, etc.) | 42 |
| Supplements | Any supplement whose purpose is not to control estrogen | 33 |
| Laetrile | Laetrile or food/supplement that contains laetrile | 27 |
| Estrogen control | Therapies/supplements to control estrogen, including DIM, soy, natural replacements for tamoxifen, bioidentical hormones, etc. | 24 |
| TCM | Use and effectiveness of Traditional Chinese Medicine for cancer management | 12 |
| Med marijuana | Use of medical marijuana for cancer management | 5 |
| Issels | Issels treatment | 2 |
| Colonics | Colonics treatments | 1 |
Figure 3Stances of posts on CAM usage clustered by topics. X axis represents the numbers of posts in pro-CAM and con-CAM stances, respectively.