| Literature DB >> 35213998 |
Patrick Pilipiec1,2, Marcus Liwicki1, András Bota1.
Abstract
Pharmacovigilance is a science that involves the ongoing monitoring of adverse drug reactions to existing medicines. Traditional approaches in this field can be expensive and time-consuming. The application of natural language processing (NLP) to analyze user-generated content is hypothesized as an effective supplemental source of evidence. In this systematic review, a broad and multi-disciplinary literature search was conducted involving four databases. A total of 5318 publications were initially found. Studies were considered relevant if they reported on the application of NLP to understand user-generated text for pharmacovigilance. A total of 16 relevant publications were included in this systematic review. All studies were evaluated to have medium reliability and validity. For all types of drugs, 14 publications reported positive findings with respect to the identification of adverse drug reactions, providing consistent evidence that natural language processing can be used effectively and accurately on user-generated textual content that was published to the Internet to identify adverse drug reactions for the purpose of pharmacovigilance. The evidence presented in this review suggest that the analysis of textual data has the potential to complement the traditional system of pharmacovigilance.Entities:
Keywords: ADRs; adverse drug reactions; computational linguistics; machine learning; pharmacovigilance; public health; user-generated content
Year: 2022 PMID: 35213998 PMCID: PMC8924891 DOI: 10.3390/pharmaceutics14020266
Source DB: PubMed Journal: Pharmaceutics ISSN: 1999-4923 Impact factor: 6.321
Figure 1Flow diagram for literature search and study selection.
Summary of characteristics of publications included in the analysis.
| Authors | Data Source | Sample Size | Horizon of Data Collection | Software Used | Techniques and Classifiers Used | Outcome | Result | Description of Result |
|---|---|---|---|---|---|---|---|---|
| [ | Social media | Twitter.com: 1642 tweets | 3 years | Toolkit for Multivariate Analysis | Artificial Neural Networks (ANN), Boosted Decision Trees with AdaBoost (BDT), Boosted Decision Trees with Bagging (BDTG), Sentiment Analysis, Support Vector Machines (SVM) | Reported ADRs for HIV treatment | Positive | Reported adverse effects are consistent with well-recognized toxicities. |
| [ | Forums | DepressionForums.org: 7726 posts | 10 years | General Architecture for Text Engineering (GATE), NLTK Toolkit within MATLAB, RapidMiner | Hyperlink-Induced Topic Search (HITS), k-Means Clustering, Network Analysis, Term-Frequency-Inverse Document Frequency (TF-IDF) | User sentiment on depression drugs | Positive | Natural language processing is suitable to extract information on ADRs concerning depression. |
| [ | Social media | Twitter.com: 2,102,176,189 tweets | 1 year | Apache Lucene | MetaMap, Support Vector Machines (SVM) | Reported ADRs for cancer | Neutral | Classification models had limited performance. Adverse events related to cancer drugs can potentially be extracted from tweets. |
| [ | Social media | Twitter.com: 6528 tweets | Unknown | GENIA tagger, Hunspell, Snowball stemmer, Stanford Topic Modelling Toolbox, Twokenizer | Backward/Forward Sequential Feature Selection (BSFS/FSFS) Algorithm, k-Means Clustering, Sentiment Analysis, Support Vector Machines (SVM) | Reported ADRs | Positive | ADRs were identified reasonably well. |
| [ | Social media | Twitter.com: | Unknown | Hunspell, Twitter tokenizer | Term Frequency-Inverse Document Frequency (TF-IDF) | Reported ADRs | Neutral | ADRs were not identified very well. |
| [ | Social media | Twitter.com: | Unknown | Unknown | Naive Bayes (NB), Natural Language Processing (NLP), Support Vector Machines (SVM) | Reported ADRs | Positive | ADRs were identified well. |
| [ | Drug reviews | Drugs.com, | Unknown | BeautifulSoup | Logistic Regression, | Patient satisfaction with drugs, Reported ADRs, Reported effectiveness of drugs | Positive | Classification results were very good. |
| [ | Social media | Twitter.com: | 1 year | Twitter4J | Decision Trees, Medical Profile Graph, Natural Language Processing (NLP) | Reported ADRs | Positive | Building a medical profile of users enables the accurate detection of adverse drug events. |
| [ | Social media | Twitter.com: | Unknown | CRF++ Toolkit, GENIA tagger, Hunspell, Twitter REST API, Twokenizer | Natural Language Processing (NLP) | Reported ADRs | Positive | ADRs were identified reasonably well. |
| [ | Drug reviews | WebMD.com: Unknown | Unknown | SentiWordNet, WordNet | Sentiment Analysis, Support Vector Machines (SVM), Term document Matrix (TDM) | User sentiment on cancer drugs | Positive | Sentiment on ADRs was identified reasonably well. |
| [ | Drug reviews, Social media | DailyStrength.org: | Unknown | Unknown | ARDMine, Lexicon-based, MetaMap, Support Vector Machines (SVM) | Reported ADRs | Positive | ADRs were identified very well. |
| [ | Drug reviews, Social media | PatientsLikeMe.com: | Not applicable | Deeply Moving | Unknown | Patient-reported medication outcomes | Positive | Social media serves as a new data source to extract patient-reported medication outcomes. |
| [ | Forums | Medications.com: | Not applicable | Java Hidden Markov Model library, jsoup | Hidden Markov Model (HMM), Natural Language Processing (NLP) | Reported ADRs | Positive | Reported adverse effects are consistent with well-recognized side-effects. |
| [ | Electronic Health Record (EHR) | 25,074 discharge summaries | Not applicable | MedLEE | Unknown | Reported ADRs | Positive | Reported adverse effects are consistent with well-recognized toxicities (recall: 75%; precision: 31%). |
| [ | Social media | Twitter.com: | Not applicable | AFINN, Bing Liu sentiment words, Multi-Perspective Question Answering (MPQA), SentiWordNet, TextBlob, Tweepy, WEKA | MetaMap, Naive Bayes (NB), Natural Language Processing (NLP), Sentiment Analysis, Support Vector Machines (SVM) | Reported ADRs | Positive | Several well-known ADRs were identified. |
| [ | Forums | MedHelp.org: 6244 discussion threads | Unknown | Unknown | Association Mining | Reported ADRs | Positive | ADRs were identified. |
General description of publications included in the analysis.
| Category | Sub-Categories | n (%) | References |
|---|---|---|---|
| Year of publication | 2009 | 1 (6) | [ |
| 2010 | 0 (0) | - | |
| 2011 | 0 (0) | - | |
| 2012 | 2 (13) | [ | |
| 2013 | 0 (0) | - | |
| 2014 | 2 (13) | [ | |
| 2015 | 7 (43) | [ | |
| 2016 | 2 (13) | [ | |
| 2017 | 0 (0) | - | |
| 2018 | 1 (6) | [ | |
| 2019 | 1 (6) | [ | |
| Type of drugs | Asthma | 1 (5) | [ |
| Cancer | 2 (11) | [ | |
| Cystic fibrosis | 1 (5) | [ | |
| Depression | 1 (5) | [ | |
| HIV | 1 (5) | [ | |
| Rheumatoid arthritis | 1 (5) | [ | |
| Type 2 diabetes | 1 (5) | [ | |
| Unknown | 11 (58) | [ | |
| Data source | Drug reviews | 4 (22) | [ |
| Electronic Health Records (EHR) | 1 (6) | [ | |
| Forums | 3 (17) | [ | |
| Social media | 10 (56) | [ | |
| Sample size | Less than 5000 | 3 (19) | [ |
| 5000 to 9999 | 4 (25) | [ | |
| 10,000 to 14,999 | 1 (6) | [ | |
| 15,000 to 19,999 | 1 (6) | [ | |
| 20,000 or more | 6 (38) | [ | |
| Unknown | 1 (6) | [ | |
| Users | HIV-infected persons undergoing drug treatment | 1 (6) | [ |
| Unknown | 15 (94) | [ | |
| Unique users | Less than 5000 | 2 (13) | [ |
| 5000 to 9999 | 0 (0) | - | |
| 10,000 to 14,999 | 0 (0) | - | |
| 15,000 to 19,999 | 0 (0) | - | |
| 20,000 or more | 0 (0) | - | |
| Unknown | 14 (88) | [ | |
| Origin of users | Canada | 1 (5) | [ |
| South Africa | 1 (5) | [ | |
| United Kingdom | 1 (5) | [ | |
| United States | 1 (5) | [ | |
| Unknown | 15 (79) | [ | |
| Average number of followers | Less than 5000 | 1 (6) | [ |
| 5000 to 9999 | 0 (0) | - | |
| 10,000 to 14,999 | 0 (0) | - | |
| 15,000 to 19,999 | 0 (0) | - | |
| 20,000 or more | 0 (0) | - | |
| Unknown | 15 (94) | [ | |
| Years of data collection | 2004 | 2 (6) | [ |
| 2005 | 1 (3) | [ | |
| 2006 | 1 (3) | [ | |
| 2007 | 1 (3) | [ | |
| 2008 | 1 (3) | [ | |
| 2009 | 2 (6) | [ | |
| 2010 | 3 (10) | [ | |
| 2011 | 2 (6) | [ | |
| 2012 | 3 (10) | [ | |
| 2013 | 2 (6) | [ | |
| 2014 | 3 (10) | [ | |
| 2015 | 2 (6) | [ | |
| Unknown | 8 (26) | [ | |
| Horizon of data collection | 1 year | 2 (13) | [ |
| 2 to 5 years | 1 (6) | [ | |
| 6 to 10 years | 1 (6) | [ | |
| Not applicable | 4 (25) | [ | |
| Unknown | 8 (50) | [ | |
| Software used | AFINN | 1 (3) | [ |
| Apache Lucene | 1 (3) | [ | |
| BeautifulSoup | 1 (3) | [ | |
| Bing Liu sentiment words | 1 (3) | [ | |
| CRF++ toolkit | 1 (3) | [ | |
| Deeply Moving | 1 (3) | [ | |
| General Architecture for Text Engineering (GATE) | 1 (3) | [ | |
| GENIA tagger | 2 (6) | [ | |
| Hunspell | 3 (9) | [ | |
| Java Hidden Markov Model library | 1 (3) | [ | |
| jsoup | 1 (3) | [ | |
| MedLEE | 1 (3) | [ | |
| Multi-Perspective Question Answering (MPQA) | 1 (3) | [ | |
| NLTK toolkit within MATLAB | 1 (3) | [ | |
| RapidMiner | 1 (3) | [ | |
| SentiWordNet | 2 (6) | [ | |
| Snowball stemmer | 1 (3) | [ | |
| Stanford Topic Modelling Toolbox | 1 (3) | [ | |
| TextBlob | 1 (3) | [ | |
| Toolkit for Multivariate Analysis | 1 (3) | [ | |
| Tweepy | 1 (3) | [ | |
| Twitter REST API | 1 (3) | [ | |
| Twitter tokenizer | 1 (3) | [ | |
| Twitter4J | 1 (3) | [ | |
| Twokenizer | 2 (6) | [ | |
| Unknown | 3 (9) | [ | |
| WEKA | 1 (3) | [ | |
| WordNet | 1 (3) | [ | |
| Techniques and classifiers used | ARDMine | 1 (2) | [ |
| Artificial Neural Networks (ANN) | 1 (2) | [ | |
| Association Mining | 1 (2) | [ | |
| Backward/forward sequential feature selection (BSFS/FSFS) algorithm | 1 (2) | [ | |
| Boosted Decision Trees with AdaBoost (BDT) | 1 (2) | [ | |
| Boosted Decision Trees with Bagging (BDTG) | 1 (2) | [ | |
| Decision Trees | 1 (2) | [ | |
| Hidden Markov Model (HMM) | 1 (2) | [ | |
| Hyperlink-Induced Topic Search (HITS) | 1 (2) | [ | |
| k-Means Clustering | 2 (5) | [ | |
| Lexicon-based | 1 (2) | [ | |
| Logistic Regression | 1 (2) | [ | |
| Medical Profile Graph | 1 (2) | [ | |
| MetaMap | 3 (7) | [ | |
| Naive Bayes (NB) | 2 (5) | [ | |
| Natural Language Processing (NLP) | 5 (12) | [ | |
| Network Analysis | 1 (2) | [ | |
| Sentiment Analysis | 5 (12) | [ | |
| Support Vector Machines (SVM) | 7 (17) | [ | |
| Term Document Matrix (TDM) | 1 (2) | [ | |
| Term-Frequency-Inverse Document Frequency (TF-IDF) | 2 (5) | [ | |
| Unknown | 2 (5) | [ | |
| Outcome | Patient satisfaction with drugs | 1 (6) | [ |
| Patient-reported medication outcomes | 1 (6) | [ | |
| Reported ADRs | 11 (61) | [ | |
| Reported ADRs for cancer | 1 (6) | [ | |
| Reported ADRs for HIV treatment | 1 (6) | [ | |
| Reported effectiveness of drugs | 1 (6) | [ | |
| User sentiment on depression drugs | 1 (6) | [ | |
| User sentiment on cancer drugs | 1 (6) | [ | |
| Drugs studied | Less than 5 | 1 (6) | [ |
| 5 to 9 | 4 (25) | [ | |
| 10 to 14 | 2 (13) | [ | |
| 15 to 19 | 0 (0) | - | |
| 20 or more | 5 (31) | [ | |
| Unknown | 4 (25) | [ | |
| Result | Positive | 14 (88) | [ |
| Neutral | 2 (13) | [ | |
| Negative | 0 (0) | - | |
| Reliability | Low | 0 (0) | - |
| Medium | 16 (100) | [ | |
| High | 0 (0) | - | |
| Validity | Low | 0 (0) | - |
| Medium | 16 (100) | [ | |
| High | 0 (0) | - |
Publications by classification category and result.
| Category | Sub-Categories | Positive (n %) | Neutral (n %) | Negative (n %) | References |
|---|---|---|---|---|---|
| Type of drugs | Asthma | 1 (5) | 0 (0) | 0 (0) | [ |
| Cancer | 1 (5) | 1 (5) | 0 (0) | [ | |
| Cystic fibrosis | 1 (5) | 0 (0) | 0 (0) | [ | |
| Depression | 1 (5) | 0 (0) | 0 (0) | [ | |
| HIV | 1 (5) | 0 (0) | 0 (0) | [ | |
| Rheumatoid arthritis | 1 (5) | 0 (0) | 0 (0) | [ | |
| Type 2 diabetes | 1 (5) | 0 (0) | 0 (0) | [ | |
| Unknown | 10 (53) | 1 (5) | 0 (0) | [ | |
| Data source | Drug reviews | 4 (22) | 0 (0) | 0 (0) | [ |
| Electronic Health Records (EHR) | 1 (6) | 0 (0) | 0 (0) | [ | |
| Forums | 3 (17) | 0 (0) | 0 (0) | [ | |
| Social media | 8 (44) | 2 (11) | 0 (0) | [ | |
| Origin of users | Canada | 1 (5) | 0 (0) | 0 (0) | [ |
| South Africa | 1 (5) | 0 (0) | 0 (0) | [ | |
| United Kingdom | 1 (5) | 0 (0) | 0 (0) | [ | |
| United States | 1 (5) | 0 (0) | 0 (0) | [ | |
| Unknown | 13 (68) | 2 (11) | 0 (0) | [ | |
| Horizon of data collection | 1 year | 1 (6) | 1 (6) | 0 (0) | [ |
| 2 to 5 years | 1 (6) | 0 (0) | 0 (0) | [ | |
| 6 to 10 years | 1 (6) | 0 (0) | 0 (0) | [ | |
| Not applicable | 4 (25) | 0 (0) | 0 (0) | [ | |
| Unknown | 7 (44) | 1 (6) | 0 (0) | [ | |
| Outcome | Patient satisfaction with drugs | 1 (6) | 0 (0) | 0 (0) | [ |
| Patient-reported medication outcomes | 1 (6) | 0 (0) | 0 (0) | [ | |
| Reported ADRs | 10 (56) | 1 (6) | 0 (0) | [ | |
| Reported ADRs for cancer | 0 (0) | 1 (6) | 0 (0) | [ | |
| Reported ADRs for HIV treatment | 1 (6) | 0 (0) | 0 (0) | [ | |
| Reported effectiveness of drugs | 1 (6) | 0 (0) | 0 (0) | [ | |
| User sentiment on depression drugs | 1 (6) | 0 (0) | 0 (0) | [ | |
| User sentiment on cancer drugs | 1 (6) | 0 (0) | 0 (0) | [ | |
| Drugs studied | Less than 5 | 1 (6) | 0 (0) | 0 (0) | [ |
| 5 to 9 | 3 (19) | 1 (6) | 0 (0) | [ | |
| 10 to 14 | 2 (13) | 0 (0) | 0 (0) | [ | |
| 15 to 19 | 0 (0) | 0 (0) | 0 (0) | - | |
| 20 or more | 5 (31) | 0 (0) | 0 (0) | [ | |
| Unknown | 3 (19) | 1 (6) | 0 (0) | [ | |
| Reliability | Low | 0 (0) | 0 (0) | 0 (0) | - |
| Medium | 14 (88) | 2 (13) | 0 (0) | [ | |
| High | 0 (0) | 0 (0) | 0 (0) | - | |
| Validity | Low | 0 (0) | 0 (0) | 0 (0) | - |
| Medium | 14 (88) | 2 (13) | 0 (0) | [ | |
| High | 0 (0) | 0 (0) | 0 (0) | - |
PRISMA checklist.
| Section/Topic | # | Checklist Item | Reported on Page # |
|---|---|---|---|
| TITLE | |||
| Title | 1 | Identify the report as a systematic review, meta-analysis, or both. | 1 |
| ABSTRACT | |||
| Abstract | 2 | See the PRISMA 2020 for Abstracts checklist. | 1 |
| INTRODUCTION | |||
| Rationale | 3 | Describe the rationale for the review in the context of what is already known. | 1 |
| Objectives | 4 | Provide an explicit statement of the objective(s) or question(s) the review addresses. | 2 |
| METHOD | |||
| Eligibility criteria | 5 | Specify the inclusion and exclusion criteria for the review and how studies were grouped for the syntheses. | 5 |
| Information sources | 6 | Specify all databases, registers, websites, organisations, reference lists and other sources searched or consulted to identify studies. Specify the date when each source was last searched or consulted. | 3 |
| Search strategy | 7 | Present the full search strategies for all databases, registers and websites, including any filters and limits used. | 3, 19 |
| Selection process | 8 | Specify the methods used to decide whether a study met the inclusion criteria of the review, including how many reviewers screened each record and each report retrieved, whether they worked independently, and if applicable, details of automation tools used in the process. | 4 |
| Data collection process | 9 | Specify the methods used to collect data from reports, including how many reviewers collected data from each report, whether they worked independently, any processes for obtaining or confirming data from study investigators, and if applicable, details of automation tools used in the process. | 4 |
| Data items | 10a | List and define all outcomes for which data were sought. Specify whether all results that were compatible with each outcome domain in each study were sought (e.g. for all measures, time points, analyses), and if not, the methods used to decide which results to collect. | 4 |
| 10b | List and define all other variables for which data were sought (e.g. participant and intervention characteristics, funding sources). Describe any assumptions made about any missing or unclear information. | 4, 5 | |
| Study risk of bias assessment | 11 | Specify the methods used to assess risk of bias in the included studies, including details of the tool(s) used, how many reviewers assessed each study and whether they worked independently, and if applicable, details of automation tools used in the process. | 5 |
| Effect measures | 12 | Specify for each outcome the effect measure(s) (e.g. risk ratio, mean difference) used in the synthesis or presentation of results. | n. a. |
| Synthesis methods | 13a | Describe the processes used to decide which studies were eligible for each synthesis (e.g. tabulating the study intervention characteristics and comparing against the planned groups for each synthesis (item #5)). | 5 |
| 13b | Describe any methods required to prepare the data for presentation or synthesis, such as handling of missing summary statistics, or data conversions. | 6 | |
| 13c | Describe any methods used to tabulate or visually display results of individual studies and syntheses. | 6 | |
| 13d | Describe any methods used to synthesize results and provide a rationale for the choice(s). If meta-analysis was performed, describe the model(s), method(s) to identify the presence and extent of statistical heterogeneity, and software package(s) used. | 6 | |
| 13e | Describe any methods used to explore possible causes of heterogeneity among study results (e.g. subgroup analysis, meta-regression). | n. a. | |
| 13f | Describe any sensitivity analyses conducted to assess robustness of the synthesized results. | n. a. | |
| Reporting bias assessment | 14 | Describe any methods used to assess risk of bias due to missing results in a synthesis (arising from reporting biases). | n. a. |
| Certainty assessment | 15 | Describe any methods used to assess certainty (or confidence) in the body of evidence for an outcome. | 5 |
| RESULTS | |||
| Study selection | 16a | Describe the results of the search and selection process, from the number of records identified in the search to the number of studies included in the review, ideally using a flow diagram. | 4 |
| 16b | Cite studies that might appear to meet the inclusion criteria, but which were excluded, and explain why they were excluded. | n. a. | |
| Study characteristics | 17 | Cite each included study and present its characteristics. | 7 |
| Risk of bias within studies | 18 | Present assessments of risk of bias for each included study. | 20 |
| Results of individual studies | 19 | For all outcomes, present, for each study: (a) summary statistics for each group (where appropriate) and (b) an effect estimate and its precision (e.g. confidence/credible interval), ideally using structured tables or plots. | 9 |
| Results of syntheses | 20a | For each synthesis, briefly summarise the characteristics and risk of bias among contributing studies. | n. a. |
| 20b | Present results of all statistical syntheses conducted. If meta-analysis was done, present for each the summary estimate and its precision (e.g. confidence/credible interval) and measures of statistical heterogeneity. If comparing groups, describe the direction of the effect. | n. a. | |
| 20c | Present results of all investigations of possible causes of heterogeneity among study results. | n. a. | |
| 20d | Present results of all sensitivity analyses conducted to assess the robustness of the synthesized results. | n. a. | |
| Reporting biases | 21 | Present assessments of risk of bias due to missing results (arising from reporting biases) for each synthesis assessed. | 15 |
| Certainty of evidence | 22 | Present assessments of certainty (or confidence) in the body of evidence for each outcome assessed. | n. a. |
| DISCUSSION | |||
| Discussion | 23a | Provide a general interpretation of the results in the context of other evidence. | 14 |
| 23b | Discuss any limitations of the evidence included in the review. | 15 | |
| 23c | Discuss any limitations of the review processes used. | 15 | |
| 23d | Discuss implications of the results for practice, policy, and future research. | 16 | |
| OTHER INFORMATION | |||
| Registration and protocol | 24a | Provide registration information for the review, including register name and registration number, or state that the review was not registered. | 15 |
| 24b | Indicate where the review protocol can be accessed, or state that a protocol was not prepared. | 15 | |
| 24c | Describe and explain any amendments to information provided at registration or in the protocol. | 15 | |
| Support | 25 | Describe sources of financial or non-financial support for the review, and the role of the funders or sponsors in the review. | 16 |
| Competing interests | 26 | Declare any competing interests of review authors. | 16 |
| Availability of data, code and other materials | 27 | Report which of the following are publicly available and where they can be found: template data collection forms; data extracted from included studies; data used for all analyses; analytic code; any other materials used in the review. | n. a. |
Characteristics of publications included in the analysis.
| Authors | Type of Drugs | Drugs Studied | Data Source | Sample Size | Users | Unique Users | Origin of Users | Average Number of Followers | Years of Data Collection | Horizon of Data Collection | Software Used | Techniques and Classifiers Used | Outcome | Result | Description of Result | Reliability | Validity |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| [ | HIV | Atripla, | Social media | Twitter.com: | HIV-infected persons undergoing drug treatment | 512 | Canada, | 2300 | 2010, 2011, 2012, 2013 | 3 years | Toolkit for Multivariate Analysis | Artificial Neural Networks (ANN), | Reported ADRs for HIV treatment | Positive | Reported adverse effects are consistent with well-recognized toxicities. | Medium | Medium |
| [ | Depression | Citalopram, | Forums | Depression | Unknown | Unknown | Unknown | Unknown | 2004–2014 | 10 years | General Architecture for Text Engineering (GATE), | Hyperlink-Induced Topic Search (HITS), | User sentiment on depression drugs | Positive | Natural language processing is suitable to extract information on ADRs concerning depression. | Medium | Medium |
| [ | Cancer | Avastin, | Social media | Twitter.com: | Unknown | Unknown | Unknown | Unknown | 2009, | 1 year | Apache Lucene | MetaMap, | Reported ADRs for cancer | Neutral | Classification models had limited performance. Adverse events related to cancer drugs can potentially be extracted from tweets. | Medium | Medium |
| [ | Unknown | Unknown | Social media | Twitter.com: | Unknown | Unknown | Unknown | Unknown | Unknown | Unknown | GENIA tagger, | Backward/forward sequential feature selection (BSFS/FSFS) algorithm, | Reported ADRs | Positive | ADRs were identified reasonably well. | Medium | Medium |
| [ | Unknown | Unknown | Social media | Twitter.com: | Unknown | Unknown | Unknown | Unknown | Unknown | Unknown | Hunspell, | - Term Frequency-Inverse Document Frequency (TF-IDF) | Reported ADRs | Neutral | ADRs were not very well identified. | Medium | Medium |
| [ | Unknown | 65 drugs | Social media | Twitter.com: | Unknown | Unknown | Unknown | Unknown | Unknown | Unknown | Unknown | Naive Bayes (NB), | Reported ADRs | Positive | ADRs were identified good. | Medium | Medium |
| [ | Unknown | Unknown | Drug reviews | Drugs.com, | Unknown | Unknown | Unknown | Unknown | Unknown | Unknown | BeautifulSoup | Logistic Regression, | Patient satisfaction with drugs, | Positive | Classification results were very good. | Medium | Medium |
| [ | Unknown | 103 drugs | Social media | Twitter.com: | Unknown | 864 | Unknown | Unknown | 2014, | 1 year | Twitter4J | Decision Trees, | Reported ADRs | Positive | Building a medical profile of users enables the accurate detection of adverse drug events. | Medium | Medium |
| [ | Unknown | Unknown | Social media | Twitter.com | Unknown | Unknown | Unknown | Unknown | Unknown | Unknown | CRF++ Toolkit, | Natural Language Processing (NLP) | Reported ADRs | Positive | ADRs were identified reasonably well. | Medium | Medium |
| [ | Cancer | 146 drugs | Drug reviews | WebMD.com: | Unknown | Unknown | Unknown | Unknown | Unknown | Unknown | SentiWordNet, | Sentiment Analysis, | User sentiment on cancer drugs | Positive | Sentiment on ADRs was identified reasonably well. | Medium | Medium |
| [ | Unknown | 81 drugs | Drug reviews, | DailyStrength.org: 6279 reviews, | Unknown | Unknown | Unknown | Unknown | Unknown | Unknown | Unknown | ARDMine, | Reported ADRs | Positive | ADRs were identified very well. | Medium | Medium |
| [ | Asthma, | Albuterol, | Drug reviews, | PatientsLikeMe.com: 796 reviews, | Unknown | Unknown | Unknown | Unknown | 2014 | Not applicable | Deeply Moving | Unknown | Patient-reported medication outcomes | Positive | Social media serves as a new data source to extract patient-reported medication outcomes. | Medium | Medium |
| [ | Unknown | Medications.com: 168 drugs, | Forums | Medications.com: 8065 posts, | Unknown | Unknown | Unknown | Unknown | 2012 | Not applicable | Java Hidden Markov Model library, | Hidden Markov Model (HMM), | Reported ADRs | Positive | Reported adverse effects are consistent with well-recognized side-effects. | Medium | Medium |
| [ | Unknown | ACE inhibitors, | Electronic Health Record (EHR) | 25,074 discharge summaries | Unknown | Unknown | Unknown | Unknown | 2004 | Not applicable | MedLEE | Unknown | Reported ADRs | Positive | Reported adverse effects are consistent with well-recognized toxicities (recall: 75%; precision: 31%). | Medium | Medium |
| [ | Unknown | Baclofen, | Social media | Twitter.com: | Unknown | Unknown | Unknown | Unknown | 2015 | Not applicable | AFINN, | MetaMap, | Reported ADRs | Positive | Several well-known ADRs were identified. | Medium | Medium |
| [ | Unknown | Adenosine, | Forums | MedHelp.com: | Unknown | Unknown | Unknown | Unknown | Unknown | Unknown | Unknown | Association Mining | Reported ADRs | Positive | ADRs were identified. | Medium | Medium |