| Literature DB >> 26163365 |
Jérémy Lardon1, Redhouane Abdellaoui, Florelle Bellet, Hadyl Asfari, Julien Souvignet, Nathalie Texier, Marie-Christine Jaulent, Marie-Noëlle Beyens, Anita Burgun, Cédric Bousquet.
Abstract
BACKGROUND: The underreporting of adverse drug reactions (ADRs) through traditional reporting channels is a limitation in the efficiency of the current pharmacovigilance system. Patients' experiences with drugs that they report on social media represent a new source of data that may have some value in postmarketing safety surveillance.Entities:
Keywords: Internet; Web 2.0; adverse drug reaction; adverse event; pharmacovigilance; scoping review; social media; text mining
Mesh:
Year: 2015 PMID: 26163365 PMCID: PMC4526988 DOI: 10.2196/jmir.4304
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Figure 1Structure of the search queries.
Full search strategy for each database.
| Database | Query | Query text |
|
|
|
|
|
| Query #1 (keywords) | (pharmacovigilance[MeSHa Terms] OR pharmacovigilance[All Fields] OR ADRb[All Fields] OR ADEc[All Fields] OR (("adverse reaction"[All Fields] OR "adverse event"[All Fields] OR "side effect"[All Fields]) AND (drug[All Fields] OR medication[All Fields] OR pharmaceutical product*[All Fields]))) AND ("social media"*[All Fields] OR “Web 2.0”[TIABd] OR “Web 2.0”[TIAB] OR "social media" [TIAB] OR "social network*" OR Twitter OR Facebook OR blog OR forum* OR fora OR message board* OR comment* OR (user feedback*)) |
|
| Query #2 (MeSH terms) | (((("pharmacovigilance"[MeSH]) OR surveillance[Title])) AND (((((Twitter[Title/Abstract]) OR Facebook[Title/Abstract]) OR Doctissimo[Title/Abstract])) OR (((((((((social media[Title/Abstract]) OR social networks[Title/Abstract]) OR "online health community"[Title/Abstract]) OR "online discussion"[Title/Abstract]) OR medical data mining[Title/Abstract]) OR online[Title/Abstract]) OR patient forum[Title/Abstract]) OR natural language processing[MeSH Terms]) OR "natural language processing"[Title/Abstract]))) OR ((((("Adverse Drug Reaction Reporting Systems"[MeSH]) AND (((((Twitter[Title/Abstract]) OR Facebook[Title/Abstract]) OR Doctissimo[Title/Abstract])) OR (((((((((social media[Title/Abstract]) OR social networks[Title/Abstract]) OR "online health community"[Title/Abstract]) OR "online discussion"[Title/Abstract]) OR medical data mining[Title/Abstract]) OR online[Title/Abstract]) OR patient forum[Title/Abstract]) OR natural language processing[MeSH Terms]) OR "natural language processing"[Title/Abstract])))) OR (((((((((((social media[Title/Abstract]) OR social networks[Title/Abstract]) OR "online health community"[Title/Abstract]) OR "online discussion"[Title/Abstract]) OR medical data mining[Title/Abstract]) OR online[Title/Abstract]) OR patient forum[Title/Abstract]) OR natural language processing[MeSH Terms]) OR "natural language processing"[Title/Abstract])) AND Adverse Drug Reaction Reporting Systems[MeSH Terms])) OR (("Adverse Drug Reaction Reporting Systems"[MeSH]) AND "Internet"[Mesh])) |
| Embase | Query #3 | "pharmacovigilance"/de OR ADR OR ADE OR ("adverse reaction"/de OR "adverse event" OR "side effect"/de AND ("drug"/de OR "medication"/de OR "pharmaceutical product")) AND ("social media"/de OR "Web 2.0":ab,tie OR "Web 2.0":ab,ti OR "social media":ab,ti OR "social network"/de OR Twitter OR Facebook OR blog OR forum OR fora OR "message board" OR comment OR "user feedback") |
aMeSH: Medical Subject Heading
bADR: adverse drug reaction
cADE: adverse drug event
dTIAB: title and abstract
eab, ti: abstract, title
Article characteristics overview.
| Characteristics | Theme 1 | Theme 2 |
| Year of publication | ✓ | ✓ |
| Language used in the studied texts | ✓ | ✓ |
| Type of data source, for example, forums or Twitter | ✓ | ✓ |
| Presence of an anonymization step | ✓ | ✓ |
| Volume of data analyzed | ✓ | ✓ |
| List of studied drugs | ✓ | ✓ |
| Coding ADRsa (medical lexicon) | ✓ | ✓ |
| Keywords the authors used to identify sources or posts of interest | ✓ |
|
| Use of semiautomated processes (mixed methods) | ✓ |
|
| Main results | ✓ |
|
| Whether reported ADRs were highly informative or not | ✓ |
|
| Seriousness of reported ADRs | ✓ |
|
| Reference source was used for comparison with reported ADRs | ✓ |
|
| Identification of potential unexpected ADRs or unexpected frequency of known ADRs | ✓ |
|
| Analysis of the influence of other media, for example, television, radio, or the press, as a potential cause of increased ADR reporting in social media | ✓ |
|
| If the authors mentioned the use of a crawler |
| ✓ |
| Implemented methods of preprocessing |
| ✓ |
| Lay language lexicon or tools used |
| ✓ |
| Authors attempted to identify the relationship between the drug and the event |
| ✓ |
| Authors used a machine-learning approach |
| ✓ |
| Evaluation of the extraction methods with metrics |
| ✓ |
| Comparison with external pharmacovigilance databases |
| ✓ |
| Whether the system enabled evaluating the unexpectedness of any extracted ADRs |
| ✓ |
aADR: adverse drug reactions
Figure 2Flowchart of our mapping process and study selection.
Main steps for identifying adverse drug reactions from social media.
| Step | Description |
| Step 0: Selection of data sources | This step consists of identifying and selecting the most relevant websites to answer the research question. They can be identified using a combination of keywords (eg, generic or brand-name drug, disease, ADRa/AEb) in Web search engines. |
| Step 1: Data collection | Potentially relevant patient narratives or posts are identified by entering keywords into the search engine hosted by the selected websites (manual identification only) or using a semiautomated process. Data may be imported into software (after anonymization) with the aim of additional analyses. |
| Step 2: Identification of drug-ADR/AE pairs | The manual identification of drug-ADR/AE pairs is performed by reading the patients' narratives or posts that were initially collected. |
| Step 3: Results evaluation | This step consists of manually evaluating the frequency and the seriousness of the ADRs or AEs that were identified in patients' narratives or posts. The results can be compared, after coding, with those of other sources (Summary of Product Characteristics [SPC], clinical trials, pharmacovigilance databases, or literature) to identify potential new ADRs or an unexpected frequency of a known ADR. |
aADR: adverse drug reaction
bAE: adverse event
Figure 3Main steps for extraction of adverse drug reactions (ADRs) from social media.
Transformations performed on the extracted data.
| Transformation | Rationale and methods |
| Anonymization | Anonymization is required to remove patients’ personal data to comply with medical confidentiality. Benton’s team trained a classifier to determine if a token had to be anonymized or not [ |
| Spelling correction | To maximize the detection of information in the corpus, spelling mistakes and typing errors that are common in texts extracted from social networks have to be corrected. The analyzed texts were extracted from social networks or public forums and included many abbreviations and typing errors. Li [ |
| Cleaning Web pages | Web pages consist of hundreds of tags that are invisible to users. When the crawler extracted a complete Web page code, a cleaning step was necessary to refine the content, as with Benton et al [ |
| Stemming | Reducing inflected words to their root helps to detect different forms of a word. This process reduces words to their word stem, base, or root forms, and these roots were then used for analysis. Different algorithms can be used by the «stemmer» [ |
| Sentencization/ | Breaking the text up into segments of words, sentences, and paragraphs allows for analyzing the sentences and locutions in the corpus. Liu and Chen [ |