Literature DB >> 35027943

Social Media Analytics for Pharmacovigilance of Antiepileptic Drugs.

Anwar Ali Yahya1, Yousef Asiri1, Ibrahim Alyami1.   

Abstract

Epilepsy is a common neurological disorder worldwide and antiepileptic drug (AED) therapy is the cornerstone of its treatment. It has a laudable aim of achieving seizure freedom with minimal, if any, adverse drug reactions (ADRs). Too often, AED treatment is a long-lasting journey, in which ADRs have a crucial role in its administration. Therefore, from a pharmacovigilance perspective, detecting the ADRs of AEDs is a task of utmost importance. Typically, this task is accomplished by analyzing relevant data from spontaneous reporting systems. Despite their wide adoption for pharmacovigilance activities, the passiveness and high underreporting ratio associated with spontaneous reporting systems have encouraged the consideration of other data sources such as electronic health databases and pharmaceutical databases. Social media is the most recent alternative data source with many promising potentials to overcome the shortcomings of traditional data sources. Although in the literature some attempts have investigated the validity and utility of social media for ADR detection of different groups of drugs, none of them was dedicated to the ADRs of AEDs. Hence, this paper presents a novel investigation of the validity and utility of social media as an alternative data source for the detection of AED ADRs. To this end, a dataset of consumer reviews from two online health communities has been collected. The dataset is preprocessed; the unigram, bigram, and trigram are generated; and the ADRs of each AED are extracted with the aid of consumer health vocabulary and ADR lexicon. Three widely used measures, namely, proportional reporting ratio, reporting odds ratio, and information component, are used to measure the association between each ADR and AED. The resulting list of signaled ADRs for each AED is validated against a widely used ADR database, called Side Effect Resource, in terms of the precision of ADR detection. The validation results indicate the validity of online health community data for the detection of AED ADRs. Furthermore, the lists of signaled AED ADRs are analyzed to answer questions related to the common ADRs of AEDs and the similarities between AEDs in terms of their signaled ADRs. The consistency of the drawn answers with the existing pharmaceutical knowledge suggests the utility of the data from online health communities for AED-related knowledge discovery tasks.
Copyright © 2022 Anwar Ali Yahya et al.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 35027943      PMCID: PMC8752219          DOI: 10.1155/2022/8965280

Source DB:  PubMed          Journal:  Comput Math Methods Med        ISSN: 1748-670X            Impact factor:   2.238


1. Introduction

With an estimated 65 million people having epilepsy worldwide [1] and an annual rate ranging from 30 to 50 per 100,000 individuals [2], epilepsy is considered the most common serious neurological disorder after stroke. It is a multifactorial disorder that involves many seizure types and syndromes with different prognoses and sensitivities to treatment. With a laudable aim of achieving seizure freedom with minimal, if any, side effects, AEDs are the mainstay of epilepsy treatment [3]. Currently, there are ample AEDs available, offering more options for the treatment of many types of seizures. Despite different mechanisms of actions of AEDs [4], none of them treat the etiology of the disorder. They instead act to symptomatically suppress seizures once they occur. Therefore, the current AEDs still fail to control seizures in 20–30% of all epilepsy patients [5, 6]. Besides their use for epilepsy treatment, AEDs are extensively used to treat other conditions, including migraine, neuropathic pain, bipolar disorder, anxiety, and many other disorders [7]. With this wide prevalence and a reported yearly growth of AED usage, particularly of new ones [7-9], their safety in use has become a major concern. Usually, the treatment of epilepsy using AEDs is a long-lasting journey, and hence, their safety for long-term administration is of paramount importance. According to the World Health Organization (WHO), drug safety or pharmacovigilance involves activities relating to the detection, assessment, understanding, and prevention of adverse effects or any other possible drug-related problems. Moreover, the WHO terms the adverse effects or problems of a drug as a signal and defines it as “reported information on a possible causal relationship between an adverse event and a drug, the relationship being unknown or incompletely documented previously.” Among different drug signals, the ADR is the primary type, which is defined as “an appreciably harmful or unpleasant reaction, resulting from an intervention related to the use of a medicinal product, which predicts a hazard from future administration and warrants prevention or specific treatment, or alteration of the dosage regimen, or withdrawal of the product” [10]. Although the ADRs of all in-use drugs are of crucial importance, it gains even more significance in AEDs for the following distinctive peculiarities. First, the treatment of epilepsy is usually maintained for many years and can be lifelong. Besides the early occurrence of ADRs developed in this long-term treatment, several ADRs are developed insidiously over several years after the introduction of the AED. Second, while the initial choice of an AED is primarily guided by its efficacy (ability to control seizures), its retention (long-term use) depends on its ADR profile (tolerability) [7]. In this respect, it has been reported that the ADRs of AEDs represent a leading cause of treatment failure in nearly 25% of patients. Furthermore, they are a major source of disability and mortality in patients with epilepsy and substantially contribute to the use and costs of healthcare systems [1]. Third, patients are different in their response to AEDs and willingness to accept their ADRs. For example, a patient may refuse Valproate, though it is most likely AED to control primary generalized seizures, because of weight gain or teratogenic risk for a female patient of child-bearing age. Fourth, for a significant portion of epileptic patients, approximately 30-50%, the seizures are poorly controlled or refractory. These patients are usually on polytherapy, where multiple AEDs are used in combination, leading to potential pharmacokinetic or pharmacodynamic interactions and causing more ADRs that might occur when the AED is taken as monotherapy [11]. Fifth, despite the wide variety of existing AEDs, new ones are continuously developed. More precisely, over the past 25 years, more than 15 new AEDs with modified mechanisms of action or side effect profiles have become available for epilepsy treatment. These new AEDs create a major challenge for health professionals and postmarketing surveillance in regard to their tolerability and drug interaction [12]. Sixth, although AEDs are essentially used for epilepsy treatment, in recent years, there is an increase in their clinical use for treating other neurological and psychiatric disorders such as migraine, neuropathic pain, bipolar disorder, mania, schizophrenia, anxiety, and essential tremor. This adds new patients who are exposed to the AEDs, and thus, a new dimension of their ADRs is introduced [13]. Given the peculiarities of ADRs in AEDs, their detection has become of paramount importance to the concerned parties (patients, health professionals, pharmaceutical companies, and regulatory authorities) [1]. In general, there are two main approaches of ADR detection: premarketing review and postmarketing surveillance. The premarketing review process is required before any pharmaceutical new drugs are approved for marketing by regulatory authorities such as the Food and Drug Administration (FDA). This process focuses on identifying the risk associated with drugs, which must be established and clearly communicated to prescribers and consumers. Nonetheless, the premarketing review process is not sufficient to uncover all ADRs, because it is usually limited by the size and duration and is often incapable of detecting rare ADRs [14]. Therefore, systems for postmarketing surveillance, or pharmacovigilance, become necessary. Typically, the postmarketing surveillance is conducted by the regulatory authority and heavily relies on applying data analytics methods to analyze spontaneous reporting system (SRS) data [15]. Despite their wide adoption, SRSs have many limitations and the most frequently mentioned one is being the subject of underreporting. The reasons for this limitation are manifold and include lack of time, large effort, fear of being prosecuted, and an unawareness of the importance of reporting. Additionally, while monitoring of all undesirable reactions is necessary, it is often thought that SRSs are designed solely for detecting rare and serious ADRs [12].Given the SRS limitations, several data sources have been utilized for pharmacovigilance. In the case of AEDs, sources such as routine clinical data [12], prescription data [16], and electronic health records [17] have been considered. Despite their merits, they suffer limitations related to their accessibility and privacy [14]. In recent years, social media has emerged as a valuable data source for health informatics [18]. Data from online social media networks, such as Google, YouTube, Facebook, and Twitter, permits people to generate a massive amount of health textual content which can be utilized to tackle various medical tasks such as psychopathic class detection [19, 20], depression classification [21], disease detection [22], and adverse drug reaction detection [23]. It is the development of Web 2.0 and Health 2.0 that makes a great deal of health-related informative contents available. As for pharmacovigilance in particular, social media offers large amounts of useful data that are internet-based, patient-generated, unsolicited, and up to date. Thus, the FDA in the United States and the European medicine agency have recognized social media as a new data source to strengthen their pharmacovigilance activities [24]. Despite all this, the use of social media data for pharmacovigilance activities is not without difficulties. Issues with the credibility, recency, uniqueness, frequency, and salience of social media data always arise. In addition, difficulties and challenges in using Natural Language Processing (NLP) techniques to process and extract relevant information from social media are frequently encountered [25]. This is due to the tendency of social media users to use nonmedical and descriptive terms to discuss health issues [26]. Nonetheless, the utilization of social media data for pharmacovigilance continues to gain increasing attention, particularly for ADR detection. In this respect, the survey of the relevant literature reveals a number of works that leverage social media data for the detection of ADRs of certain drugs such as of methylphenidate [24], statin drugs [27], breast cancer drugs [28], cancer drugs [29], diabetes drugs [30], psychiatric drugs [31], malaria drugs [32], heart disease drugs [33], and opioid drugs [34]. It also reveals the lack of work dedicated to investigating the potentiality of social media for the detection of AED ADRs. Given the peculiarities of ADRs in AEDs, the inherent limitations of traditional data sources, the growing interest in leveraging social media for ADRs detection, and finally the lack of research efforts dedicated to investigating the potentiality of social media for AED pharmacovigilance [35], this research is proposed to investigate the validation and utilization of leveraging social media data, particularly online health communities (OHCs), for detecting the ADRs of AEDs. It does so by applying data analytics methods to data collected from two OHCs. As the collected data is of textual form, NLP techniques are employed to prepare it for ADR extraction with the aid of two medical resources, consumer health vocabulary (CHV) and ADR lexicon, to bridge the language and terminology gap between health professionals and consumers. Then, disproportionality analysis measures are applied to identify the set of ADRs for each AED. The results are then analyzed to answer two main research questions given as follows: Given the growing interest in leveraging social media data for pharmacovigilance, to what extent is OHC data valid for the task of detecting ADRs of AEDs? Given the growing interest in leveraging social media data for pharmacovigilance, can OHC data be utilized in knowledge discovery tasks related to AEDs? More specifically, this question can be answered through the following specific knowledge discovery tasks: Given the common characteristics of the AEDs, what does the OHC data disclose about the common ADRs of AEDs? Given the common characteristics, mechanism of actions, and chemical structure of AEDs, what does OHC data disclose about their similarities in terms of ADRs? The remainder of this paper is organized as follows. In Section 2, a review of the related literature on ADRs of AEDs is presented. Section 3 describes the detailed methodology of detecting ADRs from OHC data. In Section 4, the results of the conducted experiments are demonstrated and analyzed to answer the research questions. Section 5 concludes the paper and discusses the future research directions.

2. Literature Review

Over the last three decades, a remarkable increase in the AEDs available to treat patients with epilepsy has been reported [36]. Their aim is to achieve the highest efficacy with minimal ADRs. Like other types of drugs, AEDs are associated with various types of ADRs. However, since the common mechanism of AEDs is to suppress the pathological neuronal hyperexcitability that constitutes the final substrate in many seizure disorders, the ADRs that affect the Central Nerve System (CNS) are the most common type of ADRs [37]. In the literature, the ADRs of AEDs have been a matter of concern in many studies from different perspectives. In [11], three categories of AED ADRs (CNS, behavioral, and general medical issues) have been identified. The long-term ADRs of AEDs, particularly new ones, are studied in [7]. A comprehensive summary of AED ADRs affecting the CNS is reviewed in [37]. A classification and identification of psychiatric ADRs of individual AEDs and general guidelines for their prevention and management are studied in [38]. Furthermore, an assessment of the psychiatric and behavioral ADRs of AEDs is conducted in [39]. An evaluation of the ADRs of the new AEDs against the conventional AEDs in terms of their ADRs is conducted in [40], which shows that newer AEDs are associated with a similar trend of ADRs. Owing to the cruciality of ADRs for AEDs, the safety of AEDs, particularly ADR detection, has become a major concern [13]. For this purpose, data analytics has played a vital role for analyzing AED usage data collected from different sources. In this regard, four types of data sources [14] can be identified: SRSs, electronic health records, pharmaceutical databases, and biomedical literature. Despite their merits, they suffer several limitations. The passiveness of spontaneous reporting systems leads to the extremely high underreporting ratio and makes it difficult to detect new and emerging signals. The privacy issues often make it difficult to access electronic health records. The accessibility of pharmaceutical databases is also a problem, because not all of them are free and public to everyone. In addition, the data of pharmaceutical databases focuses on the chemical aspect such as drug structure rather than textual aspect [14, 41]. Recently, in response to these limitations, social media as an alternative data source for pharmacovigilance has been receiving increasing attention. The research efforts in this area have been reviewed in several surveys [23, 25, 26, 42, 43]. According to these surveys, the following aspects characterize the current state of the art of utilizing social media for pharmacovigilance. Social media has potentials that are understudied, and its value has not yet been realized in practice [23] Social media may add value for specific niche areas such drug abuse and pregnancy-related outcomes [43] With the enhancement of algorithms and techniques, the scope and utility of social media may broaden over time [43] Additional research is required to explore the value of social media for pharmacovigilance [23, 43] In general, these surveys share a concordant view on the infancy of utilizing social media data for pharmacovigilance and the dire need for more research efforts in this regard. Concerning the utilization of social media for the detection of ADRs, the research efforts have been reviewed and summarized, as shown in Table 1, across four dimensions: data source, target drug set, number of drugs, ADR extraction approach, and ADR signaling method. A closer look at Table 1 reveals several interesting aspects of these research efforts that inspired the design choices of this research. First, dedicated OHCs such as Askapatient and WebMD have been used as a source of data more than public social networks such as Twitter and Facebook. Second, none of the previous research in Table 1 was dedicated to detecting the ADRs of AEDs, though most of them, 14 out of 19, studied the ADRs of a specific set of drugs. Third, the lexicon-based method is widely used for extracting drugs and ADRs from social media data. Fourth, disproportionality analysis, a widely used method detecting ADRs from SRSs data is also used for the detection of ADRs from social media data.
Table 1

Summary of previous research on utilizing social media for ADRs detection.

ReferenceData sourceTarget drug setNo. of drugsADR extraction approachADR signaling method
[44]DailyStrengthNA6Lexicon-basedAssociation rule mining
[45]Various OHC forumsBreast cancer4Lexicon-basedAssociation rule mining
[46]Various parenting forumsPediatric drugs (fever, pain, influenza, viruses)9Lexicon-basedDisproportionality analysis
[29]TwitterCancer5ML-based (SVM)NA
[47]American Diabetes AssociationDiabetesNALexicon-basedShortest dependency path-based ML algorithm
[28]Askapatient, Drugs.com, DrugRatingZBreast cancer5Pattern-basedNA
[48]TwitterNA23Lexicon-basedAggregated frequencies
[33]MedHelpHeart diseaseNALexicon-basedRule-based approach for relation classification
[49]DailyStrengthNA38ML-based (SVM)Probabilities of all comments associated with each drug combined to predict if drug should be categorized as normal or blackbox
[50]Twitter, DailyStrengthNA81Supervised learning via conditional random fields (CRF)NA
[24]Five popular and open French forumsMethylphenidate4Lexicon-basedDisproportionality analysis
[27]Askapatient.comMedications.comWebMD.comCholesterol-lowering drugsLexicon-basedLog-likelihood ratio
[51]TwitterAttention deficit hyperactivity. Disorder drugs44ML-based (RNN)NA
[14]PatientsLikeMeMedHelpNA20Lexicon-basedAssociation rule miningDisproportionality analysis
[52]Five forums in FranceOral antineoplastic drugs8 (ATC subgroup)Lexicon-based manualFrequency
[30]Askapatient.comDiabetes drug Glucophage/metformin1ML-basedNA
[31]Askapatient.comPsychiatric drugs4ML-basedNA
[32]TwitterMalaria drugs19ML-based and rule-based (cTake)Disproportionality analysis
[34]Twitter and PubMedOpioid drugs3ML-based (convolutional recurrent neural network (CRNN))NA
On the other hand, a review of the previous research in Table 1, from a methodological point of view, reveals several interesting aspects of the general methodology of detecting ADRs from social media. As characterized in [25] and demonstrated in Figure 1, the general methodology involves five main steps: raw data collection, preprocessing, information extraction (drugs and ADRs), measuring drug-ADR correlations, and evaluation. The raw data can be collected from a big public platform social network site such as Facebook, Twitter, Flicker, and Tumblr or specialized healthcare social networks and forums. The specialized healthcare social network forums can be further classified into generic health-centered social network sites where users discuss their health-related experiences, including use of prescription drugs, side effects, and treatments, such as PatientsLikeMe (http://www.patientslikeme.com), DailyStrength (http://www.dailystrength.org), MedHelp (http://www.medhelp.org), WebMD (https://exchanges.webmd.com), and CureTogether (http://curetogether.com), medicine-focused sharing platforms, which allow patients to share and compare medication experiences like Askapatient (http://www.askapatient.com) and Medications.com (http://www.medications.com), or disease-specific online health forums focused on specific diseases, e.g., the TalkStroke forum (https://www.stroke.org.uk/forum) [23]. Depending on the nature of the source, different methods can be utilized to collect the raw data. For a big public platform social network site, specific application programming interfaces are utilized to extract data; however, for specialized healthcare social networks and forums, an adapted web crawler to collect web pages and web scraper to extract the messages from web pages can be used [25].
Figure 1

General methodology of detecting ADRs from social media.

Since content and language of medical social media differ from those of general social media and of clinical documents, a preprocessing of the raw data is a crucial step. For this purpose, specific text mining methods or techniques based on NLP are employed to identify medical concepts (drugs, ADRs, symptoms, etc.) and relationships among them. In this respect, it is worth mentioning that the performance of the text mining methods plays a vital role [49]. Typically, in the preprocessing step, the following transformations can be performed. Anonymization: to remove patients' personal data to comply with medical confidentiality Spelling correction: to maximize the detection of information in the corpus, spelling mistakes and typing errors must be corrected, because texts extracted from social networks include many abbreviations and typing errors Cleaning web pages: to remove tags that are invisible to users Stemming: to reduce inflected words to their stem, base, or root forms Tokenization: breaking the text up into segments of words, sentences, and paragraphs to ease analyzing the sentences and locutions in the corpus N-gram generation: to optimize the extraction of medical concepts, the unigrams, bigrams, and trigrams are generated After preprocessing the collected data, the information extraction step extracts medical concepts, particularly the drug names and ADRs from the cleaned data. For this purpose, the employed approach can be generally classified as machine learning- (ML-) based approaches and lexicon-based approaches. The use of ML-based approaches is motived by the fact that most drug-related posts on social media are not associated with ADRs, and therefore, irrelevant posts must be filtered out to identify ADRs. In their works, ML-based approaches require a large amount of manually annotated data to make reliable evaluations. Supervised text classification techniques such as support vector machine and naïve Bayes are the most common ML-based approaches employed to classify user posts to determine if ADRs are mentioned in the posts [26]. Besides supervised ML approaches, unsupervised ML approaches such as topic modeling and named entity recognition can be utilized [24]. Lexicon-based ADR extraction, on the other hand, is a widely adopted approach, as over 50% of the previous studies adopted it [26]. The wide use of lexicon-based ADR extraction is attributed to the wide availability of medical lexicons and knowledge bases in the healthcare domain. The Unified Medical Language System (UMLS), the FDA's Adverse Event Reporting System (FAERS), and the adverse drug event reporting system in Canada (MedEffect) are the most medical lexicons used in the previous studies. Meanwhile, the CHV, a lexicon linking UMLS standard medical terms to patients' colloquial language, has been adopted in many studies to interpret medical terms in online patient discussions [45]. As for measuring the correlation between the drugs and the extracted ADRs, different approaches can be employed. These approaches can be grouped into three categories: disproportionality analysis approaches, association rule mining approaches, and machine learning-based approaches. The disproportionality analysis approaches [53] are based on the calculation of a two-by-two contingency table that relates the observed count for an ADR and a drug of interest with all other ADRs and drugs in the dataset that together constitute a background from which an expected count is derived. The principal difference being the method by which the expected value is calculated [53]. There are primarily four different measures of disproportionality used in spontaneous reports: proportional reporting ratio (PRR) [54], reporting odds ratio (ROR) [55], information component (IC) [55], and Empirical Bayes Geometrical Mean (EBGM) [56]. Association rule mining approaches are aimed at mining the association rule of the form drug⇒ADR. Common measures used in association rule mining are support, confidence, and lift [14]. They are intuitive and easy to implement and computationally less intensive. However, the simple operation does not make statistical soundness in many cases because it does not adjust for the popularity of individual drug or correlation [57]. Finally, machine learning-based approaches have the merit of dealing with a common problem in the previous approaches, that is, the lake of automatic evaluation of interactions between drugs unless clearly stated in the model. Two examples of ML-based approaches that have been employed are random forests and Monte Carlo logic regression [57]. In the evaluation step, the performance of the ADR detection approach is evaluated. The common evaluation method is to use existing metrics such as recall, precision, F-score, and accuracy. Applying these metrics requires manually annotated data; however, in the absence of annotated data, these metrics can be computed using gold standards. The gold standard can be known ADRs from product labels or databases such as VigiBase, summary of product characteristics, FDA labels, and Side Effect Resource (SIDER) database [26].

3. Detecting ADRs of AEDs from OHC Data

As mentioned above, the objective of this research is to detect the ADRs of AEDs from drug consumers' reviews in OHCs. Accordingly, the methodology of achieving this objective is a customized variant of the general methodology of detecting ADRs from social media. It involves steps of collecting drug consumers' reviews from OHCs, applying NLP techniques to prepare the data, extracting ADRs for each drug, measuring the correlation between each drug and the extracted ADRs, and finally evaluating the validity and utility of the detected ADRs. Figure 2 depicts the steps of the proposed methodology, and the following subsections describe them in more detail.
Figure 2

Methodology of AEDs' ADR detection.

3.1. AED Raw Data Collection

The raw data on AED reviews are captured from Askapatient and WebMD websites using a web crawler. The collected data from Askapatient includes ratings, reasons, side effects, comments from patients, gender, age, duration/dosage, and posting dates, whereas the collected data from the WebMD include age, sex, duration of treatment, and comments from patients. At the time of data collection, the number of patients' reviews on AEDs in Askapatient varies from 1860 for lamotrigine to only one review for several AEDs like Aptiom, whereas in WebMD, the number of patients' reviews ranges from 1818 for Gabapentin to 51 for Dilantin. For this research, the AEDs with number of reviews less than 170 are excluded from the data collection. Table 2 shows the AEDs that are considered in this research.
Table 2

List of considered AEDs.

No.Generic nameBrand nameNo. of reviews from AskapatientNo. of reviews from WebMDTotal No. of reviews
1GabapentinNeurontin91418182732
2LamotrigineLamictal18453652210
3TopiramateTopamax17642372001
4PregabalinLyrica13921061498
5ClonazepamClonazepam32411121436
6Divalproex sodiumDepakote566133699
7DiazepamValium393261654
8OxcarbazepineTrileptal357119476
9CarbamazepineTegretol283150433
10LevetiracetamKeppra190117307
11PhenytoinDilantin18351234
12AcetazolamideDiamox15595250
Total8366456412930
Additionally, to make the data more representative sample of drug population, data on non-AEDs must be collected to represent the background of the AED dataset. The background data plays an essential role in the validity and reliability of ADR detection [58, 59]. For this purpose, a set of reviews on non-AEDs have been collected from Askapatient. Table 3 shows the details of 31 non-AEDs that have been considered in background data collection. They fall into five groups with a total of 43085 reviews.
Table 3

List of considered non-AEDs.

NoDrug groupGeneric nameBrand nameNo. of reviews from Askapatient
1Depression drugs (19415 reviews)CymbaltaDuloxetine1472
2EffexorVenlafaxine Hydrochloride3907
3LexaproEscitalopram Oxalate3713
4ZoloftSertraline Hydrochloride2821
5Wellbutrin XlBupropion2626
6WellbutrinBupropion Hydrochloride2023
7CelexaCitalopram Hydrobromide1081
8PaxilParoxetine Hydrochloride1772

9Diabetes drugs (2336 reviews)ActosPioglitazone Hydrochloride613
10ByettaExenatide Synthetic333
11GlucophageMetformin Hydrochloride1012
12VictozaLiraglutide Recombinant378

13High blood pressure (1891 reviews)LisinoprilLisinopril653
14CoregCarvedilol538
15InderalPropranolol Hydrochloride376
16MicardisTelmisartan324

17Allergy drugs (8845 reviews)ZyrtecCetirizine Hydrochloride3085
18ClaritinLoratadine1590
19AllegraFexofenadine Hydrochloride1346
20BenadrylDiphenhydramine Hydrochloride948
21Claritin-D 24 hourLoratadine; Pseudoephedrine Sulfate643
22AstelinAzelastine Hydrochloride530
23VistarilHydroxyzine Hydrochloride374
24XyzalLevocetirizine Dihydrochloride329

25Digestive disorder (10598 reviews)PrilosecOmeprazole2846
26NexiumEsomeprazole Magnesium1871
27PrevacidLansoprazole434
28ProtonixPantoprazole Sodium908
29AciphexRabeprazole Sodium815
30Zantac 150Ranitidine Hydrochloride376
31PepcidFamotidine364

Total43085
Moreover, Tables 4 and 5 are snapshots of the raw data collected from the two OHCs, Askapatient and WebMD, for Lamictal (lamotrigine). The variation in the structure of the raw data among the two OHCs is notable; however, only the relevant raw data from the two OHCs are selected and complied into a unified dataset.
Table 4

Snapshot of Askapatient raw data for Lamictal (lamotrigine).

RatingReasonSide effects for LamictalCommentsSexAgeDuration/dosageDate added
4Anxiety, OCD, and BPDVivid, disturbing dreams and nightmares, increase in acne, weight lossI really liked Lamictal. It helped with my severe anxiety and panic attacks and obsessive thoughts, and agitation. I'm not sure if it was the placebo effect because it was only a week in and it was such a small dose, but I definitely noticed an improvement. I had to stop it only 10 days in because I noticed it was making my acne worse. It wasn't anywhere as bad as Lithium, but it was still enough to make me feel more insecure. Another side effect was the vivid dreams. I was having extremely vivid, detailed, disturbing nightmares every night. I also experienced weight loss and lack of appetite but this wasn't a dealbreaker for me. The acne and nightmares were however. I really wish I could've kept taking it.F2010 days/25 mg 1X D3/30/2020
2Personality disorder. Bipolar, coIt's hard to say that I can describe anything as being the fault of Lamictal. I also take Seroquel, propranolol, and 8 other meds for myriad of problems. Sex drive is there but no desire to pursue it. Fatigued, spaced out, thoughts scattered and unfocused. Hard to stay awake and have “blackouts”. My memory was good but not now. It is non existent.F604 years/100 mg × 33/22/2020
3DepressionHeadaches, initial euphoria, lethargyF315 years/200 mg 1X D1/30/2020
5Bipolar 2NoneBefore I started taking Lamictal I was being treated with Effexor alone. It helped some, but never made me feel “well”. When I started taking Lamictal, everything changed. I wanted to live again, I am no longer thinking the worst about my life situations. For the first time in as long as I can remember, I am truly content. This drug saved my life.F521 month/75 mg1/28/2020
1Central pain syndromeSave yourself! Terrible drug! It caused severe damage to my intestines. I have severe intestional spasms!!! There are not words to describe the amount of pain! Excruciating doesn't come close!! My suffering is caused by Lamotrigene!!!! I ubered to the is hospital because of this drug! I had no intestinal issues ever until this drug Now i have inflamation in my intestines and silent reflux, from Lamotrigine i can not eat anything acidic or spicy. I am praying this damage is NOT permanent. Or IT HAS DESTROYED MY LIFE!Other side effects I experienced were a very dry parched mouth no matter how much you drank your mouth is dry that's awful then I started losing my balance.F583 months/25 mg1/12/2020
Table 5

Snapshot of webMD raw data for Lamictal (lamotrigine).

ConditionReview dateReviewer info.Comment
Condition: bipolar depression6/7/2020 3:32:50 PMReviewer: Heatcap111, 35-44 on treatment for 5 to less than 10 years (patient)I am on it for years and I feel like it makes me tired could that be
Condition: epileptic seizure6/3/2020 1:21:23 AMReviewer: Girl sick of pill pushing doctors, 25-34 on treatment for 2 to less than 5 years (patient)I've been taking this medication for a few years now and the side effects have become so unbearable that I'm getting off this medication. This is a mood suppressor for people with bipolar disorder so since given to me for seizures I feel numb, no sex drive, no motivation, and no energy. I'm lethargic and fatigued at some point every day and have trouble falling asleep at night. This med also causes constipation. The longer you take this med the more you'll have to increase the dose (more side effects) because this med is known for your body building a tolerance fast.
Condition: bipolar depression2/8/2020 3:37:30 PMReviewer: K33vin, 55-64 on treatment for less than 1 month (patient)I took 25 mg daily for a week. I think I was allergic. No sleep at all for five days. I had a headache that would not go away. I had body aches. Like the flu without fever. Nausea, vomiting, diarrhea. Chills. Stomach pain. I developed 52 cherry angiomas in one week. I lost 11 pounds. Stopped med at one week
Condition: bipolar depression2/5/2020 8:37:33 PMReviewer: 19-24 on treatment for 2 to less than 5 years (patient)
Condition: bipolar depression1/14/2020 11:37:47 AMReviewer: j, 45-54 female on treatment for 1 to 6 months (patient)I sleep just fine, I started at 25 mg and slowly went up to 100 mg currently. I'm tired and sleep great. No rash, no unbearable side effects.
Condition: bipolar depression12/11/2019 3:03:57 PMReviewer: PekoeGirl1985, 25-34 female on treatment for 1 to less than 2 years (patient)I did not get much sleep while on this medicine. The insomnia side effect is horrendous. Even with adding Ambien to the mix, I still would watch the sun rise. Also, the depersonalization side effect is pretty bad. I just didn't care about anything, and my passion for art was completely gone. I won't ever take this medicine again.

3.2. Data Preprocessing

The first step in the preprocessing step is the selection of the relevant data for each drug from the collected raw data. This includes side effects and comments from Askapatient and comment from WebMD. Then, the selected data are compiled into a unified dataset for each drug. Since these reviews are composed of free text, some NLP techniques are required to preprocess them. This involves the following: Text cleaning: all punctuations and digits are removed Text normalization: convert text into lowercase Stop word removal: the set of stop words is removed as they do not contribute to the detection of ADRs N-gram generation: the unigrams, bigrams, and trigrams are generated from all the terms in each review. The maximum number of n-gram is set to three as the longest term of ADR in the ADR lexicon consisted of three words

3.3. ADR Extraction

In this step, the ADRs of each drug in the dataset are extracted and their frequency of occurrence is computed. The main idea of this process is to match every unigram, bigram, and trigram generated in the previous step with an ADR lexicon. However, in the casual and open environment of internet, patients tend to use very different vocabularies from professionals to express health concepts [60]. Therefore, the straightforward matching of the standard medical lexicon used by professionals cannot be used. To deal with this problem, CHV Wiki is employed to convert each term into the equivalent medical term. CHV is a collection of forms used in health-oriented communication for a particular task or need [60]. It reflects the difference between patients and professionals in expressing health concepts and helps to bridge this vocabulary gap. After mapping every unigram, bigram, and trigram term to their equivalent CHV terms, they are mapped into ADR lexicon to identify the ADRs. For this purpose, the ADR lexicon, an exhaustive list of ADRs and their corresponding UMLS IDs compiled by the DIEGO lab, is used [50]. It includes concepts from thesaurus of Adverse Reaction Terms (COSTART), SIDER, and a subset of CHV that represents ADRs not listed in COSTART or SIDER. The final DIEGO LAB lexicon contains 13799 phrases with 7432 unique UMLS IDs. It has been made publicly available at http://diego.asu.edu/downloads/publications/ADRMine/ADR_lexicon.tsv. The result of the ADR extraction step is a list of ADRs for each AED along with its frequency in the corpus. Table 6 shows a snapshot of the extracted ADRs for lamotrigine AED represented in their UMLS ID, CHV term, lexicon ADR, and their corresponding count.
Table 6

Snapshot of extracted ADRs for Lamictal (lamotrigine).

CUI-CUICHV termLexicon ADRCount
C0015230ExanthemaRash358
C0003467AnxietyAnxiety384
C0030193PainPain335
C0043094Weight gainWeight problem315
C0002622AmnesiaAmnesia301
C0344315Depressed moodSadness266
C0917801InsomniaInsomnia258
C0002170AlopeciaAlopecia256
C0001144Acne vulgarisAcne vulgaris235
C0085633Mood swingsMood altered229
C0012833DizzinessDizziness198
C0015672FatigueLack of energy195
C0226896Oral cavityOral cavity183
C0027497NauseaNausea180
C0033774PruritusPruritic disorder171
C0026914Mycobacterium avium complexMycobacterium avium intracellulare160
C0338831ManicMania159
C0002957AngerAnger146

3.4. Measuring AED-ADR Association

In this step, the extracted ADRs of all AEDs are compiled into a matrix containing AEDs (columns) and ADRs (rows). Each cell in the matrix represents the frequency of an ADR in a particular AED. To measure the correlation between each AED and ADR in the AED-ADR matrix, the disproportionality analysis methods are used because they are the primary class of signal detection methods in pharmacovigilance research. In addition, they are currently applied in various national spontaneous reporting centers as well as in the Uppsala Monitoring Centre [61]. The calculations of the disproportionality analysis measures are based upon a two-by-two contingency table shown in Table 7.
Table 7

Two-by-two contingency table.

ADR of interestOther ADRs
AED of interest a b a + b
Other AEDs c d c + d
a + c b + d n = a + b + c + d
a, b, c, and d are defined as follows: Table 8 contains the details of the disproportionality measures applied to measure the correlation between AEDs and ADRs. It is worth noting that each measure has its conditions that must be met to indicate a positive signal.
Table 8

Details of the disproportionality analysis methods.

#MetricComputationsThreshold95% confidence interval
1Proportional reporting ratio (PRR) [54] PRR=a/a+cb/b+d SE=1a1a+b+1c1c+d χ2=a+b+c+dadbc2a+ca+bb+dc+d PRR ≥ 2χ2 > 4≥3 cases reportedCI = elnPRR±1.96SE
2Reporting odds ratio (ROR) [55] ROR=a/cb/d=adbc SE=1a+1b+1c+1d ROR − 1.96SE > 1≥2 cases reportedCI = elnROR±1.96SE
3Information component [62] IC=log2aa+b+c+da+ca+b SD=baa+b+b+da+ca+b+c+d IC − 2SD > 0CI = elnIC±1.96SD
a: the number of ADR occurrences in the AED of interest b: the number of other ADR occurrences in the AED of interest c: the number of ADR occurrences in other AEDs d: the number of other ADR occurrences in other AEDs

3.5. Evaluation

The evaluation of ADR detection is performed by comparing the proposed method with a chosen gold standard. The chosen gold standard is SIDER [63, 64]. It is a publicly available database containing ADR text mined from several public sources including the structured product labels. It has been used in numerous studies as a reference set to evaluate signal detection methods [65-67]. In SIDER 4.1 released from Oct. 2015, there are 5868 ADRs for 1430 drugs. Since the objective of this research is to investigate the validity of OHCs as a data source for ADR detection, the precision measure is used for evaluation because it is more indicative than recall. This is due to the differences in the methods of constructing the ADR lists from the OHCs and SIDER. In the case of the OHCs, the ADRs are extracted first and disproportionality analysis measures are then applied where strict threshold values are used to determine the signaled ADRs, whereas in the case of SIDER, the ADRs are extracted from different sources, including FDA drug labels, in different frequency ranges (frequent, infrequent, rare, etc.). This makes the list of signaled ADRs from OHCs for a particular drug very short as compared to the corresponding list of ADRs from SIDER. Consequently, when comparing the two lists of ADRs, the value of false negative (FN) (the number of ADRs occurred in SIDER but not in the signaled list of ADRs from OHCs) is extremely high and that makes the recall measure nonindicative to the validity of the OHCs. Formally, the precision measure is expressed as follows: where TP (true positive) is the number of ADRs that cooccurred in the signaled list of ADRs and SIDER and FP (false positive) is the number of ADRs that occurred in the signaled list of ADRs but not in the SIDER.

4. Results and Discussions

In this section, the results of applying the methodology described above to detect the ADRs of AEDs are presented, validated, and analyzed to answer the research questions on the validity and utility of OHC data source. Prior to this, however, useful details on the implementation settings are worth mentioning. The methodology of detecting ADRs of AEDs from OHCs is implemented using the Python programming language and a Microsoft Excel spreadsheet. More specifically, Python equipped with a powerful natural language toolkit, NLTK, is used to develop a data crawler that captures patients' reviews from Askapatient and WebMD, preprocesses the collected data, and extracts ADRs from the processed data. Moreover, MS Excel spreadsheet with a powerful data analysis package, XLSTAT, that allows users to analyze data within the Excel spreadsheet is used to perform the computation of disproportionality analysis. The size of the collected dataset is 56015 reviews, where 23.08% of the dataset is pertaining AEDs and 76.92% is for non-AEDs. In the implementation of the disproportionality analysis methods, the thresholds are set as given in Table 8 and the ADRs with frequency less than 3 are excluded from the disproportionality analysis computation.

4.1. Signaled AED ADRs

The results of applying the three dispropotionality measures to detect the ADRs are lists of signaled ADRs for each AED. In other words, three lists of signaled ADRs for each AED from the three measures are generated. It should be mentioned that for a given AED, the generated ADRs lists are different in size. Table 9 shows the size of the ADR lists signaled by the PRR, ROR, and IC for each AED. Obviously, the difference in the size of the generated ADR lists is most notable between PRR and ROR from one side and IC from the other side. This reflects the differences between the adopted computation and thresholding values among the three measures. Moreover, the size of the raw data (number of reviews) among AEDs could be used to highlight the differences in the size of the signaled ADRs. For instance, Gabapentin has the highest number of signaled ADRs and also the highest number of reviews. Phenytoin, on the other hand, has the lowest numbers of signaled ADRs and the lowest number of reviews as well.
Table 9

Number of signaled ADRs for each AED using PRR, ROR, and IC.

NoGeneric nameNo. of signaled ADRs (PRR)No. of signaled ADRs (ROR)No. of signaled ADRs (IC)
1Acetazolamide999999
2Carbamazepine107107104
3Clonazepam151149139
4Diazepam123123112
5Divalproex sodium10510497
6Gabapentin222214189
7Lamotrigine213200197
8Levetiracetam115115114
9Oxcarbazepine139138132
10Pregabalin195184199
11Phenytoin838385
12Topiramate184174174
Concerning the generated lists of ADRs for each AED, they are of different types: immunologic, hypersensitivity, nervous system, psychiatric, ocular, gastrointestinal, respiratory, and dermatologic. Moreover, some of them require immediate medical attention such as lymph node enlargement and renal calculi, while others such as loss of weight and weakness do not, as they may disappear during treatment as the body adjusts to the drug. In each list, each ADR is associated with a unique value that represents its correlation with a particular AED. Tables 10, 11, and 12 show the top 10 signaled ADRs for each AED.
Table 10

Top 10 PRR ranked ADRs of AEDs.

Acetazolamide (Diamox)Carbamazepine (Tegretol)Clonazepam (clonazepam)Diazepam (Valium)Divalproex sodium (Depakote)Gabapentin (Neurontin)
ADRsPRRADRsPRRADRsPRRADRsPRRADRsPRRRadiculopathy103.20
Cloudy urine1295.13Trigeminal neuralgia79.19Adiposis dolorosa28.29Heart septal defects, atrial210.28Neurosis153.39Chiari malformation61.92
Diuretic effect555.05Lump in neck65.99Aortic valve incompetence28.29Diseases of the inner ear210.28Ovarian disorder153.39Intractable pain61.92
Cerebrospinal fluid333.03Strabismus52.80Hemorrhage intracerebral28.29Ear diseases105.14Polycystic ovary disease85.21Mastocytosis41.28
Metabolic acidosis277.53Hyponatremia47.14Delusional disorder28.29Fibroid tumor105.14Vasospasm76.69Multiple organ failure41.28
Intracranial hypertension185.02Neuralgia45.31Dissociative reaction28.29Meniere's disease90.12Abnormal menstrual cycle76.69Rhabdomyolysis41.28
Spinal headache185.02Speech disorder44.00Parasomnia28.29Muscle spasticity63.08Psychotic state76.69Mobility decreased41.28
Glaucoma66.61Deja vu33.00Thrombosis28.29Premenstrual tension63.08Lung problem51.13Compression injury of nerve39.69
Acidosis61.67Meningitis33.00Thrombosis venous28.29Claustrophobia52.57Paranoid delusions38.35Postherpetic neuralgia30.96
Sarcoidosis61.67Dyslexia28.69Alanine aminotransferase increased28.29Cholecystectomy35.05Hyperthyroidism30.68Feet burning25.80
Gum blister61.67Necrosis26.40Hyperpyrexia28.29Torticollis35.05Paralysis agitans27.89Radiculopathy103.20
Lamotrigine (Lamictal)Levetiracetam (Keppra)Oxcarbazepine (Trileptal)Pregabalin (Lyrica)Phenytoin (Dilantin)Topiramate (Topamax)
ADRsPRRADRsPRRADRsPRRADRsPRRADRsPRRADRPRR
Wrinkling50.01Meningiomas315.71Coagulation disorder112.54Astigmatism45.66Eclampsia313.32Sulfa allergy116.39
Tanning24.25Cerebral lesions315.71Drug eruption112.54Joint disorders45.66Facial paralysis313.32Narrow angle glaucoma49.88
Stevens-Johnson syndrome21.82Necrotic debris157.85Gonorrhea112.54Decreased platelet34.24Heat stroke156.66Renal calculi42.75
Lymph node enlargement19.84Vascular disease157.85Colitis microscopic112.54Pernicious anemia22.83Gum disorder104.44Mental blocking33.25
Bronchopulmonary dysplasia18.18Attitude changed157.85Negative pregnancy test112.54Contracture22.83Para104.44Myopia27.71
Hair disorder18.18Hematoma105.24Increased sodium112.54Facial nerve palsies22.83Mental status changes104.44Migraine17.49
Hypermetropia18.18Pharyngitis streptococcal78.93Persistent dry cough112.54Optic neuritis22.83Swollen gums104.44Aphasia16.63
Melancholia18.18Liver failure63.14Sodium decreased98.48Peripheral nervous system22.83Hodgkin lymphoma78.33Coagulation disorder16.63
Necrotic debris18.18Pruritus ani52.62Vaginal infection75.03Skin fissures22.83Amyotrophy78.33Facial nerve palsies16.63
Pregnancy complication18.18Pain exacerbated52.62Hyponatremia65.65Attitude changed22.83Verruca78.33Facial paralysis16.63
Table 11

Top 10 ROR ranked ADRs of AEDs.

Acetazolamide (Diamox)Carbamazepine (Tegretol)Clonazepam (clonazepam)Diazepam (Valium)Divalproex sodium (Depakote)Gabapentin (Neurontin)
ADRsRORADRsRORADRsRORADRsRORADRsRORADRsROR
Cloudy urine1302.31Trigeminal neuralgia79.99Adiposis dolorosa28.30Heart septal defects, atrial210.47Neurosis153.49Radiculopathy103.25
Diuretic effect556.37Lump in neck66.03Aortic valve incompetence28.30Diseases of the inner ear210.47Ovarian disorder153.49Chiari malformation61.94
Cerebrospinal fluid337.81Strabismus52.85Hemorrhage intracerebral28.30Ear diseases105.19Polycystic ovary disease85.49Intractable pain61.94
Metabolic acidosis278.18Hyponatremia47.27Delusional disorder28.30Fibroid tumor105.19Vasospasm76.79Mastocytosis41.29
Spinal headache185.31Neuralgia45.89Dissociative reaction28.30Meniere's disease90.36Abnormal menstrual cycle76.74Multiple organ failure41.29
Intracranial hypertension185.16Speech disorder44.02Parasomnia28.30Muscle spasticity63.17Psychotic state76.74Rhabdomyolysis41.29
Glaucoma67.08Deja vu33.02Thrombosis28.30Premenstrual tension63.17Lung problem51.16Mobility decreased41.29
Acidosis61.86Meningitis33.02Thrombosis venous28.30Claustrophobia52.62Paranoid delusions38.37Compression injury of nerve39.78
Sarcoidosis61.72Dyslexia28.77Alanine aminotransferase increased28.30Cholecystectomy35.06Hyperthyroidism30.70Postherpetic neuralgia30.99
Gum blister61.72Necrosis26.41Adiposis dolorosa28.30Torticollis35.06Paralysis agitans27.92Feet burning25.84
Lamotrigine (Lamictal)Levetiracetam (Keppra)Oxcarbazepine (Trileptal)Pregabalin (Lyrica)Phenytoin (Dilantin)Topiramate (Topamax)
ADRsRORADRsRORADRsRORADRsRORADRsRORADRsROR
Wrinkling50.05Meningiomas316.13Drug eruption112.65Astigmatism45.67Eclampsia313.74Sulfa allergy116.45
Tanning24.25Cerebral lesions316.13Coagulation disorder112.60Joint disorders45.67Facial paralysis313.74Narrow angle glaucoma49.89
Stevens Johnson syndrome21.83Necrotic debris157.96Gonorrhea112.60Decreased platelet34.26Heat stroke157.08Renal calculi42.98
Lymph node enlargement19.87Vascular disease157.96Colitis microscopic112.60Peripheral nervous system22.85Swollen gums105.13Mental blocking33.26
Bronchopulmonary dysplasia18.20Attitude changed157.96Negative pregnancy test112.60Skin fissures22.83Gum disorder104.58Myopia27.72
Weight fluctuation18.19Hematoma105.38Increased sodium112.60Intoxication22.83Para104.58Migraine19.53
Unrest18.19Pharyngitis streptococcal78.98Persistent dry cough112.60Pernicious anemia22.83Mental status changes104.58Aphasia16.64
Hair disorder18.19Liver failure63.23Sodium decreased98.80Contracture22.83Hodgkin lymphoma78.43Altered taste16.64
Hypermetropia18.19Pruritus ani52.65Vaginal infection75.10Facial nerve palsies22.83Amyotrophy78.43Aversion16.63
Melancholia18.19Pain exacerbated52.65Hyponatremia65.87Optic neuritis22.83Verruca78.43Sudden infant death16.63
Table 12

Top 10 IC ranked ADRs of AEDs.

Acetazolamide (Diamox)Carbamazepine (Tegretol)Clonazepam (clonazepam)Diazepam (VALIUM)Divalproex sodium (Depakote)Gabapentin (Neurontin)
ADRsICADRsICADRsICADRsICADRsICADRsIC
Papilledema7.54Aplastic anemia7.06Coronary bypass5.85Hyperplasia6.73Glioma7.27Herpes zoster4.44
Rectal pain7.54Blepharospasm7.06Anterograde amnesia5.85Neck injuries6.73Erythropoietic protoporphyria7.27Myeloma4.44
Dehiscence7.54Syndrome of inappropriate antidiuretic hormone7.06Adiposis dolorosa4.26Fracture of pelvis nos (disorder)6.73St segment depression7.27Otitis media4.44
Cloudy urine7.35Enlarged breasts7.06Aortic valve incompetence4.26Temporomandibular joint dislocation6.73Neurosis6.27Spondylitis4.44
Diuretic effect7.12Chemical meningitis7.06Hemorrhage intracerebral4.26Weal (disorder)6.73Ovarian disorder6.27Ankylosing spondylitis4.44
Metabolic acidosis6.80Trigeminal neuralgia5.64Delusional disorder4.26Splinter6.73Polycystic ovary disease5.78Staphylococcal infection4.44
Intracranial hypertension6.54Lump in neck5.47Dissociative reaction4.26Heart septal defects, atrial6.14Vasospasm5.69Tachyphylaxis4.44
Spinal headache6.54Strabismus5.25Parasomnia4.26Diseases of the inner ear6.14Abnormal menstrual cycle5.69Total knee replacement4.44
Glaucoma5.62Hyponatremia5.13Thrombosis4.26Ear diseases5.73Psychotic state5.69Neurotoxicity syndromes4.44
Acidosis5.54Neuralgia5.09Thrombosis venous4.26Fibroid tumor5.73Lung problem5.27Stenosis of cervix4.44
Lamotrigine (Lamictal)Levetiracetam (Keppra)Oxcarbazepine (Trileptal)Pregabalin (Lyrica)Phenytoin (Dilantin)Topiramate (Topamax)
ADRsICADRsICADRsICADRsICADRsICADRsIC
Gouty arthritis4.26Dermatitis exfoliative7.31Reduced hearing6.83Anomia4.57Cleft lip1.41Anisocoria4.14
Cognitive disorder nos4.26Hematoma subdural7.31Myasthenia gravis6.83Bunion4.57Febrile seizures1.41Eyelid ptosis4.14
Diabetes insipidus4.26Subarachnoid hemorrhage7.31Genital rash6.83Cushing's syndrome4.57Lymphomas1.41Cysto4.14
Dilatation and curettage4.26Increased bun7.31Myasthenia6.83Scleroderma4.57Methemoglobinemia1.41Emaciation4.14
Encephalitis4.26Hair ingrown7.31Coagulation disorder5.83Dry eye syndrome4.57Loss of affect1.41Ocular infections4.14
Toxic epidermal necrolysis4.26Normal platelet count7.31Drug eruption5.83Dwarfism4.57Eclampsia1.22Depilation4.14
Folate deficiency4.26Family stress7.31Gonorrhea5.83Epicondylitis4.57Facial paralysis1.22Hepatomegaly4.14
Herpes simplex4.26Lip blister7.31Colitis microscopic5.83Episcleritis4.57Gum disorder1.12Meningitis viral4.14
External ear infection4.26Meningiomas6.73Negative pregnancy test5.83Cardiospasm4.57Para1.12Menstrual disorder4.14
Head lice4.26Cerebral lesions6.73Increased sodium5.83Failure to thrive4.57Mental status changes1.12Mental retardation4.14
A comparative look at the top 10 ADR lists within and across the three tables reveals a variation in the ADRs among AEDs within each table and a notable agreement between the top-10 ADR lists across the three tables. These observations suggest the need for further analysis to answer the research questions.

4.2. Validity of the Signaled AED ADRs

Since the validity of social media as a data source for pharmacovigilance is still under investigation [23] and the objective of this research is to investigate the validity of the OHC data for the detection of AEDs' ADRs, the signaled AEDs' ADR lists are compared with the counterpart lists in SIDER [63] in terms of precision as given in Equation (1). The results of precision for the signaled ADRs by the three measures (PRR, ROR, and IC) are shown in Table 13. In addition, the precision of the unified list of signaled ADRs (PRR ∪ ROR ∪ IC) as well as the common list of ADRs (PRR ∩ ROR ∩ IC) signaled by the three measures is presented.
Table 13

Precision of the generated lists of signaled ADRs.

PRR ∪ ROR ∪ ICPRRRORICPRR∩ROR∩IC
Acetazolamide0.630.640.640.630.64
Carbamazepine0.800.820.820.810.83
Clonazepam0.700.700.690.690.68
Diazepam0.730.730.730.740.74
Divalproex sodium0.730.740.740.730.74
Gabapentin0.680.720.690.680.70
Lamotrigine0.750.770.770.750.72
Levetiracetam0.610.630.630.620.64
Oxcarbazepine0.760.780.780.780.77
Pregabalin0.760.770.810.770.80
Phenytoin0.690.750.740.710.68
Topiramate0.750.790.800.770.84
Average0.710.740.740.720.73
From the above table, it is obvious that the validation results with SIDER vary notably among AEDs. It is the lowest in the case of Levetiracetam and the highest in the case of Carbamazepine. Realizing that both sides of the validation process, AED ADR detection from the OHCs reviews and the SIDER ADR collection from drug labels, depend on the quality and quantity of data sources available for each AED, which vary among AEDs, the variation of the validation results among AEDs is meaningful. On the other hand, the limited variation among PRR, ROR, IC, and their unified and common lists of signaled ADRs is also notable. More precisely, the comparison between the validation results of the three measures indicates that the validation results of PRR and ROR are comparable and identical in 4 AED cases. As for the IC, the validation results are lower as compared to the validation results of PRR and ROR. This indicates that both PRR and ROR perform slightly better than IC, which contradicts with the previously drawn conclusion on the better performance of IC as compared to PRR and ROR. The specific characteristics of the two data sources, SRS and OHCs, and their associated techniques could interpret this contradiction. Despite the reported limitations of existing evaluation methods [26], the validation results shown in Table 13 indicate the validity of the OHCs as a source of data for ADR detection. With regard to the comparison of the obtained results with the previously reported ones, the difficulty of conducting this assessment in this manner has been pointed out in [26], since in each research, a different dataset is used. Moreover, the absence of annotated benchmark dataset makes the use of the gold standard such as FDA label or SIDER, despite its reported shortcomings, the sole possible option. Nonetheless, the comparison of the obtained precision values with the precision values reported in previous research, regardless of the contextual differences, can position this research methodology within the previously proposed ones. As reported in [26], the precision values reported in eleven previous research range between 0.54 and 0.87, whereas the precision values obtained in this research range between 0.62 and 0.84. The consistency between the precision values of this research methodology and the previous research is obvious.

4.3. Common ADRs of AED Analysis

The common AEDs' ADRs are those ADRs that are shared by most, if not all, AEDs. To answer the research question on the common AEDs' ADRs that are detected from OHC data, three lists of the common ADRs signaled by PRR, ROR, and IC along with their probabilities of occurrence are generated as shown in Table 14. The high degree of agreement between the lists of common AEDs' ADR generated by the three measures is notable, though the IC generates a shorter list. Nonetheless, most of the ADRs in the three lists are common. A closer look at these lists reveals that they are dominated by the CNS ADRs, which is consistent with what is reported in the literature of AEDs' ADRs. Since AEDs act to suppress the pathological neuronal hyperexcitability that constitutes the final substrate in many seizure disorders, it is not surprising that they are prone to causing adverse reactions that affect the CNS [37]. Moreover, according to [68], the CNS ADRs are the most frequently reported type of AEDs' ADRs and this typically includes fatigue, drowsiness, concentration difficulties, memory problems, and irritability.
Table 14

Common ADRs among AEDs.

PRRRORIC
ADRPr(ADR)ADRPr(ADR)ADRPr(ADR)
Amnesia0.75Amnesia0.75Amnesia0.75
Slurred0.75Slurred0.75Slurred0.75
Forgetfulness0.67Forgetfulness0.75Forgetfulness0.67
Epileptic seizure0.67Epileptic seizure0.67Epileptic seizure0.67
Mental confusion0.58Mental confusion0.67Convulsion0.67
Somnolence0.58Somnolence0.58Mental confusion0.58
Convulsion0.58Convulsion0.58Aura0.58
Aura0.58Aura0.58Convulsions local0.58
Convulsions local0.58Convulsions local0.58Somnolence0.58
Cerebrovascular stroke0.58Cerebrovascular stroke0.58Cerebrovascular stroke0.58
Deafness0.50Deafness0.58Vision double0.50
Vision double0.50Vision double0.50Blurring of visual image0.50
Blurring of visual image0.50Convulsion grand mal0.50Convulsion grand mal0.50
Convulsion grand mal0.50Convulsion petit mal0.50Seizure grand mal0.50
Convulsion petit mal0.50Seizure grand mal0.50Traumatic injury0.50
Gain weight0.50Gain weight0.50Gain weight0.50
Seizure grand mal0.50
Clumsiness0.50

4.4. AED ADR Similarity Analysis

The similarity between drugs in terms of their ADRs reflects their structural composition and mechanism of action [68]. To answer the research question on the potential similarities between AEDs in terms of their signaled ADRs, a similarity measure is developed and applied to quantify the similarity between each pair of AEDs as computed from the lists of signaled ADRs generated by PRR, ROR, and IC. In this measure, the similarity between a pair of AEDs, e.g., AED and AED, is computed as follows: Since the ADR lists of AED and AED are different in size, the computed Similarity(AED, AED) and Similarity(AED, AED) are expected to be different as well. Table 15, 16, and 17 show the similarity between each AED pairs in terms of the signaled ADR lists generated by PRR, ROR, and IC, respectively.
Table 15

Similarity between AED pairs—PRR.

AcetazolamideCarbamazepineClonazepamDiazepamDivalproex sodiumGabapentinLamotrigineLevetiracetamOxcarbazepinePregabalinPhenytoinTopiramate
Acetazolamide10.170.120.050.090.130.210.150.200.180.060.36
Carbamazepine0.1610.150.070.220.240.350.240.320.330.210.27
Clonazepam0.080.1110.280.090.190.130.110.090.160.090.10
Diazepam0.040.070.3510.100.200.130.100.110.170.110.08
Divalproex sodium0.070.200.110.1010.100.280.160.170.110.150.18
Gabapentin0.060.120.130.110.0510.130.100.140.340.090.12
Lamotrigine0.100.180.090.080.160.1410.220.230.150.160.22
Levetiracetam0.130.230.150.100.170.200.3910.260.180.210.24
Oxcarbazepine0.140.240.100.100.150.220.340.2210.200.150.20
Pregabalin0.090.180.120.110.070.380.150.110.1310.070.17
Phenytoin0.040.120.080.070.100.110.170.130.110.0810.12
Topiramate0.200.160.080.050.120.140.240.150.140.180.121
Table 16

Similarity between AED pairs—ROR.

AcetazolamideCarbamazepineClonazepamDiazepamDivalproex sodiumGabapentinLamotrigineLevetiracetamOxcarbazepinePregabalinPhenytoinTopiramate
Acetazolamide10.170.110.050.090.120.200.150.200.170.060.33
Carbamazepine0.1610.150.070.210.220.350.240.320.320.210.25
Clonazepam0.070.1110.280.090.190.130.110.090.150.090.08
Diazepam0.040.070.3410.100.200.120.100.110.170.110.07
Divalproex sodium0.090.220.130.1210.120.330.190.200.130.170.18
Gabapentin0.060.110.130.110.0610.120.110.140.330.090.10
Lamotrigine0.100.190.100.080.170.1310.220.210.140.170.21
Levetiracetam0.130.230.150.100.170.200.3810.260.170.210.24
Oxcarbazepine0.140.250.100.100.150.220.300.2210.190.150.17
Pregabalin0.100.190.130.120.080.390.150.110.1510.080.16
Phenytoin0.070.280.170.160.220.230.400.290.250.1610.27
Topiramate0.190.160.070.050.110.120.240.160.130.170.131
Table 17

Similarity between AED pairs—IC.

AcetazolamideCarbamazepineClonazepamDiazepamDivalproex sodiumGabapentinLamotrigineLevetiracetamOxcarbazepinePregabalinPhenytoinTopiramate
Acetazolamide1.000.140.090.050.060.120.200.130.200.150.050.34
Carbamazepine0.131.000.120.070.160.190.330.220.300.280.210.21
Clonazepam0.060.091.000.250.090.150.130.120.100.120.100.06
Diazepam0.050.070.321.000.090.180.080.060.090.140.120.07
Divalproex sodium0.060.180.130.091.000.100.330.180.200.110.150.19
Gabapentin0.060.110.110.100.051.000.110.100.150.340.100.09
Lamotrigine0.100.170.090.040.160.111.000.210.200.100.160.15
Levetiracetam0.110.200.140.050.150.170.361.000.250.160.180.20
Oxcarbazepine0.150.230.110.070.140.210.300.211.000.200.160.17
Pregabalin0.080.150.090.080.060.320.100.090.131.000.070.11
Phenytoin0.060.260.160.150.180.220.360.240.250.151.000.21
Topiramate0.200.130.050.040.100.100.170.130.130.130.101.00
The consistency between the ADR similarity of AED pairs across the three tables is notable. However, to obtain an overall summary of the similarity of AED pairs, the overall average similarity for each AED pair, AED and AED, is computed as the mean of the three similarity averages obtained from each table. Table 18 shows the overall average similarity for each AED pair.
Table 18

Overall average ADR similarity between AED pairs using ADRs signaled by PRR, ROR, and IC.

AcetazolamideCarbamazepineClonazepamDiazepamDivalproex sodiumGabapentinLamotrigineLevetiracetamOxcarbazepinePregabalinPhenytoinTopiramate
Acetazolamide
Carbamazepine0.16
Clonazepam0.090.12
Diazepam0.050.070.30
Divalproex sodium0.080.200.110.10
Gabapentin0.090.170.150.150.08
Lamotrigine0.150.260.110.090.240.12
Levetiracetam0.130.230.130.090.170.150.30
Oxcarbazepine0.170.280.100.100.170.180.260.24
Pregabalin0.130.240.130.130.090.350.130.140.17
Phenytoin0.060.220.120.120.160.140.240.210.180.10
Topiramate0.270.200.070.060.150.110.200.190.160.150.16
From Table 18, it is obvious that the overall average similarity of a number of AED pairs is relatively remarkable such as (Pregabalin, Gabapentin), (Diazepam, Clonazepam), (Lamotrigine, Levetiracetam), (Oxcarbazepine, Carbamazepine), (Topiramate, Acetazolamide), and (Lamotrigine, Carbamazepine). This can be interpreted by similarity of the mechanisms of action of these AED pairs [1]. For example, both Pregabalin and Gabapentin have a common mechanism of blockade of α2δ subunit of Ca2+, Oxcarbazepine and Carbamazepine are Na+ channel blockers, and Lamotrigine and Carbamazepine are also Na+ channel blockers. With regard to Diazepam and Clonazepam, they belong to the same group of drugs benzodiazepines, which have the ability to inhibit the epileptic electrical activity efficiently. They are structurally similar and composed of a Benzene ring connected to a seven-membered Diazepine ring [69]. As for Topiramate and Acetazolamide, since they share carbonic anhydrase inhibition and not serotonin activity, it seems plausible that they a common ADR [70]. Finally, with regard to Lamotrigine and Levetiracetam, despite the fact that they have different mechanisms of action (Lamotrigine blocks voltage-gated sodium channels and stabilizes their inactive state, while Levetiracetam inhibits the release of the excitatory neurotransmitter by binding to synaptic vesicle protein SV2A), evidence on their common effect has been recently reported [71].

5. Conclusion

In this paper, the validity and utility of social media as a data source for detecting the ADRs of AEDs have been investigated. To this end, patients' reviews from two OHCs have been collected and a lexicon-based method with disproportionality analysis measures has been applied to generate lists of ADRs for each AED. The generated lists of signaled ADRs have been analyzed in different manners to answer research questions on the validity of the signaled AEDs' ADRs, common AEDs' ADRs, and the similarity between AEDs in terms of ADRs. In answering the first question, the lists of signaled AEDs' ADRs are compared with the corresponding sets of AEDs' ADRs in the SIDER database. Regardless of the variations in the validation results of AEDs, the average validation results indicate the validity of the ADR detection from the OHC data. Moreover, the validation results indicate a comparable performance of PRR and ROR and slightly lower performance of IC. As for the second question, the analysis of the generated ADR lists indicates that most AED ADRs are of CNS type which is concordant with the extant pharmaceutical AED literature. Finally, the analysis of the similarity between AEDs in terms of their ADRs shows a remarkable similarity between several pairs of AEDs. Overall, the answer of the first question is evidence of the validity of using OHCs for the detection of AEDs' ADRs. Moreover, the answers of the second and third questions are evidence on the utility of the OHC data for the knowledge discovery tasks related to AEDs. A final remark worth mentioning in this research context is concerning the heavy role of NLP techniques for the detection of ADRs from social media and the extraction of ADRs from drug labels to construct ADR database such as SIDER. Certainly, the continuous improvement of the NLP techniques would improve the detection and validation of ADRs from social media. On the other hand, an alternative computational paradigm that could be investigated for the detection of AEDs' ADRs is ML-based approaches. In this context, a comparison between the lexicon-based approaches and ML-based approaches would be interesting.
  49 in total

1.  Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reports.

Authors:  S J Evans; P C Waller; S Davis
Journal:  Pharmacoepidemiol Drug Saf       Date:  2001 Oct-Nov       Impact factor: 2.890

2.  The effect of backgrounds in safety analysis: the impact of comparison cases on what you see.

Authors:  Victor V Gogolak
Journal:  Pharmacoepidemiol Drug Saf       Date:  2003 Apr-May       Impact factor: 2.890

Review 3.  Perspectives on the use of data mining in pharmaco-vigilance.

Authors:  June Almenoff; Joseph M Tonning; A Lawrence Gould; Ana Szarfman; Manfred Hauben; Rita Ouellet-Hellstrom; Robert Ball; Ken Hornbuckle; Louisa Walsh; Chuen Yee; Susan T Sacks; Nancy Yuen; Vaishali Patadia; Michael Blum; Mike Johnston; Charles Gerrits; Harry Seifert; Karol Lacroix
Journal:  Drug Saf       Date:  2005       Impact factor: 5.606

4.  The Role of Benzodiazepines in the Treatment of Epilepsy.

Authors:  Juan G Ochoa; William A Kilgo
Journal:  Curr Treat Options Neurol       Date:  2016-04       Impact factor: 3.598

Review 5.  New avenues for anti-epileptic drug discovery and development.

Authors:  Wolfgang Löscher; Henrik Klitgaard; Roy E Twyman; Dieter Schmidt
Journal:  Nat Rev Drug Discov       Date:  2013-09-20       Impact factor: 84.694

Review 6.  Mechanisms of action of currently used antiseizure drugs.

Authors:  Graeme J Sills; Michael A Rogawski
Journal:  Neuropharmacology       Date:  2020-01-14       Impact factor: 5.250

Review 7.  The long-term safety of antiepileptic drugs.

Authors:  Athanasios Gaitatzis; Josemir W Sander
Journal:  CNS Drugs       Date:  2013-06       Impact factor: 5.749

8.  Large-scale combining signals from both biomedical literature and the FDA Adverse Event Reporting System (FAERS) to improve post-marketing drug safety signal detection.

Authors:  Rong Xu; QuanQiu Wang
Journal:  BMC Bioinformatics       Date:  2014-01-15       Impact factor: 3.169

Review 9.  Adverse Drug Reaction Identification and Extraction in Social Media: A Scoping Review.

Authors:  Jérémy Lardon; Redhouane Abdellaoui; Florelle Bellet; Hadyl Asfari; Julien Souvignet; Nathalie Texier; Marie-Christine Jaulent; Marie-Noëlle Beyens; Anita Burgun; Cédric Bousquet
Journal:  J Med Internet Res       Date:  2015-07-10       Impact factor: 5.428

10.  A side effect resource to capture phenotypic effects of drugs.

Authors:  Michael Kuhn; Monica Campillos; Ivica Letunic; Lars Juhl Jensen; Peer Bork
Journal:  Mol Syst Biol       Date:  2010-01-19       Impact factor: 11.429

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.