| Literature DB >> 31065579 |
Maryam Zolnoori1,2,3, Kin Wah Fung1, Timothy B Patrick2, Paul Fontelo1, Hadi Kharrazi4, Anthony Faiola5, Nilay D Shah3, Yi Shuan Shirley Wu6, Christina E Eldredge7, Jake Luo2, Mike Conway8, Jiaxi Zhu9, Soo Kyung Park10, Kelly Xu6, Hamideh Moayyed11.
Abstract
The "Psychiatric Treatment Adverse Reactions" (PsyTAR) dataset contains patients' expression of effectiveness and adverse drug events associated with psychiatric medications. The PsyTAR was generated in four phases. In the first phase, a sample of 891 drugs reviews posted by patients on an online healthcare forum, "askapatient.com", was collected for four psychiatric drugs: Zoloft, Lexapro, Cymbalta, and Effexor XR. For each drug review, patient demographic information, duration of treatment, and satisfaction with the drugs were reported. In the second phase, sentence classification, drug reviews were split to 6009 sentences, and each sentence was labeled for the presence of Adverse Drug Reaction (ADR), Withdrawal Symptoms (WDs), Sign/Symptoms/Illness (SSIs), Drug Indications (DIs), Drug Effectiveness (EF), Drug Infectiveness (INF), and Others (not applicable). In the third phases, entities including ADRs (4813 mentions), WDs (590 mentions), SSIs (1219 mentions), and DIs (792 mentions) were identified and extracted from the sentences. In the four phases, all the identified entities were mapped to the corresponding UMLS Metathesaurus concepts (916) and SNOMED CT concepts (755). In this phase, qualifiers representing severity and persistency of ADRs, WDs, SSIs, and DIs (e.g., mild, short term) were identified. All sentences and identified entities were linked to the original post using IDs (e.g., Zoloft.1, Effexor.29, Cymbalta.31). The PsyTAR dataset can be accessed via Online Supplement #1 under the CC BY 4.0 Data license. The updated versions of the dataset would also be accessible in https://sites.google.com/view/pharmacovigilanceinpsychiatry/home.Entities:
Year: 2019 PMID: 31065579 PMCID: PMC6495095 DOI: 10.1016/j.dib.2019.103838
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Fig. 1Sample sizes for the four drugs of the dataset.
Fig. 2Gender distribution in the sample.
Fig. 3Frequency of sentences labeled for each item in the dataset, and SSRIs and SNRIs class separately.
Fig. 4Frequency of cognitive, physiological, psychological, and functional problems entity type by ADRs, WDs, DIs, and SSIs for the entire dataset.
Fig. 5Percentage of cognitive, physiological, psychological, and functional problems entity types by ADRs, WDs, DIs, and SSIs in the entire dataset.
Fig. 6Frequency of UMLS concepts for each ADRs, WDs, DIs, SSIs after normalization.
Fig. 7Reduction of identified entities by mapping to the UMLS Metathesaurus concepts.
Fig. 8Frequency of identified entities indicating severity and persistency of the identified entities (ADR, WD, DI, SSI).
Specifications table
| Subject area | Psychiatric medications, Consumer Health Informatics, Medical Standard Vocabularies |
| More specific subject area | Consumer health posts, Machine Learning Systems, Text mining, Adverse drug events, SNOMED CT, UMLS |
| Type of data | Categorical, string, numeric variables, analyzed |
| How data was acquired | Using an Application Program Interface (API) |
| Data format | Comma Separated Values (CSV) |
| Experimental factors | Sample consists of 891 of drug review posts collected randomly from a healthcare forum “askapatint.com” for four psychiatric medications including Zoloft, Cymbalta, Effexor XR, and Cymbalta. |
| Experimental features | Factors measure pharmacological aspects of psychiatric medications. |
| Data source location | Data collected from an online healthcare forum called “askapatint.com”, United States |
| Data accessibility | Provided as online supplement |
The PsyTAR dataset can be used as a benchmark to train and evaluate the performance of lexicon-based systems and machine learning algorithms to identify adverse drug events (ADEs) and measure drug effectiveness from online healthcare forums, particularly for psychiatric medications. The PsyTAR dataset can be used to train machine learning systems (e.g. neural network) for normalizing medical concepts in online healthcare communities by extracting the semantic links among the layperson expressions of medical terms and medical standard vocabularies. The PsyTAR dataset can be used to evaluate the association between different types of ADEs and patient satisfaction (attitude) toward psychiatric medications. The PsyTAR dataset may also be used to facilitate the seamless exchange of information between patients' expressions of ADEs in personal health records (PHR) and electronic health records (EHRs) |