| Literature DB >> 28270198 |
.
Abstract
BACKGROUND: Integrating multiple sources of pharmacovigilance evidence has the potential to advance the science of safety signal detection and evaluation. In this regard, there is a need for more research on how to integrate multiple disparate evidence sources while making the evidence computable from a knowledge representation perspective (i.e., semantic enrichment). Existing frameworks suggest well-promising outcomes for such integration but employ a rather limited number of sources. In particular, none have been specifically designed to support both regulatory and clinical use cases, nor have any been designed to add new resources and use cases through an open architecture. This paper discusses the architecture and functionality of a system called Large-scale Adverse Effects Related to Treatment Evidence Standardization (LAERTES) that aims to address these shortcomings.Entities:
Keywords: Clinical terminologies; Linked-data; Pharmacovigilance; Post-market drug safety
Mesh:
Year: 2017 PMID: 28270198 PMCID: PMC5341176 DOI: 10.1186/s13326-017-0115-3
Source DB: PubMed Journal: J Biomed Semantics
Fig. 1The information sources of existing knowledge-based systems for pharmacovigilance. Citations to the sources mentioned can be found in the “Background” section. EHR: electronic health record, AE: adverse event, EU: European Union, FAERS: Food and Drug Administration Adverse Event Reporting System, CTD: Comparative Toxicogenomics Database
Decisions that are made during the process of integrating sources that can influence downstream pharmacovigilance analyses
| Data Type | Feature | Option for variability | Performance questions |
|---|---|---|---|
| Product labels | Product label outcome mention | Named entity performance (PPV and sensitivity) | Do improvements in entity recognition performance improve system recall and precision? |
| Section location (e.g., anywhere vs specific sections) | Does identifying which sections are more informative than others reduce noise? | ||
| Frequency information | Threshold variation | Does incorporation of ADE frequency improve performance? What cut-off should be used? | |
| Pharmacovigilance DBs (e.g. FAERS, MedEffect, VigiBase) | Minimum detectable relative risk | Threshold variation | What is the appropriate cut-off for MDRR? Is it HOI specific? |
| Database (s) chosen | Does the database influence the value of MDRR for this task? | ||
| Risk identification method | Disproportionality metric | What metric (e.g. PRR, EBGM, IC) leads to the best performance? Is it HOI specific? | |
| Number of cases in FAERS | Threshold variation | What is the appropriate cut-off for number of case reports? | |
| Drug Indication DB | Indication listings in FDB | Yes/no and when mentioned | Does using on-label and off-label indication knowledge improve performance? |
| Indexed literature | Number of relevant publications from the indexed literature | Threshold variation | Is there an appropriate cut-off for number of publications? What is its variability relative to specific HOIs and drugs? |
| Source of relevant publications from the indexed literature | Varying the combination of sources | Should we be selective about the sources used or chose all of them? | |
| Drug and outcome mention in relevant indexed literature | Named entity performance | Do improvements in entity recognition performance improve system recall and precision? | |
| Main MeSH terms vs supplemental | What is the value of MeSH supplemental terms relative to the primary index terms? | ||
| Scientific discourse tag of the location of mention (e.g., intro, methods, results, conclusions) | Does limiting identification of drug-HOI co-mention to specifically tagged text excerpts improve performance? | ||
| Publication type label (randomized trial, case report, etc.) | Should the publication type of the drug-HOI co-mention be tracked and possibly weighted to improve performance? | ||
| Source of publication type label (Embase, MeSH) | Is one publication type indexing system better than the other for the question answering task, or should they be combined? | ||
| Topic of the source publication based on latent semantic indexing | Does the use of tags assigned to text sources by latent semantic indexing improve system performance if used as a feature? | ||
| Observational health data (claims + EHR) | Minimum detectable relative risk | Threshold variation | What is the appropriate cut-off for MDRR? Is it HOI specific? |
| Database (s) chosen | Does the database influence the value of MDRR for this task? | ||
| Risk identification method | Analytic method | What method (e.g. disproportionality analysis, self-controlled case series, IC temporal pattern discovery, high-dimensional propensity score) leads to the best performance? Is it HOI specific? | |
| Cohort selection | Patient ethnicity, age, sex, co-morbidities, concurrent medications | Does cohort selection using these features affect model performance? What is the appropriate size and diversity of the cohort to reduce noise and bias? | |
| Drug exposure conditions | Length of exposure, dosage | Does selecting minimum exposure duration criteria and/ or drug dosage information improve performance? | |
| Study replicability | Number of locations for confirming results | How many replicates of the study should be performed at different institutions? | |
| Observation period | Observation duration threshold | Does setting minimum observation period durations improve performance? |
PPV: positive predictive value, OMOP: Observational Medical Outcomes Partnership, ADE: adverse drug event, MDRR: minimal detectable reporting ratio, HOI: health outcome of interest, DB: database, FAERS: Food and Drug Administration Adverse Event Reporting System, EBGM: empirical Bayes geometric mean. IC: information component, FDB: First Data Bank (commercial drug knowledge base), EHR: electronic health record
Fig. 2The overall architecture of LAERTES within the OHDSI clinical research software environment. REST: representational state transfer, OHDSI: Observational Health Data Sciences and Informatics, API: application programming interface, DBMS: database management system, CDM: common data model, OA: Open Annotation Data, RDF: Resource Description Framework
Fig. 3An entity relationship diagram showing how data from US product labeling is represented as a semantically enriched Open Annotation Data graph
Fig. 4The data architecture of LAERTES. The system leverages the OMOP Vocabularies to describe drugs and health outcomes of interest via standardized vocabulary concepts (concept table). LAERTES stores aggregated evidence in a summary table (drug_hoi_evidence) that provides a “linkout” (evidence_linkout) to an Open Annotation Data representation of the source data. In the relational database, the linkout functions as a foreign key to the adr_annotation table through a table (not shown) that maps the linkout to annotation identifiers (adr_annotation_uid). Client programs can also use the linkout as a URL to retrieve JSON data from an RDF store that has a linked data version of the source open annotation data
Fig. 5The drug “roll-up” table and example reports by order identifier
Distinct drug-Health Outcome of Interest pairs by source
| Source description | Drug and HOI mapping method | Distinct drug-HOI pairs in source | Distinct drug-HOI pairs in LAERTES (%) |
|---|---|---|---|
| Adverse drug reactions mined from US drug product labels using a validated natural language processing tool called SPLICER [ | Drugs were coded using RxNorm and HOIs using MedDRA. The OMOP Standard Vocabulary was used to map MedDRA to SNOMED-CT. | 272 436a | 254 738 (93%) |
| Adverse drug events extracted from EU Summary of Product Characteristics by the PROTECT project | Drugs were mentioned by name and HOIs using MedDRA codes. Drug names were mapped to RxNorm using a combination of simple string matching and Bioportal ontology searches. Many combination products and some individual drugs were not mappable. All mappings were manually reviewed for accuracy. | 26 989 | 24 537 (91%) |
| FDA Adverse Event Reporting System counts and Proportional Reporting Ratio from [ | The OHDSI Usagi tool [ | 3 766 382 | 2 753 078 (73%) |
| Abstracts from titles and abstracts indexed in MEDLINE that describe drug-HOI evidence according to MeSH indexing [ | Drug and HOI concepts were both coded using MeSH. The OMOP Standard Vocabulary was used to map from MeSH drug concepts to RxNorm and MeSH HOI concepts to SNOMED-CT. | 79 119b | 77 395 (97.8%) |
| Sentence spans from titles and abstracts indexed in MEDLINE that describe drug-HOI evidence according to queries against the Semantic Medline database | Drug and HOI concepts were both coded using UMLS concept identifiers. The UMLS Metathesaurus MRCONSO table was used to map concepts to RxNorm, MeSH, MedDRA, and SNOMED-CT. The OMOP standard vocabulary was then used to map drug concepts only available as MeSH to RxNorm and HOI concepts only available as MedDRA or MeSH concepts to SNOMED-CT. | 5 023b | 2 813 (56%) |
| Chemical disease associations from the Comparative Toxicogenomics Database | Drug and HOI concepts were both coded using MeSH. The OMOP Standard Vocabulary was used to map from MeSH drug concepts to RxNorm and MeSH HOI concepts to SNOMED-CT. | 503 835 | 432 850 (86%) |
aSPLICER drug-hoi pairs are at the clinical drug level. All other sources are at the ingredient level. bDoes not include drug-HOI evidence where the source refers to the drug by its MeSH pharmacologic group name. EU: European Union, FDA: Food and Drug Administration, HOI: Health outcome of Interest, OMOP: Observational Medical Outcomes Partnership, US: United States, MedDRA: Medical Dictionary for Regulatory Activities, MeSH: Medical Subject Headings
Overlap of distinct drug-HOI pairs at the drug ingredient level after mapping drugs to RxNorm and HOIs to SNOMED-CT
| Literature (MEDLINE and CTD) vs spontaneous reporting ( | Product labeling (US and EU) vs spontaneous reporting ( | Literature (MEDLINE and CTD) vs product labeling (US and EU) ( | All three ( |
|---|---|---|---|
| 119 293 (3.9%) | 87 279 (3.2%) | 14 838 (2.6%) | 14 295 (0.5%) |
The counts and percentages shown contrast the sum of the union (shown in the heading) and intersection of the distinct drug-HOI pairs from both sources mentioned. CTD: Comparative Toxicogenomics Database, EU: European Union, US: United States
Fig. 6Experimental user interface to the LAERTES evidence base