| Literature DB >> 22779040 |
Jyotishman Pathak1, Richard C Kiefer, Christopher G Chute.
Abstract
The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. One of the key requirements to perform GWAS is the identification of subject cohorts with accurate classification of disease phenotypes. In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical data stored in electronic health records (EHRs) to accurately identify subjects with specific diseases for inclusion in cohort studies. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR data and enabling federated querying and inferencing via standardized Web protocols for identifying subjects with Diabetes Mellitus. Our study highlights the potential of using Web-scale data federation approaches to execute complex queries.Entities:
Year: 2012 PMID: 22779040 PMCID: PMC3392057
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc
Figure 1Snapshot of RDF graph representing United States of American (DBPedia.org)
Figure 2SPARQL query components
Figure 3System architecture for representing patient records using RDF
List of Linked Open Drug Data datasets
| DrugBank | Drugs | Provides drug (e.g., pharmacological) with drug target (e.g., pathway) data |
| LinkedCT | Clinical Trials | Clinical trials registry |
| DailyMed | Drugs | FDA approved drugs |
| DBPedia | Drugs, diseases, proteins | RDF data extracted from the Wikipedia |
| Diseasome | Diseases, genes | Links diseases and genes by known associations |
| RDF-TCM | Drugs, genes, diseases | Traditional Chinese medicine gene and disease associations dataset |
| RxNorm | Drugs | NLM’s RxNorm vocabulary |
| SIDER | Drug side effects | Marketed drugs and their adverse effects |
| STITCH | Chemicals, Proteins | Chemical, proteins, and their interactions |
| ChEMBL | Chemicals, Assays, Literature | Trial drugs, literature, drug targets |
| WHO Global Health Observatory | Infectious diseases, environmental factors, socioeconomic conditions | Data and statistics for infectious diseases at country, regional and global levels |
| Medicare | Drug formulary | Medicare D approved drugs |
Sample federated queries for Diabetes Mellitus (*Bio2RDF is not part of LODD)
| Q1. Find all patients having a side effect of Prandin after being administered. | RxNorm, SIDER |
| Q2. Find all FDA approved drugs for DM, and identify the patients that were being administered those drugs by drug class. | RxNorm, DailyMed |
| Q3. What genes or biomarkers are associated with DM as published in the literature and find their interaction pathway information? | Diseasome, Bio2RDF* |
| Q4. A variant of HNF4α is known to have a strong correlation with DM predisposition. Find all patients administered with drugs that target HNF4α. | RxNorm, DrugBank, Diseasome, ChemBL |
| Q5. Find all patients that are on Sulfonylureas, Metformin, Metglitinides, and Thiazolinediones, or combinations of them? | RxNorm, DrugBank, DailyMed |
| Q6. Find all patients taking Alpha-glucosidase inhibitors. | RxNorm, DrugBank |
Figure 4SPARQL federated query for drug side-effects of Prandin
Figure 5SPARQL federated query for genes and pathways associated with diabetes mellitus