Anup P Challa1,2,3, Xinnan Niu4, Etoi A Garrison5, Sara L Van Driest6,7, Lisa M Bastarache4, Ethan S Lippmann2, Robert R Lavieri1, Jeffery A Goldstein8, David M Aronoff5,7,9,10. 1. Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN 37203 USA. 2. Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, TN 37212 USA. 3. Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115 USA. 4. Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37203 USA. 5. Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN 37203 USA. 6. Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN 37232 USA. 7. Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37203 USA. 8. Department of Pathology, Northwestern University, Chicago, IL 60611 USA. 9. Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN 37203 USA. 10. Present Address: Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202 USA.
Abstract
Background: Systematic exclusion of pregnant people from interventional clinical trials has created a public health emergency for millions of patients through a dearth of robust safety data for common drugs. Methods: We harnessed an enterprise collection of 2.8 M electronic health records (EHRs) from routine care, leveraging data linkages between mothers and their babies to detect drug safety signals in this population at full scale. Our mixed-methods signal detection approach stimulates new hypotheses for post-marketing surveillance agnostically of both drugs and diseases-by identifying 1,054 drugs historically prescribed to pregnant patients; developing a quantitative, medication history-wide association study; and integrating a qualitative evidence synthesis platform using expert clinician review for integration of biomedical specificity-to test the effects of maternal exposure to diverse drugs on the incidence of neurodevelopmental defects in their children. Results: We replicated known teratogenic risks and existing knowledge on drug structure-related teratogenicity; we also highlight 5 common drug classes for which we believe this work warrants updated assessment of their safety. Conclusion: Here, we present roots of an agile framework to guide enhanced medication regulations, as well as the ontological and analytical limitations that currently restrict the integration of real-world data into drug safety management during pregnancy. This research is not a replacement for inclusion of pregnant people in prospective clinical studies, but it presents a tractable team science approach to evaluating the utility of EHRs for new regulatory review programs-towards improving the delicate equipoise of accuracy and ethics in assessing drug safety in pregnancy.
Background: Systematic exclusion of pregnant people from interventional clinical trials has created a public health emergency for millions of patients through a dearth of robust safety data for common drugs. Methods: We harnessed an enterprise collection of 2.8 M electronic health records (EHRs) from routine care, leveraging data linkages between mothers and their babies to detect drug safety signals in this population at full scale. Our mixed-methods signal detection approach stimulates new hypotheses for post-marketing surveillance agnostically of both drugs and diseases-by identifying 1,054 drugs historically prescribed to pregnant patients; developing a quantitative, medication history-wide association study; and integrating a qualitative evidence synthesis platform using expert clinician review for integration of biomedical specificity-to test the effects of maternal exposure to diverse drugs on the incidence of neurodevelopmental defects in their children. Results: We replicated known teratogenic risks and existing knowledge on drug structure-related teratogenicity; we also highlight 5 common drug classes for which we believe this work warrants updated assessment of their safety. Conclusion: Here, we present roots of an agile framework to guide enhanced medication regulations, as well as the ontological and analytical limitations that currently restrict the integration of real-world data into drug safety management during pregnancy. This research is not a replacement for inclusion of pregnant people in prospective clinical studies, but it presents a tractable team science approach to evaluating the utility of EHRs for new regulatory review programs-towards improving the delicate equipoise of accuracy and ethics in assessing drug safety in pregnancy.
At the point of care, pregnant patients are a complex population: physicians must exercise caution in prescribing many common drugs to these patients, given the risks of toxicity for their developing fetuses[1]. However, consideration of fetal toxicity in drug development is largely irregular. While teratogenicity scores established by regulatory agencies like United States Food and Drug Administration (FDA) are discrete, these criteria provide little concrete distinction among score classes, making it difficult for drug developers to accurately gauge the fetal toxicity risks of a molecule[2]. FDA’s updated teratology assessment guidelines in the 2014 Pregnancy and Lactation Labeling Rule aimed to increase the contextual relevance of developmental toxicity evaluation, but this guidance has been slow to translate to evaluative change at the point of care, which remains largely aligned with the previous five-pronged letter scale[3,4]. The result is a vicious cycle that promotes the approval of drugs without adequate data on their safety and efficacy in pregnant populations, as expectant patients are routinely excluded from clinical trials, out of concern for fetal harm upon exposure to drugs with uncertain, pre-clinical teratogenicity data. In fact, of 213 new drugs approved by FDA between 2003 and 2012, only 5% contained human data in the pregnancy section of their labels[5]. These factors have created a substantial gap in knowledge on pharmacotherapy for diseases during pregnancy, restricting the number of treatments available to this population through insufficient data on the pharmacodynamics and pharmacokinetics (PK) of many maternal medication exposures. At the bedside, the result is undertreatment of chronic and acute illnesses in pregnant people from obstetricians’ cautious fears of causing harm to their patients, alongside the increased risk of harm to fetuses from necessary prescriptions[6].While the ethics of excluding pregnant people from randomized, controlled drug trials (RCTs) remain in debate[7-9], the ongoing unavailability of relevant drug safety and efficacy information underscores an urgent need for new methods to rapidly assess this information, to improve the quality of care for these underserved patients, and to ensure health equity for this complex population through contemporaneous drug labeling and marketing efforts. Such an opportunity for the discovery of drug safety insights for pregnant patients may be available through strategic analysis of large numbers of existing healthcare documents like electronic health records (EHRs) that were collected during routine patient care. Collectively, EHRs can uniquely replicate the natural history of pregnancy by linking medical information of pregnant patients and their neonates, such as mothers’ prescriptions (while expectant) and the perinatal diseases of their children[10-12]. This information allows for the creation of a unique framework of relational knowledge generation. Namely, EHR data may be stratified into distinct cohorts by patients’ documented exposure—or lack thereof—to a drug of interest, facilitating the development of an inferential model to relate incidences of maternal drug exposure and neonatal disease[11]. While these experiments are not a replacement for prospective safety data generation through the inclusion of pregnant people in clinical trials, the above platform of safety signal detection presents an ethical way of studying the effects of drug exposure in pregnant people with human data, on a significant scale and across all drug classes.Existing literature that describes the safety of most drugs potentially prescriptible in pregnancy remains overwhelmed by conflicting studies—the majority of which only present results from pre-clinical animal models of drug testing and the minority of which are empirical case reports or case series among relatively few patients[13]. Deciding to prescribe a drug to a pregnant patient involves balanced evaluation of the patient’s need for treatment (drug efficacy) and the risk of injury to the patient’s fetus (drug safety). However, providers cannot make these informed decisions without robust and definitive safety data.Previous work that has attempted to clarify knowledge on drug safety in pregnant patients has relied on observational and retrospective analyses of databases like public insurance claims, measuring the significance in the coincidence of a neonatal disease of interest and prescription of a drug of interest to the neonates’ mothers[10,14]. While these studies have added new—and often valuable—narratives of drug safety to the literature, our research is innovative because it uses EHR data, attempts relational inference, and probes such drug-disease relationships at scale. Collectively, these factors allow us to advance the ontological reliability and epistemological robustness of data-driven studies of adverse pregnancy outcomes[11].Our research makes use of a database of 2.8 M EHRs at Vanderbilt University Medical Center (VUMC) to curate our experimental cohorts. The data innovation in studying EHR data over evaluating public insurance claims is that this choice mitigates significant demographic biases (e.g., poverty) that are present within public payor records. Overcoming the effects of such potentially confounding variables requires the integration of advanced methods of propensity scoring (PS) to properly evaluate the coincidence of maternal drug exposure and pediatric disease, which defines the key algorithmic design principle of parsimony and results in poor model performance[15]. In contrast, VUMC is an urban medical center that features a demographically diverse patient population, as previous studies using these EHR data affirm[16]. Indeed, self-reporting patient registries—another popular choice for observational data to study health outcomes in pregnancy—are also inherently limited in their integrity, as patients are often unreliable historians of their own care[17]. In contrast, our study promotes data integrity by studying provider-maintained healthcare information.Technical innovation in this project also rests within the rigor of the analytical methods we employ[11]. We apply a mode of systematic, relational inference to maternal drug exposure and perinatal disease that we believe is more directly and appropriately aligned with the etiology of drug-associated birth defects, compared to the highly coincidental frameworks that dominate the literature. We achieved inference suggestive of causality through harmonizing the validated phenome-wide association (PheWAS), which was originally developed at VUMC to discover genetic links to clinical phenotypes, with a rigorous, standardized consensus prioritization approach that considered clinical practice and RCT data to move from data-based associations towards etiology discovery[18]. By developing a medication history-wide association study (MedWAS) to suggest pharmacological determinants of neonatal diseases, we optimized on algorithms that underlie PheWAS to explore nascent patterns across the drug-disease hypotheses that our model revealed. In this way, we used MedWAS as a method of safety signal detection and management, approaching the design a target trial[11]. Target trials are an epidemiological method of retrospective data analysis that make use of existing clinical information and high-powered statistical algorithms to create artificial subject profiles from all relevant and available patient data within a cohort of interest. This curation then allows for relational analysis of subjects’ drug histories against a morbidity of interest, facilitating potential simulation of a clinical trial when prospective experiments are not feasible[19-21]. The approach in this manuscript alludes to a target trial by following similar approaches to data curation and stratification, statistical inference, and outcomes prioritization, though unlike the archetypal target trial developed by Hernán and Robbins for claims data and consortial data banks[19], our distributed workflow relies on a single health system’s mother-baby EHRs, meaning that some aspects of our procedure rely on manual evidence synthesis, rather than harnessing end-to-end automation. Furthermore, our approach operates on known patterns of prescriptive behavior in pregnancy to determine treatment-exposed and non-exposed (i.e., “control”) cohorts in our data, providing a very limited basis to claim RCT-like randomization that is naturally resultant from a poor recapitulation of the many reasons why clinicians decide on specific treatments for their pregnant patients, within structured EHR data. Like the target trial, our research does not seek to replace the RCT. Nonetheless, to our knowledge, there have been very few (and relatively small) attempts at EHR-derived safety signal detection evaluating pregnant patients[22], allowing us to innovate in exploring the power of this approach at scale[11,23].Using MedWAS, we present systematic safety signal detection across all drugs prescribed to pregnant people and all diseases within neonatal EHRs at VUMC: herein lies the conceptual innovation of our approach. Historically, researchers studying the safety of pharmacotherapy in pregnancy with statistical methods have communicated through a “one drug—one disease—one publication” model. While this practice provides bandwidth for deep interrogation of a single drug-disease hypothesis, it further diversifies the pool of existing data that remains conflicting and inconsistent, since the methods in such papers can become overfitted for studying the safety of other drugs that are prescriptible in pregnancy. In contrast, our approach is sufficiently reproducible to analyze maternal prescriptions and neonatal diseases across a large healthcare enterprise. We are unaware of such a drug-agnostic and phenotype-agnostic model in the available literature on drug safety in pregnancy.We have a record of work in using statistical methods like PheWAS to generate strong hypotheses of efficacy for new drug development[12,24,25]. Here, we apply that expertise to construct MedWAS as an innovatively scalable approach for the surveillance of drug safety in pregnancy. We also present potential avenues for complementarity between MedWAS and our previous attempts to develop a machine learning (ML) approach capable of identifying chemical structures that predispose drugs towards increased teratogenic risk when prescribed during pregnancy[26].In this study, we identify 1,054 drugs historically prescribed to pregnant patients and develop a quantitative, medication history-wide association study. We integrate a qualitative evidence synthesis platform using expert clinician review for inclusion of biomedical specificity—to test the effects of maternal exposure to diverse drugs on the incidence of neurodevelopmental defects in their children. Not only do the results replicate known teratogenic risks and existing knowledge on drug structure-related teratogenicity; they also highlight 5 common drug classes for which we believe this work warrants updated assessment of their safety. This research is not a replacement for the inclusion of pregnant people in prospective clinical studies, but presents a tractable team science approach to evaluating the utility of EHRs for new regulatory review programs—towards improving the delicate equipoise of accuracy and ethics in assessing drug safety in pregnancy.
Methods
The approach that we describe below is an explanatory summary of the data preprocessing (for cohort selection) and informatics procedures (for drug-disease testing) that we provide in cookbook format in the “Supplementary Information” accompanying this manuscript, supporting Supplementary Tables 1–5 in the component “Supplementary Methods” section. A diversity and inclusion report for the maternal and neonatal EHRs we analyzed is also included as Supplementary Table 6 in the “Supplementary Discussion” section.We tested the hypothesis that MedWAS can effectively establish relational inference between mothers’ exposures to drugs with uncertain safety and perinatal diseases in their neonates. In establishing the feasibility of our tool to accomplish post-market drug surveillance, we restricted ourselves to the analysis of only neurological morbidities as a base case, given that the ontologies that codify these diseases have strong bases of relational logic[27]. We expect the general framework of the analytical and signal evaluation procedures we present here will be analogously applicable to the interrogation of neonatal diseases in other organ systems.
Ethical review
The Institutional Review Board (IRB) of Vanderbilt University approved the research and deemed that it was exempt from ethical approval and informed consent since it was not deemed to involve human subjects (IRB #191553), given its retrospective, observational nature and use of data collected during routine patient care.
Cohort selection
To mimic the enrollment of pregnant patients in a drug safety experiment, we used ML to curate and block appropriate treatment and control (drug-exposed vs. not drug-exposed) cohorts across all 1,054 agents that are documented as prescriptions to pregnant patients in eStar, VUMC’s EHR system. A listing of these agents is available as described in Supplementary Methods, supporting Supplementary Data 1. To select our cohorts, we probed VUMC’s Research Derivative (RD), a database of fully identified clinical and administrative information from 2.8 M patients that contains data like International Classification of Disease-9/10 (ICD-9/10) billing codes (which codify nearly all existing human morbidities), patient demographics, lab results, medications, and clinical narratives from five different relational health information systems that source directly from patient care[28,29]. To effectively create experimental cohorts across the agents we probed from these data, we first established the following phenotyping rule as inclusion criteria for patient “enrollment” in treatment and control groups:Population: RD; Include: Mom/baby link (1 or more), where specified medication (1 or more where date during mother EHR pregnancy=yes) and clinic note in baby EHR suggests record of care (1 or more postpartum).Herein, our criteria for allocating pregnant patients to a drug treatment group required baseline, confirmed pregnancy among all candidate mothers, with a record of at least one prescription of the specified drug in the mother’s EHR during their entire gestational period and live-born delivery of a neonate who received their own EHR at VUMC (so their health outcomes were available for our analysis). Defining pregnancy and gestational period in a systematic way from the EHR remains a non-standardized analytical practice and therefore required us to develop an inferential approach reliant on a data dictionary of relevant ICD-10 codes for gestational period. For interested readers, we describe this approach in Supplementary Methods, across Supplementary Tables 1–5. We designed our inclusion criteria to maximize the data available to our model, so we could achieve the highest power for demonstrating preliminary proof of concept for our approach. Herein, we harnessed downstream evidence synthesis to vet our outcome associations, rather than establishing very tight inclusion (and exclusion) criteria a priori to mitigate confounders.
Data curation
Next, we leveraged a suite of natural language processing (NLP) tools to extract phenotypic attributes and maternal drug exposures from narrative EHR data among all patients within the 94,872 EHRs (48,434 mother-baby EHR pairs) who met our inclusion criteria for at least one study drug. These tools included a general-purpose NLP tool (the 2015-indexed version of KnowledgeMap concept identifier (KMCI)[30,31], available through https://www.vumc.org/cpm/cpm-blog/kmci-knowledgemap-concept-indexer), ML-based clinical-note section tagger (the 2010-indexed version of SecTag[32,33], available to download at https://www.vumc.org/cpm/cpm-blog/sectag-tagging-clinical-note-section-headers), and version 1.3 of MedEx, an NLP algorithm for identifying medication exposures within free clinical text[32,34] (available to download at https://sbmi.uth.edu/ccb/resources/medex.htm). KMCI identifies Unified Medical Language System concepts[35] using a shallow parser, word sense disambiguation, and semantic regularization, and includes a module to identify negation[30]. MedEx uses context-free grammar and a rule-based approach to extract detailed medication information (including dose, frequency, and route) from free text. MedEx encodes an ingredient barcode for all drugs, such that drug mentions extracted from EHRs are continuously linked to existing drug ontologies from which additional pharmacological data may be mined (e.g., RxNorm concept unique identifier[36])[32,34]. These standardized systems have been used to process more than 60 million documents at Vanderbilt and elsewhere. Here, we used them to capture all drug mentions and available ICD-9/10 codes and to facilitate requisite matching of free-text disease terms to concept unique identifiers for candidate mothers and their linked neonates, as well as to extract all available demographic information for “enrolled” mothers and babies. Enacted across all combinations of diseases and maternal drug histories in our population, our workflow enabled the curation and stratification of patient data to empower >1.7 M combinatorial drug-disease association experiments, as we describe below.
Implementation of MedWAS
PheWAS is a common, systematic ML approach to unearth associations between disease and genetic variants and to discover pleiotropy using EHR data linked to DNA. It is a method that scans phenomic data for genetic associations using Phecodes mapped to ICD-9/10 codes from the EHR. Multiple publications demonstrate that PheWAS is a feasible method to rapidly generate hypotheses on the underpinnings of disease[18,37-40]. We repurposed the PheWAS framework to develop an innovative MedWAS, in identifying the extent to which the perinatal phenotypes in our cohorts are plausibly related to exposure to the drugs in each simulated safety experiment’s treatment group. Herein, our proof-of-concept MedWAS model took an input of babies’ neurological diseases from all mother-baby cohorts we constructed and outputted the maternal medication exposures putatively related to babies’ phenotypes. While it is easiest to envision our platform through the canonical stratification of mother-baby cohorts by maternal drug exposure, our adoption of neonatal disease-contingent inference across treatment-defined maternal cohorts allowed us to develop capacity for discovery of multiple drug exposures as etiologies for our phenotypes of interest.MedWAS operated in direct analogy to PheWAS by using its component logistic classification methods (logit) to identify neonatal disease as a function of maternal exposure to a drug of interest and by reporting a p-value for each of these drug-disease tests that reflected the strength of logit alignment after correction for multiple testing of a drug across all neonatal diseases in our cohorts. In doing this across 1,054 native maternal drug exposures and the neurological subset of 1,678 EHR-embedded phenotypes—first, on a pilot-scale, with 5.7 K EHR pairs, and subsequently on our full data set of 49 K mother-baby dyads—each experiment was controlled by cases of neonatal disease linked to pregnant patients without a record of exposure to the test drug. Herein, we also computed an odds ratio (OR) as a proxy for the effect size of hypothetical drug-disease enrichment across each of our tested case and control populations. Because there are known associations among the representations of input and output data and PheWAS model performance[38-40], we iteratively assessed MedWAS performance with several standard representations of the drug and disease data (i.e., different levels of Anatomical Therapeutic Chemical (ATC) codes for drug entities[41] and Phecodes and ICD-9/10 codes for diseases[42]) from our cohorts to prevent confounding of our results by data type. The list of 1,678 Phecodes we employed is publicly accessible through the open-source code for version 0.12.3 of the PheWAS package (see https://github.com/PheWAS/PheWAS).
Hypothesis Prioritization
While the explicit goal of our work was to establish a platform for generating hypotheses of drug safety that may be pursued in more targeted studies in the future, we affirm that a non-deterministic challenge in pursuing our experiments was accurate prioritization of MedWAS’s predicted drug-disease relationships by their clinical, biological, and statistical plausibility, given the number of association tests we executed rapidly within our analytical framework. We attempted to meet this challenge by ranking our results with the following heuristics: concordance with known fetal safety risks from published drug labels, a soft constraint of Bonferroni significance (with correction from baseline p ≤ 0.05) and OR > 1, compelling clinical reviews from obstetrician and pediatrician consults on the plausibility of substantially implicated drug prescriptions and teratogenic outcomes, reproducibility between MedWAS outputs and the results from our previous work that identified drug structures linked to adverse birth outcomes[26], and evidence against “confounding by indication” from harmonizing systematic chart review of mothers’ baseline disease states with knowledge of known vertical disease transmission risks within our treatment cohorts. Our application of the p-value as a soft prioritization constraint that complemented systematic review from our clinical stakeholders aligns with guidance to this effect from American Statistical Association[43].To parse MedWAS results we believed were not clinically plausible or were potentially confounded, we began by restricting all signals associated to nutraceutical products, as we recognized that patient history-informed capture of food and nutritional supplement use data in the EHR is highly unreliable. These agents are available over-the-counter (OTC) and often incompletely reported by patients, such that mention of the agent does not always imply true exposure during gestation[44].
Pediatrics evidence synthesis
Then, we consulted a pediatrician with expertise in clinical pharmacology on our study team to identify neurological Phecodes with unlikely manifestation in the perinatal period; these diseases were mainly neurocognitive (e.g., dyslexia) and therefore excluded from consideration as true model results. Our pediatrics consult further stratified higher-level versions of the phenotype embeddings in our model outcomes as incident in infants, toddlers, school-age children, or adolescents, based on disease pattern presentations from clinical practice. Consequently, we excluded all outcomes not plausibly detectable in infants.
Obstetrics evidence synthesis
Following our pediatrician’s review, we consulted a practicing obstetrician on our study team, who has training in clinical pharmacology and maternal-fetal medicine, to identify the plausibility of prescription of the drugs implicated in our model during pregnancy. In completing this review, our obstetrics consultant synthesized knowledge from her own prescriptive practice, prescriptive guidelines from American College of Obstetricians and Gynecologists, Society for Maternal-Fetal Medicine, departmental practice guidelines at Vanderbilt, and clinical decision software (CDS) like UpToDate[4] and Reprotox[45] to stratify our signals as “high-yield” and “low-yield” outcomes. We defined high-yield outcomes as those which demonstrated statistical significance, at least 1% coincidence rate between drug prescription and pediatric disease (such that, with our sample sizes of mothers prescribed each drug and neonates born with each disease, we prioritized only non-unary outcomes), and unclear prescriptive recommendations and/or practice guidelines for implicated drugs (e.g., FDA score C and conflicting case reports described in CDS). These drugs also had plausible prescription during the first trimester of pregnancy, when most neurological organ development occurs. Low-yield outcomes included signals rooted in drugs available OTC, such that EHR data on drug use were not reliable for our first-pass analysis, and signals with drugs sparsely prescribed to pregnant patients in the United States of America due to lack of regional drug supply and/or existing guidance against prescription of these drugs during pregnancy. Our consideration of the latter revealed to us that our low-yield signals may be artifactual noise from our inferential approach to defining gestational period, if these drugs appeared in pregnant patients’ EHRs before discontinuation, when providers first learned of their patients’ pregnancies.Our designation of the yields of our signals was powered by a spreadsheet model we developed, which codified the considerations above by fields including the following: (1) “drug’s original indication” (to help identify potential cases of confounding by maternal morbidity—by which a neonate could inherit the mother’s disease or the drug’s associated adverse outcomes could be sequalae of pre-term birth precipitated by the disease for which the mother is treated); (2) “FDA drug class”; (3) “trimester of prescription”; (4) “intrapartum or immediate postpartum prescription?” (a response of “yes” to this question resulted in a signal’s relative de-prioritization, given our interest in antepartum exposures and the difficulty of perfectly ascertaining gestational period within the EHR); (5) “duration of prescription”.In an ad hoc fashion, both consultants, as well as a pharmacologist, removed drugs from consideration which presented with implausible PK for their associated toxicities (e.g., non-systematic absorption).Figure 1 provides a summary of our process for developing and vetting MedWAS data.
Fig. 1
Summary of the process to develop and assure the quality of MedWAS signals—the goal of our approach is to unearth and prioritize drug safety risks, towards generation of the highest quality hypotheses to inform new regulatory review programs.
Engagement of obstetric, pharmacological, and regulatory stakeholders is inherent to this process.
Summary of the process to develop and assure the quality of MedWAS signals—the goal of our approach is to unearth and prioritize drug safety risks, towards generation of the highest quality hypotheses to inform new regulatory review programs.
Engagement of obstetric, pharmacological, and regulatory stakeholders is inherent to this process.
Table 1
Example MedWAS Outcomea: An example MedWAS outcome for a known teratogenic relationship (fetal phenytoin intoxication and chorea) shows statistical significance across several mother-baby pairs, as we expected.
Drug
Disease
p
OR
# Disease +
% Disease + with Drug Exposure
Phenytoin
Abnormal involuntary movements
1 × 10−6
1.03
195
1
aPer the “Data Availability Statement,” row-level positive control data may be available upon request, and the accompanying Supplementary Data 1 contains exposure counts for all drugs we studied.
A collection of 21 similar results across 10 known teratogens supports our claim of proof of concept for our approach*.
Authors: Anup P Challa; Robert R Lavieri; Ethan S Lippmann; Jeffery A Goldstein; Lisa Bastarache; Jill M Pulley; David M Aronoff Journal: Nat Med Date: 2020-06 Impact factor: 53.440
Authors: Joshua C Denny; Marylyn D Ritchie; Melissa A Basford; Jill M Pulley; Lisa Bastarache; Kristin Brown-Gentry; Deede Wang; Dan R Masys; Dan M Roden; Dana C Crawford Journal: Bioinformatics Date: 2010-03-24 Impact factor: 6.937
Authors: Ellen C Caniglia; Rebecca Zash; Denise L Jacobson; Modiegi Diseko; Gloria Mayondi; Shahin Lockman; Jennifer Y Chen; Mompati Mmalane; Joseph Makhema; Miguel A Hernán; Roger L Shapiro Journal: AIDS Date: 2018-01-02 Impact factor: 4.177
Authors: Ioana Danciu; James D Cowan; Melissa Basford; Xiaoming Wang; Alexander Saip; Susan Osgood; Jana Shirey-Rice; Jacqueline Kirby; Paul A Harris Journal: J Biomed Inform Date: 2014-02-14 Impact factor: 6.317
Authors: Joshua C Denny; Lisa Bastarache; Marylyn D Ritchie; Robert J Carroll; Raquel Zink; Jonathan D Mosley; Julie R Field; Jill M Pulley; Andrea H Ramirez; Erica Bowton; Melissa A Basford; David S Carrell; Peggy L Peissig; Abel N Kho; Jennifer A Pacheco; Luke V Rasmussen; David R Crosslin; Paul K Crane; Jyotishman Pathak; Suzette J Bielinski; Sarah A Pendergrass; Hua Xu; Lucia A Hindorff; Rongling Li; Teri A Manolio; Christopher G Chute; Rex L Chisholm; Eric B Larson; Gail P Jarvik; Murray H Brilliant; Catherine A McCarty; Iftikhar J Kullo; Jonathan L Haines; Dana C Crawford; Daniel R Masys; Dan M Roden Journal: Nat Biotechnol Date: 2013-12 Impact factor: 54.908
Authors: Wei-Qi Wei; Lisa A Bastarache; Robert J Carroll; Joy E Marlo; Travis J Osterman; Eric R Gamazon; Nancy J Cox; Dan M Roden; Joshua C Denny Journal: PLoS One Date: 2017-07-07 Impact factor: 3.240