Literature DB >> 26644398

OpenFDA: an innovative platform providing access to a wealth of FDA's publicly available data.

Taha A Kass-Hout1, Zhiheng Xu2, Matthew Mohebbi3, Hans Nelsen3, Adam Baker3, Jonathan Levine1, Elaine Johanson1, Roselie A Bright1.   

Abstract

OBJECTIVE: The objective of openFDA is to facilitate access and use of big important Food and Drug Administration public datasets by developers, researchers, and the public through harmonization of data across disparate FDA datasets provided via application programming interfaces (APIs).
MATERIALS AND METHODS: Using cutting-edge technologies deployed on FDA's new public cloud computing infrastructure, openFDA provides open data for easier, faster (over 300 requests per second per process), and better access to FDA datasets; open source code and documentation shared on GitHub for open community contributions of examples, apps and ideas; and infrastructure that can be adopted for other public health big data challenges.
RESULTS: Since its launch on June 2, 2014, openFDA has developed four APIs for drug and device adverse events, recall information for all FDA-regulated products, and drug labeling. There have been more than 20 million API calls (more than half from outside the United States), 6000 registered users, 20,000 connected Internet Protocol addresses, and dozens of new software (mobile or web) apps developed. A case study demonstrates a use of openFDA data to understand an apparent association of a drug with an adverse event.
CONCLUSION: With easier and faster access to these datasets, consumers worldwide can learn more about FDA-regulated products.
© The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved.

Entities:  

Keywords:  API; adverse event; application programming interface; drug safety; open data; open source; openFDA

Mesh:

Year:  2015        PMID: 26644398      PMCID: PMC4901374          DOI: 10.1093/jamia/ocv153

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


BACKGROUND AND SIGNIFICANCE

In the United States, Food and Drug Administration (FDA)-regulated products account for about 25 cents of every dollar spent by American consumers each year—products that touch the lives of every American every day. Americans rely on the FDA to keep their food, medical products, and other FDA-regulated products safe, and where applicable, effective. FDA’s major activities regarding marketed products are to: Require firms that manufacture or distribute FDA-regulated products to register with FDA, list the products, and provide the labeling on those products. Review reports of adverse events, which may include patient or user harm, use error, and/or a quality problem with the product, to monitor marketed products. FDA accepts all voluntary reports and has issued reporting requirements that vary with the particular type of product. Inspect firms and/or products, either routinely or as part of an investigation prompted by an adverse event report or other concern. Monitor product recalls, which most often are manufacturer initiated. FDA has been making non-confidential portions of these data available on www.fda.gov in 3 modes: Web-based search tools that return structured and/or unstructured data. These are excellent for occasional use with simple queries. The entire database downloadable in comma-separated value (CSV) or standard generalized markup language (SGML) format. This mode is only practical for users with broadband internet access, large storage space, statistical software, and the technical skill to properly process the relational database files and any unstructured fields. Individual files based on popular Freedom of Information Act (FOIA) requests. These files may not be relevant to the user’s questions. Members of the public may also file custom FOIA requests for specific records. To address these difficulties FDA launched the openFDA project in March 2013. The first priority were databases with a combination of high general consumer interest, low accessibility on www.fda.gov, and high interest on the part of the FDA organization that stewards the database. By making the data available to the public in a new harmonized big data format, FDA encourages scientists, clinicians, informaticists, software developers, and other technically focused individuals in both the private and public sectors to explore the data, develop applications that automatically access the data, and offer their own enhancements to the data or the software. The more recent Presidential Executive Order on Open Data and the Department of Health and Human Services Health Data Initiative require FDA to make its publicly available data more easily accessible in a structured, computer readable format. On June 2, 2014, openFDA was launched in Beta mode at https://open.fda.gov. This project uses cutting edge technologies, and is a pilot for how FDA can develop and deploy novel applications in the public cloud securely and efficiently in the future. In this article, we describe the system and demonstrate how to obtain openFDA data through application programming interface (API) calls. A case study illustrates the use of openFDA in investigating an apparent association between a drug and an adverse event.

METHODS

Data Sources

Four main data sources are currently available in openFDA: FAERS (FDA Adverse Event Reporting System) for drugs and selected biological products. SPL (Structured Product Labeling) for drugs and selected biological products., RES (Recall Enterprise System), primarily recall notices, and also market withdrawals and safety alerts, for drugs, selected biological products, devices, and foods. MAUDE (Manufacturer and User Device Experience), adverse event reports for medical devices. Details for each data source can be found at https://open.fda.gov/updates/. To address issues related to differences in the structure of the three drug databases (adverse event reports, recalls, and labeling), openFDA features harmonization on drug identifiers (generic name, brand name, etc.), to make it easier to both search for and understand the drug products returned by API queries (details can be found at https://open.fda.gov/updates/). When users query a drug database API, they can search either fields original to the database (never deleted) or the harmonized openFDA fields.

Logical Architecture

The architecture and technology were chosen to make openFDA scalable; quickly responsive; transferable to new technologies as they mature; easily accessible by application developers, researchers, and the general public; and transparent. The data are on the cloud that has been approved for federal use (Amazon Web Services East). Details can be found at https://open.fda.gov/updates/.

Design of open.fda.gov

The design of the open.fda.gov website draws on best practices in agile development, intuitive user experience, and data visualization, aiming to provide one unified, simple presentation to users. The site is organized around broad types of data, rather than FDA’s internal structure. The site is characterized by a combination of interactive programmer-friendly queries, visualizations and examples that help explain the nature of the data and how to use JSON URL query command syntax. Plain language is used throughout.

Design for Engaged, Open Community

Users are encouraged to use GitHub to see all the open source code and post their own additions or modifications. StackExchange is encouraged for discussion, questions, and answers. In addition, FDA publishes update announcements on open.FDA.gov and an openFDA Twitter account, and maintains an openFDA email account is to widen the options for user participation.

RESULTS

Use of openFDA in Applications

Since the launch of openFDA, a growing community on StackExchange has been engaged in discovering novel ways to use, integrate, and analyze the openFDA data. By mid-July 2015, there have been more than 20 million API calls (more than half of which were generated outside the United States), more than 6000 registered API users, and more than 1800 Twitter followers. Twitter has been used to broadcast openFDA news and followers’ feedback and announcements. For example, some followers have shared their experiences of using openFDA to study gender differences in reported drug side effects and to cross reference openFDA data with other databases. As a result of the US General Services Administration’s Request for Quotes on June 17th that requires responders to build working prototype apps that use openFDA and post them in GitHub, GitLab, or BitBucket, dozens of prototypes have been posted on GitHub. Apps have been developed that allow a consumer who experiences an adverse event to determine whether there is a report of anyone else having a similar experience after taking the same drug. An interactive dashboard display of drug reports was published as a “hobby.” More advanced statistics are available on a family of public sites used in the following case study.

Case Study: The Causal Relationship between Aspirin and Flushing

This case illustrates some of the professional pharmacovigilance processes for assessing an apparent association between a drug and a particular adverse event in a collection of drug adverse event reports summarized in a public FDA guidance document. In the past, triggers of increased adverse event reporting to FDA have included news reports, public FDA alerts, and the introduction of new products. Among the many other issues with drawing causality conclusions from the reports are limited knowledge of the relative extent of use of the drugs involved, and various other alternative explanations for the apparent association. Aspirin is one of the most common drugs listed in openFDA drug adverse event reports (4.2% of all reports). A natural question is what types of adverse events have been reported for aspirin? Or, this could be rephrased as, what types of adverse events were more often reported in the same reports as aspirin, compared to reports that do not mention aspirin? A proportional reporting rate (PRR), a commonly used statistic for this question, of 2 indicates that the proportion of reports for the drug-event combination is twice the proportion of the event in the overall database.,, Using an interactive program designed to look at openFDA drug reports data, we can quickly look at the most common events for aspirin (Table 1). Further details of the steps used in this case can be found at https://open.fda.gov/updates/.
Table 1:

Events reported for generic drug name “aspirin” with PRR > 2.0, ranked by PRR.

Event rankEventNo. of reports for both aspirin and eventNo. of reports for eventPRR
(any)169,838
1Flushing10,07142,8437.6

Notes: The query URL for all aspirin reports was http://go.usa.gov/cvRTJ.Finished data and PRR output was from https://open.fda.gov/static/docs/openFDA-analysis-example.pdf.

PRR is the Proportional Reporting Ratio.

Events reported for generic drug name “aspirin” with PRR > 2.0, ranked by PRR. Notes: The query URL for all aspirin reports was http://go.usa.gov/cvRTJ.Finished data and PRR output was from https://open.fda.gov/static/docs/openFDA-analysis-example.pdf. PRR is the Proportional Reporting Ratio. Flushing” is the most common event, with 6% (10 071/169 838) of all reports mentioning aspirin also mentioning flushing. Furthermore, the PRR indicates that a report containing aspirin is more than seven times as likely to include flushing as a report that does not contain aspirin. Labeling for aspirin does not include flushing in the list of adverse events. Before concluding that aspirin causes flushing, one must rule out noncausal explanations of the association, including, but not limited to: 1) the association was a chance occurrence, 2) an extraneous event resulted in the apparent association, 3) the event was related to the underlying condition that prompted the medication use, and 4) other medications are responsible for the relationship. Explanations 1) and 2) can be investigated using a dynamic PRR graph (Figure 1).
Figure 1.

Dynamic proportional reporting ratios (PRR) for reports with aspirin and flushing. At each month, the accumulated reports were used to calculate the PRR and its 95% confidence interval.

Dynamic proportional reporting ratios (PRR) for reports with aspirin and flushing. At each month, the accumulated reports were used to calculate the PRR and its 95% confidence interval. In Figure 1 we see that before 2009 there was little or no statistical association between aspirin and flushing, with the PRR values only slightly above 1, and the 95% confidence intervals often including 1. After 2008 we see the PRR rapidly increase to 4, and then increase further to between 7 and 9. The confidence intervals for post 2008 data all exclude 1, so these are unlikely to be a chance association. The dramatic increase after 2008 suggests an explanatory event in 2008. In addition, other medication(s) could explain the relationship. Each report allows mentions of multiple drugs and multiple events. Table 2 shows the most common drugs mentioned in reports that also mention aspirin and flushing.
Table 2.

Drugs most frequently mentioned in reports with “flushing” events, restricted to those with PRR >2.0.

Drug rankDrugNo. of reports for both flushing and drugNo. of reports for drugPRRDrug labeling lists flushing in the field “information_for_patients”
(any drug)42,841
1Niacin15,30336,43466Yes
2Niacin and simvastatin497510,44655Yes
3Dimethyl fumarate623424,77130Yes
4Aspirin10,071169,8387.6No
5Lisinopril282290,4703.3Yes
Drugs most frequently mentioned in reports with “flushing” events, restricted to those with PRR >2.0. Using openFDA drug labeling, we found that the drugs in Table 2 that list “flushing” are “niacin,” “niacin and simvastatin,” and “lisinopril.” The combination of niacin with simvastatin was first approved as Simcor, February 19, 2008, just before the rise in reports noted in Figure 1. Niacin was reported to reduce the risk of myocardial infarction and stroke in 1975, and to reduce atherosclerosis beginning in 1987. Consensus guidelines for niacin therapy were published in 2012 and 2013., Lisinopril was approved in 1988. We then tested whether these three drugs explain all of the apparent association between aspirin and flushing. Looking at aspirin reports that do not mention niacin or lisinopril results in the complete absence of flushing from the list of associated adverse events. In summary, we have demonstrated that a drug-event association is unlikely to be causal. Research beyond the reporting data is usually essential to fully understand the relationship between drug-event pairs. For example, Cefali et al. found that aspirin is a good way to treat flushing. Our case may be a drug (niacin) causing the event (flushing), and an event (flushing) leading to use of the drug (aspirin).

DISCUSSION

The openFDA initiative makes it possible for technology specialists to effectively, automatically, and quickly search, query, or pull massive amounts of public information directly from FDA datasets via URL queries to APIs. As the case study illustrated, openFDA allows users to quickly conduct a variety of analyses to explore the nature of posted data. As we focus on making existing public datasets accessible in new ways, it is important to note that only data that has already been cleared for public use is considered for openFDA. This is the practical reason that narrative descriptions of drug adverse event reports are not in the public datasets. The absence of the narratives from the drug reports is enough to prevent drawing any valid conclusions from solely the openFDA drug report data. For the first time, the recalls data are more readily searchable and the drug labeling data are searchable on any of the standard labeling fields.

CONCLUSION

OpenFDA brings a new model of big data search and analytics across disparate and complex sources by simplifying dataset structures. With easier and better access, users can learn more about FDA-regulated products, as shown in the case study of aspirin and flushing. A new open community shares code, documentation, examples, apps, and ideas related to openFDA. As the president’s executive order stated, “openness in government…promotes the delivery of efficient and effective services to the public.” We invite use of openFDA for entrepreneurship, innovation, and scientific discovery.

FUNDING

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

COMPETING INTERESTS

The authors have no competing interests to declare.
  8 in total

1.  Estimating the extent of reporting to FDA: a case study of statin-associated rhabdomyolysis.

Authors:  Mara McAdams; Judy Staffa; Gerald Dal Pan
Journal:  Pharmacoepidemiol Drug Saf       Date:  2008-03       Impact factor: 2.890

Review 2.  Quantitative signal detection using spontaneous ADR reporting.

Authors:  A Bate; S J W Evans
Journal:  Pharmacoepidemiol Drug Saf       Date:  2009-06       Impact factor: 2.890

3.  2013 ACC/AHA guideline on the treatment of blood cholesterol to reduce atherosclerotic cardiovascular risk in adults: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines.

Authors:  Neil J Stone; Jennifer G Robinson; Alice H Lichtenstein; C Noel Bairey Merz; Conrad B Blum; Robert H Eckel; Anne C Goldberg; David Gordon; Daniel Levy; Donald M Lloyd-Jones; Patrick McBride; J Sanford Schwartz; Susan T Shero; Sidney C Smith; Karol Watson; Peter W F Wilson; Karen M Eddleman; Nicole M Jarrett; Ken LaBresh; Lev Nevo; Janusz Wnek; Jeffrey L Anderson; Jonathan L Halperin; Nancy M Albert; Biykem Bozkurt; Ralph G Brindis; Lesley H Curtis; David DeMets; Judith S Hochman; Richard J Kovacs; E Magnus Ohman; Susan J Pressler; Frank W Sellke; Win-Kuang Shen; Sidney C Smith; Gordon F Tomaselli
Journal:  Circulation       Date:  2013-11-12       Impact factor: 29.690

4.  Comparing reporting rates of adverse events between drugs with adjustment for year of marketing and secular trends in total reporting.

Authors:  Y Tsong
Journal:  J Biopharm Stat       Date:  1995-03       Impact factor: 1.051

5.  Aspirin reduces cutaneous flushing after administration of an optimized extended-release niacin formulation.

Authors:  E A Cefali; P D Simmons; E J Stanek; M E McGovern; C J Kissling
Journal:  Int J Clin Pharmacol Ther       Date:  2007-02       Impact factor: 1.366

6.  Pharmacovigilance in the 21st century: new systematic tools for an old problem.

Authors:  Ana Szarfman; Joseph M Tonning; P Murali Doraiswamy
Journal:  Pharmacotherapy       Date:  2004-09       Impact factor: 4.705

7.  Breaking the news or fueling the epidemic? Temporal association between news media report volume and opioid-related mortality.

Authors:  Nabarun Dasgupta; Kenneth D Mandl; John S Brownstein
Journal:  PLoS One       Date:  2009-11-18       Impact factor: 3.240

Review 8.  Current guidelines for high-density lipoprotein cholesterol in therapy and future directions.

Authors:  Bishnu H Subedi; Parag H Joshi; Steven R Jones; Seth S Martin; Michael J Blaha; Erin D Michos
Journal:  Vasc Health Risk Manag       Date:  2014-04-08
  8 in total
  14 in total

1.  Leveraging digital media data for pharmacovigilance.

Authors:  Hammad Farooq; Junaid Suhail Niaz; Saira Fakhar; Hammad Naveed
Journal:  AMIA Annu Symp Proc       Date:  2021-01-25

2.  It takes a genome to understand a village: Population scale precision medicine.

Authors:  Atul J Butte
Journal:  Proc Natl Acad Sci U S A       Date:  2016-10-19       Impact factor: 11.205

3.  Mining FDA resources to compute population-specific frequencies of adverse drug reactions.

Authors:  Aleksandar Poleksic; Carson Turner; Rishabh Dalal; Paul Gray; Lei Xie
Journal:  Proceedings (IEEE Int Conf Bioinformatics Biomed)       Date:  2017-12-18

4.  Medical Device Recalls in Radiation Oncology: Analysis of US Food and Drug Administration Data, 2002-2015.

Authors:  Michael J Connor; Kathryn Tringale; Vitali Moiseenko; Deborah C Marshall; Kevin Moore; Laura Cervino; Todd Atwood; Derek Brown; Arno J Mundt; Todd Pawlicki; Abram Recht; Jona A Hattangadi-Gluth
Journal:  Int J Radiat Oncol Biol Phys       Date:  2017-02-12       Impact factor: 7.038

5.  A database of pediatric drug effects to evaluate ontogenic mechanisms from child growth and development.

Authors:  Nicholas P Giangreco; Nicholas P Tatonetti
Journal:  Med (N Y)       Date:  2022-06-24

6.  Big Data Mining and Adverse Event Pattern Analysis in Clinical Drug Trials.

Authors:  Callie Federer; Minjae Yoo; Aik Choon Tan
Journal:  Assay Drug Dev Technol       Date:  2016-09-15       Impact factor: 1.738

7.  Digoxin-induced anemia among patients with atrial fibrillation and heart failure: clinical data analysis and drug-gene interaction network.

Authors:  Yubi Lin; Siqi He; Ruiling Feng; Zhe Xu; Wanqun Chen; Zifeng Huang; Yang Liu; Qianhuan Zhang; Bin Zhang; Kejian Wang; Shulin Wu
Journal:  Oncotarget       Date:  2017-06-16

8.  Carboplatin-induced hematotoxicity among patients with non-small cell lung cancer: Analysis on clinical adverse events and drug-gene interactions.

Authors:  Yi-Ju Cheng; Ran Wu; Ming-Liang Cheng; Juan Du; Xi-Wei Hu; Lei Yu; Xue-Ke Zhao; Yu-Mei Yao; Qi-Zhong Long; Li-Li Zhu; Juan-Juan Zhu; Ni-Wen Huang; Hua-Juan Liu; Ya-Xin Hu; Fang Wan
Journal:  Oncotarget       Date:  2017-05-09

9.  PhenCards: a data resource linking human phenotype information to biomedical knowledge.

Authors:  James M Havrilla; Cong Liu; Xiangchen Dong; Chunhua Weng; Kai Wang
Journal:  Genome Med       Date:  2021-05-25       Impact factor: 11.117

10.  Quantification of US Food and Drug Administration Premarket Approval Statements for High-Risk Medical Devices With Pediatric Age Indications.

Authors:  Samuel J Lee; Lauren Cho; Eyal Klang; James Wall; Stefano Rensi; Benjamin S Glicksberg
Journal:  JAMA Netw Open       Date:  2021-06-01
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.