Literature DB >> 25053577

Validating drug repurposing signals using electronic health records: a case study of metformin associated with reduced cancer mortality.

Hua Xu1, Melinda C Aldrich2, Qingxia Chen3, Hongfang Liu4, Neeraja B Peterson5, Qi Dai6, Mia Levy7, Anushi Shah8, Xue Han9, Xiaoyang Ruan4, Min Jiang1, Ying Li10, Jamii St Julien11, Jeremy Warner7, Carol Friedman10, Dan M Roden12, Joshua C Denny7.   

Abstract

OBJECTIVES: Drug repurposing, which finds new indications for existing drugs, has received great attention recently. The goal of our work is to assess the feasibility of using electronic health records (EHRs) and automated informatics methods to efficiently validate a recent drug repurposing association of metformin with reduced cancer mortality.
METHODS: By linking two large EHRs from Vanderbilt University Medical Center and Mayo Clinic to their tumor registries, we constructed a cohort including 32,415 adults with a cancer diagnosis at Vanderbilt and 79,258 cancer patients at Mayo from 1995 to 2010. Using automated informatics methods, we further identified type 2 diabetes patients within the cancer cohort and determined their drug exposure information, as well as other covariates such as smoking status. We then estimated HRs for all-cause mortality and their associated 95% CIs using stratified Cox proportional hazard models. HRs were estimated according to metformin exposure, adjusted for age at diagnosis, sex, race, body mass index, tobacco use, insulin use, cancer type, and non-cancer Charlson comorbidity index.
RESULTS: Among all Vanderbilt cancer patients, metformin was associated with a 22% decrease in overall mortality compared to other oral hypoglycemic medications (HR 0.78; 95% CI 0.69 to 0.88) and with a 39% decrease compared to type 2 diabetes patients on insulin only (HR 0.61; 95% CI 0.50 to 0.73). Diabetic patients on metformin also had a 23% improved survival compared with non-diabetic patients (HR 0.77; 95% CI 0.71 to 0.85). These associations were replicated using the Mayo Clinic EHR data. Many site-specific cancers including breast, colorectal, lung, and prostate demonstrated reduced mortality with metformin use in at least one EHR.
CONCLUSIONS: EHR data suggested that the use of metformin was associated with decreased mortality after a cancer diagnosis compared with diabetic and non-diabetic cancer patients not on metformin, indicating its potential as a chemotherapeutic regimen. This study serves as a model for robust and inexpensive validation studies for drug repurposing signals using EHR data.
© The Author 2014. Published by Oxford University Press on behalf of the American Medical Informatics Association.

Entities:  

Keywords:  drug repurposing; electronic health records; metformin; natural language processing

Mesh:

Substances:

Year:  2014        PMID: 25053577      PMCID: PMC4433365          DOI: 10.1136/amiajnl-2014-002649

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


INTRODUCTION

The pharmaceutical industry faces a productivity problem to smoothly deliver new drugs to market. Current de novo drug discovery and development is costly, time-consuming, and risky.1 Developing a new drug is estimated to cost over US$800 million and to take anywhere from 10 to 17 years,2 with a success rate of less than 10%.3 Therefore, pharmaceutical companies and public-sector researchers are both seeking more creative methods for drug discovery. Recently, drug repurposing (also called repositioning or re-profiling), which finds new indications for existing drugs, has received great attention.1,4–7 Drug candidates for repurposing have often been through the pre-clinical and clinical stages and, therefore, have known safety profiles which can substantially reduce the risk, cost, and time of drug development, which offers the possibility of solving the productivity dilemma. Successful stories of drug repurposing have been reported1 and the need for drug repurposing is well recognized by leaders in industry, academia, and government.8 For example, The Learning Collaborative9 aims to advance therapies for blood cancers through developing a drug repurposing framework across different organizations. Recently, there has been a growing effort to develop computational approaches to predict drug repurposing associations.10,11 With the availability of comprehensive compound databases containing structure, bioassay, and genomic information, such as NIH's Molecular Libraries Initiative,12,13 new computational methods that utilize high-throughput data to predict drug repurposing signals have been developed, including structure-based virtual screening,14 and analysis of side effect profiles,15,16 genomic and gene expression data,5,17,18 and the biomedical literature.19 More and more potential drug repurposing signals are being predicted; however, how to further validate these potential signals and determine the appropriate next steps (eg, to conduct a clinical trial or not) remains challenging. Here we propose the use of large electronic health record (EHR) databases to validate potential drugs for repurposing. Over the past decade, rapid growth in the clinical implementation of large EHRs has led to an unprecedented expansion in the availability of dense longitudinal clinical datasets of large populations, which are ideal for quantifying drug outcome. Large EHRs have emerged as a valuable resource, enabling clinical and translational research,20,21 including drug outcome studies, for instance for pharmacovigilance.22–25 Moreover, informatics approaches that can efficiently and accurately extract and analyze clinical information from heterogeneous data sources within EHR databases have also been developed and applied to facilitate cost-effective clinical studies using EHRs.26–30 As a first step to assess the use of EHRs for drug repurposing, we conducted a study to validate a recently reported association of metformin, a first-line therapy for type 2 diabetes mellitus (DM2), with reduced cancer mortality. A growing body of evidence suggests metformin improves cancer survival31,32 and decreases cancer risk33–36 when compared to other glucose-lowering therapies, suggesting metformin may have clinical promise as an antineoplastic agent.33,37 A recent study of incident cancer patients from primary care clinics in the UK showed that metformin was associated with reduced mortality compared with cancer patients not exposed to metformin.32 Specific cancers, such as pancreatic or colorectal cancer, may have improved survival with metformin use especially for early stage disease.38,39 As a result, metformin is being evaluated for use as a cancer therapeutic agent40,41 and requires confirmation in an independent clinical setting. In this study, we used two state-of-the-art EHR databases at Vanderbilt University Medical Center (VUMC) and Mayo Clinic to conduct a retrospective cohort study to evaluate the association between metformin and overall mortality among incident cancer cases. The purpose of our study was twofold: (1) to validate the association between metformin use and cancer mortality using comprehensive EHRs; and (2) to demonstrate the use of informatics tools in automated data extraction tasks for EHR-based drug repurposing studies. To the best of our knowledge, this is the first study that aims to apply EHR data to drug repurposing research.

METHODS

Data sources

We conducted a retrospective cohort study from January 1, 1995 to December 31, 2010 using the EHRs at VUMC and Mayo Clinic. At VUMC, the Synthetic Derivative (SD), a comprehensive and de-identified image of the EHR at VUMC,42 was used for this study. The SD is updated regularly as new clinical information, including inpatient and outpatient billing codes, laboratory values, laboratory reports, medication orders, and clinical notes, is accrued in the EHR. As of May 2013, the SD contained information on about 2.2 million individuals with dense electronic medical record data dating back to the early 1990s, while the Mayo Clinic EHR contained information on about 7.4 million patients. Patients were eligible for the study if: (1) they had an incident cancer diagnosis (excluding non-melanoma skin cancers because they have a much better prognosis than other types of cancers) between January 1, 1995 and December 31, 2010 identified using the Vanderbilt tumor registry which is linked to the Vanderbilt EHR; and (2) were aged 18 years or older at the time of tumor diagnosis. Cancer patients were identified using ICD-O (International Classification of Diseases for Oncology) codes and their corresponding date of diagnosis in the Vanderbilt tumor registry, which was initiated in the early 1980s and is regularly maintained by trained nurse abstractors for all cancer patients diagnosed or with their first course of treatment at Vanderbilt. We included only the first incident cancer in individuals having multiple primary tumors. We excluded patients with congestive heart failure (CHF) or chronic kidney disease (CKD) prior to tumor diagnosis, resulting in a total of 42 165 cancer patients in this study, since heart failure and kidney disease are considered contraindications for metformin use. CHF was excluded by removing patients with an ICD-9 code of 428.* at any point before the date of tumor diagnosis and CKD was excluded by removing patients with a creatinine level >1.5 mg/dL before the tumor diagnosis date. (Since CHF and CKD can both occur as complications from cancer treatment, we did not remove patients who developed these conditions after cancer diagnosis.) From the date of their cancer diagnosis, patients were followed for overall mortality. Mortality status was assessed by linkage with the local tumor registry. For example, the Vanderbilt tumor registry follows the NAACCR (North American Association of Central Cancer Registries) Death Clearance Manual43 when ascertaining death information for cancer patients.

Study design and data extraction

Figure 1 shows the overall design and workflow of this study. Four exposure groups were identified among the Vanderbilt cancer patients based on DM2 disease status and medication status following their cancer diagnosis. The four exposure groups were as follows: (1) DM2 patients using metformin (including patients exposed to both metformin and other DM2 drugs); (2) DM2 patients using other oral hypoglycemic medications (and never metformin); (3) DM2 patients using insulin only; and (4) non-diabetic patients with no use of diabetes drugs. Construction of the study cohort, identification of exposed/unexposed individuals, and ascertainment of covariates was done automatically by using existing or newly developed EHR selection algorithms44 incorporating techniques such as natural language processing (NLP).
Figure 1:

The study design and data extraction workflow for patients in the Vanderbilt electronic health record (EHR) system from January 1995 to December 2010.

The study design and data extraction workflow for patients in the Vanderbilt electronic health record (EHR) system from January 1995 to December 2010. To identify DM2 patients, we used an existing algorithm45,46 previously developed by the eMERGE (electronic Medical Records and Genomics) Network.47 The algorithm identifies patients with and without diabetes using three types of clinical information: (1) ICD-9 codes for DM2; (2) medications for DM2; and (3) clinical laboratory results (glucose >200 mg/dL or hemoglobin A1c >6.5%). DM2 individuals met at least two of the three requirements for diagnosis, while non-diabetic patients had none of the three criteria in their records. Prior research has demonstrated that this algorithm has a positive predictive value (PPV) of 98% for DM2 and a PPV of 100% for non-diabetes.45 Patients not meeting either DM2 or non-diabetic definitions were excluded (eg, a single ICD-9 code for diabetes without other supporting evidence; N = 7452). Diabetic individuals were divided into three exposure groups based on medication use after their cancer diagnosis. To identify metformin and other DM2 drug exposure, we used both structured (eg, electronic physician orders) and unstructured (eg, clinic visit notes) medication information in the EHR. MedEx,48,49 an existing high performance NLP system, was used to extract medication names and signature information from unstructured clinical text. We required that subjects have two or more mentions of the diabetes medications in the EHR and at least one mention within 5 years after their cancer diagnosis date to classify subjects by medication use; subjects lacking this information were excluded (N = 2298). Metformin medications included monotherapy and combination therapy, such as metformin with thiazolidinediones (eg, Actoplus Met or PrandiMet). Cancer patients without DM2 were included as an additional unexposed comparison group. Clinical covariates were selected a priori and included patient age at cancer diagnosis, sex, race, body mass index (BMI), insulin use, tobacco use, tumor type, and tumor stage. Some covariates were found in structured fields in the EHR or the Vanderbilt tumor registry (eg, age at diagnosis, tumor type, and tumor stage). For covariates that were not available in structured formats, we used NLP algorithms to extract the information from clinical narratives. To determine tobacco use, we utilized a recently developed smoking status extraction algorithm, which achieved a PPV of 93% for determining smoking status in Vanderbilt medical records.50 Height and weight were extracted from patient records within 2 months before and 1 month after cancer diagnosis date and used to estimate BMI. Although structured fields of height and weight exist in the EHRs, these data were often missing (42% individuals were missing height information and 36% were missing weight). To reduce the number of missing values for height and weight, we developed a regular expression-based program to extract height and weight information from clinical notes. Our manual review of 100 random NLP-extracted weights and heights revealed the PPV was 100%. Use of the NLP method reduced the percentage of missing data for height and weight to 33% and 16%, respectively. In addition, Deyo adaptation of the Charlson comorbidity index was calculated using ICD-9 codes.51 Since cancer mortality is the primary response and subcancer type was either adjusted in the regression model or was the group in the subgroup analysis, ICD-9 codes of cancer diagnoses (140–239) were excluded. All non-cancer ICD-9 codes before or within 30 days after the date of cancer diagnosis were used for the Charlson comorbidity index calculation. To verify the accuracy of our automated data extraction algorithms for drug exposure, a stratified random sample was selected from each exposure group (N = 50 for each group) and two thoracic oncology nurses independently reviewed the medical charts to determine drug exposure. Discrepancies between the two nurse reviewers were reconciled by a third physician reviewer (JCD), thus forming a ‘gold-standard’ to compare with the automated algorithms. Metformin, other DM2 drug, and insulin groups achieved PPVs of 0.98, 0.95, and 0.92, respectively. The same study design was applied to the Mayo Clinic EHR. The tumor registry at Mayo Clinic is also linked to the EHR to identify cancer patients and obtain tumor-specific information. The same algorithm was used to identify patients with and without diabetes in the Mayo EHR. The MedEx tool was used to process Mayo clinical data to identify different DM2 drug exposure groups. A locally developed program,52 similar to the Vanderbilt algorithm, was used to identify the smoking status of patients in this study.

Statistical analysis

Characteristics of the study population were summarized using median, IQR, and percent. Kaplan–Meier plots were used to visualize the unadjusted cancer survival probabilities of the four exposure groups. To formally assess the influence of metformin on cancer mortality, we used stratified Cox regression models, stratifying on tumor stage, and adjusting for age, sex, race, BMI, tobacco use, insulin, cancer type, and non-cancer Charlson index. Similar stratified Cox regression models were created to evaluate the effect of metformin on cancer survival in the patient population with breast, colorectal, lung, or prostate cancer, although tumor stage 0 and 1 were combined for lung and prostate cancers due to the limited sample sizes in these two stages. In all the aforementioned models, age, BMI, and Charlson index were modeled as restricted cubic spline functions with four knots. The covariate sex was removed from the analytical model when breast cancer and prostate cancer were being examined since these models were restricted to females and males, respectively. For the overall and individual cancer survival analysis, multiple imputation with 20 imputations was implemented for missing BMI measurements following the guidance described by White et al.53 Two-sided p values less than 0.05 were considered statistically significant. All analyses were conducted using R 2.13.1 with the survival, Hmisc, and rms packages (http://www.r-project.org).

RESULTS

We identified 42 165 individuals with an incident cancer diagnosis (excluding skin cancer and CHF/CKD, and age ≥18 years old) between January 1, 1995 and December 31, 2010 (figure 1) at Vanderbilt. Among these cancer patients, 28 917 did not have diabetes, and 3498 had DM2 matching one of the three target medication exposure groups. Of these, 63% used metformin, 26% used other oral DM2 medications, and 11% were on insulin monotherapy. Vanderbilt cancer patients had a median age of 59 years, approximately half (57%) were male, and 93% were white (table 1). Median BMI was 27 kg/m2 and 53% of cancer patients were ever smokers. DM2 patients had a median hemoglobin A1c of 7.6%. At Mayo Clinic, we identified 79 258 patients in four exposure groups (figure 2 and table 2). Figure 3 presents the Kaplan–Meier survival curves and associated 95% CIs for the four exposure groups at Vanderbilt University and Mayo Clinic. Cancer patients on metformin drugs had significantly improved 5-year survival compared to patients on other oral hypoglycemic agents (p<0.001), insulin only (p<0.001), or without diabetes (p<0.001). Adjusting for age, sex, race, BMI, tobacco use, insulin use, cancer type, and Charlson index, metformin significantly reduced overall mortality compared to diabetic patients on other oral hypoglycemic (HR 0.78; 95% CI 0.69 to 0.88) and diabetic patients on insulin only (HR 0.61; 95% CI 0.50 to 0.73). Reduced mortality was also observed for metformin compared to cancer patients without diabetes (HR 0.77; 95% CI 0.71 to 0.85) (figure 4). We replicated our findings for overall mortality after a cancer diagnosis in the Mayo Clinic cohort with HRs and 95% CIs as follows: HR 0.70 (95% CI 0.63 to 0.77), HR 0.65 (95% CI 0.58 to 0.73), and HR 0.59 (95% CI 0.54 to 0.65) (figure 4), comparing the metformin group versus other drugs, insulin only, and non-diabetic groups, respectively.
Table 1:

Descriptive characteristics of the Vanderbilt cohort (all cancers, 1995–2010)

NDM2MetforminN = 2218DM2Other drugsN = 903DM2InsulinN = 377Non-diabetic patients N = 28 917CombinedN = 32 415
Age, years32 41554, 62, 69*56, 64, 7048, 55, 6548, 58, 6749, 59, 67
Male32 41358%61%54%57%57%
White29 37188%90%86%93%93%
Body mass index (kg/m2)27 35227, 31, 3626, 31, 3525, 30, 3523, 27, 3124, 27, 32
Hemoglobin A1c1 0697.1, 7.6, 8.57.1, 7.6, 8.47.1, 7.7, 8.6NA7.1, 7.6, 8.5
Tobacco use (ever/never)22 88558%60%61%53%53%
Insulin use32 41527%27%100%0%4%
Tumor type32 415
 Colorectal8%7%3%6%6%
 Breast9%4%3%10%9%
 Lung7%8%5%8%8%
 Prostate14%9%2%18%18%
 Other63%71%86%58%59%
Tumor stage27 017
 05%4%2%6%6%
 128%25%22%26%26%
 2 or 346%44%32%47%47%
 421%27%43%21%22%

*Lower, median, and upper quartile for continuous variables.

N is the number of non-missing values.

Figure 2:

The study design and data extraction workflow for patients in the Mayo Clinic electronic health record (EHR) system from January 1995 to December 2010.

Table 2:

Descriptive characteristics of the Mayo Clinic cohort (all cancers, 1995–2010)

NDM2MetforminN = 3029DM2Other drugsN = 1629DM2InsulinN = 1462Non-diabetic patientsN = 73 138CombinedN = 79 258
Age, years79 25858, 65, 72*62, 69, 7557, 65, 7253, 62, 7154, 62, 71
Male79 25860%68%61%57%58%
White70 41199%98%99%99%99%
Body mass index (kg/m2)57 51328, 32, 3627, 30, 3426, 29, 3324, 27, 3024, 27, 30
Tobacco use (ever/never)67 68052%50%46%37%38%
Insulin use79 25845%36%100%0%5%
Tumor type79 258
 Colorectal7%7%7%6%6%
 Breast12%7%7%11%11%
 Lung7%11%6%10%10%
 Prostate19%17%7%22%21%
 Other55%59%74%50%47%
Tumor stage73 224
 07%5%3%5%5%
 130%26%31%26%27%
 2 or 346%48%42%49%48%
 417%21%24%20%20%

*Lower, median, and upper quartile for continuous variables.

N is the number of non-missing values.

Figure 3:

Kaplan–Meier (K–M) plot of overall cancer survival for the Vanderbilt and Mayo Clinic cohorts. DM2, type 2 diabetes mellitus.

Figure 4:

Adjusted HRs by cancer type for the Vanderbilt and Mayo cohorts. Other, DM2 cancer patients on other drugs; Insulin, DM2 cancer patients on insulin only; Metf, DM2 cancer patients on metformin; None, cancer patients without DM2.

Descriptive characteristics of the Vanderbilt cohort (all cancers, 1995–2010) *Lower, median, and upper quartile for continuous variables. N is the number of non-missing values. Descriptive characteristics of the Mayo Clinic cohort (all cancers, 1995–2010) *Lower, median, and upper quartile for continuous variables. N is the number of non-missing values. The study design and data extraction workflow for patients in the Mayo Clinic electronic health record (EHR) system from January 1995 to December 2010. Kaplan–Meier (K–M) plot of overall cancer survival for the Vanderbilt and Mayo Clinic cohorts. DM2, type 2 diabetes mellitus. Adjusted HRs by cancer type for the Vanderbilt and Mayo cohorts. Other, DM2 cancer patients on other drugs; Insulin, DM2 cancer patients on insulin only; Metf, DM2 cancer patients on metformin; None, cancer patients without DM2. The impact of metformin on mortality varied by cancer type and also by exposure group (figure 4). In the Vanderbilt cohort, reduced mortality with metformin use was observed across all four of the most frequent cancers, specifically breast, colorectal, lung, and prostate. Among diabetic patients with breast cancer, the greatest benefit was observed with metformin use compared to use of other diabetes drugs or insulin only. A reduced but not statistically significant HR was observed when metformin diabetic patients with breast cancer were compared to non-diabetic breast cancer patients. For colorectal cancer, metformin was beneficial compared to patients with diabetes on other drugs and cancer patients without diabetes. Lung cancer and prostate cancer mortality was not significantly improved with metformin, although the HRs did show an overall trend toward reduced mortality. Associations showed a similar protective benefit of metformin in the Mayo cancer population and most associations by cancer type (breast, colorectal, lung, and prostate) were statistically significant, likely due to larger sample sizes in the Mayo cohort. Metformin was also associated with improved survival in other cancers types in at least one EHR, including bone marrow, gynecologic, genitourinary, and gastrointestinal (see online supplementary appendix figure 1). We present the adjusted overall cancer survival curves for each tumor stage in figure 5. Metformin reduced mortality irrespective of tumor stage. Predicted survival curves for colorectal, lung, breast, and prostate cancer show patients on metformin had improved survival for each specific cancer (figure 6).
Figure 5:

Adjusted Cox proportional hazards model stratified by tumor stage for the Vanderbilt cohort. All models are based on cancer survival in a smoking white male, age 58 years, body mass index 27 kg/m2, with a cancer other than the four most common tumor types, and not using insulin. DM2, type 2 diabetes mellitus.

Figure 6:

Adjusted Cox proportional hazards model stratified by tumor type for the Vanderbilt cohort. All models are based on cancer survival in a white smoker, age 58 years, body mass index 27 kg/m2, and not using insulin. DM2, type 2 diabetes mellitus.

Adjusted Cox proportional hazards model stratified by tumor stage for the Vanderbilt cohort. All models are based on cancer survival in a smoking white male, age 58 years, body mass index 27 kg/m2, with a cancer other than the four most common tumor types, and not using insulin. DM2, type 2 diabetes mellitus. Adjusted Cox proportional hazards model stratified by tumor type for the Vanderbilt cohort. All models are based on cancer survival in a white smoker, age 58 years, body mass index 27 kg/m2, and not using insulin. DM2, type 2 diabetes mellitus.

DISCUSSION

Using two independent study populations, we validated the recently reported drug repurposing association of metformin with cancer survival. Our data demonstrate that metformin improves overall cancer survival compared to other hypoglycemic therapies in patients with DM2 and compared to patients without diabetes. These findings included a total of 111 673 patients and demonstrated a metformin survival benefit for individuals with breast and colorectal cancer in both the Vanderbilt and Mayo cohorts. Evidence for lung and prostate cancer showed a reduced mortality in both the Vanderbilt and Mayo populations, which was statistically significant only in the Mayo cohort, likely due to its larger sample size. Mortality improvements were also seen for a number of other cancers and for all cancer stages. Thus, our data support a broad role for metformin in many cancer types and, potentially, for patients with and without diabetes. We leveraged study site-maintained tumor registries combined with advanced informatics techniques examining the full text of the EHR. Prior studies have shown such methods lead to more accurate results than use of administrative data alone, as has been used in previous studies.44,46 These informatics methods, applied at both study sites, interrogated patient records to provide information on detailed medication exposures and important cancer risk factors such as smoking histories and BMI—detail not commonly afforded in retrospective claims data. With the future ubiquity of available EHR data, such data mining may provide an important tool for drug repurposing, pharmacovigilance, and comparative effectiveness research. Our findings add to a growing body of knowledge supporting a role for metformin in reducing cancer mortality.31,32,54 A strength of this study is that the same study population was used to evaluate multiple cancers. Metformin was also statistically associated with improved survival for less common cancers (see online supplementary appendix figure 1), suggesting future studies with greater statistical power should evaluate these less frequently observed cancers. Moreover, most prior epidemiologic studies have used DM2 registries31 or patient surveys38 to assess the association between metformin and cancer risk and survival. We were able to utilize two densely populated EHR-based cohorts in the USA with longitudinal follow-up and linkage with tumor registries. We were also able to incorporate smoking status into our analyses, an important consideration for many cancers but not assessed in some other retrospective studies.38,39 Using NLP for data extraction is an efficient design for hospital-based epidemiologic studies, significantly reducing the time and cost to conduct and replicate the study since no follow-up of participants is needed. In addition, our study was replicated in another independent large EHR (Mayo Clinic), demonstrating the generalizability of both our findings and the informatics tools used in this study. The mechanism by which metformin improves cancer survival either directly (insulin-independent) or indirectly (insulin-dependent) remains unknown37,40,55,56 but may be related to mTOR inhibition.57,58 The broad-based effect on multiple cancers seen in this study suggests a generalized anticancer effect. Future studies are needed to unravel the exact mechanism by which metformin acts and whether metformin should be targeted to particular patients. Currently, large efforts are underway to link EHRs across institutions and to standardize the definition of phenotypes for large-scale clinical and genomics studies of disease and treatment.59–61 Informatics approaches, such as NLP technologies that are able to extract standardized clinical information from unstructured clinical text, offer an approach to automate the data extraction process from EHRs.62 Successful progress has been made in applying informatics approaches to clinical and translational research, ranging from identifying patient safety occurrences29 and biosurveillance28 to facilitating genomics research such as genetic epidemiology and pharmacogenomic studies.63,64 In this study, we further demonstrated the value of NLP tools and electronic phenotyping algorithms in epidemiologic studies based on large-scale observational clinical practice data. To improve the efficiency of EHR-based epidemiologic research, more informatics tools to record and/or accurately extract broad types of epidemiologic information such as environmental variables (eg, exercise, diet, and other lifestyle data) are highly desirable. Limitations caution interpretation of our findings. Our medication exposures were derived from EHRs instead of pharmacy fill records. However, we have previously shown that these methods have both high sensitivity and high PPV and the ability to replicate known pharmacogenetic signals that require accurate knowledge of the timelines of medications exposures.65,66 Moreover, in comparison to claims data, they are not subject to biases from low-cost generic prescriptions, for which insurance claims are often not filed.67 The potential imperfect sensitivity of our algorithms leads to an inability to classify every patient as either diabetic or not, or to fully determine their medication exposures, primarily due to lack of data captured in the EHR. For example, we cannot exclude remote exposures to particular antidiabetic medications occurring prior to cancer diagnosis (eg, a patient with diabetes may have been treatment with metformin prior to the cancer diagnosis at an outside hospital). However, these exposures should have limited effect on cancer prognosis. Our study may be subject to immortal time bias due to misclassification of exposure time, since we are unable to discern whether erroneous exposure time was assigned between cohort entry and mention of medication in the clinical record.68,69 Excluding CHF and CKD patients could be a potential limitation of this study as well, as some physicians use metformin for these patients despite FDA warnings in these populations. We were also unable to stratify by histologic subtype within each cancer type due to small sample sizes within each cancer. We did not adjust for chemotherapy treatment regimens due to the lack of treatment information beyond first-line therapy in the tumor registry. This is a common limitation of epidemiologic studies using tumor registry or SEER data for cancer treatment information. However, it is likely that diabetic patients using metformin receive the same cancer treatment as those not using metformin, thus biasing our results towards the null. There is no published evidence, to our knowledge, of disparities in the treatment of diabetic patients with cancer, although the dosages of steroid pre-medications are often reduced in an effort to reduce incident hyperglycemia. Future classes of antineoplastics, for example, phosphoinositide 3-kinase inhibitors, may be specifically contraindicated for diabetic patients, but these medications are not yet approved for general clinical use. Diabetic patients with cancer may have greater co-morbidities than non-diabetic patients with cancer and we would expect diabetic patients to have worse survival after a cancer diagnosis than non-diabetic patients.70 However, we found in most comparisons for the four major cancers that diabetic patients on metformin had a better survival compared to non-diabetic patients, although non-diabetic patients had a better survival than diabetic patients using other drugs or insulin only. This observation is consistent with that from a recent study conducted in the UK.32 One possible interpretation for this finding is that metformin use significantly improved survival among diabetic patients despite higher prevalence of co-morbidities. Thus, it is possible that metformin use may be able to improve survival among non-diabetic cancer patients. Further studies are needed to address this important issue. We successfully detected the signal of metformin improving cancer survival using EHR data and informatics approaches. However, conducting large-scale drug repurposing studies using EHRs remains challenging. One of the problems is related to sample size. We had enough power in this study because both DM2 and cancer are high prevalence diseases and metformin is a first-line therapy for DM2. But the lack of power would be an issue for low prevalence drugs and indications. In our stratified analysis for individual cancers (see online supplementary appendix figure 1), we noticed larger CIs for less frequent cancers such as thyroid, most likely due to small sample size. This problem may be ameliorated by combining EHRs and/or complementing EHR data with data provided by drug manufacturers, drug monitoring agencies (eg, the FDA), and other ancillary data sources.

CONCLUSION

In this study, we have demonstrated that large EHRs are valuable sources for drug repurposing studies. Our findings validate the beneficial effects of metformin for cancer survival. Ongoing and future clinical trials of metformin for specific subtypes of cancer may lead to new opportunities for chemotherapy. This study serves as a model for using EHRs and informatics approaches to robustly and inexpensively validate drugs for repurposing.
  66 in total

1.  Use of genome-wide association studies for drug repositioning.

Authors:  Philippe Sanseau; Pankaj Agarwal; Michael R Barnes; Tomi Pastinen; J Brent Richards; Lon R Cardon; Vincent Mooser
Journal:  Nat Biotechnol       Date:  2012-04-10       Impact factor: 54.908

Review 2.  Finding new tricks for old drugs: an efficient route for public-sector drug discovery.

Authors:  Kerry A O'Connor; Bryan L Roth
Journal:  Nat Rev Drug Discov       Date:  2005-12       Impact factor: 84.694

3.  A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries.

Authors:  Min Jiang; Yukun Chen; Mei Liu; S Trent Rosenbloom; Subramani Mani; Joshua C Denny; Hua Xu
Journal:  J Am Med Inform Assoc       Date:  2011-04-20       Impact factor: 4.497

4.  Signatures for drug repositioning.

Authors:  Charlotte Harrison
Journal:  Nat Rev Genet       Date:  2011-09-16       Impact factor: 53.242

5.  Data-driven prediction of drug effects and interactions.

Authors:  Nicholas P Tatonetti; Patrick P Ye; Roxana Daneshjou; Russ B Altman
Journal:  Sci Transl Med       Date:  2012-03-14       Impact factor: 17.956

6.  A study of transportability of an existing smoking status detection module across institutions.

Authors:  Mei Liu; Anushi Shah; Min Jiang; Neeraja B Peterson; Qi Dai; Melinda C Aldrich; Qingxia Chen; Erica A Bowton; Hongfang Liu; Joshua C Denny; Hua Xu
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

Review 7.  Computational drug repositioning: from data to therapeutics.

Authors:  M R Hurle; L Yang; Q Xie; D K Rajpal; P Sanseau; P Agarwal
Journal:  Clin Pharmacol Ther       Date:  2013-01-15       Impact factor: 6.875

8.  Diabetes and cancer: a consensus report.

Authors:  Edward Giovannucci; David M Harlan; Michael C Archer; Richard M Bergenstal; Susan M Gapstur; Laurel A Habel; Michael Pollak; Judith G Regensteiner; Douglas Yee
Journal:  CA Cancer J Clin       Date:  2010-06-16       Impact factor: 508.702

9.  PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations.

Authors:  Joshua C Denny; Marylyn D Ritchie; Melissa A Basford; Jill M Pulley; Lisa Bastarache; Kristin Brown-Gentry; Deede Wang; Dan R Masys; Dan M Roden; Dana C Crawford
Journal:  Bioinformatics       Date:  2010-03-24       Impact factor: 6.937

10.  Metformin inhibition of mTORC1 activation, DNA synthesis and proliferation in pancreatic cancer cells: dependence on glucose concentration and role of AMPK.

Authors:  James Sinnett-Smith; Krisztina Kisfalvi; Robert Kui; Enrique Rozengurt
Journal:  Biochem Biophys Res Commun       Date:  2012-11-15       Impact factor: 3.575

View more
  86 in total

1.  Do electronic medical records improve quality of care? Yes.

Authors:  Donna P Manca
Journal:  Can Fam Physician       Date:  2015-10       Impact factor: 3.275

2.  AMPK Activation by Metformin Promotes Survival of Dormant ER+ Breast Cancer Cells.

Authors:  Riley A Hampsch; Jason D Wells; Nicole A Traphagen; Charlotte F McCleery; Jennifer L Fields; Kevin Shee; Lloye M Dillon; Darcy B Pooler; Lionel D Lewis; Eugene Demidenko; Yina H Huang; Jonathan D Marotti; Abigail E Goen; William B Kinlaw; Todd W Miller
Journal:  Clin Cancer Res       Date:  2020-04-22       Impact factor: 12.531

Review 3.  A survey of current trends in computational drug repositioning.

Authors:  Jiao Li; Si Zheng; Bin Chen; Atul J Butte; S Joshua Swamidass; Zhiyong Lu
Journal:  Brief Bioinform       Date:  2015-03-31       Impact factor: 11.622

4.  Leveraging Big Data to Transform Drug Discovery.

Authors:  Benjamin S Glicksberg; Li Li; Rong Chen; Joel Dudley; Bin Chen
Journal:  Methods Mol Biol       Date:  2019

Review 5.  Drug Repurposing: Claiming the Full Benefit from Drug Development.

Authors:  Eric Kort; Stefan Jovinge
Journal:  Curr Cardiol Rep       Date:  2021-05-07       Impact factor: 2.931

Review 6.  Drug repurposing from the perspective of pharmaceutical companies.

Authors:  Y Cha; T Erez; I J Reynolds; D Kumar; J Ross; G Koytiger; R Kusko; B Zeskind; S Risso; E Kagan; S Papapetropoulos; I Grossman; D Laifenfeld
Journal:  Br J Pharmacol       Date:  2017-05-18       Impact factor: 8.739

7.  Computational Drug Repositioning Using Continuous Self-Controlled Case Series.

Authors:  Zhaobin Kuang; James Thomson; Michael Caldwell; Peggy Peissig; Ron Stewart; David Page
Journal:  KDD       Date:  2016-08

8.  Computerized Approach to Creating a Systematic Ontology of Hematology/Oncology Regimens.

Authors:  Andrew M Malty; Sandeep K Jain; Peter C Yang; Krysten Harvey; Jeremy L Warner
Journal:  JCO Clin Cancer Inform       Date:  2018-05-11

9.  A study of active learning methods for named entity recognition in clinical text.

Authors:  Yukun Chen; Thomas A Lasko; Qiaozhu Mei; Joshua C Denny; Hua Xu
Journal:  J Biomed Inform       Date:  2015-09-15       Impact factor: 6.317

10.  PregOMICS-Leveraging systems biology and bioinformatics for drug repurposing in maternal-child health.

Authors:  Jeffery A Goldstein; Lisa A Bastarache; Joshua C Denny; Jill M Pulley; David M Aronoff
Journal:  Am J Reprod Immunol       Date:  2018-05-04       Impact factor: 3.886

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.