Fei Zhang1, Bo Sun1, Xiaolin Diao1, Wei Zhao2, Ting Shu3. 1. Department of Information Center, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, No. 167 North Lishi Road, Xicheng District, Beijing, 100037, China. 2. Department of Information Center, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, No. 167 North Lishi Road, Xicheng District, Beijing, 100037, China. zw@fuwai.com. 3. National Institute of Hospital Administration, National Health Commission, Building 3, Yard 6, Shouti South Road, Haidian, Beijing, 100044, China. nctingting@126.com.
Abstract
BACKGROUND: Adverse drug reactions (ADRs) are an important concern in the medication process and can pose a substantial economic burden for patients and hospitals. Because of the limitations of clinical trials, it is difficult to identify all possible ADRs of a drug before it is marketed. We developed a new model based on data mining technology to predict potential ADRs based on available drug data. METHOD: Based on the Word2Vec model in Nature Language Processing, we propose a new knowledge graph embedding method that embeds drugs and ADRs into their respective vectors and builds a logistic regression classification model to predict whether a given drug will have ADRs. RESULT: First, a new knowledge graph embedding method was proposed, and comparison with similar studies showed that our model not only had high prediction accuracy but also was simpler in model structure. In our experiments, the AUC of the classification model reached a maximum of 0.87, and the mean AUC was 0.863. CONCLUSION: In this paper, we introduce a new method to embed knowledge graph to vectorize drugs and ADRs, then use a logistic regression classification model to predict whether there is a causal relationship between them. The experiment showed that the use of knowledge graph embedding can effectively encode drugs and ADRs. And the proposed ADRs prediction system is also very effective.
BACKGROUND: Adverse drug reactions (ADRs) are an important concern in the medication process and can pose a substantial economic burden for patients and hospitals. Because of the limitations of clinical trials, it is difficult to identify all possible ADRs of a drug before it is marketed. We developed a new model based on data mining technology to predict potential ADRs based on available drug data. METHOD: Based on the Word2Vec model in Nature Language Processing, we propose a new knowledge graph embedding method that embeds drugs and ADRs into their respective vectors and builds a logistic regression classification model to predict whether a given drug will have ADRs. RESULT: First, a new knowledge graph embedding method was proposed, and comparison with similar studies showed that our model not only had high prediction accuracy but also was simpler in model structure. In our experiments, the AUC of the classification model reached a maximum of 0.87, and the mean AUC was 0.863. CONCLUSION: In this paper, we introduce a new method to embed knowledge graph to vectorize drugs and ADRs, then use a logistic regression classification model to predict whether there is a causal relationship between them. The experiment showed that the use of knowledge graph embedding can effectively encode drugs and ADRs. And the proposed ADRs prediction system is also very effective.
Entities:
Keywords:
Adverse Drug Reactions; DrugBank; Knowledge Graph Embedding; Word2Vec
Authors: Thomas Theo Brehm; Malte H Wehmeyer; Valentin Fuhrmann; Hansjörg Schäfer; Johannes Kluwe Journal: Am J Ther Date: 2019 Jul/Aug Impact factor: 2.688
Authors: Eugen Lounkine; Michael J Keiser; Steven Whitebread; Dmitri Mikhailov; Jacques Hamon; Jeremy L Jenkins; Paul Lavan; Eckhard Weber; Allison K Doak; Serge Côté; Brian K Shoichet; Laszlo Urban Journal: Nature Date: 2012-06-10 Impact factor: 49.962