Literature DB >> 25617670

LGscore: A method to identify disease-related genes using biological literature and Google data.

Jeongwoo Kim1, Hyunjin Kim2, Youngmi Yoon3, Sanghyun Park4.   

Abstract

Since the genome project in 1990s, a number of studies associated with genes have been conducted and researchers have confirmed that genes are involved in disease. For this reason, the identification of the relationships between diseases and genes is important in biology. We propose a method called LGscore, which identifies disease-related genes using Google data and literature data. To implement this method, first, we construct a disease-related gene network using text-mining results. We then extract gene-gene interactions based on co-occurrences in abstract data obtained from PubMed, and calculate the weights of edges in the gene network by means of Z-scoring. The weights contain two values: the frequency and the Google search results. The frequency value is extracted from literature data, and the Google search result is obtained using Google. We assign a score to each gene through a network analysis. We assume that genes with a large number of links and numerous Google search results and frequency values are more likely to be involved in disease. For validation, we investigated the top 20 inferred genes for five different diseases using answer sets. The answer sets comprised six databases that contain information on disease-gene relationships. We identified a significant number of disease-related genes as well as candidate genes for Alzheimer's disease, diabetes, colon cancer, lung cancer, and prostate cancer. Our method was up to 40% more accurate than existing methods.
Copyright © 2015 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Data mining; Disease; Gene; Google; Text-mining

Mesh:

Year:  2015        PMID: 25617670     DOI: 10.1016/j.jbi.2015.01.003

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  4 in total

1.  OCDB: a database collecting genes, miRNAs and drugs for obsessive-compulsive disorder.

Authors:  Anna P Privitera; Rosario Distefano; Hugo A Wefer; Alfredo Ferro; Alfredo Pulvirenti; Rosalba Giugno
Journal:  Database (Oxford)       Date:  2015-07-30       Impact factor: 3.451

2.  DTMiner: identification of potential disease targets through biomedical literature mining.

Authors:  Dong Xu; Meizhuo Zhang; Yanping Xie; Fan Wang; Ming Chen; Kenny Q Zhu; Jia Wei
Journal:  Bioinformatics       Date:  2016-08-09       Impact factor: 6.937

3.  The research on gene-disease association based on text-mining of PubMed.

Authors:  Jie Zhou; Bo-Quan Fu
Journal:  BMC Bioinformatics       Date:  2018-02-07       Impact factor: 3.169

4.  Using Internet Search Trends and Historical Trading Data for Predicting Stock Markets by the Least Squares Support Vector Regression Model.

Authors:  Ping-Feng Pai; Ling-Chuang Hong; Kuo-Ping Lin
Journal:  Comput Intell Neurosci       Date:  2018-07-24
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.