Indra Neil Sarkar1. 1. Center for Clinical and Translational Science, University of Vermont, Burlington, Vermont 05405, USA. neil.sarkar@uvm.edu
Abstract
OBJECTIVE: The relationship between diseases and their causative genes can be complex, especially in the case of polygenic diseases. Further exacerbating the challenges in their study is that many genes may be causally related to multiple diseases. This study explored the relationship between diseases through the adaptation of an approach pioneered in the context of information retrieval: vector space models. MATERIALS AND METHODS: A vector space model approach was developed that bridges gene disease knowledge inferred across three knowledge bases: Online Mendelian Inheritance in Man, GenBank, and Medline. The approach was then used to identify potentially related diseases for two target diseases: Alzheimer disease and Prader-Willi Syndrome. RESULTS: In the case of both Alzheimer Disease and Prader-Willi Syndrome, a set of plausible diseases were identified that may warrant further exploration. DISCUSSION: This study furthers seminal work by Swanson, et al. that demonstrated the potential for mining literature for putative correlations. Using a vector space modeling approach, information from both biomedical literature and genomic resources (like GenBank) can be combined towards identification of putative correlations of interest. To this end, the relevance of the predicted diseases of interest in this study using the vector space modeling approach were validated based on supporting literature. CONCLUSION: The results of this study suggest that a vector space model approach may be a useful means to identify potential relationships between complex diseases, and thereby enable the coordination of gene-based findings across multiple complex diseases.
OBJECTIVE: The relationship between diseases and their causative genes can be complex, especially in the case of polygenic diseases. Further exacerbating the challenges in their study is that many genes may be causally related to multiple diseases. This study explored the relationship between diseases through the adaptation of an approach pioneered in the context of information retrieval: vector space models. MATERIALS AND METHODS: A vector space model approach was developed that bridges gene disease knowledge inferred across three knowledge bases: Online Mendelian Inheritance in Man, GenBank, and Medline. The approach was then used to identify potentially related diseases for two target diseases: Alzheimer disease and Prader-Willi Syndrome. RESULTS: In the case of both Alzheimer Disease and Prader-Willi Syndrome, a set of plausible diseases were identified that may warrant further exploration. DISCUSSION: This study furthers seminal work by Swanson, et al. that demonstrated the potential for mining literature for putative correlations. Using a vector space modeling approach, information from both biomedical literature and genomic resources (like GenBank) can be combined towards identification of putative correlations of interest. To this end, the relevance of the predicted diseases of interest in this study using the vector space modeling approach were validated based on supporting literature. CONCLUSION: The results of this study suggest that a vector space model approach may be a useful means to identify potential relationships between complex diseases, and thereby enable the coordination of gene-based findings across multiple complex diseases.
Authors: Todd F Deluca; I-Hsien Wu; Jian Pu; Thomas Monaghan; Leonid Peshkin; Saurav Singh; Dennis P Wall Journal: Bioinformatics Date: 2006-06-15 Impact factor: 6.937
Authors: Joanna C Chiu; Ernest K Lee; Mary G Egan; Indra Neil Sarkar; Gloria M Coruzzi; Rob DeSalle Journal: Bioinformatics Date: 2006-01-12 Impact factor: 6.937
Authors: Kristina M Hettne; Marc Weeber; Marja L Laine; Hugo ten Cate; Scott Boyer; Jan A Kors; Bruno G Loos Journal: J Clin Periodontol Date: 2007-12 Impact factor: 8.728
Authors: Gabriel Ostlund; Thomas Schmitt; Kristoffer Forslund; Tina Köstler; David N Messina; Sanjit Roopra; Oliver Frings; Erik L L Sonnhammer Journal: Nucleic Acids Res Date: 2009-11-05 Impact factor: 16.971