Literature DB >> 24564875

GeneTopics--interpretation of gene sets via literature-driven topic models.

Vicky Wang, Li Xi, Ahmed Enayetallah, Eric Fauman, Daniel Ziemek.   

Abstract

BACKGROUND: Annotation of a set of genes is often accomplished through comparison to a library of labelled gene sets such as biological processes or canonical pathways. However, this approach might fail if the employed libraries are not up to date with the latest research, don't capture relevant biological themes or are curated at a different level of granularity than is required to appropriately analyze the input gene set. At the same time, the vast biomedical literature offers an unstructured repository of the latest research findings that can be tapped to provide thematic sub-groupings for any input gene set.
METHODS: Our proposed method relies on a gene-specific text corpus and extracts commonalities between documents in an unsupervised manner using a topic model approach. We automatically determine the number of topics summarizing the corpus and calculate a gene relevancy score for each topic allowing us to eliminate non-specific topics. As a result we obtain a set of literature topics in which each topic is associated with a subset of the input genes providing directly interpretable keywords and corresponding documents for literature research.
RESULTS: We validate our method based on labelled gene sets from the KEGG metabolic pathway collection and the genetic association database (GAD) and show that the approach is able to detect topics consistent with the labelled annotation. Furthermore, we discuss the results on three different types of experimentally derived gene sets, (1) differentially expressed genes from a cardiac hypertrophy experiment in mice, (2) altered transcript abundance in human pancreatic beta cells, and (3) genes implicated by GWA studies to be associated with metabolite levels in a healthy population. In all three cases, we are able to replicate findings from the original papers in a quick and semi-automated manner.
CONCLUSIONS: Our approach provides a novel way of automatically generating meaningful annotations for gene sets that are directly tied to relevant articles in the literature. Extending a general topic model method, the approach introduced here establishes a workflow for the interpretation of gene sets generated from diverse experimental scenarios that can complement the classical approach of comparison to reference gene sets.

Entities:  

Mesh:

Year:  2013        PMID: 24564875      PMCID: PMC4029197          DOI: 10.1186/1752-0509-7-S5-S10

Source DB:  PubMed          Journal:  BMC Syst Biol        ISSN: 1752-0509


  12 in total

1.  KEGG: kyoto encyclopedia of genes and genomes.

Authors:  M Kanehisa; S Goto
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

3.  The genetic association database.

Authors:  Kevin G Becker; Kathleen C Barnes; Tiffani J Bright; S Alex Wang
Journal:  Nat Genet       Date:  2004-05       Impact factor: 38.330

4.  Human metabolic individuality in biomedical and pharmaceutical research.

Authors:  So-Youn Shin; Ann-Kristin Petersen; Nicole Soranzo; Christian Gieger; Karsten Suhre; Robert P Mohney; David Meredith; Brigitte Wägele; Elisabeth Altmaier; Panos Deloukas; Jeanette Erdmann; Elin Grundberg; Christopher J Hammond; Martin Hrabé de Angelis; Gabi Kastenmüller; Anna Köttgen; Florian Kronenberg; Massimo Mangino; Christa Meisinger; Thomas Meitinger; Hans-Werner Mewes; Michael V Milburn; Cornelia Prehn; Johannes Raffler; Janina S Ried; Werner Römisch-Margl; Nilesh J Samani; Kerrin S Small; H-Erich Wichmann; Guangju Zhai; Thomas Illig; Tim D Spector; Jerzy Adamski
Journal:  Nature       Date:  2011-08-31       Impact factor: 49.962

5.  Finding complex biological relationships in recent PubMed articles using Bio-LDA.

Authors:  Huijun Wang; Ying Ding; Jie Tang; Xiao Dong; Bing He; Judy Qiu; David J Wild
Journal:  PLoS One       Date:  2011-03-23       Impact factor: 3.240

6.  A systems genetics approach identifies genes and pathways for type 2 diabetes in human islets.

Authors:  Jalal Taneera; Stefan Lang; Amitabh Sharma; Joao Fadista; Yuedan Zhou; Emma Ahlqvist; Anna Jonsson; Valeriya Lyssenko; Petter Vikman; Ola Hansson; Hemang Parikh; Olle Korsgren; Arvind Soni; Ulrika Krus; Enming Zhang; Xing-Jun Jing; Jonathan L S Esguerra; Claes B Wollheim; Albert Salehi; Anders Rosengren; Erik Renström; Leif Groop
Journal:  Cell Metab       Date:  2012-07-03       Impact factor: 27.287

Review 7.  The role of blood vessels, endothelial cells, and vascular pericytes in insulin secretion and peripheral insulin action.

Authors:  Oliver C Richards; Summer M Raines; Alan D Attie
Journal:  Endocr Rev       Date:  2010-02-17       Impact factor: 19.871

8.  Transcriptional profile of isoproterenol-induced cardiomyopathy and comparison to exercise-induced cardiac hypertrophy and human cardiac failure.

Authors:  Cristi L Galindo; Michael A Skinner; Mounir Errami; L Danielle Olson; David A Watson; Jing Li; John F McCormick; Lauren J McIver; Neil M Kumar; Thinh Q Pham; Harold R Garner
Journal:  BMC Physiol       Date:  2009-12-09

9.  A general modular framework for gene set enrichment analysis.

Authors:  Marit Ackermann; Korbinian Strimmer
Journal:  BMC Bioinformatics       Date:  2009-02-03       Impact factor: 3.169

10.  Identifying biological concepts from a protein-related corpus with a probabilistic topic model.

Authors:  Bin Zheng; David C McLean; Xinghua Lu
Journal:  BMC Bioinformatics       Date:  2006-02-08       Impact factor: 3.169

View more
  4 in total

1.  Interdisciplinary dialogue for education, collaboration, and innovation: intelligent Biology and Medicine in and beyond 2013.

Authors:  Bing Zhang; Yufei Huang; Jason E McDermott; Rebecca H Posey; Hua Xu; Zhongming Zhao
Journal:  BMC Genomics       Date:  2013-12-09       Impact factor: 3.969

2.  Text mining for identifying topics in the literatures about adolescent substance use and depression.

Authors:  Shi-Heng Wang; Yijun Ding; Weizhong Zhao; Yung-Hsiang Huang; Roger Perkins; Wen Zou; James J Chen
Journal:  BMC Public Health       Date:  2016-03-19       Impact factor: 3.295

Review 3.  An overview of topic modeling and its current applications in bioinformatics.

Authors:  Lin Liu; Lin Tang; Wen Dong; Shaowen Yao; Wei Zhou
Journal:  Springerplus       Date:  2016-09-20

4.  Biomedical Text Categorization Based on Ensemble Pruning and Optimized Topic Modelling.

Authors:  Aytuğ Onan
Journal:  Comput Math Methods Med       Date:  2018-07-22       Impact factor: 2.238

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.