Literature DB >> 25083553

Discovering functional modules by topic modeling RNA-Seq based toxicogenomic data.

Ke Yu1, Binsheng Gong, Mikyung Lee, Zhichao Liu, Joshua Xu, Roger Perkins, Weida Tong.   

Abstract

Toxicogenomics (TGx) endeavors to elucidate the underlying molecular mechanisms through exploring gene expression profiles in response to toxic substances. Recently, RNA-Seq is increasingly regarded as a more powerful alternative to microarrays in TGx studies. However, realizing RNA-Seq's full potential requires novel approaches to extracting information from the complex TGx data. Considering read counts as the number of times a word occurs in a document, gene expression profiles from RNA-Seq are analogous to a word by document matrix used in text mining. Topic modeling aiming at to discover the latent structures in text corpora would be helpful to explore RNA-Seq based TGx data. In this study, topic modeling was applied on a typical RNA-Seq based TGx data set to discover hidden functional modules. The RNA-Seq based gene expression profiles were transformed into "documents", on which latent Dirichlet allocation (LDA) was used to build a topic model. We found samples treated by the compounds with the same modes of actions (MoAs) could be clustered based on topic similarities. The topic most relevant to each cluster was identified as a "marker" topic, which was interpreted by gene enrichment analysis with MoAs then confirmed by compound and pathways associations mined from literature. To further validate the "marker" topics, we tested topic transferability from RNA-Seq to microarrays. The RNA-Seq based gene expression profile of a topic specifically associated with peroxisome proliferator-activated receptors (PPAR) signaling pathway was used to query samples with similar expression profiles in two different microarray data sets, yielding accuracy of about 85%. This proof-of-concept study demonstrates the applicability of topic modeling to discover functional modules in RNA-Seq data and suggests a valuable computational tool for leveraging information within TGx data in RNA-Seq era.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25083553     DOI: 10.1021/tx500148n

Source DB:  PubMed          Journal:  Chem Res Toxicol        ISSN: 0893-228X            Impact factor:   3.739


  4 in total

1.  Asymmetric author-topic model for knowledge discovering of big data in toxicogenomics.

Authors:  Ming-Hua Chung; Yuping Wang; Hailin Tang; Wen Zou; John Basinger; Xiaowei Xu; Weida Tong
Journal:  Front Pharmacol       Date:  2015-04-20       Impact factor: 5.810

2.  Open TG-GATEs: a large-scale toxicogenomics database.

Authors:  Yoshinobu Igarashi; Noriyuki Nakatsu; Tomoya Yamashita; Atsushi Ono; Yasuo Ohno; Tetsuro Urushidani; Hiroshi Yamada
Journal:  Nucleic Acids Res       Date:  2014-10-13       Impact factor: 16.971

3.  Application of dynamic topic models to toxicogenomics data.

Authors:  Mikyung Lee; Zhichao Liu; Ruili Huang; Weida Tong
Journal:  BMC Bioinformatics       Date:  2016-10-06       Impact factor: 3.169

4.  Transcriptional Responses Reveal Similarities Between Preclinical Rat Liver Testing Systems.

Authors:  Zhichao Liu; Brian Delavan; Ruth Roberts; Weida Tong
Journal:  Front Genet       Date:  2018-03-20       Impact factor: 4.599

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.