Literature DB >> 21844637

Exploiting the functional and taxonomic structure of genomic data by probabilistic topic modeling.

Xin Chen1, Xiaohua Hu, Tze Y Lim, Xiajiong Shen, E K Park, Gail L Rosen.   

Abstract

In this paper, we present a method that enable both homology-based approach and composition-based approach to further study the functional core (i.e., microbial core and gene core, correspondingly). In the proposed method, the identification of major functionality groups is achieved by generative topic modeling, which is able to extract useful information from unlabeled data. We first show that generative topic model can be used to model the taxon abundance information obtained by homology-based approach and study the microbial core. The model considers each sample as a “document,” which has a mixture of functional groups, while each functional group (also known as a “latent topic”) is a weight mixture of species. Therefore, estimating the generative topic model for taxon abundance data will uncover the distribution over latent functions (latent topic) in each sample. Second, we show that, generative topic model can also be used to study the genome-level composition of “N-mer” features (DNA subreads obtained by composition-based approaches). The model consider each genome as a mixture of latten genetic patterns (latent topics), while each functional pattern is a weighted mixture of the “N-mer” features, thus the existence of core genomes can be indicated by a set of common N-mer features. After studying the mutual information between latent topics and gene regions, we provide an explanation of the functional roles of uncovered latten genetic patterns. The experimental results demonstrate the effectiveness of proposed method.

Mesh:

Year:  2012        PMID: 21844637     DOI: 10.1109/TCBB.2011.113

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  4 in total

1.  Topic modeling revisited:  New evidence on algorithm performance and quality metrics.

Authors:  Matthias Rüdiger; David Antons; Amol M Joshi; Torsten-Oliver Salge
Journal:  PLoS One       Date:  2022-04-28       Impact factor: 3.752

2.  Exploiting topic modeling to boost metagenomic reads binning.

Authors:  Ruichang Zhang; Zhanzhan Cheng; Jihong Guan; Shuigeng Zhou
Journal:  BMC Bioinformatics       Date:  2015-03-18       Impact factor: 3.169

3.  Whole-Genome k-mer Topic Modeling AssociatesBacterial Families.

Authors:  Ernesto Borrayo-Carbajal; Isaias May-Canche; Omar Paredes; J Alejandro Morales; Rebeca Romo-Vázquez; Hugo Vélez-Pérez
Journal:  Genes (Basel)       Date:  2020-02-14       Impact factor: 4.096

Review 4.  An overview of topic modeling and its current applications in bioinformatics.

Authors:  Lin Liu; Lin Tang; Wen Dong; Shaowen Yao; Wei Zhou
Journal:  Springerplus       Date:  2016-09-20
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.