Literature DB >> 23606777

Nonparametric Bayes Modeling of Multivariate Categorical Data.

David B Dunson1, Chuanhua Xing.   

Abstract

Modeling of multivariate unordered categorical (nominal) data is a challenging problem, particularly in high dimensions and cases in which one wishes to avoid strong assumptions about the dependence structure. Commonly used approaches rely on the incorporation of latent Gaussian random variables or parametric latent class models. The goal of this article is to develop a nonparametric Bayes approach, which defines a prior with full support on the space of distributions for multiple unordered categorical variables. This support condition ensures that we are not restricting the dependence structure a priori. We show this can be accomplished through a Dirichlet process mixture of product multinomial distributions, which is also a convenient form for posterior computation. Methods for nonparametric testing of violations of independence are proposed, and the methods are applied to model positional dependence within transcription factor binding motifs.

Entities:  

Keywords:  Bayes factor; Dirichlet process; Goodness-of-fit test; Latent class; Mixture model; Motif data; Product multinomial; Unordered categorical

Year:  2012        PMID: 23606777      PMCID: PMC3630378          DOI: 10.1198/jasa.2009.tm08439

Source DB:  PubMed          Journal:  J Am Stat Assoc        ISSN: 0162-1459            Impact factor:   5.033


  11 in total

1.  Modeling within-motif dependence for transcription factor binding site predictions.

Authors:  Qing Zhou; Jun S Liu
Journal:  Bioinformatics       Date:  2004-01-29       Impact factor: 6.937

2.  A global map of p53 transcription-factor binding sites in the human genome.

Authors:  Chia-Lin Wei; Qiang Wu; Vinsensius B Vega; Kuo Ping Chiu; Patrick Ng; Tao Zhang; Atif Shahab; How Choong Yong; YuTao Fu; Zhiping Weng; JianJun Liu; Xiao Dong Zhao; Joon-Lin Chew; Yen Ling Lee; Vladimir A Kuznetsov; Wing-Kin Sung; Lance D Miller; Bing Lim; Edison T Liu; Qiang Yu; Huck-Hui Ng; Yijun Ruan
Journal:  Cell       Date:  2006-01-13       Impact factor: 41.582

3.  Subset clustering of binary sequences, with an application to genomic abnormality data.

Authors:  Peter D Hoff
Journal:  Biometrics       Date:  2005-12       Impact factor: 2.571

4.  Position dependencies in transcription factor binding sites.

Authors:  Andrija Tomovic; Edward J Oakeley
Journal:  Bioinformatics       Date:  2007-02-18       Impact factor: 6.937

5.  Bayesian Analysis of Multivariate Nominal Measures Using Multivariate Multinomial Probit Models.

Authors:  Xiao Zhang; W John Boscardin; Thomas R Belin
Journal:  Comput Stat Data Anal       Date:  2008-03-15       Impact factor: 1.681

6.  Kernel stick-breaking processes.

Authors:  David B Dunson; Ju-Hyun Park
Journal:  Biometrika       Date:  2008       Impact factor: 2.445

7.  An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences.

Authors:  C E Lawrence; A A Reilly
Journal:  Proteins       Date:  1990

8.  Multi-variate probit analysis.

Authors:  J R Ashford; R R Sowden
Journal:  Biometrics       Date:  1970-09       Impact factor: 2.571

9.  Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors.

Authors:  Martha L Bulyk; Philip L F Johnson; George M Church
Journal:  Nucleic Acids Res       Date:  2002-03-01       Impact factor: 16.971

10.  BioBayesNet: a web server for feature extraction and Bayesian network modeling of biological sequence data.

Authors:  Swetlana Nikolajewa; Rainer Pudimat; Michael Hiller; Matthias Platzer; Rolf Backofen
Journal:  Nucleic Acids Res       Date:  2007-05-30       Impact factor: 16.971

View more
  28 in total

1.  Marginally specified priors for non-parametric Bayesian estimation.

Authors:  David C Kessler; Peter D Hoff; David B Dunson
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2015-01-01       Impact factor: 4.488

2.  Theoretical limits of microclustering for record linkage.

Authors:  J E Johndrow; K Lum; D B Dunson
Journal:  Biometrika       Date:  2018-03-19       Impact factor: 2.445

3.  Nonparametric Bayes modeling with sample survey weights.

Authors:  T Kunihama; A H Herring; C T Halpern; D B Dunson
Journal:  Stat Probab Lett       Date:  2016-03-04       Impact factor: 0.870

Review 4.  Current approaches used in epidemiologic studies to examine short-term multipollutant air pollution exposures.

Authors:  Angel D Davalos; Thomas J Luben; Amy H Herring; Jason D Sacks
Journal:  Ann Epidemiol       Date:  2016-12-09       Impact factor: 3.797

5.  A dynamic Bayesian Markov model for phasing and characterizing haplotypes in next-generation sequencing.

Authors:  Yu Zhang
Journal:  Bioinformatics       Date:  2013-02-13       Impact factor: 6.937

6.  Learning phenotype densities conditional on many interacting predictors.

Authors:  David C Kessler; Jack A Taylor; David B Dunson
Journal:  Bioinformatics       Date:  2014-02-05       Impact factor: 6.937

7.  Nonparametric Bayes modeling for case control studies with many predictors.

Authors:  Jing Zhou; Amy H Herring; Anirban Bhattacharya; Andrew F Olshan; David B Dunson
Journal:  Biometrics       Date:  2015-09-22       Impact factor: 2.571

8.  BAYESIAN FACTOR MODELS FOR PROBABILISTIC CAUSE OF DEATH ASSESSMENT WITH VERBAL AUTOPSIES.

Authors:  Tsuyoshi Kunihama; Zehang Richard Li; Samuel J Clark; Tyler H McCormick
Journal:  Ann Appl Stat       Date:  2020-04-16       Impact factor: 2.083

9.  BAYESIAN SEMIPARAMETRIC ANALYSIS FOR TWO-PHASE STUDIES OF GENE-ENVIRONMENT INTERACTION.

Authors:  Jaeil Ahn; Bhramar Mukherjee; Stephen B Gruber; Malay Ghosh
Journal:  Ann Appl Stat       Date:  2013-03       Impact factor: 2.083

10.  Robust Clustering with Subpopulation-specific Deviations.

Authors:  Briana J K Stephenson; Amy H Herring; Andrew Olshan
Journal:  J Am Stat Assoc       Date:  2019-06-19       Impact factor: 5.033

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.