Literature DB >> 29910842

A Hierarchical Framework for State-Space Matrix Inference and Clustering.

Chandler Zuo1,2, Kailei Chen1,2, Kyle J Hewitt3, Emery H Bresnick3, Sündüz Keleş1,2.   

Abstract

In recent years, a large number of genomic and epigenomic studies have been focusing on the integrative analysis of multiple experimental datasets measured over a large number of observational units. The objectives of such studies include not only inferring a hidden state of activity for each unit over individual experiments, but also detecting highly associated clusters of units based on their inferred states. Although there are a number of methods tailored for specific datasets, there is currently no state-of-the-art modeling framework for this general class of problems. In this paper, we develop the MBASIC (Matrix Based Analysis for State-space Inference and Clustering) framework. MBASIC consists of two parts: state-space mapping and state-space clustering. In state-space mapping, it maps observations onto a finite state-space, representing the activation states of units across conditions. In state-space clustering, MBASIC incorporates a finite mixture model to cluster the units based on their inferred state-space profiles across all conditions. Both the state-space mapping and clustering can be simultaneously estimated through an Expectation-Maximization algorithm. MBASIC flexibly adapts to a large number of parametric distributions for the observed data, as well as the heterogeneity in replicate experiments. It allows for imposing structural assumptions on each cluster, and enables model selection using information criterion. In our data-driven simulation studies, MBASIC showed significant accuracy in recovering both the underlying state-space variables and clustering structures. We applied MBASIC to two genome research problems using large numbers of datasets from the ENCODE project. The first application grouped genes based on transcription factor occupancy profiles of their promoter regions in two different cell types. The second application focused on identifying groups of loci that are similar to a GATA2 binding site that is functional at its endogenous locus by utilizing transcription factor occupancy data and illustrated applicability of MBASIC in a wide variety of problems. In both studies, MBASIC showed higher levels of raw data fidelity than analyzing these data with a two-step approach using ENCODE results on transcription factor occupancy data.

Entities:  

Year:  2016        PMID: 29910842      PMCID: PMC6003413          DOI: 10.1214/16-AOAS938

Source DB:  PubMed          Journal:  Ann Appl Stat        ISSN: 1932-6157            Impact factor:   2.083


  30 in total

1.  Regulation of nucleosome landscape and transcription factor targeting at tissue-specific enhancers by BRG1.

Authors:  Gangqing Hu; Dustin E Schones; Kairong Cui; River Ybarra; Daniel Northrup; Qingsong Tang; Luca Gattinoni; Nicholas P Restifo; Suming Huang; Keji Zhao
Journal:  Genome Res       Date:  2011-07-27       Impact factor: 9.043

2.  Genetic framework for GATA factor function in vascular biology.

Authors:  Amelia K Linnemann; Henriette O'Geen; Sunduz Keles; Peggy J Farnham; Emery H Bresnick
Journal:  Proc Natl Acad Sci U S A       Date:  2011-08-01       Impact factor: 11.205

3.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

4.  BRG1 requirement for long-range interaction of a locus control region with a downstream promoter.

Authors:  Shin-Il Kim; Scott J Bultman; Christine M Kiefer; Ann Dean; Emery H Bresnick
Journal:  Proc Natl Acad Sci U S A       Date:  2009-01-26       Impact factor: 11.205

5.  GATA2 haploinsufficiency caused by mutations in a conserved intronic element leads to MonoMAC syndrome.

Authors:  Amy P Hsu; Kirby D Johnson; E Liana Falcone; Rajendran Sanalkumar; Lauren Sanchez; Dennis D Hickstein; Jennifer Cuellar-Rodriguez; Jacob E Lemieux; Christa S Zerbe; Emery H Bresnick; Steven M Holland
Journal:  Blood       Date:  2013-03-15       Impact factor: 22.113

6.  The BRG1 chromatin remodeler regulates widespread changes in gene expression and cell proliferation during B cell activation.

Authors:  Darcy W Holley; Beezly S Groh; Glenn Wozniak; Dallas R Donohoe; Wei Sun; Virginia Godfrey; Scott J Bultman
Journal:  J Cell Physiol       Date:  2014-01       Impact factor: 6.384

7.  Architecture of the human regulatory network derived from ENCODE data.

Authors:  Mark B Gerstein; Anshul Kundaje; Manoj Hariharan; Stephen G Landt; Koon-Kiu Yan; Chao Cheng; Xinmeng Jasmine Mu; Ekta Khurana; Joel Rozowsky; Roger Alexander; Renqiang Min; Pedro Alves; Alexej Abyzov; Nick Addleman; Nitin Bhardwaj; Alan P Boyle; Philip Cayting; Alexandra Charos; David Z Chen; Yong Cheng; Declan Clarke; Catharine Eastman; Ghia Euskirchen; Seth Frietze; Yao Fu; Jason Gertz; Fabian Grubert; Arif Harmanci; Preti Jain; Maya Kasowski; Phil Lacroute; Jing Jane Leng; Jin Lian; Hannah Monahan; Henriette O'Geen; Zhengqing Ouyang; E Christopher Partridge; Dorrelyn Patacsil; Florencia Pauli; Debasish Raha; Lucia Ramirez; Timothy E Reddy; Brian Reed; Minyi Shi; Teri Slifer; Jing Wang; Linfeng Wu; Xinqiong Yang; Kevin Y Yip; Gili Zilberman-Schapira; Serafim Batzoglou; Arend Sidow; Peggy J Farnham; Richard M Myers; Sherman M Weissman; Michael Snyder
Journal:  Nature       Date:  2012-09-06       Impact factor: 49.962

8.  Differential expression analysis for sequence count data.

Authors:  Simon Anders; Wolfgang Huber
Journal:  Genome Biol       Date:  2010-10-27       Impact factor: 13.583

9.  BRG1 directly regulates nucleosome structure and chromatin looping of the alpha globin locus to activate transcription.

Authors:  Shin-Il Kim; Emery H Bresnick; Scott J Bultman
Journal:  Nucleic Acids Res       Date:  2009-08-20       Impact factor: 16.971

10.  iASeq: integrative analysis of allele-specificity of protein-DNA interactions in multiple ChIP-seq datasets.

Authors:  Yingying Wei; Xia Li; Qian-fei Wang; Hongkai Ji
Journal:  BMC Genomics       Date:  2012-11-29       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.