Literature DB >> 16958153

Incorporating prior information via shrinkage: a combined analysis of genome-wide location data and gene expression data.

Yang Xie1, Wei Pan, Kyeong S Jeong, Arkady Khodursky.   

Abstract

Transcriptional control is a critical step in regulation of gene expression. Understanding such a control on a genomic level involves deciphering the mechanisms and structures of regulatory programmes and networks. A difficulty arises due to the weak signal and high noise in various sources of data while most current approaches are limited to analysis of a single source of data. A natural alternative is to improve statistical efficiency and power by a combined analysis of multiple sources of data. Here we propose a shrinkage method to combine genome-wide location data and gene expression data to detect the binding sites or target genes of a transcription factor. Specifically, a prior 'non-target' gene list is generated by analysing the expression data, and then this information is incorporated into the subsequent binding data analysis via a shrinkage method. There is a Bayesian justification for this shrinkage method. Both simulated and real data were used to evaluate the proposed method and compare it with analysing binding data alone. In simulation studies, the proposed method gives higher sensitivity and lower false discovery rate (FDR) in detecting the target genes. In real data example, the proposed method can reduce the estimated FDR and increase the power to detect the previously known target genes of a broad transcription regulator, leucine responsive regulatory protein (Lrp) in Escherichia coli. This method can also be used to incorporate other information, such as gene ontology (GO), to microarray data analysis to detect differentially expressed genes. Copyright 2006 John Wiley & Sons, Ltd.

Entities:  

Mesh:

Year:  2007        PMID: 16958153     DOI: 10.1002/sim.2703

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.373


  4 in total

Review 1.  Statistical methods for integrating multiple types of high-throughput data.

Authors:  Yang Xie; Chul Ahn
Journal:  Methods Mol Biol       Date:  2010

Review 2.  Use of pathway information in molecular epidemiology.

Authors:  Duncan C Thomas; David V Conti; James Baurley; Frederik Nijhout; Michael Reed; Cornelia M Ulrich
Journal:  Hum Genomics       Date:  2009-10       Impact factor: 4.639

3.  A Bayesian approach to joint modeling of protein-DNA binding, gene expression and sequence data.

Authors:  Yang Xie; Wei Pan; Kyeong S Jeong; Guanghua Xiao; Arkady B Khodursky
Journal:  Stat Med       Date:  2010-02-20       Impact factor: 2.373

4.  Comparison of linear discriminant analysis methods for the classification of cancer based on gene expression data.

Authors:  Desheng Huang; Yu Quan; Miao He; Baosen Zhou
Journal:  J Exp Clin Cancer Res       Date:  2009-12-10
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.