Literature DB >> 28919931

Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis.

Ben Li1, Yunxiao Li1, Zhaohui S Qin1,2.   

Abstract

Modern high-throughput biotechnologies such as microarray and next generation sequencing produce a massive amount of information for each sample assayed. However, in a typical high-throughput experiment, only limited amount of data are observed for each individual feature, thus the classical 'large p, small n' problem. Bayesian hierarchical model, capable of borrowing strength across features within the same dataset, has been recognized as an effective tool in analyzing such data. However, the shrinkage effect, the most prominent feature of hierarchical features, can lead to undesirable over-correction for some features. In this work, we discuss possible causes of the over-correction problem and propose several alternative solutions. Our strategy is rooted in the fact that in the Big Data era, large amount of historical data are available which should be taken advantage of. Our strategy presents a new framework to enhance the Bayesian hierarchical model. Through simulation and real data analysis, we demonstrated superior performance of the proposed strategy. Our new strategy also enables borrowing information across different platforms which could be extremely useful with emergence of new technologies and accumulation of data from different platforms in the Big Data era. Our method has been implemented in R package "adaptiveHM", which is freely available from https://github.com/benliemory/adaptiveHM.

Entities:  

Keywords:  450K methylation array; Bayesian hierarchical model; bisulphite sequencing; historical data; informative prior

Year:  2016        PMID: 28919931      PMCID: PMC5599104          DOI: 10.1007/s12561-016-9156-x

Source DB:  PubMed          Journal:  Stat Biosci        ISSN: 1867-1764


  27 in total

1.  Analyzing 'omics data using hierarchical models.

Authors:  Hongkai Ji; X Shirley Liu
Journal:  Nat Biotechnol       Date:  2010-04       Impact factor: 54.908

2.  PICS: probabilistic inference for ChIP-seq.

Authors:  Xuekui Zhang; Gordon Robertson; Martin Krzywinski; Kaida Ning; Arnaud Droit; Steven Jones; Raphael Gottardo
Journal:  Biometrics       Date:  2011-03       Impact factor: 2.571

3.  Detection of differentially methylated regions from whole-genome bisulfite sequencing data without replicates.

Authors:  Hao Wu; Tianlei Xu; Hao Feng; Li Chen; Ben Li; Bing Yao; Zhaohui Qin; Peng Jin; Karen N Conneely
Journal:  Nucleic Acids Res       Date:  2015-07-15       Impact factor: 16.971

4.  A Selective Overview of Variable Selection in High Dimensional Feature Space.

Authors:  Jianqing Fan; Jinchi Lv
Journal:  Stat Sin       Date:  2010-01       Impact factor: 1.261

5.  Challenges of Big Data Analysis.

Authors:  Jianqing Fan; Fang Han; Han Liu
Journal:  Natl Sci Rev       Date:  2014-06       Impact factor: 17.275

6.  The NIH Roadmap Epigenomics Mapping Consortium.

Authors:  Bradley E Bernstein; John A Stamatoyannopoulos; Joseph F Costello; Bing Ren; Aleksandar Milosavljevic; Alexander Meissner; Manolis Kellis; Marco A Marra; Arthur L Beaudet; Joseph R Ecker; Peggy J Farnham; Martin Hirst; Eric S Lander; Tarjei S Mikkelsen; James A Thomson
Journal:  Nat Biotechnol       Date:  2010-10       Impact factor: 54.908

7.  Using pre-existing microarray datasets to increase experimental power: application to insulin resistance.

Authors:  Bernie J Daigle; Alicia Deng; Tracey McLaughlin; Samuel W Cushman; Margaret C Cam; Gerald Reaven; Philip S Tsao; Russ B Altman
Journal:  PLoS Comput Biol       Date:  2010-03-26       Impact factor: 4.475

8.  The power prior: theory and applications.

Authors:  Joseph G Ibrahim; Ming-Hui Chen; Yeongjin Gwon; Fang Chen
Journal:  Stat Med       Date:  2015-09-07       Impact factor: 2.373

9.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

10.  Improving identification of differentially expressed genes in microarray studies using information from public databases.

Authors:  Richard D Kim; Peter J Park
Journal:  Genome Biol       Date:  2004-08-26       Impact factor: 13.583

View more
  1 in total

1.  Detecting differentially expressed genes for syndromes by considering change in mean and dispersion simultaneously.

Authors:  Chenchen Ma; Tieming Ji
Journal:  BMC Bioinformatics       Date:  2018-09-20       Impact factor: 3.169

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.