Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis.

Literature DB >> 28919931

Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis.

Ben Li¹, Yunxiao Li¹, Zhaohui S Qin^1,2.

Abstract

Modern high-throughput biotechnologies such as microarray and next generation sequencing produce a massive amount of information for each sample assayed. However, in a typical high-throughput experiment, only limited amount of data are observed for each individual feature, thus the classical 'large p, small n' problem. Bayesian hierarchical model, capable of borrowing strength across features within the same dataset, has been recognized as an effective tool in analyzing such data. However, the shrinkage effect, the most prominent feature of hierarchical features, can lead to undesirable over-correction for some features. In this work, we discuss possible causes of the over-correction problem and propose several alternative solutions. Our strategy is rooted in the fact that in the Big Data era, large amount of historical data are available which should be taken advantage of. Our strategy presents a new framework to enhance the Bayesian hierarchical model. Through simulation and real data analysis, we demonstrated superior performance of the proposed strategy. Our new strategy also enables borrowing information across different platforms which could be extremely useful with emergence of new technologies and accumulation of data from different platforms in the Big Data era. Our method has been implemented in R package "adaptiveHM", which is freely available from https://github.com/benliemory/adaptiveHM.

Entities: CellLine Chemical Disease Gene Species

Keywords: 450K methylation array; Bayesian hierarchical model; bisulphite sequencing; historical data; informative prior

Year: 2016 PMID： 28919931 PMCID： PMC5599104 DOI： 10.1007/s12561-016-9156-x

Source DB: PubMed Journal: Stat Biosci ISSN： 1867-1764

27 in total

1. Analyzing 'omics data using hierarchical models.

Authors: Hongkai Ji; X Shirley Liu
Journal: Nat Biotechnol Date: 2010-04 Impact factor: 54.908

2. PICS: probabilistic inference for ChIP-seq.

Authors: Xuekui Zhang; Gordon Robertson; Martin Krzywinski; Kaida Ning; Arnaud Droit; Steven Jones; Raphael Gottardo
Journal: Biometrics Date: 2011-03 Impact factor: 2.571

3. Detection of differentially methylated regions from whole-genome bisulfite sequencing data without replicates.

Authors: Hao Wu; Tianlei Xu; Hao Feng; Li Chen; Ben Li; Bing Yao; Zhaohui Qin; Peng Jin; Karen N Conneely
Journal: Nucleic Acids Res Date: 2015-07-15 Impact factor: 16.971

4. A Selective Overview of Variable Selection in High Dimensional Feature Space.

Authors: Jianqing Fan; Jinchi Lv
Journal: Stat Sin Date: 2010-01 Impact factor: 1.261

5. Challenges of Big Data Analysis.

Authors: Jianqing Fan; Fang Han; Han Liu
Journal: Natl Sci Rev Date: 2014-06 Impact factor: 17.275

6. The NIH Roadmap Epigenomics Mapping Consortium.

Authors: Bradley E Bernstein; John A Stamatoyannopoulos; Joseph F Costello; Bing Ren; Aleksandar Milosavljevic; Alexander Meissner; Manolis Kellis; Marco A Marra; Arthur L Beaudet; Joseph R Ecker; Peggy J Farnham; Martin Hirst; Eric S Lander; Tarjei S Mikkelsen; James A Thomson
Journal: Nat Biotechnol Date: 2010-10 Impact factor: 54.908

10. Improving identification of differentially expressed genes in microarray studies using information from public databases.

Authors: Richard D Kim; Peter J Park
Journal: Genome Biol Date: 2004-08-26 Impact factor: 13.583

1 in total

1. Detecting differentially expressed genes for syndromes by considering change in mean and dispersion simultaneously.

Authors: Chenchen Ma; Tieming Ji
Journal: BMC Bioinformatics Date: 2018-09-20 Impact factor: 3.169