Literature DB >> 26139633

A multivariate Bernoulli model to predict DNaseI hypersensitivity status from haplotype data.

Huwenbo Shi1, Bogdan Pasaniuc2, Kenneth L Lange3.   

Abstract

MOTIVATION: Haplotype models enjoy a wide range of applications in population inference and disease gene discovery. The hidden Markov models traditionally used for haplotypes are hindered by the dubious assumption that dependencies occur only between consecutive pairs of variants. In this article, we apply the multivariate Bernoulli (MVB) distribution to model haplotype data. The MVB distribution relies on interactions among all sets of variants, thus allowing for the detection and exploitation of long-range and higher-order interactions. We discuss penalized estimation and present an efficient algorithm for fitting sparse versions of the MVB distribution to haplotype data. Finally, we showcase the benefits of the MVB model in predicting DNaseI hypersensitivity (DH) status--an epigenetic mark describing chromatin accessibility--from population-scale haplotype data.
RESULTS: We fit the MVB model to real data from 59 individuals on whom both haplotypes and DH status in lymphoblastoid cell lines are publicly available. The model allows prediction of DH status from genetic data (prediction R2=0.12 in cross-validations). Comparisons of prediction under the MVB model with prediction under linear regression (best linear unbiased prediction) and logistic regression demonstrate that the MVB model achieves about 10% higher prediction R2 than the two competing methods in empirical data.
AVAILABILITY AND IMPLEMENTATION: Software implementing the method described can be downloaded at http://bogdan.bioinformatics.ucla.edu/software/. CONTACT: shihuwenbo@ucla.edu or pasaniuc@ucla.edu.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26139633      PMCID: PMC4836401          DOI: 10.1093/bioinformatics/btv397

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  28 in total

1.  Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data.

Authors:  Na Li; Matthew Stephens
Journal:  Genetics       Date:  2003-12       Impact factor: 4.562

2.  A new multipoint method for genome-wide association studies by imputation of genotypes.

Authors:  Jonathan Marchini; Bryan Howie; Simon Myers; Gil McVean; Peter Donnelly
Journal:  Nat Genet       Date:  2007-06-17       Impact factor: 38.330

3.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase.

Authors:  Paul Scheet; Matthew Stephens
Journal:  Am J Hum Genet       Date:  2006-02-17       Impact factor: 11.025

4.  Long-range LD can confound genome scans in admixed populations.

Authors:  Alkes L Price; Michael E Weale; Nick Patterson; Simon R Myers; Anna C Need; Kevin V Shianna; Dongliang Ge; Jerome I Rotter; Esther Torres; Kent D Taylor; David B Goldstein; David Reich
Journal:  Am J Hum Genet       Date:  2008-07       Impact factor: 11.025

Review 5.  Haplotype trees and modern human origins.

Authors:  Alan R Templeton
Journal:  Am J Phys Anthropol       Date:  2005       Impact factor: 2.868

Review 6.  Population genetic inference from genomic sequence variation.

Authors:  John E Pool; Ines Hellmann; Jeffrey D Jensen; Rasmus Nielsen
Journal:  Genome Res       Date:  2010-01-12       Impact factor: 9.043

7.  A fast, powerful method for detecting identity by descent.

Authors:  Brian L Browning; Sharon R Browning
Journal:  Am J Hum Genet       Date:  2011-02-11       Impact factor: 11.025

8.  Fast and accurate genotype imputation in genome-wide association studies through pre-phasing.

Authors:  Bryan Howie; Christian Fuchsberger; Matthew Stephens; Jonathan Marchini; Gonçalo R Abecasis
Journal:  Nat Genet       Date:  2012-07-22       Impact factor: 38.330

9.  Meta-analysis identifies four new loci associated with testicular germ cell tumor.

Authors:  Charles C Chung; Peter A Kanetsky; Zhaoming Wang; Michelle A T Hildebrandt; Roelof Koster; Rolf I Skotheim; Christian P Kratz; Clare Turnbull; Victoria K Cortessis; Anne C Bakken; D Timothy Bishop; Michael B Cook; R Loren Erickson; Sophie D Fosså; Kevin B Jacobs; Larissa A Korde; Sigrid M Kraggerud; Ragnhild A Lothe; Jennifer T Loud; Nazneen Rahman; Eila C Skinner; Duncan C Thomas; Xifeng Wu; Meredith Yeager; Fredrick R Schumacher; Mark H Greene; Stephen M Schwartz; Katherine A McGlynn; Stephen J Chanock; Katherine L Nathanson
Journal:  Nat Genet       Date:  2013-05-12       Impact factor: 38.330

10.  Prediction of complex human traits using the genomic best linear unbiased predictor.

Authors:  Gustavo de Los Campos; Ana I Vazquez; Rohan Fernando; Yann C Klimentidis; Daniel Sorensen
Journal:  PLoS Genet       Date:  2013-07-11       Impact factor: 5.917

View more
  1 in total

1.  Localizing Components of Shared Transethnic Genetic Architecture of Complex Traits from GWAS Summary Data.

Authors:  Huwenbo Shi; Kathryn S Burch; Ruth Johnson; Malika K Freund; Gleb Kichaev; Nicholas Mancuso; Astrid M Manuel; Natalie Dong; Bogdan Pasaniuc
Journal:  Am J Hum Genet       Date:  2020-05-21       Impact factor: 11.025

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.