Literature DB >> 25707811

Estimating copy numbers of alleles from population-scale high-throughput sequencing data.

Takahiro Mimori, Naoki Nariai, Kaname Kojima, Yukuto Sato, Yosuke Kawai, Yumi Yamaguchi-Kabata, Masao Nagasaki.   

Abstract

BACKGROUND: With the recent development of microarray and high-throughput sequencing (HTS) technologies, a number of studies have revealed catalogs of copy number variants (CNVs) and their association with phenotypes and complex traits. In parallel, a number of approaches to predict CNV regions and genotypes are proposed for both microarray and HTS data. However, only a few approaches focus on haplotyping of CNV loci.
RESULTS: We propose a novel approach to infer copy unit alleles and their numbers in each sample simultaneously from population-scale HTS data by variational Bayesian inference on a generative probabilistic model inspired by latent Dirichlet allocation, which is a well studied model for document classification problems. In simulation studies, we evaluated concordance between inferred and true copy unit alleles for lower-, middle-, and higher-copy number dataset, in which precision and recall were ≥ 0.9 for data with mean coverage ≥ 10× per copy unit. We also applied the approach to HTS data of 1123 samples at highly variable salivary amylase gene locus and a pseudogene locus, and confirmed consistency of the estimated alleles within samples belonging to a trio of CEPH/Utah pedigree 1463 with 11 offspring.
CONCLUSIONS: Our proposed approach enables detailed analysis of copy number variations, such as association study between copy unit alleles and phenotypes or biological features including human diseases.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 25707811      PMCID: PMC4331703          DOI: 10.1186/1471-2105-16-S1-S4

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  27 in total

1.  The coalescent with selection on copy number variants.

Authors:  Kosuke M Teshima; Hideki Innan
Journal:  Genetics       Date:  2011-12-14       Impact factor: 4.562

2.  A linear complexity phasing method for thousands of genomes.

Authors:  Olivier Delaneau; Jonathan Marchini; Jean-François Zagury
Journal:  Nat Methods       Date:  2011-12-04       Impact factor: 28.547

Review 3.  Implications of gene copy-number variation in health and diseases.

Authors:  Suhani H Almal; Harish Padh
Journal:  J Hum Genet       Date:  2011-09-29       Impact factor: 3.172

4.  A haplotype map of the human genome.

Authors: 
Journal:  Nature       Date:  2005-10-27       Impact factor: 49.962

Review 5.  Computational methods for discovering structural variation with next-generation sequencing.

Authors:  Paul Medvedev; Monica Stanciu; Michael Brudno
Journal:  Nat Methods       Date:  2009-11       Impact factor: 28.547

6.  Learning topic models by belief propagation.

Authors:  Jia Zeng; William K Cheung; Jiming Liu
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2013-05       Impact factor: 6.226

7.  TIGAR: transcript isoform abundance estimation method with gapped alignment of RNA-Seq data by variational Bayesian inference.

Authors:  Naoki Nariai; Osamu Hirose; Kaname Kojima; Masao Nagasaki
Journal:  Bioinformatics       Date:  2013-07-02       Impact factor: 6.937

8.  Diet and the evolution of human amylase gene copy number variation.

Authors:  George H Perry; Nathaniel J Dominy; Katrina G Claw; Arthur S Lee; Heike Fiegler; Richard Redon; John Werner; Fernando A Villanea; Joanna L Mountain; Rajeev Misra; Nigel P Carter; Charles Lee; Anne C Stone
Journal:  Nat Genet       Date:  2007-09-09       Impact factor: 38.330

9.  Relative impact of nucleotide and copy number variation on gene expression phenotypes.

Authors:  Barbara E Stranger; Matthew S Forrest; Mark Dunning; Catherine E Ingle; Claude Beazley; Natalie Thorne; Richard Redon; Christine P Bird; Anna de Grassi; Charles Lee; Chris Tyler-Smith; Nigel Carter; Stephen W Scherer; Simon Tavaré; Panagiotis Deloukas; Matthew E Hurles; Emmanouil T Dermitzakis
Journal:  Science       Date:  2007-02-09       Impact factor: 47.728

10.  Inferring haplotypes of copy number variations from high-throughput data with uncertainty.

Authors:  Mamoru Kato; Seungtai Yoon; Naoya Hosono; Anthony Leotta; Jonathan Sebat; Tatsuhiko Tsunoda; Michael Q Zhang
Journal:  G3 (Bethesda)       Date:  2011-06-01       Impact factor: 3.154

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.