Literature DB >> 15699029

Inference of missing SNPs and information quantity measurements for haplotype blocks.

Shih-Chieh Su1, C-C Jay Kuo, Ting Chen.   

Abstract

MOTIVATION: Missing data in genotyping single nucleotide polymorphism (SNP) spots are common. High-throughput genotyping methods usually have a high rate of missing data. For example, the published human chromosome 21 data by Patil et al. contains about 20% missing SNPs. Inferring missing SNPs using the haplotype block structure is promising but difficult because the haplotype block boundaries are not well defined. Here we propose a global algorithm to overcome this difficulty.
RESULTS: First, we propose to use entropy as a measure of haplotype diversity. We show that the entropy measure combined with a dynamic programming algorithm produces better haplotype block partitions than other measures. Second, based on the entropy measure, we propose a two-step iterative partition-inference algorithm for the inference of missing SNPs. At the first step, we apply the dynamic programming algorithm to partition haplotypes into blocks. At the second step, we use an iterative process similar to the expectation-maximization algorithm to infer missing SNPs in each haplotype block so as to minimize the block entropy. The algorithm iterates these two steps until the total block entropy is minimized. We test our algorithm in several experimental data sets. The results show that the global approach significantly improves the accuracy of the inference. AVAILABILITY: Upon request.

Entities:  

Mesh:

Year:  2005        PMID: 15699029     DOI: 10.1093/bioinformatics/bti261

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  9 in total

1.  A multilocus linkage disequilibrium measure based on mutual information theory and its applications.

Authors:  Lei Zhang; Jianfeng Liu; Hong-Wen Deng
Journal:  Genetica       Date:  2009-08-26       Impact factor: 1.082

2.  Multivariate Analysis of Data Sets with Missing Values: An Information Theory-Based Reliability Function.

Authors:  Lisa Uechi; David J Galas; Nikita A Sakhanenko
Journal:  J Comput Biol       Date:  2018-11-29       Impact factor: 1.479

3.  Modelling and visualizing fine-scale linkage disequilibrium structure.

Authors:  David Edwards
Journal:  BMC Bioinformatics       Date:  2013-06-06       Impact factor: 3.169

4.  Computation of haplotypes on SNPs subsets: advantage of the "global method".

Authors:  Cédric Coulonges; Olivier Delaneau; Manon Girard; Hervé Do; Ronald Adkins; Jean-Louis Spadoni; Jean-François Zagury
Journal:  BMC Genet       Date:  2006-10-26       Impact factor: 2.797

5.  Genetic association studies: an information content perspective.

Authors:  Cen Wu; Shaoyu Li; Yuehua Cui
Journal:  Curr Genomics       Date:  2012-11       Impact factor: 2.236

6.  Fast accurate missing SNP genotype local imputation.

Authors:  Yining Wang; Zhipeng Cai; Paul Stothard; Steve Moore; Randy Goebel; Lusheng Wang; Guohui Lin
Journal:  BMC Res Notes       Date:  2012-08-03

7.  Whole genome SNP genotype piecemeal imputation.

Authors:  Yining Wang; Tim Wylie; Paul Stothard; Guohui Lin
Journal:  BMC Bioinformatics       Date:  2015-10-23       Impact factor: 3.169

Review 8.  Identification of rheumatoid arthritis biomarkers based on single nucleotide polymorphisms and haplotype blocks: A systematic review and meta-analysis.

Authors:  Mohamed N Saad; Mai S Mabrouk; Ayman M Eldeib; Olfat G Shaker
Journal:  J Adv Res       Date:  2015-02-04       Impact factor: 10.479

9.  Comparative study for haplotype block partitioning methods - Evidence from chromosome 6 of the North American Rheumatoid Arthritis Consortium (NARAC) dataset.

Authors:  Mohamed N Saad; Mai S Mabrouk; Ayman M Eldeib; Olfat G Shaker
Journal:  PLoS One       Date:  2018-12-31       Impact factor: 3.240

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.