Literature DB >> 19785048

Iterative two-pass algorithm for missing data imputation in SNP arrays.

Christine Sinoquet1.   

Abstract

Though nowadays high-throughput genotyping techniques' quality improves, missing data still remains fairly common. Studies have shown that even a low percentage of missing SNPs is detrimental to the reliability of down-stream analyses such as SNP-disease association tests. This paper investigates the potentiality for improving the accuracy of an SNP inference method based on the algorithm formerly designed by Roberts and co-workers (NPUTE, 2007). This initial algorithm performs a single scan of an SNP array, inferring missing SNPs in the context of sliding windows. We have first designed a variant, KNNWinOpti, which fully exploits backward and forward dependencies between the overlapping windows and thus restores the genuine dependency of inference upon direction scanning. Our major contribution, algorithm SNPShuttle, therefore iterates bi-directional scanning to predict SNP values with more confidence. We have run simulations on realistic benchmarks built after the high resolution map of mouse strains published by the Perlegen Project. For each of the 20 mouse chromosomes and for missing data percentage varying in range 5%-30%, SNPShuttle has always been shown to increase yet high KNNWinOpti's accuracies.

Entities:  

Mesh:

Year:  2009        PMID: 19785048     DOI: 10.1142/s0219720009004357

Source DB:  PubMed          Journal:  J Bioinform Comput Biol        ISSN: 0219-7200            Impact factor:   1.122


  2 in total

1.  Fast accurate missing SNP genotype local imputation.

Authors:  Yining Wang; Zhipeng Cai; Paul Stothard; Steve Moore; Randy Goebel; Lusheng Wang; Guohui Lin
Journal:  BMC Res Notes       Date:  2012-08-03

2.  Whole genome SNP genotype piecemeal imputation.

Authors:  Yining Wang; Tim Wylie; Paul Stothard; Guohui Lin
Journal:  BMC Bioinformatics       Date:  2015-10-23       Impact factor: 3.169

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.