| Literature DB >> 23233546 |
Eric C Chi1, Hua Zhou, Gary K Chen, Diego Ortega Del Vecchyo, Kenneth Lange.
Abstract
Most current genotype imputation methods are model-based and computationally intensive, taking days to impute one chromosome pair on 1000 people. We describe an efficient genotype imputation method based on matrix completion. Our matrix completion method is implemented in MATLAB and tested on real data from HapMap 3, simulated pedigree data, and simulated low-coverage sequencing data derived from the 1000 Genomes Project. Compared with leading imputation programs, the matrix completion algorithm embodied in our program MENDEL-IMPUTE achieves comparable imputation accuracy while reducing run times significantly. Implementation in a lower-level language such as Fortran or C is apt to further improve computational efficiency.Entities:
Mesh:
Year: 2012 PMID: 23233546 PMCID: PMC3589539 DOI: 10.1101/gr.145821.112
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043
Comparison of model-based imputation and matrix completion
Accuracy and timing results for MaCH (MA), fastPHASE (FP), and MENDEL-IMPUTE (MI) on four different chromosomes from the CHB samples from HapMap
Accuracy and timing results for MaCH (MA), fastPHASE (FP), and MENDEL-IMPUTE (MI) on four different chromosomes from the YRI samples from HapMap
Accuracy and timing results for MENDEL-IMPUTE and MaCH on synthetic pedigree data
Association analysis results based on Illumina 2.5M microarray data using the 1KGP reference haplotypes
Timing results on high-coverage genotyping microarray data
Figure 1.The negative logarithm of P-values for association when the true signal depends on SNP 40,938. (MI) MENDEL-IMPUTE.
Accuracy results on synthetic low-coverage sequencing data
Timing results on synthetic low-coverage sequencing data
Figure 2.Accuracy versus time trade-off for the Nesterov algorithm on chromosome 4 from the Chinese Han group in HapMap 3. The numbers indicate the subwindow size w. (Dashed line) Error rate for MaCH on the same data set.
Summary of SNP counts from HapMap 3 used to compare model-based imputation by MaCH and MENDEL-IMPUTE