| Literature DB >> 26357082 |
Bo Liao, Xiong Li, Lijun Cai, Zhi Cao, Haowen Chen.
Abstract
Various strategies can be used to select representative single nucleotide polymorphisms (SNPs) from a large number of SNPs, such as tag SNP for haplotype coverage and informative SNP for haplotype reconstruction, respectively. Representative SNPs are not only instrumental in reducing the cost of genotyping, but also serve an important function in narrowing the combinatorial space in epistasis analysis. The capacity of kernel SNPs to unify informative SNP and tag SNP is explored, and inconsistencies are minimized in further studies. The correlation between multiple SNPs is formalized using multi-information measures. In extending the correlation, a distance formula for measuring the similarity between clusters is first designed to conduct hierarchical clustering. Hierarchical clustering consists of both information gain and haplotype diversity, so that the proposed approach can achieve unification. The kernel SNPs are then selected from every cluster through the top rank or backward elimination scheme. Using these kernel SNPs, extensive experimental comparisons are conducted between informative SNPs on haplotype reconstruction accuracy and tag SNPs on haplotype coverage. Results indicate that the kernel SNP can practically unify informative SNP and tag SNP and is therefore adaptable to various applications.Mesh:
Year: 2015 PMID: 26357082 DOI: 10.1109/TCBB.2014.2351797
Source DB: PubMed Journal: IEEE/ACM Trans Comput Biol Bioinform ISSN: 1545-5963 Impact factor: 3.710