| Literature DB >> 19455219 |
Xiang-Sun Zhang1, Rui-Sheng Wang, Ling-Yun Wu, Wei Zhang.
Abstract
The Minimum Error Correction (MEC) is an important model for haplotype reconstruction from SNP fragments. However, this model is effective only when the error rate of SNP fragments is low. In this paper, we propose a new computational model called Minimum Conflict Individual Haplotyping (MCIH) as an extension to MEC. In contrast to the conventional approaches, the new model employs SNP fragment information and also related genotype information, thereby a high accurate inference can be expected. We first prove the MCIH problem to be NP-hard. To evaluate the practicality of the new model we design an exact algorithm (a dynamic programming procedure) to implement MCIH on a special data structure. The numerical experience indicates that it is fairly effective to use MCIH at the cost of related genotype information, especially in the case of SNP fragments with a high error rate. Moreover, we present a feed-forward neural network algorithm to solve MCIH for general data structure and large size instances. Numerical results on real biological data and simulation data show that the algorithm works well and MCIH is a potential alternative in individual haplotyping.Entities:
Keywords: individual haplotyping; NP-hard; dynamic programming; feed-forward neural network; minimum conflict individual haplotyping; reconstruction rate
Year: 2007 PMID: 19455219 PMCID: PMC2674671
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Figure 1A three layer forward neural network.
Figure 2The results of MCIH and MEC on ACE. From left to right, hr = 0.25, hr = 0.5, hr = 0.75.
Figure 3The results of MCIH and MEC on Daly set. From left to right, hr = 0.25, hr = 0.5, hr = 0.75.
Figure 4The results of MCIH and MEC on Hudson’s data with r = 0. From left to right, hr = 0.25, hr = 0.5, hr = 0.75.
Figure 5The results of MCIH and MEC on Hudson’s data with r = 100. From left to right, hr = 0.25, hr = 0.5, hr = 0.75.
The results of two models on simulated data.
| error rate | s=0.5 | s=0.0 | ||
|---|---|---|---|---|
| MEC | MCIH | MEC | MCIH | |
| 0.05 | 0.941 | 1.000 | 0.965 | 0.996 |
| 0.1 | 0.904 | 0.969 | 0.950 | 0.984 |
| 0.15 | 0.863 | 0.969 | 0.890 | 0.946 |
| 0.2 | 0.786 | 0.908 | 0.834 | 0.922 |
| 0.25 | 0.763 | 0.863 | 0.766 | 0.830 |