| Literature DB >> 23343437 |
Daniele Catanzaro1, Ramamoorthi Ravi, Russell Schwartz.
Abstract
BACKGROUND: Phylogeny estimation from aligned haplotype sequences has attracted more and more attention in the recent years due to its importance in analysis of many fine-scale genetic data. Its application fields range from medical research, to drug discovery, to epidemiology, to population dynamics. The literature on molecular phylogenetics proposes a number of criteria for selecting a phylogeny from among plausible alternatives. Usually, such criteria can be expressed by means of objective functions, and the phylogenies that optimize them are referred to as optimal. One of the most important estimation criteria is the parsimony which states that the optimal phylogeny T∗for a set H of n haplotype sequences over a common set of variable loci is the one that satisfies the following requirements: (i) it has the shortest length and (ii) it is such that, for each pair of distinct haplotypes hi,hj∈H, the sum of the edge weights belonging to the path from hi to hj in T∗ is not smaller than the observed number of changes between hi and hj. Finding the most parsimonious phylogeny for H involves solving an optimization problem, called the Most Parsimonious Phylogeny Estimation Problem (MPPEP), which is NP-hard in many of its versions.Entities:
Year: 2013 PMID: 23343437 PMCID: PMC3599976 DOI: 10.1186/1748-7188-8-3
Source DB: PubMed Journal: Algorithms Mol Biol ISSN: 1748-7188 Impact factor: 1.405
Figure 1An example of two symmetric paths linking haplotypes00and11.
Figure 2An example of two symmetric solution to the MPPEP-SNP.
Comparison between the gap of [[1]]’s polynomial size integer programming model for the MPPEP-SNP versus the gap of the flow-based reduced model and its strengthening valid inequalities
| Human chromosome Y | 150 | 49 | 0.00 | 0.00 | 16 | yes |
| Bacterial mtDNA | 17 | 1510 | 26.04 | 96 | no | |
| Chimpanzee mtDNA | 24 | 1041 | 20.63 | 20.63 | 63 | yes |
| Chimpanzee chromosome Y | 15 | 98 | 0.00 | 0.00 | 99 | yes |
| Human mtDNA | 40 | 52 | 24.66 | 73 | no | |
| Human mtDNA | 395 | 830 | 22.64 | 53 | no | |
| Human mtDNA | 13 | 390 | 12.50 | 48 | no | |
| Human mtDNA | 44 | 405 | 6.98 | 43 | no |
Performances of the Flow-RM on a set of random instances of the MPPEP-SNP
| 100 | 1 | 57 | 520.05 | 0 | 1807 | 150 | 1 | 82 | 284.51 | 0 | 424 |
| | 2 | 60 | 59.74 | 0 | 174 | | 2 | 83 | 314.27 | 0.76 | 56 |
| | 3 | 63 | 377.75 | 1.45 | 110 | | 3 | 81 | 799.01 | 0 | 67 |
| | 4 | 61 | 2491.62 | 3.81 | 3351 | | 4 | 67 | 1809.26 | 2.66 | 6617 |
| | 5 | 60 | 2918.09 | 4.63 | 2062 | | 5 | 79 | 1001.14 | 2.29 | 187 |
| | 6 | 57 | 349.54 | 1.59 | 264 | | 6 | 74 | 1976.73 | 2.41 | 1071 |
| | 7 | 65 | 258.53 | 1.90 | 85 | | 7 | 73 | 8719.65 | 3.92 | 4814 |
| | 8 | 58 | 293.97 | 0 | 1299 | | 8 | 83 | 3497.73 | 2.17 | 421 |
| | 9 | 62 | 862.48 | 2.85 | 540 | | 9 | 72 | 1154.77 | 2.51 | 410 |
| | 10 | 64 | 87.19 | 0 | 92 | | 10 | 80 | 399.89 | 1.56 | 256 |
| 200 | 1 | 99 | 614.86 | 0 | 72 | 250 | 1 | 117 | 1155.41 | 0 | 197 |
| | 2 | 99 | 1353.16 | 1.28 | 149 | | 2 | 109 | 7757.98 | 1.72 | 1596 |
| | 3 | 96 | 896.68 | 0.67 | 226 | | 3 | 117 | 387.141 | 0.84 | 180 |
| | 4 | 104 | 652.44 | 0.47 | 150 | | 4 | 126 | 1267.77 | 0.51 | 114 |
| | 5 | 96 | 382.83 | 0 | 56 | | 5 | 116 | 188.188 | 0.84 | 162 |
| | 6 | 106 | 2535.09 | 0.60 | 71 | | 6 | 116 | 2311.61 | 1.14 | 685 |
| | 7 | 100 | 233.50 | 0 | 21 | | 7 | 116 | 1256.24 | 0 | 265 |
| | 8 | 99 | 1650.17 | 0.96 | 79 | | 8 | 124 | 67.556 | 0 | 528 |
| | 9 | 87 | 4600.69 | 2.10 | 954 | | 9 | 122 | 2000.77 | 0.53 | 107 |
| | 10 | 102 | 2554.84 | 1.23 | 1965 | | 10 | 111 | 1200.89 | 0.87 | 272 |
| 300 | 1 | 133 | 297.19 | 0 | 15 | | | | | | |
| | 2 | 123 | 2753.53 | 0.39 | 68 | | | | | | |
| | 3 | 142 | 5371.05 | 0 | 941 | | | | | | |
| | 4 | 133 | 420.72 | 0 | 43 | | | | | | |
| | 5 | 126 | 388.99 | 0 | 433 | | | | | | |
| | 6 | 134 | 397.01 | 0 | 61 | | | | | | |
| | 7 | 138 | 1173.65 | 0 | 1788 | | | | | | |
| | 8 | 126 | 666.21 | 0 | 186 | | | | | | |
| | 8 | 127 | 449.30 | 0.77 | 42 | | | | | | |
| 10 | 145 | 201.87 | 0 | 876 |
Performances of the Flow-RM on a set of real instances of the MPPEP-SNP
| | | | 10 | 1 | | | | | yes |
| f1 | 63 | 16569 | 13 | 56 | 3.11 | 1 | no |
| | | | 15 | 10286.1 | 26.92 | 773521 | no |
| i2 | 40 | 977 | 10 | 781.85 | 20.00 | 37511 | no |
| k3 | 100 | 757 | 10 | 150 | 7.65 | 353 | no |
| | | 13 | 588.38 | 14.29 | 11265 | no | |
| m4 | 26 | 48 | 10 | 5 | 5.88 | 109 | no |
| p5 | 21 | 16548 | 10 | 22283.4 | 50.79 | 6125448 | no |