| Literature DB >> 29589554 |
Abolfazl Hashemi1, Banghua Zhu2, Haris Vikalo3.
Abstract
BACKGROUND: Haplotype assembly is the task of reconstructing haplotypes of an individual from a mixture of sequenced chromosome fragments. Haplotype information enables studies of the effects of genetic variations on an organism's phenotype. Most of the mathematical formulations of haplotype assembly are known to be NP-hard and haplotype assembly becomes even more challenging as the sequencing technology advances and the length of the paired-end reads and inserts increases. Assembly of haplotypes polyploid organisms is considerably more difficult than in the case of diploids. Hence, scalable and accurate schemes with provable performance are desired for haplotype assembly of both diploid and polyploid organisms.Entities:
Keywords: Haplotype assembly; Iterative algorithm; Tensor decomposition
Mesh:
Year: 2018 PMID: 29589554 PMCID: PMC5872563 DOI: 10.1186/s12864-018-4551-y
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Representing haplotype sequences and sequencing reads using tensors. Tensor contains haplotype information while matrix U∈{0,1} assigns each of the n horizontal slices of to one of the k haplotype sequences, i.e., the i row of U is an indicator of the origin of the i read
Fig. 2Representing haplotype sequences and sequencing reads using unfolded tensors. Matrix contains haplotype information while matrix U∈{0,1} assigns each of the n rows of to one of the k haplotype sequences, i.e., the i row of U is an indicator of the origin of the i read
Performance comparison of AltHap, H-PoP, BP, HapTree, SCGD, and ILP applied to haplotype reconstruction of the CEU NA12878 data set in the 1000 Genomes Project
| AltHap | H-PoP | BP | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Chromosome | CPR | MEC | t(sec) | CPR | MEC | t(sec) | CPR | MEC | t(sec) |
| 1 | 97.4 | 2011 | 11.26 | 95.7 | 2264 | 5.22 | 99.1 | 2321 | 8.17 |
| 2 | 95.3 | 2562 | 12.22 | 95.6 | 2971 | 5.65 | 89.5 | 2897 | 9.83 |
| 3 | 93.3 | 2084 | 10.38 | 91.2 | 2312 | 6.99 | 74.3 | 2367 | 8.30 |
| 4 | 96.9 | 2368 | 12.16 | 97.0 | 2648 | 5.24 | 74.8 | 2613 | 6.76 |
| 5 | 97.2 | 1924 | 9.96 | 96.6 | 2103 | 4.67 | 88.2 | 2185 | 4.76 |
| 6 | 94.9 | 3687 | 14.17 | 95.2 | 3343 | 4.93 | 88.7 | 3588 | 6.94 |
| 7 | 97.0 | 1846 | 11.19 | 92.4 | 1986 | 4.24 | 81.1 | 2073 | 7.88 |
| 8 | 96.2 | 1634 | 9.63 | 94.7 | 1848 | 4.14 | 88.5 | 1857 | 8.01 |
| 9 | 97.1 | 1272 | 6.42 | 91.0 | 1462 | 3.36 | 89.8 | 1491 | 6.13 |
| 10 | 96.8 | 1584 | 7.97 | 94.5 | 1683 | 3.67 | 90.8 | 1839 | 7.18 |
| 11 | 93.3 | 1394 | 7.45 | 91.5 | 1553 | 3.71 | 75.6 | 1586 | 6.69 |
| 12 | 92.1 | 1423 | 7.12 | 90.3 | 1570 | 3.46 | 74.4 | 1589 | 6.48 |
| 13 | 97.0 | 1269 | 4.42 | 94.1 | 1440 | 2.89 | 89.1 | 1409 | 5.38 |
| 14 | 90.3 | 857 | 9.53 | 97.1 | 974 | 2.54 | 70.0 | 995 | 4.53 |
| 15 | 97.2 | 941 | 9.42 | 97.4 | 1039 | 2.40 | 74.6 | 1063 | 3.92 |
| 16 | 96.7 | 1198 | 5.40 | 93.5 | 1192 | 2.47 | 79.7 | 1269 | 4.42 |
| 17 | 97.5 | 1146 | 4.58 | 91.1 | 1244 | 1.98 | 92.4 | 1234 | 3.15 |
| 18 | 91.0 | 860 | 4.54 | 97.6 | 893 | 2.51 | 82.0 | 942 | 3.79 |
| 19 | 97.6 | 618 | 3.32 | 97.8 | 695 | 1.82 | 98.0 | 1060 | 2.47 |
| 20 | 97.3 | 703 | 3.53 | 95.0 | 719 | 2.00 | 97.1 | 796 | 2.74 |
| 21 | 97.4 | 470 | 2.51 | 97.0 | 512 | 1.70 | 97.5 | 532 | 1.86 |
| 22 | 97.3 | 367 | 1.98 | 98.3 | 427 | 1.44 | 90.7 | 438 | 1.72 |
| Mean | 95.8 | 1464 | 7.69 | 94.8 | 1585 | 3.50 | 85.0 | 1643 | 5.51 |
| Sd | 2.27 | 780 | 3.54 | 2.54 | 790 | 1.49 | 8.94 | 793 | 2.32 |
| # best | 9 | 0 | 0 | 5 | 0 | 5 | 3 | 0 | 0 |
| HapTree | SCGD | ILP | |||||||
| Chromosome | CPR | MEC | t(sec) | CPR | MEC | t(sec) | CPR | MEC | t(sec) |
| 1 | 84.1 | 2305 | 15.43 | 92.5 | 2456 | 3.62 | 95.6 | 1741 | 173.68 |
| 2 | 84.5 | 2875 | 17.59 | 92.6 | 3509 | 4.41 | 95.3 | 2219 | 190.37 |
| 3 | 85.2 | 2363 | 15.06 | 91.9 | 2498 | 3.40 | 95.6 | 1788 | 152.09 |
| 4 | 83.5 | 2604 | 18.67 | 92.7 | 3754 | 5.47 | 97.1 | 2048 | 168.56 |
| 5 | 84.8 | 2171 | 16.95 | 93.9 | 2750 | 3.54 | 95.4 | 1691 | 147.72 |
| 6 | 84.6 | 3583 | 23.86 | 93.0 | 5612 | 8.70 | 95.7 | 2643 | 181.51 |
| 7 | 84.7 | 2070 | 13.06 | 93.5 | 2826 | 3.95 | 95.4 | 1590 | 133.36 |
| 8 | 84.2 | 1838 | 14.81 | 90.7 | 1692 | 2.18 | 95.6 | 1472 | 136.60 |
| 9 | 85.1 | 1479 | 14.90 | 97.1 | 1885 | 2.94 | 95.2 | 1125 | 105.34 |
| 10 | 85.7 | 1823 | 12.13 | 92.6 | 1876 | 2.56 | 95.7 | 1354 | 120.89 |
| 11 | 83.6 | 1577 | 11.33 | 93.2 | 2265 | 2.95 | 95.2 | 1206 | 104.74 |
| 12 | 84.8 | 1589 | 9.97 | 92.3 | 1612 | 2.03 | 95.4 | 1214 | 103.88 |
| 13 | 82.8 | 1405 | 9.55 | 97.0 | 2947 | 3.31 | 95.5 | 1105 | 93.33 |
| 14 | 85.4 | 987 | 7.79 | 91.1 | 904 | 1.36 | 95.3 | 752 | 65.07 |
| 15 | 83.6 | 1061 | 7.43 | 99.1 | 1041 | 1.21 | 94.1 | 809 | 66.52 |
| 16 | 85.1 | 1273 | 8.13 | 93.0 | 1305 | 1.79 | 95.5 | 920 | 77.81 |
| 17 | 84.8 | 1230 | 6.34 | 96.7 | 2123 | 2.61 | 96.1 | 943 | 47.99 |
| 18 | 84.1 | 941 | 7.13 | 90.3 | 933 | 1.16 | 95.2 | 720 | 71.49 |
| 19 | 84.6 | 765 | 5.26 | 97.2 | 1290 | 3.25 | 96.6 | 533 | 44.32 |
| 20 | 86.9 | 795 | 6.08 | 96.8 | 949 | 1.38 | 95.8 | 612 | 54.30 |
| 21 | 86.3 | 528 | 5.05 | 94.3 | 499 | 0.63 | 95.2 | 415 | 31.82 |
| 22 | 86.9 | 436 | 4.65 | 94.1 | 422 | 0.74 | 95.2 | 316 | 31.89 |
| Mean | 84.8 | 1623 | 11.42 | 93.9 | 2052 | 2.87 | 95.5 | 1237 | 104.69 |
| Sd | 1.03 | 802 | 5.23 | 2.3 | 1222 | 1.80 | 0.57 | 612 | 50.37 |
| # best | 0 | 0 | 0 | 1 | 0 | 17 | 5 | 22 | 0 |
The best results in each Chromosome and in all Chromosomes are in bolface font
Performance comparison of AltHap, H-PoP, BP, HapTree, SCGD, and ILP applied to the Fosmid data set. HapTree could not finish assembling haplotype of the 6 chromosome in 48 hours
| AltHap | H-PoP | BP | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Chromosome | CPR | MEC | t(sec) | CPR | MEC | t(sec) | CPR | MEC | t(sec) |
| 1 | 95.5 | 9731 | 18.38 | 84.8 | 9845 | 2.13 | 87.6 | 9567 | 40.18 |
| 2 | 95.5 | 9589 | 38.89 | 90.4 | 9444 | 2.16 | 84.8 | 9698 | 42.90 |
| 3 | 91.7 | 7311 | 29.40 | 91.7 | 7182 | 1.79 | 84.7 | 7587 | 30.61 |
| 4 | 92.7 | 5508 | 26.69 | 92.6 | 5775 | 1.76 | 86.9 | 6288 | 31.10 |
| 5 | 92.0 | 6711 | 27.39 | 93.9 | 6910 | 1.95 | 86.3 | 6975 | 36.94 |
| 6 | 90.9 | 7213 | 33.68 | 88.5 | 7505 | 2.40 | 85.0 | 7590 | 41.20 |
| 7 | 90.7 | 6151 | 28.60 | 91.9 | 6829 | 1.68 | 85.8 | 6091 | 36.94 |
| 8 | 91.2 | 5927 | 23.82 | 90.2 | 6143 | 1.89 | 87.3 | 6282 | 38.87 |
| 9 | 91.8 | 5347 | 19.40 | 91.8 | 5719 | 1.76 | 85.1 | 5493 | 26.13 |
| 10 | 90.1 | 6044 | 24.07 | 92.4 | 6328 | 1.48 | 86.4 | 6503 | 27.65 |
| 11 | 90.8 | 5424 | 21.73 | 90.3 | 6432 | 1.68 | 85.8 | 5579 | 20.56 |
| 12 | 91.5 | 5456 | 24.25 | 91.4 | 5653 | 1.46 | 85.0 | 5706 | 24.19 |
| 13 | 90.4 | 3646 | 14.23 | 90.1 | 3708 | 1.54 | 82.7 | 3976 | 17.33 |
| 14 | 89.5 | 4156 | 18.64 | 89.1 | 4261 | 1.21 | 87.0 | 4004 | 14.84 |
| 15 | 90.0 | 4079 | 14.67 | 72.9 | 4001 | 1.06 | 82.3 | 4022 | 14.35 |
| 16 | 88.5 | 6197 | 26.28 | 71.5 | 6119 | 1.20 | 84.4 | 5112 | 29.51 |
| 17 | 89.7 | 4507 | 16.35 | 88.3 | 4911 | 1.22 | 87.6 | 4749 | 18.29 |
| 18 | 93.0 | 3080 | 12.68 | 90.8 | 3315 | 1.14 | 85.5 | 3457 | 13.31 |
| 19 | 85.7 | 4212 | 13.40 | 86.3 | 4115 | 0.84 | 83.5 | 3928 | 13.44 |
| 20 | 90.3 | 3512 | 13.64 | 90.0 | 4121 | 0.85 | 84.9 | 3814 | 15.97 |
| 21 | 92.7 | 1871 | 6.20 | 91.9 | 1974 | 0.68 | 87.2 | 1953 | 8.18 |
| 22 | 85.1 | 3654 | 17.24 | 87.8 | 3757 | 0.62 | 86.7 | 3910 | 14.72 |
| mean | 90.9 | 5424 | 21.35 | 88.6 | 5639 | 1.48 | 85.6 | 5558 | 25.33 |
| Sd | 2.5 | 1950 | 7.79 | 5.7 | 1934 | 0.50 | 1.5 | 1948 | 10.84 |
| # best | 13 | 0 | 0 | 4 | 0 | 9 | 0 | 0 | 0 |
| HapTree | SCGD | ILP | |||||||
| Chromosome | CPR | MEC | t(sec) | CPR | MEC | t(sec) | CPR | MEC | t(sec) |
| 1 | 91.5 | 9676 | 6501 | 95.1 | 10127 | 2.59 | 79.0 | 6889 | 80.33 |
| 2 | 92.3 | 9802 | 7196 | 94.5 | 9721 | 2.41 | 76.1 | 6700 | 76.60 |
| 3 | 90.7 | 7705 | 4847 | 88.6 | 7410 | 1.83 | 76.9 | 5122 | 79.50 |
| 4 | 90.8 | 6500 | 8392 | 87.6 | 5494 | 1.48 | 77.0 | 4072 | 51.49 |
| 5 | 90.8 | 7094 | 5670 | 89.6 | 7058 | 1.71 | 76.0 | 4637 | 54.39 |
| 6 | - | - | - | 90.4 | 7843 | 2.14 | 75.7 | 5248 | 63.37 |
| 7 | 91.5 | 6169 | 5589 | 89.4 | 6189 | 1.73 | 77.9 | 4174 | 46.85 |
| 8 | 91.2 | 6379 | 8316 | 87.4 | 5996 | 1.47 | 76.3 | 4301 | 53.57 |
| 9 | 91.7 | 5513 | 4465 | 90.0 | 5592 | 1.20 | 76.8 | 3974 | 42.41 |
| 10 | 88.9 | 6553 | 4838 | 92.8 | 6027 | 1.60 | 76.8 | 4508 | 59.25 |
| 11 | 90.5 | 5625 | 5183 | 90.1 | 5662 | 1.34 | 79.0 | 3903 | 45.45 |
| 12 | 91.3 | 5770 | 5654 | 90.5 | 5731 | 1.55 | 77.5 | 3907 | 48.76 |
| 13 | 89.8 | 4029 | 5367 | 87.6 | 3727 | 0.79 | 77.1 | 2669 | 32.09 |
| 14 | 90.6 | 4038 | 4103 | 92.9 | 4859 | 1.12 | 75.4 | 2814 | 39.61 |
| 15 | 90.7 | 4116 | 3357 | 87.8 | 4442 | 0.88 | 78.7 | 2903 | 33.80 |
| 16 | 94.2 | 5142 | 9683 | 95.5 | 6474 | 1.60 | 79.8 | 3844 | 62.44 |
| 17 | 93.1 | 4806 | 3003 | 97.1 | 4843 | 1.01 | 80.8 | 3448 | 42.00 |
| 18 | 91.9 | 3493 | 2303 | 88.3 | 3478 | 0.71 | 76.9 | 2337 | 32.27 |
| 19 | 92.8 | 3953 | 1984 | 82.5 | 4204 | 0.87 | 78.6 | 2707 | 33.68 |
| 20 | 90.1 | 3886 | 1529 | 94.6 | 3790 | 0.83 | 78.7 | 2783 | 31.78 |
| 21 | 92.1 | 1979 | 1410 | 90.7 | 2042 | 0.36 | 77.2 | 1367 | 16.42 |
| 22 | 92.4 | 3307 | 1351 | 90.6 | 3495 | 1.06 | 77.0 | 2422 | 60.62 |
| mean | 91.4 | 5502 | 4797.19 | 90.6 | 5645 | 1.38 | 77.6 | 3851 | 49.39 |
| Sd | 1.2 | 1998 | 2392.54 | 3.4 | 1977 | 0.56 | 1.39 | 1360 | 16.82 |
| # best | 4 | 0 | 0 | 4 | 0 | 13 | 0 | 22 | 0 |
The best results in each Chromosome and in all Chromosomes are in bolface font
Performance comparison of AltHap, H-PoP, BP, HapTree, SCGD, and ILP on a simulated diploid data set from [39] with haplotype block length m=700. ILP could only finish assembly of haplotypes for two settings in 48 hours
| AltHap | H-PoP | BP | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Error rate | Coverage | CPR | MEC | t(s) | CPR | MEC | t(s) | CPR | MEC | t(s) |
| 0.1 | 5 | 99.6 | 477 | 0.043 | 99.3 | 402 | 0.012 | 86.7 | 698 | 1.421 |
| 0.1 | 8 | 99.9 | 759 | 0.128 | 99.8 | 780 | 0.035 | 87.2 | 861 | 4.627 |
| 0.1 | 10 | 99.9 | 954 | 0.404 | 99.9 | 903 | 0.109 | 87.3 | 1130 | 13.58 |
| 0.2 | 5 | 90.9 | 941 | 0.061 | 87.7 | 1021 | 0.027 | 81.2 | 953 | 2.671 |
| 0.2 | 8 | 98.1 | 1458 | 0.141 | 88.9 | 1532 | 0.098 | 86.1 | 1847 | 6.897 |
| 0.2 | 10 | 99.1 | 1836 | 0.394 | 91.5 | 2023 | 0.201 | 86.7 | 2485 | 10.13 |
| 0.3 | 5 | 60.7 | 1228 | 0.069 | 61.8 | 1331 | 0.041 | 53.7 | 1677 | 3.235 |
| 0.3 | 8 | 67.7 | 2022 | 0.145 | 65.7 | 2250 | 0.098 | 57.2 | 2469 | 7.982 |
| 0.3 | 10 | 75.0 | 2558 | 0.375 | 71.2 | 2979 | 0.217 | 59.6 | 3114 | 15.32 |
| HapTree | SCGD | ILP | ||||||||
| Error rate | Coverage | CPR | MEC | t(s) | CPR | MEC | t(s) | CPR | MEC | t(s) |
| 0.1 | 5 | 88.6 | 491 | 2.13 | 96.6 | 523 | 0.66 | 98.8 | 467 | 471 |
| 0.1 | 8 | 88.4 | 767 | 3.82 | 99.8 | 772 | 0.84 | 99.7 | 760 | 2004 |
| 0.1 | 10 | 87.3 | 963 | 4.03 | 99.9 | 965 | 0.97 | - | - | - |
| 0.2 | 5 | 76.2 | 988 | 9.36 | 76.1 | 979 | 0.72 | - | - | - |
| 0.2 | 8 | 80.8 | 1562 | 6.69 | 91.3 | 1531 | 1.18 | - | - | - |
| 0.2 | 10 | 82.7 | 1943 | 4.20 | 95.4 | 1902 | 1.50 | - | - | - |
| 0.3 | 5 | 64.6 | 1170 | 10.21 | 57.8 | 1136 | 0.73 | - | - | - |
| 0.3 | 8 | 65.7 | 2021 | 6.17 | 63.7 | 1998 | 1.14 | - | - | - |
| 0.3 | 10 | 65.1 | 2597 | 5.74 | 67.9 | 2574 | 1.44 | - | - | - |
The best results in each simulation setting are in bolface font
Performance comparison of AltHap, H-PoP, BP, and SCGD on a simulated biallelic triploid data set with haplotype block length m=1000. HapTree could not finish the simulations in 48 hours
| AltHap | H-PoP | BP | SCGD | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Err | Cov | CPR | MEC | t(s) | CPR | MEC | t(s) | CPR | MEC | t(s) | CPR | MEC | t(s) |
| 0.002 | 10 |
|
| 30 | 71.5 | 3642 |
| 68.9 | 4210 | 132 | 69.7 | 11988 | 159 |
| 0.002 | 20 |
|
| 59 | 73.1 | 7728 |
| 72.9 | 7762 | 416 | 51.8 | 35660 | 283 |
| 0.002 | 30 | 98.4 | 2412 | 109 | 70.8 | 12865 | 265 | 69.7 | 14751 | 1310 | 52.1 | 53248 | 422 |
| 0.01 | 10 |
|
| 30 | 70.0 | 3786 | 14 | 68.1 | 4092 | 138 | 68.4 | 12108 | 161 |
| 0.01 | 20 |
|
| 60 | 70.9 | 8375 |
| 68.9 | 8601 | 460 | 52.0 | 35606 | 295 |
| 0.01 | 30 | 98.9 | 3143 | 110 | 71.8 | 11769 | 266 | 68.1 | 15124 | 1301 | 52.7 | 53185 | 422 |
| 0.05 | 10 |
|
| 31 | 70.1 | 3978 | 14 | 66.9 | 4227 | 135 | 67.5 | 13037 | 158 |
| 0.05 | 20 |
|
| 59 | 70.3 | 9276 |
| 70.1 | 9484 | 460 | 51.7 | 35693 | 285 |
| 0.05 | 30 | 82.6 | 17284 | 110 | 71.3 | 13778 | 268 | 67.6 | 16876 | 1315 | 52.1 | 52499 | 431 |
The best results in each simulation setting are in bolface font
Performance of AltHap on simulated biallelic triploid data set with haplotype block length m=1000, data error rate p=0.002, and different read lengths
| Read length | Cov | CPR | MEC | t(s) |
|---|---|---|---|---|
| 2 × 250 | 10 | 98.2 | 322 | 30.74 |
| 2 × 250 | 20 | 95.1 | 1986 | 59.65 |
| 2 × 250 | 30 | 98.4 | 2412 | 109.73 |
| 2 × 300 | 10 | 93.0 | 856 | 34.83 |
| 2 × 300 | 20 | 97.9 | 1410 | 66.50 |
| 2 × 300 | 30 | 97.7 | 3216 | 117.62 |
| 2 × 500 | 10 | 95.5 | 682 | 39.36 |
| 2 × 500 | 20 | 92.4 | 2605 | 66.37 |
| 2 × 500 | 30 | 93.0 | 5869 | 116.69 |
Performance comparison of AltHap, H-PoP, BP, and SCGD on a simulated biallelic tetraploid data set with haplotype block length m=1000. HapTree could not finish the simulations in 48 hours
| AltHap | H-PoP | BP | SCGD | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Err | Cov | CPR | MEC | t(s) | CPR | MEC | t(s) | CPR | MEC | t(s) | CPR | MEC | t(s) |
| 0.002 | 10 | 91.1 |
| 43 | 70.7 | 3366 | 43 | 69.8 | 4568 | 290 | 67.1 | 14839 | 208 |
| 0.002 | 20 | 95.0 |
| 87 | 73.4 | 7359 | 113 | 71.2 | 9434 | 540 | 51.7 | 41241 | 419 |
| 0.002 | 30 | 99.9 | 674 | 163 | 72.6 | 11693 | 598 | 71.5 | 12745 | 1496 | 51.8 | 61885 | 653 |
| 0.01 | 10 | 98.2 |
| 44 | 69.3 | 3511 | 46 | 66.4 | 6475 | 296 | 67.1 | 14819 | 213 |
| 0.01 | 20 | 99.3 |
| 87 | 70.3 | 7882 | 114 | 66.9 | 10213 | 552 | 51.5 | 41712 | 414 |
| 0.01 | 30 | 95.3 | 6518 | 164 | 71.0 | 12392 | 597 | 68.4 | 13245 | 1485 | 51.5 | 61981 | 652 |
| 0.05 | 10 | 93.7 |
| 44 | 67.7 | 4110 | 46 | 64.5 | 6869 | 306 | 65.0 | 15861 | 213 |
| 0.05 | 20 | 95.8 | 9645 | 89 | 69.1 | 9109 | 118 | 68.5 | 11477 | 623 | 51.9 | 41042 | 408 |
| 0.05 | 30 | 81.5 | 18690 | 165 | 70.0 | 14212 | 601 | 67.5 | 17681 | 1504 | 51.7 | 62261 | 643 |
The best results in each simulation setting are in bolface font
Performance of AltHap on simulated polyallelic triploid data set with haplotype block length m=1000. H-PoP, BP, HapTree, and SCGD cannot assemble polyallelic polyploid haplotypes
| Error rate | Cov | CPR | MEC | t(s) |
|---|---|---|---|---|
| 0.002 | 5 | 83.2 | 1377 | 43.05 |
| 0.002 | 10 | 93.2 | 897 | 115.13 |
| 0.002 | 15 | 93.5 | 1799 | 173.55 |
| 0.002 | 20 | 95.2 | 2346 | 232.07 |
| 0.01 | 5 | 74.7 | 2341 | 58.13 |
| 0.01 | 10 | 94.4 | 1269 | 115.41 |
| 0.01 | 15 | 90.9 | 3755 | 173.38 |
| 0.01 | 20 | 85.5 | 7272 | 235.86 |
| 0.05 | 5 | 79.9 | 3076 | 57.77 |
| 0.05 | 10 | 89.4 | 3925 | 116.33 |
| 0.05 | 15 | 93.1 | 6100 | 174.37 |
| 0.05 | 20 | 93.9 | 9120 | 236.73 |
Performance of AltHap on simulated polyallelic tetraploid data set with haplotype block length m=1000. H-PoP, BP, HapTree, and SCGD cannot assemble polyallelic polyploid haplotypes
| Error rate | Cov | CPR | MEC | t(s) |
|---|---|---|---|---|
| 0.002 | 5 | 79.4 | 2380 | 109.00 |
| 0.002 | 10 | 86.5 | 2043 | 220.6 |
| 0.002 | 15 | 93.8 | 2148 | 328.49 |
| 0.002 | 20 | 96.3 | 2388 | 432.28 |
| 0.01 | 5 | 79.7 | 2398 | 113.08 |
| 0.01 | 10 | 84.1 | 2927 | 220.33 |
| 0.01 | 15 | 82.8 | 5787 | 327.10 |
| 0.01 | 20 | 99.2 | 2319 | 432.85 |
| 0.05 | 5 | 74.6 | 4721 | 113.38 |
| 0.05 | 10 | 89.0 | 5146 | 211.43 |
| 0.05 | 15 | 92.3 | 7555 | 327.20 |
| 0.05 | 20 | 92.0 | 13704 | 435.15 |
Fig. 3Comparison of the theoretical and experimental results. Comparison of the theoretical bound on CPR with the experimental results when Cseq=15 obtained by applying AltHap to the problem of reconstructing biallelic triploid haplotypes (synthetic data)
Fig. 4Comparison of the theoretical and experimental results. Comparison of the theoretical bound on CPR with the experimental results when p=0.002 obtained by applying AltHap to the problem of reconstructing biallelic triploid haplotypes (synthetic data)