| Literature DB >> 29881444 |
Jingli Wu1, Qian Zhang2.
Abstract
BACKGROUND: Haplotype assembly, reconstructing haplotypes from sequence data, is one of the major computational problems in bioinformatics. Most of the current methodologies for haplotype assembly are designed for diploid individuals. In recent years, genomes having more than two sets of homologous chromosomes have attracted many research groups that are interested in the genomics of disease, phylogenetics, botany and evolution. However, there is still a lack of methods for reconstructing polyploid haplotypes.Entities:
Keywords: Algorithm; Bioinformatics; Genotype; Haplotype; Minimum error correction with genotype information (MEC/GI); Sequence analysis; Single nucleotide polymorphism (SNP); Triploid
Year: 2018 PMID: 29881444 PMCID: PMC5984336 DOI: 10.1186/s13015-018-0129-0
Source DB: PubMed Journal: Algorithms Mol Biol ISSN: 1748-7188 Impact factor: 1.405
Fig. 1An example of SNP fragment matrix and genotype matrix. a SNP fragment matrix M , b genotype matrix G
Fig. 2An example for enumerating and computing
Fig. 3Algorithm EHTLD
Comparison with different error rates (CELSIM instance)
|
| RR | VE | MEC | Running time (s) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EH | HC | HT | EH | HC | HT | EH | HC | HT | EH | HC | HT | |
| 0 | 0.97 | 0.89 | 0.89 | 3 | 29 | 27 | 0 | 126 | 57 | 0.01 | 0.02 | 0.01 |
| 0.01 | 0.97 | 0.89 | 0.88 | 3 | 31 | 28 | 17 | 147 | 64 | 0.01 | 0.03 | 0.01 |
| 0.02 | 0.97 | 0.89 | 0.88 | 4 | 30 | 27 | 34 | 152 | 79 | 0.01 | 0.03 | 0.01 |
| 0.03 | 0.97 | 0.89 | 0.90 | 4 | 31 | 26 | 51 | 180 | 83 | 0.01 | 0.03 | 0.01 |
| 0.04 | 0.97 | 0.90 | 0.89 | 4 | 29 | 27 | 69 | 179 | 96 | 0.01 | 0.03 | 0.01 |
| 0.05 | 0.97 | 0.90 | 0.89 | 4 | 29 | 26 | 85 | 194 | 117 | 0.01 | 0.03 | 0.01 |
| 0.06 | 0.96 | 0.89 | 0.88 | 5 | 31 | 26 | 102 | 210 | 132 | 0.01 | 0.03 | 0.01 |
| 0.07 | 0.96 | 0.89 | 0.88 | 5 | 30 | 26 | 122 | 225 | 157 | 0.01 | 0.03 | 0.01 |
| 0.08 | 0.96 | 0.90 | 0.88 | 6 | 29 | 24 | 138 | 238 | 173 | 0.01 | 0.03 | 0.01 |
| 0.09 | 0.95 | 0.89 | 0.87 | 6 | 30 | 28 | 154 | 254 | 181 | 0.01 | 0.03 | 0.01 |
| 0.1 | 0.95 | 0.90 | 0.89 | 7 | 29 | 25 | 172 | 263 | 206 | 0.01 | 0.03 | 0.01 |
| 0.2 | 0.92 | 0.89 | 0.88 | 14 | 31 | 26 | 335 | 407 | 364 | 0.01 | 0.03 | 0.01 |
Comparison with different coverages (CELSIM instance)
|
| RR | VE | MEC | Running time (s) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EH | HC | HT | EH | HC | HT | EH | HC | HT | EH | HC | HT | |
| 2 | 0.94 | 0.89 | 0.86 | 10 | 30 | 29 | 16 | 40 | 19 | 0.01 | 0.01 | 0.01 |
| 3 | 0.95 | 0.89 | 0.85 | 9 | 31 | 28 | 25 | 60 | 30 | 0.01 | 0.02 | 0.01 |
| 4 | 0.95 | 0.89 | 0.91 | 7 | 31 | 28 | 34 | 79 | 38 | 0.01 | 0.02 | 0.01 |
| 5 | 0.96 | 0.89 | 0.87 | 6 | 30 | 26 | 41 | 96 | 46 | 0.01 | 0.02 | 0.01 |
| 6 | 0.96 | 0.89 | 0.87 | 5 | 30 | 25 | 52 | 120 | 58 | 0.01 | 0.02 | 0.01 |
| 7 | 0.96 | 0.90 | 0.89 | 5 | 29 | 26 | 58 | 133 | 65 | 0.01 | 0.02 | 0.01 |
| 8 | 0.96 | 0.89 | 0.89 | 5 | 30 | 25 | 67 | 157 | 75 | 0.01 | 0.02 | 0.01 |
| 9 | 0.96 | 0.89 | 0.89 | 4 | 30 | 25 | 75 | 175 | 84 | 0.01 | 0.03 | 0.01 |
| 10 | 0.97 | 0.90 | 0.89 | 4 | 29 | 26 | 85 | 194 | 117 | 0.01 | 0.03 | 0.01 |
Comparison with different haplotype lengths (CELSIM instance)
|
| RR | VE | MEC | Running time (s) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EH | HC | HT | EH | HC | HT | EH | HC | HT | EH | HC | HT | |
| 100 | 0.97 | 0.90 | 0.89 | 4 | 29 | 26 | 85 | 194 | 117 | 0.01 | 0.03 | 0.01 |
| 200 | 0.96 | 0.89 | 0.90 | 12 | 61 | 57 | 136 | 305 | 169 | 0.04 | 0.16 | 0.04 |
| 300 | 0.95 | 0.89 | 0.88 | 29 | 92 | 90 | 181 | 387 | 230 | 0.11 | 0.20 | 0.09 |
| 500 | 0.93 | 0.88 | 0.87 | 57 | 156 | 148 | 271 | 573 | 337 | 0.56 | 0.83 | 0.72 |
| 800 | 0.92 | 0.88 | 0.86 | 100 | 256 | 242 | 398 | 855 | 492 | 2.21 | 2.59 | 2.30 |
| 1000 | 0.92 | 0.88 | 0.86 | 136 | 322 | 314 | 479 | 1029 | 595 | 4.36 | 4.67 | 4.51 |
Comparison with different single fragment length ranges (CELSIM instance)
| RR | VE | MEC | Running time (s) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EH | HC | HT | EH | HC | HT | EH | HC | HT | EH | HC | HT | |
| [3, 7] | 0.97 | 0.90 | 0.89 | 4 | 29 | 26 | 85 | 194 | 117 | 0.01 | 0.03 | 0.01 |
| [2, 4] | 0.95 | 0.89 | 0.88 | 8 | 30 | 28 | 67 | 129 | 90 | 0.01 | 0.03 | 0.01 |
| [1, 2] | 0.94 | 0.90 | 0.88 | 14 | 30 | 28 | 65 | 86 | 76 | 0.02 | 0.03 | 0.01 |
Comparison with different hamming distances (CELSIM instance)
|
| RR | VE | MEC | Running time (s) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EH | HC | HT | EH | HC | HT | EH | HC | HT | EH | HC | HT | |
| 0.1 | 0.99 | 0.97 | 0.96 | 5 | 9 | 8 | 88 | 109 | 92 | 0.01 | 0.02 | 0.01 |
| 0.2 | 0.97 | 0.93 | 0.94 | 6 | 17 | 16 | 86 | 140 | 106 | 0.01 | 0.03 | 0.01 |
| 0.3 | 0.97 | 0.90 | 0.89 | 4 | 28 | 26 | 85 | 194 | 117 | 0.01 | 0.03 | 0.01 |
| 0.4 | 0.97 | 0.86 | 0.85 | 3 | 38 | 35 | 86 | 260 | 145 | 0.02 | 0.06 | 0.02 |
| 0.5 | 0.97 | 0.82 | 0.84 | 1 | 46 | 42 | 88 | 326 | 201 | 0.04 | 0.08 | 0.03 |
| 0.6 | 0.97 | 0.79 | 0.82 | 1 | 57 | 49 | 89 | 393 | 263 | 0.05 | 0.10 | 0.05 |
| 0.7 | 0.98 | 0.74 | 0.81 | 0 | 71 | 63 | 92 | 478 | 346 | 0.07 | 0.16 | 0.06 |
| 0.8 | 0.98 | 0.71 | 0.80 | 0 | 78 | 72 | 90 | 553 | 421 | 0.10 | 0.20 | 0.08 |
| 0.9 | 0.99 | 0.67 | 0.78 | 0 | 89 | 78 | 90 | 633 | 507 | 0.12 | 0.25 | 0.10 |
| 1.0 | 1.00 | 0.63 | 0.77 | 0 | 94 | 85 | 91 | 722 | 589 | 0.15 | 0.29 | 0.12 |
Comparison with different coverages (MetaSim instance)
|
| RR | VE | MEC | Running time (s) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EH | HC | HT | EH | HC | HT | EH | HC | HT | EH | HC | HT | |
| 5 | 0.94 | 0.89 | 0.88 | 10 | 30 | 26 | 80 | 134 | 99 | 0.01 | 0.02 | 0.01 |
| 10 | 0.94 | 0.90 | 0.88 | 8 | 29 | 26 | 144 | 252 | 189 | 0.01 | 0.02 | 0.01 |
| 15 | 0.95 | 0.90 | 0.88 | 8 | 30 | 26 | 249 | 405 | 287 | 0.01 | 0.03 | 0.01 |
| 20 | 0.95 | 0.90 | 0.88 | 7 | 30 | 26 | 299 | 513 | 343 | 0.02 | 0.04 | 0.02 |
| 25 | 0.95 | 0.89 | 0.89 | 7 | 28 | 25 | 383 | 648 | 446 | 0.03 | 0.05 | 0.03 |
| 30 | 0.95 | 0.90 | 0.89 | 7 | 30 | 25 | 458 | 775 | 531 | 0.03 | 0.07 | 0.03 |
| 35 | 0.95 | 0.89 | 0.89 | 7 | 30 | 26 | 532 | 903 | 623 | 0.05 | 0.09 | 0.04 |
| 40 | 0.95 | 0.90 | 0.89 | 7 | 28 | 26 | 600 | 1028 | 711 | 0.07 | 0.12 | 0.06 |
| 45 | 0.95 | 0.90 | 0.90 | 7 | 29 | 24 | 700 | 1193 | 807 | 0.10 | 0.16 | 0.09 |
| 50 | 0.95 | 0.90 | 0.90 | 7 | 28 | 25 | 758 | 1293 | 879 | 0.13 | 0.19 | 0.11 |
Comparison with different haplotype lengths (MetaSim instance)
|
| RR | VE | MEC | Running time (s) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EH | HC | HT | EH | HC | HT | EH | HC | HT | EH | HC | HT | |
| 100 | 0.95 | 0.90 | 0.88 | 7 | 30 | 26 | 299 | 513 | 343 | 0.02 | 0.04 | 0.02 |
| 200 | 0.93 | 0.89 | 0.90 | 17 | 62 | 57 | 550 | 985 | 620 | 0.24 | 0.31 | 0.21 |
| 300 | 0.93 | 0.89 | 0.89 | 25 | 93 | 79 | 788 | 1441 | 896 | 0.56 | 0.63 | 0.60 |
| 500 | 0.93 | 0.88 | 0.89 | 45 | 156 | 142 | 1284 | 2392 | 1440 | 2.43 | 2.83 | 2.64 |
| 800 | 0.92 | 0.88 | 0.89 | 75 | 254 | 238 | 1998 | 3791 | 2254 | 9.65 | 10.50 | 9.89 |
| 1000 | 0.92 | 0.88 | 0.88 | 91 | 319 | 305 | 2484 | 4731 | 2804 | 17.43 | 17.95 | 17.47 |
Comparison with different single fragment lengths (MetaSim instance)
| RR | VE | MEC | Running time (s) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EH | HC | HT | EH | HC | HT | EH | HC | HT | EH | HC | HT | |
| 10 | 0.96 | 0.90 | 0.88 | 5 | 30 | 26 | 248 | 404 | 307 | 0.01 | 0.04 | 0.01 |
| 5 | 0.95 | 0.90 | 0.88 | 7 | 30 | 26 | 299 | 513 | 343 | 0.02 | 0.04 | 0.02 |
| 3 | 0.93 | 0.89 | 0.89 | 14 | 31 | 28 | 129 | 246 | 201 | 0.03 | 0.06 | 0.03 |
Comparison with different hamming distances (MetaSim instance)
|
| RR | VE | MEC | Running time (s) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EH | HC | HT | EH | HC | HT | EH | HC | HT | EH | HC | HT | |
| 0.1 | 0.98 | 0.97 | 0.87 | 4 | 10 | 28 | 347 | 387 | 375 | 0.01 | 0.01 | 0.01 |
| 0.2 | 0.96 | 0.93 | 0.88 | 6 | 18 | 26 | 309 | 427 | 353 | 0.01 | 0.03 | 0.01 |
| 0.3 | 0.95 | 0.90 | 0.88 | 7 | 30 | 26 | 299 | 513 | 343 | 0.02 | 0.04 | 0.02 |
| 0.4 | 0.95 | 0.86 | 0.89 | 6 | 35 | 25 | 294 | 629 | 330 | 0.05 | 0.11 | 0.02 |
| 0.5 | 0.94 | 0.82 | 0.88 | 5 | 48 | 25 | 289 | 752 | 325 | 0.09 | 0.16 | 0.03 |
| 0.6 | 0.94 | 0.78 | 0.88 | 3 | 60 | 26 | 288 | 895 | 320 | 0.12 | 0.26 | 0.03 |
| 0.7 | 0.95 | 0.75 | 0.88 | 3 | 69 | 26 | 296 | 1060 | 336 | 0.18 | 0.37 | 0.04 |
| 0.8 | 0.95 | 0.71 | 0.88 | 5 | 77 | 26 | 321 | 1227 | 366 | 0.22 | 0.42 | 0.06 |
| 0.9 | 0.96 | 0.67 | 0.88 | 3 | 86 | 25 | 290 | 1359 | 333 | 0.27 | 0.50 | 0.09 |
| 1.0 | 0.96 | 0.63 | 0.88 | 2 | 94 | 25 | 277 | 1523 | 321 | 0.35 | 0.61 | 0.11 |