| Literature DB >> 23742238 |
Filippo Utro1, Niina Haiminen, Donald Livingstone, Omar E Cornejo, Stefan Royaert, Raymond J Schnell, Juan Carlos Motamayor, David N Kuhn, Laxmi Parida.
Abstract
BACKGROUND: We address the task of extracting accurate haplotypes from genotype data of individuals of large F1 populations for mapping studies. While methods for inferring parental haplotype assignments on large F1 populations exist in theory, these approaches do not work in practice at high levels of accuracy.Entities:
Mesh:
Year: 2013 PMID: 23742238 PMCID: PMC3716545 DOI: 10.1186/1471-2156-14-48
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Definition and size of the classes
| | | |||||
|---|---|---|---|---|---|---|
| | ||||||
| fastPHASE [ | 60.07(0.00) | 59.77(0.00) | No | 78 | | |
| | 58.01(0.00) | 56.55(0.00) | | | 158 | |
| FMPH [ | - | - | NA | No | - | Up to 30-100 markers |
| MACH [ | 52.89 (0.00) | 52:16(0:00) | Yes | 567 | | |
| | 52.49 (0.00) | 50.91 (0.00) | | | 1144 | |
| | ||||||
| BEAGLE [ | 99.90 (0.00) | 98.61 (0.00) | Yes | 5 | | |
| | 99.90 (0.00) | 98.28 (0.00) | | | 10 | |
| HAPI-UR [ | 99.69 (0.00) | 94.75 (0.00) | No | 3 | | |
| | 99.67 (0.00) | 94.88 (0.00) | | | 7 | |
| | ||||||
| HAPI [ | 90.75 (9.17) | 0.00 (100.0) | No | 0.1 | < 15 progeny/parent | |
| | 90:63(9:29) | 0:00(100:0) | | 0.2 | | |
| Merlin [ | 70:59(29:38) | 69:60(29:47) | Yes | 299 | < 15 progeny/parent | |
| | 64:80(35:18) | 63:72(35:09) | | 604 | | |
| SHAPEIT2 [ | 87:20(0:00) | 57:61(0:00) | No | 70 | < 50 progeny/parent | |
| | 90:46(0:00) | 64:05(0:00) | | | 148 | |
| iXora | 95:89(4:05) | 92:11(7:75)) | Yes | 0.3 | | |
| 95:73(4:21) | 91:43(8:40) | 0.8 | ||||
The first row for each method corresponds to 300 markers while the second to 600 markers. The results are averaged over multiple data sets of 200 individuals. Parental haplotype assignment (PHA) was found to be critical in the task of trait association in [15] and is shown here in bold font. Section Methods describes accuracy and PHA computation in detail. Trait assoc. denotes whether the software allows testing for phenotype association. Time was obtained on a system with 3.0 GHz quad-core processor and 4 GB memory. Abbreviations: PA= Parent assignment; Imp. = Imputation; PHA = Parent haplotype assignment; ua = unassigned; “NA” = computation is outside the scope of the software; “-” = unable to compute on our data sets
Figure 1Outline of the iXora phasing approach. The eight steps in the iXora haplotype extraction algorithm. Eqn and Obs refer to the Equations and Observations discussed in Methods. The task is to estimate the haplotypes of the two parents, say a and b, as well as those of the four progeny.
Figure 2Transition diagram for computing the final phasing output. The diagram shows the permissible state transitions for computing the phasing result matrices F. The states S are discussed in detail in Methods.
Figure 3Expected haplotype distributions visualization. Expected haplotype frequencies are shown for the simulated use case detailed in Additional file 1, for the two phenotypic groups: A) tall progeny, B) short progeny. The variance due to uncertainty in crossover locations is shown as shaded regions. Clear distortion is visible near marker 30 (marked by the dashed rectangle), evident from under-representation of haplotype combinations involving paternal haplotype H2 in the short progeny (green and yellow lines in B).
Figure 4Results from Fisher’s exact test for phenotype-haplotype association for A) Father and B) Mother, including the p-value significance thresholds from randomizations, for the simulated use case detailed in Additional file 1. In this case only one region of the genome from the father is significantly associated with the phenotype (marked by the dashed rectangle), according to the Fisher’s exact test and the randomization thresholds. [Legend: real data (red), randomized data (blue), smallest value in randomized data (green)].