| Literature DB >> 27921120 |
Huange Wang1, Fred A van Eeuwijk2, Johannes Jansen2.
Abstract
KEY MESSAGE: Probabilistic graphical models show great potential for robust and reliable construction of linkage maps. We show how to use probabilistic graphical models to construct high-quality linkage maps in the face of data perturbations caused by genotyping errors and reciprocal translocations. It has been shown that linkage map construction can be hampered by the presence of genotyping errors and chromosomal rearrangements such as inversions and translocations. Here, we report a novel method for linkage map construction using probabilistic graphical models. The method is proven, both theoretically and practically, to be effective in filtering out markers that contain genotyping errors. In particular, it carries out marker filtering and ordering simultaneously, and is therefore superior to the standard post hoc filtering using nearest-neighbour stress. Furthermore, we demonstrate empirically that the proposed method offers a promising solution to linkage map construction in the case of a reciprocal translocation.Entities:
Mesh:
Year: 2016 PMID: 27921120 PMCID: PMC5263214 DOI: 10.1007/s00122-016-2824-x
Source DB: PubMed Journal: Theor Appl Genet ISSN: 0040-5752 Impact factor: 5.699
Genotypic frequencies for ordered triplet of markers M 1–M 2–M 3. θ (0 < θ < 0.5) denote the recombination frequency between markers M and M ; , ,and denote locus-specific genotyping error rates, 0 < ε < 0.5
| Marker type | Genotypic frequency | ||
|---|---|---|---|
|
|
|
| |
| −1 | −1 | −1 | 0.5 × [(1 − |
| −1 | −1 | 1 | 0.5 × (1 − |
| −1 | 1 | 1 | 0.5 × |
| −1 | 1 | −1 | 0.5 × [ |
| 1 | 1 | 1 | 0.5 × [(1 − |
| 1 | 1 | −1 | 0.5 × (1 − |
| 1 | −1 | −1 | 0.5 × |
| 1 | −1 | 1 | 0.5 × [ |
Numeric values −1 and 1 in the first three columns represent marker types a and b, respectively
Pairwise correlation coefficients for ordered triplet of markers M –M –M
|
|
|
| |
|---|---|---|---|
|
| 1 − 2 | 1 − 2 | (1 − 2 |
|
| (1 − 2 | 1 − 2 | (1 − 2 |
|
| 1 − 2 | (1 − 2 | (1 − 2 |
|
| (1 − 2 | (1 − 2 | (1 − 2 |
|
| (1 − 2 | (1 − 2 | (1 − 2 |
|
| (1 − 2 | (1 − 2 | (1 − 2 |
|
| (1 − 2 | (1 − 2 | (1 − 2 |
|
| (1 − 2 | (1 − 2 | (1 − 2 |
Denotations of θ 12, θ 23, , , and ε are identical to those in Table 1
The observed pairwise recombination frequencies for ordered triplet of markers M 1–M 2–M 3
|
|
|
| |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Denotations of θ , θ , , , and ε are identical to those in Table 1
Fig. 1a A PGM constructed with the PC-stable algorithm for the simulated data. The six markers designed with genotyping errors are pulled aside from the linear string and coloured in red, and another six markers pulled aside from the linear string are coloured in cyan. Enlargements of two detailed parts of the PGM are given above the linear string, though the whole graph itself can be enlarged dramatically to show all details clearly. b An MST constructed with Genstat for the simulated data. The diagram was projected on the first two principal axes obtained by a principal coordinate analysis. Only the six markers designed with genotyping errors are marked out and coloured in red (colour figure online)
The top six markers with the highest N.N.Stress obtained by JoinMap 4.1 from the simulated marker data
| Locus | Position | N.N.Stress (cM) | |
|---|---|---|---|
| 1 | Marker63 | 96.145 | 11.01 |
| 2 | Marker184 | 296.883 | 10.939 |
| 3 | Marker155 | 243.352 | 6.318 |
| 4 | Marker51 | 70.273 | 5.629 |
| 5 | Marker128 | 195.223 | 2.046 |
| 6 | Marker34 | 43.34 | 2.041 |
A summary of the total number of markers, the number of unique markers, the average map length across five mapping runs, and the highest values of N.N.Stress for each of the seven linkage groups constructed from the cucumber data (before missing data imputation)
| Linkage group | Number of markers | Number of unique markers | Average map length in 5 runs (cM) | Highest N.N.Stress (cM) |
|---|---|---|---|---|
| Chr.1 | 107 | 103 | 195.1 | 8.0 |
| Chr.2 | 108 | 103 | 269.8 | 11.7 |
| Chr.3 | 163 | 151 | 343.9 | 22.3 |
| Chr.4 | 95 | 67 | 115.4 | 11.8 |
| Chr.5 | 144 | 104 | 155.0 | 9.1 |
| Chr.6 | 177 | 157 | 333.1 | 12.0 |
| Chr.7 | 69 | 66 | 176.7 | 9.5 |
| Total | 863 | 751 |
Fig. 2a An MST constructed with Genstat for 20 representative markers of Chr.5. The diagram was projected on the first two principal axes obtained by a principal coordinate analysis. b A PGM constructed with the PC-stable algorithm for the same set of 20 markers. The significance level for conditional independence tests was set at 0.05
Fig. 3a An MST constructed with Genstat for 64 unique markers of Chr.5. The diagram was projected on the first two principal axes obtained by a principal coordinate analysis. b A PGM constructed with the PC-stable algorithm for the same set of 64 markers. The significance level for conditional independence tests was set at 0.05
Fig. 4An adjusted PGM obtained by further applying frequentist diagonal ordering to the adjacency matrix of the PGM shown in Fig. 3b
Fig. 5A PGM constructed from the barley data by the PC-stable algorithm in combination with frequentist diagonal ordering. Yellow nodes stand for markers on chromosome 1H and green nodes stand for markers on chromosome 3H. The significance level for conditional independence tests was set at 0.05