| Literature DB >> 25210765 |
Jun Xing1, Huijiang Gao2, Yang Wu2, Yani Wu3, Hongwang Li3, Runqing Yang4.
Abstract
Generalized estimating equation (GEE) algorithm under a heterogeneous residual variance model is an extension of the iteratively reweighted least squares (IRLS) method for continuous traits to discrete traits. In contrast to mixture model-based expectation-maximization (EM) algorithm, the GEE algorithm can well detect quantitative trait locus (QTL), especially large effect QTLs located in large marker intervals in the manner of high computing speed. Based on a single QTL model, however, the GEE algorithm has very limited statistical power to detect multiple QTLs because of ignoring other linked QTLs. In this study, the fast least absolute shrinkage and selection operator (LASSO) is derived for generalized linear model (GLM) with all possible link functions. Under a heterogeneous residual variance model, the LASSO for GLM is used to iteratively estimate the non-zero genetic effects of those loci over entire genome. The iteratively reweighted LASSO is therefore extended to mapping QTL for discrete traits, such as ordinal, binary, and Poisson traits. The simulated and real data analyses are conducted to demonstrate the efficiency of the proposed method to simultaneously identify multiple QTLs for binary and Poisson traits as examples.Entities:
Mesh:
Year: 2014 PMID: 25210765 PMCID: PMC4161361 DOI: 10.1371/journal.pone.0106985
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
The commonly used distributions in the GLM for discrete traits.
| Distribution | Link name | Link function | Mean function |
|
|
| Normal | Identity |
|
|
|
|
| Poisson | Log |
|
|
| 1 |
| Identity |
|
| |||
| Sqrt |
|
| |||
| Binomial | Logit |
|
|
| 1 |
| Cloglog |
|
| |||
| Probit |
|
| |||
| Log |
|
| |||
| Multinomial | As above | As above | As above |
| 1 |
Mean estimates and standard deviations (in parentheses) of QTL positions detected with three mapping methods for the simulated datasets.
| Trait | Sample size | QTL no. | Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Q7 | Q8 | Q9 | Q10 |
| True position | 23 | 56 | 148 | 193 | 267 | 332 | 390 | 478 | 522 | 574 | ||
| Binary | 200 | IRglmnet | 23.2(2.5) | 57.2(2.3) | 150.3 (2.8) | 194.3(2.6) | 271.7(2.7) | 335.0(2.4) | 393.5(2.3) | 480.3(3.7) | 530.0(3.2) | 578.6(2.8) |
| UWglmnet | 23.1 (2.7) | 56.9(2.3) | 150.0(2.7) | 194.6(2.7) | 271.9(2.8) | 334.6(2.4) | 393.8(2.3) | 480.7(3.5) | 526.5(3.2) | 579.0(3.0) | ||
| IRGEE | 28.9(5.5) | 55.0(4.5) | 154.6(7.4) | 191.4(6.1) | 276.5(16.2) | 335.8(8.1) | 389.0(7.0) | 481.5(7.5) | 545.0(4.4) | 586.0(8.8) | ||
| 400 | IRglmnet | 23.7(2.1) | 57.1(1.5) | 149.8(2.8) | 194.7(2.6) | 270.9(2.6) | 335.1(2.5) | 393.8(2.1) | 480.4(2.7) | 526.8(1.7) | 578.9(2.8) | |
| UWglmnet | 23.4(2.2) | 57.0(1.5) | 149.6(3.0) | 194.7(2.5) | 271.1 (2.8) | 335.0(2.5) | 393.8(2.3) | 480.2(2.8) | 526.5(2.9) | 579.0(2.9) | ||
| IRGEE | 29.1(4.8) | 54.5(3.5) | 155.8(7.8) | 191.4 (6.5) | 278.7(7.8) | 333.9(6.2) | 398.7(6.7) | 480.8(5.8) | 546.0(7.9) | 581.9(7.0) | ||
| Poisson | 200 | IRglmnet | 23.8(2.6) | 56.5(2.5) | 149.9(2.8) | 194.5(2.5) | 271.7(2.8) | 335.1(2.6) | 393.8 (2.2) | 480.9(2.9) | 526.4(2.5) | 578.8(2.6) |
| UWglmnet | 23.7(2.5) | 56.1(2.4) | 150.1 (2.6) | 194.5(2.7) | 270.84(2.8) | 335.1(2.7) | 393.5(2.4) | 480.7(2.8) | 526.6(2.4) | 578.9(2.8) | ||
| IRGEE | 16.9(8.7) | 52.7(9.4) | 152.1(9.3) | 191.2(8.2) | 266.6(12.3) | 335.8(10.9) | 395.4(11.0) | 481.5(8.2) | 524.5611.6) | 581.1(10.0) | ||
| 400 | IRglmnet | 24.22(2.3) | 56.2(2.3) | 150.1(2.5) | 194.7(2.5) | 270.8(2.9) | 335.2(2.5) | 393.8(2.3) | 480.8(2.3) | 527.0(2.7) | 579.4 (2.6) | |
| UWglmnet | 24.2(2.3) | 56.1(2.3) | 150.1(2.5) | 194.6(2.5) | 270.4 (3.0) | 335.1(2.5) | 394.1(2.1) | 480.8(2.5 | 526.62(2.4) | 579.6(2.5) | ||
| IRGEE | 20.2(10.8) | 49.3(7.4) | 154.3(7.3) | 192.1(6.9) | 269.4(12.0) | 333.7(6.7) | 396.1(7.2) | 480.4(7.8) | 524.8(9.4) | 580.0 (6.4) |
Mean estimates and standard deviations (in parentheses) of QTL effects obtained with three mapping methods for the simulated datasets.
| Trait | Sample size | QTL no. | Q1 | Q 2 | Q 3 | Q4 | Q 5 | Q 6 | Q 7 | Q 8 | Q 9 | Q 10 |
| True effect | 1.5 | 2.0 | 0.72 | 1.1 | −0.22 | 0.70 | −0.65 | 1.25 | 0.35 | −0.80 | ||
| Binary | 200 | IRglmnet | 1.51(0.26) | 1.88(0.34) | 0.97(0.18) | 1.25(0.25) | −0.40(0.11) | 0.91(0.19) | −0.89(0.19) | 1.18(0.28) | 0.60(0.18) | −0.96(0.21) |
| UWglmnet | 1.25(0.35) | 1.59(0.42) | 0.83(0.19) | 1.02(0.28) | −0.31(0.10) | 0.8(0.19) | −0.8(0.19) | 0.98(0.29) | 0.59(0.18) | −0.80(0.24) | ||
| IRGEE | 0.78(0.39) | 0.90(0.38) | 0.38(0.18) | 0.41(0.17) | −0.33(0.11) | 0.34(0.24) | −0.33(0.11) | 0.37(0.26) | −0.3(0.21) | −0.36(0.22) | ||
| 400 | IRglmnet | 1.59(0.27) | 1.99(0.34) | 0.83(0.21) | 1.13(0.31) | −0.35(0.08) | 0.77(0.20) | −0.75(0.20) | 1.23(0.32) | 0.56(0.07) | −0.82(0.21) | |
| UWglmnet | 1.17(0.31) | 1.49(0.23) | 0.60(0.14) | 0.79(0.17) | −0.30(0.09) | 0.57(0.13) | −0.55(0.12) | 0.89(0.18) | 0.46(0.07) | −0.61(0.14) | ||
| IRGEE | 0.76(0.17) | 0.89(0.17) | 0.33(0.16) | 0.36(0.17) | −0.25(0.13) | 0.25(0.14) | −0.15(0.19) | 0.32(0.16) | −0.22(0.13) | −0.26(0.13) | ||
| Poisson | 200 | IRglmnet | 1.38(0.31) | 1.70(0.35) | 0.80(0.18) | 1.07(0.17) | −0.23(0.20) | 0.76(0.21) | −0.73(0. 21) | 1.28(0.25) | 0.35(0.16) | −0.81(0.16) |
| UWglmnet | 1.41(0.34) | 1.64(0.43) | 0.73(0.19) | 1.03(0.17) | −0.23(0.25) | 0.67(0.27) | −0.64(0.27) | 1.19(0.26) | 0.35(0.17) | −0.75(0.17) | ||
| IRGEE | 1.27(0.23) | 1.21(0.21) | 1.14(0.21) | 1.20(0.29) | −0.25(0.50) | 0.53(0.37) | −0.33(0.57) | 1.13(0.17) | 0.19(0.52) | −0.66(0.39) | ||
| 400 | IRglmnet | 1.34(0.20) | 1.61(0.23) | 0.76(0.11) | 1.07(0.12) | −0.24(0.09) | 0.77(0.14) | −0.74(0.15) | 1.24(0.13) | 0.39(0.11) | −0.85(0.15) | |
| UWglmnet | 1.42(0.28) | 1.49(0.34) | 0.72(0.14) | 1.02(0.14) | −0.22(0.11) | 0.69(0.15) | −0.65(0.17) | 1.17(0.13) | 0.35(0.11) | −0.76(0.16) | ||
| IRGEE | 1.21(0.21) | 1.21(0.21) | 1.08(0.17) | 1.17(0.16) | −0.4(0.22) | 0.51(0.19) | −0.43(0.34) | 1.1(0.15) | 0.17(0.27) | −0.73(0.20) |
Statistical powers of QTL detection obtained with three mapping methods for the simulated datasets.
| Trait | Sample size | Chr. no. | C1 | C2 | C3 | C4 | C5 | C6 | ||||
| QTL no. | Q1 | Q 2 | Q3 | Q4 | Q5 | Q6 | Q7 | Q8 | Q9 | Q10 | ||
| Binary | 200 | IRglmnet | 83.0 | 85.7 | 27.5 | 49.9 | 5.5 | 24.2 | 17.9 | 71.0 | 5.3 | 26.6 |
| UWglmnet | 83.6 | 87.1 | 27.1 | 49.2 | 5.2 | 23.6 | 23.0 | 78.8 | 5.9 | 30.2 | ||
| IRGEE | 68.1 | 88.0 | 27.2 | 32.3 | 2.2 | 6.3 | 2.3 | 47.8 | 3.1 | 11.5 | ||
| 400 | IRglmnet | 92.9 | 91.8 | 57.2 | 82.1 | 7.3 | 60.5 | 59.8 | 84.9 | 11.8 | 69.9 | |
| UWglmnet | 93.1 | 95.1 | 60.0 | 78.5 | 7.5 | 62.8 | 63.4 | 85.8 | 12.4 | 65.9 | ||
| IRGEE | 74.2 | 92.2 | 56.0 | 80.8 | 3.2 | 17.0 | 15.4 | 79.4 | 5.2 | 20.4 | ||
| Poisson | 200 | IRglmnet | 83.6 | 82.5 | 74.2 | 81.3 | 38.6 | 76.5 | 83.4 | 83.4 | 57.8 | 70.6 |
| UWglmnet | 83.2 | 82.4 | 74.3 | 83.8 | 40.5 | 79.6 | 83.0 | 78.6 | 56.5 | 73.9 | ||
| IRGEE | 80.1 | 78.2 | 70.4 | 74.8 | 23.0 | 65.5 | 62.3 | 76.2 | 30.2 | 54.3 | ||
| 400 | IRglmnet | 87.7 | 96.2 | 82.9 | 91.9 | 59.0 | 88.8 | 91.4 | 91.2 | 69.8 | 79.8 | |
| UWglmnet | 88.8 | 90.4 | 78.4 | 91.6 | 56.2 | 86.8 | 91.6 | 89.0 | 71.0 | 80.8 | ||
| IRGEE | 78.7 | 86.3 | 80.5 | 84.3 | 30.4 | 82.2 | 82.6 | 88.2 | 48.3 | 68.2 |
Figure 1The profiles of -log(p) test statistics of additive genetic effects obtained with IRglmnet method (upper panel) and IRGEE method (lower panel) for alopecia areata.
In each plot, the genome-wide critical value is marked by a horizontal reference line. Chromosomes are separated by the vertical dotted lines and marker positions are indicated by the ticks on the horizontal axis.
Figure 2The profiles of -log(p) test statistics of dominance genetic effects obtained with IRglmnet method (upper panel) and IRGEE method (lower panel) for alopecia areata.
In each plot, the genome-wide critical value is marked by a horizontal reference line. Chromosomes are separated by the vertical dotted lines and marker positions are indicated by the ticks on the horizontal axis.
Estimated QTL parameters obtained with the three mapping methods for alopecia areata in an F2 mouse population.
| Inheritance Mode | QTL No. | IRglmnet | UWglmnet | IRGEE | ||||||
| Chr-pos. (cM) | Marker interval | Effect | Heritability | -log(P) | Effect | Heritability | Chr-pos. (cM) | Effect | ||
| Additive | 1 | 1–3 | D1Mit231 | −1.05 | 0.03 | 3.02 | −0.37 | 0.02 | 8–22 | −0.35 |
| 2 | 9–13 | D9Mit162 | 1.31 | 0.05 | 4.21 | 0.53 | 0.04 | 9–3 | 0.28 | |
| 3 | 10–43 | D10Mit180 | 1.05 | 0.03 | 3.53 | 0.50 | 0.04 | |||
| 4 | 13–4 | D13Mit179∼D13Mit159 | 1.21 | 0.04 | 2.90 | 0.57 | 0.05 | |||
| 5 | 15–8 | D15Mit115∼D15Mit270 | −1.14 | 0.04 | 2.19 | −0.49 | 0.03 | 15–8 | −0.39 | |
| 6 | 17–8.8 | D17Mit80 | −4.00 | 0.49 | 3.91 | −1.44 | 0.30 | 17–12.8 | −0.75 | |
| Dominance | 1 | 2–0 | D2Mit237 | 1.59 | 0.04 | 2.81 | 0.71 | 0.04 | 1–3 | −0.41 |
| 2 | 2–58.3 | D2Mit456 | −1.27 | 0.02 | 2.23 | −0.62 | 0.03 | |||
| 3 | 8–34 | D8Mit75∼D8Mit167 | −2.48 | 0.09 | 2.10 | −1.20 | 0.10 | |||
| 4 | 15–14.2 | D15Mit209 | 2.45 | 0.09 | 3.23 | 1.02 | 0.07 | |||
Figure 3The profiles of -log(p) test statistics obtained with IRglmnet method (upper panel) and IRGEE method (lower panel) for tiller numbers.
In each plot, the genome-wide critical value is marked by a horizontal reference line. Chromosomes are separated by the vertical dotted lines and marker positions are indicated by the ticks on the horizontal axis.