| Literature DB >> 33854527 |
Jin Zhang1,2, Min Chen1, Yangjun Wen1, Yin Zhang1, Yunan Lu1, Shengmeng Wang1, Juncong Chen3.
Abstract
The mixed linear model (MLM) has been widely used in genome-wide association study (GWAS) to dissect quantitative traits in human, animal, and plant genetics. Most methodologies consider all single nucleotide polymorphism (SNP) effects as random effects under the MLM framework, which fail to detect the joint minor effect of multiple genetic markers on a trait. Therefore, polygenes with minor effects remain largely unexplored in today's big data era. In this study, we developed a new algorithm under the MLM framework, which is called the fast multi-locus ridge regression (FastRR) algorithm. The FastRR algorithm first whitens the covariance matrix of the polygenic matrix K and environmental noise, then selects potentially related SNPs among large scale markers, which have a high correlation with the target trait, and finally analyzes the subset variables using a multi-locus deshrinking ridge regression for true quantitative trait nucleotide (QTN) detection. Results from the analyses of both simulated and real data show that the FastRR algorithm is more powerful for both large and small QTN detection, more accurate in QTN effect estimation, and has more stable results under various polygenic backgrounds. Moreover, compared with existing methods, the FastRR algorithm has the advantage of high computing speed. In conclusion, the FastRR algorithm provides an alternative algorithm for multi-locus GWAS in high dimensional genomic datasets.Entities:
Keywords: genome-wide association study; minor effect; mixed linear model; multi-locus algorithm; polygenic background; statistical power
Year: 2021 PMID: 33854527 PMCID: PMC8041068 DOI: 10.3389/fgene.2021.649196
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Comparison of lasso, adaptive lasso, SCAD, EMMA, DEMMA, and FastRR methods in the first simulation experiment (three scenarios).
| Polygenic back ground | True value | Lasso | Adaptive lasso | SCAD | EMMA | DEMMA | FastRR | ||||||||||||||
| Position | Effect | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | ||
| 2K | 98 | 0.7398 | 5% | 100.0 | 0.476 (0.092) | 7.768 | 83.0 | 0.374 (0.155) | 13.079 | 100.0 | 0.474 (0.156) | 9.446 | 100.0 | 0.736 (0.091) | 0.818 | 100.0 | 0.736 (0.091) | 0.818 | 100.0 | 0.734 (0.091) | 0.817 |
| 5K | 98 | 0.7398 | 5% | 100.0 | 0.404 (0.111) | 12.527 | 59.0 | 0.315 (0.224) | 13.585 | 100.0 | 0.390 (0.164) | 14.915 | 98.0 | 0.735 (0.103) | 1.040 | 99.0 | 0.733 (0.105) | 1.089 | 100.0 | 0.729 (0.109) | 1.188 |
| 10K | 98 | 0.7398 | 5% | 91.0 | 0.337 (0.134) | 16.386 | 32.0 | 0.380 (0.247) | 6.048 | 87.0 | 0.324 (0.168) | 17.446 | 70.0 | 0.795 (0.094) | 0.829 | 84.0 | 0.765 (0.110) | 1.052 | 99.0 | 0.729 (0.131) | 1.693 |
| False positive rate of 2K (‰) | 0.453 | 0.004 | 0.288 | 0.030 | 0.014 | 0.450 | |||||||||||||||
| False positive rate of 5K (‰) | 0.555 | 0.001 | 0.460 | 0.090 | 0.018 | 0.498 | |||||||||||||||
| False positive rate of 10K (‰) | 0.636 | 0.019 | 0.550 | 0.050 | 0.026 | 0.436 | |||||||||||||||
Comparison of lasso, adaptive lasso, SCAD, EMMA, DEMMA, and FastRR methods in the second simulation experiment (scenarios 1: two times polygenic background).
| QTN | True value | Lasso | Adaptive lasso | SCAD | EMMA | DEMMA | FastRR | ||||||||||||||
| Position | Effect | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | ||
| 1 | 98 | 0.5451 | 2% | 99.0 | 0.298 (0.091) | 6.833 | 96.0 | 0.416 (0.149) | 3.703 | 99.0 | 0.269 (0.122) | 9.011 | 91.0 | 0.600 (0.087) | 0.956 | 94.0 | 0.596 (0.089) | 0.978 | 99.0 | 0.587 (0.094) | 1.035 |
| 2 | 301 | 0.8622 | 5% | 100.0 | 0578 (0.100) | 9.080 | 100.0 | 0.782 (0.114) | 1.924 | 100.0 | 0.683 (0.174) | 6.221 | 100.0 | 0.822 (0.095) | 1.044 | 100.0 | 0.822 (0.095) | 1.044 | 100.0 | 0.820 (0.094) | 1.054 |
| 3 | 540 | 0.8598 | 5% | 100.0 | 0.605 (0.093) | 7.350 | 100.0 | 0.811 (0.101) | 1.240 | 100.0 | 0.730 (0.150) | 3.906 | 100.0 | 0.852 (0.089) | 0.788 | 100.0 | 0.852 (0.089) | 0.788 | 100.0 | 0.850 (0.089) | 0.788 |
| 4 | 801 | 1.0789 | 8% | 100.0 | 0.807 (0.099) | 8.34 | 100.0 | 1.030 (0.105) | 1.333 | 100.0 | 1.025 (0.139) | 2.211 | 100.0 | 1.061 (0.094) | 0.914 | 100.0 | 1.061 (0.094) | 0.914 | 100.0 | 1.059 (0.094) | 0.911 |
| 5 | 1000 | 1.2093 | 10% | 100.0 | 0.957 (0.095) | 7.276 | 100.0 | 1.118 (0.098) | 1.023 | 100.0 | 1.207 (0.251) | 10.129 | 100.0 | 1.223 (0.094) | 0.886 | 100.0 | 1.223 (0.094) | 0.886 | 100.0 | 1.220 (0.094) | 0.878 |
| False positive rate (‰) | 0.461 | 0.024 | 0.355 | 0.000 | 0.007 | 0.422 | |||||||||||||||
Comparison of lasso, adaptive lasso, SCAD, EMMA, DEMMA, and FastRR methods in the second simulation experiment (scenarios 3: ten times polygenic background).
| QTN | True value | Lasso | Adaptive lasso | SCAD | EMMA | DEMMA | FastRR | ||||||||||||||
| Position | Effect | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | ||
| 1 | 98 | 0.5451 | 2% | 56.0 | 0.223 (0.092) | 6.283 | 46.0 | 0.393 (0.188) | 4.297 | 51.0 | 0.240 (0.092) | 5.165 | 20.0 | 0.757 (0.047) | 0.943 | 36.0 | 0.706 (0.069) | 1.102 | 76.0 | 0.644 (0.095) | 1.160 |
| 2 | 301 | 0.8622 | 5% | 97.0 | 0.437 (0.126) | 19.080 | 93.0 | 0.718 (0.212) | 6.046 | 98.0 | 0.488 (0.195) | 17.444 | 89.0 | 0.860 (0.102) | 0.923 | 93.0 | 0.851 (0.108) | 1.088 | 100.0 | 0.830 (0.126) | 1.668 |
| 3 | 540 | 0.8598 | 5% | 97.0 | 0.459 (0.141) | 17.520 | 97.0 | 0.726 (0.235) | 1.240 | 98.0 | 0.516 (0.210) | 15.874 | 88.0 | 0.873 (0.119) | 1.242 | 94.0 | 0.858 (0.128) | 1.529 | 99.0 | 0.842 (0.140) | 1.960 |
| 4 | 801 | 1.0789 | 8% | 100.0 | 0.682 (0.147) | 17.912 | 99.0 | 1.020 (0.173) | 3.287 | 100.0 | 0.855 (0.251) | 11.254 | 100.0 | 1.085 (0.141) | 1.962 | 100.0 | 1.085 (0.141) | 1.962 | 100.0 | 1.083 (0.141) | 1.958 |
| 5 | 1000 | 1.2093 | 10% | 100.0 | 0.783 (0.159) | 20.627 | 99.0 | 1.129 (0.174) | 3.592 | 100.0 | 1.012 (0.251) | 10.129 | 100.0 | 1.206 (0.152) | 2.297 | 100 | 1.206 (0.153) | 2.297 | 100.0 | 1.204 (0.152) | 2.290 |
| False positive rate (‰) | 0.673 | 0.209 | 0.788 | 0.050 | 0.026 | 0.490 | |||||||||||||||
FIGURE 1The statistical powers for the fixed position QTN in the first simulation experiment using six methods (lasso, adaptive lasso, SCAD, EMMA, DEMMA, and the FastRR algorithm).
FIGURE 2The statistical powers for the minor effect QTNs in the second simulation experiment using six methods (lasso, adaptive lasso, SCAD, EMMA, DEMMA, and the FastRR algorithm).
FIGURE 3The average statistical powers for all QTNs in the third simulation experiment using six methods (lasso, adaptive lasso, SCAD, EMMA, DEMMA, and the FastRR algorithm).
FIGURE 4Comparison of computing times to analyze simulation experiment 1 using all six methods (lasso, adaptive lasso, SCAD, EMMA, DEMMA, and the FastRR algorithm).
The computation times (seconds) for analyzing Arabidopsis flowering time traits and rice grain width by using lasso, adaptive lasso, SCAD, EMMA, DEMMA, and FastRR methods.
| Traits | Lasso | Adaptive lasso | SCAD | EMMA | DEMMA | FastRR |
| Grain width | 235.33 | 1067.22 | 455.31 | 60813.82 | 26417.71 | 561.31 |
| LD | 36.11 | 189.36 | 128.79 | 1362.55 | 1117.49 | 105.17 |
| SD | 37.17 | 159.00 | 114.17 | 1350.19 | 4114.88 | 112.75 |
| SDV | 44.47 | 140.96 | 112.34 | 1665.94 | 4123.34 | 107.36 |
Comparison of lasso, adaptive lasso, SCAD, EMMA, DEMMA, and FastRR methods in the second simulation experiment (scenarios 2: five times polygenic background)
| QTN | True value | Lasso | Adaptive lasso | SCAD | EMMA | DEMMA | FastRR | ||||||||||||||
| Position | Effect | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | Power (%) | Effect (SD) | MSE | ||
| 1 | 98 | 0.5451 | 2% | 89.0 | 0.239 (0.091) | 9.048 | 71.0 | 0.375 (0.179) | 4.297 | 88.0 | 0.216 (0.098) | 10.367 | 52.0 | 0.656 (0.072) | 0.943 | 73.0 | 0.622 (0.082) | 0.910 | 96.0 | 0.587 (0.095) | 1.029 |
| 2 | 301 | 0.8622 | 5% | 100.0 | 0.527 (0.119) | 12.673 | 100.0 | 0.764 (0.166) | 3.703 | 100.0 | 0.606 (0.200) | 10.515 | 99.0 | 0.841 (0.106) | 1.140 | 99.0 | 0.841 (0.106) | 1.140 | 100.0 | 0.820 (0.126) | 1.283 |
| 3 | 540 | 0.8598 | 5% | 100.0 | 0.518 (0.117) | 13.063 | 100.0 | 0.754 (0.153) | 3.439 | 100.0 | 0.591 (0.191) | 10.812 | 99.0 | 0.831 (0.107) | 1.195 | 100.0 | 0.828 (0.110) | 1.297 | 100.0 | 0.826 (0.109) | 1.299 |
| 4 | 801 | 1.0789 | 8% | 100.0 | 0.755 (0.116) | 11.824 | 100.0 | 1.029 (0.126) | 1.811 | 100.0 | 0.957 (0.186) | 4.911 | 100.0 | 1.077 (0.117) | 1.336 | 100.0 | 1.077 (0.116) | 1.336 | 100.0 | 1.075 (0.116) | 1.334 |
| 5 | 1000 | 1.2093 | 10% | 100.0 | 0.897 (0.109) | 10.937 | 100.0 | 1.176 (0.117) | 1.480 | 100.0 | 1.165 (0.150) | 2.428 | 100.0 | 1.234 (0.101) | 1.063 | 100.0 | 1.234 (0.101) | 1.063 | 100.0 | 1.232 (0.100) | 1.049 |
| False positive rate (‰) | 0.510 | 0.102 | 0.473 | 0.040 | 0.014 | 0.431 | |||||||||||||||