| Literature DB >> 27048905 |
Jiawei Shen1,2,3, Zhiqiang Li1,3, Jianhua Chen1,3, Zhijian Song1,3, Zhaowei Zhou1,4,5, Yongyong Shi1,2,6,7.
Abstract
Currently, algorithms and softwares for genetic analysis of diploid organisms with bi-allelic markers are well-established, while those for polyploids are limited. Here, we present SHEsisPlus, the online algorithm toolset for both dichotomous and quantitative trait genetic analysis on polyploid species (compatible with haploids and diploids, too). SHEsisPlus is also optimized for handling multiple-allele datasets. It's free, open source and also designed to perform a range of analyses, including haplotype inference, linkage disequilibrium analysis, epistasis detection, Hardy-Weinberg equilibrium and single locus association tests. Meanwhile, we developed an accurate and efficient haplotype inference algorithm for polyploids and proposed an entropy-based algorithm to detect epistasis in the context of quantitative traits. A study of both simulated and real datasets showed that our haplotype inference algorithm was much faster and more accurate than existing ones. Our epistasis detection algorithm was the first try to apply information theory to characterizing the gene interactions in quantitative trait datasets. Results showed that its statistical power was significantly higher than conventional approaches. SHEsisPlus is freely available on the web at http://shesisplus.bio-x.cn/. Source code is freely available for download at https://github.com/celaoforever/SHEsisPlus.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27048905 PMCID: PMC4822172 DOI: 10.1038/srep24095
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Illustration of 3-way interaction information.
The intersection of H(A), H(B) and H(C) is the interaction information.
Summary of eight SNPs used in analysis.
| SNP | Position | Gene name and function | Allele | Populations | Allele frequency |
|---|---|---|---|---|---|
| Allele CEU CHB | |||||
| rs12129861 | 1q21.1 | PDZK1, 5′Intergenic | G/A | European | A 0.460 0.170 |
| rs780094 | 2p23.3 | GCKR, Intron16 | G/A | European | A 0.394 0.566 |
| rs734553 | 4p10.1 | SLC2A9, Intron7 | A/C | European | C 0.261 0.004 |
| rs742132 | 6p22.2 | LRRC16A, Intron34 | T/C | European, Japanese | C 0.301 0.244 |
| rs1183201 | 6p22.2 | SLC17A1, Intron 3 | T/A | European | A— |
| rs12356193 | 10q21.2 | SLC16A9, Intron 5 | A/G | European | G 0.186 0.141 |
| rs17300741 | 11q13.1 | SLC22A11, Intron4 | A/G | European | G 0.531 0.073 |
| rs505802 | 11q13.1 | SLC22A12, 5′Intergenic | G/A | European | A 0.726 0.256 |
aOn human genome build 18.
bIn NCBI.
*Collected from HapMap Data Phase III/Rel#3. CEU: Utah residents with Northern and Western European ancestry from the CEPH collection, CHB: Han Chinese in Beijing, China.
Accuracy and running time of SHEsisPlus for haplotype inference.
| Algorithm/Ploidy | 2 | 3 | 4 |
|---|---|---|---|
| SHEsisPlus | 99.63% (6.317 s) | 98.74% (15.862 s) | 98.14% (51.109 s) |
| PolyHap | 99.15% (12.25 m) | 98.21% (3.075 h) | 78.91% (43.95 h) |
| SATlotyper | 90.46% (19.80 m) | – | – |
Figure 2Epistasis models used for simulation study.
(Left) Penetrance table for two-locus, bi-allelic epistasis in diploids (Right) Penetrance table for two-locus, bi-allelic epistasis in triploids.
Power of SHEsisPlus for epistasis detection in diploids.
| Samples | sd* | alpha = 0.05 SHEsisPlus/Plink | alpha = 0.01 SHEsisPlus/Plink |
|---|---|---|---|
| 2000 | 0.25 | 0.494/0.041 | 0.331/0.006 |
| 2000 | 0.5 | 0.779/0.047 | 0.693/0.005 |
| 2000 | 0.75 | 0.870/0.054 | 0.832/0.004 |
| 2000 | 1 | 0.923/0.056 | 0.902/0.004 |
| 2000 | 1.5 | 0.948/0.065 | 0.938/0.008 |
| 2000 | 2 | 0.966/0.066 | 0.950/0.009 |
| 2000 | 2.5 | 0.971/0.064 | 0.968/0.009 |
*Number of standard deviation apart between two groups.
Power of SHEsisPlus for epistasis detection in triploids.
| Samples | sd* | alpha = 0.05 | alpha = 0.01 |
|---|---|---|---|
| 2000 | 0.25 | 0.099 | 0.024 |
| 2000 | 0.5 | 0.24 | 0.115 |
| 2000 | 0.75 | 0.418 | 0.287 |
| 2000 | 1 | 0.602 | 0.472 |
| 2000 | 1.5 | 0.824 | 0.762 |
| 2000 | 2 | 0.901 | 0.863 |
| 2000 | 2.5 | 0.903 | 0.877 |
*Number of standard deviation apart between two groups.
False positive rate of SHEsisPlus for epistasis detection in diploids.
| Samples | alpha = 0.05 SHEsisPlus/Plink | alpha = 0.01 SHEsisPlus/Plink |
|---|---|---|
| 500 | 0.055/0.048 | 0.008/0.013 |
| 1000 | 0.047/0.053 | 0.009/0.007 |
| 2000 | 0.051/0.051 | 0.008/0.015 |
| 3000 | 0.033/0.060 | 0.008/0.015 |
| 5000 | 0.052/0.058 | 0.012/0.011 |
False positive rate of SHEsisPlus for epistasis detection in triploids.
| Samples | alpha = 0.05 | alpha = 0.01 |
|---|---|---|
| 500 | 0.042 | 0.006 |
| 1000 | 0.043 | 0.008 |
| 2000 | 0.045 | 0.012 |
| 3000 | 0.044 | 0.007 |
| 5000 | 0.054 | 0.009 |
Figure 3QQ plot of SHEsisPlus for 2-way epistasis detection in diploids in the context of quantitative trait.
It approximately lied on the line y = x, indicating that the results were unbiased.
Figure 4Distribution of the BMI-adjusted uric acid level.
The optimal threshold to divide the samples is marked red.
SHEsisPlus results on the uric acid level data.
| SNP set | P value | FDR |
|---|---|---|
| rs742132,rs12356193 | 0.005 | 0.176 |
| rs1183201,rs12356193 | 0.001 | |
| rs12129861,rs742132,rs505802 | 6.03e–04 | |
| rs12129861,rs1183201,rs12356193 | 6.13e–04 | |
| rs12129861,rs12356193,rs505802 | 0.01 | 0.288 |
| rs734553,rs742132,rs1183201 | 0.028 | 0.638 |
| rs12129861,rs780094,rs742132,rs12356193 | 6.26e–04 | |
| rs12129861,rs742132,rs1183201,rs12356193 | 2.09e–06 | |
| rs12129861,rs742132,rs12356193,rs505802 | 0.008 | 0.261 |
| rs12129861,rs742132,rs17300741,rs505802 | 0.041 | 0.86 |
| rs12129861,rs780094,rs742132,rs1183201,rs12356193 | 7.72e–05 | |
| rs12129861,rs780094,rs742132,rs1183201,rs12356193,rs17300741 | 0.017 | 0.42 |