| Literature DB >> 28200033 |
Felipe Llinares-López1,2, Laetitia Papaxanthos1,2, Dean Bodenham1,2, Damian Roqueiro1,2, Karsten Borgwardt1,2.
Abstract
MOTIVATION: Genetic heterogeneity is the phenomenon that distinct genetic variants may give rise to the same phenotype. The recently introduced algorithm Fast Automatic Interval Search ( FAIS ) enables the genome-wide search of candidate regions for genetic heterogeneity in the form of any contiguous sequence of variants, and achieves high computational efficiency and statistical power. Although FAIS can test all possible genomic regions for association with a phenotype, a key limitation is its inability to correct for confounders such as gender or population structure, which may lead to numerous false-positive associations.Entities:
Mesh:
Year: 2017 PMID: 28200033 PMCID: PMC5870548 DOI: 10.1093/bioinformatics/btx071
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1Schematic illustration of how individually weak signals inside a genomic region can be reinforced in meta-markers. In this example, , l = 20 and k = 2
Fig. 2(a) A comparison of the power of FastCMH, FAIS-χ and BonfCMH for detecting true significant regions, as varies. The parameters are chosen as: n = 500, , k = 2 and . (b) The proportion of confounded significant regions falsely detected by each of those three algorithms. The parameters have the same values as for (a). (c) A comparison of the runtimes for the three methods, where the dashed section for BonfCMH represents approximated values. Both axes are plotted on the log-scale. The set of parameters is as follows: n = 500, k = 4, . (d) The difference in runtime between FastCMH and a naive implementation of a procedure combining Tarone’s trick and the CMH test. The dashed section of the naive method represents approximated values. We chose: n = 500,
Fig. 3A comparison of the power between FastCMH and several burden tests with (a) non-overlapping windows and (b) sliding windows. The burden tests were performed for various windows sizes (w) and used the encoding that counts all minor alleles in the window. Refer to Supplementary Section S1.5 for more details
Comparison of the results obtained using our proposed method (FastCMH) and the previous state-of-the-art algorithm (FAIS-χ), which cannot correct for covariates
| Dataset and phenotype | Samples
| Cases % | |||||
|---|---|---|---|---|---|---|---|
| Hits | Hits | ||||||
| 7993 | 45.4 | 20 | 16.70 | 88 403 | 1.05 | 3 | |
| 87 | 63.2 | 3 | 1.66 | 14 | 1.17 | 11 | |
| 84 | 66.7 | 3 | 1.53 | 15 | 1.13 | 13 | |
| 90 | 51.1 | 4 | 1.70 | 6 | 1.22 | 5 | |
| 95 | 22.1 | 3 | 2.05 | 20 | 1.21 | 3 | |
| 95 | 30.5 | 5 | 2.51 | 26 | 1.30 | 1 | |
For each method, the columns λ and “Hits” refer to the genomic inflation factor and the resulting number of non-overlapping genomic regions deemed significant, respectively. The value of λ is computed based on the P-values of all testable regions.
Fig. 4Comparison of the QQ-plots for the P-values of all testable genomic regions obtained with FastCMH (red) and the previous state-of-the-art FAIS-χ (blue) for three datasets: (a) A. thaliana LES, (b) A. thaliana LY, (c) COPDGene. Horizontal lines show the adjusted significance thresholds
| Variables | Row totals | ||
|---|---|---|---|
| Col. totals |