| Literature DB >> 24008273 |
Toomas Haller, Mart Kals, Tõnu Esko, Reedik Mägi, Krista Fischer.
Abstract
UNLABELLED: Genome-wide association studies are becoming computationally more demanding with the growing amounts of data. Combinatorial traits can increase the data dimensions beyond the computational capabilities of the current tools. We addressed this issue by creating an application for quick association analysis that is ten to hundreds of times faster than the leading fast methods. Our tool (RegScan) is designed for performing basic linear regression analysis with continuous traits maximally fast on large data sets. RegScan specifically targets association analysis of combinatorial traits in metabolomics. It can both generate and analyze the combinatorial traits efficiently. RegScan is capable of analyzing any number of traits together without the need to specify each trait individually. The main goal of the article is to show that RegScan can be the preferred analytical tool when large amounts of data need to be analyzed quickly using the allele frequency test. AVAILABILITY: Precompiled RegScan (all major platforms), source code, user guide and examples are freely available at www.biobank.ee/regscan. REQUIREMENTS: Qt 4.4.3 or newer for dynamic compilations.Entities:
Keywords: GWAS; combinatorial traits; continuous traits; genome-wide analysis; linear regression; metabolomics
Mesh:
Year: 2013 PMID: 24008273 PMCID: PMC4293375 DOI: 10.1093/bib/bbt066
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Figure 1:General RegScan workflow. Creating combinatorial traits and adjustments/transformations are optional (dashed boxes).
Figure 2:Analysis time (RegScan versus QuickTest) with 1 million markers, one trait and variable number of individuals (750–3315). (A) Relative speed gain of RegScan over QuickTest; slope = 10.14, (B) Computational speed of RegScan and QuickTest as a function of the number of individuals. A colour version of this figure is available at BIB online: http://bib.oxfordjournals.org.
Pearson correlation coefficients between P-, β and SE values computed by SNPTEST (ST), QuickTest (QT) and RegScan (RS) based on 40 765 random markers
| Parameter | RS versus QT | RS versus ST | QT versus ST |
|---|---|---|---|
| 0.999998 | 0.999951 | 0.99995 | |
| β | 1 | 0.999999 | 0.999999 |
| SE | 1 | 1 | 1 |
Deviation (%) between P-, β and SE values computed by SNPTEST (ST), QuickTest (QT) and RegScan (RS) based on 40 765 random markers
| Parameter | RS versus QT | RS versus ST | QT versus ST |
|---|---|---|---|
| Mean deviation of | 0.119 | 0.114 | 0.006 |
| | 0.000 | 0.010 | 0.010 |
| | 2.956 | 2.951 | 0.010 |
| Mean deviation of β (%) | 0.018 | 0.017 | 0.036 |
| β values with >5% deviation (%) | 0.373 | 0.383 | 0.010 |
| β values with >1% deviation (%) | 1.820 | 1.828 | 0.010 |
| Mean deviation of SE (%) | 0.004 | 0.004 | 0.0002 |
| SE values with >5% deviation (%) | 0.000 | 0.007 | 0.007 |
| SE values with >1% deviation (%) | 0.000 | 0.010 | 0.010 |
The deviation (%) is calculated as the mean of the deviations of all markers (each calculated as the larger value divided by the smaller value times 100).
Figure 3:Manhattan plot showing the chromosome regions associated with blood plasma urate concentration (A), and with combinatorial traits involving urate concentration (B) as determined by RegScan.