| Literature DB >> 30122786 |
Alexander Sibley1, Zhiguo Li2, Yu Jiang2, Yi-Ju Li2, Cliburn Chan2, Andrew Allen2, Kouros Owzar3.
Abstract
The score statistic continues to be a fundamental tool for statistical inference. In the analysis of data from high-throughput genomic assays, inference on the basis of the score usually enjoys greater stability, considerably higher computational efficiency, and lends itself more readily to the use of resampling methods than the asymptotically equivalent Wald or likelihood ratio tests. The score function often depends on a set of unknown nuisance parameters which have to be replaced by estimators, but can be improved by calculating the efficient score, which accounts for the variability induced by estimating these parameters. Manual derivation of the efficient score is tedious and error-prone, so we illustrate using computer algebra to facilitate this derivation. We demonstrate this process within the context of a standard example from genetic association analyses, though the techniques shown here could be applied to any derivation, and have a place in the toolbox of any modern statistician. We further show how the resulting symbolic expressions can be readily ported to compiled languages, to develop fast numerical algorithms for high-throughput genomic analysis. We conclude by considering extensions of this approach. The code featured in this report is available online as part of the supplementary material.Entities:
Keywords: computer algebra; genome-wide association study; mathematical statistics; nuisance parameters; python; trio data
Year: 2017 PMID: 30122786 PMCID: PMC6092959 DOI: 10.1080/00031305.2017.1392361
Source DB: PubMed Journal: Am Stat ISSN: 0003-1305 Impact factor: 8.710