| Literature DB >> 29617944 |
Hannah Verena Meyer1, Ewan Birney1.
Abstract
Motivation: Simulation is a critical part of method development and assessment. With the increasing sophistication of multi-trait and multi-locus genetic analysis techniques, it is important that the community has flexible simulation tools to challenge and explore the properties of these methods.Entities:
Mesh:
Year: 2018 PMID: 29617944 PMCID: PMC6129313 DOI: 10.1093/bioinformatics/bty197
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Phenotype simulation scheme. PhenotypeSimulator takes genotypes from a number of different input formats and uses these as the basis for the simulation of the genetic effects. In addition to the genetic effects, non-genetic covariates, observational noise and non-genetic correlation structure can be simulated. The effect structure of the upper four components can be divided into a shared effect across traits or an independent effect for a number of traits, allowing for complex phenotype structures such as the simulation of pleiotropy. Before combining the phenotype components, they are scaled to a user-defined proportion of the total phenotypic variance. Finally, the simulated phenotype and its components can be saved into a number of different genetic output formats. Arrows, lines and rectangles mark the dimensions of each component
Fig. 2.Phenotype simulation and genome-wide association study as a downstream application. (A) Heatmaps of the trait-by-trait correlation (Pearson correlation) of a simulated phenotype Y and its five phenotype components: genetic variant effects XB, infinitesimal genetic effects U, non-genetic covariates WA, correlated non-genetic effects T and observational noise . The non-genetic covariates consist of four independent components, two following a binomial and two following a normal distribution. The genetic variant effect of ten causal SNPs with shared effect across all traits, yielding the strong correlation structure observed above (see Section 2). (B) Quantile-quantile plots of P-values observed from a multivariate linear mixed model (mvLMM) and univariate linear mixed models (uvLMM) fitted to each of the about eight million genome-wide SNPs (grey), including the ten SNPs for which a phenotype effect was modelled (green). The R code and detailed description of the simulation and analysis are provided in the Supplementary Material (Simulation-and-LinearModel.pdf)