| Literature DB >> 35298919 |
Megan Null1, Josée Dupuis2, Pezhman Sheinidashtegol3, Ryan M Layer4, Christopher R Gignoux5, Audrey E Hendricks6.
Abstract
Identification of rare-variant associations is crucial to full characterization of the genetic architecture of complex traits and diseases. Essential in this process is the evaluation of novel methods in simulated data that mirror the distribution of rare variants and haplotype structure in real data. Additionally, importing real-variant annotation enables in silico comparison of methods, such as rare-variant association tests and polygenic scoring methods, that focus on putative causal variants. Existing simulation methods are either unable to employ real-variant annotation or severely under- or overestimate the number of singletons and doubletons, thereby reducing the ability to generalize simulation results to real studies. We present RAREsim, a flexible and accurate rare-variant simulation algorithm. Using parameters and haplotypes derived from real sequencing data, RAREsim efficiently simulates the expected variant distribution and enables real-variant annotations. We highlight RAREsim's utility across various genetic regions, sample sizes, ancestries, and variant classes.Entities:
Keywords: RAREsim; rare variants; simulated data; simulated genetic variants
Mesh:
Year: 2022 PMID: 35298919 PMCID: PMC9069075 DOI: 10.1016/j.ajhg.2022.02.009
Source DB: PubMed Journal: Am J Hum Genet ISSN: 0002-9297 Impact factor: 11.043