| Literature DB >> 32600319 |
Dörte Wittenburg1, Sarah Bonk2, Michael Doschoris3, Henry Reyer4.
Abstract
BACKGROUND: Single nucleotide polymorphisms (SNPs) which capture a significant impact on a trait can be identified with genome-wide association studies. High linkage disequilibrium (LD) among SNPs makes it difficult to identify causative variants correctly. Thus, often target regions instead of single SNPs are reported. Sample size has not only a crucial impact on the precision of parameter estimates, it also ensures that a desired level of statistical power can be reached. We study the design of experiments for fine-mapping of signals of a quantitative trait locus in such a target region.Entities:
Keywords: Linkage disequilibrium; SNP-BLUP; Single nucleotide polymorphism; Statistical power; Target region
Year: 2020 PMID: 32600319 PMCID: PMC7324978 DOI: 10.1186/s12863-020-00871-1
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Median of optimal sample size for detecting different number of QTL signals from 100 repetitions of simulations
| Single | Single | Single | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 128 | 126 | 127 | 195 | 57 | 58 | 57 | 91 | 34 | 33 | 34 | 56 |
| 2 | 275 | 269 | 273 | 382 | 125 | 126 | 122 | 175 | 73 | 70 | 73 | 106 |
| 3 | 421 | 426 | 436 | 569 | 214 | 201 | 205 | 259 | 126 | 120 | 120 | 155 |
| 4 | 613 | 540 | 584 | 756 | 291 | 288 | 281 | 342 | 177 | 170 | 170 | 204 |
| 5 | 763 | 713 | 685 | 943 | 385 | 349 | 344 | 426 | 228 | 208 | 207 | 253 |
Results are based on the multi-SNP approach (N=1,5,10 families) or single-SNP approach. In each repetition, sample size was repeatedly determined for randomly drawn QTL positions and the median was calculated
Fig. 1Distribution of optimal sample size. Violinplot of nopt vs. number of half-sib families for different numbers of QTL signals in a multi-SNP model. The parent generation was simulated 100 times and 100 random draws of positions of QTL signals were analyzed in each run, h2=0.1. The diamond indicates the median of nopt and the blue line marks the results based on a single-SNP model
Fig. 2Dependence between SNPs in a single simulated data set with N=10 sires. a Correlation matrix R, b entries selected from R which belong to 10 % highest sample size (nopt≥864). All possible SNP pairs were evaluated to detect two QTL signals (h2=0.1)
Fig. 3Sensitivity and specificity of testing SNP effects. ROC curve is based on 100 ×100 repeated simulations of genotypes and phenotypes in progeny generation comprising N=10 half-sib families (two QTL signals, h2=0.1). Optimal sample size suggested by the multi-SNP model was considered for setting up the progeny generation
Fig. 4Empirical bovine HD SNP chip data. a Correlation matrix for a randomly selected window containing 300 SNPs on BTA7. b Violinplot of nopt vs. number of QTL signals to be detected. The diamond indicates the median of nopt and the blue dots mark the results based on a single-SNP model, N=10 and h2=0.1
Fig. 5Relative effect size depending on number of QTL signals. The relative effects (blue dots) were calculated based on the assumption of heritability and number of QTL signals. The simulated relative effects (grey dots) were derived from simulated QTL effects on the observed genotype scale, allele frequencies at positions of QTL signals in the founder population and residual variance component. Simulation was based on h2=0.1