| Literature DB >> 31215703 |
Sanne van den Berg1,2, Jérémie Vandenplas1, Fred A van Eeuwijk2, Marcos S Lopes3, Roel F Veerkamp1.
Abstract
Significance testing for genome-wide association study (GWAS) with increasing SNP density up to whole-genome sequence data (WGS) is not straightforward, because of strong LD between SNP and population stratification. Therefore, the objective of this study was to investigate genomic control and different significance testing procedures using data from a commercial pig breeding scheme. A GWAS was performed in GCTA with data of 4,964 Large White pigs using medium density, high density or imputed whole-genome sequence data, fitting a genomic relationship matrix based on a leave-one-chromosome-out approach to account for population structure. Subsequently, genomic inflation factors were assessed on whole-genome level and the chromosome level. To establish a significance threshold, permutation testing, Bonferroni corrections using either the total number of SNPs or the number of independent chromosome fragments, and false discovery rates (FDR) using either the Benjamini-Hochberg procedure or the Benjamini and Yekutieli procedure were evaluated. We found that genomic inflation factors did not differ between different density genotypes but do differ between chromosomes. Also, the leave-one-chromosome-out approach for GWAS or using the pedigree relationships did not account appropriately for population stratification and gave strong genomic inflation. Regarding different procedures for significance testing, when the aim is to find QTL regions that are associated with a trait of interest, we recommend applying the FDR following the Benjamini and Yekutieli approach to establish a significance threshold that is adjusted for multiple testing. When the aim is to pinpoint a specific mutation, the more conservative Bonferroni correction based on the total number of SNPs is more appropriate, till an appropriate method is established to adjust for the number of independent tests.Entities:
Keywords: DNA analysis; genome-wide association studies; pig population; significance testing; whole-genome sequence
Mesh:
Year: 2019 PMID: 31215703 PMCID: PMC6900143 DOI: 10.1111/jbg.12419
Source DB: PubMed Journal: J Anim Breed Genet ISSN: 0931-2668 Impact factor: 2.380
Figure 1Manhattan plots for the number of teats using either medium‐density, high‐density or iWGS genotypes
Figure 2Inflation factors per chromosome and the total genome found with medium density, high density and iWGS
Significance thresholds and genomic inflation factors from permutation testing of chromosomes 4, 7 and 10 for medium and high densities and iWGS
| Chromosome | Density | Threshold | Genomic inflation factor |
|---|---|---|---|
| 4 | Medium | 4.178 | 0.997 (0.198) |
| High | 4.927 | 1.004 (0.204) | |
| iWGS | 5.469 | 0.991 (0.196) | |
| 7 | Medium | 4.232 | 1.003 (0.207) |
| High | 4.922 | 1.003 (0.201) | |
| iWGS | 5.449 | 1.005 (0.210) | |
| 10 | Medium | 4.100 | 1.003 (0.194) |
| High | 4.743 | 0.988 (0.188) | |
| iWGS | 5.426 | 0.988 (0.185) |
p‐value thresholds are expressed as –log10 (p‐values).
Averages and standard deviation within brackets over 1,000 permutations.
Details on the top 4 QTL used as a fixed effect in the GWAS model
| Chromosome | Position | −log10( | SNP effect |
|---|---|---|---|
| 2 | 125.63 | 9.1 | −0.13 |
| 7 | 103.5 | 26.6 | 0.34 |
| 10 | 525.9 | 11.5 | −0.15 |
Position is given in mega base pairs (MB).
Figure 3Manhattan plot of GWAS without (upper) or with 3 QTL as fixed effect (lower)
Figure 4Genomic inflation factors of GWAS without (grey) or with 3 QTL as fixed effect (Blue)
Figure 5Inflation factors found per chromosome and across the whole genome using a pedigree relationship matrix (A matrix), a genomic relationship matrix based on the leave‐one‐chromosome‐out approach (LOCO G) and a genomic relationship matrix based on all iWGS markers (Full G)
Significance thresholds of a Bonferroni correction using the total number of SNPs or the number of independent chromosome fragments for medium and high densities and iWGS
| Bonferroni_total | Bonferroni_M | FDR_BH | FDR_BY | |||
|---|---|---|---|---|---|---|
| # SNP | Threshold | Me
| Threshold | Threshold | Threshold | |
| Medium density | 34,588 | 5.84 | 198.9 | 3.60 | 2.48 | 4.19 |
| High density | 491,169 | 6.99 | 193.4 | 3.59 | 2.44 | 4.38 |
| iWGS | 10.2 M | 8.31 | 223.4 | 3.65 | 2.57 | 4.54 |
Bonferroni_total = 0.05/total number of SNPs.
Bonferroni_M e = 0.05/ M e.
The false discovery rate (FDR) computed following Benjamini and Hochberg (1995) (BH).
The false discovery rate (FDR) computed following Benjamini and Yekutieli (2001) (BY).
Significance thresholds are expressed as –log10 (p‐values).
M e is the number of independent chromosome fragment calculated with the formula proposed by Goddard et al. (2011).
Figure 6Number of SNPs with a significance level above a range of significance thresholds (without correction) for medium density (grey), high density (blue), and iWGS (orange)
Figure 7Number of QTL regions with a significance level above a range of significance thresholds (without correction) for medium density (grey), high density (blue) and iWGS (orange)
Figure 8Cumulative proportion of variance explained in the genomic relationship matrix by its eigenvalues
Figure 9LD decay on chromosome 2, 7 and 10 between medium‐density SNPs. LD was measured as r 2 between SNP in bins of 50 kilo‐base pair (KB)