Literature DB >> 27305981

Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set.

Masahiro Kanai¹, Toshihiro Tanaka^1,2, Yukinori Okada^1,3,4.

Abstract

To assess the statistical significance of associations between variants and traits, genome-wide association studies (GWAS) should employ an appropriate threshold that accounts for the massive burden of multiple testing in the study. Although most studies in the current literature commonly set a genome-wide significance threshold at the level of P=5.0 × 10-8, the adequacy of this value for respective populations has not been fully investigated. To empirically estimate thresholds for different ancestral populations, we conducted GWAS simulations using the 1000 Genomes Phase 3 data set for Africans (AFR), Europeans (EUR), Admixed Americans (AMR), East Asians (EAS) and South Asians (SAS). The estimated empirical genome-wide significance thresholds were Psig=3.24 × 10-8 (AFR), 9.26 × 10-8 (EUR), 1.83 × 10-7 (AMR), 1.61 × 10-7 (EAS) and 9.46 × 10-8 (SAS). We additionally conducted trans-ethnic meta-analyses across all populations (ALL) and all populations except for AFR (ΔAFR), which yielded Psig=3.25 × 10-8 (ALL) and 4.20 × 10-8 (ΔAFR). Our results indicate that the current threshold (P=5.0 × 10-8) is overly stringent for all ancestral populations except for Africans; however, we should employ a more stringent threshold when conducting a meta-analysis, regardless of the presence of African samples.

Entities: Disease Gene Species

Mesh：

Year: 2016 PMID： 27305981 PMCID： PMC5090169 DOI： 10.1038/jhg.2016.72

Source DB: PubMed Journal: J Hum Genet ISSN： 1434-5161 Impact factor: 3.172

Introduction

Genome-wide association studies (GWAS) have successfully identified thousands of loci associated with human diseases and traits.[1, 2] To assess the statistical significance of associations between tested variants and traits, GWAS should employ an appropriate threshold that accounts for the massive burden of multiple testing undertaken in the study.[3, 4] Although a variety of statistical approaches have been developed to estimate this burden, including the Bonferroni correction,[5, 6] Sidak correction,[7] false discovery rate[8] and permutation test, most GWAS commonly set a genome-wide significance threshold at the level of P=5.0 × 10−8, which is equivalent to the Bonferroni-corrected threshold (α=0.05) for 1 million independent variants (approximately the number of independent single-nucleotide polymorphisms (SNPs) estimated using the HapMap Phase II data set[9]). The number of variants tested in recent GWAS, however, has increased dramatically because of the widespread use of genotype imputation using the 1000 Genomes data set as a reference[10, 11, 12, 13] or whole-genome sequencing,[14, 15, 16] and therefore the supposition of the above-mentioned Bonferroni correction has become untenable. Additionally, the variants tested in a study are inevitably dependent on population-specific factors, such as linkage disequilibrium (LD) pattern and minor allele frequency (MAF), suggesting that the appropriate threshold for genome-wide significance might vary for different populations.[17] For example, the threshold for a population with a lower LD pattern, such as the African population, should be more stringent than a population with higher LD, as the number of independent markers tends to be greater in the former population than the latter. To address the independence of genetic markers in LD, several studies have proposed methods for estimating the effective number of independent tests Me;[17, 18, 19] however, the effectiveness of these methods remains unclear. On the other hand, the current threshold, P=5.0 × 10−8, has been claimed to be overly stringent.[20, 21] A previous study showed that 73% of ‘borderline' associations (5.0 × 10−820] We report here empirical estimation of genome-wide significance thresholds for different populations based on GWAS simulations using the 1000 Genomes Phase 3 data set, the most recently released and widely used reference panel for genotype imputation containing five major ethnic ancestries. For each ancestral population in this data set, we tested associations of the variants with the simulated phenotypes and calculated empirical genome-wide significance thresholds based on the distributions of the minimum P-value of the associations. Our empirical estimation revealed that different thresholds should be adopted for different ancestral populations or trans-ethnic meta-analyses rather than the current single genome-wide significance threshold of P=5.0 × 10−8.

Materials and methods

Samples and ancestral populations

We used the 1000 Genomes Project[11, 12] (http://www.1000genomes.org/) Phase 3 data set (version 5), which comprises approximately 51 million variants (autosome and chromosome X) from 2504 individuals in 26 populations (Table 1). We split the data set into five ancestral populations: African (AFR; n=661), European (EUR; n=503), Admixed American (AMR; n=347), East Asian (EAS; n=504), and South Asian (SAS; n=489). For each ancestral population, we excluded SNPs that were monomorphic, singleton or MAF<0.5% and obtained 21 048 933, 11 980 247, 14 261 439, 10 201 713 and 12 641 702 variants for AFR, EUR, AMR, EAS and SAS, respectively.

Table 1

Overview of the 1000 Genomes Phase 3 (version 5) samples

			No. of samples
Ancestral population	Subpopulation	Code	Male	Female	Total	No. of variantsa (MAF>0.5%)
AFR	African Caribbeans in Barbados	ACB	47	49	96	21 048 933
	Americans of African Ancestry in SW USA	ASW	26	35	61
	Esan in Nigeria	ESN	53	46	99
	Gambian in Western Divisions in the Gambia	GWD	55	58	113
	Luhya in Webuye, Kenya	LWK	44	55	99
	Mende in Sierra Leone	MSL	42	43	85
	Yoruba in Ibadan, Nigeria	YRI	52	56	108
	Subtotal		319	342	661

EUR	Utah Residents (CEPH) with Northern and Western European Ancestry	CEU	49	50	99	11 980 247
	Finnish in Finland	FIN	38	61	99
	British in England and Scotland	GBR	46	45	91
	Iberian Population in Spain	IBS	54	53	107
	Toscani in Italia	TSI	53	54	107
	Subtotal		240	263	503

AMR	Colombians from Medellin, Colombia	CLM	43	51	94	14 261 439
	Mexican Ancestry from Los Angeles, USA	MXL	32	32	64
	Peruvians from Lima, Peru	PEL	41	44	85
	Puerto Ricans from Puerto Rico	PUR	54	50	104
	Subtotal		170	177	347

EAS	Chinese Dai in Xishuangbanna, China	CDX	44	49	93	10 201 713
	Han Chinese in Beijing, China	CHB	46	57	103
	Southern Han Chinese	CHS	52	53	105
	Japanese in Tokyo, Japan	JPT	56	48	104
	Kinh in Ho Chi Minh City, Vietnam	KHV	46	53	99
	Subtotal		244	260	504

SAS	Bengali from Bangladesh	BEB	42	44	86	12 641 702
	Gujarati Indian from Houston, Texas	GIH	56	47	103
	Indian Telugu from the UK	ITU	59	43	102
	Punjabi from Lahore, Pakistan	PJL	48	48	96
	Sri Lankan Tamil from the UK	STU	55	47	102
	Subtotal		260	229	489

Total			1233	1271	2504	28 993 742

Abbreviations: AFR, African; AMR, Admixed American; EAS, East Asian; EUR, European; MAF, minor allele frequency; SAS, South Asian.

MAF was calculated within each ancestral population.

GWAS simulations

To empirically estimate appropriate genome-wide significance thresholds for different ancestral populations, we calculated empirical null distributions of the minimum P-values of the variants by randomly simulating case–control phenotypes. We conducted the simulations 100 000 times for each ancestral population using a permutation procedure. For each iteration, we randomly assigned case–control phenotypes at a ratio of 1:1 within each single subpopulation in the ancestral population. For autosomal variants, we tested associations of the variants on a logistic regression model using the PLINK 1.9 software (https://www.cog-genomics.org/plink2).[22, 23] In order to account for potential population stratification, we included the top two principal components as covariates in the model; these were calculated for each ancestral population using the smartpca program in the EIGENSOFT 6.0.1 package (http://www.hsph.harvard.edu/alkes-price/software/).[24] Additionally, we applied post-genomic control (GC) correction[25] if the population-specific genomic inflation factor λGC was >1 in each simulation. For chromosome X variants, we first split a population into males and females and conducted separate analyses using the same procedure as described for autosomal variants. We then performed a meta-analysis across male and female subjects and integrated this into the autosomal variants' result to conduct a meta-analysis across all ancestral populations.

Meta-analysis

To simulate trans-ethnic meta-analysis, we performed a GWAS meta-analysis for a given iteration across all ancestral populations using the inverse-variance method with the assumption of a fixed-effect model.[26] We included 28 993 742 variants that existed in at least one ancestral population. To prevent potential inflation from the inclusion of AFR samples, we also performed an additional meta-analysis that excluded AFR but included all other ancestries (that is, EUR, AMR, EAS and SAS).

Estimation of an empirical genome-wide significance

We measured the distributions of the minimum P-values of the variants (Pmin) for each ancestral population and meta-analysis result. We defined an empirical genome-wide significance threshold, −log10 Psig, as the 95th percentile (1−α) of −log10 Pmin at a significance level of α=0.05. We calculated −log10 Psig using the Harrell–Davis distribution-free quantile estimator[27] and calculated 95% confidence interval for −log10 Psig by bootstrapping method. We also estimated the effective number of independent variants by dividing the significance level α=0.05 by Psig given the Bonferroni-corrected threshold and calculated the ratio of the effective number of independent variants to the total number of variants after quality control. All calculations were performed using the authors' scripts (http://mkanai.github.io/). In order to confirm robustness of our approach for different MAF thresholds (0.1, 1 and 5%), different number of principal components (5, 10 and 20) or without post-GC correction, we additionally estimated empirical genome-wide significance thresholds under these different conditions. We note that we conducted the additional estimations for just 10 000 permutations each, except for the one without post-GC correction, considering their intensive computational cost.

LD pruning

Given that a population-specific LD structure significantly affects the number of independent variants in a population, we evaluated how Psig would reflect the effective number of independent variants estimated using the LD-based approach.[17] We applied LD pruning with the PLINK 1.9 software,[22, 23] using a 40-kb sliding window size, a 4-kb window step size and a maximum r2 threshold ranging from 0.1 to 1.0 in increments of 0.1. The number of remaining variants after LD pruning was considered as the effective number of independent variants. We calculated the LD-based genome-wide significance threshold by dividing the significance level α=0.05 by the population-specific effective number of independent variants, given the Bonferroni-corrected threshold. The effective ratio was defined as the ratio of the effective number of independent variants to the total number of variants after quality control.

Results

Empirical genome-wide significance

Based on the GWAS simulations for 100 000 times, we measured the −log10 Pmin distribution for each ancestral population and meta-analysis result (Figure 1). The empirical genome-wide significance thresholds for AFR, EUR, AMR, EAS and SAS were Psig=3.24 × 10−8 (95% confidence interval: 3.11–3.36 × 10−8); 9.26 × 10−8 (9.01–9.51 × 10−8); 1.83 × 10−7 (1.79–1.87 × 10−7); 1.61 × 10−7 (1.57–1.64 × 10−7) and 9.46 × 10−8 (9.20–9.69 × 10−8), respectively (Table 2). These results indicate that, with the exception of the African population, each ancestral population requires a different genome-wide significance threshold that is slightly more lenient than the current threshold of P=5.0 × 10−8.

Figure 1

The −log10 Pmin distributions for five ancestral populations and meta-analysis results. We conducted GWAS simulations using the 1000 Genomes Phase 3 data set and measured the minimum P-value of the variants (Pmin). Each panel represents a population/meta-analysis result. Each vertical bar in the panel represents the top five percentile of −log10 Pmin (that is, the estimated empirical genome-wide significance −log10 Psig). The dotted vertical bar represents the common genome-wide significance threshold of 5.0 × 10−8. AFR, African; AMR, Admixed American; EAS, East Asian; EUR, European; SAS, South Asian; ALL, meta-analysis across all ancestral populations; ΔAFR, meta-analysis including all ancestral populations except for AFR (that is, EUR, AMR, EAS and SAS).

Table 2

Estimated genome-wide significance thresholds for ancestral populations and meta-analyses

Ancestry	P_sig (−log₁₀ P_sig)a	95% CIa	No. of variantsb (MAF>0.5%)	No. of effective variantsc	Ratio
AFR	3.24 × 10⁻⁸ (7.49)	3.11 × 10⁻⁸–3.36 × 10⁻⁸ (7.47–7.51)	21 048 933	1 545 429	0.073
EUR	9.26 × 10⁻⁸ (7.03)	9.01 × 10⁻⁸–9.51 × 10⁻⁸ (7.02–7.05)	11 980 247	540 128	0.045
AMR	1.83 × 10⁻⁷ (6.74)	1.79 × 10⁻⁷–1.87 × 10⁻⁷ (6.73–6.75)	14 261 439	273 444	0.019
EAS	1.61 × 10⁻⁷ (6.79)	1.57 × 10⁻⁷–1.64 × 10⁻⁷ (6.78–6.80)	10 201 713	311 275	0.031
SAS	9.46 × 10⁻⁸ (7.02)	9.20 × 10⁻⁸–9.69 × 10⁻⁸ (7.01–7.04)	12 641 702	528 484	0.042

ALL	3.25 × 10⁻⁸ (7.49)	3.16 × 10⁻⁸–3.33 × 10⁻⁸ (7.48–7.50)	28 993 742	1 539 237	0.053
ΔAFR	4.20 × 10⁻⁸ (7.38)	4.08 × 10⁻⁸–4.33 × 10⁻⁸ (7.37–7.39)	19 862 732	1 189 822	0.060

Abbreviations: AFR, African; ALL, meta-analysis across all ancestral populations; AMR, Admixed American; CI, confidence interval; EAS, East Asian; EUR, European; MAF, minor allele frequency; SAS, South Asian; ΔAFR, meta-analysis including all ancestral populations except for AFR (that is, EUR, AMR, EAS and SAS).

The 5th percentile of Psig was calculated based on the 95th percentile of –log10 Psig.

MAF was calculated within each ancestral population.

The effective number of independent variants was calculated by dividing the significance level α=0.05 by Psig.

Trans-ethnic meta-analysis

Using the same procedure, we measured the −log10 Pmin distribution for trans-ethnic meta-analysis results (Figure 1). The estimated Psig values for ALL and ΔAFR were 3.25 × 10−8 (3.16–3.33 × 10−8) and 4.20 × 10−8 (4.08–4.33 × 10−8), respectively (Table 2). Compared with the current threshold for single-population GWAS (P=5.0 × 10−8), our estimations for both trans-ethnic meta-analyses (ALL and ΔAFR) are more stringent, regardless of whether the data set contained African samples or not. We note that our empirical estimations remained approximately the same when using different MAF thresholds (0.1, 1 and 5%) or different number of principal components (5, 10 and 20) for calculations (Supplementary Tables S1 and S2). With regard to post-GC correction, although the empirical thresholds without the correction were slightly stringent as expected, the discrepancy was so small that it did not dismiss our conclusions (Supplementary Table S3).

Relationship between a population-specific LD structure and Psig

We applied LD pruning to each population using a maximum r2 threshold of 0.5 (Table 3; for a complete list, see Supplementary Tables S4 and S5). Based on the effective number of independent variants, we calculated an LD-based genome-wide significance threshold (PLD) by dividing a significance level α=0.05 given the Bonferroni-corrected threshold (Figure 2). For most ancestries (AFR, EUR, EAS and SAS), a −log10 Psig showed approximately positive correlation with −log10 PLD, suggesting that our estimation of the empirical genome-wide significance threshold clearly corresponded to the population-specific LD structure, as expected. However, we found that AMR was an outlier among the ancestral populations, with a substantial imbalance in the effective number of independent variants within the AMR population (Table 3). Although the effective numbers of independent variants for each subpopulation were well balanced in the other ancestries, the numbers for CLM (Colombians from Medellin, Colombia) and PUR (Puerto Ricans from Puerto Rico) were higher than those for the other subpopulations in AMR, leading to a potential increase in the overall effective number of independent variants for AMR.

Table 3

Estimated effective number of independent variants in the AMR subpopulations by LD pruning

Code	No. of variantsa (MAF>0.5%)	No. of effective variantsb	Ratio	P_LD (−log₁₀ P_LD)
AMR	14 261 439	2 129 877	0.149	2.35 × 10⁻⁸ (7.63)
CLM	7 512 590	1 343 116	0.179	3.72 × 10⁻⁸ (7.43)
MXL	7 218 484	985 773	0.137	5.07 × 10⁻⁸ (7.29)
PEL	6 570 123	873 604	0.133	5.72 × 10⁻⁸ (7.24)
PUR	7 735 691	1 542 788	0.199	3.24 × 10⁻⁸ (7.49)

Abbreviations: AMR, Admixed American; CLM, Colombians from Medellin, Colombia; LD, linkage disequilibrium; MAF, minor allele frequency; MXL, Mexican Ancestry from Los Angeles, USA; PEL, Peruvians from Lima, Peru; PUR, Puerto Ricans from Puerto Rico.

MAF was calculated within each population.

The effective number of independent variants was estimated by LD-based pruning (sliding window size: 40 kb; window step size: 4 kb; r2<0.5).

Figure 2

The relationship between −log10 PLD and −log10 Psig. We calculated the LD-based genome-wide significance PLD based on the effective number of independent variants, which was estimated by applying LD pruning with a maximum r2 threshold of 0.5. Whereas −log10 Psig showed approximately positive correlation with −log10 PLD for AFR, EUR, EAS and SAS (blue), AMR (red) is an outlier. The error bars represent the 95% CI for −log10 Psig. The dotted lines represent the common genome-wide significance threshold of P=5.0 × 10−8. AFR, African; AMR, Admixed American; EAS, East Asian; EUR, European; SAS, South Asian.

Discussion

In the present study, we estimated the empirical genome-wide significance thresholds for the five ancestral populations based on the GWAS simulations conducted using the 1000 Genomes Project Phase 3 data set. The results suggested that, for non-African populations, we could apply a threshold less stringent than the current level of P=5.0 × 10−8. On the other hand, the meta-analysis results revealed that more stringent thresholds should be adopted in meta-analysis study, regardless of the inclusion of African samples. Our empirical estimation based on the 1000 Genomes Project will be applicable to various studies, as most current studies conduct genotype imputation using the same data set. To date, an increasing number of studies have conducted trans-ethnic meta-analysis to improve the power to identify susceptible loci by combining extremely large number of samples from single-population studies.[28] Although these studies commonly adopted the same genome-wide significance threshold (P=5.0 × 10−8) used in a single-population GWAS, few have scrutinized the stringency of this threshold for preventing false positives. Our present study fills this gap and suggests that a more stringent threshold is needed for trans-ethnic meta-analysis even though African samples are absent from the data set. Li et al.[19] reported genome-wide significance thresholds for AFR, ASN (Asian) and EUR in the 1000 Genomes data set (released in August 2010) of 1.62 × 10−8, 3.47 × 10−8 and 3.06 × 10−8, respectively, based on the calculation of the effective number of independent markers using eigenvalues. As the number of samples and genotypes in the data set differed, we additionally applied their method to each population (AFR, EUR, AMR, EAS and SAS) in our data set, obtaining 4.94 × 10−9, 1.09 × 10−8, 9.05 × 10−9, 1.40 × 10−8 and 9.97 × 10−9, respectively. Our estimated thresholds were more lenient than both these previously reported and additionally calculated thresholds for the 1000 Genomes data set based on their method. This discrepancy arguably suggests the importance of empirical estimation, given the complex genetic backgrounds resulting from different LD structures among ancestral populations. Considering the limited sample size (~2500) of the data set, our empirical estimation might not fully reflect the genetic backgrounds of humans. The 1000 Genomes Project estimated that their power to detect SNPs to be >95% for those with sample frequency of at least 0.5% and to be >75% with frequency of 0.1% for Europeans.[11] Although it is difficult to exactly assess how far the data set of this sample size reflects the current populations, we envisage that the future panel will resolve the issue by providing new empirical estimations, given the recent efforts in the field to create much larger reference panels, such as the Haplotype Reference Consortium (http://www.haplotype-reference-consortium.org/). Although the least stringent genome-wide significance threshold (Psig=1.83 × 10−7) was estimated for the AMR population, we note that further investigations would be required to fully assess the confounding bias resulting from complex LD structure of this recently admixed population, such as long-range LD regions.[29] The observation of AMR as an outlier (Figure 2) suggests that the Psig estimated from an empirical distribution of associations does not simply reflect the population-specific LD structure but also other underlying dependencies. A recent study revealed that South American populations have different admixture history from their ancestry, which resulted in diverse proportions of African, European, Native American and Asian ancestries.[30] Association studies of such complex admixed population should be carefully conducted to avoid potential false positives. Additionally, in a typical GWAS of today, genotype imputation is commonly conducted to fine-map causal variants and increase a power,[10, 13] which we should address its potential effect to our empirical estimations. Although we used whole variants in the data set that passed our quality control criteria, several variants would not be well imputed in a typical study, depending on a genotyping platform of the study. By defining imputable variants of the data set with reference to ‘SNP and indel imputability database'[31] (http://www.unc.edu/~yunmli/1000G-imp/) for each combination of genotyping platforms and ancestral populations, we observed that the more variants an array has, the more stringent Psig is (Supplementary Table 6). We note that, as the database was constructed using the Phase 1 data set (version 3), we cannot simply compare the original results to those with only imputable variants. The relationship between array density and Psig supports that we could apply a more lenient threshold for current imputation-based single-population studies. In this paper, we have presented empirically estimated genome-wide significance thresholds based on the 1000 Genomes data set. Despite the computational cost, our study illustrates the value of an empirical estimation for genetic data through calculating the empirical genome-wide significance threshold. The results indicate that we should adopt a more stringent threshold compared with the current level of P=5.0 × 10−8 in future studies of African samples or trans-ethnic meta-analyses, whereas the threshold might be relaxed for non-African studies.

28 in total

1. Genomic control for association studies.

Authors: B Devlin; K Roeder
Journal: Biometrics Date: 1999-12 Impact factor: 2.571

2. Principal components analysis corrects for stratification in genome-wide association studies.

Authors: Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal: Nat Genet Date: 2006-07-23 Impact factor: 38.330

Review 3. Multiple testing in the genomics era: findings from Genetic Analysis Workshop 15, Group 15.

Authors: Lisa J Martin; Jessica G Woo; Christy L Avery; Huann-Sheng Chen; Kari E North; Kinman Au; Philippe Broët; Cyril Dalmasso; Mickael Guedj; Peter Holmans; Baisong Huang; Po-Hsiu Kuo; Alex C Lam; Hao Li; Alisa Manning; Ivan Nikolov; Ritwik Sinha; Jianxin Shi; Kijoung Song; Meredith Tabangin; Rui Tang; Ryo Yamada
Journal: Genet Epidemiol Date: 2007 Impact factor: 2.135

4. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits.

Authors: Lucia A Hindorff; Praveen Sethupathy; Heather A Junkins; Erin M Ramos; Jayashri P Mehta; Francis S Collins; Teri A Manolio
Journal: Proc Natl Acad Sci U S A Date: 2009-05-27 Impact factor: 11.205

5. Long-range LD can confound genome scans in admixed populations.

Authors: Alkes L Price; Michael E Weale; Nick Patterson; Simon R Myers; Anna C Need; Kevin V Shianna; Dongliang Ge; Jerome I Rotter; Esther Torres; Kent D Taylor; David B Goldstein; David Reich
Journal: Am J Hum Genet Date: 2008-07 Impact factor: 11.025

6. Addressing population-specific multiple testing burdens in genetic association studies.

Authors: Rafal S Sobota; Daniel Shriner; Nuri Kodaman; Robert Goodloe; Wei Zheng; Yu-Tang Gao; Todd L Edwards; Christopher I Amos; Scott M Williams
Journal: Ann Hum Genet Date: 2015-01-22 Impact factor: 1.670

7. Correcting away the hidden heritability.

Authors: Scott M Williams; Jonathan L Haines
Journal: Ann Hum Genet Date: 2011-02-24 Impact factor: 1.670

8. Large-scale whole-genome sequencing of the Icelandic population.

Authors: Daniel F Gudbjartsson; Hannes Helgason; Sigurjon A Gudjonsson; Florian Zink; Asmundur Oddson; Arnaldur Gylfason; Soren Besenbacher; Gisli Magnusson; Bjarni V Halldorsson; Eirikur Hjartarson; Gunnar Th Sigurdsson; Simon N Stacey; Michael L Frigge; Hilma Holm; Jona Saemundsdottir; Hafdis Th Helgadottir; Hrefna Johannsdottir; Gunnlaugur Sigfusson; Gudmundur Thorgeirsson; Jon Th Sverrisson; Solveig Gretarsdottir; G Bragi Walters; Thorunn Rafnar; Bjarni Thjodleifsson; Einar S Bjornsson; Sigurdur Olafsson; Hildur Thorarinsdottir; Thora Steingrimsdottir; Thora S Gudmundsdottir; Asgeir Theodors; Jon G Jonasson; Asgeir Sigurdsson; Gyda Bjornsdottir; Jon J Jonsson; Olafur Thorarensen; Petur Ludvigsson; Hakon Gudbjartsson; Gudmundur I Eyjolfsson; Olof Sigurdardottir; Isleifur Olafsson; David O Arnar; Olafur Th Magnusson; Augustine Kong; Gisli Masson; Unnur Thorsteinsdottir; Agnar Helgason; Patrick Sulem; Kari Stefansson
Journal: Nat Genet Date: 2015-03-25 Impact factor: 38.330

9. A second generation human haplotype map of over 3.1 million SNPs.

Authors: Kelly A Frazer; Dennis G Ballinger; David R Cox; David A Hinds; Laura L Stuve; Richard A Gibbs; John W Belmont; Andrew Boudreau; Paul Hardenbol; Suzanne M Leal; Shiran Pasternak; David A Wheeler; Thomas D Willis; Fuli Yu; Huanming Yang; Changqing Zeng; Yang Gao; Haoran Hu; Weitao Hu; Chaohua Li; Wei Lin; Siqi Liu; Hao Pan; Xiaoli Tang; Jian Wang; Wei Wang; Jun Yu; Bo Zhang; Qingrun Zhang; Hongbin Zhao; Hui Zhao; Jun Zhou; Stacey B Gabriel; Rachel Barry; Brendan Blumenstiel; Amy Camargo; Matthew Defelice; Maura Faggart; Mary Goyette; Supriya Gupta; Jamie Moore; Huy Nguyen; Robert C Onofrio; Melissa Parkin; Jessica Roy; Erich Stahl; Ellen Winchester; Liuda Ziaugra; David Altshuler; Yan Shen; Zhijian Yao; Wei Huang; Xun Chu; Yungang He; Li Jin; Yangfan Liu; Yayun Shen; Weiwei Sun; Haifeng Wang; Yi Wang; Ying Wang; Xiaoyan Xiong; Liang Xu; Mary M Y Waye; Stephen K W Tsui; Hong Xue; J Tze-Fei Wong; Luana M Galver; Jian-Bing Fan; Kevin Gunderson; Sarah S Murray; Arnold R Oliphant; Mark S Chee; Alexandre Montpetit; Fanny Chagnon; Vincent Ferretti; Martin Leboeuf; Jean-François Olivier; Michael S Phillips; Stéphanie Roumy; Clémentine Sallée; Andrei Verner; Thomas J Hudson; Pui-Yan Kwok; Dongmei Cai; Daniel C Koboldt; Raymond D Miller; Ludmila Pawlikowska; Patricia Taillon-Miller; Ming Xiao; Lap-Chee Tsui; William Mak; You Qiang Song; Paul K H Tam; Yusuke Nakamura; Takahisa Kawaguchi; Takuya Kitamoto; Takashi Morizono; Atsushi Nagashima; Yozo Ohnishi; Akihiro Sekine; Toshihiro Tanaka; Tatsuhiko Tsunoda; Panos Deloukas; Christine P Bird; Marcos Delgado; Emmanouil T Dermitzakis; Rhian Gwilliam; Sarah Hunt; Jonathan Morrison; Don Powell; Barbara E Stranger; Pamela Whittaker; David R Bentley; Mark J Daly; Paul I W de Bakker; Jeff Barrett; Yves R Chretien; Julian Maller; Steve McCarroll; Nick Patterson; Itsik Pe'er; Alkes Price; Shaun Purcell; Daniel J Richter; Pardis Sabeti; Richa Saxena; Stephen F Schaffner; Pak C Sham; Patrick Varilly; David Altshuler; Lincoln D Stein; Lalitha Krishnan; Albert Vernon Smith; Marcela K Tello-Ruiz; Gudmundur A Thorisson; Aravinda Chakravarti; Peter E Chen; David J Cutler; Carl S Kashuk; Shin Lin; Gonçalo R Abecasis; Weihua Guan; Yun Li; Heather M Munro; Zhaohui Steve Qin; Daryl J Thomas; Gilean McVean; Adam Auton; Leonardo Bottolo; Niall Cardin; Susana Eyheramendy; Colin Freeman; Jonathan Marchini; Simon Myers; Chris Spencer; Matthew Stephens; Peter Donnelly; Lon R Cardon; Geraldine Clarke; David M Evans; Andrew P Morris; Bruce S Weir; Tatsuhiko Tsunoda; James C Mullikin; Stephen T Sherry; Michael Feolo; Andrew Skol; Houcan Zhang; Changqing Zeng; Hui Zhao; Ichiro Matsuda; Yoshimitsu Fukushima; Darryl R Macer; Eiko Suda; Charles N Rotimi; Clement A Adebamowo; Ike Ajayi; Toyin Aniagwu; Patricia A Marshall; Chibuzor Nkwodimmah; Charmaine D M Royal; Mark F Leppert; Missy Dixon; Andy Peiffer; Renzong Qiu; Alastair Kent; Kazuto Kato; Norio Niikawa; Isaac F Adewole; Bartha M Knoppers; Morris W Foster; Ellen Wright Clayton; Jessica Watkin; Richard A Gibbs; John W Belmont; Donna Muzny; Lynne Nazareth; Erica Sodergren; George M Weinstock; David A Wheeler; Imtaz Yakub; Stacey B Gabriel; Robert C Onofrio; Daniel J Richter; Liuda Ziaugra; Bruce W Birren; Mark J Daly; David Altshuler; Richard K Wilson; Lucinda L Fulton; Jane Rogers; John Burton; Nigel P Carter; Christopher M Clee; Mark Griffiths; Matthew C Jones; Kirsten McLay; Robert W Plumb; Mark T Ross; Sarah K Sims; David L Willey; Zhu Chen; Hua Han; Le Kang; Martin Godbout; John C Wallenburg; Paul L'Archevêque; Guy Bellemare; Koji Saeki; Hongguang Wang; Daochang An; Hongbo Fu; Qing Li; Zhen Wang; Renwu Wang; Arthur L Holden; Lisa D Brooks; Jean E McEwen; Mark S Guyer; Vivian Ota Wang; Jane L Peterson; Michael Shi; Jack Spiegel; Lawrence M Sung; Lynn F Zacharia; Francis S Collins; Karen Kennedy; Ruth Jamieson; John Stewart
Journal: Nature Date: 2007-10-18 Impact factor: 49.962

10. A global reference for human genetic variation.

Authors: Adam Auton; Lisa D Brooks; Richard M Durbin; Erik P Garrison; Hyun Min Kang; Jan O Korbel; Jonathan L Marchini; Shane McCarthy; Gil A McVean; Gonçalo R Abecasis
Journal: Nature Date: 2015-10-01 Impact factor: 49.962

33 in total

1. A simple and accurate method to determine genomewide significance for association tests in sequencing studies.

Authors: Dan-Yu Lin
Journal: Genet Epidemiol Date: 2019-01-08 Impact factor: 2.135

2. Integration of genetics and miRNA-target gene network identified disease biology implicated in tissue specificity.

Authors: Saori Sakaue; Jun Hirata; Yuichi Maeda; Eiryo Kawakami; Takuro Nii; Toshihiro Kishikawa; Kazuyoshi Ishigaki; Chikashi Terao; Ken Suzuki; Masato Akiyama; Naomasa Suita; Tatsuo Masuda; Kotaro Ogawa; Kenichi Yamamoto; Yukihiko Saeki; Masato Matsushita; Maiko Yoshimura; Hidetoshi Matsuoka; Katsunori Ikari; Atsuo Taniguchi; Hisashi Yamanaka; Hideya Kawaji; Timo Lassmann; Masayoshi Itoh; Hiroyuki Yoshitomi; Hiromu Ito; Koichiro Ohmura; Alistair R R Forrest; Yoshihide Hayashizaki; Piero Carninci; Atsushi Kumanogoh; Yoichiro Kamatani; Michiel de Hoon; Kazuhiko Yamamoto; Yukinori Okada
Journal: Nucleic Acids Res Date: 2018-12-14 Impact factor: 16.971

Review 3. Genetic studies of alcohol dependence in the context of the addiction cycle.

Authors: Matthew T Reilly; Antonio Noronha; David Goldman; George F Koob
Journal: Neuropharmacology Date: 2017-01-22 Impact factor: 5.250

4. Impact of Rare and Common Genetic Variants on Diabetes Diagnosis by Hemoglobin A1c in Multi-Ancestry Cohorts: The Trans-Omics for Precision Medicine Program.

Authors: Chloé Sarnowski; Aaron Leong; Laura M Raffield; Peitao Wu; Paul S de Vries; Daniel DiCorpo; Xiuqing Guo; Huichun Xu; Yongmei Liu; Xiuwen Zheng; Yao Hu; Jennifer A Brody; Mark O Goodarzi; Bertha A Hidalgo; Heather M Highland; Deepti Jain; Ching-Ti Liu; Rakhi P Naik; Jeffrey R O'Connell; James A Perry; Bianca C Porneala; Elizabeth Selvin; Jennifer Wessel; Bruce M Psaty; Joanne E Curran; Juan M Peralta; John Blangero; Charles Kooperberg; Rasika Mathias; Andrew D Johnson; Alexander P Reiner; Braxton D Mitchell; L Adrienne Cupples; Ramachandran S Vasan; Adolfo Correa; Alanna C Morrison; Eric Boerwinkle; Jerome I Rotter; Stephen S Rich; Alisa K Manning; Josée Dupuis; James B Meigs
Journal: Am J Hum Genet Date: 2019-09-26 Impact factor: 11.025

5. An Atlas of Genetic Variation Linking Pathogen-Induced Cellular Traits to Human Disease.

Authors: Liuyang Wang; Kelly J Pittman; Jeffrey R Barker; Raul E Salinas; Ian B Stanaway; Graham D Williams; Robert J Carroll; Tom Balmat; Andy Ingham; Anusha M Gopalakrishnan; Kyle D Gibbs; Alejandro L Antonia; Joseph Heitman; Soo Chan Lee; Gail P Jarvik; Joshua C Denny; Stacy M Horner; Mark R DeLong; Raphael H Valdivia; David R Crosslin; Dennis C Ko
Journal: Cell Host Microbe Date: 2018-08-08 Impact factor: 21.023

6. Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases.

Authors: Masahiro Kanai; Masato Akiyama; Atsushi Takahashi; Nana Matoba; Yukihide Momozawa; Masashi Ikeda; Nakao Iwata; Shiro Ikegawa; Makoto Hirata; Koichi Matsuda; Michiaki Kubo; Yukinori Okada; Yoichiro Kamatani
Journal: Nat Genet Date: 2018-02-05 Impact factor: 38.330

7. Genome-wide association study identifies loci associated with liability to alcohol and drug dependence that is associated with variability in reward-related ventral striatum activity in African- and European-Americans.

Authors: Leah Wetherill; Dongbing Lai; Emma C Johnson; Andrey Anokhin; Lance Bauer; Kathleen K Bucholz; Danielle M Dick; Ahmad R Hariri; Victor Hesselbrock; Chella Kamarajan; John Kramer; Samuel Kuperman; Jacquelyn L Meyers; John I Nurnberger; Marc Schuckit; Denise M Scott; Robert E Taylor; Jay Tischfield; Bernice Porjesz; Alison M Goate; Howard J Edenberg; Tatiana Foroud; Ryan Bogdan; Arpana Agrawal
Journal: Genes Brain Behav Date: 2019-06-11 Impact factor: 3.449

8. Performing post-genome-wide association study analysis: overview, challenges and recommendations.

Authors: Yagoub Adam; Chaimae Samtal; Jean-Tristan Brandenburg; Oluwadamilare Falola; Ezekiel Adebiyi
Journal: F1000Res Date: 2021-10-04

9. Identification of genomic loci associated with resting heart rate and shared genetic predictors with all-cause mortality.

Authors: Ruben N Eppinga; Yanick Hagemeijer; Stephen Burgess; David A Hinds; Kari Stefansson; Daniel F Gudbjartsson; Dirk J van Veldhuisen; Patricia B Munroe; Niek Verweij; Pim van der Harst
Journal: Nat Genet Date: 2016-10-31 Impact factor: 38.330

10. Shared heritability of human face and brain shape.

Authors: Sahin Naqvi; Yoeri Sleyp; Hanne Hoskens; Karlijne Indencleef; Jeffrey P Spence; Rose Bruffaerts; Ahmed Radwan; Ryan J Eller; Stephen Richmond; Mark D Shriver; John R Shaffer; Seth M Weinberg; Susan Walsh; James Thompson; Jonathan K Pritchard; Stefan Sunaert; Hilde Peeters; Joanna Wysocka; Peter Claes
Journal: Nat Genet Date: 2021-04-05 Impact factor: 38.330