Literature DB >> 35283669

Quantitative Trait Loci Identification by Estimating the Genetic Model based on the Extremal Samples.

Zining Yang¹, Yaning Yang¹, Xu Steven Xu¹, Min Yuan¹.

Abstract

Background: In genetic association studies with quantitative trait loci (QTL), the association between a candidate genetic marker and the trait of interest is commonly examined by the omnibus F test or by the t-test corresponding to a given genetic model or mode of inheritance. It is known that the t-test with a correct model specification is more powerful than the F test. However, since the underlying genetic model is rarely known in practice, the use of a model-specific t-test may incur substantial power loss. Robust-efficient tests, such as the Maximin Efficiency Robust Test (MERT) and MAX3 have been proposed in the literature.
Methods: In this paper, we propose a novel two-step robust-efficient approach, namely, the genetic model selection (GMS) method for quantitative trait analysis. GMS selects a genetic model by testing Hardy-Weinberg disequilibrium (HWD) with extremal samples of the population in the first step and then applies the corresponding genetic model-specific t-test in the second step.
Results: Simulations show that GMS is not only more efficient than MERT and MAX3, but also has comparable power to the optimal t-test when the genetic model is known.
Conclusion: Application to the data from Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort demonstrates that the proposed approach can identify meaningful biological SNPs on chromosome 19.

Entities: Chemical

Keywords: Genetic association studies; extreme samples; genetic model selection; hardy-weinberg disequilibrium; quantitative trait loci

Year: 2021 PMID： 35283669 PMCID： PMC8844942 DOI： 10.2174/1389202922666210625161602

Source DB: PubMed Journal: Curr Genomics ISSN： 1389-2029 Impact factor: 2.689

INTRODUCTION

The clinical importance and health relevance of many quantitative traits, such as diabetes, obesity, cholesterol level and blood pressure, has been recognized. Although quantitative trait data is widely used in genetic association studies [1, 2], there are relatively few studies on robust-efficient methods in this area. For genetic association studies with a quantitative trait locus (QTL), a linear regression model can be used to link the genotypic value and the phenotypic value and an omnibus F-test is applied to test the null hypothesis that the QTL is unliked. When the genetic model (or mode of inheritance) is known, the genotypic value can be coded according to the underlying genetic model and effectively, the number of interesting parameters is reduced. This results in a t-test for the null hypothesis, which is more powerful than the omnibus F-test due to the inclusion of extra information on the mode of inheritance. However, it is also widely known that misspecification of genetic model may lead to substantial power loss of the t-test. Therefore, powerful tests that are robust to model misspecification, referred to as robust-efficient tests, are desired. Robust-efficient procedures have been well studied for qualitative trait analysis in case-control design. Common robust-efficient approaches to test the association between a qualitative trait and a candidate marker include the Pearson’s chi-square test with 2 degrees of freedom, Maximin Efficiency Robust Test, MAX3 and genetic model selection [3-6]. The last three methods are based on the Cochran-Armitage’s trend test (CATT) [6-10]. The CATT is derived for a specified genetic model or mode of inheritance and is proved to be optimal in power when the genetic model is correctly specified. However, the CATT suffers from substantial power loss when the genetic model is misspecified, especially when the dominant model is misspecified as a recessive model or vice versa. Maximin Efficiency Robust Test (MERT) linearly combines the extreme pair of CATT by considering the correlation between CATTs and has the highest minimum efficiency relative to the optimal test [11, 12] among all linear combinations of the normally distributed optimal test statistics. MAX3 takes the maximum over the CATTs under dominant, additive and recessive models. The third type of robust approach is a two-step procedure known as genetic model selection (GMS). Hardy-Weinberg disequilibrium coefficient has been examined and used to test association for the case-control design [13, 14]. It has been shown that the Hardy-Weinberg disequilibrium (HWD) contains the genetic model information and can be used to estimate the underlying genetic model for case-control design. The GMS test applies an adaptive procedure to select a genetic model before applying an appropriate CATT. The MERT, MAX3 and GMS are shown to have good power performance across the common genetic models. However, there are relatively few studies on robust-efficient methods in QTL study. Deng et al. [15] proposed a QTL mapping method by measuring and testing for Hardy-Weinberg and linkage disequilibrium at a series of linked marker loci in extreme samples of populations. Inspired by how the genetic model information is extraction from data based on HWD in case-control design, we propose a GMS procedure, which is a two-step procedure similar to the GMS method in case control study. In the first step, we estimate the genetic model by using the HWD calculated from extreme individuals as in the study by Deng et al. [15]. In the second step, we apply the t-test corresponding to the genetic model estimated in the first step. We derive the asymptotic correlation between the t-test for recessive model and dominant model. We also derive and adjust the asymptotic correlation between HWD test and t-tests for different genetic models so that the GMS has approximately correct size. MERT and MAX3 for quantitative trait are another two candidate choices, which can be readily used for testing association for a quantitative trait once the model-specific t-test statistics are obtained. Extensive simulation studies are conducted to compare the GMS procedure with existing approaches. Real data analysis is also performed to illustrate the proposed approach. The rest of this paper is organized as follows. Section 2 introduces notations, models and the theoretical results. Extensive simulation studies are implemented to examine the performance of the proposed methods in Section 3. In Section 4, we apply the proposed methods to real data from a subsample of the Alzheimer’s Disease Neuroimaging Initiative (ADNI). Conclusions and discussions are given in the last section.

METHODS

Notations and Models

Consider a QTL with two alleles: A1 and A2. Allele A1 is assumed to be the risk allele and A2 is the reference allele. Allele A2 may also be regarded as all of the non- A1 alleles with similar genetic effects. Let p be the frequency of the allele A1 and q = 1 − p be the frequency of allele A2. Denote the three genotypes A1A1, A1A2 and A2A2 by g = 0, 1, 2 respectively. Let µ0 be the mean (genotypic value) for individuals with genotype A1A1 and µ1 and µ2 be the genotypic value of individuals with A1A2 and A2A2, respectively. The null hypothesis is established that there is no association between the locus and the trait, i.e. H0: µ0 = µ1 = µ2. The alternative hypothesis is H1: µ0 ≥ µ1 ≥ µ2 and at least one inequality holds. If we can determine the underlying genetic model, the H1 will reduced to two-dimensional plane, i.e., H0 = µ1 > µ2 under the dominant model; H0 > µ1 = µ2 under the recessive model; H: 2µ1 = µ0 + µ2, µ0 > µ2 under the additive model. The general linear model for a QTL is described as follows [16]. The phenotypic value of the jth individual with genotype g in the population y = µ + G + s, i = 0, 1, 2, j = 1, 2, · · ·, n is the mean baseline of the quantitative trait, G is the genotypic value at the QTL for the ith genotype, and s represents a random variable for the combined effects of all the rest of the polymorphic QTLs and all random environmental effects. Without loss of generality, we assume that µ = 0. Thus the genotypic value for the ith genotype is equal to µ0, µ1 and µ2, respectively, for genotypes A1A1, A1A2, and A2A2. We also assume that s follows a normal distribution.

Common Test Statistics

To test the association between the candidate genetic marker and a quantitative trait, we can either use the F test with the alternative hypothesis H1 or t-test t.r, t.a, t.d for a given genetic model with alternative hypothesis H, H or H respectively. The F test is derived by calculating the ratio of between-group variance and within-group variance, that is, (2.1) Under the null hypothesis, the F test follows an F distribution with degrees of freedom 2 and n − 3. When the genetic model is specified as one of the dominant, additive and recessive models, the corresponding t-test can be derived as follows. (2.2) (2.3) (2.4) with Under the null hypothesis, t.r, t.a, t.d follow the t distribution with degrees of freedom n−2. When the sample size is sufficiently large, we can use a standard normal distribution to approximate the t distribution. It is obvious that t.r, t.a, t.d are the correspondingly optimal tests for a known genetic model. As the above tests incorporate risk trend in the genotypic value, we call them trend tests as that in case control studies. In practice, the underlying genetic model is rarely known and the use of the trend test specified for a certain model could suffer from substantial loss of power if the genetic model is misspecified. Therefore, it is essential to consider efficiency robust methods to protect against genetic model mis-specifications. The F test, MERT and MAX3 are three common robust tests. MERT linearly combines t.r and t.d by taking into account the correlation between them. The correlation efficient between t.r and t.d denoted as ρ equals asymptotically. Therefore, The MERT can be derived as follows, (2.5) The MAX3, taking the maximum of three optimal tests under different genetic models, is given by, (2.6) Thresholds to control type I error at level α for Z and Z are determined through equations P0 (|Z) = α and P0 (Z) = α respectively. As Z is asymptotically normal distributed, we can use the theoretical quantile as the threshold when sample size is large enough. However, it is hard to obtain the asymptotical distribution of Z In this article, we will use Monto-Carlo method to obtain all these thresholds.

Genetic Model Selection and Related Robust Test

We assume that HWE is present in the population. The quantitative trait in the population then follows a mixture distribution of three normal distributions, each of which is weighted by the respective genotype frequencies in the population. The probability density function is where is the density function for the normal random variable with mean µ and σ2. For a given upper threshold U, extreme individuals are defined as those people with phenotype greater than U, that is y > U. Denote , , be the probabilities to obtain an extreme sample conditioning on the genotypes, then: (2.7) where is the cumulative probability function for the normal random variable with mean µ and σ2. The proportion of the extreme individuals denoted as , can be computed as follows: Similarly, for a lower threshold T, we can compute the proportion of the population with y < T by using the following relationship, . It can be easily shown that the probabilities of a certain genotype and the risk allele given that the individual comes from the extreme population with y > U are respectively, (2.8) and (2.9) The Hardy-Weinberg disequilibrium coefficient in the extreme population can be defined by the difference between the genotype frequency and the allele as follows, (2.10) When the null hypothesis is true, i.e., , , we have . When the underlying genetic model is dominant, i.e.,,, thus . When the underlying genetic model is recessive, i.e., , and . When the genetic model is additive model, i.e.,, then can either be positive or negative and is close to 0 if both and are close to zero (small or moderate effects). The signs of under the dominant and recessive are opposite, indicating that can be used to select a genetic model. Denote , then follows a normal distribution with mean and variance , i = 0, 1, 2. We use to estimate . The allele frequency of can be estimated by . Denote . , , , are defined by estimating the unknown parameters , i = 0, 1, 2 and p in the expressions of , , and by , i = 0, 1, 2 and respectively. Therefore, Hardy-Weinberg disequilibrium coefficient can be estimated by, The Hardy-Weinberg disequilibrium test can be constructed by standardizing the disequilibrium coefficient as follows, . (2.11) Under the null hypothesis, and can be calculated by the delta method. Details are provided in the Appendix (A1). The expectation of has the same sign as the Hardy-Weinberg disequilibrium coefficient , therefore, we select the recessive model if ; the dominant model if and additive model otherwise. The thresholds and are two constants, which can be either prespecified or determined based on the data. Once the underlying model is determined, the correspondingly optimal trend test could be applied. Such procedure is a two-step method, which extracts the model information from the data and cooperates them into the association test. This data driven procedure is referred the GMS in this article. The choice of thresholds of and affects the accuracy of model selection. For example, if is relatively large and is relatively small, it is more likely to select the dominant (recessive) model as the recessive (dominant) model, which is the situation we should always avoid. Therefore, one needs to choose and carefully when applying the GMS method. In this paper, we propose to determine the thresholds and by a data driven method. The main idea of the data driven procedure is to calculate the mean of the HWD statistics under different assumptions and use the midpoints as the thresholds to distinguish different genetic models. Denote, (2.12) where For a given genotypic value, it is obvious that, , and are the asymptotic mean of under the null hypothesis and three different genetic models. Estimators of these four values are denoted by , , and respectively. It is easy to show that . Therefore, thresholds for model selection can be sensibly determined as the midpoints of points and , of points and respectively. Specifically, and . To control type I error at level , we need to find the threshold under such that: The threshold can be obtained analytically by the delta method and the asymptotic joint multivariate normal distribution of and t.r, t.d or t.a when the sample size is moderately large (details to derive the expression of the correlations between and the trend tests t.r, (t.a, t.d) are provided in Appendix A2). Our simulations generate 10000 independent datasets under the null hypothesis by resampling method and using the empirical quantile as the critical value.

RESULTS

Simulation

We first evaluate the accuracy of the model selection procedure under various scenarios. We also check the performance of the GMS and compare it with the other methods, including the F test in (2.1) (denoted by F in the figures and tables), the trend tests in (2.2), MERT, in (2.5) and the MAX3 test, in (2.6). Data are generated under 54 simulation settings. The population allele frequency p is taken to be0.1, 0.3, or 0.5. Sample size is set at n = 200, 500, or 1000. The genotypic value is µ = 0.1, 0.5, or 1. Three genetic models are considered in the simulation: the dominant (DOM), the additive (ADD) and the recessive (REC). Effective sample proportion (ESP) defined as the proportion of extreme individuals is = 5%, or 10%. Each simulation is replicated 10000 times. Simulation results show that the model selection accuracy increases with the allele frequency, the genotypic value, the sample size and effective sample size proportion (Fig. ). Table reports the accuracy of the genetic model selection for a moderate allele frequency and sample size, i.e. p = 0.3 and n = 500 for different genotypic values. From Table we can see the model selection accuracy is high for a moderate genotypic value. For example, when the genotypic value µ = 0.5, the probabilities of selecting the true model are about 92%, 76% and 98%, respectively. Type I error rates are provided in Table . All of the seven methods can control type I error rate well at the nominal level 0.05. Power comparisons are presented in Fig. (. The trend tests (i.e., t.d, t.a, and t.r) are sensitive to the model assumption. For example, when the true model is recessive, the dominant trend test t is least powerful among all the tests. Compared to the trend tests, the three robust-efficient procedures are generally more robust against model misspecification with a higher power. In addition, the proposed method, GMS, is generally more powerful than MAX3 and MERT, indicating superior power and efficiency of GMS under different genetic models. We also compared the performance of seven different methods in the case of rare variants (results are provided in Appendix A3). When the allele frequency is low (i.e., p=0.01, 0.05, 0.1), the genetic effect is moderate (µ = 0.2) and the effective sample size is small (i.e., 100), the powers of all tests are less than 50%. When the sample size is gradually increased to 4000 (effective sample size 400), the powers of robust tests, GMS, MERT and MAX3, can reach the power of more than 80%. For example, when the genetic model is DOM and p = 0.05, the powers of these tests are 83.53% (MERT), 89.50% (MAX3) and 95.34% (GMS) respectively and the corresponding optimal power is 97.76% (t.d). The simulation results under rare variants scenarios show that GMS, MERT and MAX3 tests can find statistical differences when the sample size is sufficiently large. Whether it is common variants or rare variants, GMS always has the highest power among these three methods and is comparable to the optimal test.

Fig. (2)

The power comparison among seven methods with in (2.5); MAX3: the MAX3 test in (2.6); GMS: two step approach by estimating genetic model with in (2.11) and corresponding trend test in (2.2)-(2.4). (.

Application

We applied our method to the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. The data was downloaded from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (www.loni.usc.edu/ ADNI). The ADNI is an ongoing longitudinal multicenter study aimed at detecting and monitoring the early stage of AD by investigating magnetic resonance imaging, positron emission tomography, genetic, biochemical biomarkers, and neuropsychological and clinical assessment. The ADNI data collected 2074 observations and 113 variables, including clinical outcomes and biomarker variables for Alzheimer’s Disease (AD) for 784 individuals. In this section, we used baseline Rey's Auditory Verbal Learning Test (RAVLT) score as the response variable to examine the association between SNPs on chromosome 19 and AD. The RAVLT is a powerful neuropsychological tool for testing episodic memory, which is widely used for cognitive assessment in dementia and pre-dementia conditions [17]. Several studies have shown that impairment in RAVLT scores reflects well the underlying pathology caused by Alzheimer's disease (AD), thus making RAVLT an effective early marker to detect AD in persons with memory complaints [18]. The higher score indicates the severity of the auditory verbal learning ability impairment. We selected individuals with RAVLT score greater than 4 as the extreme samples (i.e., U>4). After quality control, missing values are removed and 783 individuals genotyped at 10683 SNPs on chromosome 19 are included in the analysis. We apply the proposed GMS method to the ADNI data and compare it with the other six methods. We permuted data 10,000 times to determine the thresholds of the MAX3 and GMS. Thresholds for the other 5 methods are determined by the asymptotic normal distribution. After FDR correction, we find 282 significant SNPs at significance level α = 0.05. Manhattan plot for the ADNI data is presented in Fig. (, which shows the p-values of all SNPs with seven methods on the –log10 scale. In Table , we report the top 10 significant SNPs selected by GMS, as well as p-values of the other six methods. All three robust methods (MERT, MAX3 and GMS) exhibit robustness to genetic model specification. For ADNI data, it seems that MERT and GMS have comparable power, while MAX3 is slightly less powerful than MERT and GMS with relatively large p-values. Existing studies have shown that zinc plays an important role in the development of Alzheimer’s disease [19, 20]. Among the 10 significant SNPs, five SNPs (rs143663113, rs113397810, rs7258002, rs10409896, rs117810408) are screened and reported to be related to zinc finger protein 578 (Gene ZNF578), and two SNPs (rs143663113, rs111321022) be related to Gene ZNF808.

Fig. (3)

Manhattan plot for all p-values (on –log 10 scale) of seven methods for all 10683 SNPs on chromosome 19. t.a, t.d, and t.r: trend tests corresponding to additive, dominant, recessive genetic models. F: F-test; MERT: in (2.5); MAX3: the MAX3 test in (2.6); GMS: two step approach by estimating genetic model with in (2.11) and corresponding trend test in (2.2)-(2.4). (A higher resolution / colour version of this figure is available in the electronic copy of the article).

CONCLUSION

In this article, we proposed a two-step procedure to test the genetic association between an SNP and a quantitative trait. The underlying genetic model is determined in the first step based on the difference of HWD coefficients under different genetic models using extreme individuals, and in the second step, the t-test corresponding to the genetic model selected in the first step is then applied to test the association between the SNP and the trait. The proposed method is shown to be more efficient than the previously reported approaches (MERT and MAX3) by simulation studies. Under three genetic models, GMS method has similar power to the t-test corresponding to the true model with a well-controlled type I error rate at the given significant level. In our study, we used a data-driven threshold in selecting genetic model. Simulation results show that this method works well in selecting the underlying genetic model. We would like to mention that in the literature of QTL, robustness often refers to nonparametric methods that are robust to distribution specification of the error term in linear models [21, 22]. Robustness in this work refers to the insensitivity of the tests to genetic model specification. The proposed GMS has high efficiency in power for common variants. For the rare variants with low allele frequency, even the best test has very low power. However, GMS is the closest to the optimal test among the existing robust methods for either common or rare variants. In practical applications, higher efficiency could be achieved by increasing the sample size for rare variants.

APPENDIX

Appendix A1: Derivation of the estimated variance of . Let , because Therefore, by applying the delta-method, we have, . When the null hypothesis is true, , and , further we have Therefore and where . Appendix A2: Details to derive the expression of the expectation of the Hardy-Weinberg disequilibrium test and the correlations between and the trend tests respectively. The Hardy-Weinberg disequilibrium test is So Under specific genetic model,,, , , , , so. Under the null hypothesis and three different genetic models, equal to, and We also calculate the correlation between and by delta method.

Table 1

Proportion of selecting of different genetic models (p=0.3 n=500, ESP=0.05).

-	True Model	Selected Model
-	True Model	REC	ADD	DOM
µ = 0.1	NULL	0.4933	0.2702	0.2365
	REC	0.6581	0.3068	0.0351
	ADD	0.3265	0.4742	0.1993
	DOM	0.0933	0.2288	0.6779
µ = 0.5	NULL	0.3477	0.3170	0.3353
	REC	0.9158	0.0842	0.0000
	ADD	0.0345	0.7582	0.2073
	DOM	0.0000	0.0179	0.9821
µ = 1	NULL	0.2353	0.6985	0.0662
	REC	0.9495	0.0505	0.0000
	ADD	0.0007	0.9993	0.0000
	DOM	0.0000	0.0009	0.9991
µ = 2	NULL	0.3417	0.6472	0.0111
	REC	1.0000	0.0000	0.0000
	ADD	0.0000	1.0000	0.0000
	DOM	0.0000	0.0002	0.9998

Table 2

Type I error rate of the seven methods with various parameters.

ESP	p	n	Seven Methods
ESP	p	n	t.d	t.a	t.r	F	MERT	MAX3	GMS
0.05	0.1	200	0.0487	0.0503	0.0492	0.0500	0.0493	0.0500	0.0489
		500	0.0514	0.0519	0.0492	0.0500	0.0514	0.0500	0.0437
		1000	0.0505	0.0505	0.0501	0.0500	0.0503	0.0500	0.0517
	0.3	200	0.0495	0.0498	0.0502	0.0500	0.0502	0.0499	0.0480
		500	0.0496	0.0515	0.0461	0.0500	0.0514	0.0500	0.0498
		1000	0.0514	0.0539	0.0530	0.0501	0.0501	0.0500	0.0476
	0.5	200	0.0502	0.0495	0.0511	0.0500	0.0493	0.0500	0.0472
		500	0.0540	0.0520	0.0523	0.0500	0.0517	0.0500	0.0456
		1000	0.0477	0.0492	0.0500	0.0500	0.0477	0.0500	0.0498
0.1	0.1	200	0.0501	0.0506	0.0491	0.0500	0.0506	0.0500	0.0488
		500	0.0542	0.0482	0.0520	0.0500	0.0484	0.0500	0.0461
		1000	0.0520	0.0509	0.0502	0.0500	0.0509	0.0502	0.0485
	0.3	200	0.0491	0.0498	0.0498	0.0500	0.0498	0.0500	0.0491
		500	0.0504	0.0499	0.0493	0.0498	0.0500	0.0500	0.0509
		1000	0.0496	0.0476	0.0505	0.0500	0.0473	0.0505	0.0505
	0.5	200	0.0496	0.0497	0.0498	0.0500	0.0499	0.0500	0.0478
		500	0.0477	0.0489	0.0468	0.0500	0.0490	0.0500	0.0513
		1000	0.0505	0.0481	0.0540	0.0500	0.0479	0.0500	0.0466

Table 3

The p-value of significant SNPs selected by GMS.

SNP	t.r*	t.a*	t.d*	F*	MERT*	MAX*	GMS*
rs143663113	2.06e-04	2.30e-06	2.53e-04	7.09e-04	1.80e-05	4.09e-04	5.81e-06
rs62131791	3.01e-06	5.15e-06	1.45e-02	1.69e-04	8.76e-05	3.14e-04	7.45e-06
rs73026154	1.10e-05	6.07e-06	6.46e-03	1.21e-04	2.61e-05	1.21e-05	8.95e-06
rs71839901	5.26e-06	9.11e-05	1.49e-03	1.69e-04	8.76e-05	5.07e-05	9.14e-06
rs111321022	6.30e-05	1.73e-05	3.42e-04	6.58e-04	1.67e-05	3.14e-04	1.07e-05
rs113397810	2.74e-05	6.02e-06	2.64e-05	3.40e-04	7.97e-06	1.48e-04	1.09e-05
rs7258002	4.34e-04	1.17e-05	2.53e-04	2.68e-04	6.20e-06	2.10e-05	1.20e-05
rs10409896	1.44e-03	6.06e-06	2.81e-05	2.67e-04	6.20e-06	1.62e-04	1.37e-05
rs7257286	1.82e-03	1.02e-05	1.07e-06	5.52e-05	1.15e-06	8.23e-05	1.65e-05
rs117810408	7.40e-04	1.58e-05	2.62e-03	3.14e-04	7.31e-06	1.84e-04	2.15e-05

*t.a, t.d, and t.r: trend tests corresponding to additive, dominant, recessive genetic models. F: F-test; MERT: in (2.5); MAX3: the MAX3 test in (2.6); GMS: two step approach by estimating genetic model with in (2.11) and corresponding trend test in (2.2)-(2.4).

Appendix A3: Table

Powers with different allele frequencies (sample size: 4000; = 0.2; ESP=10%).

	p	t.d	t.a	t.r	F	MERT	MAX3	GMS
DOM	0.01	0.8189	0.8098	0.0695	0.6825	0.6136	0.7465	0.7688
	0.05	0.9775	0.9689	0.1213	0.8936	0.8352	0.8950	0.9534
	0.1	0.9898	0.9803	0.4966	0.9351	0.9180	0.9465	0.9634
ADD	0.01	0.4183	0.4242	0.0687	0.3306	0.3027	0.3424	0.3725
	0.05	0.7697	0.7709	0.1335	0.7433	0.6864	0.7520	0.7560
	0.1	0.8992	0.9294	0.5889	0.8884	0.8972	0.9089	0.9193
REC	0.01	0.0520	0.0532	0.0683	0.0622	0.0602	0.0618	0.0784
	0.05	0.0512	0.0548	0.1445	0.1159	0.1050	0.1133	0.1451
	0.1	0.0758	0.1664	0.6741	0.5644	0.4585	0.5730	0.6389

14 in total

1. QTL fine mapping by measuring and testing for Hardy-Weinberg and linkage disequilibrium at a series of linked marker loci in extreme samples of populations.

Authors: H W Deng; W M Chen; R R Recker
Journal: Am J Hum Genet Date: 2000-03 Impact factor: 11.025

2. A novel MHC class I-like gene is mutated in patients with hereditary haemochromatosis.

Authors: J N Feder; A Gnirke; W Thomas; Z Tsuchihashi; D A Ruddy; A Basava; F Dormishian; R Domingo; M C Ellis; A Fullan; L M Hinton; N L Jones; B E Kimmel; G S Kronmal; P Lauer; V K Lee; D B Loeb; F A Mapa; E McClelland; N C Meyer; G A Mintier; N Moeller; T Moore; E Morikang; C E Prass; L Quintana; S M Starnes; R C Schatzman; K J Brunke; D T Drayna; N J Risch; B R Bacon; R K Wolff
Journal: Nat Genet Date: 1996-08 Impact factor: 38.330

10. Extreme sampling design in genetic association mapping of quantitative trait loci using balanced and unbalanced case-control samples.

Authors: Yi Li; Orna Levran; JongJoo Kim; Tiejun Zhang; Xingdong Chen; Chen Suo
Journal: Sci Rep Date: 2019-10-29 Impact factor: 4.379

Quantitative Trait Loci Identification by Estimating the Genetic Model based on the Extremal Samples.

INTRODUCTION

METHODS

Notations and Models

Common Test Statistics

Genetic Model Selection and Related Robust Test

RESULTS

Simulation

Application

CONCLUSION

APPENDIX

1. QTL fine mapping by measuring and testing for Hardy-Weinberg and linkage disequilibrium at a series of linked marker loci in extreme samples of populations.

2. A novel MHC class I-like gene is mutated in patients with hereditary haemochromatosis.

3. Improving power for testing genetic association in case-control studies by reducing the alternative space.

4. Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus.

5. From genotypes to genes: doubling the sample size.

Review 6. Genetic model selection in genome-wide association studies: robust methods and the use of meta-analysis.

Review 7. Genetic association of molecular traits: A help to identify causative variants in complex diseases.

8. A robust distribution-free test for genetic association studies of quantitative traits.

9. GWAR: robust analysis and meta-analysis of genome-wide association studies.

10. Extreme sampling design in genetic association mapping of quantitative trait loci using balanced and unbalanced case-control samples.