| Literature DB >> 22264315 |
Douglas Londono1, Steven Buyske, Stephen J Finch, Swarkar Sharma, Carol A Wise, Derek Gordon.
Abstract
BACKGROUND: Locus heterogeneity is one of the most documented phenomena in genetics. To date, relatively little work had been done on the development of methods to address locus heterogeneity in genetic association analysis. Motivated by Zhou and Pan's work, we present a mixture model of linked and unlinked trios and develop a statistical method to estimate the probability that a heterozygous parent transmits the disease allele at a di-allelic locus, and the probability that any trio is in the linked group. The purpose here is the development of a test that extends the classic transmission disequilibrium test (TDT) to one that accounts for locus heterogeneity.Entities:
Mesh:
Year: 2012 PMID: 22264315 PMCID: PMC3292499 DOI: 10.1186/1471-2105-13-13
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Simulation parameter settings for the single-locus simulations
| Item | Parameter | Setting |
|---|---|---|
| MOI | Dominant, Recessive, Multiplicative | |
| 0.05, 0.15 | ||
| 1.0 (Null), 2.25 | ||
| 0.25, 0.50, 0.75, 1.0 | ||
| 0.10, 0.25, 0.50, 0.75, 0.90 | ||
| Number of trios | 1000 | |
| Number of permutations per statistic | 500 | |
| Number of starting points | 200 | |
| Number of EM steps per starting point | 100 | |
| 10-6 | ||
| Penalty | 0.001 | |
| Number of replicates per vector (Items 1-3) | 250 |
MOI = Mode of inheritance
ϕ = Disease prevalence
R= Genotype relative risk for disease allele homozygote
π= Proportion of linked trios
p = Disease allele frequency
ε = Tolerance
Simulation parameter settings for the multi-locus simulations
| Item | Parameter | Setting |
|---|---|---|
| Number of loci | 4 | |
| Locus transmission probability: MOI | Multiplicative | |
| Locus transmission probability: | 0.10, 0.50, 0.9 | |
| 1.0 (Null), 2.25,9.0 | ||
| 0.25, 0.75 | ||
| 0.8 |
MOI = Mode of inheritance
p = Disease allele frequency
R= Genotype relative risk for disease allele homozygote
π= Proportion of linked trios
ρ = correlation coefficient. ρ = 1 (perfect correlation), ρ = 0 (no correlation)
Figure 1Empirical type I error rates for . Here we present the empirical type I error rates for the TDT-HET and TDT statistics for different settings of the prevalence (0.05 or 0.15), DAF (0.10, 0.25, 0.50, 0.75, 0.90), π1 (0.25, 0.50, 0.75, 1.00) at two different significance levels (5%, 1%). The various plotted shapes in the figure represent empirical type I error rates (-log-transformed) for a fixed setting of the parameters. Solid square = TDT-HET, 5% Empirical Type I Error Rate. Hollow diamond = TDT, 5% Empirical Type I Error Rate. Solid circle = TDT-HET, 1% Empirical Type I Error Rate. Hollow triangle = TDT, 1% Empirical Type I Error Rate.
Figure 2Contour plot of . Empirical powers at the 5% significance level for prevalence (ϕ) equal to 0.05; DAF (p) equal to 0.10, 0.25, 0.50, 0.75, 0.90 and π1 equal to 0.25, 0.50, 0.75, 1.00. Each contour in the figure represents a range of empirical power values. There are five contours, corresponding to power ranges (x, x + 0.20), where x = 0.00, 0.20, 0.40, 0.60, 0.80. For example, the black contour represents the power range (0.00, 0.20). The light gray contour contiguous to the black contour represents the power range (0.20, 0.40) and so forth. The lightest contour represents the power range (0.80, 1.00).
Figure 3Contour plot of . Empirical powers at the 5% significance level for prevalence (ϕ) equal to 0.05; DAF (p) equal to 0.10, 0.25, 0.50, 0.75, 0.90 and π1 equal to 0.25, 0.50, 0.75, 1.00. Each contour in the figure represents a range of empirical power values. There are five contours, corresponding to power ranges (x, x + 0.20), where x = 0.00, 0.20, 0.40, 0.60, 0.80. For example, the black contour represents the power range (0.00, 0.20). The light gray contour contiguous to the black contour represents the power range (0.20, 0.40) and so forth. The lightest contour represents the power range (0.80, 1.00).
Figure 4Contour plot of . Empirical powers at the 5% significance level for prevalence (ϕ) equal to 0.05; DAF (p) equal to 0.10, 0.25, 0.50, 0.75, 0.90 and π1 equal to 0.25, 0.50, 0.75, 1.00. Each contour in the figure represents a range of empirical power values. There are five contours, corresponding to power ranges (x, x + 0.20), where x = 0.00, 0.20, 0.40, 0.60, 0.80. For example, the black contour represents the power range (0.00, 0.20). The light gray contour contiguous to the black contour represents the power range (0.20, 0.40) and so forth. The lightest contour represents the power range (0.80, 1.00).
Figure 5Box and whiskers plots of differences (. Here we present Box and Whisker plots for the differences (TDT-HET Empirical Power - TDT Empirical Power) for the three MOIs and the two significance levels. The code referring to each plot is: (MOI, Significance Level). For example, Dom05 refers to the differences for the dominant MOI at the 5% significance level. Similarly, Dom01 refers to the differences for the dominant MOI at the 1% significance level. For each gray box, the upper side indicates the 3rd quartile value, the lower side indicates the 1st quartile value, and the black horizontal line in the middle of the box indicates the median value. Mean values are indicated by white diamonds, and outliers are indicated by stars.
Figure 6Empirical type I error rates for . In this figure, we provide TDT-HET and TDT SumStat empirical type I error values (GRR R1 = R2 = 1.0) for parameter settings listed in Tables 1 and 2. In this figure, solid symbols represent TDT-HET SumStat empirical powers; hollow shapes represent TDT SumStat empirical powers. More specifically: Solid diamonds = TDT-HET SumStat empirical type I error rate at 1% significance level when π1 = 0.25. Solid squares = TDT-HET SumStat empirical type I error rate at 1% significance level when π1 = 0.75. Hollow diamonds = TDT SumStat empirical type I error rate at 1% significance level when π1 = 0.25. Multiplication signs = TDT SumStat empirical type I error rate at 1% significance level when π1 = 0.75.
Figure 7Empirical powers for = 1.5. Here, we present TDT-HET and TDT SumStat empirical powers for parameter settings listed in Tables 1 and 2. Solid diamonds = TDT-HET SumStat empirical power at 1% significance level when π1 = 0.25. Solid squares = TDT-HET SumStat empirical power at 1% significance level when π1 = 0.75. Hollow diamonds = TDT SumStat empirical power at 1% significance level when π1 = 0.25. Multiplication signs = TDT SumStat empirical power at 1% significance level when π1 = 0.75.
Figure 8Empirical powers for = 3.0. Here, we present TDT-HET and TDT SumStat empirical powers for parameter settings listed in Tables 1 and 2. Solid diamonds = TDT-HET SumStat empirical power at 1% significance level when π1 = 0.25. Solid squares = TDT-HET SumStat empirical power at 1% significance level when π1 = 0.75. Hollow diamonds = TDT SumStat empirical power at 1% significance level when π1 = 0.25. Multiplication signs = TDT SumStat empirical power at 1% significance level when π1 = 0.75.
Results of TDT-HET analysis on idiopathic scoliosis candidate loci
| PLINK Results | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RS1400180 | 145968 | 14.78 | 1 6 × 10-4 | 0.80 | 0.30 | 23.93 | 1 0 × 10-5 | 14.35 | 2.8 × 10-4 | 1.44 | 0.59 | 0.001 | |
| RS10510181 | 166047 | 9.15 | 0.003 | 0.60 | 0.77 | 9.04 | 0.004 | 1.37 | 0.58 | 0.02 | |||
| RS11770843 | 146426312 | 18.32 | 1.0 × 10-5 | 0.26 | 0.47 | NA | 17.29 | 1.6 × 10-4 | 1.56 | 0.61 | 3.3 × 10-4 | ||
| RS1040315 | 40746722 | 18.41 | 2.0 × 10-5 | 0.76 | 0.38 | 40.94 | 0.00 | 19.10 | 3.0 × 10-5 | 1.54 | 0.61 | 1.2 × 10-4 | |
| RS2222973 | 40755754 | 22.53 | 0.00 | 0.36 | 0.87 | 22.25 | 2.0 × 10-5 | 0.60 | 0.38 | 7.0 × 10-5 | |||
The headings for each of the columns are defined as follows:
Chr = Human chromosome on which locus is located.
Locus = Particular SNP genotyped in idiopathic scoliosis trios.
BP = Base pair position of Locus. This position is based on the human reference sequence (NCBI Build 36.1/HG18).
TDT-HET = Value of the TDT-HET statistic for particular locus genotype data in idiopathic scoliosis trios.
P-value (Perm) = P-value of corresponding TDT-HET statistic, based on 100,000 random permutations. For a description of how the permutation p-value is computed, see Methods, P-values by permutation.
= EM-Algorithm estimate of the probability, t, that a heterozygous parent transmits a "1" allele.
= EM-Algorithm estimate of the probability, π1, that a trio is linked to the locus in question.
TDT-HET SumStat = ∑TDT-HET (k), where k indexes the set of all loci on a chromosome and TDT-HET (k) is the value of the TDT-HET statistic at the particular locus. For example, in Table 3, k = 1 or 2, corresponding to locus RS1400180 or RS10510181, respectively. The TDT-HET statistic for each locus is 14.78 (k = 1) and 9.15 (k = 2). Therefore, for Chromosome 3, TDT-HET SumStat = 14.78 + 9.15 = 23.93.
SumStat P-value (Perm) = Permutation P-value corresponding the TDT-HET SumStat value. For a further description, see Methods, Simulations, Multi-locus.
(PLINK Results)
TDT = Value of the TDT statistic as computed by PLINK.
P-value (Perm01) = Permutation p-value computed by PLINK. Purcell et al. [74] label this p-value "Emp1". It is the Point-wise empirical p-value.
OR = Odds Ratio for the disease allele.
= The maximum likelihood estimate of the probability, t, that a heterozygous parent transmits the disease allele. Here, T is the number of times a heterozygous parent transmits the disease allele, and NT = the number of times a heterozygous parent does not transmit the disease allele. It has been shown that, for the likelihood form of the TDT, this value is the maximum likelihood estimate of the transmission probability (see, e.g., [81-83]).
Max(T) P-value (Perm02) = Permutation p-value computed by PLINK that controls the family-wise type I error rate. For more information, see Methods, Idiopathic Scoliosis Candidate Loci.
Posterior probability estimates that each coded trio is in linked group for Chromosome 21 Locus RS2222973 in the idiopathic scoliosis data set
| 000 | |
| 100 | |
| 0.83 | |
| 110 | |
| 0.86 | |
| 0.78 | |
| 201 | |
| 211 | |
| 0.83 | |
| 222 |
We indicate in bold the coded trios xsuch that . The value 0.87 comes from Table 3, for locus RS2222973. See Results, Idiopathic Scoliosis Candidate Loci, for further discussion of the importance of this inequality.
Conditional probabilities of mating type and child genotype
| Mating type = | Child genotype | Notation | Pr( | ||
|---|---|---|---|---|---|
| MM × MM ( | MM | 1 | |||
| MM × MNC( | MM | ||||
| MM × MNC( | MN | (1 - | |||
| MM × NN( | MN | 1 | |||
| MN × MN( | MM | ||||
| MN × MN( | MN | 2 | 2 | ||
| MN × MN( | NN | (1 - | |||
| MN × NN( | MN | ||||
| MN × NN( | NN | (1 - | |||
| NN × NN( | NN | 1 |
In this table, the high risk allele is M. Also, we define D to be the event that the child is affected. Note that 1 ≤ k ≤ 2. The last column is computed using the definition of conditional probability. Schaid and Sommer [63] also demonstrated this calculation. Note that . Finally, t = Pr(heterozygous parent transmits an M allele to an affected child).