| Literature DB >> 16451639 |
David A Greenberg1, Junying Zhang, Dvora Shmulewitz, Lisa J Strug, Regina Zimmerman, Veena Singh, Sudhir Marathe.
Abstract
The Genetic Analysis Workshop 14 simulated dataset was designed 1) To test the ability to find genes related to a complex disease (such as alcoholism). Such a disease may be given a variety of definitions by different investigators, have associated endophenotypes that are common in the general population, and is likely to be not one disease but a heterogeneous collection of clinically similar, but genetically distinct, entities. 2) To observe the effect on genetic analysis and gene discovery of a complex set of gene x gene interactions. 3) To allow comparison of microsatellite vs. large-scale single-nucleotide polymorphism (SNP) data. 4) To allow testing of association to identify the disease gene and the effect of moderate marker x marker linkage disequilibrium. 5) To observe the effect of different ascertainment/disease definition schemes on the analysis. Data was distributed in two forms. Data distributed to participants contained about 1,000 SNPs and 400 microsatellite markers. Internet-obtainable data consisted of a finer 10,000 SNP map, which also contained data on controls. While disease characteristics and parameters were constant, four "studies" used varying ascertainment schemes based on differing beliefs about disease characteristics. One of the studies contained multiplex two- and three-generation pedigrees with at least four affected members. The simulated disease was a psychiatric condition with many associated behaviors (endophenotypes), almost all of which were genetic in origin. The underlying disease model contained four major genes and two modifier genes. The four major genes interacted with each other to produce three different phenotypes, which were themselves heterogeneous. The population parameters were calibrated so that the major genes could be discovered by linkage analysis in most datasets. The association evidence was more difficult to calibrate but was designed to find statistically significant association in 50% of datasets. We also simulated some marker x marker linkage disequilibrium around some of the genes and also in areas without disease genes. We tried two different methods to simulate the linkage disequilibrium.Entities:
Mesh:
Year: 2005 PMID: 16451639 PMCID: PMC1866756 DOI: 10.1186/1471-2156-6-S1-S3
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Figure 1Graphical representation of the genetic model used in the simulation. D1-D4 are disease-causing loci. D5 and D6 influence disease expression if the disease genotype is present. P1-P3 are different phenotypes caused by the disease loci to which they are connected by the lines. The "a" and "b" after the phenotype designation indicate identical phenotypes but caused by different genotypes. D5 changes phenotype P2a into P1 when allele 1 is present. D6 changes the penetrance of P2b when allele 1 is present.
Genetic model parameters
| Major loci | Disease allele | Inheritance at | Modifying genes/penetrance | |
| Phenotype 1 | D1 | 0.015 | Dominant-Dominant | Penetrance of genotype is 0.6 |
| Phenotype 2 | D2 | 0.15 | D2-D3 | If D5 has allele 1, Phenotype 2 will be converted into. |
| D4 | 0.3 | D3-D4 | If D6 has allele 1, penetrance is 0.3, otherwise, penetrance is 0.6. | |
| Phenotype 3 | D1 | 0.15 | D1-D4 | Penetrance of genotype = 1.0 |
| D2 | 0.15 | D2-D3 | Penetrance of genotype = 0.4 |
Location of disease-related loci
| Disease locus name | Located between (or at) markers (locus)a |
| D1 | C01R0052 and B01T0561 |
| D2 | B03T3067 and C04R0282b |
| D3 | B05T4136 and C05R0380 |
| D4 | C09R0765 and B09T8337 |
| D5 | C10R0880c |
| D6 | C02R0097c |
aD1-D4 do not appear in the map.
bB03T3067 is the last visible SNP on chromosome. The disease locus is the last SNP. So C04R0282 should not be linked to B03T3067 or the disease.
cDisease locus is marker.
Genotype-phenotype relationships for subclinical traits
| Trait | Loci involved | Inheritance of trait |
| Unaffected individuals | ||
| a | D1 + D3 | dominant, 30% penetrance |
| b | D1 | dominant, 30% penetrance |
| c | D4 | dominant, 15% penetrance |
| d | D3 | dominant, 18% penetrance |
| e | D2 | dominant, 19% penetrance |
| f | D2 | dominant, 19% penetrance |
| g | D4 | dominant, 15% penetrance |
| h | D2 | dominant, 20% penetrance |
| i | none | random, 30% probability |
| j | none | random, 30% probability |
| k | D2 + D4 | one disease allele at each locus, 30% penetrance |
| l | D4 | recessive, 30% penetrance |
| Affected individuals | ||
| Phenotype 1 | D1 + D2 | have traits b, e, f, h |
| Phenotype 2 | D2 + D3, D3 + D4 | c, d, e, f, g, h |
| Phenotype 3 | D1 + D4, D2 + D3 | b, c, d, e, f, g, h |
| a | D1 + D3 | dominant, 100% penetrance |
| i | none | random, 30% probability |
| j | none | random, 30% probability |
| k | D2 + D4 | one disease allele at each locus, 30% penetrance |
| l | D4 | recessive, 30% penetrance |
Frequency of KPD and associated traits of populations
| Populations | |||
| Aipotu | Karangar | Danacaa | |
| KPD | 0.024 | 0.019 | 0.008 |
| a. Joining/founding cults | 0.005 | 0.003 | 0.005 |
| b. Fear/discomfort with strangers | 0.021 | 0.016 | 0.015 |
| c. Dislike of jokes told face to face | 0.085 | 0.094 | 0.078 |
| d. Obsession with entertainers | 0.079 | 0.085 | 0.068 |
| e. Humor impairment | 0.070 | 0.066 | 0.058 |
| f. Fascination with automobiles | 0.070 | 0.069 | 0.055 |
| g. Aversion to walking | 0.090 | 0.093 | 0.074 |
| h. Uncommunicative, contentless speech patterns | 0.074 | 0.072 | 0.062 |
| i. Fiscal irresponsibility | 0.152 | 0.156 | 0.145 |
| j. Morbid anger/fear/terror concerning rain/snow | 0.156 | 0.151 | 0.152 |
| k. Reluctance to wear clothing appropriate for subjective temperature | 0.048 | 0.045 | 0.044 |
| l. Body-image concerns/mild body dysmorphic disorder | 0.029 | 0.026 | 0.027 |