| Literature DB >> 20030838 |
Frauke Günther1, Nina Wawro, Karin Bammann.
Abstract
BACKGROUND: Our aim is to investigate the ability of neural networks to model different two-locus disease models. We conduct a simulation study to compare neural networks with two standard methods, namely logistic regression models and multifactor dimensionality reduction. One hundred data sets are generated for each of six two-locus disease models, which are considered in a low and in a high risk scenario. Two models represent independence, one is a multiplicative model, and three models are epistatic. For each data set, six neural networks (with up to five hidden neurons) and five logistic regression models (the null model, three main effect models, and the full model) with two different codings for the genotype information are fitted. Additionally, the multifactor dimensionality reduction approach is applied.Entities:
Mesh:
Year: 2009 PMID: 20030838 PMCID: PMC2817696 DOI: 10.1186/1471-2156-10-87
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Figure 1Neural network. Neural network with one hidden layer consisting of three hidden neurons.
Risk scenarios.
| Two-locus disease model | Low risk scenario | High risk scenario |
|---|---|---|
| ADD, HET, MULT | ||
| EPI RR | ||
| EPI DD, EPI RD | ||
Applied risk scenarios for all two-locus disease models.
Number of parameters.
| Neural network | ||
|---|---|---|
| 0 hidden neurons | 3 | |
| 1 hidden neuron | 5 | |
| 2 hidden neurons | 9 | |
| 3 hidden neurons | 13 | |
| 4 hidden neurons | 17 | |
| 5 hidden neurons | 21 | |
| Null model (NM) | 1 | 1 |
| One main effect (SiA/SiB) | 2 | 3 |
| Both main effects (ME) | 3 | 5 |
| Full model (FM) | 4 | 9 |
Number of parameters for neural networks, logistic regression models and logistic regression models with design variables (DV).
Additive model (ADD).
| Low risk | High risk | |
|---|---|---|
| Mean absolute difference | ||
| Sum | 0.2313 | 0.2059 |
| Mean absolute difference | ||
| Sum | 0.2530 | 0.2544 |
| Mean absolute difference | ||
| Sum | 0.2897 | 0.2804 |
Mean absolute differences between theoretical and estimated penetrance matrices from 100 replications in the low and high risk scenario.
Multiplicative model (MULT).
| Low risk | High risk | |
|---|---|---|
| Mean absolute difference | ||
| Sum | 0.2428 | 0.2178 |
| Mean absolute difference | ||
| Sum | 0.3965 | 0.4887 |
| Mean absolute difference | ||
| Sum | 0.1637 | 0.1833 |
Mean absolute differences between theoretical and estimated penetrance matrices from 100 replications in the low and high risk scenario.
Epistatic model - recessive (EPI RR).
| Low risk | High risk | |
|---|---|---|
| Mean absolute difference | ||
| Sum | 0.2071 | 0.1410 |
| Mean absolute difference | ||
| Sum | 0.4849 | 0.6150 |
| Mean absolute difference | ||
| Sum | 0.3503 | 0.2755 |
Mean absolute differences between theoretical and estimated penetrance matrices from 100 replications in the low and high risk scenario.
Epistatic model - dominant (EPI DD).
| Low risk | High risk | |
|---|---|---|
| Mean absolute difference | ||
| Sum | 0.3095 | 0.2524 |
| Mean absolute difference | ||
| Sum | 0.3132 | 0.6528 |
| Mean absolute difference | ||
| Sum | 0.3071 | 0.2648 |
Mean absolute differences between theoretical and estimated penetrance matrices from 100 replications in the low and high risk scenario.
Epistatic model - mixed (EPI RD).
| Low risk | High risk | |
|---|---|---|
| Mean absolute difference | ||
| Sum | 0.2239 | 0.1563 |
| Mean absolute difference | ||
| Sum | 0.5105 | 0.8658 |
| Mean absolute difference | ||
| Sum | 0.2799 | 0.2329 |
Mean absolute differences between theoretical and estimated penetrance matrices from 100 replications in the low and high risk scenario.
Selected logistic regression models (LRM).
| LRM with design variables | |||||||
|---|---|---|---|---|---|---|---|
| Statistical model (# parameters) | |||||||
| ∑ | |||||||
| low | 1 | 39 | 100 | ||||
| high | 7 | 100 | |||||
| low | 45 | 100 | |||||
| high | 10 | 100 | |||||
| low | 10 | 100 | |||||
| high | 12 | 100 | |||||
| low | 6 | 100 | |||||
| high | 100 | ||||||
| low | 3 | 100 | |||||
| high | 100 | ||||||
| low | 19 | 20 | 100 | ||||
| high | 14 | 29 | 100 | ||||
| ∑ | |||||||
| low | 1 | 27 | 100 | ||||
| high | 4 | 100 | |||||
| low | 30 | 100 | |||||
| high | 6 | 100 | |||||
| low | 28 | 100 | |||||
| high | 46 | 100 | |||||
| low | 7 | 6 | 9 | 3 | 100 | ||
| high | 100 | ||||||
| low | 2 | 100 | |||||
| high | 100 | ||||||
| low | 19 | 21 | 100 | ||||
| high | 38 | 23 | 100 | ||||
In the upper part of the table, the two-locus disease model (ADD, HET) agrees with the statistical model when a statistical model of independence (NM, SiA, SiB, ME) is selected. In the lower part of the table, the two-locus disease model representing biological interaction (MULT, EPI RR, EPI DD, EPI RD) agrees with the statistical model when the full model (FM) is selected. Bold numbers mark the mode of the selected models in the low and high risk scenario.
MDR analyses: selected variables and identification as redundant or synergistic behavior.
| MDR analyses | ||||||||
|---|---|---|---|---|---|---|---|---|
| Redundant | Synergistic | |||||||
| Two-locus disease model | Risk scenario | Only A | Only B | Both | Only A | Only B | Both | ∑ |
| low | 18 | 100 | ||||||
| high | 7 | 100 | ||||||
| low | 32 | 100 | ||||||
| high | 1 | 6 | 100 | |||||
| low | 7 | 100 | ||||||
| high | 100 | |||||||
| low | 10 | 22 | 2 | 4 | 23 | 100 | ||
| high | 18 | 17 | 1 | 2 | 3 | 100 | ||
| low | 12 | 1 | 100 | |||||
| high | 18 | 100 | ||||||
| low | 34 | 3 | 100 | |||||
| high | 3 | 100 | ||||||
Bold numbers mark the mode of the selected variables in the low and high risk scenario.