| Literature DB >> 16436204 |
Alison A Motsinger1, Stephen L Lee, George Mellick, Marylyn D Ritchie.
Abstract
BACKGROUND: The identification and characterization of genes that influence the risk of common, complex multifactorial disease primarily through interactions with other genes and environmental factors remains a statistical and computational challenge in genetic epidemiology. We have previously introduced a genetic programming optimized neural network (GPNN) as a method for optimizing the architecture of a neural network to improve the identification of gene combinations associated with disease risk. The goal of this study was to evaluate the power of GPNN for identifying high-order gene-gene interactions. We were also interested in applying GPNN to a real data analysis in Parkinson's disease.Entities:
Mesh:
Year: 2006 PMID: 16436204 PMCID: PMC1388239 DOI: 10.1186/1471-2105-7-39
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
GPNN Power (%) Results – Sample Size 400
| Heritability | ||||||
| Allele freq | Number loci | 3% | 2% | 1.5% | 1% | 0.5% |
| .2/.8 | 2 | 100 | 94 | 97 | 81 | 24 |
| .4/.6 | 2 | 100 | 99 | 99 | 77 | 16 |
| .2/.8 | 3 | 99 | 94 | 22 | 4 | 3 |
| .4/.6 | 3 | 75 | 35 | 20 | 3 | 1 |
| .2/.8 | 4 | 46 | 23 | 0 | 5 | 0 |
| .4/.6 | 4 | 11 | 2 | 0 | 2 | 0 |
| .2/.8 | 5 | 0 | 1 | 0 | 1 | 0 |
| .4/.6 | 5 | 0 | 0 | 0 | 0 | 0 |
GPNN Power (%) Results – Sample Size 800
| Heritability | ||||||
| Allele freq | Number loci | 3% | 2% | 1.5% | 1% | 0.5% |
| .2/.8 | 2 | 100 | 100 | 100 | 99 | 76 |
| .4/.6 | 2 | 100 | 100 | 100 | 99 | 65 |
| .2/.8 | 3 | 98 | 100 | 31 | 10 | 12 |
| .4/.6 | 3 | 97 | 50 | 42 | 15 | 3 |
| .2/.8 | 4 | 68 | 42 | 4 | 14 | 2 |
| .4/.6 | 4 | 34 | 11 | 3 | 3 | 1 |
| .2/.8 | 5 | 2 | 6 | 0 | 6 | 0 |
| .4/.6 | 5 | 1 | 0 | 0 | 1 | 0 |
GPNN Power (%) Results – Sample Size 1600
| Heritability | ||||||
| Allele freq | Number loci | 3% | 2% | 1.5% | 1% | 0.5% |
| .2/.8 | 2 | 100 | 97 | 100 | 100 | 97 |
| .4/.6 | 2 | 100 | 100 | 100 | 99 | 86 |
| .2/.8 | 3 | 100 | 99 | 40 | 21 | 20 |
| .4/.6 | 3 | 97 | 65 | 53 | 20 | 6 |
| .2/.8 | 4 | 70 | 45 | 15 | 11 | 3 |
| .4/.6 | 4 | 30 | 15 | 5 | 3 | 0 |
| .2/.8 | 5 | 2 | 1 | 0 | 0 | 0 |
| .4/.6 | 5 | 2 | 0 | 0 | 0 | 0 |
GPNN Results from Parkinson's Disease Data Analysis
| CV | Factors in Model | CE | PE | ||||||
| 1 | 0.4050 | 0.4127 | |||||||
| 2 | 0.3978 | 0.5079 | |||||||
| 3 | 0.3996 | 0.3810 | |||||||
| 4 | 0.3936 | 0.4355 | |||||||
| 5 | 0.4007 | 0.4355 | |||||||
| 6 | 0.3989 | 0.3871 | |||||||
| 7 | 0.3989 | 0.3871 | |||||||
| 8 | 0.3828 | 0.5323 | |||||||
| 9 | 0.3982 | 0.4098 | |||||||
| 10 | 0.3929 | 0.3934 | |||||||
Figure 1GPNN model for Parkinson's Disease data. A GPNN model that was evolved by GPNN on the PD data. The real numbers are used to create weights and fill in for the W nodes. The individual values of sex and DLST_234 fill into those nodes. The activation function is a Boolean function AND, thus it will take (61055.5/33038.075)*sex AND (96492.325*11716.425)*DLST_234.
Stepwise Logistic Regression Results from Parkinson's Disease Analysis
| Effect | Point Estimate | p-value | OR | 95% Wald CI | |
| 0.2501 | 0.0438 | 1.284 | 1.007 | 1.638 | |
| 0.3908 | 0.0491 | 1.478 | 1.002 | 2.181 | |
| -0.1879 | 0.0913 | 0.829 | 0.666 | 1.031 | |
| -0.7997 | <.0001 | 0.449 | 0.324 | 0.623 | |
Forward Logistic Regression Results from Parkinson's Disease Analysis
| Effect | Point Estimate | p-value | OR | 95% Wald CI | |
| 0.2564 | 0.0374 | 1.292 | 1.015 | 1.645 | |
| -0.7730 | <.0001 | 0.462 | 0.334 | 0.638 | |
Figure 2A NN evolved by GPNN. An example of a NN evolved by GPNN. The O is the output node, S indicates the activation function, W indicates a weight, and X1-X4 are the NN inputs.
Figure 3Overview of GPNN Method.
Demographic characteristics
| Cases (n = 305) | Controls (n = 321) | p-value | |
| Sex | 166 males | 206 males | <.0001* |
| 139 females | 115 females | ||
| Average age | 67 ± 9 years | 65 ± 9 years | 0.0131** |
| Average age of onset | 60 ± 10 years | NA |
* based on Fisher's exact test
** based on Student's t-test
List of mitochondrial polymorphisms
| Marker | dbSNP ID# | Major Allele Frequency |
| rs1799900 | A = 51.4 | |
| rs1801316 | G = 98.7 | |
| rs1800823 | T = 92.9 | |
| rs1800824 | T = 92.9 | |
| rs1801311 | C = 66.4 | |
| rs1045629 | C = 83.0 | |
| rs4679 | A = 58.8 | |
| rs6822 | G = 87.7 | |
| ss10349 | G = 83.0 | |
| ss16204 | A = 62.0 | |
| rs12762 | C = 89.3 | |
| rs9543 | C = 53.2 | |
| rs1800662 | C = 77.9 | |
| rs1128560 | C = 96.1 | |
| rs1801317 | G = 54.3 | |
| ss2421568 | A = 64.8 | |
| rs12570 | T = 66.9 | |
| rs567 | A = 50.5 | |
| rs31303 | G = 77.3 | |
| rs1801315 | T = 57.1 | |
| rs1051806 | C = 80.7 | |
| rs906807 | C = 81.3 |