| Literature DB >> 19237446 |
Ke-Shiuan Lynn1, Li-Lan Li, Yen-Ju Lin, Chiuen-Huei Wang, Shu-Hui Sheng, Ju-Hwa Lin, Wayne Liao, Wen-Lian Hsu, Wen-Harn Pan.
Abstract
MOTIVATION: Identification of disease-related genes using high-throughput microarray data is more difficult for complex diseases as compared with monogenic ones. We hypothesized that an endophenotype derived from transcriptional data is associated with a set of genes corresponding to a pathway cluster. We assumed that a complex disease is associated with multiple endophenotypes and can be induced by their up/downregulated gene expression patterns. Thus, a neural network model was adopted to simulate the gene-endophenotype-disease relationship in which endophenotypes were represented by hidden nodes.Entities:
Mesh:
Year: 2009 PMID: 19237446 PMCID: PMC2666815 DOI: 10.1093/bioinformatics/btp106
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Performances of the five neural network models constructed via the 5-fold cross-validation
| Model 1 | |||||
| Training sets | 0.959 ± 0.012 | 0.612 | 0.461 ± 0.005 | 0.626 ± 0.158 | 0 |
| Validation sets | 0.960 ± 0.047 | 0.612 | 0.708 ± 0.013 | 0.289 ± 0.139 | 0 |
| Test set | 0.865 | 0.612 | 0.654 | 0.775 | 0 |
| Model 2 | |||||
| Training sets | 0.959 ± 0.010 | 0.709 | 0.417 ± 0.021 | 0.564 ± 0.084 | 0 |
| Validation sets | 0.960 ± 0.040 | 0.709 | 0.664 ± 0.021 | 0.207 ± 0.250 | 0 |
| Test set | 0.811 | 0.709 | 0.609 | 0.711 | 0 |
| Model 3 | |||||
| Training sets | |||||
| Validation sets | |||||
| Test set | |||||
| Model 4 | |||||
| Training sets | 0.951 ± 0.014 | 0.738 | 0.451 ± 0.011 | 0.741 ± 0.058 | 0 |
| Validation sets | 0.950 ± 0.054 | 0.738 | 0.709 ± 0.012 | 0.466 ± 0.490 | 0 |
| Test set | 0.811 | 0.738 | 0.668 | 0.725 | 0 |
| Model 5 | |||||
| Training sets | 0.976 ± 0.009 | 0.738 | 0.432 ± 0.007 | 0.825 ± 0.109 | 0 |
| Validation sets | 0.976 ± 0.035 | 0.738 | 0.696 ± 0.014 | 0.479 ± 0.292 | 0 |
| Test set | 0.865 | 0.738 | 0.654 | 0.781 | 0 |
aModel 3 is adopted as the final model which is specified by values in bold face.
Fig. 1.The three constructed endophenotypes indicated in red, green and blue. From left to right the figure shows: (1) cluster tree of genes in each endophenotype (the dark and light shades are used to distinguish subclusters), (2) correlation coefficient between the gene cluster and its corresponding endophenotype, (3) normalized correlation (0–1) of the gene with others in the endophenotype, (4) P-value of the endophenotype to hypertension, (5) P-value (0–0.01) of a gene to hypertension, (6) absolute weight (0–0.7) of a gene to its corresponding endophenotype, (7) sequence number of the gene in the endophenotype, (8) gene symbol (highly consistent and influential genes are highlighted in purple, while genes with hypertension-related functions are highlighted in yellow), (9) cytogenetic information, (10) Unigene ID from Unigene build #163 and (11) major functions of selected genes.
Fig. 2.Average blood pressure values [diastolic blood pressure (DBP), systolic blood pressure (SBP)] and average network outputs across various endophenotypic patterns using training data: a blue circle or cross indicates a data point for an individual; error bars indicate the means and standard errors (SEs) of subject subgroups. (a and b) Individual DBP, their mean values and SEs. (c and d) Individual SBP, their mean values and SEs. (e and f) The network outputs, their mean values and SEs. (g and h) Binarized endophenotype values (using a threshold of 0.5) showing endophenotypic patterns. (i and j) Original endophenotype values represented in a gradation of red and blue colors (refer to the color bar at the right-hand side for actual magnitude); the vertical blocks between the black dashed lines denote subject subgroups defined by different endophenotype patterns (refer to g and h); horizontal blocks denote endophenotypes.
Fig. 3.Gene expression plot of the three endophenotypes: each vertical strip represents a subject; each horizontal strip represents a gene; the colors used to indicate expression level is illustrated at the color bar in the right-hand side. The vertical blocks between black lines denote subject subgroups defined by different endophenotypic patterns (refer to Fig. 2g and h); horizontal blocks denote endophenotypes. The ‘S1’, ‘S2’ and ‘S3’ are magnified views of GCGR expression level of the three major subject groups (indicated in the three yellow boxes).