| Literature DB >> 19245720 |
Jing Chen1, Bruce J Aronow, Anil G Jegga.
Abstract
BACKGROUND: Although most of the current disease candidate gene identification and prioritization methods depend on functional annotations, the coverage of the gene functional annotations is a limiting factor. In the current study, we describe a candidate gene prioritization method that is entirely based on protein-protein interaction network (PPIN) analyses.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19245720 PMCID: PMC2657789 DOI: 10.1186/1471-2105-10-73
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1ROC curves from cross validations. This figure shows the representative ROC curves using PageRank with Priors with back probability 0.01, 0.05, 0.1, 0.3 and 0.5, and HITS with Priors with back probability 0.3 and 0.5. The random curve was derived from prioritization of the random training set using the PageRank with Prior method with back probability 0.3.
Figure 2ROC curves from cross validations. This figure shows the representative ROC curves using the K-Step Markov method with K = 1, 2, 4, and 6. The random curve was derived from prioritization of the random training set using the PageRank with Prior method with back probability 0.3.
AUC values from each cross validation run.
| k1 | 1 | 0.66 | 2 | 0.66 | 3 | 0.87 | 4 | 0.66 | 5 | 0.66 |
| k2 | 6 | 0.78 | 7 | 0.78 | 8 | 0.78 | 9 | 0.78 | 10 | 0.78 |
| k4 | 11 | 0.8 | 12 | 0.8 | 13 | 0.8 | 14 | 0.8 | 15 | 0.8 |
| k6 | 16 | 0.8 | 17 | 0.8 | 18 | 0.8 | 19 | 0.8 | 20 | 0.8 |
| h3 | 21 | 0.8 | 22 | 0.8 | 23 | 0.8 | 24 | 0.8 | 25 | 0.8 |
| h5 | 26 | 0.8 | 27 | 0.8 | 28 | 0.8 | 29 | 0.8 | 30 | 0.8 |
| p01 | 36 | 0.73 | 37 | 0.73 | 38 | 0.73 | 39 | 0.73 | 40 | 0.73 |
| p05 | 31 | 0.78 | 32 | 0.78 | 33 | 0.78 | 34 | 0.78 | 35 | 0.78 |
| p1 | 41 | 0.79 | 42 | 0.79 | 43 | 0.79 | 44 | 0.79 | 45 | 0.79 |
| p3 | 46 | 0.8 | 47 | 0.8 | 48 | 0.8 | 49 | 0.8 | 50 | 0.8 |
| p5 | 51 | 0.8 | 52 | 0.8 | 53 | 0.8 | 54 | 0.8 | 55 | 0.8 |
| p7 | 56 | 0.8 | 57 | 0.8 | 58 | 0.8 | 59 | 0.8 | 60 | 0.8 |
| p9 | 61 | 0.79 | 62 | 0.8 | 63 | 0.79 | 64 | 0.8 | 65 | 0.8 |
Column "Test Type" indicates the method and parameter settings of the test. p01 through p9 stand for PageRank with Priors with back probability 0.01 to 0.9, respectively; k1, k2, k4 and k6 represent K-Step Markov with K = 1, 2, 4 and 6, accordingly; h3 and h5 are HITS with Priors with back probability 0.3 and 0.5, respectively. There were 13 test conditions, each repeated 5 times.
Mean and standard deviation (SD) of AUC values with 13 different cross validation conditions.
| PageRank with Priors | Back probability = 0.01 | 0.73 | 0.0008 |
| PageRank with Priors | Back probability = 0.05 | 0.78 | 0.0015 |
| PageRank with Priors | Back probability = 0.1 | 0.8 | 0.0013 |
| PageRank with Priors | |||
| PageRank with Priors | Back probability = 0.5 | 0.8 | 0.0015 |
| PageRank with Priors | Back probability = 0.7 | 0.8 | 0.0009 |
| PageRank with Priors | Back probability = 0.9 | 0.79 | 0.0007 |
| K-Step Markov | K = 1 | 0.7 | 0.096 |
| K-Step Markov | K = 2 | 0.78 | 0.0005 |
| K-Step Markov | K = 4 | 0.8 | 0.0024 |
| K-Step Markov | |||
| HITS with Priors | |||
| HITS with Priors | Back probability = 0.5 | 0.8 | 0.0004 |
Highlighted rows correspond to the best parameter value of each method.
Figure 3Plots of AUC with different parameter values. The left panel shows the AUC values of PageRank with Priors with back probability varied from 0.01 to 0.9. The right panel shows the AUC values of the K-Step Markov method with random walk length varied from 1 to 6. The vertical bars indicate the standard deviations.
Cardiac septal defect candidate gene prioritization.
| 1 | ||||
| 2 | ||||
| 3 | ||||
| 4 | ||||
| 5 | ||||
| 6 | ||||
| 7 | ||||
| 8 | ||||
| 9 | ||||
| 10 | ||||
| 11 | ||||
| 12 | ||||
| 13 | ||||
| 14 | ||||
| 15 | ||||
| 16 | ||||
| 17 | ||||
| 18 | ||||
| 19 | ||||
| 20 | ||||
The cardiac septal defect sub-network was created using known cardiac septal defect genes (from OMIM) and their immediate interactants, and was prioritized using functional annotation and PPIN based methods. Functional annotation based prioritization was done using ToppGene server. The PPIN based rankings were obtained using 3 methods: K step Markov, Hits with Priors, and PageRank with Priors. The highlighted genes are those occurring in all of the prioritized top 20 genes generated using different methodologies. Note that the Hits with Priors and PageRank with Priors gave identical results (see Additional Files 3, 4 and 5 for the list of genes and prioritization results). The genes marked with * are associated with abnormal heart morphology (ToppGene: 15/20; K-Step Markov: 14/20; and Hits with Priors and PageRank with Priors: 13/20) while those marked with # have been reported to be associated with cardiac septal defects (6/20 and 3/20 in ToppGene and PPIN prioritized top 20 candidate genes for cardiac septal defects).
Figure 4Prioritized candidate genes of cardiac septal defects using both functional annotation- and PPIN- based methods. Panel A shows the sub-network of heart septal defect related genes comprising (i) genes associated with OMIM diseases that have the phenotype of cardiac septal defect (Training set of genes for cardiac septal defect) and their immediate interactants (Test set genes). The size of the nodes is proportional to the degree (number of edges). Panel B shows the intersection among the top 20 ranked cardiac septal defect candidate genes using functional annotation- and PPIN- based methods. Functional annotation-based prioritization was done using ToppGene server. For PPIN-based methods K-Step Markov, Hits with Priors, and PageRank with Priors was used. Panel C shows the top 20 ranked cardiac septal defect genes (generated using PPIN- and functional annotation- based methods) along with their connectivity to training set genes (based on protein-protein interactions).
Summary of functional annotation coverage of human interactome genes.
| GO + MP + Pathways | 2440 |
| GO + Pathways | 1630 |
| GO+MP | 1232 |
| MP + Pathways | 4 |
| GO only | 2448 |
| MP only | 10 |
| Pathways only | 47 |
| 223 | |
About 1/3rd (2505/8034) genes of the interactome are sparsely annotated (GO – Gene Ontology; MP – Mammalian Phenotype; and Pathway annotations).