| Literature DB >> 20148193 |
Abstract
BACKGROUND: Proteins directly interacting with each other tend to have similar functions and be involved in the same cellular processes. Mutations in genes that code for them often lead to the same family of disease phenotypes. Efforts have been made to prioritize positional candidate genes for complex diseases utilize the protein-protein interaction (PPI) information. But such an approach is often considered too general to be practically useful for specific diseases.Entities:
Year: 2009 PMID: 20148193 PMCID: PMC2818071 DOI: 10.4172/jcsb.1000025
Source DB: PubMed Journal: J Comput Sci Syst Biol ISSN: 0974-7230
Figure 1PPI networks of T1D disease genes according to HPRD (left) and HT (right).
Figure 2The topological features of the T1D disease genes in the PPI network are distinct from the other genes. (A) The degree distribution of all proteins follows a power law (r~0.98, p<0.001), with p(k)~k−λ , λ ~1.35, indicating the PPI network is scale free. The distribution for the candidate genes clearly deviate from the power law, skewed significantly toward higher degrees. (B) The clustering coefficient (CC) is plotted against degree k. There is a linear decline in CC with increasing k, suggesting that the network is modular. The distribution of the disease genes again deviate from random genes, with more interactions among their level-1 neighbours.
Figure 3The size effect of the bait set. (A) Number of predicted disease genes increases with number of baits. (B) The efficiency of the disease gene prediction algorithm, as judged by the odds ratio of known disease gene being recovered, does not depend on the size of bait set.
List of the 68 predicted disease genes.
| Gene ID | Gene Name | Linkage | # of baits | PubMed | Citation in T1D- | T1D citation, | ||
|---|---|---|---|---|---|---|---|---|
| # | p | # | p | |||||
| 2099 | ESR1 | IDDM5 | 6 | 783 | 139 | 1.49E-27 | 124 | 8.3E-19 |
| 7430 | VIL2 | IDDM5 | 6 | 117 | 30 | 2.45E-10 | 29 | 4.2E-09 |
| 1616 | DAXX | IDDM1 | 5 | 85 | 7 | 0.22 | 6 | 0.42 |
| 4087 | SMAD2 | IDDM6 | 5 | 172 | 82 | 7.2E-41 | 52 | 4.2E-18 |
| 5970 | RELA | IDDM4 | 5 | 429 | 36 | 0.019 | 35 | 0.060 |
| 801 | CALM1 | IDDM11 | 4 | 189 | 18 | 0.030 | 16 | 0.13 |
| 921 | CD5 | IDDM4 | 4 | 123 | 2 | 0.99 | 1 | 1 |
| 3118 | HLA-DQA2 | IDDM1 | 3 | 13 | 0 | 1 | 0 | 1 |
| 3122 | HLA-DRA | IDDM1 | 3 | 123 | 36 | 1.6E-13 | 31 | 8.9E-10 |
| 4089 | SMAD4 | IDDM6 | 3 | 203 | 82 | 1.0E-36 | 60 | 3.1E-20 |
| 5336 | PLCG2 | UN16 | 3 | 91 | 2 | 0.96 | 2 | 0.97 |
| 10524 | HTATIP | IDDM4 | 3 | 65 | 4 | 0.51 | 3 | 0.75 |
| 156 | ADRBK1 | IDDM4 | 2 | 104 | 34 | 5.5E-14 | 32 | 6.5E-12 |
| 823 | CAPN1 | IDDM4 | 2 | 101 | 3 | 0.92 | 3 | 0.94 |
| 931 | MS4A1 | IDDM4 | 2 | 59 | 0 | 1 | 0 | 1 |
| 3113 | HLA-DPA1 | IDDM1 | 2 | 31 | 1 | 0.83 | 0 | 1 |
| 5499 | PPP1CA | IDDM4 | 2 | 82 | 3 | 0.84 | 1 | 0.99 |
| 5883 | RAD9A | IDDM4 | 2 | 60 | 2 | 0.85 | 2 | 0.88 |
| 5979 | RET | IDDM10 | 2 | 471 | 12 | 1 | 7 | 1 |
| 6925 | TCF4 | IDDM6 | 2 | 71 | 10 | 10 | 3 | 0.80 |
| 7277 | TUBA1 | IDDM13 | 2 | 68 | 13 | 0.00036 | 10 | 0.013 |
| 10589 | DRAP1 | IDDM4 | 2 | 25 | 0 | 1 | 0 | 1 |
| 23193 | GANAB | IDDM4 | 2 | 25 | 0 | 1 | 0 | 1 |
| 353091 | RAET1G | IDDM5 | 2 | 3 | 0 | 1 | 0 | 1 |
| 572 | BAD | IDDM4 | 1 | 138 | 9 | 0.39 | 4 | 0.96 |
| 1012 | CDH13 | UN16 | 1 | 50 | 3 | 0.55 | 1 | 0.95 |
| 1374 | CPT1A | IDDM4 | 1 | 60 | 4 | 0.45 | 4 | 0.50 |
| 2785 | GNG3 | IDDM4 | 1 | 34 | 0 | 1 | 0 | 1 |
| 2950 | GSTP1 | IDDM4 | 1 | 177 | 9 | 0.67 | 6 | 0.95 |
| 3111 | HLA-DOA | IDDM1 | 1 | 31 | 1 | 0.83 | 0 | 1 |
| 3185 | HNRPF | IDDM10 | 1 | 35 | 0 | 1 | 0 | 1 |
| 3482 | IGF2R | IDDM5 | 1 | 183 | 3 | 1 | 2 | 1 |
| 3688 | ITGB1 | IDDM10 | 1 | 494 | 89 | 1.2E-18 | 83 | 2.3E-14 |
| 4054 | LTBP3 | IDDM4 | 1 | 36 | 7 | 0.0074 | 5 | 0.083 |
| 4094 | MAF | UN16 | 1 | 70 | 9 | 0.025 | 7 | 0.15 |
| 4142 | MAS1 | IDDM5 | 1 | 53 | 0 | 1 | 0 | 1 |
| 4221 | MEN1 | IDDM4 | 1 | 107 | 17 | 0.00035 | 16 | 0.0018 |
| 4311 | MME | IDDM9 | 1 | 133 | 9 | 0.35 | 9 | 0.43 |
| 4645 | MYO5B | IDDM6 | 1 | 31 | 0 | 1 | 0 | 1 |
| 5028 | P2RY1 | IDDM9 | 1 | 74 | 16 | 2.2E-05 | 13 | 0.0013 |
| 5366 | PMAIP1 | IDDM6 | 1 | 37 | 0 | 1 | 0 | 1 |
| 5790 | PTPRCAP | IDDM4 | 1 | 32 | 0 | 1 | 0 | 1 |
| 5806 | PTX3 | IDDM9 | 1 | 51 | 1 | 0.94 | 1 | 0.95 |
| 5867 | RAB4A | UN1 | 1 | 55 | 1 | 0.95 | 1 | 0.96 |
| 6199 | RPS6KB2 | IDDM4 | 1 | 34 | 2 | 0.58 | 1 | 0.87 |
| 6520 | SLC3A2 | IDDM4 | 1 | 94 | 10 | 0.052 | 7 | 0.36 |
| 6747 | SSR3 | IDDM9 | 1 | 16 | 8 | 2.3E-05 | 6 | 0.0012 |
| 6840 | SVIL | IDDM10 | 1 | 21 | 3 | 0.14 | 2 | 0.38 |
| 7423 | VEGFB | IDDM4 | 1 | 49 | 0 | 1 | 0 | 1 |
| 7536 | SF1 | IDDM4 | 1 | 41 | 8 | 0.00429 | 7 | 0.019 |
| 8325 | FZD8 | IDDM10 | 1 | 27 | 6 | 0.0075 | 5 | 0.034 |
| 8833 | GMPS | IDDM9 | 1 | 18 | 0 | 1 | 0 | 1 |
| 9013 | TAF1C | UN16 | 1 | 23 | 0 | 1 | 0 | 1 |
| 9063 | PIAS2 | IDDM6 | 1 | 56 | 5 | 0.23 | 3 | 0.66 |
| 9252 | RPS6KA5 | IDDM11 | 1 | 53 | 4 | 0.36 | 2 | 0.83 |
| 9352 | TXNL1 | IDDM6 | 1 | 20 | 0 | 1 | 0 | 1 |
| 9616 | RNF7 | IDDM9 | 1 | 31 | 1 | 0.83 | 1 | 0.84 |
| 10963 | STIP1 | IDDM4 | 1 | 48 | 0 | 1 | 0 | 1 |
| 23549 | DNPEP | IDDM13 | 1 | 17 | 0 | 1 | 0 | 1 |
| 25937 | WWTR1 | IDDM9 | 1 | 18 | 1 | 0.65 | 0 | 0.65 |
| 30827 | CXXC1 | IDDM6 | 1 | 23 | 1 | 0.73 | 0 | 0.73 |
| 55048 | VPS37C | IDDM4 | 1 | 14 | 0 | 1 | 0 | 1 |
| 55867 | SLC22A11 | IDDM4 | 1 | 15 | 0 | 1 | 0 | 1 |
| 56945 | MRPS22 | IDDM9 | 1 | 19 | 0 | 1 | 0 | 1 |
| 84064 | HDHD2 | IDDM6 | 1 | 15 | 0 | 1 | 0 | 1 |
| 135250 | RAET1E | IDDM5 | 1 | 19 | 0 | 1 | 0 | 1 |
| 154043 | CNKSR3 | IDDM5 | 1 | 10 | 2 | 0.13 | 0 | 0.13 |
| 170506 | DHX36 | IDDM9 | 1 | 20 | 0 | 1 | 0 | 1 |
most loci were named IDDM#, where IDDM stands for Insulin Dependent Diabetes Mellitus, another name for type 1 diabetes.
Figure 4The PPI network of known (circle) and predicted disease genes (diamond).
The top 30 GO categories shared by the 266 known T1D genes, and their presentation in the 68 predicted disease genes.
| GO term | # of genes | # in | Enrichment rtio | p |
|---|---|---|---|---|
| |1|GO:0002376|immune system process| | 9 | 640 | 2.2834 | 0.06 |
| |2|GO:0006955|immune response| | 5 | 480 | 1.7634 | 0.30 |
| |3|GO:0006952|defense response| | 4 | 397 | 1.5896 | 0.35 |
| |4|GO:0048522|positive regulation of | 16 | 748 | 3.2124 | 0.0005 |
| |5|GO:0048518|positive regulation of | 16 | 835 | 2.874 | 0.0015 |
| |6|GO:0009607|response to biotic stimulus| | 2 | 166 | 1.871 | 0.35 |
| |7|GO:0005102|receptor binding| | 4 | 566 | 1.0546 | 0.60 |
| |8|GO:0031325|positive regulation of | 8 | 313 | 3.9479 | 0.0037 |
| |9|GO:0009893|positive regulation of | 8 | 329 | 3.7422 | 0.0049 |
| |10|GO:0005615|extracellular space| | 0 | 356 | 0 | 1 |
| |11|GO:0042127|regulation of cell | 3 | 359 | 1.275 | 0.50 |
| |12|GO:0009611|response to wounding| | 3 | 345 | 1.3251 | 0.48 |
| |13|GO:0009605|response to external | 4 | 470 | 1.2709 | 0.47 |
| |14|GO:0008283|cell proliferation| | 5 | 596 | 1.236 | 0.46 |
| |15|GO:0051707|response to other | 1 | 113 | 1.4804 | 0.57 |
| |16|GO:0045321|leukocyte activation| | 4 | 177 | 3.9334 | 0.048 |
| |17|GO:0001775|cell activation| | 4 | 198 | 3.4324 | 0.067 |
| |18|GO:0044421|extracellular region part| | 0 | 507 | 0 | 1 |
| |19|GO:0046649|lymphocyte activation| | 4 | 159 | 4.3458 | 0.036 |
| |20|GO:0005886|plasma membrane| | 18 | 1402 | 1.8357 | 1.8357 |
| |21|GO:0044459|plasma membrane part| | 17 | 1146 | 2.1285 | 0.011 |
| |22|GO:0001816|cytokine production| | 2 | 94 | 4.1452 | 0.16 |
| |23|GO:0005515|protein binding| | 42 | 4552 | 1.2866 | 0.15 |
| |24|GO:0008219|cell death| | 9 | 626 | 2.0869 | 0.056 |
| |25|GO:0016265| death| | 9 | 626 | 2.0869 | 0.056 |
| |26|GO:0006950|response to stress| | 8 | 767 | 1.5053 | 0.23 |
| |27|GO:0051239|regulation of multicellular | 4 | 241 | 2.6416 | 0.11 |
| |28|GO:0005125|cytokine activity| | 1 | 180 | 0.88632 | 0.74 |
| |29|GO:0009891|positive regulation of | 1 | 58 | 3.9624 | 0.36 |
| |30|GO:0005126|hematopoietin cytokine | 0 | 34 | 0 | 1 |
Protein sequence motifs that are over-represented among known and predicted disease genes. Listed are the top 10 motifs shared in the known disease genes at p<2e-16 (Fisher's exact test), together with their significance in the predicted ones.
| InterPro ID | Short Description | InterPro Description | p, for the predicted |
|---|---|---|---|
| IPR007110 | Ig-like | Immunoglobulin-like | 0.0004 |
| IPR013151 | Immunoglobulin | Immunoglobulin | 8.81E-05 |
| IPR003597 | Ig_c1 | Immunoglobulin C1 type | 7.96E-16 |
| IPR003006 | Ig_MHC | Immunoglobulin/major | 1.22E-10 |
| IPR013568 | SEFIR | SEFIR |
|
| IPR000157 | TIR | Toll-Interleukin receptor |
|
| IPR004075 | IL1_rcpt_1 | Interleukin-1 receptor, |
|
| IPR001039 | MHC_I_alpha_A1A2 | MHC class I, alpha | 2.06E-05 |
| IPR001003 | MHC_II_alpha_N | MHC class II, alpha | 2.20E-16 |
| IPR007775 | LST1 | LST-1 |
|
The expected number of genes out of the 68 that share the motif is far below 1, <0.25. The actual number is 0. Not enough power for statistical analysis.
Figure 5The probability density distribution of normalized T1D citation. Both known (A) and predicted disease genes (B) are cited significantly (p<1e-33, and p<1e-5, respectively, KS-test) more often in T1D-related publications than random genes. In the analysis of predicted, cocitations with known disease genes were excluded.
Figure 6Candidates predicted by more baits are more likely to be cited in T1D-related publications.
Figure 7PPI network of top 5 predictions (ellipse) and their corresponding baits (round rectangle). Bright magenta nodes represent genes with significant citation in T1D-related publications (p<0.01).