| Literature DB >> 29800326 |
Daniel J B Clarke1, Maxim V Kuleshov1, Brian M Schilder1, Denis Torre1, Mary E Duffy1, Alexandra B Keenan1, Alexander Lachmann1, Axel S Feldmann1, Gregory W Gundersen1, Moshe C Silverstein1, Zichen Wang1, Avi Ma'ayan1.
Abstract
While gene expression data at the mRNA level can be globally and accurately measured, profiling the activity of cell signaling pathways is currently much more difficult. eXpression2Kinases (X2K) computationally predicts involvement of upstream cell signaling pathways, given a signature of differentially expressed genes. X2K first computes enrichment for transcription factors likely to regulate the expression of the differentially expressed genes. The next step of X2K connects these enriched transcription factors through known protein-protein interactions (PPIs) to construct a subnetwork. The final step performs kinase enrichment analysis on the members of the subnetwork. X2K Web is a new implementation of the original eXpression2Kinases algorithm with important enhancements. X2K Web includes many new transcription factor and kinase libraries, and PPI networks. For demonstration, thousands of gene expression signatures induced by kinase inhibitors, applied to six breast cancer cell lines, are provided for fetching directly into X2K Web. The results are displayed as interactive downloadable vector graphic network images and bar graphs. Benchmarking various settings via random permutations enabled the identification of an optimal set of parameters to be used as the default settings in X2K Web. X2K Web is freely available from http://X2K.cloud.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29800326 PMCID: PMC6030863 DOI: 10.1093/nar/gky458
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 19.160
Figure 1.Screenshot from the X2K Web input form. Users can submit their own lists of mammalian differentially expressed genes, or click on the example to obtain the X2K Web results page (shown in Figure 2).
Figure 2.Screenshot from the X2K Web results page. The results page is divided into four panels: at the top left panel are the TFEA results displayed as a bar graph. At the top right is the PPI subnetwork that connects the input list of enriched transcription factors (red) through intermediate proteins (gray). At the bottom left panel are the KEA results provided as a bar graph. At last, at the bottom right is the complete X2K Web generated subnetwork with the kinases at the top, the intermediate proteins in the middle, and the transcription factors at the bottom.
Figure 3.Screenshot from the X2K Web interface that provides users with the ability to fetch L1000 signatures. Users can search for names of small molecules or cell lines.
Processed resources for transcription-factor/target interactions
| Database | Type | Interaction [M/H] | TFs [M/H] | Tragets [M/H] | PMID |
|---|---|---|---|---|---|
| ARCHS4 | Co-expression | 518466/472585 | 1734/1724 | 21857/21918 | 29636450 |
| ChEA_2016 | ChIP-seq | 535545/461570 | 194/178 | 34462/35204 | 20709693 |
| CREEDS | LOF-microarray | 6140050/3583008 | 265/174 | 23170/20592 | 27667448 |
| ENCODE_2015 | ChIP-seq | 259695/1218728 | 44/175 | 18170/22008 | 22955616 |
| Enrichr Queries | Co-occurrence | 516300 | 1721 | 12487 | 23586463 |
| huMAP | Mass-spec | 14017 | 419 | 2109 | 28596423 |
| iREF | Mixed | 7239/57042 | 402.0/1372 | 3454/11021 | 18823568 |
| JASPAR-TRANSFAC | PWM | 139520/424314 | 104/222 | 20895/22258 | 14681366 |
| TF-Genes2Fans | Predictions | 22525/22525 | 278 | 6001 | 22748121 |
| LOF-GEO | LOF-microarray | 86951/85829 | 82/43 | 23876/23585 | 27141961 |
LOF: Loss of function; PWM: Position weight matrices; M/H: Mouse/human.
Processed resources for protein–protein interactions
| Database | Type | Interactions | Proteins | PMID |
|---|---|---|---|---|
| BIND | Literature PPI | 25622 | 5528 | 12519993 |
| Biocarta | Literature PPI | 756 | 352 | N/A |
| BioGRID | Mixed | 68759 | 7312 | 27980099 |
| BioPLEX | Mass-spec | 56553 | 8610 | 26186194 |
| DIP | Literature PPI | 3822 | 1946 | 11752321 |
| figeys | Mass-spec | 6452 | 2033 | 17353931 |
| HPRD | Literature PPI | 47496 | 7490 | 18988627 |
| huMAP | Mass-spec | 62214 | 6061 | 28596423 |
| InnateDB | Literature PPI | 4576 | 1523 | 23180781 |
| IntAct | Mixed | 15726 | 4186 | 24234451 |
| iREF | Mixed | 28417 | 5403 | 18823568 |
| KEGG | Literature PPI | 13993 | 1198 | 19880382 |
| MINT | Literature PPI | 75065 | 9415 | 22096227 |
| MiPS | Mass-spec | 606 | 373 | 9399795 |
| PDZbase | Literature PPI | 244 | 159 | 15513994 |
| PPID | Literature PPI | 6998 | 1208 | 14755292 |
| Sets2Networks | Predicted | 3000 | 828 | 22824380 |
| SNAVI | Literature PPI | 2007 | 442 | 19154595 |
| Stelzl | Mass-spec | 6207 | 1702 | 16169070 |
| Vidal | Yeast-2-Hybrid | 6726 | 2541 | 16189514 |
Processed resources for kinase–substrate interactions
| Database | Type | Interactions | Kinases | Substrates | PMID |
|---|---|---|---|---|---|
| ARCHS4 | Co-expression | 9936 | 517 | 3824 | 29636450 |
| BIND | Literature PPI | 2533 | 227 | 1323 | 12519993 |
| Harmonizome | ML Predictions | 10000 | 79 | 3635 | 27374120 |
| HPRD | Literature PPI | 5043 | 262 | 2159 | 18988627 |
| huMAP | Mass-Spec PPI | 1385 | 156 | 955 | 28596423 |
| iPTMnet | Literature K–S | 947 | 131 | 724 | 29145615 |
| iREF | Literature PPI | 26734 | 329 | 8036 | 18823568 |
| KEGG | Literature PPI | 2238 | 131 | 621 | 19880382 |
| MINT | Literature PPI | 1583 | 225 | 1065 | 22096227 |
| NetworkIN | Predictions | 5829 | 190 | 2006 | 17981841 |
| Phospho.ELM | Literature K–S | 1441 | 231 | 891 | 21062810 |
| Phosphopoint | Literature K–S | 1970 | 281 | 1061 | 18689816 |
| PhosphositePlus | Literature K–S | 6434 | 168 | 2680 | 22135298 |
ML: Machine learning; K–S: Kinase–substrate.
Figure 4.Box plots to illustrate the contribution of each dataset option to the performance of the X2K pipeline in terms of fitness. The fitness is determined by the negative log of the P-value of the ‘correct’ kinase. (A) Transcription factor libraries; (B) Protein interaction networks; (C) Kinase–substrate libraries.
Figure 5.Ranking of protein kinases based on their likelihood to be recovered by the pipeline.
Figure 6.Comparison of the performance between using the learned parameters from the parameter tuning process applied to the signatures extracted from GEO (left), the learned parameters applied to an independent LINCS L1000 kinase perturbation followed by expression dataset (center), or random settings applied to the GEO signatures (right).