| Literature DB >> 29155944 |
Yanhui Hu1,2, Arunachalam Vinayagam1, Ankita Nand2, Aram Comjean1,2, Verena Chung1,2, Tong Hao3, Stephanie E Mohr1,2, Norbert Perrimon1,4.
Abstract
Model organism and human databases are rich with information about genetic and physical interactions. These data can be used to interpret and guide the analysis of results from new studies and develop new hypotheses. Here, we report the development of the Molecular Interaction Search Tool (MIST; http://fgrtools.hms.harvard.edu/MIST/). The MIST database integrates biological interaction data from yeast, nematode, fly, zebrafish, frog, rat and mouse model systems, as well as human. For individual or short gene lists, the MIST user interface can be used to identify interacting partners based on protein-protein and genetic interaction (GI) data from the species of interest as well as inferred interactions, known as interologs, and to view a corresponding network. The data, interologs and search tools at MIST are also useful for analyzing 'omics datasets. In addition to describing the integrated database, we also demonstrate how MIST can be used to identify an appropriate cut-off value that balances false positive and negative discovery, and present use-cases for additional types of analysis. Altogether, the MIST database and search tools support visualization and navigation of existing protein and GI data, as well as comparison of new and existing data.Entities:
Mesh:
Year: 2018 PMID: 29155944 PMCID: PMC5753374 DOI: 10.1093/nar/gkx1116
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Data sources and summary of integrated data in MIST
| Source | Interactions included in MIST | unique interactions not in other database | Interaction types | Species for MIST | Reference (pmid) |
|---|---|---|---|---|---|
| DIP ( | 106 660 | 6517 | PPI | 10 | 14681454 |
| DroID (including DPiM) ( | 247 816 | 129 219 | PPI,GI | 1 | 21036869 |
| BioGrid ( | 1 827 231 | 1 105 722 | PPI,GI | 9 | 27980099 |
| IntAct (including MINT) ( | 634 547 | 16 381 | PPI,GI | 10 | 24234451 |
| FlyBase ( | 64 007 | 21 192 | PPI,GI | 1 | 27799470 |
| HPRD ( | 76 233 | 23 363 | PPI | 2 | 18988627 |
| PomBase ( | 5290 | 3624 | PPI | 1 | 25361970 |
| mentha ( | 1 020 351 | 9454 | PPI | 9 | 23900247 |
| HumanMAPK ( | 4530 | 2941 | PPI | 1 | 20936779 |
|
|
|
|
| ||
|
|
|
|
|
Comparison of MIST to similar resources
|
|
Figure 1.MIST online user interface. The MIST user interface allows users to select a species of interest; upload a single gene, a list of genes or gene pairs; and select one or more interaction types. Users also have the option to apply filters based on confidence and/or data type. The output contains a Cytoscape network view with edges color-coded for different interaction types as well as a summary table about the interaction partners that includes references and experimental approaches. The summary table can be downloaded as a file.
Figure 2.Comparison of PPI and GI data with data from orthologous genes. (A) Overlap of PPI and interolog (derived from PPI of other species) for each species. (B) Overlap of GIs and interologs (derived from GIs of other species) for each species. (C) Pairwise comparison of PPI data by orthologous mapping. Species are organized from largest number of PPIs (human) to least (Schizosaccharomyces pombe). Overlap percentage is corrected for pairwise conservation (DIOPT score ≥ 3 for both partners).
Figure 3.Using MIST to analyze gene or gene pair lists. (A). Using MIST to analyze results of a genome-wide study of essential genes in cancer cells. Genes scoring as ‘hits’ (positive results) in the CRISPR screen by Hart et al. (20) with higher confidence are more likely to interact with each other, showing that MIST can facilitate the analysis and interpretation of large-scale screen data. (B) Using MIST to analyze paralogs. Paralog pairs as identified using DIOPT overlap with both PPIs and GIs in model organisms and human. The overlap correlates with paralog rank. Protein complexes identified using COMPLEAT provides supporting evidence for the idea that some paralogs (blue circles) physically interact. (C) Using MIST to analyze proteomics data. Analysis with MIST of a raw mass spectrometry interactome dataset can help define a SAINT score cutoff that improves sensitivity without undue increase in specificity. As shown at the top of panel C, the x-axis is the SAINT score, and the y-axis is the percent overlap with PPI and/or interologs. As shown in at the bottom of panel C, MIST can also help ‘rescue’ interactions supported by independent evidence. The blue circle represents the published hits selected by SAINT score cutoff. The areas outside the blue circle but inside the red or green circles represent ‘rescued’ interactions that are included in the raw data and do not meet the cutoff, but are supported by independent evidence.