| Literature DB >> 30476243 |
Damian Szklarczyk1, Annika L Gable1, David Lyon1, Alexander Junge2, Stefan Wyder1, Jaime Huerta-Cepas3, Milan Simonovic1, Nadezhda T Doncheva2,4, John H Morris5, Peer Bork6,7,8,9, Lars J Jensen2, Christian von Mering1.
Abstract
Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein-protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein-protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.Entities:
Mesh:
Year: 2019 PMID: 30476243 PMCID: PMC6323986 DOI: 10.1093/nar/gky1131
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.A typical association network in STRING. The yeast prion-like protein URE2 has been selected as input. The network has been expanded by an additional 10 proteins (via the ‘More’ button in the STRING interface), and the confidence cutoff for showing interaction links has been set to ‘highest’ (0.900). The insets at the right show how many items of the various evidence types in STRING contributed to this particular network (counts denote how many records covered at least two of the proteins in the network; not all of these records contributed high-scoring links after score calibration).
Figure 2.Functional enrichment analysis of a genome-sized input set. An expression dataset comparing metastatic melanoma cells with normal skin tissue (62) has been submitted to STRING, with average log fold change values associated to each gene (negative values signify depletion in the melanoma cells). The screenshot shows how STRING presents and groups statistical enrichment observations for a number of pathways and functional subsystems. When hovering with the mouse, the website highlights the corresponding proteins both in the input data on the left side, as well as in the organism-wide network on the right side. The latter can be interactively zoomed until individual proteins and their neighbors become discernible. Here, the highlighted observation shows that the desmosome is downregulated in melanoma cells—this stands out by way of several publications in PubMed whose discussed proteins (desmosome proteins) are strongly enriched at one end of the user input.