| Literature DB >> 29165593 |
Christoph Ogris1, Dimitri Guala1, Erik L L Sonnhammer1.
Abstract
This release of the FunCoup database (http://funcoup.sbc.su.se) is the fourth generation of one of the most comprehensive databases for genome-wide functional association networks. These functional associations are inferred via integrating various data types using a naive Bayesian algorithm and orthology based information transfer across different species. This approach provides high coverage of the included genomes as well as high quality of inferred interactions. In this update of FunCoup we introduce four new eukaryotic species: Schizosaccharomyces pombe, Plasmodium falciparum, Bos taurus, Oryza sativa and open the database to the prokaryotic domain by including networks for Escherichia coli and Bacillus subtilis. The latter allows us to also introduce a new class of functional association between genes - co-occurrence in the same operon. We also supplemented the existing classes of functional association: metabolic, signaling, complex and physical protein interaction with up-to-date information. In this release we switched to InParanoid v8 as the source of orthology and base for calculation of phylogenetic profiles. While populating all other evidence types with new data we introduce a new evidence type based on quantitative mass spectrometry data. Finally, the new JavaScript based network viewer provides the user an intuitive and responsive platform to further evaluate the results.Entities:
Mesh:
Year: 2018 PMID: 29165593 PMCID: PMC5755233 DOI: 10.1093/nar/gkx1138
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Amount of links used for the positive gold standards, in total for all species: shared protein–protein interaction(PPI), KEGG signaling pathway (Signaling), KEGG metabolic pathway (Metabolic), shared protein complex (Complex), and organization in same operon (Operon)
| Gold standard | FunCoup 4 |
|---|---|
| PPI | 115 799 |
| Signaling | 4 805 854 |
| Metabolic | 2 248 802 |
| Complex | 1 854 271 |
| Operon | 5895 |
Comparison of number of links and genome sizes between Funcoup 3 and Funcoup 4
| Species | Genes (% genome coverage) | Links | ||
|---|---|---|---|---|
| FunCoup 3 | FunCoup 4 | FunCoup 3 | FunCoup 4 | |
|
| 16375 (60) | 19461 (71) | 5106648 | 5597050 |
|
| 12389 (61) | 13942 (69) | 3206664 | 3618485 |
|
| 17239 (89) | 17742 (89) | 3537089 | 3853720 |
|
| 5642 (40) | 6098 (37) | 1137425 | 1373106 |
|
| 11398 (83) | 9768 (73) | 1987503 | 2174621 |
|
| 15003 (57) | 16612 (73) | 4168563 | 3938535 |
|
| 12317 (74) | 12289 (79) | 2037840 | 1608939 |
|
| 18113 (84) | 18355 (82) | 4477041 | 6403719 |
|
| 19226 (83) | 17708 (79) | 5314496 | 6157297 |
|
| 18562 (81) | 18322 (82) | 5460769 | 5560189 |
|
| 5766 (86) | 6234 (90) | 1353169 | 806515 |
|
| 152030 (72) | 156531 (74) | 3435200 | 3735652 |
|
| ||||
|
| - | 3856 (92) | - | 60553 |
|
| - | 17906 (90) | - | 4551013 |
|
| - | 3624 (88) | - | 111500 |
|
| - | 12184 (28) | - | 2996703 |
|
| - | 2273 (43) | - | 133158 |
|
| - | 3726 (73) | - | 277840 |
|
| 152030 (72) | 200100 (68) | 34603907 | 49122943 |
Comparisons of number of datapoints used for Funcoup 3 and FunCoup 4 for each evidence type.
| Evidence type | ||
|---|---|---|
| FunCoup 3 | FunCoup 4 | |
| PIN | 53886 | 70878 |
| MEX | 920690 | 2807555 |
| DOM | 144826 | 223822 |
| GIN | 288287 | 904740 |
| MIR | 62304 | 62304 |
| PEX | 12238 | 14578 |
| PHP | 188068 | 266236 |
| SCL | 151439 | 307578 |
| TFB | 70975 | 77703 |
| QMS | - | 99239 |
|
| 1892713 | 4834633 |
Protein interaction (PIN), mRNA co-expression (MEX), domain-interaction (DOM), protein co-expression (PEX), genetic interaction profile similarity (GIN), co-miRNA regulation by shared miRNA targeting (MIR), protein co-expression (PEX), phylogenetic profile similarity (PHP), sub-cellular co-localization (SCL), shared transcription factor binding (TFB) and quantitative mass spectrometry(QMS).
Figure 1.Evidence contribution per species. Evidence data types are: MEX: mRNA co-expression; PHP: phylogenetic profile similarity; PIN: protein interaction networks; SCL: sub-cellular co-localization; MIR: comiRNA regulation by shared miRNA targeting; DOM: domain interactions; PEX: protein co-expression; TFB: shared transcription factor binding; GIN: genetic interaction profile similarity and QMS: quantitative mass spectrometry data. The total contribution (LLRs) is normalized such that for each species it sums up to 1.
Figure 2.Evidence source species contributions for all evidences. The total contribution (LLRs) is normalized such that for each species it sums up to 1.
Figure 3.Distributions of gold standard contributions, showing the fraction of links where a given gold standard has the highest LLR score.
Figure 4.The new FunCoup network viewer, showing the comparative interactomics feature. The network of the query in H. sapiens (orange circles) is linked to orthologous networks in M. musculus (blue circles) and B. subtilis (red circles). As query we used the 4 human genes, LACTB2, ADH5, GOT2 and GPI, which have been identified as an evolutionarily conserved ancient metazoan protein complex. The query genes and their orthologs are highlighted with bold black border, and the orthology relation between genes is represented using green dashed lines whereas gray solid lines are functional associations within a species.