| Literature DB >> 28870256 |
Gabriella Sferra1, Federica Fratini1, Marta Ponzi1, Elisabetta Pizzi2.
Abstract
BACKGROUND: Elaboration of powerful methods to predict functional and/or physical protein-protein interactions from genome sequence is one of the main tasks in the post-genomic era. Phylogenetic profiling allows the prediction of protein-protein interactions at a whole genome level in both Prokaryotes and Eukaryotes. For this reason it is considered one of the most promising methods.Entities:
Keywords: Distance correlation; Phylogenetic profiling; Protein-protein interaction
Mesh:
Substances:
Year: 2017 PMID: 28870256 PMCID: PMC5584357 DOI: 10.1186/s12859-017-1815-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Summary of genomes in the reference sets
| Prokaryotes | Eukaryotes | Ratio | |
|---|---|---|---|
| Reference set 1 (RS1) | 592 | 120 | 5:1 |
| Reference set 2 (RS2) | 592 | 45 | 13:1 |
| Reference set 3 (RS3) | 230 | 45 | 5:1 |
| Reference set 4 (RS4) | 230 | 18 | 13:1 |
Fig. 1Pipeline of the dCor calculation. The phylogenetic profile matrix of Pi proteins constructed using a reference set of size Gj genomes (step a); starting from this data, the Di Euclidean jxj distance matrices (step b) and the DAi centered Euclidean distances (step c) were calculated applying a “split-apply-combine” algorithm; DAi matrices were stored in a repository of binary files (step d), from which they were extracted to proceed with the calculation of the distance correlation matrix (step e)
Fig. 2Benchmarking of Phylo-dCor application. Results of the ten-fold cross-validation procedure to assess predictive performances of dCor (grey box plots), PC (ligth blue box plots) and MI (empty box plots). Results obtained using GS_fun benchmark are shown in panels a and b, while in panels a’ and b’ are reported results obtained using GS_phy