| Literature DB >> 18842131 |
Sailu Yellaboina1, Dawood B Dudekula, Minoru Sh Ko.
Abstract
BACKGROUND: Identification of protein-protein interactions is an important first step to understand living systems. High-throughput experimental approaches have accumulated large amount of information on protein-protein interactions in human and other model organisms. Such interaction information has been successfully transferred to other species, in which the experimental data are limited. However, the annotation transfer method could yield false positive interologs due to the lack of conservation of interactions when applied to phylogenetically distant organisms.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18842131 PMCID: PMC2571111 DOI: 10.1186/1471-2164-9-465
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Flowchart of the approach used to predict protein interactions in Mus musculus. Protein interactions were generated using two approaches; 1) Physical interactions in different databases 2) functional interactions in operons. The interactions were transferred to orthologs of Mus musculus and false positives in the interactions were filtered using phylogenetic profiles.
Distribution of interologs in mouse before and after filtering with phylogenetic profiles
| Non-redundant interactions identified experimentally in each species | Interologs transferred to mouse based on orthologous relationship | Interologs remained after filtering by phylogenetic profiles | Fraction of filtered interologs (%) | |
| 1261 | 294 | 76% | ||
| 19605 | 12528 | 36% | ||
| 924 | 627 | 32% | ||
| 21 | 20 | 5% | ||
| 2506 | 1876 | 25% | ||
| 790 | 373 | 52% | ||
| 2979 | 2224 | 25% | ||
| 30806 | 23166 | 25% | ||
Figure 2Percent distribution of organelle proteins in the interactions dataset predicted by ortholog co-occurrence in operons. Localization information is obtained from eSLDB (Pierleoni, et al., 2007) . Categories of sub cellular localization are defined according to the Swiss-prot annotation. Protein sequences with no localization information are named as 'None'.
Figure 3Effect of gene conservation score on accuracy of predictions using phylogenetic profiles. Accuracy is defined as the average of sensitivity and specificity as described in Methods. It is clearly seen that the prediction accuracy is poorer at low conservation scores and maximum at the conservation score of 58. See text for details.
Evaluation of predicted interactions by frequency of co-expression and functional similarity of GO terms
| Frequency of co-expression (Mean/Stddev) | Similarity of GO term (BP) Mean/Stddev) | Similarity of GO term(CC)(Mean/Stddev) | |
| Interolog | 3.7/5.9 | 0.32/0.21 | 0.40/0.31 |
| interolog + phylogenetic profile | 4.18/6.45 | 0.34/0.22 | 0.43/0.31 |
| Interologs predicted as false positives by phylogenetic profiles | 2.70/4.58 | 0.29/0.17 | 0.34/0.30 |
| Ortholog co-occurrence in operons | 5.0/8.8 | 0.41/0.25 | 0.28/0.35 |
| Ortholog co-occurrence in operons + phylogenetic profile | 10.11/15.76 | 0.51/0.27 | 0.46/0.37 |
| 0.48/1.74 | 0.23/0.14 | 0.37/0.24 | |
The interaction datasets were evaluated by co-expression frequency of interacting genes and similarity between gene ontology terms BP (Biological Process) and CC (Cellular component).
See Methods section for the details of co-expression frequency. Similarity between GO terms was calculated by using "getGeneSim" function in GOSim package [36].