| Literature DB >> 17339371 |
Sailu Yellaboina1, Kshama Goyal, Shekhar C Mande.
Abstract
Cellular functions are determined by interactions among proteins in the cells. Recognition of these interactions forms an important step in understanding biology at the systems level. Here, we report an interaction network of Escherichia coli, obtained by training a Support Vector Machine on the high quality of interactions in the EcoCyc database, and with the assumption that the periplasmic and cytoplasmic proteins may not interact with each other. The data features included correlation coefficient between bit score phylogenetic profiles, frequency of their co-occurrence in predicted operons, and a new measure--the distance between translational start sites of the genes. The combined genome context methods show a high accuracy of prediction on the test data and predict a total of 78,122 binary interactions. The majority of the interactions identified by high-throughput experimental methods correspond to indirect interaction (interactions through neighbors) in the predicted network. Correlation of the predicted network with the gene essentiality data shows that the essential genes in E. coli exhibit a high linking number, whereas the nonessential genes exhibit a low linking number. Furthermore, our predicted protein-protein interaction network shows that the proteins involved in replication, DNA repair, transcription, translation, and cell wall synthesis are highly connected. We therefore believe that our predicted network will serve as a useful resource in understanding prokaryotic biology.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17339371 PMCID: PMC1832100 DOI: 10.1101/gr.5900607
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043