| Literature DB >> 24576332 |
Edgar D Coelho, Joel P Arrais1, Sérgio Matos, Carlos Pereira, Nuno Rosa, Maria José Correia, Marlene Barros, José Luís Oliveira.
Abstract
BACKGROUND: The oral cavity is a complex ecosystem where human chemical compounds coexist with a particular microbiota. However, shifts in the normal composition of this microbiota may result in the onset of oral ailments, such as periodontitis and dental caries. In addition, it is known that the microbial colonization of the oral cavity is mediated by protein-protein interactions (PPIs) between the host and microorganisms. Nevertheless, this kind of PPIs is still largely undisclosed. To elucidate these interactions, we have created a computational prediction method that allows us to obtain a first model of the Human-Microbial oral interactome.Entities:
Mesh:
Year: 2014 PMID: 24576332 PMCID: PMC3975954 DOI: 10.1186/1752-0509-8-24
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 1Workflow applied on the construction of the Human-microbial oral interactome”. It also contains footnote information: “a) the proteins identified on the oral proteome are obtained from the Oralcard database; b) the gold standard used for training and validation is obtained by combining the five most relevant curated protein interaction databases; c) for each protein interacting pair five clusters of features are constructed; d) the previously trained classifier is applied to each pair of interaction; and e) finally the interactome network is obtained by combining the individual pairs of proteins.
Analysis of the prediction performance of individual features
| + Literature | 0.781 | 0.722 | 0.723 | 0.721 | 0.726 |
| + Sequence | 0.877 | 0.784 | 0.790 | 0.768 | 0.813 |
| + GO | 0.817 | 0.742 | 0.748 | 0.735 | 0.760 |
| + COGs | 0.663 | 0.652 | 0.537 | 0.806 | 0.402 |
| + DDIs | 0.620 | 0.617 | 0.424 | 0.861 | 0.281 |
| Final Model | 0.926 | 0.850 | 0.851 | 0.848 | 0.854 |
For each line the metrics are obtained by considering only that cluster of features on the classifier. AUC, area under the receiver operating characteristic (ROC) curve; CA, classification accuracy.
Analysis of the contribution to the overall performance of individual cluster of features
| - Literature | 0.919 | 0.841 | 0.841 | 0.841 | 0.841 |
| - Sequence | 0.891 | 0.794 | 0.774 | 0.855 | 0.708 |
| - GO | 0.916 | 0.838 | 0.839 | 0.835 | 0.842 |
| - COGs | 0.923 | 0.846 | 0.847 | 0.842 | 0.852 |
| - DDIs | 0.911 | 0.831 | 0.834 | 0.819 | 0.850 |
| Final Model | 0.926 | 0.850 | 0.851 | 0.848 | 0.854 |
For each line the metrics are obtained by removing that cluster of features from the classifier. AUC, area under the receiver operating characteristic (ROC) curve; CA, classification accuracy.
Figure 2Plot with the relation of the number of interactions (y-axis) by classifier probability (x-axis).
Figure 3Representation of the Human-microbial inter-species protein interactions. Each section represents an organism. The ribbons connecting any two sections symbolize the PPIs between two organisms. The thickness of each ribbon correlates with the number of PPIs between both organisms.
Figure 4Venn diagram representing the intersections between the five high-quality experimentally determined protein-protein interaction databases.
Relative coverage of protein-protein interactions present in the training and test data by individual feature clusters
| | ||||
|---|---|---|---|---|
| Literature | 22,720 | 61.9% | 4,698,390 | 69.9% |
| Sequence | 35,379 | 96.4% | 6,703,945 | 99.8% |
| GO | 23,769 | 64.8% | 5,130,103 | 76.4% |
| COGs | 9,636 | 26.3% | 1,324,230 | 19.7% |
| DDIs | 5,994 | 16.3% | 516,609 | 7.7% |
| Total | 36,698 | 100.0% | 6,716,792 | 100.0% |
GO, gene ontology; COGs, clusters of orthologous groups; DDIs, domain-domain interactions.