| Literature DB >> 25379138 |
Nicola J Mulder1, Richard O Akinola1, Gaston K Mazandu1, Holifidy Rapanoel1.
Abstract
Infectious diseases are the leading cause of death, particularly in developing countries. Although many drugs are available for treating the most common infectious diseases, in many cases the mechanism of action of these drugs or even their targets in the pathogen remain unknown. In addition, the key factors or processes in pathogens that facilitate infection and disease progression are often not well understood. Since proteins do not work in isolation, understanding biological systems requires a better understanding of the interconnectivity between proteins in different pathways and processes, which includes both physical and other functional interactions. Such biological networks can be generated within organisms or between organisms sharing a common environment using experimental data and computational predictions. Though different data sources provide different levels of accuracy, confidence in interactions can be measured using interaction scores. Connections between interacting proteins in biological networks can be represented as graphs and edges, and thus studied using existing algorithms and tools from graph theory. There are many different applications of biological networks, and here we discuss three such applications, specifically applied to the infectious disease tuberculosis, with its causative agent Mycobacterium tuberculosis and host, Homo sapiens. The applications include the use of the networks for function prediction, comparison of networks for evolutionary studies, and the generation and use of host-pathogen interaction networks.Entities:
Keywords: Biological networks; Evolution; Pathogen; Protein–protein interaction; Tuberculosis
Year: 2014 PMID: 25379138 PMCID: PMC4212278 DOI: 10.1016/j.csbj.2014.08.006
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Proteins in MTB strain CDC1551 annotated with the same BP and MF ontology terms from electronic and manual inferences. The level indicates the level of the GO term in the GO DAG, assuming that the root of each ontology is located at level 0. The manual evidence code is provided, together with the source of electronic inferences.
| Protein | GO ID | GO term | Level | Evidence code, source |
|---|---|---|---|---|
| Q7D8E1 | GO:0045454 | Cell redox homeostasis | 5 | TAS/IDA, InterPro |
| Q7D8E1 | GO:0055114 | Oxidation–reduction process | 3 | IDA, GOC |
| P95276 | GO:0008152 | Metabolic process | 1 | IDA, GOC |
| O07218 | GO:0016998 | Cell wall macromolecule catabolic process | 6 | IDA, InterPro |
| P71937 | GO:0006355 | Regulation of transcription, DNA-templated | 8 | IDA, InterPro |
| P71971 | GO:0006979 | Response to oxidative stress | 3 | IMP, InterPro |
| O53294 | GO:0004497 | Monooxygenase activity | 3 | IDA, UniProt, IEA UniProt |
| P0CF99 | GO:0043750 | Phosphatidylinositol alpha-mannosyltransferase activity | 5 | IDA, UniProt |
| P96291 | GO:0016747 | Transferase activity, transferring acyl groups other than amino-acyl groups | 4 | IDA, InterPro |
| Q7D4L9 | GO:0008745 | N-acetylmuramoyl- | 5 | IDA, InterPro |
| P71855 | GO:0016747 | Transferase activity, transferring acyl groups other than amino-acyl groups | 4 | IDA, InterPro |
| P95001 | GO:0004764 | Shikimate 3-dehydrogenase (NADP +) activity | 5 | TAS/IDA, InterPro/UniProt |
| O33342 | GO:0004356 | Glutamate-ammonia ligase activity | 6 | IDA, InterPro |
| P71828 | GO:0003840 | Gamma-glutamyltransferase activity | 6 | IDA, InterPro/UniProt |
| Q7D8E1 | GO:0015035 | Protein disulfide oxidoreductase activity | 5 | IDA, InterPro |
| O53665 | GO:0004316 | 3-Oxoacyl-[acyl-carrier-protein] reductase (NADPH) activity | 6 | IDA, UniProt |
| P96830 | GO:0016791 | Phosphatase activity | 5 | IDA, InterPro |
Fig. 1Comparison of annotations inferred manually and electronically in the MTB genome strain CDC1551 in terms of term specificity score computed using the GO-universal metric.
Fig. 2Protein function prediction system. Protein–protein interaction network and semantic similarity scores between terms annotating known proteins are used to determine optimal cut-off scores.
Comparing network properties in the MTB, MLP and MSM networks.
| Parameters | |||
|---|---|---|---|
| Number of proteins (nodes) | 4136 | 1412 | 4953 |
| Number of functional interactions (edges) | 59,919 | 20,742 | 66,543 |
| Number of hubs | 201 | 103 | 755 |
| Density | 0.007 | 0.0208 | 0.0054 |
| Average degree | 28 | 29 | 26 |
| Average shortest path length | 3.62739 | 3.16955 | 4.2224 |
| Number of connected components | 23 | 19 | 166 |
| % of nodes in largest component | 98.7% | 97.5% | 91.7% |
Number of ortholog proteins shared, common edges and network identity of the compared networks.
| A | B | # of proteins in A only | # of proteins in B only | # of common proteins | # of common edges | Network identity |
|---|---|---|---|---|---|---|
| MLP | MTB | 135 | 2859 | 1277 | 3693 | 4.5% |
| MLP | MSM | 342 | 3883 | 1070 | 1901 | 2.1% |
| MSM | MTB | 2965 | 2148 | 1988 | 2284 | 1.5% |
Number of common nodes, edges and network identity of the compared sub-networks containing only orthologous proteins.
| A | B | # of edges in A only | # of edges in B only | # of common proteins | # of common edges | Network identity |
|---|---|---|---|---|---|---|
| MLP | MTB | 13,670 | 9941 | 1001 | 2820 | 11.9% |
| MLP | MSM | 13,670 | 5086 | 1001 | 1849 | 9.8% |
| MSM | MTB | 5086 | 9941 | 1001 | 656 | 4.3% |
Fig. 3Boxplot of the clustering coefficients of the 1001 proteins of the three sub-networks.