| Literature DB >> 29441045 |
David Botero1,2,3, Camilo Alvarado1, Adriana Bernal4, Giovanna Danies5, Silvia Restrepo1.
Abstract
Even in the age of big data in Biology, studying the connections between the biological processes and the molecular mechanisms behind them is a challenging task. Systems biology arose as a transversal discipline between biology, chemistry, computer science, mathematics, and physics to facilitate the elucidation of such connections. A scenario, where the application of systems biology constitutes a very powerful tool, is the study of interactions between hosts and pathogens using network approaches. Interactions between pathogenic bacteria and their hosts, both in agricultural and human health contexts are of great interest to researchers worldwide. Large amounts of data have been generated in the last few years within this area of research. However, studies have been relatively limited to simple interactions. This has left great amounts of data that remain to be utilized. Here, we review the main techniques in network analysis and their complementary experimental assays used to investigate bacterial-plant interactions. Other host-pathogen interactions are presented in those cases where few or no examples of plant pathogens exist. Furthermore, we present key results that have been obtained with these techniques and how these can help in the design of new strategies to control bacterial pathogens. The review comprises metabolic simulation, protein-protein interactions, regulatory control of gene expression, host-pathogen modeling, and genome evolution in bacteria. The aim of this review is to offer scientists working on plant-pathogen interactions basic concepts around network biology, as well as an array of techniques that will be useful for a better and more complete interpretation of their data.Entities:
Keywords: bacterial pathogens; host-pathogen interactions; networks; pathogenicity; plant pathogens
Year: 2018 PMID: 29441045 PMCID: PMC5797656 DOI: 10.3389/fmicb.2018.00035
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Figure 1Type of networks. (A) Directed networks are composed of nodes representing biological entities as proteins, metabolites, or genes. These nodes are interconnected by directed edges (or arrows) that symbolize a directed relationship between two or more biological species, as a gene regulated by a transcription factor or a reaction that is connected downstream to another reaction forming metabolic pathways. (B) Undirected networks are composed of nodes, that represent proteins, for example. These nodes are interconnected by edges that symbolize an interaction between two or more biological species, as for example signaling proteins.
Basic concepts of biological networks.
| Degree distribution | Distribution of probabilities of degrees in a specific network. | Comparisons, scale-free networks. Clear indicator of the presence of hubs when it is combined with the centrality measurement. Degree provides clues about modules in a network by determining the number of interactions shared between neighboring nodes. | Képès, |
| Shortest path | The shortest path between two nodes in a biological network. | Connectivity. | Perumal et al., |
| Average diameter | The minimum number of edges connecting any two nodes over all possible pairs. | Information flow, Small World. Capacity and time of the response of a system, so that in networks with a high centrality, signaling processes are favored. | Képès, |
| Node clustering coefficient | The ratio of connections to neighboring nodes to the number of all possible connections. | Comparisons, scale-free, hierarchical. | Képès, |
| Betweenness—centrality | The ratio of the number of k-shortest paths passing through a node and its nearest neighbor links. | Identifies hubs (highly connected nodes in a network), important in pathogenicity and potential target for drugs. Hubs may potentially disconnect the network if they are removed or blocked. | Goh et al., |
| Assortativity | The probability of connection of a node with others of the same degree. | Robustness to node deletion. | Newman, |
Summary of structural measurements of the topology of a network and their utility in a biological context.
Figure 2Metabolic modeling. The process of metabolic modeling starts with a genome annotation used for inferring metabolic reactions that are present in an organism. Automatic tools could be used for reconstructing the metabolic network based on the genome. In the initial set of reactions there will be metabolic gaps or missing reactions that are necessary for the complete function of pathways. These gaps can be identified and filled out using different algorithms. The final metabolic reconstruction will have associations among genes, proteins, and reactions (GPRs). Then, further manual curation, based on omics data and literature should be performed. The definition of an objective function that represents a target biological function to optimize should be defined, typically cell growth or ATP production. Once the objective function is set, computational simulations for obtaining metabolic phenotypes related to different conditions are carried out; Flux Balance Analysis (FBA) is the main technique for these simulations. Finally, new biological hypotheses are generated and validated. In all the procedure, data, and information from different experimental assays are incorporated into the model.
Examples of objective functions used and the biological utility of the studies.
| Gene targets for antibiotic development. Growth at different carbon sources (used for classification of strains of | Biomass: at two temperatures. Differences in LPS and fatty acid composition at biomass definition. | Charusanti et al., | |
| Metabolic reconstruction, reconciliation of two models. | Biomass | Thiele et al., | |
| Reconciliation of simulations and experimental data; gap filling. | Biomass | Fong et al., | |
| Search for drug targets and comparison of metabolic networks of pathogenic and non-pathogenic bacterium. | NA | Perumal et al., | |
| Differences and similarities in pathogenesis and virulence. | Biomass: special composition of lipids and fatty acid. | Bartell et al., | |
| Establishes a new strategy for identification of bactericides targets of agriculture importance. | Biomass: | Wang et al., | |
| Metabolic model reconstruction and experimental validation of the model. | Biomass | Liao et al., | |
| Uncover mechanisms of xanthan biosynthesis for industrial purposes and pathogenicity research. | Biomass/ xanthan production | Schatschneider et al., | |
| Research on plant-pathogen interactions. | NA | Duan et al., | |
| Determination of limits between strain and species at a metabolic level. Characterization of | Biomass | Monk et al., |
Main experimental techniques used for reconstruction or validation of protein-protein interaction networks.
| Y2H - Yeast two hybrid | +++ | B | No antibody required | Elevated rate of false-positives; Nuclear localization of proteins | Weßling et al., | |
| PCA - Protein-fragment complementation Assay | ++ | C | Interaction with membrane proteins | Works better with small monomeric proteins | Ozawa et al., | |
| FRET - Förster resonance energy transfer | + | B | Reversible interaction | Decreased sensibility; Photobleaching | Bhat et al., | |
| BiFC - Bimolecular fluorescence complementation | +++ | B | Used for localization in living cells | Detection of weakly associated proteins | Lacroix et al., | |
| TAP - Tandem affinity purification-mass spectroscopy | + | C | Accurate and efficient for multiprotein complex | High experimental effort and extensive data analysis | Kaneko et al., | |
| Protein array | ++ | C | Highly specific recognition | Needs a set of labeled proteins | Scietti et al., | |
| Pull - down | +++ | C | Medium level of standardization | Protein GST fusion may cause sterical hindrance | Li et al., | |
| Phage display | +++ | C | Great diversity of variant proteins that can be represented in a phage library | Post-translational modifications; selection condition of library | Jonsson et al., |
Computational methods for prediction of protein-protein interaction.
| Phylogenetic | Cluster analysis, maximum likelihood, maximum parsimony, Bayesian inference | Provides information of selective environmental pressure | Difficult to estimate divergence of proteins | Ratmann et al., | |
| Machine learning | Random forest, decision tree, k-nearest neighbors, bayesian, Neural networks, support vector machine | Simple to understand, accurate | Dependent of parameter settings and features, black-box predictor, large data set for training | Nanni et al., | |
| Data mining | Named entity recognition, ID3, Computational of natural language processing, C4.5 | Fast and process large volumes of information, good to focused list | It is sensitive to noise, require manually curation | Bock and Gough, | |
| Topological | Common topological characteristics among species (small-world), comparison with random networks | False positives proportional to the size of the network, configuration of protein modules may vary | Butland et al., | ||
| Structure | Shape complementarity, rigid-body docking, heuristic potential | Accurate, good availability of data for primary and secondary structure | Slow development for high throughput methodologies | Matsuzaki et al., |
Methods for reconstruction of regulatory networks.
| Differential equations | Network dynamic over time, regulation and optimization of function | High computational demanding, complex parameter optimization | Linde et al., | |
| Boolean | Switch-like behavior, efficient and easy interpretation | Only two states, good in small networks, Only synchronous interactions | Franke et al., | |
| Bayesian | Robust to deal of disturbances, integrated knowledge to increase the support | Non-dynamical, high computational cost, often used a hybrid method to increase the accuracy | Yang et al., | |
| Neural networks | Allows continuous variables over time, very sensitive for regulated systems, noise-resistant | Computational complex, difficult for training, need a lot of input data | Yaghoobi et al., | |
| State space model | High computational efficiency, probabilistic framework to simulate the network, determines an optimal threshold value | There are no learning steps | Do et al., |
To counteract the stationary problem of Bayesian networks, The dynamic Bayesian network approach was developed.
Summary of networks for the study of host-pathogen interaction.
| Regulatory | Genomics; Transcriptomics; Transcription Start Site (5′-RACE); Binding sites global regulators (ChIP-chip) | Boolean; network analysis | Dynamic of regulation of genes involved in virulence and pathogenicity |
| Metabolic | Genomics; transcriptomics; Metabolomics; Phenotype microarrays; C13 labeling | Constraint-based modeling; elementary flux mode analysis; pathway enrichment analysis; network analysis | Metabolic capabilities; genes related with virulence and pathogenicity |
| Protein-protein interaction | Y2H; PCA; BiFC; Protein arrays; Pull down; Phage display | Phylogenetic methods; dynamical networks; machine learning | Identification of hubs involved in virulence and pathogenicity; Determination of interaction between proteins related with signaling and regulatory cascades |
| Signaling and regulatory | Transcriptomics; Fusion assays (LacZ reporter); Adherence assay; Biofilm formation (fluorescence) | Boolean; network analysis | Impact of sensors in regulation of virulence and pathogenesis; Cell-to-cell signaling; biofilm synthesis; |
| Signaling, regulatory and metabolic | Genomics; metabolomics; transcriptomics | Constraint-based modeling; boolean model hierarchical layers; network analysis | Model regulatory and metabolic network of QS system |
The networks reviewed in this work, the experimental data (mainly at the level of omics), the mathematical and computational approaches applied for every network, and the research objective for the networks studied are summarized.