| Literature DB >> 25705211 |
Sylvie Schulze1, Sebastian G Henkel2, Dominik Driesch2, Reinhard Guthke1, Jörg Linde1.
Abstract
Inference of inter-species gene regulatory networks based on gene expression data is an important computational method to predict pathogen-host interactions (PHIs). Both the experimental setup and the nature of PHIs exhibit certain characteristics. First, besides an environmental change, the battle between pathogen and host leads to a constantly changing environment and thus complex gene expression patterns. Second, there might be a delay until one of the organisms reacts. Third, toward later time points only one organism may survive leading to missing gene expression data of the other organism. Here, we account for PHI characteristics by extending NetGenerator, a network inference tool that predicts gene regulatory networks from gene expression time series data. We tested multiple modeling scenarios regarding the stimuli functions of the interaction network based on a benchmark example. We show that modeling perturbation of a PHI network by multiple stimuli better represents the underlying biological phenomena. Furthermore, we utilized the benchmark example to test the influence of missing data points on the inference performance. Our results suggest that PHI network inference with missing data is possible, but we recommend to provide complete time series data. Finally, we extended the NetGenerator tool to incorporate gene- and time point specific variances, because complex PHIs may lead to high variance in expression data. Sample variances are directly considered in the objective function of NetGenerator and indirectly by testing the robustness of interactions based on variance dependent disturbance of gene expression values. We evaluated the method of variance incorporation on dual RNA sequencing (RNA-Seq) data of Mus musculus dendritic cells incubated with Candida albicans and proofed our method by predicting previously verified PHIs as robust interactions.Entities:
Keywords: NetGenerator; dual RNA-Seq; gene regulatory networks; inter-species interactions; microarrays; network inference; transcriptomics
Year: 2015 PMID: 25705211 PMCID: PMC4319478 DOI: 10.3389/fmicb.2015.00065
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Figure 1From dual RNA-Seq data to inter-species GRNs. (A) Dual RNA extraction results in one sample to be sequenced. (B) Data preprocessing and analysis leads to separation of host and pathogen RNA-Seq data. DEGs are identified and candidate genes selected. (C) Prediction of an inter-species GRN with NetGenerator.
Figure 2Testing PHI data characteristics. (A) Benchmark example of an inter-species GRN with 3 pathogen candidate genes (orange nodes), four host candidate genes (green nodes) and two stimuli (gray nodes). Edges represent interactions. (B) Test setup. (C) F-measures calculated from predicted network topologies and the known network topology given different stimuli functions. Two stimuli increase F-measures (Test-2). (D) F-measures calculated from predicted network topologies and the known network topology based on missing data values. Carefully selected time points covering both the host and pathogen response increase F-measures (Test-3).
Figure 3GRN robustness analysis and visualization. (A) Fitting plots for each gene are generated showing measured time points (dots), simulated time courses (solid lines), interpolated time courses (dashed lines), and standard deviations from replicated measurements (shaded areas). (B) Outer robustness analysis. Noise is added to time series data with variances calculated from replicates of genes and time points. This is repeated n times to predict n GRNs. (C) The bubble map visualizes the robustness of a predicted edge from column gene to row gene. Bubble sizes illustrate the robustness score assigned to an edge. Orange and blue pies illustrate the fraction of activating and inhibiting edges, respectively.