Literature DB >> 18400774

A system for generating transcription regulatory networks with combinatorial control of transcription.

Sushmita Roy1, Margaret Werner-Washburne, Terran Lane.   

Abstract

UNLABELLED: We have developed a new software system, REgulatory Network generator with COmbinatorial control (RENCO), for automatic generation of differential equations describing pre-transcriptional combinatorics in artificial regulatory networks. RENCO has the following benefits: (a) it explicitly models protein-protein interactions among transcription factors, (b) it captures combinatorial control of transcription factors on target genes and (c) it produces output in Systems Biology Markup Language (SBML) format, which allows these equations to be directly imported into existing simulators. Explicit modeling of the protein interactions allows RENCO to incorporate greater mechanistic detail of the transcription machinery compared to existing models and can provide a better assessment of algorithms for regulatory network inference. AVAILABILITY: RENCO is a C++ command line program, available at http://sourceforge.net/projects/renco/

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18400774      PMCID: PMC2373921          DOI: 10.1093/bioinformatics/btn126

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

With the increasing availability of genome-scale data, a plethora of algorithms are being developed to infer regulatory networks. Examples of such algorithms include Bayesian networks, ARACNE (Bansal et al., 2007). Because of the absence of “ground truth” of regulatory network topology, these algorithms are evaluated on artificial networks generated via network simulators (Kurata et al., 2003; Margolin et al., 2005; Mendes et al., 2003; Schilstra and Bolouri, 2002). Since gene regulation is a dynamic process, existing network simulations employ systems of ordinary differential equations (ODEs) that describe the kinetics of mRNA and protein concentrations as a function of time. Some approaches construct highly detailed models, but require large amounts of user-specified information (Kurata et al., 2003; Schilstra and Bolouri, 2002). Other approaches generate large networks but use simpler models by making the mRNA concentration of target genes dependent upon mRNA concentration, rather than on protein concentration of transcription factors (Mendes et al., 2003). In real biological systems, protein expression does not correlate with gene expression, especially at steady state, due to different translation and degradation rates (Belle et al., 2006). These approaches also do not model protein interactions edges and, therefore, combinatorics resulting from these interactions. We describe a regulatory network generator, RENCO, that models genes and proteins as separate entities, incorporates protein–protein interations among the transcription factor proteins, and generates ODEs that explicitly capture the combinatorial control of transcription factors. RENCO accepts either pre-specified network topologies or gene counts, in which case it generates a network topology. The network topology is used to generate ODEs that capture combinatorial control among transcription factor proteins. The output from RENCO is in SBML format, compatible with existing simulators such as Copasi (Hoops et al., 2006) and RANGE (Long and Roth, 2007). Time-series and steady-state expression data produced from the ODEs from our generator can be leveraged for comparative analysis of different network inference algorithms.

2 TRANSCRIPTIONAL REGULATORY NETWORK GENERATOR

RENCO works in two steps: (a) generate/read the network topology and (b) generate the ODEs specifying the transcription kinetics (see RENCO manual for details). For (a) proteins are connected to each other via a scale-free network (Albert and Barabasi, 2000), and to genes via a network with exponential degree distribution (Maslov and Sneppen, 2005).

2.1 Modeling combinatorial control of gene regulation

We model combinatorial control by first identifying the set of cliques, , up to a maximum of size t in the protein interaction network. Each clique represents a protein complex that must function together to produce the desired target regulation. A target gene, g is regulated by k randomly selected such cliques, where k is the indegree of the gene. These k cliques regulate g by binding in different combinations, thus exercising combinatorial gene regulation. We refer to the set of cliques in a combination as a transcription factor complex (TFC). At any time there can be several such TFCs regulating g. The mRNA concentration of a target gene is, therefore, a function of three types of regulation: within-clique, within-complex and across-complex regulation. Within-clique regulation captures the contribution of one clique on a target gene. The within-complex regulation captures the combined contribution of all cliques in one TFC. Finally, the across-complex regulation specifies the combined contribution of different TFCs. We now introduce the notation for ODEs generated by RENCO. M (t) and P(t) denote the mRNA and protein concentrations, respectively, of gene g, at time t. V and v denote the rate constants of mRNA synthesis and degradation of and denote the rate constants of protein synthesis and degradation. C and T denote a protein clique and a TFC, respectively, associated with g. Q denotes the set of TFCs associated with g. X, Y and S specify the within-clique, within-complex and across-complex regulation on g. Based on existing work (Mendes et al., 2003; Schilstra and Bolouri, 2002), the rate of change of mRNA concentration is the difference of synthesis and degradation of . Similarly for protein concentration, . The across-complex regulation, S is a weighted sum of contributions from |Q| TFCs: , where w denotes the TFC weight. The sum models ‘or’ behavior of the different TFCs because all TFCs need not be active simultaneously. The within-complex regulation, Y is a product of within-clique actions in the TFC T, . The product models ‘and’ behavior of a single TFC because all proteins within a TFC must be active at the same time. Finally, the cliques per gene C are randomly assigned activating or repressing roles on g. If C is activating, otherwise, Ka and Ki are equilibrium dissociation constants of the pth activator or repressor of g. All degradation, synthesis and dissociation constants are initialized uniformly at random from [0.01,V], where V is user specified.

3 EXAMPLE NETWORK

We used RENCO to analyze : (a) mRNA and protein steady-state measurements and (b) combinatorial gene regulation, in a small example network (Supplementary Material has details).

3.1 Importance of modeling protein expression

The example network has five genes and five proteins (Fig. 1a). The gene G4 is regulated via different combinations of the cliques {P2},{P0,P1}. We find that the wild-type time courses of individual mRNA expressions are correlated with corresponding proteins (Fig. 1b and c). But because different genes and proteins have different degradation and synthesis rate constants, the mRNA population as a whole does not correlate with the protein population (Spearman's; correlation =0.3). Because of the dissimilarity in the steady-state mRNA and protein expression populations, genes appearing to be differentially expressed at the mRNA level may not be differentially expressed at the protein level. This highlights the importance of modeling mRNA and protein expression as separate entities in the network.
Fig. 1.

(a) Example network. Dashed edges indicate regulatory actions. Wild-type gene (b) and protein (c) time courses.

(a) Example network. Dashed edges indicate regulatory actions. Wild-type gene (b) and protein (c) time courses.

3.2 Combinatorics of gene regulation

We analyzed combinatorial control in our network by generating the G4 time course under different knockout combinations of the G4 activators, P0,P1 and P2 (Fig. 2). Because all the regulators are activating, G4 is downregulated here compared to wild-type. We note that each knock out combination yields different time courses. In particular, knocking out either G0 or G1 in combination with G2 is sufficient to drive the G4 expression to 0. This phenomenon is because of the clique, P0,P1. This illustrates a possible combinatorial regulation process to produce a range of expression dynamics using a few transcription factors.
Fig. 2.

G4 time course under knock out combinations of G0, G1 and G2.

G4 time course under knock out combinations of G0, G1 and G2.

4 CONCLUSION

We have described RENCO, a generator for artificial regulatory networks and their ODEs. RENCO models the transcriptional machinery more faithfully by explicitly capturing protein interactions and provides a good testbed for network structure inference algorithms.
  8 in total

1.  Topology of evolving networks: local events and universality

Authors: 
Journal:  Phys Rev Lett       Date:  2000-12-11       Impact factor: 9.161

2.  CADLIVE for constructing a large-scale biochemical network based on a simulation-directed notation and its application to yeast cell cycle.

Authors:  Hiroyuki Kurata; Nana Matoba; Natsumi Shimizu
Journal:  Nucleic Acids Res       Date:  2003-07-15       Impact factor: 16.971

3.  Computational architecture of the yeast regulatory network.

Authors:  Sergei Maslov; Kim Sneppen
Journal:  Phys Biol       Date:  2005-11-09       Impact factor: 2.583

4.  Quantification of protein half-lives in the budding yeast proteome.

Authors:  Archana Belle; Amos Tanay; Ledion Bitincka; Ron Shamir; Erin K O'Shea
Journal:  Proc Natl Acad Sci U S A       Date:  2006-08-17       Impact factor: 11.205

5.  COPASI--a COmplex PAthway SImulator.

Authors:  Stefan Hoops; Sven Sahle; Ralph Gauges; Christine Lee; Jürgen Pahle; Natalia Simus; Mudita Singhal; Liang Xu; Pedro Mendes; Ursula Kummer
Journal:  Bioinformatics       Date:  2006-10-10       Impact factor: 6.937

6.  Synthetic microarray data generation with RANGE and NEMO.

Authors:  James Long; Mitchell Roth
Journal:  Bioinformatics       Date:  2007-11-03       Impact factor: 6.937

7.  Artificial gene networks for objective comparison of analysis algorithms.

Authors:  Pedro Mendes; Wei Sha; Keying Ye
Journal:  Bioinformatics       Date:  2003-10       Impact factor: 6.937

8.  ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context.

Authors:  Adam A Margolin; Ilya Nemenman; Katia Basso; Chris Wiggins; Gustavo Stolovitzky; Riccardo Dalla Favera; Andrea Califano
Journal:  BMC Bioinformatics       Date:  2006-03-20       Impact factor: 3.169

  8 in total
  8 in total

Review 1.  Mechanisms and evolution of control logic in prokaryotic transcriptional regulation.

Authors:  Sacha A F T van Hijum; Marnix H Medema; Oscar P Kuipers
Journal:  Microbiol Mol Biol Rev       Date:  2009-09       Impact factor: 11.056

2.  Learning structurally consistent undirected probabilistic graphical models.

Authors:  Sushmita Roy; Terran Lane; Margaret Werner-Washburne
Journal:  Proc Int Conf Mach Learn       Date:  2009

3.  Benchmarking imputation methods for network inference using a novel method of synthetic scRNA-seq data generation.

Authors:  Ayoub Lasri; Vahid Shahrezaei; Marc Sturrock
Journal:  BMC Bioinformatics       Date:  2022-06-17       Impact factor: 3.307

4.  Scalable learning of large networks.

Authors:  S Roy; S Plis; M Werner-Washburne; T Lane
Journal:  IET Syst Biol       Date:  2009-09       Impact factor: 1.615

5.  Spearheading future omics analyses using dyngen, a multi-modal simulator of single cells.

Authors:  Robrecht Cannoodt; Wouter Saelens; Louise Deconinck; Yvan Saeys
Journal:  Nat Commun       Date:  2021-06-24       Impact factor: 14.919

6.  GeNGe: systematic generation of gene regulatory networks.

Authors:  Hendrik Hache; Christoph Wierling; Hans Lehrach; Ralf Herwig
Journal:  Bioinformatics       Date:  2009-02-27       Impact factor: 6.937

7.  ENNET: inferring large gene regulatory networks from expression data using gradient boosting.

Authors:  Janusz Sławek; Tomasz Arodź
Journal:  BMC Syst Biol       Date:  2013-10-22

8.  sgnesR: An R package for simulating gene expression data from an underlying real gene network structure considering delay parameters.

Authors:  Shailesh Tripathi; Jason Lloyd-Price; Andre Ribeiro; Olli Yli-Harja; Matthias Dehmer; Frank Emmert-Streib
Journal:  BMC Bioinformatics       Date:  2017-07-04       Impact factor: 3.169

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.