Kranthi Varala1, Amy Marshall-Colón2, Jacopo Cirrone3, Matthew D Brooks4, Angelo V Pasquino4, Sophie Léran4, Shipra Mittal4, Tara M Rock4, Molly B Edwards4, Grace J Kim4, Sandrine Ruffel5, W Richard McCombie6, Dennis Shasha3, Gloria M Coruzzi7. 1. Horticulture and Landscape Architecture/Center for Plant Biology, Purdue University, West Lafayette, IN 47907. 2. Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801. 3. Courant institute for Mathematical Sciences, New York University, New York, NY 10012. 4. Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003. 5. Laboratoire de Biochimie et Physiologie Moléculaire des Plantes, UMR CNRS/INRA/SupAgro/Université Montpellier, Montpellier 34060, France. 6. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724. 7. Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003; gloria.coruzzi@nyu.edu.
Abstract
This study exploits time, the relatively unexplored fourth dimension of gene regulatory networks (GRNs), to learn the temporal transcriptional logic underlying dynamic nitrogen (N) signaling in plants. Our "just-in-time" analysis of time-series transcriptome data uncovered a temporal cascade of cis elements underlying dynamic N signaling. To infer transcription factor (TF)-target edges in a GRN, we applied a time-based machine learning method to 2,174 dynamic N-responsive genes. We experimentally determined a network precision cutoff, using TF-regulated genome-wide targets of three TF hubs (CRF4, SNZ, and CDF1), used to "prune" the network to 155 TFs and 608 targets. This network precision was reconfirmed using genome-wide TF-target regulation data for four additional TFs (TGA1, HHO5/6, and PHL1) not used in network pruning. These higher-confidence edges in the GRN were further filtered by independent TF-target binding data, used to calculate a TF "N-specificity" index. This refined GRN identifies the temporal relationship of known/validated regulators of N signaling (NLP7/8, TGA1/4, NAC4, HRS1, and LBD37/38/39) and 146 additional regulators. Six TFs-CRF4, SNZ, CDF1, HHO5/6, and PHL1-validated herein regulate a significant number of genes in the dynamic N response, targeting 54% of N-uptake/assimilation pathway genes. Phenotypically, inducible overexpression of CRF4 in planta regulates genes resulting in altered biomass, root development, and 15NO3- uptake, specifically under low-N conditions. This dynamic N-signaling GRN now provides the temporal "transcriptional logic" for 155 candidate TFs to improve nitrogen use efficiency with potential agricultural applications. Broadly, these time-based approaches can uncover the temporal transcriptional logic for any biological response system in biology, agriculture, or medicine.
This study exploits time, the relatively unexplored fourth dimension of gene regulatory networks (GRNs), to learn the temporal transcriptional logic underlying dynamic nitrogen (N) signaling in plants. Our "just-in-time" analysis of time-series transcriptome data uncovered a temporal cascade of cis elements underlying dynamic N signaling. To infer transcription factor (TF)-target edges in a GRN, we applied a time-based machine learning method to 2,174 dynamic N-responsive genes. We experimentally determined a network precision cutoff, using TF-regulated genome-wide targets of three TF hubs (CRF4, SNZ, and CDF1), used to "prune" the network to 155 TFs and 608 targets. This network precision was reconfirmed using genome-wide TF-target regulation data for four additional TFs (TGA1, HHO5/6, and PHL1) not used in network pruning. These higher-confidence edges in the GRN were further filtered by independent TF-target binding data, used to calculate a TF "N-specificity" index. This refined GRN identifies the temporal relationship of known/validated regulators of N signaling (NLP7/8, TGA1/4, NAC4, HRS1, and LBD37/38/39) and 146 additional regulators. Six TFs-CRF4, SNZ, CDF1, HHO5/6, and PHL1-validated herein regulate a significant number of genes in the dynamic N response, targeting 54% of N-uptake/assimilation pathway genes. Phenotypically, inducible overexpression of CRF4 in planta regulates genes resulting in altered biomass, root development, and 15NO3- uptake, specifically under low-N conditions. This dynamic N-signaling GRN now provides the temporal "transcriptional logic" for 155 candidate TFs to improve nitrogen use efficiency with potential agricultural applications. Broadly, these time-based approaches can uncover the temporal transcriptional logic for any biological response system in biology, agriculture, or medicine.
Nitrogen (N)—a nutrient/signal—is a core component of fertilizer used in modern agriculture to alleviate worldwide hunger (1). However, this comes at environmental costs, through excess nitrogen run-off due to inefficient N-use efficiency by crops (2). Thus, improving plant N uptake, assimilation, and utilization is highly desirable. With this goal, studies have attempted to capture and model the N-regulatory networks controlling N uptake/assimilation (3–6). Validation studies have identified several transcription factors (TFs) (7–12) as key regulators of N signaling. However, we lack knowledge of the dynamics and temporal hierarchy of these known—and as yet unknown—TFs in controlling N signaling and N uptake/assimilation. A meta-analysis placed some known regulators within network modules (13). However, such correlation-based networks are unable to predict causality. By contrast, time-based machine learning approaches can predict the regulatory influence of TFs on their targets in the dataset and in out-of-sample data, the ultimate goal of systems biology (5, 14, 15).In this study, we derived the temporal dynamics of N-regulatory networks by devising and combining several time-based approaches. First, our “just-in-time” (JIT) analysis uncovered a temporal cis-element cascade underlying dynamic N signaling. Second, we used a validated time-driven machine-learning approach, dynamic factor graph (DFG) (5, 14, 15), to infer TF–target interactions in 2,174 N-response genes in shoots. Third, we “pruned” the inferred TF-target edges in this gene regulatory network (GRN) using a precision cutoff threshold derived from experimentally regulated genome-wide targets of six regulators of N uptake/assimilation—CRF4, SNZ, CDF1, HHO5/6, and PHL1—validated herein. This pruned GRN predicts the influence of 155 TFs on 608 N-responsive genes. Fourth, to provide further support for the edges in the GRN, we used available TF-target binding data (DNA affinity purification sequencing, DAP-Seq) (16), also used to calculate a TF “N-specificity” index. This time-based GRN now reveals the temporal relationships of TFs previously validated in the N response [e.g., NLP7/8 (7, 17), TGA1/4 (8), NAC4 (9), HRS1 (10), and LBD37, 38, 39 (11)]. It also connects these known TFs with previously undescribed TFs in the N-response cascade, including ones we validated herein—CRF4, SNZ, CDF1, HHO5/6, and PHL1—to regulate a significant number of genes in the dynamic N response, including 54% of nitrate uptake/assimilation pathway genes. Finally, we show that perturbation of CRF4, the earliest N-responsive TF in this GRN, affects genes and processes that result in altered nitrate uptake, root development, and plant biomass, under low-N input conditions. Beyond these proof-of-principle examples, the pruned GRN of dynamic N signaling we derived now provides the temporal “transcriptional logic” for 155 candidate TFs for perturbations aimed at improving nitrogen use efficiency (NUE) with potential applications in agriculture. More broadly, these time-based approaches can be applied to uncover the temporal transcriptional logic for any biological response system in biology, agriculture, or medicine.
Results
A Fine-Scale Time-Course Transcriptome of Dynamic Nitrogen Signaling.
N nutrient signal elicits dynamic responses in plant metabolism and development (5, 13, 18–21). However, most prior transcriptome studies assayed only one or two time points following N treatment (3, 6, 13) or widely spaced time points not amenable to learning GRN causality (22). A previous study uncovered the very early (3–20 min) transcriptional response to nitrate treatment in Arabidopsis roots (5). Herein, we captured early-to-late transcriptome responses (5, 10, 15, 20, 30, 45, 60, 90, and 120 min) to an N supply (NO3− and NH4+) shown to elicit inorganic- and organic-N responses (4) (). Genes responding to N as a function of time (NxTime genes) were identified using a cubic-spline model [false discovery rate (FDR) P < 0.01] (23) (). This analysis identified NxTime response genes in shoots (2,174 genes) and in roots (2,681 genes) [ (shoots: green bars), Dataset S1, Table S1, and (roots: brown bars), Dataset S1, Table S2]. These NxTime gene sets are largely organ-specific but share 778 genes, including 54 TFs (). These include many known N-responsive genes (3–6, 13) and also 2,737 unique N-responsive genes () (24), due to increased sensitivity from RNA sequencing and 511 genes absent on microarrays (13). We also captured transient responses to N supply, including the well-known N-regulator TF, NLP7 (7, 17) (). Our dataset captures dynamic effects of N signaling in shoots, on metabolism, RNA processing, photosynthesis (25), and circadian rhythm (4) ( and Dataset S1, Table S1).
Just-in-Time Analysis Uncovers a Temporal Cascade of Cis-Regulatory Elements and Biological Processes in Response to N Supply.
To uncover the regulatory cascade underlying dynamic N signaling, we implemented a JIT analysis (). This JIT analysis bins NxTime genes, based on the first time point at which its mRNA levels are affected by N signaling (fold change ≥ 1.5) (, blue bars, and Dataset S1, Table S1). We then identified overrepresented known cis motifs (16, 26, 27) in each JIT bin, using a hypergeometric distribution on a genome-wide promoter background (28). This analysis uncovered a temporal cascade of overrepresented cis-regulatory motifs (e-value < 0.05) in the promoters of genes first responding to N signaling at each JIT point (Fig. 1). The set of enriched cis elements are different between the JIT sets of shoots (Fig. 1) vs. roots (). The temporal enrichment of unique cis-element motifs in shoots is particularly noticeable at the 10-, 15-, and 20-min JIT points (Fig. 1). Conversely, certain cis-element motifs—such as SORLIP2 and TELO-box—are overrepresented at consecutive JIT sets (Fig. 1). This JIT analysis also uncovered a temporal cascade of enriched Gene Ontology (GO) terms enriched in each JIT gene set in shoots (FDR adjusted P < 0.01) (Fig. 1 and ). The early JIT gene sets (5–15 min) are significantly enriched in genes related to N uptake/assimilation. Intermediate JIT gene sets (20–30 min) are enriched in energy generation. The later JIT gene sets (≥45 min) are enriched in genes for metabolic and developmental processes (). Overall, JIT cis-element and GO analysis implicates a cascade of associated TFs regulating largely nonoverlapping sets of genes at consecutive JIT time points in N signaling (Fig. 1). However, the current cis-motif datasets (16, 26, 27) are generalized for TF families and cannot associate individual TFs with specific target genes. We thus associated specific TFs with targets in the NxTime cascade by using a time-based network inference method described below.
Fig. 1.
JIT gene set analysis identifies a temporal cascade of N-response genes in shoots. Genes responding to NxTime by cubic-spline analysis (23) were binned into the first time point at which mean expression changes by ≥1.5 fold ( and Dataset S1, Table S1). (A) A cascade of unique cis-element motifs are significantly enriched in each JIT gene set (). (B) The JIT gene sets have nonoverlapping sets of GO terms enriched at each time point ( and ).
JIT gene set analysis identifies a temporal cascade of N-response genes in shoots. Genes responding to NxTime by cubic-spline analysis (23) were binned into the first time point at which mean expression changes by ≥1.5 fold ( and Dataset S1, Table S1). (A) A cascade of unique cis-element motifs are significantly enriched in each JIT gene set (). (B) The JIT gene sets have nonoverlapping sets of GO terms enriched at each time point ( and ).
Assigning an N-Specificity Index to TFs in the Dynamic N-Response Cascade.
Our time course captures 172 TFs responding to N supply within 2 h (Dataset S1, Table S13). To identify TFs that play a specific role in N signaling, we computed an N-specificity index (Dataset S1, Table S12), based on available TF-target binding data (16). For each NxTime regulated TF with genome-wide binding data (40 TFs; Dataset S1, Table S12) we tested if the proportion of its genome-wide targets (16) in the NxTime shoot genes are significantly overrepresented, relative to the proportion of all of the TF-bound targets in the genome (). This identified 19 TFs with a highly significant N-specificity score (P < 0.05) in shoots (Dataset S1, Table S12). These N-specific TFs include four validated regulators of the N response [NLP7 (7), TGA1/4 (8), and NAC4 (9)] and 15 additional TFs whose targets are enriched in N-signal-responsive genes in shoots (Dataset S1, Table S12). We note that this N-specificity calculation is limited to TFs with TF-target binding data for 529 TFs currently in the DAP-seq database (16). However, this N-specificity calculation may be applied to any TF with known genome-wide targets, as we show with SNZ and CDF1 (), as detailed below.
Inferring a Time-Derived GRN Driving the Temporal N Response in Shoots.
De novo network inference is a valuable approach to build GRNs (29–32). Because causality moves forward in time, fine-scale time-series transcriptome experiments are an especially valuable resource to infer GRNs that can predict out-of-sample target gene behavior, the ultimate goal of systems biology (5, 14, 33). Previously, we applied a time-based machine-learning method, DFG (15), to learn and predict causal relationships between TFs and their targets (5, 14). Briefly, DFG identifies the likely set of TFs driving target gene expression, by learning an f function that explains the target gene expression at each time point, based on the expression of the TFs at previous time points (15). Here, we used the DFG method to predict the influence of every TF on every gene in the shoot NxTime gene set, implementing rigorous hyperparameterization steps (). The resultant DFG network provides a measure of the influence (i.e., TF-target edge score) of each of the 172 TFs on the 2,174 N-responsive genes in shoots (e.g., 374,000 predicted edges). However, a major challenge in de novo network inference is the high false-positive rate of TF-target predictions (30). We thus estimated confidence in the edges of our time-inferred DFG network by comparing the predicted TF-target edges in the GRN to experimentally validated TF targets (30). This method establishes the precision (i.e., proportion of predicted TF-target edges that are real) and recall (i.e., proportion of real TF-target edges that are predicted) of the GRN, which can then be used to prune the network to enrich for higher-confidence TF-target predictions (30). To implement this network pruning step, we first retained the top 10% of DFG predictions (Dataset S1, Table S19) and experimentally validated the genome-wide targets of seven TF hubs in this initial DFG network. This genome-wide TF-target validation step established the precision vs. recall for the larger GRN of 155 TFs (), as detailed below.
Validation of TFs CRF4, SNZ, and CDF1 in the Time-Inferred GRN That Regulate N-Response and N-Uptake/Assimilation Pathway Genes.
To implement genome-wide validation of the DFG-predicted GRN, CRF4 was selected for initial TF-target validation as it (i) is early N-responsive in both shoots and roots (), (ii) is a TF hub (422 out-edges) in the top 10% unpruned DFG shoot network (∼35,200 edges) (Dataset S1, Table S19), (iii) has a high N-specificity index (Dataset S1, Table S12), and (iv) is a TF in N signaling with potential links to the cytokinin pathway (34). We identified the genome-wide targets of CRF4 in an inducible overexpression transplanta line (CRF4-OX) (35) and also via TF perturbation in shoot cells using the TARGET (Transient Assay Reporting Genome-wide Effects of Transcription factors) assay (10, 36–38). These results confirm the early and central role that CRF4 plays in the dynamic N response. In planta CRF4-regulated targets are significantly overrepresented in NxTime genes in both shoots and roots, spanning early and later JIT NxTime points (). The validated genome-wide targets of CRF4 in shoots include 16 downstream TFs responsive to NxTime ( and Dataset S1, Table S5). We next selected two validated TF targets of CRF4—an “early” (SNZ, 10-min JIT) and “late” (CDF1, 45-min JIT) N responder—for TF-perturbation studies in shoot cells using the TARGET system (36) (). These results revealed that the targets regulated by CRF4, SNZ, and CDF1 (Dataset S1, Tables S5–S7) (i) are significantly enriched in NxTime genes, (ii) support a high N-specificity index, and (iii) are enriched in GO terms related to nitrate assimilation/metabolism (for CRF4, SNZ, and CDF1), ribosome biogenesis (for CRF4), and rhythmic processes (for CDF1) ( and Dataset S1, Tables S8–S10). Collectively, the targets of CRF4, SNZ, and CDF1 encompass (i) 54% of N-uptake/assimilation pathway (35/65) genes, (ii) 75% of the NxTime genes in the N-uptake/assimilation pathway (12/16), and (iii) 23 N-pathway genes that are not NxTime-responsive (Fig. 2). We note the cell-based TARGET system can identify direct targets based on TF regulation (Fig. 2, solid lines), because translation of mRNA from primary TF targets is blocked () (36). By contrast, in planta TF perturbations cannot distinguish direct vs. indirect regulated targets (Fig. 2, dashed lines).
Fig. 2.
Three TFs—CRF4, SNZ, and CDF1—regulate 53% of the N-uptake/assimilation pathway genes. A time-based machine learning approach DFG (5, 15) was used to infer TF-target influence in an N-response GRN in shoots (). Validated genome-wide targets of three TFs in this GRN—CRF4, SNZ, and CDF1—are shown to regulate 53% (35/65) genes in the N-uptake/assimilation pathway ( and Dataset S1, Tables S5–S7 and S15). TF edges to N-responsive genes (green nodes) that are predicted by the GRN and validated by TF perturbations are shown by asterisks and thicker edge width (Dataset S1, Table S14). Gray circles indicate other cellular processes validated to be regulated by these three TFs (Dataset S1, Tables S8–S10).
Three TFs—CRF4, SNZ, and CDF1—regulate 53% of the N-uptake/assimilation pathway genes. A time-based machine learning approach DFG (5, 15) was used to infer TF-target influence in an N-response GRN in shoots (). Validated genome-wide targets of three TFs in this GRN—CRF4, SNZ, and CDF1—are shown to regulate 53% (35/65) genes in the N-uptake/assimilation pathway ( and Dataset S1, Tables S5–S7 and S15). TF edges to N-responsive genes (green nodes) that are predicted by the GRN and validated by TF perturbations are shown by asterisks and thicker edge width (Dataset S1, Table S14). Gray circles indicate other cellular processes validated to be regulated by these three TFs (Dataset S1, Tables S8–S10).
The GRN Is Pruned Using Genome-Wide TF-Target Validation Data to Identify Higher-Confidence Edge Predictions.
Next, to validate our edge predictions in the time-derived GRN, we used experimentally derived TF-target regulation data for CRF4, SNZ, and CDF1 (). First, we tested the significance of the DFG TF-target edge rankings in our GRN by performing an area under precision recall (AUPR) curve analysis (30, 39, 40) (). We compared the ranked TF-target edge predictions from the DFG-inferred network to a random ranking of TF-target edges (1,000 iterations). This analysis showed that the AUPR of the DFG-inferred network (0.24) is significantly better than the mean AUPR for random networks (0.14) (P < 0.001) (). Next, to identify higher-confidence edges in the GRN (39), we chose a cutoff point (precision = 0.345) before the AUPR curve flattens (). This precision cutoff point of 0.345 matches a TF-target edge score of 0.95554 in our GRN (). Thus, only TF-target edges with an edge score ≥0.95554 were retained in our pruned DFG network ( and Dataset S1, Table S3). This pruned GRN includes 85 validated targets out of the 245 predicted TF-target edges between CRF4, SNZ, and CDF1 and genes in the NxTime shoot set. These predicted and validated targets of CRF4, SNZ, and CDF1 include five key genes in N uptake/assimilation (NRT1.1, NR1 and NR2, NIR, and GLN1.1) (Fig. 2, edges denoted by asterisks), 10 genes involved in transcriptional/translation, and genes in the circadian clock (e.g., TIC) (Dataset S1, Table S14). As our pruned GRN was optimized to increase precision at the cost of low recall, it likely underestimates the influence of a given TF on GRN. For example, only 9/24 experimentally validated edges from these three TFs to the 12 N-responsive genes in the N-assimilation pathway (Fig. 2, green nodes) are in the pruned network (Fig. 2, edges with asterisks) (Dataset S1, Table S14).
Cross-Validation of the Pruned N-Signaling GRN Using TF-Target Regulation Data from Four Additional TFs.
The above pruned GRN at an average precision of ∼0.345 can now predict the influence of 155 N-responsive TFs on 608 NxTime genes in the dynamic N response in shoots ( and Dataset S1, Table S3). To independently validate this precision rate, we identified the regulated TF targets of four additional TFs in the GRN—HHO5/6, PHL1, and TGA1—in shoot cells using the TARGET system (Dataset S1, Tables S22–S25). The precision for each of these four TFs in the GRN ranged from 0.17 to 0.45, for an overall average of 0.32 (). In total, 110/349 predicted TF targets in the pruned GRN were experimentally validated, including six genes involved in N uptake/reduction (). These four TFs also influence a significant number of genes and processes in the NxTime gene set in shoots ( and Dataset S1, Tables S27–S30). This independent TF validation proves that the initial network precision of 0.345 () used to prune the shoot N-response GRN extends beyond the three TFs used in the network inference pruning stage (e.g., CRF4, SNZ, and CDF1) and can be used to predict targets of 155 TFs in the N-response GRN with an overall precision of ∼0.33. We note that our precision cutoff of 0.33 (i.e., one in three predicted edges are likely to be true), is of a scale comparable to the maximum precision of 0.5 achieved using an ensemble approach of multiple network inference methods in microbes (30).
Independent TF-Target Binding Data Support Predicted Edges in the NxTime GRN.
The TF-target edges in the pruned DFG network (Dataset S1, Table S3) identified as hubs (i.e., influential TFs) multiple known/validated regulators of N signaling (e.g., TGA1/4, NLP7/8, NAC4, HRS1, and LBD37/38/39) (7–11, 17), as well as 146 potential regulators, including six validated herein: CRF4, SNZ, CDF1, HHO5/6, and PHL1 ( and Dataset S1, Tables S5–S7 and S22–S24). To add further edge support, the DFG-predicted edges in the pruned GRN were queried using an independent source of TF-target binding data for the 40 NxTime TFs in shoots that are present in the DAP-Seq dataset (16) (Fig. 3). A TF-target edge in the pruned DFG network is supported by DAP-Seq TF-target data (16) only if that TF is shown to bind to the promoter of the target gene in the DAP-Seq assay (41) (Fig. 3 and Dataset S1, Table S4). We note that the actual DAP-Seq TF-DNA binding data (41) were used to establish TF-target binding, not the in silico cis-motif information (16). The 19 TFs in the pruned GRN that have DAP-seq data and high-N specificity include four known TFs in the N response (TGA1/4, NAC4, and NLP7) and four TFs (CRF4, HHO5/6, and PHL1) validated herein (Fig. 3, red underlined TFs).
Fig. 3.
A time-dependent GRN uncovers TFs in dynamic N signaling in shoot. A time-based machine learning approach DFG (5, 15) was used to infer TF-target influence in a GRN. Validated genome-wide targets of three TFs—CRF4, SNZ, and CDF1 ()—were used to prune the GRN for TF-target precision based on AUPR analysis ( and Dataset S1, Tables S5–S7). This TF-target precision was reconfirmed using data for four independent TFs—TGA1, HHO5/6, and PHL1 ( and Dataset S1, Tables S22–S25). The TF-target edges supported by an independent source of TF-target binding data [DAP-SEq (16, 41)] capture regulation of 208 N-responsive target genes by 35 TFs (Dataset S1, Table S4). TFs with a significant N-specificity index are highlighted in red (Dataset S1, Table S12). Validated TF regulators of the N-response are underlined: NLP7 (7), TGA1/4 (8), NAC4 (9), LBD37, 38 (11), and CRF4, HHO5/6, and PHL1 (this study).
A time-dependent GRN uncovers TFs in dynamic N signaling in shoot. A time-based machine learning approach DFG (5, 15) was used to infer TF-target influence in a GRN. Validated genome-wide targets of three TFs—CRF4, SNZ, and CDF1 ()—were used to prune the GRN for TF-target precision based on AUPR analysis ( and Dataset S1, Tables S5–S7). This TF-target precision was reconfirmed using data for four independent TFs—TGA1, HHO5/6, and PHL1 ( and Dataset S1, Tables S22–S25). The TF-target edges supported by an independent source of TF-target binding data [DAP-SEq (16, 41)] capture regulation of 208 N-responsive target genes by 35 TFs (Dataset S1, Table S4). TFs with a significant N-specificity index are highlighted in red (Dataset S1, Table S12). Validated TF regulators of the N-response are underlined: NLP7 (7), TGA1/4 (8), NAC4 (9), LBD37, 38 (11), and CRF4, HHO5/6, and PHL1 (this study).
CRF4—the Earliest TF in the N-signaling GRN—Regulates N Uptake and N Use in Planta.
The pruned DFG network—refined by TF-target binding data—places CRF4 at the top of the N-signaling cascade (Fig. 3), based on its early response (5-min JIT) and its GRN connections (Dataset S1, Table S3). Indeed, our validation studies support the early and specific role of CRF4 in mediating the dynamic N-response GRN in planta. Inducible expression using a CRF4-OX transplanta line (35) () reveals that CRF4 controls a highly significant number of NxTime genes, spanning early and later JIT gene sets ( and Dataset S1, Tables S18 and S21). Impressively, CRF4 directly or indirectly regulates approximately one-third of the genes in the N-uptake/assimilation pathway (21/65), including seven N-uptake genes (Fig. 2). In planta CRF4 targets are also enriched in N-metabolic processes and translation (in shoots) and response to nitrate and root development (in roots) (). Moreover, these CRF4-mediated changes in gene regulation affect N uptake and use in planta (Fig. 4). CRF4-OX overexpression results in significantly lowered shoot biomass (P < 1e-5) (Fig. 4), primary root length, and number of lateral roots, under low-N conditions (), where the high-affinity N-transporter NRT2.1 is the major functional nitrate-uptake system (42). Further, repression of NRT2.1 in shoots and roots of CRF4-OX plants (Fig. 4 and and Dataset S1, Tables S5 and S20) leads to lower rates of nitrate uptake under low-N conditions in the CRF4-OX line (35) (). Using 15NO3 tracer (43), 15NO3 uptake was significantly reduced in the induced CRF4-OX overexpression line, at levels comparable to the nrt2.1 mutant impaired in high-affinity nitrate uptake (44), compared with uninduced CRF4-line and wild-type controls, under low-N conditions (two-way ANOVA with Tukey honestly significant difference analysis) () (Fig. 4 and Dataset S1, Table S16). These results validate the important role CRF4 plays in regulating N uptake/use—acting either directly or indirectly through its downstream TFs, such as SNZ and CDF1 (Fig. 4).
Fig. 4.
CRF4 overexpression represses high-affinity nitrate uptake and biomass in planta. (A) CRF4 overexpression via β-estradiol (+βE) induction (35) represses SNZ, CDF1, and NRT2.1 ( and Dataset S1, Table S5). SNZ and CDF1 overexpression in shoot cells (36) induces NRT2.1 expression (Dataset S1, Tables S6 and S7). CRF4 overexpression in low-N (1 mM NO3) conditions significantly reduces (B) the rate of nitrate 15NO3− uptake (Dataset S1, Table S16) and (C) shoot biomass in planta (). N.S, not significant.
CRF4 overexpression represses high-affinity nitrate uptake and biomass in planta. (A) CRF4 overexpression via β-estradiol (+βE) induction (35) represses SNZ, CDF1, and NRT2.1 ( and Dataset S1, Table S5). SNZ and CDF1 overexpression in shoot cells (36) induces NRT2.1 expression (Dataset S1, Tables S6 and S7). CRF4 overexpression in low-N (1 mM NO3) conditions significantly reduces (B) the rate of nitrate 15NO3− uptake (Dataset S1, Table S16) and (C) shoot biomass in planta (). N.S, not significant.
Discussion
N—a key nutrient/signal—regulates dynamic plant processes including circadian rhythm (4) and root foraging (5, 13, 18–20). However, the underlying temporal mechanisms are unknown. Our JIT analysis uncovered discrete waves of transcriptional responses to N signaling in shoots (Fig. 1). For example, we confirm and extend the role of N signaling as an input to the circadian clock in plants (4). N signaling regulates TFs in the circadian clock, inducing TOC1 and CDF1, and repressing ZTL within 20–45 min after N supply (). Overall, the shoot NxTime gene set shows significant enrichment for genes with peak expression at predawn (45) (Dataset S1, Table S26).
A Fine-Scale Time Course and GRN Establishes the Temporal Hierarchy of N-Signaling Regulators.
Next, we used the DFG network inference method (5, 15) to derive GRNs that reveal the transcriptional logic underlying dynamic N signaling in shoots. The resulting N-response network pruned for precision (Dataset S1, Table S3) now places 155 N-responsive TFs in shoots in a temporal hierarchy () and predicts their likely temporal interactions. For example, the 12 TFs that respond earliest to the N signal in shoots (5-min JIT) () include TFs previously validated in the N response: LBD37/38/39 (11) and HRS1 (10), and an “early” TF validated herein, CRF4 (Fig. 3). We note that some of the earliest steps of N-signal transduction are also likely to occur via posttranslational modifications (18) or changes in TF localization, as shown for NLP7 (7). This pruned network is further supported by TF-target binding data and reveals a set of 15 TFs that are specific to the N response (Fig. 3, TFs in red) (Dataset S1, Table S12). This establishes the power of applying de novo GRN inference approaches to expression datasets, as shown for GRNs mediating environmental responses in rice (29) and drought responses in Arabidopsis (46).
CRF4 Regulates N Uptake and N Use in Planta.
Our time-based N-regulatory network revealed CRF4 as an early player in mediating the N-signaling response. Indeed, our genome-wide target studies and phenotypic analysis support the key role CRF4 plays in mediating nitrate uptake and use in planta (Fig. 4 and ). In response to N supply, CRF4 represses genes in the N-assimilation pathway, including the high-affinity nitrate transporter NRT2.1 ( and Dataset S1, Tables S5 and S20), which is repressed under high-N conditions in wild type (44). Additionally, we validated two downstream TF targets of CRF4 and found that SNZ is largely an activator, while CDF1 activates or represses genes in the N-assimilation pathway (Fig. 2). CRF4 targets in shoots include ribosomal proteins, induced within 30–45 min of N supply ( and Dataset S1, Table 5). In roots, CRF4 regulates nitrate uptake and root development processes, consistent with the in vivo phenotypes (Fig. 4, , and Dataset S1, Tables S20–S21). N-signaling is an additional role for CRF4, whose only previous described role was in the plant cold response (47). In addition, we discovered that 11/12 members of the CRF family (47) are N-responsive, including eight in shoots (CRF1–6, CRF10, and CRF11) and three (CRF3–4 and CRF11) in roots (Dataset S1, Tables S1 and S2). This highlights a potential role of the CRF family in linking the N response and cytokinin signaling (34). Our study also identifies multiple TFs that link nitrogen and phosphate responses (HHO5/6 and PHL1), as previously shown for HRS1 (10).In addition to discovering TFs in the N-response network, the “transcriptional logic” of N signaling uncovered herein can also suggest the temporal mode-of-action for TFs and combinatorial TF experiments which will be valuable for the global goal of enhancing NUE. More broadly, our time-centric approach that uses fine-scale time-course data to fuel causal network inference can now be applied to understand any stimulus-driven GRN in any organism. Moreover, the analysis approaches we described—JIT and N-specificity index—can be used to uncover the regulatory structure and signal specificity in any time-series transcriptome datasets. When coupled with genome-wide TF-target binding data [e.g., ChIP-Seq and DAP-Seq (16)] and other layers of genome-wide dynamic interaction data [e.g., chromatin accessibility maps (27)], the approach employed in our time-based study can identify key molecular players, their hierarchy, and other emergent network properties in any complex transcriptional regulatory system in biology, agriculture, or medicine.
Methods
Plant Material and N Treatments.
Arabidopsis thaliana (Col-0) seeds were grown on N-free Murashige and Skoog (MS) media + 1 mM KNO3 for 2 wk in long-day conditions. Two hours after the start of the light period, plants were treated with (i) standard MS media (20 mM KNO3 + 20 mM NH4NO3) (4, 48) or (ii) 20 mM KCl. Triplicate shoot and root samples were harvested at 0, 5, 10, 15, 20, 30, 45, 60, 90, and 120 min after treatments.
CRF4-OX NUE Phenotyping.
Col-0 and CRF4-OX transplanta line (CS2104639) (35) were grown on 30 mM N [½ MS (48)] for 7 d and treated with (i) 10 μΜ β-estradiol or DMSO (solvent) and (ii) low N (1mM nitrate) or high N [30 mM ½ MS (48)] for 5 d. Shoot dry weight, root length, lateral root number, and lengths were assayed. Influx of 15NO3 was assayed as described previously on Col-0, CRF4-OX, and nrt2.1 mutant (44) plants.
Genome-Wide TF-Target Validation.
In planta genome-wide targets of CRF4 were identified by differential expression in Col-0 vs. CRF4-OX line (CS2104639) (35), 24 h after TF induction by 10 µM β-estradiol. The TARGET system (36) was used to identify the genome-wide targets of seven TFs (CRF4, SNZ, CDF1, TGA1, HHO5/6, or PHL1) (Dataset S1, Tables S5–S7 and S22–S25) as in ref. 37. Transformed shoot cells were treated sequentially with (i) 20 mM KNO3 + 20 mM NH4NO3 for 2 h, (ii) 35 μM cycloheximide for 20 min before, and (iii) TF-nuclear localization by 10 μM dexamethasone (37). Shoot cells overexpressing the TF or empty vector were collected in triplicate and transcriptomes profiled on the Illumina NextSEq 500 platform. Genes differentially expressed in response to TF overexpression were identified using DESeq2 package (FDR < 0.05).
Time-Course Transcriptome Profiling.
RNA was extracted from ∼100 mg shoot or root tissue and used for Illumina compatible library preparation. cDNA libraries were sequenced on Illumina HiSEq 2500 v4 platform (100 bp, PE). Gene expression values were determined after quality filtering, genome alignment, and quantile normalization. Differentially expressed genes (FDR < 0.05) were identified by fitting a cubic spline model (df = 5) (23).
GRN Inference and Network Pruning.
DFG (15) was used to infer interactions between 172 TFs and 2,174 NxTime genes in shoots (Dataset S1, Table S1). Experimentally determined TF-target relationships from CRF4, SNZ, and CDF1 were used to perform an AUPR analysis and identify a pruning threshold of precision = 0.345, which was independently confirmed with validated targets of four TFs in the GRN: TGA1, HHO5/6, and PHL1(). Further support for predicted TF–target interactions was obtained from in vitro TF-promoter binding for 40 TFs that have binding data (16, 41) (Fig. 3).
JIT Analysis of Time-Series Transcriptome Data.
Each NxTime gene was assigned to the first time bin at which gene expression in N-treated samples is ≥1.5 fold of control ( and Dataset S1, Tables S1 and S2). Each JIT gene set (, blue bars) was analyzed to identify overrepresentation of cis-regulatory motifs [FDR E-value < 0.05, Elefinder (28)] and such cis elements were hierarchically clustered (Fig. 1). JIT gene sets were also analyzed to identify overrepresented GO terms (49) (Fig. 1 and ).
Nitrogen-Specificity Index for TFs in the GRN.
For each of the 40 NxTime TFs with in vitro TF-target genome-wide binding data (16, 41) we retrieved genome-wide targets in shoot NxTime set. The TFs with a significantly higher proportion of targets in the NxTime set relative to their genome-wide distribution (one-tailed t test, P < 0.01) were accepted as being specific to the N signal (Dataset S1, Table S12).
Data Availability.
All raw sequence data from this project have been deposited into NCBI’s GEO database (accession no. GSE97500). All data and scripts used in this study are available in Data Dryad, dx.doi.org/10.5061/dryad.248g184.
Authors: Olivia Wilkins; Christoph Hafemeister; Anne Plessis; Meisha-Marika Holloway-Phillips; Gina M Pham; Adrienne B Nicotra; Glenn B Gregorio; S V Krishna Jagadish; Endang M Septiningsih; Richard Bonneau; Michael Purugganan Journal: Plant Cell Date: 2016-09-21 Impact factor: 11.277
Authors: Daniel Marbach; James C Costello; Robert Küffner; Nicole M Vega; Robert J Prill; Diogo M Camacho; Kyle R Allison; Manolis Kellis; James J Collins; Gustavo Stolovitzky Journal: Nat Methods Date: 2012-07-15 Impact factor: 28.547
Authors: Joan Doidy; Ying Li; Benjamin Neymotin; Molly B Edwards; Kranthi Varala; David Gresham; Gloria M Coruzzi Journal: BMC Genomics Date: 2016-02-03 Impact factor: 3.969
Authors: Elena A Vidal; José M Alvarez; Viviana Araus; Eleodoro Riveras; Matthew D Brooks; Gabriel Krouk; Sandrine Ruffel; Laurence Lejay; Nigel M Crawford; Gloria M Coruzzi; Rodrigo A Gutiérrez Journal: Plant Cell Date: 2020-03-13 Impact factor: 11.277
Authors: Ying Li; Matthew Brooks; Jenny Yeoh-Wang; Rachel M McCoy; Tara M Rock; Angelo Pasquino; Chang In Moon; Ryan M Patrick; Milos Tanurdzic; Sandrine Ruffel; Joshua R Widhalm; W Richard McCombie; Gloria M Coruzzi Journal: Plant Physiol Date: 2019-10-22 Impact factor: 8.340