| Literature DB >> 28479747 |
Pui Shan Wong1, Kosuke Tashiro2, Satoru Kuhara2, Sachiyo Aburatani1,3.
Abstract
Functional genomics and gene regulation inference has readily expanded our knowledge and understanding of gene interactions with regards to expression regulation. With the advancement of transcriptome sequencing in time-series comes the ability to study the sequential changes of the transcriptome. Here, we present a new method to augment regulation networks accumulated in literature with transcriptome data gathered from time-series experiments to construct a sequential representation of transcription factor activity. We apply our method on a time-series RNA-Seq data set of Escherichia coli as it transitions from growth to stationary phase over five hours and investigate the various activity in gene regulation process by taking advantage of the correlation between regulatory gene pairs to examine their activity on a dynamic network. We analyse the changes in metabolic activity of the pagP gene and associated transcription factors during phase transition, and visualize the sequential transcriptional activity to describe the change in metabolic pathway activity originating from the pagP transcription factor, phoP. We observe a shift from amino acid and nucleic acid metabolism, to energy metabolism during the transition to stationary phase in E. coli.Entities:
Keywords: Escherichia coli; gene regulation; network; time-series
Year: 2017 PMID: 28479747 PMCID: PMC5405090 DOI: 10.6026/97320630013025
Source DB: PubMed Journal: Bioinformation ISSN: 0973-2063
Figure 2The three equations used in the methodology for network construction. (a) The calculation of cross-correlation between transcription factor a and regulated gene b for lag time h for 0 ≤ i + h < N where N is the number of values in a and b, and a¯ and ¯b are their respective means. (b) The difference in expression between adjacent time points i and j where i and j are either 3, 4, 5, 6, 8 and j > i, for a gene v (c) A matrix M populated by pairwise differences between all combinations of da(i,j) and db(x,y) so that da were ordered along the rows and db were ordered along the columns.
Figure 1A flow chart of the lag time h detection method starting from top to bottom. The cross-correlation between a transcription factor a and regulated gene b is first calculated and the highest cross-correlation is selected. The corresponding lag time h of the highest cross-correlation is used to exclude the a and b pair if 0 < h ≤ 3 otherwise the analysis continues. The difference in expression of adjacent time points are calculated for a and b individually, and then the difference between those differences are calculated in a matrix, M where da were the rows and db were the columns. The activation times ta and tb are defined as the row and column of the smallest absolute value in M.
Frequency counts of activation times ta and tb for transcription factor a and the gene it regulates b.
| Transcription Factor a (hr) | ||||||
| Gene b (hr) | 3 | 4 | 5 | 6 | Sum | |
| Sample 1 | 3 | 650 | 650 | |||
| 4 | 130 | 491 | 621 | |||
| 5 | 156 | 82 | 539 | 777 | ||
| 6 | 155 | 177 | 99 | 138 | 569 | |
| Sum | 1,091 | 750 | 638 | 138 | 2,617 | |
| Sample 2 | 3 | 571 | 571 | |||
| 4 | 77 | 197 | 274 | |||
| 5 | 172 | 118 | 286 | 576 | ||
| 6 | 37 | 219 | 89 | 212 | 557 | |
| Sum | 857 | 534 | 375 | 212 | 1,978 | |
| Sample 3 | 3 | 299 | 299 | |||
| 4 | 285 | 497 | 787 | |||
| 5 | 141 | 235 | 297 | 673 | ||
| 6 | 160 | 218 | 184 | 191 | 753 | |
| Sum | 885 | 950 | 481 | 191 | 2,507 | |
Figure 3The three networks created by each sample with phoP highlighted in red. The dynamic highlights of activation time and labels, and visualization controls are not shown. (a) Sample 1 produced the largest network containing 67 vertices and 80 edges, (b) sample 2 contained 6 vertices and 6 edges and (c) sample 3 contained 26 vertices and 26 edges.