| Literature DB >> 16473855 |
Zhengchang Su1, Fenglou Mao, Phuongan Dam, Hongwei Wu, Victor Olman, Ian T Paulsen, Brian Palenik, Ying Xu.
Abstract
Deciphering the regulatory networks encoded in the genome of an organism represents one of the most interesting and challenging tasks in the post-genome sequencing era. As an example of this problem, we have predicted a detailed model for the nitrogen assimilation network in cyanobacterium Synechococcus sp. WH 8102 (WH8102) using a computational protocol based on comparative genomics analysis and mining experimental data from related organisms that are relatively well studied. This computational model is in excellent agreement with the microarray gene expression data collected under ammonium-rich versus nitrate-rich growth conditions, suggesting that our computational protocol is capable of predicting biological pathways/networks with high accuracy. We then refined the computational model using the microarray data, and proposed a new model for the nitrogen assimilation network in WH8102. An intriguing discovery from this study is that nitrogen assimilation affects the expression of many genes involved in photosynthesis, suggesting a tight coordination between nitrogen assimilation and photosynthesis processes. Moreover, for some of these genes, this coordination is probably mediated by NtcA through the canonical NtcA promoters in their regulatory regions.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16473855 PMCID: PMC1363776 DOI: 10.1093/nar/gkj496
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
The components in the initial network model
| Name | Protein ID | Synonym | Operon | Templates | SAM | Rank of NtcA binding site |
|---|---|---|---|---|---|---|
| 33864789 | PCC6803 | 9 | ||||
| 33865607 | PCC7120, PCC6803 | −3.820 | 3 | |||
| 33864998 | PCC7120, PCC6803 | 6.422 | 26 | |||
| 33866067 | PCC7120 | 299 | ||||
| 33864702 | PCC7120, PCC6803 | 618 | ||||
| 33866994 | PCC7120, PCC6803, WH8103 | 616 | ||||
| 33867007 | PCC7120, PCC6803, WH8103 | 11 | ||||
| 33867017 | PCC7942 | −5.132 | 4 | |||
| 33867016 | PCC7942 | 4 | ||||
| 33867015 | PCC7942 | 4 | ||||
| 33867020 | PCC7942 | −3.420 | 735 | |||
| 33866993 | WH8103 | −5.184 | 960 | |||
| 33864811 | PCC7120,PCC6803 | −5.098 | 1 | |||
| 33865285 | PCC7120 | −6.842 | 162 | |||
| 33866250 | PCC7120 | 1101 | ||||
| 33866979 | PCC7120 | −3.602 | 98 | |||
| 33866978 | PCC7120 | −6.018 | 98 | |||
| 33866977 | PCC7120 | −7.659 | 98 | |||
| 33866976 | PCC7120 | 98 | ||||
| 33866975 | PCC7120 | 78 | ||||
| 33866974 | PCC7120 | 78 | ||||
| 33866973 | PCC7120 | 78 | ||||
| 33866972 | PCC7120 | 2 | ||||
| 33866971 | PCC7120 | −4.551 | 2 | |||
| 33866970 | PCC7120 | −2.995 | 2 | |||
| 33866969 | PCC7120 | −4.429 | 2 | |||
| 33866968 | PCC7120 | −4.423 | 2 |
aSAM D values, a positive or negative number indicates that the gene was up- or down-regulated by ammonium, respectively.
bThe top 54 predictions are considered as putative NtcA promoters. The NtcA promoter for nrtP is probably missed by our program, see text.

(A) Distributions of the confidence scores of the protein–protein interactions predicted by orthology mapping. (B) Distributions of the confidence scores of the protein–protein interactions predicted by protein fusion analysis. For both the methods, the distribution of the scores of the non-verified predictions in E.coli K12 (black lines) and that of the predictions in WH8102 (green lines) are very similar to that of the verified predictions in E.coli K12 (red lines). (C) Distributions of the confidence scores of the combined protein–protein interactions predicted by the two methods. The distributions of the scores of the predicted protein–protein interactions in E.coli K12 and WH8102 are very similar to each other, suggesting that similar prediction accuracy has been achieved for both the species.

(A) Predicted genome-scale protein–protein interaction map in WH8102. Each vertex represents a protein, and an edge represents a predicted interaction. Vertices in red are proteins in the initial model of the nitrogen assimilation network. (B) The distribution of the degree of the vertices of the predicted protein–protein interactions in WH8102. It can be fitted to a power-law function, C(n) = n−γ, where C is the number of proteins with n interacting partners and γ is a constant. (C) Proteins recruited into the network model through predicted protein–protein interactions. The vertices in red are the proteins in the initial model and those in black are proteins recruited. Proteins marked by ‘+’ or ‘−’ were up- or down-regulated by ammonium, respectively.
The components of the network recruited by NtcA promoter predictions at P < 0.05
| Rank | Transcription unit | Names | NtcA site | Downstream of NtcA binding site and −10 like box | NtcA site position | Score |
|---|---|---|---|---|---|---|
| 1 | −49 | 14.518 | ||||
| 2 | −52 | 14.188 | ||||
| 3 | −60 | 13.978 | ||||
| 4 | −51 | 13.831 | ||||
| 5 | - | −39 | 13.152 | |||
| 6 | - | −64 | 13.112 | |||
| 7 | - - | −62 | 13.111 | |||
| 8 | - | −173 | 13.097 | |||
| 9 | −37 | 13.015 | ||||
| 10 | - - | −143 | 12.792 | |||
| 11 | −349 | 12.711 | ||||
| 12 | −69 | 12.678 | ||||
| - - - | ||||||
| - | ||||||
| 13 | −199 | 12.586 | ||||
| 14 | - - | −111 | 12.538 | |||
| 15 | −527 | 12.477 | ||||
| 16 | - - - | −52 | 12.416 | |||
| - - | ||||||
| 17 | −56 | 12.168 | ||||
| 18 | - | −56 | 12.160 | |||
| 19 | −72 | 12.098 | ||||
| 20 | - | −53 | 11.997 | |||
| 21 | - - | −32 | 11.987 | |||
| 22 | - - | −251 | 11.976 | |||
| 23 | - - | −141 | 11.947 | |||
| - - | ||||||
| 24 | −93 | 11.936 | ||||
| 25 | - | −627 | 11.920 | |||
| 26 | −51 | 11.915 | ||||
| 27 | - | −51 | 11.883 | |||
| 28 | - | −423 | 11.865 | |||
| 29 | - | −51 | 11.816 | |||
| 30 | - | −222 | 11.787 | |||
| 31 | - - - | −89 | 11.753 | |||
| 32 | −71 | 11.743 | ||||
| 33 | - | −80 | 11.615 | |||
| 34 | - | −662 | 11.614 | |||
| 35 | - - | −70 | 11.564 | |||
| 36 | −159 | 11.563 | ||||
| 37 | - - | −91 | 11.533 | |||
| 38 | −628 | 11.526 | ||||
| - | ||||||
| 39 | - | −398 | 11.480 | |||
| 40 | −391 | 11.432 | ||||
| - | ||||||
| 41 | - | −452 | 11.380 | |||
| 42 | −151 | 11.356 | ||||
| 43 | - | −103 | 11.318 | |||
| 44 | - - - | −704 | 11.288 | |||
| 45 | - - | −177 | 11.273 | |||
| 46 | - | −113 | 11.257 | |||
| 47 | - | −37 | 11.236 | |||
| 48 | −209 | 11.232 | ||||
| 49 | - | −130 | 11.211 | |||
| 50 | - | −53 | 11.203 | |||
| 51 | - | −77 | 11.201 | |||
| 52 | −149 | 11.175 | ||||
| 53 | - - | −286 | 11.172 | |||
| 54 | −94 | 11.166 |
aBold face or underlined, down- or up-regulated when grown on ammonium relative to nitrate, respectively.
bBold face, putative −10 like box.

Phylogenetic profile analysis. (A) 2D representation of the clusters of the proteins of WH8102 based on the similarity of their phylogenetic profiles. The open circles near the horizontal axis indicate the mapping coordinates of the proteins in the initial model of the nitrogen assimilation network, and they are arbitrarily vertically separated to show the crowded ones. The cluster labeled by bar I contains proteins unique to WH8102. The next 7 clusters, labeled by bar II, contain proteins unique to the nine cyanobacterial genomes used in our analysis. The cluster III contains 20 proteins shared by all the 231 genomes in our analysis. B, C and D, close-up views to show the clusters containing at least one protein in the initial network model. The open circles represent proteins in the initial network. Proteins that were down-regulated by ammonium are labeled by ‘−’. The P-values indicate the statistical significance levels of the clusters.
Validation of the components of the nitrogen assimilation network recruited by the methods of the computational protocol
| Methods | Number of | genes affected | |
|---|---|---|---|
| Initial model | 27 | 14 | 2.35 × 10−7 |
| Protein-protein interactions | 16 (32) | 4 (14) | 0.0525 (4.02 × 10−6) |
| Phylogenetic profile | 4 (12) | 0 (7) | −(2.96 × 10−5) |
| Regulon prediction | 89 (102) | 24 (32) | 1.55 × 10−4 (3.89 × 10−7) |
| Operon | 2 (17) | 1 (9) | 0.018 (1.36×10−5) |
| Combined | 133 | 42 | 5.68 × 10−9 |
aThe number in a parenthesis is the result when the genes in the initial network model are also considered for the method, see text.

(A) The cumulative probability functions of the scores of putative NtcA promoters (an NtcA binding site plus a downstream TAN3T/A box) found for genes down-regulated [p(S(gd) > s), pink] and for genes unaffected [p(S(go) > s), blue] by ammonium, and their log odds ratio function (LOR, red, see Materials and Methods). The solid vertical line indicates the score cutoff 11.166 for the NtcA regulon prediction with P < 0.05. (B) The cumulative probability functions of the scores of putative canonical NtcA binding sites found for genes up-regulated [p(S(gu) > s), pink] and for genes unaffected [p(S(go) > s), blue] by ammonium, and their log odds ratio function (LOR, red). The solid vertical line indicates the score cutoff 8.02 for predicting the canonical NtcA binding sites that are possibly involved in the repressive regulation by NtcA (see text).

The working model of the nitrogen assimilation network in WH8102. Genes recruited either by our computational methods (Supplementary Table S7) or by microarray gene expression data (Supplementary Tables S5 and S6) are all considered to be members of the network. Genes that are predicted to bear NtcA promoters (Table 2) or canonic NtcA binding sites (also up-regulated by ammonium, Supplementary Table S6) are considered to be members of the NtcA regulon, while the others are considered to be non-NtcA regulon members of the network. Solid arrows represent substance translocations or chemical reactions, and dashed arrows represent regulatory relationships.