| Literature DB >> 27436282 |
Jiajia Xu1, Andrea Bräutigam2, Andreas P M Weber3, Xin-Guang Zhu4.
Abstract
Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5'UTR, 3'UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5'UTR, 3'UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution.Entities:
Keywords: C4 photosynthesis; cell specificity; cis element; etiolation; evolution; systems biology.
Mesh:
Year: 2016 PMID: 27436282 PMCID: PMC5014158 DOI: 10.1093/jxb/erw275
Source DB: PubMed Journal: J Exp Bot ISSN: 0022-0957 Impact factor: 6.992
Fig 1.Pathway-level gene expression of maize and rice during the de-etiolation process. The dot size represents the gene expression level across the whole genome. The color code indicates the relative gene expression level within a given pathway, from low (yellow) to high (red). (A) Photosynthetic pathways, and (B) non-photosynthesis related pathways. CCM, CO2 concentration mechanism.
Fig 2.Expression curves of C4 gene families. The x-axis represents different time points and the y-axis represents the RPKM value. Expression curves were 3rd-order polynomial regressed, whilst points indicate actual RPKM values.
Fig 3.Expression patterns of C4 orthologous gene pairs between maize (red) and rice (blue). RPKM values were normalized after 3rd-order polynomial regression.
Euclidean distances between maize and rice orthologous gene pairs. For columns headed 1–7, the 1st column indicates the pattern observed in Fig. 3, where ‘s’ stands for similar and ‘d’ stands for different; the 2nd column is the Euclid distance between two clusters that maize and rice genes fall into; the 3rd column is the rank correlation coefficient between maize and rice RPKM vectors ordered across time points; the 4th column is the rank correlation coefficient between maize and rice RPKM vectors ordered across genes; the 5th column is the mutual information value calculated by the R package ‘infotheo’ when setting the bin number to be 3; the 6th column is the maximum mutual information value; the 7th column is the random mutual information value by taking the average of 100 permutations of RPKM values across time points
| Gene ID | Maize ID | Rice ID | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|---|---|---|---|
|
| GRMZM2G083841 | LOC_Os01g11054 | s | 2.21 | 0.82 | 0.21 | 0.73 | 1.00 | 0.39 |
|
| GRMZM2G097457 | LOC_Os05g33570 | s | 0.00 | 0.54 | 0.77 | 0.26 | 1.00 | 0.38 |
|
| GRMZM2G085019 | LOC_Os01g09320 | d | 12.25 | 0.46 | 0.31 | 0.26 | 1.00 | 0.38 |
|
| GRMZM2G001696 | LOC_Os03g15050 | d | 8.26 | -0.61 | -0.74 | 0.46 | 1.00 | 0.38 |
|
| GRMZM2G070605 | LOC_Os01g13770 | d | 10.47 | -0.11 | -0.12 | 0.26 | 1.00 | 0.39 |
|
| GRMZM2G121878 | LOC_Os01g45274 | s | 3.76 | 0.29 | 0.03 | 0.26 | 1.00 | 0.39 |
|
| GRMZM2G174107 | LOC_Os08g25624 | s | 2.14 | 0.68 | 0.86 | 0.46 | 1.00 | 0.36 |
|
| GRMZM2G090718 | LOC_Os02g52940 | d | 10.26 | 0.18 | 0.28 | 0.46 | 1.00 | 0.40 |
|
| GRMZM2G028379 | LOC_Os03g48080 | d | 8.14 | 0.64 | 0.31 | 0.46 | 1.00 | 0.37 |
|
| GRMZM2G383088 | LOC_Os12g33080 | d | 13.74 | -0.79 | -0.81 | 1.00 | 1.00 | 0.39 |
|
| GRMZM2G138258 | LOC_Os01g72710 | s | 4.15 | 0.50 | 0.59 | 0.26 | 1.00 | 0.40 |
|
| GRMZM2G178192 | LOC_Os08g01770 | s | 3.54 | 0.82 | 0.75 | 0.73 | 1.00 | 0.39 |
|
| GRMZM2G131286 | LOC_Os07g34640 | s | 7.46 | -0.50 | -0.17 | 0.73 | 1.00 | 0.38 |
|
| GRMZM2G175140 | LOC_Os04g43070 | d | 13.13 | 0.25 | 0.23 | 0.26 | 1.00 | 0.44 |
|
| GRMZM5G836910 | LOC_Os02g55420 | d | 12.91 | -0.82 | -0.42 | 0.73 | 1.00 | 0.39 |
Mapping motifs predicted by the k80 and k30 approaches. Total predicted motifs is the total number of motifs predicted by the gene list obtained by k-mean clustering using the k80 approach. Mapped motifs is the number that could be mapped to motifs predicted by the k30 approach by STAMP with a P-value cut-off set at 0.01
| Genomic section | Total predicted motifs | Mapped motifs | Mapped rate |
|---|---|---|---|
| Maize promoter | 747 | 645 | 86.3% |
| Rice promoter | 545 | 461 | 84.6% |
| Maize 5UTR | 311 | 175 | 56.3% |
| Rice 5UTR | 208 | 101 | 48.6% |
| Maize 3UTR | 385 | 217 | 56.4% |
| Rice 3UTR | 311 | 217 | 69.8% |
| Maize CDS | 531 | 382 | 71.9% |
| Rice CDS | 440 | 333 | 75.7% |
| Maize intron | 538 | 436 | 81.0% |
| Rice intron | 556 | 184 | 33.1% |
Likelihood of identifying cis elements in genomic regions of C4 orthologous genes. ‘√’ indicates that conserved motifs were identified between the different methods; ‘X’ indicates that no conserved motifs were identified
The numbered columns are as follows: 1, PEPC; 2, PPDK; 3, NADP-ME; 4, PEP-CK; 5, TPT; 6, CA; 7, PPT; 8, PP; 9, AlaAT; 10, DiT1; 11, Mep3; 12, AMK; 13, PPCK-RP; 14, MEP1; 15, AspAT.
| Section | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Maize | Promoter | √ | √ | √ | √ | √ | √ | √ | X | √ | √ | √ | √ | √ | √ | √ |
| 5′UTR | √ | X | √ | √ | √ | √ | X | √ | √ | √ | X | X | X | √ | √ | |
| 3′UTR | √ | √ | √ | X | √ | √ | X | √ | X | √ | √ | √ | √ | √ | √ | |
| Intron | √ | X | √ | X | √ | √ | √ | √ | √ | √ | X | X | √ | √ | √ | |
| CDS | √ | √ | √ | X | √ | √ | √ | X | √ | √ | √ | √ | √ | √ | √ | |
| Rice | Promoter | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ | X | √ | √ | √ |
| 5′UTR | √ | √ | √ | X | √ | X | √ | √ | √ | X | √ | X | √ | X | √ | |
| 3′UTR | √ | √ | √ | √ | √ | √ | √ | X | √ | √ | √ | √ | √ | √ | X | |
| Intron | √ | √ | X | X | √ | √ | √ | X | √ | X | √ | X | √ | X | √ | |
| CDS | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ | X | √ | √ | √ |
The most conserved motifs predicted for maize and rice. The motifs listed in this table satisfy the following criteria: (1) conserved between at least two prediction methods; (2) conserved between maize and rice orthologous genes; and (3) conserved across maize or rice genes. ‘–’ indicates that no motifs were identified under the specified conditions. Overlapping results between the k80 and k30 approaches are indicated in bold. Numbers in brackets indicate the numbers of copies of this particular motif in the corresponding maize and rice genomic segments, respectively. M, A or C; R, A or G; W, A or T; S, C or G; Y, C or T; K, G or T; V, not T; H, not G; D, not C; B, not A
| Section | |||||
|---|---|---|---|---|---|
|
| CGTTGC (1,7) |
|
|
|
|
|
|
|
| TCGAGCAG (2,0) |
| |
|
| AACAAG (10,3) |
| TCGCGCAC (0,1) |
| |
|
|
|
|
|
| |
|
|
| CCCATA (3,8) | |||
|
|
|
| |||
|
|
| ||||
|
| GTGTAG (9,4) | ||||
|
| NTACCC (15,12) | ||||
|
| TAACAN (30,8) | ||||
|
|
| ||||
|
| |||||
|
| |||||
|
| – | – | CTCGNC (2,6) | – | – |
|
| |||||
|
| |||||
|
| – |
| – | – | AMCCAA (1,2) |
|
|
| ACGTTY (5,9) | AAYNTC (22,113) | – | – |
| CGTTNC (13,16) | TTGAAR (14,62) | ||||
|
| |||||
| TTGYNC (27,115) | |||||
|
|
|
|
| - |
|
|
|
|
| |||
|
|
Fig 4.Diagram showing numbers and species of conserved DNA motifs between maize and rice. Conserved DNA motifs identified with the k80 appraoch are marked with color as indicated in the keys, and the number of mapped sites are shown. Overlapping results between the k80 and k30 approaches are marked in bold in the keys.
Fig 5.The number of recruited motif sites in different segments of C4 genes. The total number of mapped sites for potential recruited motifs in maize identified using the k80 approach are given in the corresponding genomic segments, and the number of overlapping motifs is indicated in brackets.