| Literature DB >> 34608238 |
Muhammad-Redha Abdullah-Zawawi1, Nur-Farhana Ahmad-Nizammuddin2, Nisha Govender3, Sarahani Harun1, Norfarhan Mohd-Assaad2, Zeti-Azura Mohamed-Hussein1,2.
Abstract
Transcription factors (TFs) form the major class of regulatory genes and play key roles in multiple plant stress responses. In most eukaryotic plants, transcription factor (TF) families (WRKY, MADS-box and MYB) activate unique cellular-level abiotic and biotic stress-responsive strategies, which are considered as key determinants for defense and developmental processes. Arabidopsis and rice are two important representative model systems for dicot and monocot plants, respectively. A comprehensive comparative study on 101 OsWRKY, 34 OsMADS box and 122 OsMYB genes (rice genome) and, 71 AtWRKY, 66 AtMADS box and 144 AtMYB genes (Arabidopsis genome) showed various relationships among TFs across species. The phylogenetic analysis clustered WRKY, MADS-box and MYB TF family members into 10, 7 and 14 clades, respectively. All clades in WRKY and MYB TF families and almost half of the total number of clades in the MADS-box TF family are shared between both species. Chromosomal and gene structure analysis showed that the Arabidopsis-rice orthologous TF gene pairs were unevenly localized within their chromosomes whilst the distribution of exon-intron gene structure and motif conservation indicated plausible functional similarity in both species. The abiotic and biotic stress-responsive cis-regulatory element type and distribution patterns in the promoter regions of Arabidopsis and rice WRKY, MADS-box and MYB orthologous gene pairs provide better knowledge on their role as conserved regulators in both species. Co-expression network analysis showed the correlation between WRKY, MADs-box and MYB genes in each independent rice and Arabidopsis network indicating their role in stress responsiveness and developmental processes.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34608238 PMCID: PMC8490385 DOI: 10.1038/s41598-021-99206-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Phylogenetic tree of collated rice and Arabidopsis full-length WRKY protein sequences. Red dots represent the rice-Arabidopsis orthologous gene pairs. The tree is built using the neighbor-joining (NJ) method (MEGA7.0 software) and are divided into ten clades, numbered in bold.
Figure 2Phylogenetic tree of collated rice and Arabidopsis full-length MADS-box protein sequences. Red dots represent the rice-Arabidopsis orthologous gene pairs. The tree is built using the neighbor-joining (NJ) method (MEGA7.0 software) and are divided into seven clades, numbered in bold.
Figure 3Phylogenetic tree of collated rice and Arabidopsis full-length MYB protein sequences. Red dots represent the rice-Arabidopsis orthologous gene pairs. The tree is built using the neighbor-joining (NJ) method (MEGA7.0 software) and are divided into 14 clades, numbered in bold.
Figure 4The chromosomal distribution of rice-Arabidopsis WRKY, MADS-box and MYB orthologous gene. (A) Distribution of gene loci on Arabidopsis chromosomes. (B) Distribution of gene loci on rice chromosomes. Different gene loci colours (naming) represents a gene transcription factor family: WRKY; black, MADS-box; purple and MYB; red.
Orthologous WRKY, MADS-box and MYB gene-pairs in Arabidopsis and rice.
| Gene identifier | Name* | Chr | Location | ORF length (bp) | Protein | Exon number | ||
|---|---|---|---|---|---|---|---|---|
| Length | PI | Molecular weight (Da) | ||||||
| AT1G18750 | AtAGL651 | 1 | 6,466,761–6,469,984 | 1170 | 389 | 6.504 | 44,877.5 | 10 |
| AT4G28110 | AtMYB412 | 4 | 13,968,029–13,969,384 | 849 | 282 | 5.903 | 31,651.6 | 3 |
| AT5G56110 | AtMYB803 | 5 | 22,719,191–22,720,664 | 963 | 320 | 7.322 | 35,983.4 | 3 |
| AT3G13540 | AtMYB54 | 3 | 4,420,173–4,421,701 | 750 | 249 | 8.285 | 27,793.5 | 2 |
| AT5G35550 | AtMYB1235 | 5 | 13,726,743–13,727,860 | 777 | 258 | 8.903 | 29,611.4 | 3 |
| AT5G12870 | AtMYB466 | 5 | 4,062,724–4,064,992 | 843 | 280 | 6.037 | 31,541.3 | 2 |
| AT1G63910 | AtMYB1037 | 1 | 23,719,783–23,721,774 | 1113 | 370 | 5.681 | 42,262.6 | 3 |
| AT3G60460 | AtMYB1258 | 3 | 22,342,429–22,343,491 | 894 | 297 | 6.075 | 33,649.6 | 3 |
| AT3G09230 | AtMYB19 | 3 | 2,833,398–2,835,338 | 1182 | 393 | 5.217 | 42,811.4 | 2 |
| AT1G09770 | AtMYBCDC510 | 1 | 3,161,841–3,165,360 | 2535 | 844 | 6.731 | 95,766.6 | 4 |
| AT2G37630 | AtMYB9111 | 2 | 15,781,615–15,783,433 | 1104 | 367 | 9.555 | 42,243.1 | 1 |
| AT3G18100 | AtMYB4R112 | 3 | 6,200,524–6,204,644 | 2544 | 847 | 5.580 | 96,084.4 | 7 |
| AT1G29280 | AtWRKY6513 | 1 | 10,236,367–10,237,467 | 780 | 259 | 5.469 | 29,054.4 | 2 |
| AT1G68150 | AtWRKY914 | 1 | 25,543,969–25,545,717 | 1125 | 374 | 7.816 | 42,743.0 | 5 |
| AT2G40740 | AtWRKY5515 | 2 | 16,997,177–16,999,277 | 879 | 292 | 8.049 | 32,488.8 | 3 |
| AT4G26640 | AtWRKY2016 | 4 | 13,437,071–13,440,835 | 1458 | 485 | 7.102 | 53,601.5 | 5 |
| AT4G30935 | AtWRKY3217 | 4 | 15,051,814–15,054,042 | 1401 | 466 | 5.895 | 51,480.4 | 5 |
| AT2G37260 | AtWRKY4418 | 2 | 15,644,840–15,647,065 | 1290 | 429 | 9.399 | 47,141.2 | 4 |
| AT5G43290 | AtWRKY4919 | 5 | 17,371,838–17,373,201 | 825 | 274 | 7.924 | 31,580.6 | 3 |
| AT2G46130 | AtWRKY4320 | 2 | 18,957,226–18,957,911 | 330 | 109 | 9.992 | 12,951.8 | 2 |
| AT5G13080 | AtWRKY7521 | 5 | 4,149,740–4,151,150 | 438 | 145 | 9.593 | 16,801.8 | 2 |
| AT4G12020 | AtWRKY1922 | 4 | 7,201,656–7,209,648 | 5397 | 1798 | 7.019 | 199,996.0 | 15 |
| LOC_Os11g43740 | OsMADS681 | 11 | 26,414,394–26,418,442 | 1179 | 392 | 6.829 | 43,366.9 | 11 |
| LOC_Os07g37210 | OsMYB1022 | 7 | 22,293,735–22,295,309 | 1107 | 368 | 7.092 | 39,929.0 | 3 |
| LOC_Os04g39470 | OsMYB803 | 4 | 23,510,412–23,512,029 | 1119 | 372 | 6.146 | 39,699.2 | 3 |
| LOC_Os01g50110 | OsMYB134 | 1 | 28,796,516–28,797,732 | 828 | 275 | 6.107 | 29,793.3 | 2 |
| LOC_Os03g29614 | OsMYB465 | 3 | 16,879,442–16,883,640 | 966 | 321 | 6.624 | 34,049 | 3 |
| LOC_Os12g33070 | OsMYB1226 | 12 | 19,991,426–19,994,401 | 1230 | 409 | 6.824 | 43,722.4 | 2 |
| LOC_Os08g05520 | OsMYB937 | 8 | 2,948,522–2,951,372 | 1080 | 359 | 6.624 | 39,954.7 | 3 |
| LOC_Os04g46384 | OsMYB588 | 4 | 27,503,041–27,504,784 | 1032 | 343 | 7.919 | 37,110.9 | 3 |
| LOC_Os01g63160 | OsMYB199 | 1 | 36,606,535–36,608,135 | 1242 | 413 | 6.697 | 44,329.6 | 2 |
| LOC_Os04g28090 | OsMYB5010 | 4 | 16,579,869–16,587,180 | 2919 | 972 | 4.878 | 109,684 | 4 |
| LOC_Os12g38400 | OsMYB12511 | 12 | 23,554,928–23,560,551 | 1029 | 342 | 10.28 | 39,041.6 | 2 |
| LOC_Os07g04700 | OsMYB8712 | 7 | 2,084,106–2,091,653 | 2907 | 968 | 8.639 | 106,868.0 | 13 |
| LOC_Os01g54600 | OsWRKY1313 | 1 | 31,409,004–31,410,978 | 951 | 316 | 4.601 | 34,294.6 | 3 |
| LOC_Os02g53100 | OsWRKY3214 | 2 | 32,489,017–32,495,070 | 1815 | 604 | 4.800 | 62,940.3 | 6 |
| LOC_Os01g60490 | OsWRKY2215 | 1 | 34,981,468–34,985,447 | 798 | 265 | 7.110 | 29,807.4 | 3 |
| LOC_Os07g39480 | OsWRKY8716 | 7 | 23,654,076–23,659,625 | 1857 | 618 | 6.332 | 66,163.6 | 6 |
| LOC_Os08g17400 | OsWRKY8917 | 8 | 10,633,195–10,639,603 | 1653 | 550 | 6.707 | 59,781.9 | 4 |
| LOC_Os01g62510 | OsWRKY11918 | 1 | 36,188,702–36,191,681 | 612 | 203 | 5.042 | 21,483.5 | 2 |
| LOC_Os01g74140 | OsWRKY1719 | 1 | 42,946,753–42,948,750 | 1233 | 410 | 4.685 | 45,109.9 | 3 |
| LOC_Os01g53260 | OsWRKY2320 | 1 | 30,604,295–30,608,077 | 765 | 254 | 6.903 | 27,796.2 | 2 |
| LOC_Os11g29870 | OsWRKY7221 | 11 | 17,352,085–17,355,820 | 729 | 242 | 9.335 | 25,857.2 | 2 |
| LOC_Os05g45230 | OsWRKY5822 | 5 | 26,256,951–26,257,809 | 546 | 181 | 4.631 | 18,481.3 | 2 |
Each gene is described according to chromosome loci, open reading frame (ORF) length, properties of the encoding protein and exon number.
*Similar superscript numbers in the name column represents orthologous gene pairs.
Figure 5Exon–intron structure of Arabidopsis and rice WRKY (blue column), MADS-box (yellow column) and MYB (green column), orthologous gene pairs displayed according to clade numbers in their TF family-phylogenetic tree. The exon–intron structure is described as following: the yellow rectangles and grey lines denote exons and introns, respectively whilst the blue boxes represents the untranslated regions (UTRs).
Figure 6Distribution pattern of conserved motifs in Arabidopsis and rice WRKY, MADS-box and MYB orthologous genes, identified by MEME web server. Orthologous gene pairs are presented by transcription factor (TF) families: column blue; WRKY, column yellow; MADS-box and column green; MYB. The p-values are significant at 0.05. Motif distribution includes different coloured boxes, each represent a unique numbered motif as indicated in the legend. The width differences among the boxes represents the motif length.
Figure 7Distribution of the cis-regulatory elements (CRE) in the 1.5 kb promoter region of Arabidopsis and rice WRKY, MADS box and MYB orthologous genes as identified by PlantCARE and visualized using the IBS software (http://ibs.biocuckoo.org). The CREs are denoted by in different shapes and colours. Each CRE is drawn as following: (i) thick black line for the reverse strand and (ii) thin black line for the forward strand.
Comparison of plant development, hormone and stress-responsive cis-regulatory elements (CREs) in the promoter regions of Arabidopsis and rice WRKY, MADS-box, and MYB orthologous gene pairs.
| Clade | Gene identifier | Name | CRE function | ||
|---|---|---|---|---|---|
| Development | Hormone response | Abiotic/biotic stress | |||
| 2 | LOC_Os11g43740 | OsMADS68 | N//A | CGTCA-motif, TGACG-motif, ABRE, ABRE3a, ABRE4 | G-box |
| AT1G18750 | AtAGL65 | N/A | ABRE | S-box, GT1-motif, MBS, MYB, STRE, TCT-motif | |
| 1 | LOC_Os07g37210 | OsMYB102 | C-box, O2-site | ABRE3a, ABRE4, CGTCA-motif, TGACG-motif, O2-site | C-box, Sp1 |
| AT4G28110 | AtMYB41 | CAT-box | N/A | MBS, MYC, MYB | |
| 2 | LOC_Os04g39470 | OsMYB80 | Motif I, AP-2 like | ABRE3a, ABRE4, CGTCA-motif, TGACG-motif | G-box, GC-motif, Sp1 |
| AT5G56110 | AtMYB80 | As-1 | ABRE, As-1 | MYB, MYC, STRE, TCT-motif, W-box | |
| 4 | LOC_Os01g50110 | OsMYB13 | N/A | CGTCA-motif, TGACG-motif | GC-motif, Sp1 |
| AT3G13540 | AtMYB5 | N/A | ABRE | TCT-motif, MYC, AE-box, GT1-motif, MYB | |
| LOC_Os03g29614 | OsMYB46 | N/A | ABRE, CGTCA-motif, TGACG-motif | G-box, I-box, CCAAT-box | |
| AT5G35550 | AtMYB123 | N/A | N/A | MYC, GATA-motif, MYB | |
| 5 | LOC_Os12g33070 | OsMYB122 | AP-2 like | CGTCA-motif, TGACG-motif | ARE, G-box, GC-motif, I-box, Sp1 |
| AT5G12870 | AtMYB46 | As-1 | As-1 | S-box, MBS, MYB, STRE, W-box | |
| LOC_Os08g05520 | OsMYB93 | O2-site | CGTCA-motif, TGACG-motif, O2-site | G-box, GC-motif, Sp1 | |
| AT1G63910 | AtMYB103 | As-1, CAT-box | ABRE, As-1 | AE-box, GATA-motif, MYB, MYC | |
| 10 | LOC_Os04g46384 | OsMYB58 | N/A | ABRE, CGTCA-motif, TGACG-motif | G-box, Sp1 |
| AT3G60460 | AtMYB125 | As-1 | ABRE, As-1 | W-box, MYC, MYB, sbp-CMA1c | |
| LOC_Os01g63160 | OsMYB19 | GCN4_motif | ABRE, CGTCA-motif, TGACG-motif | ARE, GC-motif, I-box, LTR, P-box, Sp1 | |
| AT3G09230 | AtMYB1 | As-1 | As-1 | AE-box, GT1-motif, MBS, MYB, MYC, STRE | |
| 12 | LOC_Os04g28090 | OsMYB50 | N/A | ABRE3a, ABRE4 | ARE, G-box, P-box, Sp1 |
| AT1G09770 | AtMYBCDC5 | As-1, CAT-box | As-1 | GATA-motif, STRE, TCT-motif, W-box | |
| LOC_Os12g38400 | OsMYB125 | C-box, AP-2 like | CGTCA-motif, TGACG-motif | C-box, CCAAT-box, G-box, Sp1 | |
| AT2G37630 | AtMYB91 | As-1, CAT-box | ABRE, JERE | AE-box, MYB, MYC, STRE | |
| 13 | LOC_Os07g04700 | OsMYB87 | AP-2 like | ABRE3a, ABRE4, CGTCA-motif, TGACG-motif | LTR, P-box, Sp1 |
| AT3G18100 | AtMYB4R1 | As-1 | As-1 | GT1-motif, MBS, MYB, MYC, TCT-motif | |
| 1 | LOC_Os01g54600 | OsWRKY13 | N/A | CGTCA-motif, TGACG-motif | GC-motif, G-box |
| AT1G29280 | AtWRKY65 | As-1 | ABRE, As-1 | MYB, MYC, STRE | |
| 4 | LOC_Os02g53100 | OsWRKY32 | N/A | ABRE, CGTCA-motif, TGACG-motif | G-box, CCAAT-box, Sp-1 |
| AT1G68150 | AtWRKY9 | As-1 | ABRE, As-1 | AE-box, G-box, MYB, MYC | |
| 5 | LOC_Os01g60490 | OsWRKY22 | O2-site | ABRE, ABRE3a, ABRE4, CGTCA-motif, TGACG-motif, O2-site | Box II |
| AT2G40740 | AtWRKY55 | As-1, CAT-box | ABRE, As-1 | AE-box, S-box, GT1-motif, MYB, MYC, W-box | |
| 7 | LOC_Os07g39480 | OsWRKY87 | GCN4_motif, O2-site | CGTCA-motif, TGACG-motif, O2-site | ARE, GC-motif |
| AT4G26640 | AtWRKY20 | As-1 | ABRE, As-1 | AE-box, S-box, G-box, GATA-motif, MBS, MYB, MYC, STRE, TCT-motif, W-box | |
| LOC_Os08g17400 | OsWRKY89 | O2-site | ABRE, CGTCA-motif, TGACG-motif, O2-site | CCAAT-box, Sp1 | |
| AT4G30935 | AtWRKY32 | As-1 | As-1 | GATA-motif, GT1-motif, MBS, MYB, MYC, W-box | |
| LOC_Os01g62510 | OsWRKY119 | N/A | ABRE3a, ABRE4, CGTCA-motif, TGACG-motif | ARE, G-box, GC-motif, Sp1 | |
| AT2G37260 | AtWRKY44 | As-1 | As-1 | MBS, MYC, STRE, TCT-motif, W-box | |
| 8 | LOC_Os01g74140 | OsWRKY17 | N/A | CGTCA-motif, TGACG-motif | ARE, G-box, Sp1, GC-motif |
| AT5G43290 | AtWRKY49 | As-1 | ABRE, As-1 | S-box, DRE core, Gap-box, MYB, MYC, STRE | |
| LOC_Os01g53260 | OsWRKY23 | AP-2 like | CGTCA-motif, TGACG-motif | G-box, Sp1 | |
| AT2G46130 | AtWRKY43 | As-1 | ABRE, As-1 | MYB, MYC | |
| LOC_Os11g29870 | OsWRKY72 | N/A | CGTCA-motif, TGACG-motif | CCAAT-box | |
| AT5G13080 | AtWRKY75 | As-1 | As-1 | MYC, MYB, AE-box, G-box, STRE | |
| 9 | LOC_Os05g45230 | OsWRKY58 | N/A | ABRE, CGTCA-motif, TGACG-motif | CCAAT-box, GC-motif, Sp1 |
| AT4G12020 | AtWRKY19 | As-1, CAT-box | As-1 | DRE core, MYB, MYC, STRE, W-box | |
Figure 8Gene co-expression network of Arabidopsis and rice WRKY, MADS-box and MYB orthologous genes. (A) Frequencies of co-expression interactions identified by PLANEX. Increasing r-values show stronger positive correlation and vice versa. (B) Co-expression network comprised of nodes, represent genes, different node colour s indicate unique transcription factor family (red node = MYB, blue node = WRKY and purple node = MADS-box) and edges indicate positive (red lines) and negative (blue lines) correlations.
Functional similarity between the Arabidopsis and rice WRKY, MADS-box and MYB orthologous gene-pairs.
| Rice | Kappa statistics | |||||
|---|---|---|---|---|---|---|
| Gene ID | Name | Probe ID | Gene ID | Name | Probe ID | |
| LOC_Os11g43740 | OsMADS68 | OsAffx.19355.1.S1_at | AT1G18750 | AtAGL65 | 261423_at | 0.500029866483831 |
| LOC_Os07g37210 | OsMYB102 | Os.3390.1.S1_at | AT4G28110 | AtMYB41 | 253851_at | 0.236403501449661 |
| LOC_Os04g39470 | OsMYB80 | OsAffx.14205.1.S1_at | AT5G56110 | AtMYB80 | 248051_at | 0.162086262661477 |
| LOC_Os01g50110 | OsMYB13 | Os.55528.1.S1_at | AT3G13540 | AtMYB5 | 256985_at | 0.342306085866936 |
| LOC_Os03g29614 | OsMYB46 | Os.56985.1.S1_a_at | AT5G35550 | AtMYB123 | 249704_at | 0.355608075400779 |
| LOC_Os12g33070 | OsMYB122 | OsAffx.19945.2.S1_at | AT5G12870 | AtMYB46 | 250322_at | 0.122141588277817 |
| LOC_Os08g05520 | OsMYB93 | Os.49830.1.S1_at | AT1G63910 | AtMYB103 | 260326_at | 0.316581470795973 |
| LOC_Os12g38400 | OsMYB125 | Os.12994.1.S1_at | AT2G37630 | AtMYB91 | 267157_at | 0.348381644725455 |
| LOC_Os01g54600 | OsWRKY13 | Os.2160.2.S1_x_at | AT1G29280 | AtWRKY65 | 260882_at | 0.293798964280973 |
| LOC_Os02g53100 | OsWRKY32 | OsAffx.12620.1.S1_at | AT1G68150 | AtWRKY9 | 260432_at | 0.439323046291688 |
| LOC_Os01g60490 | OsWRKY22 | OsAffx.23871.1.S1_at | AT2G40740 | AtWRKY55 | 266052_at | 0.309001957815947 |
| LOC_Os07g39480 | OsWRKY87 | Os.18862.1.S1_at | AT4G26640 | AtWRKY20 | 253983_at | 0.263336526419911 |
| LOC_Os08g17400 | OsWRKY89 | Os.27818.1.S1_at | AT4G30935 | AtWRKY32 | 253603_at | 0.206776541657964 |
| LOC_Os01g62510 | OsWRKY119 | OsAffx.9554.1.S1_at | AT2G37260 | AtWRKY44 | 265954_at | 0.185878341728013 |
| LOC_Os01g53260 | OsWRKY23 | Os.30386.1.S1_at | AT2G46130 | AtWRKY43 | 266597_at | 0.372919466881054 |
| LOC_Os05g45230 | OsWRKY58 | OsAffx.27315.1.S1_at | AT4G12020 | AtWRKY19 | 254852_at | 0.208021020730278 |
The co-expression datasets are retrieved and analyzed using Kappa statistics from PLANEX.