Literature DB >> 31941983

A Comprehensive Transcriptome-Wide Identification and Screening of WRKY Gene Family Engaged in Abiotic Stress in Glycyrrhiza glabra.

Pooja Goyal1, Malik Muzafar Manzoor1, Ram A Vishwakarma1, Deepak Sharma2, Manoj K Dhar2, Suphla Gupta3.   

Abstract

The study reports 147 full-length WRKY genes based on the transcriptome analysis of Glycyrrhiza genus (n class="Species">G. glabra and G. uralensis). Additional motifs in G. glabra included DivIVA (GgWRKY20) and SerS Superfamily (GgWRKY21) at the C-terminal, and Coat family motifs (GgWRKY55) at the N-terminal of the proteins, while Exo70 exo cyst complex subunit of 338 amino acid (GuWRKY9) was present at the N-terminal of G. uralensis only. Plant Zn cluster super-family domain (17 WRKYs) and bZIP domain (2 WRKYs) were common between the two species. Based on the number of WRKY domains, sequence alignment and phylogenesis, the study identified GuWRKY27 comprising of 3 WRKY domains in G. uralensis and a new subgroup-IIf (10 members), having novel zinc finger pattern (C-X4-C-X22-HXH) in G. glabra. Multiple WRKY binding domains (1-11) were identified in the promoter regions of the GgWRKY genes indicating strong interacting network between the WRKY proteins. Tissue-specific expression of 25 GgWRKYs, under normal and treated conditions, revealed 11 of the 18 induction factor triggered response corroborating to response observed in AtWRKYs. The study identified auxin-responsive GgWRKY 55 & GgWRKY38; GA3 responsive GgWRKYs15&59 in roots and GgWRKYs8, 20, 38, 57 &58 in the shoots of the treated plant. GgWRKYs induced under various stresses included GgWRKY33 (cold), GgWRKY4 (senescence), GgWRKYs2, 28 & 33 (salinity) and GgWRKY40 (wounding). Overall, 23 GgWRKYs responded to abiotic stress, and 17 WRKYs were induced by hormonal signals. Of them 13 WRKYs responded to both suggesting inter-connection between hormone signalling and stress response. The present study will help in understanding the transcriptional reprogramming, protein-protein interaction and cross-regulation during stress and other physiological processes in the plant.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 31941983      PMCID: PMC6962277          DOI: 10.1038/s41598-019-57232-x

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

The complexity in plant cell organization can be directly related to intricate inter-connections between genes and regulatory network inside the cell. This observation is further substantiated by studies on vast genomic and transcriptomic sequence information available in the public domain[1]. The inter-cellular biological circuit in higher plants is governed at several discreet levels, one of them is regulated by a specific group of DNA binding proteins, the WRKYs[2] are among the ten largest families of transcription factors (TFs) in higher plants. The literature cites several papers after the first report on Ipomea batata (SPF1) in 1994[3], from dicots[4], monocots[5], orchids[6] to unicellular eukaryote (Giardia lamblia) and the slime mold (Dictyostelium discoideum), revealing their evolutionary significance and complex organization[7,8]. The 60 amino-acid characteristic conserved sequence of WRKY transcription factor (TFs) are most commonly identified by specific hepta-nucleotide signature sequence (WRKYGQK), the W-Box, which binds to the promoter sequence of target gene(s) modulating its activity[9]. The large WRKY super-family is phylogenetically classified into three groups (I, II &III) based on the number of WRKY domains and type of zinc finger sequences at the C-terminal. WRKY proteins classified in group I is characterized by two WRKY domains and zinc-finger motifs (C2H2), while group II and III WRKY proteins constitute single WRKY domain. Zinc-finger motif in group II & III comprise of C2H2 and C2HC zinc-finger pattern, respectively[10]. Studies have shown WRKY binding motifs (W-boxes) are present in multiple numbers in WRKY responsive gene promoters[11]. The promoters of 83% genes of the 72 WRKYs in Arabidopsis, contain at least two perfect W-boxes (TTGACC/T), and 58% had four or more core element sequence (TTGAC)[11]. Some WRKYs had 11 to 12 (AtWRKY66, AtWRKY17) core elements in the promoter fragment as analysed by Dong et al.[12]. Interestingly, studies confirm the presence of W-boxes also in the promoter region of WRKY genes, suggesting a potentially strong transcriptional networking between WRKY proteins[11]. Studies using co-transfection assays have revealed role of WRKY proteins on the promoters of their own genes and on other WRKY genes thereby modulating reporter gene[13]. Also in-vitro DNA-protein binding assays have highlighted single WRKY binding to several target gene promoters as elucidated in WRKY53 binding to three different WRKY genes, confirming complex interactive regulatory network. Microarray experiments using Arabidopsis genome illustrated more than 70% (45 out of 61) of the WRKY genes are co-regulated with other WRKYs[14] and transcription factors[12]. Biological role of WRKYs are being studied in several plants[15]. They have been found to regulate several target genes in response to stress[16] including metal stress[17], development[18] and secondary metabolite biosynthesis[1]. WRKYs have shown regulatory role in pathogen-induced response[12] resulting in concerted activation of variety of genes. WRKY TFs have been found to rapidly and transiently regulate gene induction in response to signalling molecule[19], wounding, stress, physiological processes like flowering[20], seed germination and development[21] and senescence[4]. Expressed Sequence Tags (ESTs) and other plant database have revealed presence of several hundred WRKYs in various tissues under different physiology, stress[18], cold[22], stomatal movement[23] and defense[24,25] implying their predominant role in varied biological functions. However, under normal growth conditions also, WRKY proteins have demonstrated broad-spectrum regulatory role as reported in morphogenesis and development of trichomes[26] embryo development[18], senescence[13], dormancy[27], plant growth[28], immunity[29], systematic acquired resistant and metabolic pathways[30]. Two decades of studies on WRKY TFs has resulted in more than 14500 WRKY genes from 165 plant species[31] with most of the species from eudicots (100 species) followed by monocots (38 species) and chlorophytae (16 species)[31]. Legumes with 12 species contributed to 1094 WRKY genes[32]. n class="Chemical">No report on WRKY transcription factors has been published from Glycyrrhiza species, though transcriptome, genome and EST databases are available in public domain from G. uralensis. Glycyrrhiza belongs to Fabaceae sub-family of Leguminoseae family. The underground roots (Licon class="Species">rice) of the genus (G. uralensis, G. glabra and G. echinata) are commercially valued for its pharmaceutical, flavour enhancer natural sweetener, and cosmaceutical properties[33]. Roots of the plant are rich in bioactive flavonoids and tri-terpenoid saponins including glycyrrhizin[34]. Glycyrrhizin molecule is pharmaceutically sought molecule for its multitude of bioactivities[33]. The global demand of the roots of Glycyrrhiza is evident by a market report, as per Transparency Market report (ALBANY, New York, April 4, 2017 /PRNewswire). Where projected compound annual growth rate was estimated to be 5.7% during 2017–2025 equivalent to USD 2,393.9 million by 2025. Present research underlines the transcriptome-wide identification and characterization of 147 WRKY TFs from Glycyrrhiza genus. Here, we analysed 87 WRKY genes from n class="Species">G. glabra and 60 from G. uralensis, categorized them into different structural groups based on conserved motif composition. We also predicted functions based on STRING prediction algorithm in G. glabra WRKY members. Subsequently their expression profiles were investigated under various stress conditions in the aerial tissues of the in vitro cultured plant. We also characterized 31 promoters (between 0.5 kb to 4.1 kb) of the 87 GgWRKY genes (from the transcriptomic data) to get an insight into its functioning and regulation of secondary metabolites.

Results and Discussion

Transcriptome-wide analysis and characterization of Glycyrrhiza WRKY TF

We have done the transcriptomics of G. glabra plant and mined the data for the n class="Gene">WRKY transcription factor. Among the 125 sequences that matched WRKY genes on BLAST and PF03106 HMM profile searches, 87 GgWRKYs had complete CDS, and 38 gene sequences were partial (Table 1). All of these were revalidated using Uniprot (https://www.uniprot.org/) resulting in 78 sequences with best hits, while 47 sequences were found unique. Out of these, 55 (UniProt hits) and 32 (unique) sequences were full length, and 23 (UniProt hits) and 15 (unique) were partial sequences (Table 1). Further, we used the publicly available G. uralensis transcriptome data as a reference source (http://ngs-data-archive.psc.riken.jp/Gur-genome/download.pl.) to retrieve the WRKY transcription factor using BLAST and PF03106 HMM profile searches, we could identify 60 WRKY genes from G. uralensis. Subsequently, all the full-length protein sequences (147) were re-examined for the presence of WRKY domains using conserved domain database (https://www.ncbi.nlm.nih.gov/cdd/) and through HMMScan (https://www.ebi.ac.uk/Tools/hmmer/search/hmmscan). The GgWRKY sequences were submitted to NCBI, and their accession numbers are given in (Supplementary File 1). The identified GuWRKY protein sequences were included in the sequence alignment and phylogenetic studies only (Supplementary File 2). The detailed GgWRKY protein sequence features are listed in Table 2. The deduced GgWRKY proteins had amino acid residues between 112 (GgWRKY67) to 760 (GgWRKY12). The coding sequences of 87 full-length GgWRKYs ranged from 339 bp (GgWRKY67) to 2283 bp (GgWRKY12), and their molecular weight (MW) varied between 13291.91 Da (GgWRKY67) to 82181.16 Da (GgWRKY12) (Table 3). The isoelectric point (pI) of 44 GgWRKYs were acidic, one (GgWRKY55) was neutral with pI value equal to 7.0, and the remaining 42 were basic proteins. According to the instability index proteins with index value higher than 40.0 is unstable[35]. In the present study most of the GgWRKYs were found to be unstable, having maximum instability index of 68.68 (GgWRKY34) with the exception of ten GgWRKYs namely, GgWRKY10 (30.20), GgWRKY16 (38.82), GgWRKY48 (39.86), GgWRKY50 (33.92), GgWRKY60 (39.40), GgWRKY73 (33.70), GgWRKY80 (39.40), GgWRKY83 (37.16), GgWRKY84 (32), GgWRKY86 (35.94) (Table 3). Additionally, the WoLFPSORT prediction showed that 81 GgWRKY proteins were localised in nucleus, suggesting that they play regulatory role predominantly in cell nucleus, while 4 GgWRKYs (-23, 32, 75, 84) had chloroplast orientation. GgWRKY73 had mitochondrial and GgWRKY 86 had cytoplasmic subcellular localization (Table 3). Further, five GgWRKY members (GgWRKYs 10,-33,-67,-68,-87) had WRKYGKK domain instead of the common WRKYGQK (Table 2). Earlier studies have also reported replacement of Q by K as common variant. Rice WRKYs have shown 19 variants, where the characteristic WRKY is substitution by WRRY,WSKY,WKKY, WVKY or WKKY motifs[5].
Table 1

Sequence information of WRKY genes in G. glabra.

Sequence typeUniprot matchesUniqueTotal
Full length sequences553287
Partial sequences231538
Total7847125
Table 2

Sequence features of WRKY genes in G. glabra.

GgWRKYsGene IDCDS (bp)ORF (aa)groupConserved motifDomain patternZinc finger
GgWRKY1MK511239156352012(WRKYGQK)C-X4-C-X22-HNHC2H2
GgWRKY2MK511240115538412(WRKYGQK)C-X4-C-X22-HNHC2H2
GgWRKY3MK5112418042672eWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY4MK5112428402792cWRKYGQKC-X4-C-X23-HNHC2H2
GgWRKY5MK51124376225312(WRKYGQK)C-X4-C-X22-HNHC2H2
GgWRKY6MK5112446032002fWRKYGQKC-X4-C-X22-HNHC2H2
GgWRKY7MK511245176758812(WRKYGQK)C-X4-C-X22-HNHC2H2
GgWRKY8MK511246112837512(WRKYGQK)C-X4-C-X22-HNHC2H2
GgWRKY9MK511247144047912(WRKYGQK)C-X4-C-X22-HNHC2H2
GgWRKY10MK5112485071682cWRKYGKKC-X4-C-X23-HXHC2H2
GgWRKY11MK51124911043672cWRKYGQKC-X4-C-X23-HNHC2H2
GgWRKY12MK511250228376012(WRKYGQK)C-X4-C-X23-HNHC2H2
GgWRKY13MK51125111043672cWRKYGQKC-X4-C-X23-HNHC2H2
GgWRKY14MK51125212664212fWRKYGQKC-X4-C-X22-HXHC2H2
GgWRKY15MK511253130543412(WRKYGQK)C-X4-C-X22-HNHC2H2
GgWRKY16MK511254120640112(WRKYGQK)C-X4-C-X23-HNHC2H2
GgWRKY17MK5112556692222fWRKYGQKC-X4-C-X22-HXHC2H2
GgWRKY18MK511256160853512(WRKYGQK)C-X4-C-X22-HNHC2H2
GgWRKY19MK511257152750712(WRKYGQK)C-X4-C-X23-HXHC2H2
GgWRKY20MK51125819746572bWRKYGQKC-X5-C-X23-HXHC2H2
GgWRKY21MK51125917435802bWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY22MK511260103534412(WRKYGQK)C-X4-C-X22-HXHC2H2
GgWRKY23MK5112617232402cWRKYGQKC-X4-C-X23-HXHC2H2
GgWRKY24MK5112627772582eWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY25MK5112636242072eWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY26MK5112646242072eWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY27MK51126516175382eWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY28MK5112666122032fWRKYGQKC-X4-C-X22-HXHC2H2
GgWRKY29MK51126711433802eWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY30MK51126810683553WRKYGQKC-X7-C-X23-HXCC2HC
GgWRKY31MK511269143747812(WRKYGQK)C-X4-C-X22-HXHC2H2
GgWRKY32MK5112704951653WRKYGQKC-X7-C-X23-HXCC2HC
GgWRKY33MK5112714981652cWRKYGKKC-X4-C-X23-HXHC2H2
GgWRKY34MK5112729663213WRKYGQKC-X7-C-X23-HXCC2HC
GgWRKY35MK51127310293422cWRKYGQKC-X4-C-X23-HNHC2H2
GgWRKY36MK5112745791922cWRKYGQKC-X4-C-X23-HNHC2H2
GgWRKY37MK5112757172382aWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY38MK51127611583852bWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY39MK51127718816262bWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY40MK5112787892622aWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY41MK511279124841512(WRKYGQK)C-X4-C-X22-HXHC2H2
GgWRKY42MK51128010293422fWRKYGQKC-X4-C-X22-HXHC2H2
GgWRKY43MK511281114938212(WRKYGQK)C-X4-C-X22-HXHC2H2
GgWRKY44MK51128212964312bWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY45MK5112837862612eWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY46MK511284135645112(WRKYGQK)C-X4-C-X22-HXHC2H2
GgWRKY47MK511285153050912(WRKYGQK)C-X4-C-X22-HXHC2H2
GgWRKY48MK5112865941972cWRKYGQKC-X4-C-X23-HNHC2H2
GgWRKY49MK5112877892622cWRKYGQKC-X4-C-X23-HNHC2H2
GgWRKY50MK5112884681552fWRKYGQKC-X4-C-X22-HXHC2H2
GgWRKY51MK5112898942972cWRKYGQKC-X4-C-X23-HXHC2H2
GgWRKY52MK5112907322432aWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY53MK5112919453142aWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY54MK5112928672882aWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY55MK5112938822932aWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY56MK5112947502482aWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY57MK5112957832602fWRKYGQKC-X4-C-X22-HNHC2H2
GgWRKY58MK5112969363112cWRKYGQKC-X4-C-X23-HXHC2H2
GgWRKY59MK51129710143372fWRKYGQKC-X4-C-X22-HXHC2H2
GgWRKY60MK5112986512162cWRKYGQKC-X4-C-X23-HNHC2H2
GgWRKY61MK5112997322432aWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY62MK5113004291422aWRKYGQKC-X5-C-X13-HNC2H2
GgWRKY63MK51130110923632dWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY64MK51130210503492dWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY65MK51130310653542eWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY66MK51130410473482dWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY67MK5113053391122cWRKYGKKC-X4-C-X23-HNHC2H2
GgWRKY68MK5113065881952cWRKYGKKC-X4-C-X23-HNHC2H2
GgWRKY69MK5113079333102dWRKYGQKZinc cluster
GgWRKY70MK5113089813262dWRKYGQKZinc cluster
GgWRKY71MK51130910713562dWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY72MK5113109573182dWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY73MK5113113871282dWRKYGQKC-X5-C-X23-HNHC2H2
GgWRKY74MK51131210893622fWRKYGQKC-X4-C-X22-HNHC2H2
GgWRKY75MK5113136902292cWRKYGQKC-X4-C-X23-HNHC2H2
GgWRKY76MK51131410053342cWRKYGQKC-X4-C-X23-HNHC2H2
GgWRKY77MK51131510683553WRKYGQKC-X7-C-X23-HXCC2HC
GgWRKY78MK5113167232402fWRKYGQKC-X4-C-X22-HNHC2H2
GgWRKY79MK5113176902292cWRKYGQKC-X4-C-X23-HXHC2H2
GgWRKY80MK5113186512162cWRKYGQKC-X4-C-X23-HNHC2H2
GgWRKY81MN6257349123032dWRKYGQKZinc cluster
GgWRKY82MN6257357802592dWRKYGQKZinc cluster
GgWRKY83MK511319591196NGWRKYGQK
GgWRKY84MK511320531176NGWRKYGQK
GgWRKY85MN625736654217NGWRKYGQK
GgWRKY86MN625737552183NGWRKYGQK
GgWRKY87MN625738426141NGWRKYGKK
Table 3

Physical parameters of GgWRKY genes.

GgWRKYspIMw(Da)Instability IndexAliphatic indexGRAVYSubcellular localization
GgWRKY16.6456446.5761.1358.15−0.760Nucleus
GgWRKY26.6142812.2557.1957.60−1.054Nucleus
GgWRKY34.8730524.6967.4959.14−1.006Nucleus
GgWRKY46.5531717.6141.3163.19−0.743Nucleus
GgWRKY56.5128190.9453.7648.50−1.096Nucleus
GgWRKY66.1722124.2361.9444.85−1.109Nucleus
GgWRKY77.1864879.5859.8741.12−0.988Nucleus
GgWRKY88.8941166.9853.5646.53−0.984Nucleus
GgWRKY98.6352606.4252.8752.55−0.903Nucleus
GgWRKY105.3019325.0930.2043.99−1.340Nucleus
GgWRKY115.9139455.3750.9649.92−0.843Nucleus
GgWRKY125.7482181.1654.4352.75−0.836Nucleus
GgWRKY135.9139455.3750.9649.92−0.843Nucleus
GgWRKY145.6345531.9551.9655.39−0.773Nucleus
GgWRKY156.1248273.3866.4438.82−1.086Nucleus
GgWRKY167.7044055.5138.8255.39−0.859Nucleus
GgWRKY175.5224792.9645.3350.05−1.027Nucleus
GgWRKY185.8158714.9651.3261.23−0.758Nucleus
GgWRKY196.3456060.6452.1659.04−1.002Nucleus
GgWRKY206.4570757.1558.0961.11−0.700Nucleus
GgWRKY216.8663675.2853.7264.60−0.622Nucleus
GgWRKY228.1038457.0952.9170.20−0.784Nucleus
GgWRKY235.8126383.7147.2948.25−1.033Chloroplast
GgWRKY245.5128633.9661.5558.22−0.728Nucleus
GgWRKY255.7323136.9555.8959.32−0.679Nucleus
GgWRKY265.7223236.3454.1665.89−0.515Nucleus
GgWRKY276.0159133.8860.5254.94−0.827Nucleus
GgWRKY286.8622645.061.6665.27−1.052Nucleus
GgWRKY298.2241362.9560.2759.34−0.720Nucleus
GgWRKY305.3240108.9749.4463.18−0.690Nucleus
GgWRKY316.8852267.9857.4955.46−0.992Nucleus
GgWRKY325.9618879.8955.3753.15−1.062Chloroplast
GgWRKY336.3018917.7160.5348.97−1.024Nucleus
GgWRKY345.4436176.0168.6869.84−0.771Nucleus
GgWRKY356.2338723.2262.5048.22−1.141Nucleus
GgWRKY369.6022200.1558.5160.94−0.886Nucleus
GgWRKY376.2526645.0152.9074.20−0.690Nucleus
GgWRKY388.7240884.3152.1261.53−0.570Nucleus
GgWRKY395.5667756.8746.2359.11−0.720Nucleus
GgWRKY406.3229579.3751.5575.57−0.747Nucleus
GgWRKY417.6345591.8853.8762.46−0.896Nucleus
GgWRKY426.4837329.1562.9356.14−1.051Nucleus
GgWRKY439.0941765.7255.6455.60−0.958Nucleus
GgWRKY447.9947206.1655.8275.66−0.515Nucleus
GgWRKY455.4228438.5560.8958.35−0.670Nucleus
GgWRKY467.2249789.8250.1066.08−0.735Nucleus
GgWRKY477.2855366.6054.6860.33−0.935Nucleus
GgWRKY488.9621525.7339.8656.85−0.812Nucleus
GgWRKY496.7528771.6145.2357.98−0.842Nucleus
GgWRKY507.7717309.3333.9259.74−0.684Nucleus
GgWRKY517.0932735.0963.3346.09−0.864Nucleus
GgWRKY529.2327088.4643.6963.74−0.833Nucleus
GgWRKY538.7034991.3451.4461.72−0.825Nucleus
GgWRKY548.9831973.9650.4559.86−0.840Nucleus
GgWRKY557.0032897.2652.0376.45−0.568Nucleus
GgWRKY568.8827837.5252.3571.20−0.610Nucleus
GgWRKY575.6929064.8864.7253.27−0.994Nucleus
GgWRKY586.0634087.4145.8254.15−0.973Nucleus
GgWRKY597.1635981.7653.2057.63−0.706Nucleus
GgWRKY609.1323611.0839.4057.27−0.838Nucleus
GgWRKY619.2327088.4643.6963.74−0.833Nucleus
GgWRKY629.6516437.9145.3374.15−0.896Nucleus
GgWRKY639.7040591.7954.8766.61−0.769Nucleus
GgWRKY649.6838091.7959.7263.52−0.724Nucleus
GgWRKY655.5838415.2654.7953.79−0.706Nucleus
GgWRKY669.7339237.3851.3461.93−0.809Nucleus
GgWRKY679.6113291.9158.9841.70−1.143Nucleus
GgWRKY686.2322180.2360.6545.95−1.086Nucleus
GgWRKY699.8634964.5952.3767.94−0.742Nucleus
GgWRKY709.8936444.1055.5468.19−0.728Nucleus
GgWRKY719.2438874.5947.5366.29−0.631Nucleus
GgWRKY729.9134646.2051.3965.06−0.586Nucleus
GgWRKY739.9614291.4633.7058.67−0.802Mitochondria
GgWRKY748.3639382.6555.2253.400.858Nucleus
GgWRKY757.6025909.5342.0251.05−0.895Chloroplast
GgWRKY766.2737066.8156.8953.95−0.845Nucleus
GgWRKY775.4540568.9149.0557.13−0.830Nucleus
GgWRKY784.9326513.7863.0354.08−1.010Nucleus
GgWRKY798.3126029.3443.7962.14−0.738Nucleus
GgWRKY809.1323611.0839.4057.27−0.838Nucleus
GgWRKY819.9633185.3863.2562.54−0.746Nucleus
GgWRKY8210.0528317.1359.6264.40−0.639Nucleus
GgWRKY839.1122167.3637.1666.68−0.463Nucleus
GgWRKY847.7319597.613257.05−0.807Chloroplast
GgWRKY855.5823670.7364.1437.83−1.053Nucleus
GgWRKY869.0520003.8735.9476.72−0.551Cytoplasm
GgWRKY876.2816102.5355.8643.55−1.207Nucleus
Sequence information of WRKY genes in G. glabra. Sequence features of WRKY genes in n class="Species">G. glabra. Physical parameters of GgWRKY genes.

Conserved domain in Glycyrrhiza WRKY members

Generally, similar domains in a protein impart similar function. Transcription factors gene families have a common conserved domain involved in DNA binding. All the 147 WRKYs (n class="Species">G. glabra & G. uralensis) had a distinctive hepta-peptide DNA binding sequence (WRKYG[Q/K]K), the identifying character of the WRKY family. In the present study, 28 WRKYs showed the presence of additional motif besides the WRKY domain (Figs. 1 and S1). GgWRKY55 possessed Coat family motif (30 amino acid residues) at the N-terminal while GgWRKY20 and GgWRKY21 had DivIVA super-family (63 amino acid residues) and SerS Superfamily motif (61 amino acid residues), respectively at the C-terminal. G. uralensis, on the other hand, had GuWRKY9 having Exo70 exo cyst complex subunit (338 amino acid) and Flac-arch super (GuWRKY26) at the N-terminal and PAT1 (GuWRKY23) and SGNH_hydrolase (GuWRKY60) at the C-terminal. However two motifs were common in both the species-Plant Zinc Cluster (26–40 amino acid) in 16 WRKYs and bZIP domain (42amino acid) in 2 WRKYs was present in both the species. Plant Zn cluster super-family domain was present in nine GgWRKYs (GgWRKYs63,64, 66, 69,70,71,72,73 and 86) and seven GuWRKYs (GuWRKYs1,13,18,29,35,49 and 56). All the domains simultaneously reported in the present study in both the species of Glycyrrhiza were reported individually in different plants in earlier studies. We have not come across any report mentioning DivIVA, SerS Superfamily, PAT1, Exo70 exo cyst complex subunit SGNH_hydrolase, Flac-arch super & Coat family protein in plant WRKY proteins. However, bZIP & plant zinc cluster have been reported from A. thaliana earlier[36]. The analysis of sequence motifs using MEME platform (http://meme.nbcr.net/meme/cgi-bin/meme.cgi)[37] displaying common and unique motifs within the GgWRKY sequences are shown in Fig. 2.
Figure 1

Classification of full-length GgWRKY amino acid sequences with different conserved domains (DivIVA, SerS, bZIP, Coat & Plant Zn cluster, WRKY). The conserved domains were investigated by CDD; * are exceptions in the classified groups and sub-groups in the phylogeny.

Figure 2

(a) Visualization of classification of 82 GgWRKY proteins. Conserved regions of GgWRKYs were used to construct the NJ phylogenetic tree with 1000 bootstrap value. (b) Architecture of 15 conserved protein motifs in GgWRKYs. Each motif is represented in different color (Motif 1–15). The conserved motifs were predicted by MEME program.

Classification of full-length GgWRKY amino acid sequences with different conserved domains (DivIVA, n class="Chemical">SerS, bZIP, Coat & Plant Zn cluster, WRKY). The conserved domains were investigated by CDD; * are exceptions in the classified groups and sub-groups in the phylogeny. (a) Visualization of classification of 82 GgWRKY proteins. Conserved regions of GgWRKYs were used to construct the NJ phylogenetic tree with 1000 bootstrap value. (b) Architecture of 15 conserved protein motifs in GgWRKYs. Each motif is represented in difn class="Chemical">ferent color (Motif 1–15). The conserved motifs were predicted by MEME program.

Phylogeny

The relatedness among 136 Glycyrrhiza WRKY proteins with the 109WRKYs identified from n class="Species">Arabidopsis thaliana, Psychometrella patens, Human FLYWCH CRAa and GCMa were investigated (Fig. 3) and tabulated in Table 4. The phylogeny of 136 WRKY proteins from the genus Glycyrrhiza displayed 22WRKYs (17GgWRKYs & 5GuWRKYs) belonging to group-I, 98 WRKYs (61 GgWRKYS & 37 GuWRKYs) clustering in group-II and 16 WRKY members comprising of group-III (4GgWRKYS & 12GuWRKYs). Group-II was further sub-divided into five sub-groups, IIa (11), IIb (17), IIc (16 + 8), IId (17), IIe (15) and an additional novel sub-group IIf (14) based on WRKY transcription factor rules adopted in Arabidopsis[9]. The present paper reports few exceptions observed in the WRKY members identified in the genus Glycyrrhiza. The GuWRKY27 possessed three WRKY domains (N1, N2 &C). Few recent publications have also reported more than 2 WRKY domains in Gossypium raimondii[38], Linum usitatissimum Lupinus angustifolius, Aquilegia coerulea and Setaria italic[32]. Phylogenetic analysis of the indicated proteins, however clubbed them into different subgroups. For example, in G. raimondii (WRKY108) the three domains (WRKY108N1, WRKY108N2 &WRKY108C) were clustered into IIc, III & IId sub-groups, respectively. In the present study, however, all the three WRKY domains (N1, N2 &C) of GuWRKY27 were found to be clustered into Group-III having Zn finger pattern similar to groupIII. This implies that the GuWRKY27 protein sequences are highly homologous to the group III WRKY member proteins, unlike the earlier published reports. Another exception was seen in GuWRKY20, where the protein was classified into group I based on the number of WRKY domains (2). However, it was clustered into group-III in the phylogenetic classification. MSA revealed that both the WRKY domains had Zn finger pattern similar to Group-III (C-X7-C-X23-HXC). The third exception was observed in GuWRKY3 whose Zn finger pattern was unlike any of the existing subgroups of group II. It could be the starting point for the evolution of a new subgroup in group II.
Figure 3

Neighbour-Joining JTT model of phylogenetic tree comprising of 82 Glycyrrhiza glabra (maroon), 54 Glycyrrhiza uralensis (cyan blue), 70 Arabidopsis thaliana (dark green), 37 Psycometrella patens (violet), with GCMa (blue) and FLYWCH CRAa (red) WRKY domains. Suffix ‘N’ and ‘C’ indicates the N-terminal and the C-terminal of 60 amino acids WRKY domains of Group I.

Table 4

Phylogenetic classification of WRKY domains identified from G. glabra, A. thaliana, P. patens and G. uralensis WRKY proteins.

GroupSub groupGene numberGgWRKYsAtWRKYsPpWRKYsGuWRKYs
GgWRKYsAtWRKYsPpWRKYsGuWRKYs
IIN171335

41,46,47,

31,43,1,2,15,9,16,

19,22,18,

12,7,8,5

58,20,1,

32,3,4,19,44,2,34,

33,25,26

30,39,214,57,6,50,55
IC171335

19,41,46,

22,15,5,

47,31,43,

12,18,9,1,2,7,8,16

58,4,3,33,20,2,26,

34,19,25,

44,32,1

39,30,214,57,6,50,55
IIIIa9302

53,54,61,

52,55,56,

62,37,40

40,60,18--------58,28
IIb58512

20,21,44,

38,39

36,6,31,

42,47,61,

9,72

7,9,12,

14,29

54,43,5,12,

22,23,47,31,37,48,32,26

IIc1117195

36,35,76,

51,79,75,

4,10,68,

67,33

45,75,43,

24,56,48,

57,23,68,

71,8,28,

13,12,50,

51,59

37,11,25,

10,3,23,

20,8,13,

32,1,4,19,31,40,28,

5,6,24

40,24,33,36,59
IId10757

70,69,81,

82,66,63,

64,72,71,

73

11,17,15,

21,39,74,

7

15,22,17,

2,38

1,49,35,13,

56,29,18

IIe8807

27,29,3,

65,25,26,

24,45

65,69,29,

27,22,16,

14,35

--------

46,34,14,2,

44,25,42

IC + IIc8100

58,23,60,

80,48,49,

11,13

10------------------
IN + IIf10004

78,28,50,

14,59,17,

74,6,57,42

---------------------53,7,60,3
III413512

34,30,32,

77

63,64,66,

67,38,62,

54,70,55,

46,30,41,

53

16,27,26,

33,34

9,16,11,39,

10,19,38,8,

17,51,20N,

20C,27N1,

27N2,27C

Total82703754
Neighbour-Joining JTT model of phylogenetic tree comprising of 82 n class="Species">Glycyrrhiza glabra (maroon), 54 Glycyrrhiza uralensis (cyan blue), 70 Arabidopsis thaliana (dark green), 37 Psycometrella patens (violet), with GCMa (blue) and FLYWCH CRAa (red) WRKY domains. Suffix ‘N’ and ‘C’ indicates the N-terminal and the C-terminal of 60 amino acids WRKY domains of Group I. Phylogenetic classification of WRKY domains identified from G. glabra, n class="Species">A. thaliana, P. patens and G. uralensis WRKY proteins. 41,46,47, 31,43,1,2,15,9,16, 19,22,18, 12,7,8,5 58,20,1, 32,3,4,19,44,2,34, 33,25,26 19,41,46, 22,15,5, 47,31,43, 12,18,9,1,2,7,8,16 58,4,3,33,20,2,26, 34,19,25, 44,32,1 53,54,61, 52,55,56, 62,37,40 20,21,44, 38,39 36,6,31, 42,47,61, 9,72 7,9,12, 14,29 54,43,5,12, 22,23,47,31,37,48,32,26 36,35,76, 51,79,75, 4,10,68, 67,33 45,75,43, 24,56,48, 57,23,68, 71,8,28, 13,12,50, 51,59 37,11,25, 10,3,23, 20,8,13, 32,1,4,19,31,40,28, 5,6,24 70,69,81, 82,66,63, 64,72,71, 73 11,17,15, 21,39,74, 7 15,22,17, 2,38 1,49,35,13, 56,29,18 27,29,3, 65,25,26, 24,45 65,69,29, 27,22,16, 14,35 46,34,14,2, 44,25,42 58,23,60, 80,48,49, 11,13 78,28,50, 14,59,17, 74,6,57,42 34,30,32, 77 63,64,66, 67,38,62, 54,70,55, 46,30,41, 53 16,27,26, 33,34 9,16,11,39, 10,19,38,8, 17,51,20N, 20C,27N1, 27N2,27C It was further observed among the 82 GgWRKY proteins in the phylogenetic tree, Group IN (17 members) clubbed with ten GgWRKYs (GgWRKYs 59,-14,-17,-28,-50,-42,-6,-57,-74,-78) belonging to Group-II with unique n class="Chemical">Zn finger pattern (C-X4-C-X22-HXH) which was not reported earlier in this group. We propose a new subgroup-IIf based on the present findings which could be the initiation of divergence into a new sub-group maintaining the characteristic WRKY domain. The phylogenetic analysis of the 60 amino acid region of the Glycyrrhiza WRKY proteins indicated their diverse origin. The n class="Chemical">N-terminal and the C-terminal of Group I of the WRKY proteins clustered them into different clades indicating their dissimilar background. Further, the majority of the subgroup IIc proteins (8 proteins) were found to assemble with group IC indicating their common origin with respective clusters. Contrary to our results, Zhu et al.[39] found that subgroup IIc WRKY domain in Triticum aestivum, originated from the N-terminal WRKY domain of group I. However, recent study on legumes have revealed that IIc sub-groups have multiple origins[32]. The present study also showed gathering of sub-groups IIa & IIb, while sub-group IId + IIe were clustered with group III, signifying close relationship with members with respective groups. Previously Zhang & Wang[8] proposed a phylogenetic tree based evolutionary relationships which classified the WRKY gene family into four clades including groups I + IIc, groups IIa + IIb, group IId, and group IIe. But according to Rinerson et al.[40] hypothesis the WRKY protein evolution may have followed two alternative paths, “Group I Hypothesis” which proposed that all WRKY proteins evolved from the C-terminal WRKY domains of group I proteins, and the “ IIa + b Separate Hypothesis” which suggested that groups IIa and IIb have evolved directly from a single domain algal gene separated from a group I-derived lineage. It is hard to explain the origin of the WRKY gene family on the basis of any one hypothesis, as mounting number of studies have demonstrated their multiple origins. Based on our phylogenetic analyses, we found that a phylogenetic cluster was a mix of WRKY genes from at least two different groups or sub-group indicating their dynamic nature. Further, the present study could identify eleven WRKY proteins (GgWRKY-83 to-87 & GuWRKY-15, -21, -30, -41, -45 &-52), not included in the phylogenetic analyses, that possessed WRKY domain but had truncated characteristic zinc finger motif. Earlier studies on Vitis venifera[41] and n class="Species">rice[42] had also shown loss of Zn finger motifs in WRKY proteins. The phylogenetic clustering was further examined at sequence level by multiple sequence alignment (MSA).

Multiple sequence alignment of the identified WRKY proteins

The multiple sequence alignment of 60 amino acids conserved region of all the 87 GgWRKY proteins were clustered in 9 different groups and sub-groups with very high homology (>70%) as shown in (Fig. 4). Group In class="Chemical">N displayed conserved motif 1 (DG[Y/F]NWRKYGQK[L/Q/H]VK) and zinc finger pattern of C-X4-C-X22-HXH showing conservancy with 27 GgWRKY proteins, 17 of them belonged to Group IN and 10 GgWRKYs (59,-14,-17,-28,-50,-42,-6,-57,-74,-78) belonging to new sub-group IIf, having Zn finger motif (C-X4-C-X22-HXH) which was similar to Zn finger domain of group IN unlike group II members. While Group IC had 17 GgWRKYs belonging to group IC and 8 GgWRKYs from group IIc (GgWRKYs 11,-13,-48,-60,-80,-49,-23,-58). All the 25 GgWRKY members in Group IC displayed conserved motif 2 (DG[Y/F]RWRKYGQK), zinc finger pattern of C-X4-C-X23-HXH and high identity (70.4%) as shown in Fig. 4. The third group IIa had nine GgWRKY proteins (GgWRKYs40,-37,-62,-55,-56,-61,-52,-53,-54) displaying motif 3 (DGYQWRKYGQKVT[R/K] DN) and a zinc finger motif pattern of C-X5-C-X23-HNH having 86.2% identity, except GgWRKY62 which had C-X5-C-X13-HN Zn finger pattern. Group IIb had five GgWRKYs (GgWRKYs 20,-21,-44,-38,-39) with three conserved motifs, motif 4 (WRKYG[Q/K]K), motif 5 (PRAYYRC) and motif 6 (CPVRKQVQRC) with 85.8% identity, while Group IIc comprised of 11 sequences (GgWRKYs35,-36,-76,-51,-79,-75,-4,-10,-67,-68,-33) with conserved sequence motif 4 (WRKYG[Q/K]K) and a zinc finger motif pattern of C-X4-C-X23-HXH (77.6% identity). The 60 amino acid signature sequence of Group IId proteins comprising of 10 GgWRKYs, when aligned together showed only 32% identity (Fig. 5). Six of the ten members had zinc finger, while four (GgWRKYs 69,-70,-81,82) had no zinc finger present in them; instead they had 50 amino acid zinc cluster domain at the N terminal. Based on the conservancy, when this group was divided into two sub-groups IId1 (GgWRKYs 69,-70,-81,-82) & IId2 (GgWRKYs 72, -64,-71,-73,-63,-66,) the identity was significantly increased from 32% to 54.9% and 83.7%, respectively. The members clustered in group IId1 showed conserved motif 7(WRKYGQKPIKGSP) and no zinc finger at the C-terminal end, while subgroup IId2 displayed conservancy of three motifs- 7(WRKYGQKPIKGSP), 8(PRGYYKC) & 9(RGCPARKHVER) along with a common zinc finger pattern C-X5-C-X23-HNH (Fig. 5). Further, when conserved domain sequence of 60amino acid of all the 10 GgWRKYs of sub-group IId, was increased to 110 amino acids the two sub-groups (IId1&IId2) combined into a single group (IId) displaying 70.68% identity among all the members. We also confirmed the conservancy of each group and subgroup with the WRKY members belonging to A. thaliana and G. uralensis (Figs. 4 and 5). The MSA further proves the dynamic nature of GgWRKYs.
Figure 4

Multiple sequence alignment (MSA) of conserved GgWRKY domain. The alignment was performed using Clustal W program and displayed using DNAMAN software. Conserved motifs (1–10) and type of zinc-finger pattern are indicated within groups or sub-groups. Blue color represents 100% sequence identity. Pink color is for more than 75% while cyan color is for less than 75% sequence identity.

Figure 5

Multiple sequence alignment (MSA) profile of group IId (10 sequences). Initially conserved 60 amino acids region is used to build alignment that showed low sequence identity (32%). When it was separated in two groups (IId1&IId2), identity increased significantly (54.9& 83.7%). Sub-group IId1 (4 sequences) with sequences upstream to WRKY domain having Plant Zinc cluster with motif 7and no zinc finger; subgroup IId2 (6 sequences) with 7, 8, 9 motifs and Zn finger. When four sequences of sub group IId1 were extended 50 amino acids towards N’terminal (total110 amino acid), sequence identity of sub groupIId increased to 70.68%.

Multiple sequence alignment (MSA) of conserved GgWRKY domain. The alignment was performed using Clustal W program and displayed using DNAMAn class="Chemical">N software. Conserved motifs (1–10) and type of zinc-finger pattern are indicated within groups or sub-groups. Blue color represents 100% sequence identity. Pink color is for more than 75% while cyan color is for less than 75% sequence identity. Multiple sequence alignment (MSA) profile of group IId (10 sequences). Initially conserved 60 amino acids region is used to build alignment that showed low sequence identity (32%). When it was separated in two groups (IId1&IId2), identity increased significantly (54.9& 83.7%). Sub-group IId1 (4 sequences) with sequences upstream to WRKY domain having Plant Zinc cluster with motif 7and no zinc finger; subgroup IId2 (6 sequences) with 7, 8, 9 motifs and Zn finger. When four sequences of sub group IId1 were extended 50 amino acids towards n class="Chemical">N’terminal (total110 amino acid), sequence identity of sub groupIId increased to 70.68%.

Promoter analysis

The upstream region of 31 GgWRKY genes was examined for the presence of Cis-regulatory elements. Several stress-responsive elements like UV, salinity, ABA, GA signalling, etiolation, water stress, auxin and sulphur responsive elements were identified (Fig. 6). Also, several copies of WRKY binding motifs were identified in the promoter region of GgWRKY genes. The DNA binding WRKY motifs in the promoter region ranged from 1 (GgWRKY20, 23) to 11 (GgWRKYs 18 &62). Overall, twenty-seven GgWRKYs had three or more W-boxes in their promoter region. Observation revealed presence of multiple W-box elements mostly in the stress-related genes, which is following the earlier studies[6,12]. Additionally promoters of several n class="Chemical">glycyrrhizin biosynthesis genes (CYP88D6, CYP72A154 & squalene epoxidase) contained W boxes (unpublished data) suggesting regulatory role of WRKY in glycyrrhizin biosynthesis, thereby providing a platform to understand its regulation.
Figure 6

Analysis of cis-regulatory elements (CREs) in GgWRKY promoter region. Total ten stress responsive elements were mapped on sense and anti-sense strand using RSAT tool.

Analysis of cis-regulatory elements (CREs) in GgWRKY promoter region. Total ten stress responsive elements were mapped on sense and anti-sense strand using RSAT tool.

Protein-protein interaction

The protein-protein interaction of GgWRKYs was predicted by STRING[43] with n class="Species">A. thaliana (taxonomic ID 3702) as a model using Markov clustering (MCL) having inflation factor of 8.5. The STRING software is a prediction pipeline for deducing protein-protein associations from co-expression data and interaction conservation (Fig. S2; Supplementary File 3). It predicts interaction between orthologs in taxonomically different organism. The corresponding GgWRKY orthologs selected had more than 60% protein sequence homology having one WRKY domain (PF03106) as predicted by Pfam and three domains (IPR003657, IPR003657 and IPR017412) as analysed by INTERPRO. The analysis revealed 74, 08 and 03 GO term significantly enriched in biological processes, molecular function and cellular components, respectively (Supplementary File 4). The MCL clustering displayed 8 distinct groups, largest being associated with 8 WRKY proteins (red) showing strong interaction (AtWRKYs 15,22,11,33,40,53,30 &48) corresponding to predicted orthologs GgWRKYs 73, 29, 73, 15, 53, 32, 32 & 67, respectively (Fig. 7; Supplementary File 5). These specific associations indicated that these proteins jointly contributed to a shared function of cis or trans in nature as inferred from curated databases or experimentally determined data available in public domains[43]. The AtWRKYs and corresponding GgWRKYs were shown to be involved in various biological processes including ROS induced modulation, plant growth and osmotic stress (AtWRKY15/GgWRKY73), development (AtWRKY22/GgWRKY29), Jasmonic acid-induced response (AtWRKY11/GgWRKY73), wound-induced response and positive regulator of stress (AtWRKY33/GgWRKY15), senescence (AtWRKY40/GgWRKY53), leaf development and senescence (AtWRKY53/GgWRKY32), abiotic stress and senescence (AtWRKY30/GgWRKY32), and hormonal signal response and defense (AtWRKY48/GgWRKY67). Strong association between 8 AtWRKYs and corresponding ortholog GgWRKYs indicated co-regulation of several biological processes related to senescence, Jasmonate response, hormonal signaling and wound induced response (Supplementary File 5).
Figure 7

Protein-Protein interaction of GgWRKYs transcription factor based on AtWRKYs orthologs as predicted by STRING search tool.

Protein-Protein interaction of GgWRKYs transcription factor based on AtWRKYs orthologs as predicted by STRING search tool. The associated proteins need not physically connect in a protein-protein interaction of a specific step instead, they may form functional protein linkages especially in transcriptional or post-transcriptional regulation of a process. Also, it has been observed that evolutionarily related proteins usually maintain their three-dimensional structure, even when they have diverged[43,44]. This interaction between orthologs is expected to display high degree of interaction conservation more so in indirect or transient types of protein-protein associations. Based on protein conservancy of GgWRKYs with AtWRKYs, we assessed the putative functions of GgWRKYs and verified the expression profile of few of the predicted functions of GgWRKYs experimentally in Lab (Table 5) under n class="Disease">abiotic stress.
Table 5

AtWRKYs, their induction factor and experimentally verified responses in GgWRKYs.

S.NGgWRKYsOrthologsAtWRKYsOrthologs(% Identity)Reported Induction factor (AtWRKYs)Experimentally verified Induction factor (GgWRKYs)
1GgWRKY2AtWRKY453%P. syringae, Salicyclic acid (SA), Jasmonic acid (JA), sucrose, senescence, cold, salinitySenescence, Carbon starvation, NAA, GA3
2GgWRKY4AtWRKY2474%unknownsalinity, Carbon starvation
3GgWRKY5AtWRKY458%P. syringae, SA, JA, sucrose, senescence, cold, salinitySenescence, Salinity, dark, GA3
4GgWRKY8AtWRKY3361%

Salinity, mannitol, cold, heat,

H2O2, ozone, UV, chitin, B. cinerea, P. syringae, A. brassiciola

Senescence, Salinity, Carbon starvation, NAA, GA3
5GgWRKY14AtWRKY2053%unknownHeat, cold, Salinity. UV, GA3
6GgWRKY15AtWRKY3362%

Salinity, mannitol, cold, heat,

H2O2, ozone, UV, chitin, B. cinerea, P. syringae, A. brassiciola

Senescence, cold, Carbon starvation, NAA, GA3
7GgWRKY20AtWRKY6151%unknownGA3
8GgWRKY24AtWRKY6552%Fe starvationDark, heat, UV, Salinity
9GgWRKY29AtWRKY2264%H2O2, dark, chitin, flagellinSenescence, cold, Nwrky
10GgWRKY36AtWRKY2877%Salinity, mannitol, H2O2Salinity, heat, UV, dark, Carbon starvation
11GgWRKY38AtWRKY6177%unknownSenescence, dark, NAA, GA3
12GgWRKY40AtWRKY660%

H2O2, methyl viologen,

Pi and B starvation

Cold, dark
13GgWRKY44AtWRKY3167%unknownGA3
14GgWRKY45AtWRKY6964%unknownSenescence, dark, Carbon starvation, GA3
15GgWRKY51AtWRKY7175%unknownDark, Carbon starvation, GA3
16GgWRKY53AtWRKY4055%ABA signaling, SA, chitin, woundingWounding, cold, dark
17GgWRKY54AtWRKY4058%ABA signaling, SA, chitin, woundingSenescence, wounding, cold, dark, NAA, GA3
18GgWRKY55AtWRKY4043%ABA signaling, SA, chitin, woundingSenescence, UV, cold, NAA, GA3
19GgWRKY56AtWRKY951%unknownCold, dark, NAA
20GgWRKY57AtWRKY278%Salinity,manitol,ABASalinity, cold, dark, GA3
21GgWRKY58AtWRKY264%Salinity, manitol, ABACold, Carbon starvation, GA3
22GgWRKY59AtWRKY2054%unknownCold, GA3
23GgWRKY62AtWRKY1869%ABA, SASalinity, heat, cold
24GgWRKY69AtWRKY 2165%unknownCold, dark
25GgWRKY70AtWRKY 2163%unknownSalinity, cold, dark
AtWRKYs, their induction factor and experimentally verified responses in GgWRKYs. Salinity, mannitol, cold, heat, H2O2, n class="Chemical">ozone, UV, chitin, B. cinerea, P. syringae, A. brassiciola Salinity, mannitol, cold, heat, H2O2, n class="Chemical">ozone, UV, chitin, B. cinerea, P. syringae, A. brassiciola H2O2, n class="Chemical">methyl viologen, Pi and B starvation

Real-time expression analysis

The A. thaliana based protein conservancy for the functional prediction of putative orthologs in n class="Species">G. glabra was experimentally performed. The expression profile of twenty-five GgWRKY genes was investigated post-hormonal treatments (NAA & GA3) and under eight abiotic stress treatments including carbon starvation, salinity, heat, cold, dark, UV, senescence and wounding administered to the aerial tissues of the in-vitro cultured G. glabra plant. Out of the 25 GgWRKYs examined, eight GgWRKYs responded to the NAA treatment (Fig. 8). As can be seen from the heat map, transcripts of GgWRKYs 8, 15 & 29 accumulated maximum (4.1, 3.3 &1.6 folds, respectively) between 0.5 to 1.00 hrs, GgWRKY55 took longer (1.30 hrs) to display its maxima (17.5 folds). GgWRKYs 54 & 56 were mostly up-regulated all the time, while GgWRKYs 4 & 38 were down-regulated in the specified time of study. It seems GgWRKY56 & GgWRKY 38 had definite positive (3.3 folds) & negative regulatory (0.001folds) effects, respectively on the aerial tissues of the plant treated with auxin. GA3 treatment, on the other hand (Fig. 8), revealed GgWRKY 58 was highly up-regulated (257.3 folds) in the aerial tissues of the plant, while GgWRKY15 was up-regulated (43.6 fold) in the underground tissues of the plant as compared to the control. Most of the GgWRKYs responded within 1.5 hrs of GA3 treatment, except GgWRKY 20 which took longer to show their maxima (2.0 hrs) in the root tissues. The results inferred from the present study were compared with the earlier published reports on the functions of AtWRKYs[45] which are presented in Table 5. Of the 25 GgWRKYs assessed for abiotic stress treatment, maximum showed response to post-cold treatment (17), followed by dark (13). Nine GgWRKYs responded to senescence and salinity, while eight triggered a response on carbon starvation. Maximum number of GgWRKYs were up-regulated (10) after dark treatment followed by senescence (9). Darkness induced up-regulation of GgWRKYs 5, 24, 36, 38, 40, 45, 51, 53, 54 and 57, while GgWRKYs 56, 69 & 70 were down-regulated. Nine GgWRKYs 2, 5, 8, 15, 29, 38, 45, 54 & 55 were up-regulated during senescence, GgWRKYs 5, 14, 24 & 54 were down-regulated under saline conditions. The transcript levels of GgWRKYs 14, 24 & 36 were more under heat stress, while GgWRKY 24 was up-regulated on UV treatment. The injured plant showed up-regulated transcripts of GgWRKY 54, while GgWRKY53 was found to be down-regulated. Cold treated samples showed higher transcript levels of GgWRKYs 15, 53 & 54, while GgWRKYs14, 40, 55, 56, 57, 58, 59, 62, 69 &70 were down-regulated. The Carbon starved plants showed up-regulation of only GgWRKY51 while GgWRKYs 2, 4, 8, 15, 36, 45 & 58 were found to be down-regulated (Fig. 9). Significantly up-regulated (P ≤ 0.001) GgWRKYs were observed only in senescence (GgWRKYs 45&15), while in salinity, GgWRKY36 was significantly down-regulated. Out of the ten different treatments performed to assess the role of GgWRKYs in abiotic stress, predicted by STRING based on AtWRKY protein conservancy, the response of 11 GgWRKYs corroborated very well with 15 AtWRKYs whose functions were reported in literature (Table 5). Among the 25 GgWRKYs examined, 23 responded to abiotic stress, 17 were induced by hormone while 13 were common to both, suggesting role of hormone under stress conditions. Further study on these functionally assigned GgWRKYs will throw light on their role in underlying molecular mechanism. On comparing the experimental data with the STRING predicted data, it was found that our results corroborated well with the earlier reports on the induction of AtWRKY4 on senescence, AtWRKY40 on wounding, AtWRKYs 2, 28 &33 during salinity and AtWRKY33 under cold treatments. Few AtWRKYs whose functions were not assigned, like AtWRKYs 21, 24, 31, 38, 61, 69 and 71, were also designated putative function based on identity percentage.
Figure 8

GgWRKY genes are represented as rows and treatment time duration as columns in the matrix. Expression analysis of selected GgWRKY genes displaying differential expression pattern in shoot and roots under various hormonal stress. Heat map showing- Cluster analysis of GgWRKY genes according to their expression profiles in (a) shoots and (b) roots after GA3 treatment for 0.5, 1, 2, 4, 8, 12 and 24 h time interval; (c) Cluster analysis of GgWRKY genes according to their expression profiles in shoots after NAA treatment for 0.5, 1, 1.5, 2 and 3 h time interval.

Figure 9

Expression profiles of selected GgWRKY genes under eight different stresses. The Y-axis indicates relative expression level and X-axis indicates control shoot tissues (C) and treated shoot tissues (T). (a) expression patterns under etiolated conditions; (b–h) expression profiles under heat, UV, wounding, cold, dark, carbon starvation and salinity, respectively. Actin was used as internal reference. Three biological replicates were used to calculate error bars using standard deviation. Asterisks indicate that the corresponding gene was significantly up- or down regulated in a given treatment (*P ≤ 0.05; **P ≤ 0.01; ***P ≤ 0.001).

GgWRKY genes are represented as rows and treatment time duration as columns in the matrix. Expression analysis of selected GgWRKY genes displaying differential expression pattern in shoot and roots under various hormonal stress. Heat map showing- Cluster analysis of GgWRKY genes according to their expression profiles in (a) shoots and (b) roots after GA3 treatment for 0.5, 1, 2, 4, 8, 12 and 24 h time interval; (c) Cluster analysis of GgWRKY genes according to their expression profiles in shoots after n class="Chemical">NAA treatment for 0.5, 1, 1.5, 2 and 3 h time interval. Expression profiles of selected GgWRKY genes under eight different stresses. The Y-axis indicates relative expression level and X-axis indicates control shoot tissues (C) and treated shoot tissues (T). (a) expression patterns under etiolated conditions; (b–h) expression profiles under heat, UV, wounding, cold, dark, n class="Chemical">carbon starvation and salinity, respectively. Actin was used as internal reference. Three biological replicates were used to calculate error bars using standard deviation. Asterisks indicate that the corresponding gene was significantly up- or down regulated in a given treatment (*P ≤ 0.05; **P ≤ 0.01; ***P ≤ 0.001). In any biological process, understanding the role of transcription factor provides an insight into its regulatory mechanism. WRKY transcriptional factors have been extensively studied in a plant for plant growth, development, and response to biotic and abiotic stresses. However, WRKY genes present in n class="Species">Glycyrrhiza species have not been elucidated. In conclusion, we identified and characterized 147 full length putative WRKY genes in the genus Glycyrrhiza. These putative genes were grouped based on the number of WRKY domains & zinc finger pattern and further analysed for various properties like molecular weight, iso-electric point, instability index, sub-cellular location. The phylogenetic analysis categorised more than one group/sub-group together, indicating their multiple origins. The present paper highlights several findings not reported earlier, like the novel Zn finger motif of C-X4-CX22-HXH type (sub-group IIf). Also these group-II members shared homology with group IN WRKY members, unlike the other members of group II. This paper also reports several additional domains (DivIVA, SerS, Coat, Exo70 exo cyst complex subunit, Flac-arch super, PAT1and SGNH_hydrolase) apart from the conserved WRKY domain in the WRKY proteins. MSA based 60 amino acid signature sequence of group IId showed very low sequence identity (32%), however when its length was increased to 110 amino acid the identity increased to 70.7%. A closer look at the subgroup IId showed presence of 50 amino acid plant Zn cluster domain upstream to the WRKY domain in four members. However, characteristic Zn finger motif was absent in these members. Additionally, putative functions were assigned to the identified GgWRKYs, based on STRING database which comprised of both theoretically reported and experimentally verified data. Verification of the data in the Lab displayed 11 out of 15 functions as assigned. The study provides significant evidence to further investigate and validate the role of WRKYs in n class="Species">Glycyrrhiza species in growth, under stress condition and in secondary metabolite biosynthesis.

Materials and Methods

Identification and sequence annotation of WRKY genes

Transcriptome-wide identification of WRKY genes in G. glabra and n class="Species">G. uralensis transcriptome data was done by local similarity (tblastn) search and HMM profile methods. Initially, seventy AtWRKY proteins were downloaded from Arabidopsis Information Resource (TAIR; http://www.Arabidopsis.org/), and HMM profile of WRKY family with accession number PF03106 was retrieved from the Pfam protein family database (https://pfam.xfam.org/). The A. thaliana (AtWRKYs) and PF03106 profile were used as a query sequence to search against the transcriptome data of G. glabra and G. uralensis. An e-value cut off of 1e−50 was applied for the homologue recognition. Parsing the BLAST data from G. glabra, a total of 125 contig hits were found. All these contigs were further analysed in ORF finder (https://www.ncbi.nlm.nih.gov/orffinder/) to get the full-length CDS of 87 GgWRKY sequences. Publicly available transcriptome database of G. uralensis (http://ngs-data-archive.psc.riken.jp/Gur-genome/download.pl.) was used to get 60 GuWRKYs. The retrieved coding sequences (CDSs) were then translated by ExPASy translate (https://web.expasy.org/translate/) tool and validated using the Uniprot protein database (https://www.uniprot.org/), conserved domain database (https://www.ncbi.nlm.nih.gov/cdd/) and HMMScan (https://www.ebi.ac.uk/Tools/hmmer/search/hmmscan). The molecular weight (MW), Theoretical isoelectric point (pI), instability index, aliphatic index, Grand average of hydropathicity (GRAVY) of GgWRKY proteins were predicted via the ProtParam (http://web.expasy.org/protparam/). Additionally, subcellular localisation was also predicted by an advanced protein subcellular localisation prediction tool WoLFPSORT (https://wolfpsort.hgc.jp/).

Multiple sequence alignment, phylogenetic analysis and classification

The multiple sequence alignment (MSA) of 245 WRKY proteins was performed using 82 WRKY proteins of G. glabra (GgWRKY), 54 WRKY proteins from n class="Species">G. uralensis (GuWRKY), 70 from A. thaliana (AtWRKYs), 37 from P. patens (PpWRKYs) and one each from Human FLYWCH CRAa and GCMa. The protein sequences of Arabidopsis were downloaded from TAIR (http://www.Arabidopsis.org/), GuWRKYs from (http://ngs-data-archive.psc.riken.jp/Gur-genome/download.pl.), PpWRKYs were obtained from P. patens v3.3 (https://phytozome.jgi.doe.gov/pz/portal.html#!info?alias = Org_Ppatens), Human FLYWCH CRAa (EAW85440) and GCMa (BAA13651) were retrieved from NCBI (http://www.ncbi.nlm.nih.gov/protein/). The conserved regions of 60 amino acids for the WRKY proteins were searched using HMMScan and aligned using CLUSTALW for the construction of the phylogenetic tree. For the GgWRKY based phylogenetic tree, complete protein sequences were used. The tree was constructed using MEGA 7.0 with neighbor-joining method using JTT substitution model and pair-wise deletion method with 1000 bootstrap value. The 60 amino acid conserved region of MSA of G. glabra was visualised using DNAMAN. The MSA included conserved region of WRKY members representing each group and subgroup from A. thaliana and G. uralensis as reference.

Protein-protein interaction analysis and motif detection

The conserved motifs of GgWRKY proteins were analyzed using Multiple Expectation Maximization for Motif Elicitation (MEME: http://meme-suite.org/tools/meme) with the following parameters: minimum and maximum motif widths 6 and 50, respectively and the maximum number of motifs 15. Protein-protein interactions were predicted by STRING[43] with n class="Species">A. thaliana as model using Markov clustering with inflation factor of 8.5.

Analysis of cis-regulatory elements in GgWRKYs promoter regions

Promoter sequences of 31 GgWRKYs of up to 2.5 kb (kilobase) upstream to the transcription start site were retrieved manually (Supplementary File 6). These promoter sequences were used as queries to scan the presence of various Cis-regulatory elements in Plant Cis-acting Regulatory DNA Elements (PLACE, http://www.dna.affrc.go.jp/PLACE/)[46]. The position of identified CREs (biotic and n class="Disease">abiotic stress-responsive elements) was mapped on both sense and anti-sense strand using RSAT[47] (http://rsat.sb-roscoff.fr/feature-map_form.cgi) drawing tool.

Plant material and treatments

Five months old in-vitro cultured plants grown in SPB medium[48] under controlled conditions of 25 °C (±1.5) temperature and a 16 h light/8 h dark cycle (light intensity of 200mmol m–2 s–1), were exposed to various treatments including hormone, temperature, salinity, senescence and wounding. The plantlets, grown in liquid SPB medium were individually subjected to 50 µM auxin (NAA) for 0.5, 1.0, 1.5, 2.0 & 3.0 hrs, and 10 µM of n class="Chemical">gibberellin (GA3) treatments for 0.5, 1.0, 2.0, 4.0, 8.0, 12 & 24 hrs. Controls were sprayed with water. Different sets of plants were independently subjected to different abiotic treatments like NaCl (500 mM) for 72 hrs, dark, cold (4 °C) and heat (55 °C) treatments for 48, 24 and 8 hrs, respectively. For the Ultra-violet treatment, plants were kept under UV-C for 30 minutes. Mechanically injured plants were examined after 8 hrs of injury. Yellow aerial tissues of plants were used for senescence study, and Carbon starvation was given to the plant for 48 hrs in SPB medium having no sugar. All the respective controls were kept under culture conditions. The control and treated plants were harvested at the appropriate times as indicated, frozen in liquid nitrogen and stored at −80 °C for RNA extraction. Each treatment was used in triplicate and was repeated at least twice.

RNA extraction and quantitative real-time reverse transcription PCR (qRT-PCR)

Total RNA of control and treated shoots and root tissues were extracted using the Pure Link RNA Mini Kit (Invitrogen, US). RNA integrity was analysed on a 1.5% agarose gel and quantity was determined using a NanoDrop 2000C spectrophotometer (Thermo Scientific, USA). cDNA synthesis was carried out using SuperScript™ VILO™ cDNA Synthesis Kit (Thermo Scientific, USA). qRT-PCRs were performed using the SYBR Green PCR Master Mix (Takara, Japan) and carried out in triplicate for each tissue sample. Gene-specific RT primers were designed manually (Supplementary File 7). The amplicons size ranged between 200 to 250 bp. Actin gene was selected as an internal reference gene. The amplification was done in a ten μl reaction volume, which contained 5.0 μl of SYBR Green PCR Master Mix, 0.2 μl of each primer (10 pc), 0.2 μl of ROX, 1.0 μl cDNA template (150 ng/μl), and 3.4 μl ddH2O. PCRs with no-template controls were also performed for each primer pair. The real-time PCRs were performed employing7500 Fast Real-Time PCR System and software (Applied Biosystems, USA). All the PCRs were performed under following conditions: 30 sec at 95 °C, 3 sec at 95 °C, respective optimized Tm for 1 min (40 cycles) followed by 95 °C (15 seconds), 60 °C (30 sec) and 95 °C (15 sec) in MicroAmp fast reaction tubes (Applied Biosystems, USA). The specificity of amplicons was verified by melting curve analysis (55 to 95 °C) after 40 cycles. Supplementary information.
  21 in total

Review 1.  WRKY transcription factors.

Authors:  Paul J Rushton; Imre E Somssich; Patricia Ringler; Qingxi J Shen
Journal:  Trends Plant Sci       Date:  2010-03-19       Impact factor: 18.313

Review 2.  Regulation of specialized metabolism by WRKY transcription factors.

Authors:  Craig Schluttenhofer; Ling Yuan
Journal:  Plant Physiol       Date:  2014-12-10       Impact factor: 8.340

Review 3.  Protein-protein interactions in the regulation of WRKY transcription factors.

Authors:  Yingjun Chi; Yan Yang; Yuan Zhou; Jie Zhou; Baofang Fan; Jing-Quan Yu; Zhixiang Chen
Journal:  Mol Plant       Date:  2013-03-02       Impact factor: 13.164

4.  Expression profiles of the Arabidopsis WRKY gene superfamily during plant defense response.

Authors:  Jixin Dong; Chunhong Chen; Zhixiang Chen
Journal:  Plant Mol Biol       Date:  2003-01       Impact factor: 4.076

5.  Characterization of a cDNA encoding a novel DNA-binding protein, SPF1, that recognizes SP8 sequences in the 5' upstream regions of genes coding for sporamin and beta-amylase from sweet potato.

Authors:  S Ishiguro; K Nakamura
Journal:  Mol Gen Genet       Date:  1994-09-28

6.  The WRKY transcription factor superfamily: its origin in eukaryotes and expansion in plants.

Authors:  Yuanji Zhang; Liangjiang Wang
Journal:  BMC Evol Biol       Date:  2005-01-03       Impact factor: 3.260

Review 7.  WRKY proteins: signaling and regulation of expression during abiotic stress responses.

Authors:  Aditya Banerjee; Aryadeep Roychoudhury
Journal:  ScientificWorldJournal       Date:  2015-03-24

8.  A Genome-Wide Identification of the WRKY Family Genes and a Survey of Potential WRKY Target Genes in Dendrobium officinale.

Authors:  Chunmei He; Jaime A Teixeira da Silva; Jianwen Tan; Jianxia Zhang; Xiaoping Pan; Mingzhi Li; Jianping Luo; Jun Duan
Journal:  Sci Rep       Date:  2017-08-23       Impact factor: 4.379

9.  Activation of a gene network in durum wheat roots exposed to cadmium.

Authors:  Alessio Aprile; Erika Sabella; Marzia Vergine; Alessandra Genga; Maria Siciliano; Eliana Nutricati; Patrizia Rampino; Mariarosaria De Pascali; Andrea Luvisi; Antonio Miceli; Carmine Negro; Luigi De Bellis
Journal:  BMC Plant Biol       Date:  2018-10-16       Impact factor: 4.215

10.  Characterization of WRKY co-regulatory networks in rice and Arabidopsis.

Authors:  Stefano Berri; Pamela Abbruscato; Odile Faivre-Rampant; Ana C M Brasileiro; Irene Fumasoni; Kouji Satoh; Shoshi Kikuchi; Luca Mizzi; Piero Morandini; Mario Enrico Pè; Pietro Piffanelli
Journal:  BMC Plant Biol       Date:  2009-09-22       Impact factor: 4.215

View more
  5 in total

Review 1.  WRKY transcription factors: evolution, regulation, and functional diversity in plants.

Authors:  Pooja Goyal; Ritu Devi; Bhawana Verma; Shahnawaz Hussain; Palak Arora; Rubeena Tabassum; Suphla Gupta
Journal:  Protoplasma       Date:  2022-07-13       Impact factor: 3.186

2.  Molecular cloning and expression analysis of a WRKY transcription factor gene, GbWRKY20, from Ginkgo biloba.

Authors:  Tingting Zhou; Xiaoming Yang; Guibin Wang; Fuliang Cao
Journal:  Plant Signal Behav       Date:  2021-05-23

Review 3.  Transcription Factors in Plant Stress Responses: Challenges and Potential for Sugarcane Improvement.

Authors:  Talha Javed; Rubab Shabbir; Ahmad Ali; Irfan Afzal; Uroosa Zaheer; San-Ji Gao
Journal:  Plants (Basel)       Date:  2020-04-10

4.  Transcriptome-Wide Identification of WRKY Transcription Factors and Their Expression Profiles under Different Types of Biological and Abiotic Stress in Pinus massoniana Lamb.

Authors:  Sheng Yao; Fan Wu; Qingqing Hao; Kongshu Ji
Journal:  Genes (Basel)       Date:  2020-11-23       Impact factor: 4.096

5.  Characterization of the WRKY gene family in Akebia trifoliata and their response to Colletotrichum acutatum.

Authors:  Feng Wen; Xiaozhu Wu; Tongjian Li; Mingliang Jia; Liang Liao
Journal:  BMC Plant Biol       Date:  2022-03-14       Impact factor: 4.215

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.