Literature DB >> 25606443

Genome wide analysis of Arabidopsis thaliana reveals high frequency of AAAGN7CTTT motif.

Rajesh Mehrotra1, Vishesh Jain2, Chandra Shekhar3, Sandhya Mehrotra2.   

Abstract

Sequence specific elements in DNA regulate transcription by recruiting transcription factors. The Dof proteins are a large family of transcription factors that share a single highly conserved zinc finger. The core to which Dof proteins bind has a consensus AAAG or ACTTTA sequence. These motifs have been over represented in many promoters. We performed a genome wide analysis of AAAG repeat elements increasing the spacer length from 0 to 25. Similar analyses was done with AAAG-CTTT motifs. We report unusual high frequency of AAAGN7CTTT in Arabidopsis thaliana genome. We also conclude that there is a preference for A/G nucleotides in spacer sequence between two AAAG repeats.

Entities:  

Keywords:  Arabidopsis thaliana; Cis element; Designer promoter; Dof; Genome

Year:  2014        PMID: 25606443      PMCID: PMC4288566          DOI: 10.1016/j.mgene.2014.05.003

Source DB:  PubMed          Journal:  Meta Gene        ISSN: 2214-5400


Introduction

Promoters frequently contain multiple functional regulatory elements (Wray et al., 2003). This has an inherent question. How do redundancy and the evolution of cis element multiplicity take place. Cis elements are non coding DNA sequences present upstream of a gene and is required for proper spatio-temporal expression of the gene present downstream of it. It contains binding sites for transcription factors. The Dof domain proteins are typical example of plant specific transcription factors (Riechmann et al., 2000, Yanagisawa and Sheen, 1998, Yanagisawa, 2002). Dof transcription factor binds to a core sequence AAAG as shown by Vicente-Carbajosa et al. (1997) in a pull down assay. Dof domain proteins have been shown to interact with another class of transcription factors (Zhang et al., 1995). We are very much interested in knowing how transcription factors select their target-like sequences which are scattered on the entire chromosome and how they function at this site. We have shown earlier that the minimal core sequences of the commonly occurring cis elements can enhance promoter expression, even when used out of their native contexts (Mehrotra and Mehrotra, 2010, Mehrotra and Panwar, 2009, Mehrotra et al., 2005, Sawant et al., 2005). Using ACGT core sequence we showed that probabilistic model is not followed when we look for the evolution of cis element multiplicity (Mehrotra et al., 2012, Mehrotra et al., 2013). In this study we searched for the multiplicity of AAAG core sequence in the genome of Arabidopsis thaliana and reported that AAAGn7 CTTT is a preferred sequence in the genome. This information will be useful for designer promoters where specific interactions could be directed.

Methodology

The objective was to find out the frequency of the recurring sequences. Sequences of chromosomes were downloaded form the NCBI website (www.ncbi.nlm.nih.gov) and converted to a single line sequence using Notepad++. An ANSI C code was generated and later a code in Python 2.6.5 was used to find the results. The code written is as follows: Code to find frequency of AAAG(A/G/C/T)AAAG as in Table 1
Table 1

Frequency of two AAAG motifs separated by all possible distances (till 25 bp), across the five chromosomes. ‘n’ represents the intervening distance between the motifs. The second column displays the value of ‘n’.

chr1chr2chr3chr4chr5Total
03224217125011934290812,738
AAAGnAAAG12951187322821711249911,316
23314208825462302321513,465
32635175521121693239010,585
42732175120381594237710,492
52577174620761570225610,225
62529179221071541237310,342
72407166322781589227810,215
82533164421341548234110,200
9220114541720133020268731
10214815181737139020678860
11230815431719136520739008
12216914541763127420218681
13219414381671135219398594
14249715011909142820809415
15217214351738134820228715
16255615071789143921429433
172482169020281583233110,114
18215414591888134519258771
19223014761819137819178820
20233815531776141821059190
21214414301646129619398455
22212913081609126718818194
23215914671733143520158809
24225414431689154919978932
25214715041721140019238695
Python codes to find frequency of two AAAG/CTTTs separated by 0–25 nt. Spacer

Results and discussions

AAAGn7CTTT sequence is highly preferred in A. thaliana genome

Dof proteins, which are typically composed of 200–400 amino acids, are defined as DNA-binding proteins that have a highly conserved Dof domain. The strong similarity among Dof DNA-binding domains suggested that all Dof proteins display similar DNA-binding specificity. Indeed, an AAAG sequence or its reversibly oriented sequence, CTTT, is always found in the binding sequences of individual Dof proteins (Chen et al., 1996, dePaolis et al., 1996, Kang and Singh, 2000, Mena et al., 1998, Plesch et al., 2001, Washio, 2001, Yanagisawa and Izui, 1993) except a pumpkin Dof protein (AOBP) that recognizes an AGTA motif (Kisu et al., 1998). In A. thaliana, two AAAGs separated by one neuclotide is a known binding site for the OBP-1 protein (Yanagisawa, 2002). Similarly clusters of AAAG sites have been shown to additively contribute to guard cell-specificity of AtMYB60 promoter in guard cells (Cominelli et al., 2011). With an intention to discover potential new DOF binding sites in A. thaliana, the frequency of two AAAG or CTTT motifs separated by an increasing distance was carried out. The frequency of AAAGAAAG without any spacer has a maximum occurrence of 12,738 as shown in Table 1 and Fig. 1 As we increase the spacer length, the frequency of occurrences started decreasing. There was a slight increase in frequency for the spacer length 14–17. Statistical analyses (data not shown) indicated them to be non significant as the deviation was essentially within 10–15%. Similar trend was observed for (CTTTnCTTT) as shown in Table 2 and Fig. 2.
Fig. 1

Frequency of two AAAG motifs separated by all possible distances till 25 bp across the five chromosomes of Arabidopsis thaliana.

Table 2

Frequency of two CTTT motifs separated by all possible distances (till 25 bp), across the five chromosomes. ‘n’ represents the intervening distance between the motifs. The second column displays the value of ‘n’.

chr1chr2chr3chr4chr5Total
CTTTnCTTT03195208624471896276412,388
13192191022621755262711,746
23269216824652139306013,101
32648170621501677249710,678
42616168820811547237410,306
52582172320781606245210,441
62591174020141612234410,301
72402170821831664229210,249
82416172920971576228710,105
9228614481836139020539013
10226015281698138421088978
11232615311770138122129220
12223114841683129319398630
13214314841683140718968613
14236016061837143521369374
15249315231656143119789081
16222714941829147723279354
172402167320431568232010,006
18223714821797131819858819
19224014441657135320018695
20230515551746140221019109
21218014591610140220458696
22212414011578127819468327
23221815271747136120148867
24215614281625132119168446
25221914571703137319868738
Fig. 2

Frequency of two CTTT motifs separated by all possible distances till 25 bp, across the five chromosomes of Arabidopsis thaliana.

A very interesting observation was made when we looked for combination of AAAG and CTTT sequences. An unexpected high frequency was observed for AAAGn7CTTT. The frequency of occurrence was observed as 14,977 which is more than two times the predecessor whose frequency is 7177 as shown in Table 3 and Fig. 3. However, when we change the orientation to CTTTn7 AAAG this tendency was not observed as shown in Table 4. The other implication of this is that transcriptional factor binding is direction specific. Not all AAAG motifs in plant promoters are targets of the Dof domain proteins. However, since an AAAG and a CTTT motif separated by a distance of 7 bp is present in an exceptionally high frequency, we think it is highly likely that this sequence combination may have a functional significance yet to be discovered.
Table 3

Frequency of a AAAG and a CTTT motif separated by all possible distances (till 25 bp), across the five chromosomes. ‘n’ represents the intervening distance between the motifs. The second column displays the value of ‘n’.

AAAGnCTTTchr1chr2chr3chr4chr5Total
0237913521437157023209058
11504910123690513845939
2118779292173610184654
3139890399379212055291
4119984299295711905180
5130886399579512215182
6185312051396106916547177
73827248228542358345614,977
81546990120192213506009
915341026119799414056156
1016741050118396813666241
11154410061218108316336484
121620976120197714706244
1315571033118097413586102
14166010811245101213856383
15168711191309103214796626
16166411071335101615756697
17171510921251113518717064
1816851119150897014546736
1915189921189101413886101
2016491040123192513546199
2116731111126995114126416
2216351046129895514546388
2315481066119693714856232
2416311059124396914616363
2516551081127897715216512
Fig. 3

Frequency of CTTT and AAAG motifs separated by all possible distances till 25 bp, across the five chromosomes of Arabidopsis thaliana.

Table 4

Frequency of a CTTT and a AAAG motif separated by all possible spacer distances (till 25 bp), across the five chromosomes. ‘n’ represents the intervening distance between the motifs. The second column displays the value of ‘n’.

chr1chr2chr3chr4chr5Total
CTTTnAAAG08715876235057493335
11206878100573611624987
21356915109783311915392
31289866106581810765114
41341917138187113185828
51354861103978512125251
615521003110788414035949
714611021111892414025926
81442939110083212795592
916591052123597513756296
101635941122388514066090
11168911441590108216357140
12169711311279114918547110
13160610541223100315446430
1414331004111787814145846
1515531060119185314426099
16170311541214105614716598
1716291037120396114506280
18159510161199100215256337
1916801135127197614846546
2015791067125695714516310
2115451040123992913926145
22163211241251101114596477
2315411096126599813566256
2416191002110696914066102
2516501120120195214146337

A and G are preferred as flanking nucleotides

We were interested to know which residues predominate in the flanking of AAAG sequence. Such studies are very important because many studies indicate that flanking sequences are very important for binding specificity (Foster et al., 1994, Izawa et al., 1993). We changed one nucleotide at a time following AAAG. As shown in Table 5, A and G predominate as flanking residues although there is an exception when (AAAG)–(AAAG) is separated by one nucleotide where the frequency of G flanking is 1918 which is less than C which is 2057. In all other cases G dominates as a flanking sequence over C and T.
Table 5

Frequency of flanking nucleotide between two AAAG motifs separated by an increasing sequence length across five chromosomes.

AAAG?_AAAGCHR1chr2chr3chr4chr5Total
A1480962114388512645734
G4873063992964301918
C5513304132864772057
T4332753272443281607



AA7254415744036462789
AG4022482862343271479
AC189124135115366929
AT196118178123173788



AAA2861701891562501051
AAG15210514396136632
AAC7441594855277
AAT6537283643209



AAAA1879810981141616
AAAG122799565118479
AAAC5235432331184
AAAT3318231938131



AAAAA7642694966302
AAAAG3326453037171
AAAAC222113141585
AAAAT15131161863



AAAAAA5537452843208
AAAAAG20131892787
AAAAAC46561435
AAAAAT8166222



AGA102858772108454
AGG4732392549192
AGC3230302726145
AGT5534474159236

Conclusions

The promoter region of many genes contain multiple binding sites for the same transcription factor. One possibility is that individuals with multiple, redundant binding sites have higher fitness. Cis regulatory element multiplicity has been correlated with several gene functionalities like Promoters containing multiple sites evolve more slowly. In this paper we focused on the multiplicity of AAAG sequence with varied spacer lengths and also in combination with CTTT sequence. We report that AAAGn7 CTTT is a preferred sequence in the genome of A. thaliana. This information will be useful for designer promoters where specific interactions could be directed.
  23 in total

Review 1.  The Dof family of plant transcription factors.

Authors:  Shuichi Yanagisawa
Journal:  Trends Plant Sci       Date:  2002-12       Impact factor: 18.313

2.  A variety of synergistic and antagonistic interactions mediated by cis-acting DNA motifs regulate gene expression in plant cells and modulate stability of the transcription complex formed on a basal promoter.

Authors:  Samir V Sawant; Kanti Kiran; Rajesh Mehrotra; Chandra Prakash Chaturvedi; Suraiya A Ansari; Pratibha Singh; Niraj Lodhi; Rakesh Tuli
Journal:  J Exp Bot       Date:  2005-07-12       Impact factor: 6.992

3.  The promoter of a H2O2-inducible, Arabidopsis glutathione S-transferase gene contains closely linked OBF- and OBP1-binding sites.

Authors:  W Chen; G Chao; K B Singh
Journal:  Plant J       Date:  1996-12       Impact factor: 6.417

4.  A rolB regulatory factor belongs to a new class of single zinc finger plant proteins.

Authors:  A De Paolis; S Sabatini; L De Pascalis; P Costantino; I Capone
Journal:  Plant J       Date:  1996-08       Impact factor: 6.417

Review 5.  Plant bZIP proteins gather at ACGT elements.

Authors:  R Foster; T Izawa; N H Chua
Journal:  FASEB J       Date:  1994-02       Impact factor: 5.191

6.  Involvement of TAAAG elements suggests a role for Dof transcription factors in guard cell-specific gene expression.

Authors:  G Plesch; T Ehrhardt; B Mueller-Roeber
Journal:  Plant J       Date:  2001-11       Impact factor: 6.417

7.  An endosperm-specific DOF protein from barley, highly conserved in wheat, binds to and activates transcription from the prolamin-box of a native B-hordein promoter in barley endosperm.

Authors:  M Mena; J Vicente-Carbajosa; R J Schmidt; P Carbonero
Journal:  Plant J       Date:  1998-10       Impact factor: 6.417

8.  Characterization and expression of a new class of zinc finger protein that binds to silencer region of ascorbate oxidase gene.

Authors:  Y Kisu; T Ono; N Shimofurutani; M Suzuki; M Esaka
Journal:  Plant Cell Physiol       Date:  1998-10       Impact factor: 4.927

9.  Evidence for directed evolution of larger size motif in Arabidopsis thaliana genome.

Authors:  Rajesh Mehrotra; Amit Yadav; Purva Bhalothia; Ratna Karan; Sandhya Mehrotra
Journal:  ScientificWorldJournal       Date:  2012-05-03

10.  Patterns and evolution of ACGT repeat cis-element landscape across four plant genomes.

Authors:  Rajesh Mehrotra; Sachin Sethi; Ipshita Zutshi; Purva Bhalothia; Sandhya Mehrotra
Journal:  BMC Genomics       Date:  2013-03-25       Impact factor: 3.969

View more
  5 in total

1.  Genome-wide identification and characterization of Dof transcription factors in eggplant (Solanum melongena L.).

Authors:  Qingzhen Wei; Wuhong Wang; Tianhua Hu; Haijiao Hu; Weihai Mao; Qinmei Zhu; Chonglai Bao
Journal:  PeerJ       Date:  2018-03-06       Impact factor: 2.984

2.  Expression, purification and DNA-binding properties of zinc finger domains of DOF proteins from Arabidopsis thaliana.

Authors:  Hakimeh Moghaddas Sani; Maryam Hamzeh-Mivehroud; Ana P Silva; James L Walshe; S Abolghasem Mohammadi; Mahdyieh Rahbar-Shahrouziasl; Milad Abbasi; Omid Jamshidi; Jason Kk Low; Siavoush Dastmalchi; Joel P Mackay
Journal:  Bioimpacts       Date:  2018-01-28

3.  In Silico Analysis of CCGAC and CATGTG Cis-regulatory Elements Across Genomes Reveals their Roles in Gene Regulation under Stress.

Authors:  Sneha Lata Bhadouriya; Abhishek Suresh; Himanshu Gupta; Sandhya Mehrotra; Divya Gupta; Rajesh Mehrotra
Journal:  Curr Genomics       Date:  2021-12-30       Impact factor: 2.689

4.  Genome-wide analysis of AAAG and ACGT cis-elements in Arabidopsis thaliana reveals their involvement with genes downregulated under jasmonic acid response in an orientation independent manner.

Authors:  Zaiba H Khan; Siddhant Dang; Mounil B Memaya; Sneha L Bhadouriya; Swati Agarwal; Sandhya Mehrotra; Divya Gupta; Rajesh Mehrotra
Journal:  G3 (Bethesda)       Date:  2022-05-06       Impact factor: 3.542

Review 5.  The Role of Major Transcription Factors in Solanaceous Food Crops under Different Stress Conditions: Current and Future Perspectives.

Authors:  Lemessa Negasa Tolosa; Zhengbin Zhang
Journal:  Plants (Basel)       Date:  2020-01-02
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.