Literature DB >> 25914445

Computational identification and analysis of MADS box genes in Camellia sinensis.

Madhurjya Gogoi1, Sangeeta Borchetia1, Tanoy Bandyopadhyay1.   

Abstract

MADS (Minichromosome Maintenance1 Agamous Deficiens Serum response factor) box genes encode transcription factors and they play a key role in growth and development of flowering plants. There are two types of MADS box genes- Type I (serum response factor (SRF)-like) and Type II (myocyte enhancer factor 2 (MEF2)-like). Type II MADS box genes have a conserved MIKC domain (MADS DNA-binding domain, intervening domain, keratin-like domain, and c-terminal domain) and these were extensively studied in plants. Compared to other plants very little is known about MADS box genes in Camellia sinensis. The present study aims at identifying and analyzing the MADS-box genes present in Camellia sinensis. A comparative bioinformatics and phylogenetic analysis of the Camellia sinensis sequences along with Arabidopsis thaliana MADS box sequences available in the public domain databases led to the identification of 16 genes which were orthologous to Type II MADS box gene family members. The protein sequences were classified into distinct clades which are associated with the conserved function of flower and seed development. The identified genes may be used for gene expression and gene manipulation studies to elucidate their role in the development and flowering of tea which may pave the way to improve the crop productivity.

Entities:  

Keywords:  Bioinformatics; Crop productivity; Flowering; MADS box genes; Tea; Transcription factor

Year:  2015        PMID: 25914445      PMCID: PMC4403032          DOI: 10.6026/97320630011115

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

MADS (Minichromosome Maintenance1 Agamous Deficiens Serum response factor) box genes are one of the best studied transcription factor family that are key regulators of development in almost all groups of eukaryotes and plays important role in the growth and development of flowering plants [1]. The MADS box proteins possess a highly conserved DNA-binding MADS domain having a length of around 60 amino acids. MADS box genes are divided into two types− type I (serum response factor (SRF)-like) and type II (myocyte enhancer factor 2 (MEF2)-like). Type II has a conserved MIKC domain (MADS DNA-binding domain, intervening domain, keratin-like domain, and c-terminal domain). Only a few MADS-box genes of type I have been functionally characterized, whereas type II MADS box genes have been extensively studied in plants [2]. A lot of progress has been made in deciphering the molecular mechanism involved in the floral transition [3]. Further complete genome sequence of Arabidopsis provided more clear picture of the complexity and diversity of MADS box genes [4]. Many MADS box genes have been identified which are involved in various steps of transition from vegetative to reproductive growth. Most of the flowering genes encode transcription factors of MADS-box domain. Compared to other plants very little is known about the MADS box genes in Camellia sinensis. The floral buds of C. sinensis were found to be a major sink of assimilates produced by the maintenance foliage and are considered to be a limiting factor in proper partitioning of assimilates. It has been found that root starch was enriched when flower buds were controlled and it induced more vegetative growth. Considering the role of MADS box genes in the flowering of plants and its possible implication in improving tea productivity by controlling flowering with gene manipulation, the present study aimed at identifying and analyzing the MADS-box genes present in Camellia sinensis. Comparative bioinformatics and phylogenetic analysis identified the probable orthologous of Arabidopsis MADS box genes in tea. The protein sequences of the identified genes were classified into distinct clades and were found to be associated with the conserved function of flower and seed development. Biotechnological interventions on the identified MADS box genes will elucidate their role in flowering of tea and may also lead to increase in tea crop productivity.

Methodology

Database search of MADS box sequences:

NCBI NR, NCBI dbEST and NCBI TSA databases were used for the search of Camellia sinensis MADS Box sequences. Search for Camellia sinensis sequences was conducted using the tblastn module of NCBI blast. The query sequence for blast used was the band consensus of MADS region generated by the COBBLER program (Consensus Biasing by Locally Embedding Residues) [5] of the published MADS-box sequences of Arabidopsis thaliana. The Camellia sinensis blast hits having significant similarity (E-value cutoff 1e-15) and score greater than 100 were selected. The reads obtained from blast hit were combined together, clustered and assembled using CAP3 program to form the contigs and singletons. The names of the contigs were prefixed as CsC and the singleton names were prefixed as CsS, followed by the number. To define putative coding frame of the transcripts, the NCBI ORF Finder tool was used. The transcripts open reading frame was determined and corresponding protein sequences were retrieved. Those sequences which were partial or had incomplete ORF were discarded from further analysis.

Conserved domain identification:

For the verification of the MADS box conserved domain, the protein sequences were inspected by the NCBI Batch CDSearch program [6] and sequences without MADS box domain were discarded.

Phylogenetic analysis:

For the Phylogenetic analysis, Arabidopsis thaliana MADS box gene sequences and corresponding protein sequences were retrieved from TAIR database (The Arabidopsis Information Resource) based on keyword search and published gene sequences. Amino acid sequences were used for phylogenetic analysis as they are more conserved compared to high variability of nucleotide sequences. The dataset for phylogenetic analysis contained the Camellia sinensis predicted MADS box protein sequences and the Arabidopsis thaliana MADS box protein sequences. MEGA 5 software [7] was used for the phylogenetic analysis. Sequence alignment was performed in MEGA using ClustalW and the phylogenetic tree was obtained using Neighbor-joining method [8] with Poisson distances and the pair-wise deletion option. For the reliability of the tree 1000 bootstrap replication were performed.

Motif identification and analysis:

MEME program [9] was used for the identification of the motif cluster present in the MADS box sequences. The order of sequences in the phylogenetic tree was maintained in the input file for MEME program to facilitate the observation of common motif between the closely related sequences

Result & Discussion

The assembly of the sequences using CAP3 resulted in 13 contigs and 13 singletons. Based on the results of NCBI Batch CD-Search program, sequences without MADS box domain were discarded. Finally only 8 contigs and 8 singletons were further used for the analysis. Except two contigs (namely- CsC7, CsC8) and three singletons (namely-CsS5, CsS6, CsS7), all the others sequences were found to be complete in both N and C terminals Table 1 (see supplementary material). However all the sequences represented perfect MADS box domain including the two contigs and three singletons mentioned above. Accession number of contigs and singleton are provided in Table 2 (see supplementary material). Phylogenetic tree (Figure 1) comprising MADS box protein sequences of Arabidopsis and C. sinensis, showed that the sequences were clustered into different groups. All the transcripts from C. sinensis grouped with different subfamilies of type II MADS-box protein (Figure 1). The sequences could be further visualized by the analysis of the motif grouping results (Figure 2) of MEME program.
Figure 1

Phylogenetic Tree: Phylogenetic tree constructed based on MADS box protein sequences of Camellia sinensis and published Arabidopsis thailiana MADS box protein sequences. Neighbor-joining comparison model was used with poisson distances and Pairwise deletion option for the construction of the phylogenetic tree. Bootstrap values smaller than 50% were omitted and corresponding branches were merged. The Camellia sinensis protein sequences in the phylogenetic tree together with the Arabidopsis thaliana protein sequences were grouped mainly into seven subfamily (square bracket covering the subfamily members). The colours in the phylogenetic tree are used to graphically distinguish the subfamilies.

Figure 2

Graphic representation showing the complete grouping motifs of Camellia sinensis and Arabidopsis thailiana MADS box sequences obtained using MEME program (Multiple Expectation Minimization for Motif Elicitation, http://meme.sdsc.edu/meme/ meme.html). The parameters used were: Distribution of motif occurrences- Zero or one per sequence, maximum number of motifs-20, Maximum motif width- 300 and Minimum motif width-6.

AGL2 subfamily:

AGL2 subfamily is sister to the AGL6 subfamily with only one transcript from C. sinensis clustering with it (Figure 1). The CsS4 transcript appears to be homologous to AGL2 and AGL4 along with AGL3, AGL9 forming one clade (Figure 1). For this subfamily motif grouping among sequences reflected the conserved feature (Figure 2). Studies pointed out that AGL2 gene may play a fundamental role in the floral organ identity and development along with seeds and embryo development [3]. By suppressing the expression of native AGL2 gene and other regulatory element linked to this gene by biotechnological approaches like antisense, co-suppression, gene replacement etc. delay in flower may be achieved which may led to increase in the length of vegetative phase and thus increase in vegetative tissue yield particularly in case of foliage crops.

AGL6 subfamily:

The contig CsC6 clustered with the AGL6 subfamily and is seen to be very closely related to AGL6 and AGL13 (Figure 1). The overlying role of the AGL2 and AGL6 subfamily genes is also evident from the phylogenetic tree grouping of the gene subfamilies under one superclade (Figure 1). Sequences within the subfamily showed high similarity in motif grouping (Figure 2). In Arabidopsis, AGL6 and AGL13 belong to the AGL6 subfamily. Studies showed that AGL6 subfamily plays a key role in regulating floral organ identity [10] and floral meristem determinacy in rice, maize and Petunia hybrid. It also control circadian clock and is involved in the negative regulation of the FLC/MAF subfamily genes and positive regulation of FT genes [11].

SQUA-Like subfamily:

Two contigs namely CsC3 and CsC4 grouped together with the SQUA-Like subfamily (Figure 1). CsC3 appears to be homologous to Fruitfull (FUL) and CsC4 is homologous to Cauliflower (CAL) and Apetala 1 (AP1). The functions of MADS box genes of the SQUA-Like subfamily includes controlling of transition from vegetative to reproductive growth, determining identity of the floral organ and regulating fruit maturation [12] . Apetala 1 (AP1) is one of the members of this subfamily. Together with the FUL and CAL genes, AP1 act redundantly to control inflorescence architecture and meristem identity. Constitutive over expression of AP1 gene led to early flowering in transgenic Chrysanthemum plant [13]. Studies can be undertaken to down regulate or knockdown AP1 and related genes to see if this would extend the vegetative phase duration of foliage crop like tea.

FLOWERING LOCUS C (FLC) subfamily:

The two transcripts from C. sinensis namely CsC8 and CsS6 formed another small group which is homologous to FLC (AGL 25). These two transcripts along with FLC (AGL 25), MAF1 (AGL27) and MAF2 (AGL31) gene of Arabidopsis formed one clade to represent the FLC subfamily in the phylogenetic tree (Figure 1). The FLC is a flowering transition repressor and also other members of the FLC like subfamily are directly involved in the flowering process to seasonal environmental factors. It also controls major life-history transition-seed germination and its expression is associated with natural variation in temperature-dependent germination [14]. Transgenic Chinese cabbage overexpressing Brassica rapa FLC showed delay in flowering and remained in vegetative phase for longer time [15]. Thus FLC gene appears to be one of the important target for genetic manipulation to increase biomass and get high yield in vegetative tissues.

AG subfamily:

The two transcripts namely CsC1 and CsS1, clustered with AG subfamily were homologous to AGAMOUS (AT4G18960) of Arabidopsis (Figure 1). The motif grouping pattern (Figure 2) is highly similar among the two C. sinensis transcripts and Arabidopsis AGAMOUS (AT4G18960). The floral homeotic gene AGAMOUS (AG), a class C gene of the MADS-box transcription factor family is necessary for specification and development of stamen and carpels along with floral meristem determinacy [16]. In Arabidopsis, it interacts with other MADS box proteins and plays an important role for the induction of reproductive organ development [17]. Besides AGAMOUS, few more genes namely AGL1, AGL5, AGL11 and AGL12 (Figure 1) constitute the AG subfamily. In situ hybridization studies have shown that AGAMOUS gene is transcribed with strong expression only in the third and fourth floral whorl, after the floral bud formation just before the primordia of stamens and carpels [3]. It doesn׳t interfere with the normal vegetative growth of the plant and thus it may acts as a suitable target in genetic modification for crop improvement.

Tomato MADS box transcription factor3 (TM3) like genes subfamily:

A total of four transcripts namely CsC2, CsC5, CsS2 and CsS3 were grouped with the TM3 like subfamily (Figure 1) with common grouping motifs (Figure 2). All these four transcripts showed being homologue to AGL20/SOC1. The TM3 like subfamily clade also contains AGL14, AGL42 and AGL71 genes of Arabidopsis. Many member of this subfamily showed expression both in vegetative and reproductive organs of angiosperms. [18]. The SOC1 gene of TM3 subfamily is regulated by several pathways and it co-ordinate the responses to environmental signals. As SOC1 acts as a major hub in the regulatory network of floral timing and development [19], it may acts as an important target for biotechnological intervention for crop improvement by means of over expression or under expression of the SOC1 polypeptide. Over expression of SOC1 may result in early flowering, increase in flower production and increase in fruit production. Whereas under expression of the same may benefit foliage crops by extending vegetative phase duration.

STMADS11-Like (SVP - SHORT VEGETATIVE PHASE):

SVP is another major subfamily where four transcripts from C. sinensis grouped with it (Figure 1). CsC7 and CsS7 appear to be homologue of AGL24 gene, whereas CsS5 and CsS8 were homologous to the AGL22 or SVP gene. SVP gene play important role in two developmental phases of plants. During the vegetative phase it acts as a repressor of the floral transition and later it plays a vital role in the specification of floral meristem [20]. Alteration in gene expression of the SVP gene in C. sinensis via genetic modification may lead to late flowering and extension of vegetative growth phase.

Conclusion

In this study, presence of Type II MADS domain protein members and absence of member from type I MADS domain proteins were observed in Camellia sinensis. In future, studies can be undertaken to understand the molecular mechanisms of the MADS-box proteins in growth, development and stress conditions of C. sinensis and also steps can be undertaken to develop trans or cisgenic plants without seeds. MADS box gene family appears to be promising target to obtain such plants considering its role in reproductive development of plants. Seedless genetically modified tea plant with desired characters if obtained might be highly welcomed by tea community, provided the route to ethical issue relating to transgenics is clear. Seedless transgenic plants with desired trait will also eliminate the danger of uncontrolled dispersal of transgenic plants in environment. Tea plant is mainly propagated through vegetative cloning so absence of seed should not be a problem for its cultivation. Issue relating to narrowing down of genetic diversity of tea through such seedless plantation can be overcome by taking steps to preserve the genetic pool of the tea. As tea is a foliage crop vegetative phase is more important than the reproductive phase. Control of flowering may increase the vegetative growth. There is a possibility to enhance the vegetative growth of tea plants by manipulating the genes involved in flower development. Thus, understanding the role of MADS box gene family in tea plant may pave the way for improving the crop productivity and benefit the tea industry.

Conflict of interest

None
  16 in total

1.  Major flowering time gene, flowering locus C, regulates seed germination in Arabidopsis thaliana.

Authors:  George C K Chiang; Deepak Barua; Elena M Kramer; Richard M Amasino; Kathleen Donohue
Journal:  Proc Natl Acad Sci U S A       Date:  2009-06-29       Impact factor: 11.205

2.  MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods.

Authors:  Koichiro Tamura; Daniel Peterson; Nicholas Peterson; Glen Stecher; Masatoshi Nei; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2011-05-04       Impact factor: 16.240

3.  The neighbor-joining method: a new method for reconstructing phylogenetic trees.

Authors:  N Saitou; M Nei
Journal:  Mol Biol Evol       Date:  1987-07       Impact factor: 16.240

4.  Two ancient classes of MIKC-type MADS-box genes are present in the moss Physcomitrella patens.

Authors:  Katrin Henschel; Rumiko Kofuji; Mitsuyasu Hasebe; Heinz Saedler; Thomas Münster; Günter Theissen
Journal:  Mol Biol Evol       Date:  2002-06       Impact factor: 16.240

5.  Delayed flowering time in Arabidopsis and Brassica rapa by the overexpression of FLOWERING LOCUS C (FLC) homologs isolated from Chinese cabbage (Brassica rapa L.: ssp. pekinensis).

Authors:  Soo-Yun Kim; Beom-Seok Park; Soo-Jin Kwon; Jungsun Kim; Myung-Ho Lim; Young-Doo Park; Dool Yi Kim; Seok-Chul Suh; Yong-Moon Jin; Ji Hoon Ahn; Yeon-Hee Lee
Journal:  Plant Cell Rep       Date:  2006-10-06       Impact factor: 4.570

6.  AGAMOUS-LIKE 6 is a floral promoter that negatively regulates the FLC/MAF clade genes and positively regulates FT in Arabidopsis.

Authors:  Seung Kwan Yoo; Xuelin Wu; Jong Seob Lee; Ji Hoon Ahn
Journal:  Plant J       Date:  2010-11-10       Impact factor: 6.417

7.  Functional characterization of OsMADS18, a member of the AP1/SQUA subfamily of MADS box genes.

Authors:  Fabio Fornara; Lucie Parenicová; Giuseppina Falasca; Nilla Pelucchi; Simona Masiero; Stefano Ciannamea; Zenaida Lopez-Dee; Maria Maddalena Altamura; Lucia Colombo; Martin M Kater
Journal:  Plant Physiol       Date:  2004-08-06       Impact factor: 8.340

8.  Regulation of floral patterning by flowering time genes.

Authors:  Chang Liu; Wanyan Xi; Lisha Shen; Caiping Tan; Hao Yu
Journal:  Dev Cell       Date:  2009-05       Impact factor: 12.270

9.  Characterization of SOC1's central role in flowering by the identification of its upstream and downstream regulators.

Authors:  Richard G H Immink; David Posé; Silvia Ferrario; Felix Ott; Kerstin Kaufmann; Felipe Leal Valentim; Stefan de Folter; Froukje van der Wal; Aalt D J van Dijk; Markus Schmid; Gerco C Angenent
Journal:  Plant Physiol       Date:  2012-07-12       Impact factor: 8.340

10.  The study of two barley type I-like MADS-box genes as potential targets of epigenetic regulation during seed development.

Authors:  Aliki Kapazoglou; Cawas Engineer; Vicky Drosou; Chrysanthi Kalloniati; Eleni Tani; Aphrodite Tsaballa; Evangelia D Kouri; Ioannis Ganopoulos; Emmanouil Flemetakis; Athanasios S Tsaftaris
Journal:  BMC Plant Biol       Date:  2012-09-17       Impact factor: 4.215

View more
  1 in total

1.  Transcriptome-wide analysis of MADS-box family genes involved in aluminum and fluoride assimilation in Camellia sinensis.

Authors:  Junting Pan; Pinpin Chang; Xiaoli Ye; Jiaojiao Zhu; Dongqin Li; Chuanlei Cui; Bo Wen; Yuanchun Ma; Xujun Zhu; Wanping Fang; Yuhua Wang
Journal:  Plant Biotechnol (Tokyo)       Date:  2018-12-25       Impact factor: 1.133

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.