Literature DB >> 27049067

Identification and characterization of histone lysine methylation modifiers in Fragaria vesca.

Tingting Gu1, Yuhui Han1, Ruirui Huang1, Richard J McAvoy2, Yi Li1,2.   

Abstract

The diploid woodland strawberry (Fragaria vesca) is an important model for fruit crops because of several unique characteristics including the small genome size, an ethylene-independent fruit ripening process, and fruit flesh derived from receptacle tissues rather than the ovary wall which is more typical of fruiting plants. Histone methylation is an important factor in gene regulation in higher plants but little is known about its roles in fruit development. We have identified 45 SET methyltransferase, 22 JmjC demethylase and 4 LSD demethylase genes in F. vesca. The analysis of these histone modifiers in eight plant species supports the clustering of those genes into major classes consistent with their functions. We also provide evidence that whole genome duplication and dispersed duplications via retrotransposons may have played pivotal roles in the expansion of histone modifier genes in F. vesca. Furthermore, transcriptome data demonstrated that expression of some SET genes increase as the fruit develops and peaks at the turning stage. Meanwhile, we have observed that expression of those SET genes responds to cold and heat stresses. Our results indicate that regulation of histone methylation may play a critical role in fruit development as well as responses to abiotic stresses in strawberry.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27049067      PMCID: PMC4822149          DOI: 10.1038/srep23581

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Chromatin is a complex composed of histones, chromosomal proteins, DNAs and small RNAs. The nucleosome is the basic unit of chromatin, consisting of a histone octamer (two of each histone 2A, histone 2B, histone 3 and histone 4), surrounded by a 146–148 bp DNA wrapping. The post-transcriptional modifications of the N-terminal tails of core histones could affect the nucleosome spacing, higher-order nucleosome interaction, and greatly affect the accessibility of the transcriptional regulatory proteins. Thus, in eukaryotes, post-transcriptional histone modifications are a determinant of the active/silent state of the associated genes, and are of great importance to a variety of important biological processes1. The covalent modifications of histones include methylation, acetylation, phosphorylation, ubiquitation and SUMOylation2. Methylation on the lysine residues is among the most variant and important histone modifications. The majority of methyl marks are located on lysine 4, 9, 27, 36, 79 on histone H3, and lysine 20 on histone H4. The different active/silent chromatin states are characterized by different combinatorial patterns of histone modifications. Generally speaking, methylations on H3K9, H3K27, H3K79 and H4K20 are silent marks, while methylations on H3K4 and H3K36 are active marks34. Although having some variations, the mechanisms determining the chromatin states are quite conserved in plants and animals. Histone lysine methylation is dynamic during organ development, and determined by histone modifying proteins including histone lysine methyltransferases (HKMTases), histone demethylases (HDMases) and histone turnovers. The majority of HKMTases have a SET (Suppressor of variegation, Enhancer of zeste and Trithorax in Drosophila) domain mediating the methyltransferase catalytic activity5. The only known HKMTase that does not have a SET domain is Dot1/Dot1L, which is responsible for H3K79 methylation67. Based on the amino acid sequence conservation of SET domains, there are seven classes of SET protein encoding genes in Arabidopsis, each of which have preferred targets58. Class I consists of three polycomb group genes homologous to the Drosophila ortholog E(Z) (Enhancer of Zeste), capable of transferring methyl groups to H3K27. Class II consists of five SET genes homologous to the Drosophila ortholog ASH1, responsible for methylations on H3K4 and/or H3K36. Class III is another group of SET genes responsible for the active mark H3K4me1/2/3, similar to the Drosophila ortholog TRX. Class IV is plant-specific, responsible for monomethylation on H3K27, a silent mark essential for transposon silencing9. Class V is the largest SET group consisting of 15 SET genes homologous to Drosophila SU(VAR)3–9. Similar to their orthologs in animals, class V SET genes in Arabidopsis play an essential role in the establishment of heterochromatic mark H3K9me1/2/3, as well as H4K20me and H3K27me2. Class VI and VII consist of genes having a SET domain with functions to be determined. Histone HDMases consist of two major types of enzymes, the LSD (Lysine Specific demethylase) type and the JmjC domain-containing HDMases10. LSD HDMases can only demethylate mono- and di-methylated lysine residues7. LSD family in Arabidopsis has four members, being able to demethylate H3K4me1/2 and H3K9me1/2. While JmjC is a much larger gene family consisting of 21 genes in Arabidopsis, and is compatible with trimethylated residues11. Within the JmjC family, members with the activity to demethylate methyl groups on H3K4, H3K9, H3K27, H3K36 and H4K20 have been identified7. H3K79 methylation is mediated by the only non-SET domain HKMTase Dot1/Dot1L, thus it is tempting to speculate that another class of histone HDMase might be responsible for H3K79 demethylation. The first plant SET genes identified are CURLY LEAF (CLF) and MEDEA (MEA) in Arabidopsis thaliana1213. Since then, the identification and functional investigation of histone HKMTases and HDMases in plants have been the subject of numerous studies. These studies suggest that HKMTases and HDMases are pivotal in phase transitions between sporophyte and gametophyte, gametophyte and seed development, embryo-seedling transition, induction of flowering, and vernalization1. In addition, histone methylations determined by both HKMTases and HDMases play an important role in the memory mechanism in response to recurring stresses114. For example, the “memory genes” responded to recurrent dehydrations maintain the active mark H3K4me3 during the recovery phase when transcription is low, which serves as a mark of “transcriptional stress memory”14. The list of cellular processes known to involve HKMTases/HDMases is still growing, and these histone modifiers are believed to play essential roles in all aspects of regulations of plant development. In spite of the essential roles of histone modifications in cellular processes, little is known about histone modifiers in strawberry. The cultivated strawberry (F. x ananassa) is a young crop species as a model plant considered to be non-climacteric. F. x ananassa has an extremely complex octaploid genome harboring 56 chromosomes (2n = 8x = 56) derived from 4 diploid ancestors. Thus, the sequenced diploid woodland strawberry Fragaria vesca with a small genome (240 Mb, 2n = 2x = 14) offers substantial advantages for genomic research15. In this study, we identified genes encoding the histone lysine methylation modifiers, both HKMTases and HDMases in F. vesca. Comprehensive studies about the phylogeny, evolutionary history, structure, expression patterns in different stages/organs and in response to abiotic stresses were performed to give an overview of this important group of genes in F. vesca. This study provides the first characterization of the full set of histone lysine methylation modifiers in strawberry, and should greatly facilitate the functional characterization of those epigenetic regulators in this economically important crop species.

Results

Identification of genes encoding putative histone HKMTases containing SET domains in F. vesca

To identify histone HKMTases, the full alignment of SET domains downloaded from Pfam was searched against the F. vesca proteome by HMMER toolset (Methods for details). This sequence-based search identified 45 SET domain-containing genes in F. vesca (Fig. 1, Table 1 and Supplementary Table S1). To better understand the expansion and evolutionary history of SET genes in F. vesca, genes encoding SET-domain containing proteins were also identified in seven other species representing the major clades of terrestrial plants (Fig. 1). The basal angiosperm Amborella trichopoda is suggested to be the single living representative of the sister lineage to all other extant flowering plants16. A. trichopoda originated prior to the split of eudicots and monocots, and has not experienced any whole genome duplication (WGD) since then16; while the other seven angiosperms had several rounds of whole genome duplication/triplications after their split from A.trichopoda (Fig. 1, data from PGDD website, http://chibba.agtec.uga.edu/duplication/)17, which should have contributed to the evolution of SET genes. To standardize gene names, the Arabidopsis genes with known functions were named as published, following the standard gene symbol conventions with all capital letters; while genes in other species were named as SET1-62 following the species abbreviation.
Figure 1

The Taxonomy Common Tree of the eight species (F. vesca, A. thaliana, V. vinifera, N. nucifera, O. sativa, Z. mays, A. trichopoda and S. moellendorffii) and the numbers of SET, JmjC, LSD and Dot1/Dot1L genes retained in each genome.

The Taxonomy Common Tree was constructed online by Taxonomy Browser in NCBI (http://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi).

Table 1

Characterization of SET-domain containing genes in F. vesca.

ClassNameAnnotationLocus tagSpecificityNameAnnotationLocus tagProteinLength(aa)Chr
A. thalianaF. vesca
IAT-MEDEAMEAAT1G02580.1H3K27me3
AT-EZA1SWNAT4G02020.1H3K27me3FV-SET29EZAgene022758805
AT-CLFCLFAT2G23380.1H3K27me3FV-SET4CLFgene238139131
IIAT-ASHH1ASHH1AT1G76710.1H3;H4FV-SET38ASHH1gene0044015227
AT-ASHH2ASHH2AT1G77300.1H3K4me3FV-SET15ASHH2gene1126921132
AT-ASHH3ASHH3AT2G44150.1FV-SET1FV-SET6ASHH3–gene30492#gene30492#37237211
AT-ASHH4ASHH4AT3G59960.1
AT-ASHR3ASHR3AT4G30860.1H3K4me2H3K36me2/me3FV-SET25ASHR3gene276574854
IIIAT-SDG25ATXR7AT5G42400.1H3K4me1/me2/me3FV-SET32gene2932412295
AT-ATXR3ATXR3AT4G15180.1H3K4me1/me2/me3FV-SET2ATXR3gene2388324021
AT-ATX1AT-ATX2ATX1ATX2AT2G31650.1AT1G05830.1H3K4me3H3K4me2FV-SET24ATX2gene2242710754
AT-ATX3ATX3AT3G61740.1FV-SET8FV-SET39ATX3aATX3bgene22028gene1919675690817
AT-ATX4AT-ATX5ATX4ATX5AT4G27910.1AT5G53430.1FV-SET42ATX5gene1286110697
FV-SET16gene1099921702
IVAT-ATXR5ATXR5AT5G09790.2H3K27me1FV-SET7ATXR5gene178733791
AT-ATXR6ATXR6AT5G24330.1H3K27me1FV-SET41ATXR6gene128203447
VAT-SUVH1AT-SUVH3AT-SUVH7AT-SUVH8AT-SUVH8SUVH1SUVH3SUVH7SUVH8SUVH10AT5G04940.1AT1G73100.1AT1G17770.1AT2G24740.1AT2G05900.1FV-SET33FV-SET40SUVH1aSUVH1bgene02482gene0323470266467
AT-SUVH2AT-SUVH9SUVH2SUVH9AT2G33290.1AT4G13460.1H3K9me1/me2H4K20meH3K27me2–FV-SET14SUVH9gene083796742
AT-SUVH4SUVH4AT5G13960.1H3K9me1/me2FV-SET34SUVH4agene013966516
    FV-SET26FV-SET45FV-SET18SUVH4bSUVH4cSUVH4dgene06630gene07293gene199994066394124unknown3
AT-SUVH5AT-SUVH6SUVH5SUVH6AT2G35160.1AT2G22740.2H3K9me1/me2H3K9me1/me2FV-SET30SUVH6gene2048410835
AT-SUVR1AT-SUVR2SUVR1SUVR2AT1G04050.1AT5G43990.2FV-SET13SUVR2gene083248252
AT-SUVR3SUVR3AT3G03750.2FV-SET11FV-SET17SUVR3aSUVR3bgene11265gene2174634034622
AT-SUVR4SUVR4AT3G04380.1H3K9me2/me3FV-SET35SUVR4agene166845116
FV-SET44FV-SET22SUVR4u1SUVR4u2gene07805gene115628121303unknown4
AT-SUVR5SUVR5AT2G23740.2H3K9me2H3K27me2FV-SET31SUVR5gene2941115205
VIAT-ASHR2ASHR2AT2G19640.2FV-SET20ASHR2gene227553954
AT-ATXR2ATXR2AT3G21820.1FV-SET3ATXR2gene239084721
AT-RE-PRAT1G33400.1
AT-SET38AT5G06620.1FV-SET36ATXR4gene180713276
AT-SETC4AT1G43245.1FV-SET9-gene081236462
AT-SET35AT1G26760.1FV-SET28-gene047155224
AT-SET37AT2G17900.1FV-SET10FV-SET12ASHR1aASHR1bgene11111gene2550348352522
VIIAT-SET40AT5G17240.1FV-SET5gene316675121
AT-RUB-MEAT3G07670.1FV-SET37gene155414996
AThal9AT1G14030.1FV-SET27gene046935984
AT-PLASAT4G20130.1FV-SET19gene199475143
AT-LRU-PRAT1G24610.1FV-SET21gene062674834
AT-SETC1AT1G01920.1FV-SET23gene169465634
AT-SETC2AT3G55080.1FV-SET43gene233544617
AT-SETC3AT3G56570.1

#It is noted that FV-SET1 and FV-SET6 have the same gene tag (gene30492) by “Hybrid GeneMArk Predictions” (predicted by EST data, https://www.rosaceae.org/species/fragaria/fragaria_ vesca/genome_ v1.0). But the genomic assembly indicates that these two genes are located at different loci on chromosome 1 (Fig. 3A) and different XP tags (Supplementary Table S1). In addition, these two genes have different intron/exon structures and DNA sequences (Supplementary Fig. S1). Thus, we retained both of these genes.

Expansion and evolution of SET genes in F. vesca

In order to investigate the classification of F. vesca SET genes, phylogenetic trees were constructed using the Maximum Likelihood method (see Methods for details) based on amino acid sequences in the conserved SET domains on the SET genes identified in the agiosperms F. vesca, A. thaliana, Z. mays, O. sativa and A. trichopoda (Fig. 2). It is noted that the topologies of phylogenetic trees constructed by different methods are slightly different, and the results of the bootstrapping analysis for some nodes are lower than 60 (Fig. 2, Supplementary Fig. S1, S2–S6). Thus, domain composition for the whole proteins (Supplementary Fig. S1) and motif construct for the SET domains (Supplementary Fig. S2) were taken into consideration for the classification of SET genes as well. Accordingly, the SET genes could be grouped into seven classes (Fig. 2 and Table 1), which is consistent with previously reported in A. thaliana8. Class I–V consist of most of the canonical SET genes known to be involved in the catalysis of histone methylation (in Arabidopsis); while class VI and VII consist of relatively shorter genes with no known specificity (Table 1 and Fig. 2).
Figure 2

A most likelihood phylogenetic tree of predicted SET genes identified F. vesca, A. thaliana, O. sativa, Z. mays and A. trichopoda.

The phylogenetic tree was constructed based on the amino acids sequences of the SET domains with 1000 bootstrapping replicates. The F. vesca lineage-specific duplicated gene pairs (since its split form A. thaliana) are highlighted as yellow. The seven classes of SET genes are marked by different colors. Refer to Table 1 for more basic information of SET genes in F. vesca and A. thaliana.

The seven SET classes have different domain architectures and motif compositions (Supplementary Fig. S1–S6). In addition to the domain compositions identified by pfam, the 20 most common motifs embedded in the SET domains were identified by MEME for class I–V SET genes (Supplementary Fig. S2). Class I has few domains beside SET, but all the SET domains in this class consistently have the class I-specific motif 16 and 17. Although four SET genes in other classes have the motif 16 as well, the sequences are highly degenerate. Class II is characterized by AWS, SET and Post-SET domains, motif 1, 2, 3, 4, 9, and a class II-specific motif 20. Class III SET genes have PHD, zf-HC5HC2H_2, SET and Post-SET domains at high frequency, motif 1, 2, 3, 9 and class III-specific motif 7, 8, 12 and 15. Class IV is characterized by PHD and SET domains, and motif 1, 2 and class IV-specific motif 18. Class V SET genes have SAD_SRA, Pre-SET, SET and Post-SET domains, motif 1, 2, 3 and 4 at high frequency, and several class V-specific motif 5, 6, 11, 12, 14 and 15 (Supplementary Fig. S1,S2). While class VI and VII SET genes are relative short, and have few domains known to be essential for an HKMTase catalysis function. To explore the detailed evolutionary history of the SET domain containing genes, the phylogenies of each class were investigated. In general, most of the SET genes reside in the sub-clades consisting of genes from all five species (Supplementary Fig. S3–S6), but there are some exceptions as well. For example, the AT-RE-PR sub-clade in class VI, and the AT-SETC3 sub-clade in class VII do not have an F. vesca member, indicating a lineage-specific loss in F. vesca (Supplementary Fig. S5,S6). On the other hand, in some sub-clades, two F. vesca genes clustered together with either a single Arabidopsis gene, or without any corresponding genes in Arabidopsis (Fig. 2, highlighted by yellow), indicating a lineage-specific duplication in F. vesca (duplications happened in F. vesca after its split from A. thaliana). In total, there are 7 pairs of such F. vesca SET genes, with 4 pairs in class V and the other 3 in other classes. To study the evolutionary constraints performed on the 7 recent duplicates in F. vesca, dn (nonsynonymous substitutions per site) and ds (synonymous substitutions per site) between each duplicate were calculated. dn of class V duplicated pairs are higher than other classes (Fig. 3B). Furthermore, dn/ds value of class V duplicates ranges from 0.62 to 0.93, significantly higher than the 3 duplicates from other classes (Fig. 3B), indicating a relaxation of negative selection on the class V duplicated pairs after duplication events happened18. In addition, some of those recent duplicates are coupled with domain gains and losses. For example, in duplicate pair FV-SET22 and FV-SET44 of class V, FV-SET22 obtained two domains at the N-terminal (Supplementary Fig. S3). Overall, recent duplicate pairs were more frequently originated or maintained in class V, and class V SET gene duplicate pairs evolved faster than other classes.
Figure 3

F. vesca lineage-specific duplicated SET gene pairs in class V exhibits stronger positive selection than in other SET classes.

(A) Chromosomal locations of SET and JmjC genes on the seven chromosomes of F. vesca. The lineage-specific duplicated SET and JmjC pairs are connected by dashed lines. The scale on the left is in megabases. (B) The correlation between ds vs. dn, and ds vs. dn/ds for those duplicated gene pairs. Red circles denote duplicate genes pairs in class V, while blue diamonds denote those in other classes. For the FV-SET34/45/18/26 gene set, ds and dn were calculated for FV-SET34 vs.45, 34 vs.18 and 34 vs.26, according to the phylogenetic tree shown in Supplementary Fig. S5.

On the DNA level, WGD, large segmental duplication, or tandem duplication might lead to those duplicated pairs. To evaluate the relative contribution of those mechanisms in the expansion of the SET gene family in F. vesca, all SET genes were mapped to the seven chromosomes (Fig. 3A) and analyzed by MCscan19. The MCscan results suggest that in F. vesca, 6 out of the 45 SET genes were related to WGD, while the others resulted from dispersed duplications (Supplementary Table S2,S3). In class V, FV-SET34/45/18/26 cluster with a single Arabidopsis gene AT-SUVH4 (Fig. 2), indicating that more than one duplication event happened to the F. vesca orthologs of AT-SUVH4. The fact that both FV-SET18 and FV-SET 26 have a single exon suggests that retro-transposition may have contributed to the expansion of this gene set. In summary, our results suggest that most of the F. vesca SET genes originated before the split of eudicots and monocots, and that WGDs, dispersed duplications via retro-transpositions in some cases, have contributed to the evolution of SET genes in F. vesca as well.

Identification of genes encoding histone HDMases and investigation of their evolution in F. vesca

To investigate the histone HDMases in F. vesca, LSD HDMases and JmjC domain-containing HDMases were identified by sequence-based search using HMMER toolset. All the LSD HDMases characterized previously contain two domains, a SWIRM domain and an amino oxidase domain7. Thus, the proteins with both domains were identified as putative LSD HDMases. In total, the sequence-based search identified 5, 4, 4, 4 and 4 genes encoding proteins with both the SWIRM and amino oxidase domains in A. trichopoda, O. sativa, Z. mays, A. thaliana and F. vesca, respectively (Fig. 4, Table 2 and Supplementary Table S4). The consistent number of LSD HDMases indicates that duplication events may not contribute much to the evolution of LSD HDMases in angiosperms.
Figure 4

Most likelihood phylogenetic trees and schematic diagrams for domain composition of JmjC (A) and LSD (B) genes in the species investigated. The phylogenetic tree was constructed based on the amino acids sequences of either the JmjC domain (A) or the whole LSD protein (B) with 1000 bootstrapping replicates, and the results of the bootstrapping analysis larger than 50% are shown.

Table 2

Characterization of JmjC and LSD histone HDMase encoding genes in F. vesca.

ClassNameAnnotation*Locus tagNameLocus tagProteinLength(aa)Chr
A. thalianaF.vesca
PKDM7AT-JmjC9PKDM7AAT2G38950FV_JmjC2gene081088592
AT-JmjC15AT-JmjC8AT-JmjC4PKDM7BPKDM7CPKDM7EAT4G20400AT2G34880.1AT1G30810.1FV_JmjC5gene2021010693
AT-JmjC1PKDM7DAT1G08620FV_JmjC15gene1666512186
PKDM9AT-JmjC13PKDM9AAT3G48430.1FV_JmjC10gene0990314925
 AT-JmjC17PKDM9BAT5G04240.1FV_JmjC18gene2325515907
PKDM11AT-JmjC21PKDM11AT5G63080.1FV_JmjC8gene317794755
PKDM12AT-JmjC11PKDM12AAT3G20810.2FV_JmjC6gene206514163
AT-JmjC19PKDM12BAT5G19840.2FV_JmjC1gene202655632
PKDM13AT-JmjC12PKDM13AT3G45880.1FV_JmjC3gene103623483
KDM3AT-JmjC2KDM3AAT1G09060.3FV_JmjC16gene225039656
AT-JmjC16KDM3BAT4G21430.1
AT-JmjC10KDM3CAT3G07610.3FV_JmjC14FV_JmjC13FV_JmjC20FV_JmjC12FV_JmjC21gene22017gene27692gene04808gene11798gene09210445960123487680455757
AT-JmjC14KDM3DAT4G00990.1FV_JmjC22gene128748827
FV_JmjC19gene1915610177
AT-JmjC3AT-JmjC5KDM3EKDM3FAT1G11950.1AT1G62310.1FV_JmjC4gene194928673
KDM5AT-JmjC6KDM5AT1G63490.1FV_JmjC7gene3247418395
PKDM8AT-JmjC20KDM8AT5G46910.1FV_JmjC9gene137458785
PKDM6AT-JmjC18JMJD6AAT5G06550.1FV_JmjC17gene181315196
 AT-JmjC7JMJD6BAT1G78280.1FV_JmjC11gene119649595
LSDAT-LSD1AT1G62830.1FV-LSD1gene086187912
 AT-LSD2AT3G10390.1FV-LSD2gene234639117
 AT-LSD3AT3G13682.1FV-LSD3gene250107483
 AT-LSD4AT4G16310.1FV-LSD4gene1522118632

*Refer to Qian et al.20 for annotations of JmjC demethylase genes.

In contrast to LSD HDMases, the number of JmjC HDMases varies in the five species. In total, A. trichopoda, O. sativa, Z. mays, A. thaliana and F. vesca have 17, 17, 26, 21 and 22 JmjC domain-containing genes respectively (Fig. 1 and Fig. 4). According to phylogenetic trees and domain constructs, JmjC HDMases are grouped into 9 classes (Fig. 4), which is consistent with previous reports20. Specifically, F. vesca lineage-specific duplications only happened in the KDM3 class, in which the sister group of AT-KDM3C (AT_JmjC10) in F. vesca has five members (FV_JmjC12, 13, 14, 20 and 21). Interestingly, among all the JmjC domain-containing genes, FV_jmjC12 and FV_jmjC21 are the only two genes having a single exon (Supplementary Fig. S7), suggesting that the ancestor of FV_JmjC12 and FV_JmjC21 resulted from a retrotransposition event where transcribed messenger RNA was inserted into the genome to form the ancestor of FV_JmjC12 and FV_JmjC21. In most classes, the F. vesca JmjC HDMases have not expanded, but in the KDM3 class, a series of duplication events occurred leading to the F. vesca lineage-specific expansion in this particular class. In order to investigate which mechanisms might have contributed to those duplication events, JmjC genes were mapped to F. vesca chromosomes (Fig. 2) and analyzed by MCscan. MCScan results suggest that out of the 26 HDMase genes, 2 were WGD-related, and 24 resulted from dispersed duplications (Supplementary Table S2,S3). Therefore, similar to SET HKMTases, most of the F. vesca LSD and JmjC HDMases originated before the split of eudicots and monocots; and recent dispersed duplications and retro-transpositions might have played a pivotal role in the evolution of histone HDMases in F. vesca.

Expression profiles of histone HKMTases and HDMases in flower and fruit development in F. vesca

To investigate the expression profiles of individual histone HKMTase and HDMase genes in different organs and developmental stages, transcriptome data were investigated in flower development and early-stage fruit development2122. One out of 45 F. vesca SET genes have no expression data available from the database, and were omitted from the following analysis. The genes encoding HKMTases and HDMases have quite diversified expression patterns (Fig. 5A). Firstly, for the seven classes of SET genes, the members within a particular class show different tissue/stage-specific expressions. For example, in class I, CLF is moderately expressed in each tissue/developmental stage; while the mRNA of EZA1 is depleted in pollen and the early-stage embryo, and is more enriched in the developing pith and cortical tissues of strawberry flesh. Secondly, the respective members of the recently duplicated SET pairs have different tissue-specificity. Based on available transcription data, five out of the six duplicated F. vesca SET pairs with transcription data available have a similar expression pattern: one gene of the duplicated pair is more evenly and ubiquitously expressed, while the other gene is silent in the flesh (pith and cortex), the anther, and in some tissues in the seed (embryo, ghost, wall). It suggests that although highly conserved on amino acids sequence, those duplicated genes are differently regulated. Thirdly, LSD and different classes of JmjC HDMases show different expression profiles in different organs/stages as well; and the recent JmjC HDMase duplicates also express differentially (Fig. 4). Overall, the expression of histone HKMTase and HDMase genes has different organ/stage specificity, indicating a functional diversification coupled with the expansion of those gene families in F. vesca.
Figure 5

Expression profiles of identified histone HKMTase and HDMase genes in F. vesca.

(A) The mRNA levels of histone modifiers in different tissues in flower and early-stage fruit development. The expression levels (RPKM) for the genes of interest were directly retrieved from http://bioinformatics.towson.edu/strawberry/) and plotted in log2 scale. (B) Expression profiles of SET genes in flesh (including pith and cortex, without seeds) during fruit ripening. (C) Expression profiles of SET genes in response to heat and cold stresses. For (B,C), the expression levels relative to GAPDH were measured by quantitative RT-PCR, and displayed in log2 scale. Three biological replicates and three technical replicates were done for each data point.

On the other hand, the different organs/stages have very different combinatorial expression patterns of HKMTase and HDMase genes (Fig. 5A). Firstly, most of the SET genes are expressed at extremely low level in pollen, with the majority of the class V and class I SET genes silent there. Secondly, the mRNAs of all the LSD HDMases and most of the JmjC HDMases are depleted in pollen as well, but interestingly, FV-JmjC5 and FV-JmjC16 show highest expression, indicating that those two JmjC HDMases might play a dominant role in histone demethylation in pollen. Thirdly, in the developing strawberry flesh (pith and cortex), both active and silent mark-related SET genes, LSD and JmjC genes show decreasing expression levels during early-stage fruit development (pollination to big green). Fourthly, overall, genes encoding both HKMTases and HDMases express higher in tissues of developing flowers (carpels, perianth, flowers, receptacles and microspores) than in other tissues. Thus, the different tissues in different developmental stages have diversified expression patterns of histone lysine methylation related genes, indicating the specific regulatory roles of those genes in cellular processes. The expression profiles shown above reveal that duplicated SET pairs are quite distinct, with one silent in early-stage fruit development, while the other relatively ubiquitously expressed in all tissues, (Fig. 5A). To investigate how those duplicated SET genes express during strawberry fruit ripening, the expression levels of the more ubiquitously expressed genes were investigated in strawberry fleshy fruits (stripped with seeds, including pith and cortex only) at big green stage, big white stage (with red seeds and white flesh), turning stage (with red seeds and light white flesh) and red stage (2–3 days after turning stage) by quantitative RT-PCR assays (Fig. 5B). In addition, a subset of SET genes representing different classes was investigated as well. In contrast to the overall decreasing expression during early-stage fruit development, the mRNA levels of a substantial number of SET genes showed increasing expression levels during fruit ripening, and peaked at the turning stage (9 out of 14 genes investigated, Fig. 5B). The expression patterns of those SET genes in fruit development revealed that histone lysine HKMTase genes are dynamically expressed, and that the genomic histone lysine methylation patterns might undergo a dramatic change at the onset of fruit ripening.

Expression profiles of SET genes during heat/cold shock in F. vesca

Histone modifications are suggested to play an important role in the regulation of gene expression in response to abiotic stresses114. Strawberry plants are quite sensitive to extreme temperatures, and cold and heat shock are two recurrent stresses strawberry encounters in the natural environment. To study how HKMTase genes are regulated during heat and cold stresses, the expression patterns of a subset of SET genes were investigated in seedlings. The qRT-PCR results demonstrate that those HKMTase genes respond differentially to a particular abiotic stress (Fig. 5C). Three out of 13 investigated SET genes show increased expression levels upon cold shock at 3 h, while other 10 genes display no significant changes. Two SET genes show increased expressions upon heat shock at 4 h (3 h heat shock + 1 h recovery) as well. Interestingly, ATX3b and SUVH4a response to both cold and heat shock. Furthermore, the recent duplicated gene pairs response differently. For SUVH4a/b/c/d, the expression level of SUVH4a increases after cold and heat shock, while the other three are not responsive at all. In summary, some SET genes show dynamic expression patterns upon cold and heat shock, which indicates that these genes may be involved in F. vesca’s responses to temperature stresses.

Discussion

Sequence-based searching and phylogenetic analysis proved to be an effective way to identify histone modifiers in a sequenced genome202324252627. In this study, we identified genes encoding SET HKMTases, LSD HDMases and JmjC HDMases in F. vesca plus seven other plant species representing the major clades of terrestrial plants. The extensive phylogenetic analysis revealed the evolutionary history of those histone modifiers in F. vesca and also in other angiosperms. In total, 45 SET HKMTase genes grouped in seven classes were identified in F. vesca. These phylogenetic studies suggest that those identified SET genes were highly conserved in each class across a wide spectrum of plants, indicating their essential regulatory roles in the plant kingdom. Of the SET genes studied, the most intriguing observation was the expansion of class V in both eudicots and monocots, especially in Z. mays. Class V SET HKMTases are specific for methylations on H3K9, which is involved in transposon silencing and heterochromatin formation25. There is a vast expansion of transposable elements in Z. mays28, which might explain the maintenance of a large number of class V SET genes in Z. mays to protect the genome integrity. In addition, class V genes diverged faster than other classes (Fig. 3 and Table 2). For each species, the numbers of genes grouped into each class is summarized in Table 2. Overall, there is no simple linear correlation between SET gene numbers and genome size, or total gene numbers (Table 3). Furthermore, there is no significant difference among the five species in class I, II, III, VI or VII, in terms of gene numbers. It suggests that those five classes did not experience any extensive expansion in angiosperms, and that most of the duplicated genes from the multiple whole genome duplication/triplication events were lost during evolution.
Table 3

Number of SET genes identified in each class in the five species.

 SETIIIIIIIVVVIVIIGenesGenomeWGD*
A.trichopoda343450105726846701M2
O. sativa442562147839049373M5
Z. mays5446622187634802053M6
A. thaliana473572157827416118M5
F. vesca452572157724771212M3

*Refer to Fig. 1 for more details of WGDs.

In contrast to the conserved number of SET genes in the five classes mentioned above, class IV and V show distinct evolutionary characteristics. Firstly, class IV SET genes are absent in A. trichopoda. Phylogenetic trees suggest that class IV genes are present in S. moellendorffii (Supplementary Fig. S7). Thus it is likely that the H3K27me1-specific HKMTase was lost in A. trichopoda. Secondly, there is an expansion of class V SET genes in both eudicots and monocots, especially in Z. mays. Overall, for the five species investigated, the number of SET genes in class I, II, III, VI and VII remains quite constant, while the number of genes in class IV and V fluctuates in angiosperms, indicating different evolutionary histories accompanied by rounds of WGDs and subsequent gene losses/gains by natural selection constrains. We identified 26 histone HDMase genes in F. vesca. The number of LSD HDMase genes remained nearly the same in eight angiosperm species and the domain construct was highly conserved (Fig. 4B). The JmjC HDMase genes fall into 11 classes indicating that the genes coding for JmjC HDMases underwent rapid expansion and probably functional specification in plant genomes. This expansion and specification process was likely involved in the evolution of epigenetic regulatory mechanisms in plants with distinct biological features. Furthermore, JmjC genes in the KDM3 group were preferentially expanded in the strawberry genome compared to other species, implying that KDM3 group genes may have evolved in strawberry to meet some unique regulatory needs. It is noticeable that there was no H3K79me-specific Dot1/Dot1L HKMTase identified in the five angiosperms (Fig. 1). Considered that Dot1/Dot1L gene is present in S. moellendorffii (Fig. 1) and also in animals6, it is likely that angiosperms have lost the Dot1/Dot1L HKMTases in their common ancestor. The phylogeny based on both sequence conservation and domain construct in this study suggests that WGDs and dispersed duplications contributed to the expansion of some histone lysine modifiers in angiosperms, which is consistent with that previously reported2029. On the other hand, all the F. vesca lineage-specific duplications originated from dispersed duplications, particularly retro-transpositions in some cases. The fact that those recently duplicated gene pairs have greatly diverged in expression patterns suggests that they might have been retained in the F. vesca genome by selection to more precisely regulate the developmental processes which histone methylation plays a role. All the SET, LSD and JmjC families have several genes, and their functions could be both redundant and/or complementary. Thus, the overall histone modifications in particular tissues are determined by the combinatorial expression profiles of histone modifiers. Indeed, the different organs/stages have distinct combinatorial expression patterns during flower and fruit development, and many of those histone modifier genes show abrupt up- or down-regulation in specific tissues or developmental stages. Anthers and carpels where sporogenesis and gametogenesis occur, appear to have more genes being up-regulated, irrespective of whether they code for HKMTases or HDMases, highlighting the potentially active and dynamic regulation of histone modifications in these tissues. However, pollen (considered to be in a division- and growth-quiescent state) showed the least number of genes with active expression, particularly the class V SET genes (Fig. 5A). The lack of expression of both H3K9me- and H2K27me2/3-specific SET genes (both for silencing chromatin) is consistent with the de-condensed chromatin states in the vegetative cells in pollen. Pollen grains have three cells, one large vegetative cell and two germ cells. It is known that in Arabidopsis, the vegetative cell lack H3K9me2 marks, resulting in the genome-wide activation of transposons. The small RNAs produced by transposon activation are delivered to the germ cells to silence the transposons and thus maintain DNA integrity30. The lack of expression of class V SET genes might reflect this compromised strategy of the large vegetative cell to protect the germ lines. A few genes coding for demethylase were sharply up-regulated in specific stages or tissues, for example FV_JmjC9 and FV_JmjC22 in pollen, which might serve critical regulatory function there. Another interesting phenomenon is that the expression levels of the some active SET genes that we investigated in pith and cortical tissues decreased during early-stage fruit development (from pollination to big green stage), but gradually increased beginning at the white stage and reaching a peak at the turning stage (Fig. 5). It has been reported that epigenetic marks fluctuate during fruit development and ripening, e.g. overall DNA methylation levels decrease during tomato fruit ripening31. Global DNA methylation measurements revealed that DNA methylation varied in both peal and flesh of sweet orange32. Our results suggest that histone modifications might be dynamic during strawberry fruit development and ripening as well, and histone modifiers are probably involved in this regulatory process. Epigenentic regulation of cellular processes during abiotic stresses has been the subject of several recent investigations that suggest that both DNA modifications and histone modifications play a pivotal role in plant responses to various stresses1143334. The RT-qPCR results revealed that the expression levels of a set of HKMTases were found to be elevated after cold/heat shock (Fig. 5C), indicating their possible participation in response to extreme temperatures in strawberry plants. Overall, expressions of histone modifers are dynamic in different tissues during different developmental stages, and in response to abiotic stresses as well, demonstrating their regulatory roles in various cellular processes in strawberry. Compared to the well-studied model plants Arabidopsis and rice, strawberry has several distinct characteristics including that “fruits” develop from receptacle tissues and the adaptability of the species to different environments. As essential regulatory factors, how histone modifiers are involved in various cellular process is of great interest. The majority of previous studies about histone modifiers focused on Arabidopsis, which does not have fleshy fruits. Although some work has been published about histone modifiers in tomato, the data are limited. Our identification and characterization of histone modifiers in woodland strawberry is the first comprehensive analysis of HKMTases and HDMases in a non-climacteric fruit species. Our study provides an overview of the histone lysine methylation modifiers in strawberry, and should greatly facilitates molecular, biochemical and physiological characterizations of histone methylations in strawberry and other Rosaceae species.

Methods

Data retrieve

Eight plant genomes were analyzed, Fragiaria vesca, Arabidopsis thaliana, Vites vinifera, Nelumbo nucifera, Orazy sativa, Zea mays, Amborella trichopoda and Selaginella moellendorffii. The Fragaria vesca and Nelumbo nucifera complete protein sequences and corresponding annotation information were downloaded from NCBI and others were download from Phytozome (version 10.3; http://phytozome.jgi.doe.gov/pz/portal.html). See Supplementary Table S5 for versions and resources of the databases. In proteome datasets, if more than one protein are annotated for the same gene from alternative splicing, the longest form was used for further analysis.

Identification of genes with the domain(s) of interest

To identify the genes with the domain(s) of interest, the amino acid sequences of SET domain (PF00856), JmjC domain (PF02373), Amino_oxidase domain (PF05193) and SWIRM domain (PF04433) were downloaded from pfam database V27.035 and used as a query to find homologous sequences in proteome datasets, respectively. To verify the presence of those domains, the resulting sequences were verified using the Pfam database (http://pfam.xfam.org/search), Conserved Domain Database36 (CDD; http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) available from NCBI, and the Simple Modular Architecture Research Tool database37 (SMART; http://smart.embl-heidelberg.de/), with a threshold of e-value < 1e−4. Proteins with both Amino_oxidase domain and SWIRM domain were identified as LSD genes.

Sequences alignment and phylogenetic analysis

Protein sequences were aligned using MUSCLE v3.8.31 with default parameters38. Phylogenetic trees were constructed by Raxml (version 8.1.16) with gamma distribution and 1000 bootstrapping replicates39. The construct of each phylogenetic tree was verified by MrBayes v3.2.440.

Domain and motif analysis, and identification of F. vesca lineage-specific duplicated pairs

All identified proteins were used to search against the PFAM, SMART and CDD databases to search for other known domains. All domains found by any of the three databases with e-value < 10−4 were kept. In addition, motif analyses were performed online by MEME (MEME, Version4.10.2, http://meme-suite.org/tools/meme)41. The number of motifs was set at no more than 20 with the length from 15–50 amino acids for each search. The lineage-specific duplicated gene pairs were identified based on both phylogenetic trees, domain composition of the whole proteins and motif composition of the SET domains. FV-SET13/35 was not identified as an F. vesca lineage-specific duplicated pair by the phylogenetic tree constructed for SET class V (Supplementary Fig. S4) and thus was omitted; while FV-SET33/40 and FV-SET10/12 were included based on either domain composition (Supplementary Fig. S2) or phylogenetic trees constructed for each SET class (Supplementary Fig. S6).

Plant growth conditions, stress treatments and material collection

A 7th generation inbred line of woodland Fragaria vesca, Ruegen F7-4 (Kindly provided by Janet Slovin) was used for all strawberry material collection. Strawberry flesh at different development stages was collected from plants grown in 10 cm x 10 cm pots in a controlled-environment growth chamber, set at 16 h light/8 h dark cycles, 22 °C, and 65% relative humidity. Strawberry fruits at 12-day old big green stage, big white stage (white flesh with red seeds), pink stage (slight pink flesh and red seeds), and red stage (2–3 days after the pink stage) were collected and immediately put into liquid nitrogen. The strawberry seedlings for cold/heat shock were grown in MS media in a growth chamber set at 16 h light/8 h dark cycles, 22 °C. Four-week old seedlings were transferred to a growth chamber set at either 4 °C or 38 °C for cold/heat shock. Cold shocked seedlings were collected at 1 h, 3 h and 8 h; while heat shocked seedlings were collected at 1 h and 3 h, or at 4 h (3 h heat shock and 1 h recovery at 22 °C) and 8 h (3 h heat shock and 5 h recovery at 22 °C). The collected materials were immediately put into liquid nitrogen for RNA processing.

RNA extraction and expression analysis

Before RNA extraction, the achenes were stripped from the frozen strawberry fruit, and only the de-seeded flesh was processed for RNA isolation. RNA was isolated from either the deseeded flesh or seedlings by a modified CTAB method. After DNase I treatment, RNAs were used for cDNA synthesis by using the Primerscript RT reagent Kit with gDNA Erase (Takara). The cDNAs were used as templates for quantitative RT-PCR to measure the abundance of a certain transcript. Quantitative RT-PCR was performed using SYBR Premix Ex Tag (Takara) on a Bio-rad iQ5. Primers used are listed in Supplementary Table S6. Results were analyzed by using the ΔΔCT method42 using GAPDH as the control locus43. Three biological and three technical replicates were performed and analyzed.

Additional Information

How to cite this article: Gu, T. et al. Identification and characterization of histone lysine methylation modifiers in Fragaria vesca. Sci. Rep. 6, 23581; doi: 10.1038/srep23581 (2016).
  42 in total

1.  Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method.

Authors:  K J Livak; T D Schmittgen
Journal:  Methods       Date:  2001-12       Impact factor: 3.608

2.  JmjC-domain-containing proteins and histone demethylation.

Authors:  Robert J Klose; Eric M Kallin; Yi Zhang
Journal:  Nat Rev Genet       Date:  2006-09       Impact factor: 53.242

3.  Identification and characterization of the SET domain gene family in maize.

Authors:  Yexiong Qian; Yilong Xi; Beijiu Cheng; Suwen Zhu; Xianzhao Kan
Journal:  Mol Biol Rep       Date:  2014-01-04       Impact factor: 2.316

Review 4.  The discovery of histone demethylases.

Authors:  Yujiang Geno Shi; Yuichi Tsukada
Journal:  Cold Spring Harb Perspect Biol       Date:  2013-09-01       Impact factor: 10.005

5.  Single-base resolution methylomes of tomato fruit development reveal epigenome modifications associated with ripening.

Authors:  Silin Zhong; Zhangjun Fei; Yun-Ru Chen; Yi Zheng; Mingyun Huang; Julia Vrebalov; Ryan McQuinn; Nigel Gapper; Bao Liu; Jenny Xiang; Ying Shao; James J Giovannoni
Journal:  Nat Biotechnol       Date:  2013-01-27       Impact factor: 54.908

Review 6.  Transcriptional 'memory' of a stress: transient chromatin and memory (epigenetic) marks at stress-response genes.

Authors:  Zoya Avramova
Journal:  Plant J       Date:  2015-04-15       Impact factor: 6.417

7.  Arabidopsis Histone Lysine Methyltransferases.

Authors:  Frédéric Pontvianne; Todd Blevins; Craig S Pikaard
Journal:  Adv Bot Res       Date:  2010-01-01       Impact factor: 2.175

8.  Methylation of H3-lysine 79 is mediated by a new family of HMTases without a SET domain.

Authors:  Qin Feng; Hengbin Wang; Huck Hui Ng; Hediye Erdjument-Bromage; Paul Tempst; Kevin Struhl; Yi Zhang
Journal:  Curr Biol       Date:  2002-06-25       Impact factor: 10.834

Review 9.  The many faces of histone lysine methylation.

Authors:  Monika Lachner; Thomas Jenuwein
Journal:  Curr Opin Cell Biol       Date:  2002-06       Impact factor: 8.382

10.  SMART: recent updates, new developments and status in 2015.

Authors:  Ivica Letunic; Tobias Doerks; Peer Bork
Journal:  Nucleic Acids Res       Date:  2014-10-09       Impact factor: 16.971

View more
  11 in total

1.  Identification of histone methylation modifiers and their expression patterns during somatic embryogenesis in Hevea brasiliensis.

Authors:  Hui-Liang Li; Dong Guo; Jia-Hong Zhu; Ying Wang; Shi-Qing Peng
Journal:  Genet Mol Biol       Date:  2020-02-17       Impact factor: 1.771

2.  Functional Characterization of the Lysine-Specific Histone Demethylases Family in Soybean.

Authors:  Mengshi Liu; Jiacan Jiang; Yapeng Han; Mengying Shi; Xianli Li; Yingxiang Wang; Zhicheng Dong; Cunyi Yang
Journal:  Plants (Basel)       Date:  2022-05-25

3.  Genome-Wide Analysis of Soybean JmjC Domain-Containing Proteins Suggests Evolutionary Conservation Following Whole-Genome Duplication.

Authors:  Yapeng Han; Xiangyong Li; Lin Cheng; Yanchun Liu; Hui Wang; Danxia Ke; Hongyu Yuan; Liangsheng Zhang; Lei Wang
Journal:  Front Plant Sci       Date:  2016-12-05       Impact factor: 5.753

4.  Genome-Wide Identification of bZIP Family Genes Involved in Drought and Heat Stresses in Strawberry (Fragaria vesca).

Authors:  Xiao-Long Wang; Xinlu Chen; Tian-Bao Yang; Qunkang Cheng; Zong-Ming Cheng
Journal:  Int J Genomics       Date:  2017-04-11       Impact factor: 2.326

5.  Genome-Wide Identification of Histone Modifiers and Their Expression Patterns during Fruit Abscission in Litchi.

Authors:  Manjun Peng; Peiyuan Ying; Xuncheng Liu; Caiqin Li; Rui Xia; Jianguo Li; Minglei Zhao
Journal:  Front Plant Sci       Date:  2017-04-27       Impact factor: 5.753

6.  Bioinformatics and expression analysis of histone modification genes in grapevine predict their involvement in seed development, powdery mildew resistance, and hormonal signaling.

Authors:  Li Wang; Bilal Ahmad; Chen Liang; Xiaoxin Shi; Ruyi Sun; Songlin Zhang; Guoqiang Du
Journal:  BMC Plant Biol       Date:  2020-09-04       Impact factor: 4.215

7.  A native chromatin immunoprecipitation (ChIP) protocol for studying histone modifications in strawberry fruits.

Authors:  Xiaorong Huang; Qinwei Pan; Ying Lin; Tingting Gu; Yi Li
Journal:  Plant Methods       Date:  2020-02-03       Impact factor: 4.993

8.  Duplication and Specialization of NUDX1 in Rosaceae Led to Geraniol Production in Rose Petals.

Authors:  Corentin Conart; Nathanaelle Saclier; Fabrice Foucher; Clément Goubert; Aurélie Rius-Bony; Saretta N Paramita; Sandrine Moja; Tatiana Thouroude; Christophe Douady; Pulu Sun; Baptiste Nairaud; Denis Saint-Marcoux; Muriel Bahut; Julien Jeauffre; Laurence Hibrand Saint-Oyant; Robert C Schuurink; Jean-Louis Magnard; Benoît Boachon; Natalia Dudareva; Sylvie Baudino; Jean-Claude Caissard
Journal:  Mol Biol Evol       Date:  2022-02-03       Impact factor: 16.240

Review 9.  A roadmap for research in octoploid strawberry.

Authors:  Vance M Whitaker; Steven J Knapp; Michael A Hardigan; Patrick P Edger; Janet P Slovin; Nahla V Bassil; Timo Hytönen; Kathryn K Mackenzie; Seonghee Lee; Sook Jung; Dorrie Main; Christopher R Barbey; Sujeet Verma
Journal:  Hortic Res       Date:  2020-03-15       Impact factor: 6.793

10.  Identification, Evolution, and Expression Profiling of Histone Lysine Methylation Moderators in Brassica rapa.

Authors:  Gaofeng Liu; Nadeem Khan; Xiaoqing Ma; Xilin Hou
Journal:  Plants (Basel)       Date:  2019-11-20
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.