Literature DB >> 30670918

HEATSTER: A Database and Web Server for Identification and Classification of Heat Stress Transcription Factors in Plants.

Jannik Berz1, Stefan Simm1,2, Sebastian Schuster3, Klaus-Dieter Scharf1, Enrico Schleiff1,2, Ingo Ebersberger4,5,6.   

Abstract

Heat stress transcription factors (HSFs) regulate transcriptional response to a large number of environmental influences, such as temperature fluctuations and chemical compound applications. Plant HSFs represent a large and diverse gene family. The HSF members vary substantially both in gene expression patterns and molecular functions. HEATSTER is a web resource for mining, annotating, and analyzing members of the different classes of HSFs in plants. A web-interface allows the identification and class assignment of HSFs, intuitive searches in the database and visualization of conserved motifs, and domains to classify novel HSFs.

Entities:  

Keywords:  HSF; database; heat stress; motif search

Year:  2019        PMID: 30670918      PMCID: PMC6327235          DOI: 10.1177/1177932218821365

Source DB:  PubMed          Journal:  Bioinform Biol Insights        ISSN: 1177-9322


Introduction

Plants have evolved a remarkable complexity in their stress response. The transcriptional reprogramming at higher temperatures is controlled by heat stress transcription factors (HSFs) leading to the activation of genes involved in heat stress response (HSR).[1,2] In general, HSFs control the expression of genes responsive to numerous abiotic stresses (eg, heat, drought, and salinity), while recently a function in the developmental regulation was observed as well.[3-6] The abundance and function of HSFs is controlled by various mechanisms like protein degradation, cooperative interactions between distinct HSF members, and the interaction with chaperones.[7] The HSF gene family comprises between 15 and 50 members depending on the plant species.[8] All HSFs share the presence of 2 conserved functional domains, the N-terminal DNA-binding domain (DBD), containing a helix-turn-helix motif flanked by 2 β-strands on each side, and 2 heptad repeat patterns (HR-A/B) of hydrophobic amino acids (aa) building the oligomerization domain (OD). Based on the length of the insertion in the linker sequence between the 2 HR patterns, HSFs have been differentiated into to class A (21 aa), B (0 aa), and C (7 aa). Furthermore, differences in the primary and secondary structure of the 2 conserved domains have been used for classifying plant HSFs (Figure 1).[8-10]
Figure 1.

Domain architecture of the 3 major HSF classes in plants. Gray lines represent regions of low conservation, variable length, and without annotated motifs. The domain architecture comprises the conserved DNA-binding (yellow) and oligomerization domains (HR-A/B region, green), the NLS and NES (orange), and the transcriptional activator and repressor domains (blue). AHA indicates activator motifs/domains; DBD, DNA-binding domain; HR-A/B, heptad repeat patterns; HSF, heat stress transcription factors; NES, nuclear export signal/sequence; NLS, nuclear localization signal/sequence; RD, repressor motifs/domains.

Adapted from Scharf et al[10]

Domain architecture of the 3 major HSF classes in plants. Gray lines represent regions of low conservation, variable length, and without annotated motifs. The domain architecture comprises the conserved DNA-binding (yellow) and oligomerization domains (HR-A/B region, green), the NLS and NES (orange), and the transcriptional activator and repressor domains (blue). AHA indicates activator motifs/domains; DBD, DNA-binding domain; HR-A/B, heptad repeat patterns; HSF, heat stress transcription factors; NES, nuclear export signal/sequence; NLS, nuclear localization signal/sequence; RD, repressor motifs/domains. Adapted from Scharf et al[10] Each HSF class is further distinguished into sub-classes, eg, HsfA1, based on the characteristic architecture of the functional motifs. These functional motifs regulate the DNA binding (DBD), the oligomerization (HR-A, HR-B), and the intracellular localization (nuclear localization signal/sequence [NLS]; nuclear export signal/sequence [NES]). With respect to the latter, only class A, but not class B and C HSFs harbor a NES. In addition, aromatic, hydrophobic, and acidic sequence stretches can act as activator domains (AHA) or repressor domains (RD). These domains fine-tune the functionality of the individual transcription factors.[10] The RD is generally associated with class B HSFs, while AHA motifs are typically found in class A HSFs. In some cases, a sub-class of HSFs can comprise of up to 5 different factors, annotated by additional letters (eg, HsfA1a). Overall, HSFs of different classes and sub-classes establish a complex network that controls a fine-tuned program of stress response. Functional analyses in model plants, eg, Solanum lycopersicum, Oryza sativa, or Arabidopsis thaliana, have greatly contributed to elucidating the function of this regulatory network.[6,11] The information about the function of the individual HSFs in maintenance of homeostasis and recovery after stress cycles serves as basis to investigate the HSR in plants, with a particular focus on crops.[5] Therefore, a comprehensive sequence and structure based mining, annotation, and analysis of plant HSFs in different species will provide a deep understanding of their role in plant abiotic stress responses with a strong focus on HSR, which should lead to a decrease in crop losses worldwide.[4] HEATSTER is a web-based reference platform for integrating HSF research across plants. The underlying database comprises 848 manually curated HSFs from 32 plant species version 1.0 (v1.0).[10] In version 2.0 (v2.0), this data set is complemented by further 1000 mostly automatically annotated HSFs from additionally 29 plant species of different ranks, 1 Phaeophyta and 3 Rhodophyta. Complementary to the data repository, HEATSTER provides a rich environment for annotation and classification of HSFs in new species. Furthermore, HEATSTER facilitates the analysis of HSFs in a functional and evolutionary context.

Materials and Methods

Deposited data

Full length amino acid (v2.0 and v1.0) and coding sequence (CDS) nucleotide sequences (v1.0) of plant HSFs are deposited in the HEATSTER database. The v1.0 from January 2014 includes 32 manually curated angiosperm species with 26 Eudicotyledons and 6 Monocotyledons. HEATSTER v2.0 from September 2016 extends the database to 65 species including 1 Phaeophyta, 3 Rhodophyta, and 61 species of Viridiplantae (5 Chlorophyta, 1 Bryophyta, 1 Lycopodiidae, 1 basal Magnoliophyta, 3 Gymnosperms, 15 Monocotyledons, and 35 Eudicotyledons) (Figure 2; Supplemental Table 1).
Figure 2.

Species included in the HEATSTER databases. The taxonomic tree displays the species collection that is currently included in HEATSTER. The color code corresponds to the taxonomic assignment of the individual species. Species names are abbreviated (first 3 letters of the genus and first 2 letters of the species epithet). Supplementary Table 1 links the short name to the full name of the species. The tree is rooted using the Phaeophyta (dark gray) and Rhodophyta (light gray) as outgroups. Species represented in the manually curated v1.0 of HEATSTER are marked with an asterisk.

Species included in the HEATSTER databases. The taxonomic tree displays the species collection that is currently included in HEATSTER. The color code corresponds to the taxonomic assignment of the individual species. Species names are abbreviated (first 3 letters of the genus and first 2 letters of the species epithet). Supplementary Table 1 links the short name to the full name of the species. The tree is rooted using the Phaeophyta (dark gray) and Rhodophyta (light gray) as outgroups. Species represented in the manually curated v1.0 of HEATSTER are marked with an asterisk. HEATSTER v1.0 features a collection of manually curated HSFs from plant genomes available prior to January 2014. The curation step served to correct sequencing errors and wrong gene models resulting from an automated gene annotation procedure. The curation procedure included the following analysis steps: (1) editing based on homology comparison to model organisms and (2) scanning for conserved signature sequence motifs within the predicted genomic region, as well as in the adjacent 5’ and 3’ intergenic regions. Although we are confident that the curation procedure removed most annotation errors, an ultimate validation must await experimental evidence. All sequences of the HEATSTER v2.0 are directly extracted from the databases Phytozome, NCBI, Dendrome, Bambogdb, Banana hub, Kazusa Database, and Cucurbit Genomics Database (Supplementary Table 1; September 2016).

Signature motif libraries

The HEATSTER database provides 2 sets of signature motifs for the HSF sub-classes, the first for the manual sequences of v1.0 and the second set for the automatically annotated HSFs of v2.0. For the creation of the signature motif libraries, training sets of HSF sequences for each sub-class were defined. To compile the training data for the signature motifs of v1.0, the assignment of HSF sub-class sequences was performed in 3 stages using sequences from 32 angiosperm species: (1) Homolog search with known HSF sequences in publicly available EST, cDNA, and protein databases; (2) Refinement of identified HSF sequences by BLAST search in plant genome databases; (3) Classification of newly identified HSF sequences based on conserved functional and signature sequence motifs according to the widely accepted nomenclature for plant HSFs.[8,9] The detected signature motif library of v1.0 based on the nomenclature of Nover et al[8,9] was used as starting point to classify HSF sub-classes in all 61 plant species of v2.0. Conserved signature motifs within the predicted HSF sequences were performed by MEME, TOMTOM, and MAST.[12,13] The HSF sub-classes were used to perform a motif search via MEME using different parameter sets for motif-width (8-20 aa, 20-35 aa, 35-50 aa) and site coverage (-OOPS, -ZOOPS, -Anr). The option maxiter was set to 100 and nmotifs was set to 20. For each HSF sub-class, we created a decoy database containing the random shuffled sequences of the HSF sub-class and performed the motif search 10 times (×9 shuffling, ×1 original) for each parameter setting. The identification of the same signature motif in the decoy database was counted as false positive (FP) and used to calculate a false discovery rate (FDR). All signature motifs below a threshold of 0.3 were selected as signature motifs for HSF sub-classes. Furthermore, signature motifs with an identity above 95% (identified by CD-hit) were merged via TOMTOM. Therefore, only signature motifs of the single HSF sub-classes were cross-validated by searching the HSF sub-class signature motifs in the other HSF sub-class signature motif libraries with MAST and TOMTOM to detect sub-class specific signature motifs.

Results

HEATSTER platform

The website HEATSTER is free and open to all users without a login requirement. HEATSTER is written in PHP, HTML, and CCS and uses HMMscan from HMMER (see http://hmmer.org/) and MAST from the MEME suite (see http://meme-suite.org/doc/mast.html) for the classification and identification. Furthermore, MySQL (see https://www.mysql.com/de/), jQuery (see https://jquery.com/), sorttable (see https://www.kryogenix.org/code/browser/sorttable/), and CSS Bootstrap (see https://getbootstrap.com/docs/3.3/css/) are included in the website. The HEATSTER website is located at http://applbio.biologie.uni-frankfurt.de/hsf/heatster/. The website is divided into a HSF classification and visualization tool. Beside classification and visualization, HEATSTER provides downloadable content like logo plots from the HSF motifs and FASTA-files of HSF sequences in the download section. The underlying MySQL database provides comprehensive information about curated HSFs from 26 Eudicotyledons and 6 Monocotyledons of v1.0.[10] HEATSTER v2.0 provides information for 5 Chlorophyta, 1 Bryophyta, 1 Lycopodiidae, 1 basal Magnoliophyta, 3 Gymnosperms, 15 Monocotyledons, and 35 Eudicotyledons (Figure 2; Supplementary Table 1). Furthermore, 3 Rhodophyta and 1 Phaeophyta were included. HEATSTER features the nomenclature of plant HSFs suggested by Nover et al.[8,9] All sequences together with their annotation can be accessed via the web-interface.

Annotation and visualization tool

The sequence analysis routines of the HEATSTER classification tool facilitate the online annotation and classification of novel HSF candidates. A library of signature motifs characterizing the individual HSF classes and sub-classes forms the fundament of these analyses. To generate this library in v2.0, we first compiled HSF sub-classes containing 61 plant species based on the v1.0 nomenclature and used MEME to identify and validate shared motif sets in the individual sub-classes. Signature motifs with a q-value lower than 1.0e-09 and sequence identity of more than 70% were merged via TOMTOM.[12] To arrive at sub-class specific signature motif sets, we removed those motifs that are represented in more than 1 HSF sub-class. The web-logos of the signature motif library can be accessed online. For the classification of a HSF candidate, HEATSTER first assigns the candidate to the HSF classes A, B, or C based on the characteristic appearance of the DBD and OD domains. Sequences harboring only one of these domains are called HSF-related. Subsequently, HEATSTER maps the signature motif sets of all HSF sub-classes against the candidate with MAST.[13] The sequence is then assigned to the best matching HSF sub-class. For the HSF classification, HEATSTER requires sequences in FASTA format submitted as a file or pasted in the provided textbox as input. HEATSTER also allows batch searches where the entire gene set of an organism can be screened for the presence of HSFs. The output is represented in table format and a modified MAST visualization output. The tables are downloadable in csv- and the sequences in FASTA format. The visual representation of HSF sequences and their signature motif architecture in HEATSTER is facilitated by the visualization tool and makes comparative studies on HSFs intuitive and straightforward. For visualization of the HSF class-specific motifs, the input is selected via dropdown menus. In the plain mode, the user can display a set of signature motifs for sequences representing a single or multiple HSF sub-classes. However, it is also possible to paste sequences in FASTA format in the extended mode for a comparison to a set of pre-selected HSFs. In the extended mode, the user can upload a query protein for a comparison to a set of pre-selected HSFs. It also allows, if present, the display of signature motifs characteristic for other sub-classes. Thereby, novel motif combination can be detected that may provide first insights into the specific functions of the HSFs and a more fine-grained classification of a novel HSF. The output is also represented and downloadable in a modified MAST HTML format.

Applications of HEATSTER

We demonstrate the use of HEATSTER exemplarily on the genome-wide identification of HSFs in the ITAG2.4[14] reference genome of S lycopersicum (tomato). This corresponds to the first step in reconstructing the HSF network in a newly sequenced genome (Figure 3 outlines the procedure).
Figure 3.

HSF classification in HEATSTER. Candidates are first assigned to the HSF classes A, B, or C, and are subsequently sub-classified based on the signature motif architecture (colored boxes). Logo plots for each motif can be visualized alongside the architecture. Shown are exemplarily NLS and NES. HsfA1a: Solyc08g005170; HsfA1c: Solyc08g076590; HsfA1b: Solyc03g097120; HsfA1e: Solyc08g076590. HSF indicates heat stress transcription factors; NES, nuclear export signal; NLS, nuclear localization signal.

HSF classification in HEATSTER. Candidates are first assigned to the HSF classes A, B, or C, and are subsequently sub-classified based on the signature motif architecture (colored boxes). Logo plots for each motif can be visualized alongside the architecture. Shown are exemplarily NLS and NES. HsfA1a: Solyc08g005170; HsfA1c: Solyc08g076590; HsfA1b: Solyc03g097120; HsfA1e: Solyc08g076590. HSF indicates heat stress transcription factors; NES, nuclear export signal; NLS, nuclear localization signal. For comparison of the HEATSTER versions 1.0 and 2.0, we analyzed the tomato proteome with the HEATSTER batch search to predict HSFs in multiple sequences. In v1.0, we identified 15 HsfAs, 8 HsfBs, and 1 HsfC. Furthermore, 3 HSF-like sequences are known from literature.[10] The HEATSTER v2.0 could identify 26 of the known 27 HSFs (Table 1).
Table 1.

HSFs in Solanum lycopersicum identified by the HEATSTER.

Identifier ITAG2.4v1.0v2.0e-value v2.0
Solyc03g097120.2.1 SolycHsfA1bSollyHsfA1b4.40E-262
Solyc06g072750.2.1 SolycHsfA1eSollyHsfA1e9.70E-256
Solyc08g005170.2.1 SolycHsfA1aSollyHsfA1a3.80E-231
Solyc08g076590.2.1 SolycHsfA1cSollyHsfA1c1.80E-190
Solyc08g062960.2.1 SolycHsfA2SollyHsfA26.50E-195
Solyc09g009100.2.1 SolycHsfA3SollyHsfA30
Solyc02g072000.2.1 SolycHsfA4cSollyHsfA4c3.60E-194
Solyc03g006000.2.1 SolycHsfA4aSollyHsfA4a2.50E-257
Solyc07g055710.2.1 SolycHsfA4bSollyHsfA4b1.60E-199
Solyc12g098520.1.1 SolycHsfA5SollyHsfA50
Solyc09g065660.2.1 SolycHsfA7SollyHsfA61.40E-160
Solyc09g082670.2.1 SolycHsfA6aSollyHsfA6a5.90E-152
Solyc09g059520.2.1 SolycHsfA8SollyHsfA81.00E-210
Solyc02g072060.1.1 SolycHsfl1SollyHsfA91.20E-57
Solyc02g079180.1.1 SolycHsfl2SollyHsfA93.70E-34
Solyc07g040680.2.1 SolycHsfA9SollyHsfA92.40E-220
Solyc02g090820.2.1 SolycHsfB1SollyHsfB11.20E-148
Solyc03g026020.2.1 SolycHsfB2aSollyHsfB2a9.50E-158
Solyc08g080540.2.1 SolycHsfB2bSollyHsfB2b4.20E-208
Solyc04g016000.2.1 SolycHsfB3aSollyHsfB3a1.10E-155
Solyc10g079380.1.1 SolycHsfB3bSollyHsfB3b3.00E-162
Solyc04g078770.2.1 SolycHsfB4aSollyHsfB4a7.30E-174
Solyc11g064990.1.1 SolycHsfB4bSollyHsfB4b1.70E-112
Solyc02g078340.2.1 SolycHsfB5SollyHsfB52.50E-160
Solyc12g007070.1.1 SolycHsfC1SollyHsfC11.90E-176
Solyc06g053960.2.1 SolycHsfA6bSollyN.C.
Solyc11g008410.2.1 SolycHsfl3

The table provides for each gene represented by the ITAG2.4 identifier, the corresponding annotation by the HEATSTER v1.0 and v2.0 as well as the e-value of the classification by the v2.0.

HSFs in Solanum lycopersicum identified by the HEATSTER. The table provides for each gene represented by the ITAG2.4 identifier, the corresponding annotation by the HEATSTER v1.0 and v2.0 as well as the e-value of the classification by the v2.0. Furthermore, 2 of the 3 HSF-like sequences could be annotated as HsfA9s and only the very similar HsfA6 and HsfA7 showed differences in the classification between HEATSTER v1.0 and v2.0. As the 4 classified tomato HsfA1 sequences showed distinct patterns of signature motifs (Figure 3), a further, more fine-grained classification of the HSF, as indicated by lower case letters, is possible. Click here for additional data file. Supplemental material, Supplementary_Table1_revised_xyz1201380f61038 for HEATSTER: A Database and Web Server for Identification and Classification of Heat Stress Transcription Factors in Plants by Jannik Berz, Stefan Simm, Sebastian Schuster, Klaus-Dieter Scharf, Enrico Schleiff and Ingo Ebersberger in Bioinformatics and Biology Insights
  8 in total

1.  Genomic insights into HSFs as candidate genes for high-temperature stress adaptation and gene editing with minimal off-target effects in flax.

Authors:  Dipnarayan Saha; Pranit Mukherjee; Sourav Dutta; Kanti Meena; Surja Kumar Sarkar; Asit Baran Mandal; Tapash Dasgupta; Jiban Mitra
Journal:  Sci Rep       Date:  2019-04-03       Impact factor: 4.379

Review 2.  Physiological and molecular insights on wheat responses to heat stress.

Authors:  Milan Kumar Lal; Rahul Kumar Tiwari; Vijay Gahlaut; Vikas Mangal; Awadhesh Kumar; Madan Pal Singh; Vijay Paul; Sudhir Kumar; Brajesh Singh; Gaurav Zinta
Journal:  Plant Cell Rep       Date:  2021-09-20       Impact factor: 4.570

3.  Evolution and co-evolution: insights into the divergence of plant heat shock factor genes.

Authors:  Ramya Parakkunnel; K Bhojaraja Naik; C Susmita; Vanishree Girimalla; K Udaya Bhaskar; K V Sripathy; C S Shantharaja; S Aravindan; Sanjay Kumar; Suman Lakhanpaul; K V Bhat
Journal:  Physiol Mol Biol Plants       Date:  2022-05-19

Review 4.  Molecular and genetic bases of heat stress responses in crop plants and breeding for increased resilience and productivity.

Authors:  Michela Janni; Mariolina Gullì; Elena Maestri; Marta Marmiroli; Babu Valliyodan; Henry T Nguyen; Nelson Marmiroli
Journal:  J Exp Bot       Date:  2020-06-26       Impact factor: 6.992

5.  Genome-Wide Analysis of Heat Shock Transcription Factors in Ziziphus jujuba Identifies Potential Candidates for Crop Improvement Under Abiotic Stress.

Authors:  Kishor Prabhakar Panzade; Sonam S Kale; Vijay Kapale; Narendra R Chavan
Journal:  Appl Biochem Biotechnol       Date:  2020-11-26       Impact factor: 2.926

6.  RNA-Seq Highlights Molecular Events Associated With Impaired Pollen-Pistil Interactions Following Short-Term Heat Stress in Brassica napus.

Authors:  Neeta Lohani; Mohan B Singh; Prem L Bhalla
Journal:  Front Plant Sci       Date:  2021-01-07       Impact factor: 5.753

7.  Genome-wide characterization and evolutionary analysis of heat shock transcription factors (HSFs) to reveal their potential role under abiotic stresses in radish (Raphanus sativus L.).

Authors:  Mingjia Tang; Liang Xu; Yan Wang; Wanwan Cheng; Xiaobo Luo; Yang Xie; Lianxue Fan; Liwang Liu
Journal:  BMC Genomics       Date:  2019-10-24       Impact factor: 3.969

8.  CaHsfA1d Improves Plant Thermotolerance via Regulating the Expression of Stress- and Antioxidant-Related Genes.

Authors:  Wen-Xian Gai; Xiao Ma; Yang Li; Jing-Jing Xiao; Abid Khan; Quan-Hui Li; Zhen-Hui Gong
Journal:  Int J Mol Sci       Date:  2020-11-08       Impact factor: 5.923

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.