| Literature DB >> 35966928 |
Yiying Liao1, Zhiming Liu2, Andrew W Gichira2, Min Yang1, Ruth Wambui Mbichi2, Linping Meng1, Tao Wan2,3.
Abstract
Heat shock factor (HSF) genes are essential in some of the basic developmental pathways in plants. Despite extensive studies on the structure, functional diversification, and evolution of HSF genes, their divergence history and gene duplication pattern remain unknown. To further illustrate the probable divergence patterns in these subfamilies, we analyzed the evolutionary history of HSF genes using phylogenetic reconstruction and genomic syntenic analyses, taking advantage of the increased sampling of genomic data from pteridophytes, gymnosperms and basal angiosperms. We identified a novel clade that includes HSFA2, HSFA6, HSFA7, and HSFA9 with a complex relationship, which is very likely due to orthologous or paralogous genes retained after frequent gene duplication events. We hypothesized that HSFA9 derives from HSFA2 through gene duplication in eudicots at the ancestral state, and then expanded in a lineage-specific way. Our findings indicate that HSFB3 and HSFB5 emerged before the divergence of ancestral angiosperms, but were lost in the most recent common ancestors of monocots. We also presumed that HSFC2 derives from HSFC1 in ancestral monocots. This work proposes that during the radiation of flowering plants, an era during which there was a differentiation of angiosperms, the size of the HSF gene family was also being adjusted with considerable sub- or neo-functionalization. The independent evolution of HSFs in eudicots and monocots, including lineage-specific gene duplication, gave rise to a new gene in ancestral eudicots and monocots, and lineage-specific gene loss in ancestral monocots. Our analyses provide essential insights for studying the evolutionary history of this multigene family. ©2022 Liao et al.Entities:
Keywords: Diversification; Gene family evolution; Heat shock factor; Lineage-specific expansions; Whole genome duplication
Year: 2022 PMID: 35966928 PMCID: PMC9373977 DOI: 10.7717/peerj.13603
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 3.061
The species used for phylogenetic tree construction, and the category of HSFs.
|
| ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Taxonomy | Species | Abbreviation | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | A9 | B1 | B2 | B3 | B4 | B5 | C1 | C2 | HSF like (N.C.) | Total |
| Chlorophyta |
| Chlre | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 4 |
| Chlorophyta |
| Volca | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 |
| Bryophyta |
| Phypa | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8 |
| Pteridophyta |
| Selmo | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 7 |
| Pteridophyta |
| Azofi | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 3 | 0 | 0 | 0 | 1 | 14 |
| Pteridophyta |
| Cerga | 3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 9 |
| Pteridophyta |
| Lygja | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 6 | 13 |
| Pteridophyta |
| Pteaq | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 10 |
| Pteridophyta |
| Salcu | 7 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 15 |
| Gymnosperm |
| Abifi | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 2 | 0 | 0 | 0 | 3 | 11 |
| Gymnosperm |
| Aracu | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 1 | 0 | 1 | 0 | 0 | 0 | 3 | 13 |
| Gymnosperm |
| Cepsi | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 8 |
| Gymnosperm |
| Cycre | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 8 |
| Gymnosperm |
| Epheq | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 7 |
| Gymnosperm |
| Ginbi | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 8 |
| Gymnosperm |
| GinbiR | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 2 | 0 | 0 | 0 | 2 | 9 |
| Gymnosperm |
| Gnemo | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 4 | 0 | 0 | 0 | 13 | 24 |
| Gymnosperm |
| Metgl | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 1 | 0 | 0 | 0 | 2 | 12 |
| Gymnosperm |
| Picab | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 7 | 1 | 0 | 2 | 0 | 0 | 0 | 5 | 19 |
| Gymnosperm |
| PicabR | 3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 2 | 9 |
| Gymnosperm |
| Picgl | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 11 | 1 | 0 | 2 | 0 | 0 | 0 | 0 | 18 |
| Gymnosperm |
| Pinta | 7 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 15 | 1 | 0 | 4 | 0 | 0 | 0 | 20 | 48 |
| Gymnosperm |
| PintaR | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 2 | 0 | 0 | 0 | 1 | 10 |
| Gymnosperm |
| Podma | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 1 | 0 | 0 | 0 | 0 | 0 | 3 | 10 |
| Gymnosperm |
| Scive | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 1 | 0 | 1 | 0 | 0 | 0 | 3 | 10 |
| Gymnosperm |
| Taxch | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 9 |
| Gymnosperm |
| Welmi | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 12 |
| Gymnosperm |
| Zamfu | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 9 |
| Basal angiosperms |
| Ambtr | 2 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 2 | 1 | 0 | 0 | 2 | 13 |
| Basal angiosperms |
| Lirch | 2 | 2 | 1 | 1 | 2 | 1 | 1 | 0 | 0 | 2 | 2 | 1 | 1 | 1 | 1 | 0 | 0 | 18 |
| Basal angiosperms |
| Nymco | 3 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 2 | 0 | 3 | 1 | 2 | 0 | 4 | 21 |
| Eudicots |
| Arath | 4 | 1 | 1 | 2 | 1 | 2 | 2 | 1 | 0 | 1 | 2 | 1 | 1 | 0 | 1 | 0 | 4 | 24 |
| Eudicots |
| Cajca | 2 | 1 | 1 | 2 | 1 | 2 | 1 | 1 | 1 | 2 | 2 | 1 | 4 | 1 | 1 | 0 | 4 | 27 |
| Eudicots |
| Citla | 1 | 2 | 1 | 3 | 1 | 2 | 0 | 2 | 1 | 1 | 2 | 2 | 3 | 1 | 2 | 0 | 0 | 24 |
| Eudicots |
| Mimgu | 2 | 1 | 1 | 2 | 0 | 2 | 0 | 1 | 0 | 0 | 2 | 1 | 2 | 1 | 1 | 0 | 5 | 21 |
| Eudicots |
| Nelnu | 4 | 2 | 1 | 2 | 1 | 1 | 0 | 1 | 0 | 2 | 2 | 2 | 3 | 2 | 2 | 0 | 0 | 25 |
| Eudicots |
| Poptr | 3 | 1 | 1 | 3 | 2 | 4 | 0 | 2 | 1 | 1 | 3 | 2 | 4 | 2 | 1 | 0 | 4 | 34 |
| Eudicots |
| Prupe | 2 | 1 | 1 | 2 | 0 | 2 | 0 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 0 | 3 | 20 |
| Eudicots |
| Solly | 4 | 1 | 1 | 3 | 1 | 2 | 0 | 1 | 3 | 1 | 2 | 2 | 2 | 1 | 1 | 0 | 1 | 26 |
| Monocots |
| Bradi | 1 | 3 | 1 | 2 | 1 | 2 | 2 | 1 | 0 | 1 | 3 | 0 | 3 | 0 | 2 | 2 | 2 | 26 |
| Monocots |
| Orybr | 0 | 3 | 0 | 2 | 1 | 2 | 2 | 1 | 0 | 1 | 1 | 0 | 3 | 0 | 2 | 1 | 3 | 22 |
| Monocots |
| Orysa | 1 | 4 | 1 | 2 | 1 | 2 | 2 | 1 | 0 | 1 | 3 | 0 | 4 | 0 | 2 | 2 | 3 | 29 |
| Monocots |
| Phoda | 7 | 3 | 2 | 2 | 2 | 2 | 0 | 0 | 0 | 2 | 4 | 0 | 3 | 0 | 1 | 2 | 1 | 31 |
| Monocots |
| Phyhe | 1 | 4 | 2 | 3 | 2 | 3 | 1 | 0 | 0 | 2 | 2 | 0 | 4 | 0 | 2 | 1 | 14 | 41 |
| Monocots |
| Sorbi | 1 | 3 | 1 | 1 | 1 | 2 | 2 | 1 | 0 | 1 | 3 | 0 | 3 | 0 | 2 | 2 | 3 | 26 |
| Monocots |
| Triur | 1 | 5 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 2 | 1 | 6 | 20 |
| Monocots |
| Zeama | 2 | 2 | 1 | 3 | 1 | 2 | 2 | 2 | 0 | 2 | 4 | 0 | 2 | 0 | 2 | 2 | 13 | 40 |
Notes.
The data from transcriptomes. N.C. the sequence only contains some of the necessary domains for a heat shock transcription factor and therefore it could not be classified.
Figure 1An unrooted Maximum-Likelihood tree showing the hylogeny and classification of 670 HSFs sequences from 44 species representing seven main taxa including chlorophyta, bryophyta, peridophyta, gymnospermae, basal angiosperms, eudicots and monocots.
The information of species and sequences accession numbers used for the tree are listed in File S1. HSFA, HSFB and HSFC are clustered into three main clades. The clade of subfamilies HSFA2-7, HSFA8 and HSFA9, HSFB2-5, and HSFC1 and HSFC2, were shown over relevant branches with different colors. The three groups HSFA, HSFB, and HSFC were highlighted with shades of different colors. The scale bar represents amino acid substitutions per site.
The detected paralogous genes within different species.
|
|
|
|
|---|---|---|
| Pteridophyta |
| HSFB1-HSFB4 |
| Basal angiosperms |
| HSFC1-HSFC1, HSFA2-HSFA2, HSFA4-HSFA5, HSFB1-HSFB1, HSFB2-HSFB2 |
| Eudicots |
| HSFA1-HSFA1, HSFA4-HSFA4, HSFA6-HSFA6, HSFA6-HSFA7 |
|
| HSFA1-HSFA1, HSFA4-HSFA4, HSFA5-HSFA5, HSFA6-HSFA6, HSFA8-HSFA8, HSFA9-HSFA2, HSFB2-HSFB2, HSFB3-HSFB3, HSFB4-HSFB4, HSFB5-HSFB5 | |
|
| HSFA2-HSFA9, HSFA6-HSFA6, HSFB2-HSFB2 | |
|
| HSFA1-HSFA1, HSFA4-HSFA4, HSFA6-HSFA6, HSFA9-HSFA2, HSFB2-HSFB2, HSFB3-HSFB3 | |
|
| HSFB4-HSFB4 | |
| Monocots |
| HSFA2-HSFA2, HSFA6-HSFA2, HSFB2-HSFB2, HSFB4-HSFB4, HSFC2-HSFC2 |
|
| HSFA2-HSFA2, HSFA2-HSFA6, HSFA6-HSFA6, HSFB2-HSFB2, HSFC2-HSFC2 | |
|
| HSFA1-HSFA1, HSFA2-HSFA2, HSFA4-HSFA4, HSFB1-HSFB1, HSFB2-HSFB1, HSFB2-HSFB2, | |
|
| HSFA2-HSFA2, HSFA6-HSFA6, HSFB2-HSFB2, HSFB4-HSFB4, HSFC2-HSFC2 |
The ortologous gene clusters detected between different species.
|
|
|
|
|---|---|---|
| Gymnosperm-Gymnosperm |
| HSFA1-HSFA1 |
| Gymnosperm-Basal angiosperms |
| HSFA1-HSFA1 |
|
| HSFA4-HSFA1, HSFA5-HSFA1 | |
| Basal angiosperms-Basal angiosperms |
| HSFA1-HSFA1,HSFA2-HSFA2,HSFA3-HSFA3,HSFA5-HSFA5,HSFA6-HSFA6,HSFB2-HSFB2,HSFB5-HSFB5 |
| Basal angiosperms-Eudicots |
| HSFA1-HSFA1, HSFA5-HSFA5, HSFA6-HSFA6, HSFA6-HSFA7, HSFB2-HSFB2, |
|
| HSFA1-HSFA1, HSFA2-HSFA2, HSFA4-HSFA4, HSFA4-HSFA5, HSFB1-HSFB1, HSFB2-HSFB2, HSFB3-HSFB3, HSFC1-HSFC1, HSFC1-HSFC1 | |
|
| HSFA1-HSFA1, HSFA2-HSFA2, HSFA2-HSFA9, HSFA4-HSFA4, HSFA4-HSFA5, HSFB1-HSFB1, HSFB2-HSFB2, HSFB3-HSFB3, HSFB4-HSFB4, HSFC1-HSFC1, HSFC1-HSFC1 | |
|
| HSFA1-HSFA1, HSFA2-HSFA2, HSFA2-HSFA9, HSFA5-HSFA5, HSFA6-HSFA6, HSFB5-HSFB2, HSFB5-HSFB5, | |
| Basal angiosperms-Monocots |
| HSFA2-HSFA6, HSFA3-HSFA3, HSFA6-HSFA6, HSFB2-HSFB2 |
|
| HSFB1-HSFB1,HSFB2-HSFB2,HSFB4-HSFB4 | |
|
| HSFA2-HSFA7, HSFA3-HSFA3, HSFA6-HSFA2, HSFA6-HSFA6, HSFB2-HSFB2 | |
|
| HSFA4-HSFA4, HSFA7-HSFA2, HSFB1-HSFB1, HSFB2-HSFB2, HSFB4-HSFB4 | |
| Eudicots-Monocots |
| HSFA6-HSFA2, HSFA6-HSFA6, HSFA7-HSFA2 |
|
| HSFA2-HSFA6, HSFA4-HSFA4, HSFA6-HSFA6, HSFA7-HSFA2, HSFB1-HSFB1, HSFB2-HSFB2, HSFB4-HSFB4, | |
|
| HSFA2-HSFA6 | |
|
| HSFA2-HSFA6, HSFA6-HSFA6, HSFB1-HSFB1, HSFB2-HSFB2, | |
| Monocots-Monocots |
| HSFA1-HSFA1, HSFA1-HSFA5, HSFA2-HSFA2, HSFA3-HSFA3, HSFA4-HSFA4, HSFA6-HSFA2, HSFA6-HSFA6, HSFA7-HSFA7, HSFA8-HSFA8, HSFB1-HSFB1, HSFB2-HSFB2, HSFB4-HSFB4, HSFC1-HSFC1, HSFC2-HSFC2 |
| Eudicots-Eudicots |
| HSFA1-HSFA1, HSFA2-HSFA2, HSFA3-HSFA3, HSFA4-HSFA4, HSFA5-HSFA5, HSFA6-HSFA6, HSFA6-HSFA7, HSFB1-HSFB1, HSFB2-HSFB2, HSFB3-HSFB3, HSFC1-HSFC1 |
Figure 2(A) Synteny analysis between the subfamilies HSFA2, HSFA6, HSFA7, HSFA9 of seven representative plant species (Amborella trichopoda, Liriodendron chinense, Arabidopsis thaliana, Solanum lycopersicum, Oryza sativa, Sorghum bicolor, Zea mays). (B) Synteny analysis between the subfamilies HSFB1, HSFB2, HSFB4, HSFB5 of seven representative plant species (Selaginella moellendorffii, Amborella trichopoda, Liriodendron chinense, Arabidopsis thaliana, Solanum lycopersicum, Oryza sativa, Zea mays). (C) Synteny analysis between the subfamilies HSFA1, HSFA4, HSFA5 of eight representative plant species (Ginkgo biloba, Amborella trichopoda, Liriodendron chinense, Arabidopsis thaliana, Solanum lycopersicum, Oryza sativa, Sorghum bicolor, Zea mays).
Black, and blue lines indicate orthologous, and paralogous gene pairs respectively. The different colored circle represent HSF genes from different subfamilies. The name of the genes is inside the circle.
Figure 3A dated phylogenetic reconstruction for the subfamilies HSFA2 and HSFA9.
Red ovals indicate gene duplication events. The divergence time of HSFA2 and HSFA9 are marked with red. The blue numbers on each node refer to the mean time to MRCA estimates; the blue numbers in parentheses on each node refer to the 95% highest posterior density intervals.
Figure 4A dated phylogenetic reconstruction were done for the subfamilies HSFC1 and HSFc2. Red ovals indicate gene duplication events.
The divergence time of HSFC1 and HSFc2 are marked with red. The blue numbers on each node refer to the mean time to MRCA estimates; the blue numbers in parentheses on each node refer to the 95% highest posterior density intervals.