| Literature DB >> 26868598 |
Yongzhe Gu1, Shilai Xing1, Chaoying He2.
Abstract
Gene loss is the driving force for changes in genome and morphology; however, this particular evolutionary event has been poorly investigated in leguminous plants. Legumes (Fabaceae) have some lineage-specific and diagnostic characteristics that are distinct from other angiosperms. To understand the potential role of gene loss in the evolution of legumes, we compared six genome-sequenced legume species of Papilionoideae, the largest representative clade of Fabaceae, such as Glycine max, with 34 nonlegume plant species, such as Arabidopsis thaliana. The results showed that the putative orthologs of the 34 Arabidopsis genes belonging to 29 gene families were absent in these legume species but these were conserved in the sequenced nonlegume angiosperm lineages. Further evolutionary analyses indicated that the orthologs of these genes were almost completely lost in the Papillionoideae ancestors, thus designated as the legume lost genes (LLGs), and these underwent purifying selection in nonlegume plants. Most LLGs were functionally unknown. In Arabidopsis, two LLGs were well-known genes that played a role in plant immunity such as HARMLESS TO OZONE LAYER 1 and HOPZ-ACTIVATED RESISTANCE 1, and 16 additional LLGs were predicted to participate in plant-pathogen interactions in in silico expression and protein-protein interaction network analyses. Most of these LLGs' orthologs in various plants were also found to be associated with biotic stress response, indicating the conserved role of these genes in plant defense. The evolutionary implication of LLGs during the development of the ability of symbiotic nitrogen fixation involving plant and bacterial interactions, which is a well-known characteristic of most legumes, is also discussed. Our work sheds light on the evolutionary implication of gene loss events in Papilionoideae evolution, as well as provides new insights into crop design to improve nitrogen fixation capacity.Entities:
Keywords: defense response; gene loss; genome evolution; legume; nitrogen fixation
Mesh:
Substances:
Year: 2016 PMID: 26868598 PMCID: PMC4824202 DOI: 10.1093/gbe/evw021
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
LLGs
| Query ID | Vv | Pt | Pp | Gm | Pv | Cc | Mt | Ca | Lj | Gene Symbol |
|---|---|---|---|---|---|---|---|---|---|---|
| y (1) | y (1) | y (1) | N (0) | N (0) | N (0) | N (0) | N (0) | N (0) | ||
| AT1G35340.1 (1) | y (1) | y (1) | y (1) | N (0) | N (0) | N (0) | n (0) | n (0) | N (0) | |
| AT1G64385.1 (1) | y (1) | y (2) | y (1) | N (0) | n (0) | N (0) | n (0) | N (0) | N (0) | |
| AT2G43210.1 (1) | y (1) | y (2) | y (1) | n (0) | n (0) | n (0) | N (0) | N (0) | n (0) | |
| y (0) | y (0) | y (0) | N (0) | N (0) | N (0) | N (0) | N (0) | N (0) | ||
| y (0) | y (0) | y (0) | N (0) | N (0) | N (0) | N (0) | N (0) | N (0) | ||
| AT2G43940.1 (1) | y (2) | y (1) | y (1) | n (0) | n (0) | N (0) | N (0) | n (0) | N (0) | |
| y (1) | y (1) | y (1) | N (0) | N (0) | N (0) | N (0) | N (0) | N (0) | ||
| AT4G22160.2 (1) | y (2) | y (3) | y (1) | N (0) | N (0) | N (0) | n (0) | n (0) | N (0) | |
| y (1) | y (1) | y (1) | N (0) | N (0) | N (0) | N (0) | N (0) | N (0) | ||
| y (1) | y (1) | y (1) | N (0) | N (0) | N (0) | N (0) | N (0) | N (0) | ||
| y (1) | y (1) | y (1) | N (0) | N (0) | N (0) | N (0) | N (0) | N (0) | ||
| y (1) | y (1) | y (1) | N (0) | N (0) | N (0) | N (0) | N (0) | N (0) | ||
| AT1G13630.1 (1) | y (1) | y (1) | y (2) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT1G55580.1 (1) | y (1) | y (3) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT1G55590.1 (1) | y (1) | y (1) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT1G68940.3 (3) | y (2) | y (4) | y (2) | n (2) | n (1) | n (1) | n (1) | n (1) | n (1) | |
| AT1G71120.1 (1) | y (1) | y (1) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT2G05810.1 (1) | y (1) | y (2) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT2G18520.1 (2) | y (1) | y (2) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT4G36680.1 (2) | y (1) | y (2) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT2G39100.1 (1) | y (1) | y (1) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT2G45530.1 (1) | y (1) | y (2) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT3G24515.1 (1) | y (1) | y (2) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT3G50950.2 (1) | y (1) | y (1) | y (2) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT3G61210.1 (3) | y (0) | y (0) | y (0) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT4G11670.1 (1) | y (1) | y (1) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT4G24340.1 (2) | y (1) | y (3) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT4G24350.1 (2) | y (1) | y (3) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT5G01015.1 (1) | y (1) | y (2) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT5G04840.1 (1) | y (1) | y (2) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT5G10830.1 (1) | y (2) | y (1) | y (2) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT5G12460.1 (1) | y (1) | y (2) | y (2) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) | |
| AT5G66160.1 (1) | y (1) | y (1) | y (1) | n (0) | n (0) | n (0) | n (0) | n (0) | n (0) |
Vv, Vitis vinifera; Pt, Populus trichocarpa; Pp, Prunus persica; Gm, Glycine max; Pv, Phaseolus vulgaris; Cc, Cajanus cajan; Mt, Medicago truncatula; Ca, Cicer arietinum; Lj, Lotus japonicus. The symbol y indicates that the LLG had putative orthologs in this species, and n represents that the LLG had homologous protein sequences but no putative ortholog in this species. N indicates that the LLG had no homologous sequence in this species. Genes in bold indicate the Group 2 LLGs. The numbers in parenthesis represent number of genes in this species within the orthoMCL group containing LLG.
HOL1 and HOL2 are in one orthoMCL group; and HOL1, HOL2, and HOL3 belong to the HOL family.
AT2G18520 and AT4G36680 are in one orthoMCL group.
AT4G24340 and AT4G24350 are in one orthoMCL group.
AT1G68940 and two non-LLGs (AT1G20780 and AT1G76390) belong to one orthoMCL group.
AT3G61210.1 and two non-LLGs (AT1G55450 and AT3G54150) comprise one orthoMCL group.
FLocal synteny around the HOL genes in legumes and nonlegume species. Species names are provided on the left. The block arrow on the horizontal line represents one open reading frame of a gene and its orientation. The yellow blocks are HOL family genes. The same color block arrows connected with the same color lines indicate the putative orthologs in different species. The gray block arrows are nonhomologous genes. The nomenclature of each putative gene is shown above the corresponding block arrow.
FSelection analyses of LLGs in nonlegumes. Selection was evaluated in Arabidopsis thaliana, Prunus persica, Populus trichocarpa, and Vitis vinifera. Yellow lines and cyan lines, respectively, indicate the dN/dS value of the Group 1 LLGs and Group 2 LLGs. The dN/dS distribution of the 5,935 conserved genes in these four plants is presented in black columns.
FLLG evolution in sequenced angiosperms. Gray represents species without any homologous sequence to LLGs. Orange indicates species with LLG homologs but not orthologs. Blue represents species with putative LLG orthologs. The black and gray stars indicate the ancestors of Papillionoideae and Angiospermae, respectively, and their ancestral states are represented at the bottom (for details, see supplementary fig. S4, Supplementary Material online). Forty plant species whose genomes have been sequenced are included, and their phylogeny was deduced from Cogepedia and APG III. Selaginella moellendorffii was used as outgroup.
FHeat map of LLG expression under different stresses in Arabidopsis. Hormone treatments include ABA, ACC, and MeJA. Elicitors include CaCl2 (Ca), glutathione S-transferase (GST), hrpz, GST-necrosis-inducing phytophtora Protein 1 (npp1), flagellin (flg), and lipopolysaccharide (lps). Bacterial stresses include Pseudomonas syringae pv. tomato DC3000 (PstDC3000), P. syringae pv. tomato avrRPM1 (Pstavrrpm1), P. syringae pv. tomato DC3000 hrcC (Psthrcc), P. syringae pv. phaseolicola (Pstpsph), Botrytis cinerea (B. cinerea), Erysiphe orontii (E. orontii), and Phytophthora infestans (P. infestans). The column above each treatment represents gene expression of different time points after treatment (for details, see supplementary fig. S10, Supplementary Material online).The color scale indicates the log2 values of expression change (treatments/control). Yellow indicates upregulation under treatments, and blue indicates downregulation under treatments. Green boxes indicate the log2 values of fold changes (treatments/control) > 1.0, and red boxes indicate the log2 values of fold changes (treatments/control) < −1.0. Gene names in red are Group 2 LLGs. AT2G43910/20 represents AT2G43910 (HOL1) and AT2G43920 (HOL2), and AT4G24340/50 represents AT4G24340 and AT4G24350 because one probe ID could detect the two closely related genes.
Summary of LLGs Involved in Plant Defense Response
| LLGs | Rice | Tomato | Grape | ||||
|---|---|---|---|---|---|---|---|
| Function | Expression | PPI | Expression | ||||
| AT1G35340.1 | Y | Y | Y | Y | |||
| AT1G64385.1 | Y | ||||||
| AT2G43910.2 ( | Y | Y | |||||
| AT2G43920.1 ( | Y | ||||||
| AT4G14970.1 | Y | ||||||
| AT4G29560.1 | Y | ||||||
| AT5G44010.1 | Y | Y | |||||
| AT5G49110.2 | Y | ||||||
| AT5G65740.2 | Y | ||||||
| AT2G05810.1 | Y | Y | |||||
| AT2G18520.1 | Y | Y | |||||
| AT4G36680.1 | Y | Y | Y | ||||
| AT2G39100.1 | Y | ||||||
| AT3G24515.1 ( | Y | Y | |||||
| AT3G50950.2 ( | Y | Y | Y | Y | |||
| AT3G61210.1 | Y | Y | Y | ||||
| AT4G11670.1 | Y | ||||||
| AT4G24340.1 | Y | Y | Y | ||||
| AT4G24350.1 | Y | Y | Y | ||||
| AT5G01015.1 | Y | ||||||
| AT5G10830.1 | Y | Y | Y | Y | |||
Y indicates that evidence for LLG involvement in defense response was detected. PPI, protein-protein interaction.
FThe largest PPI network associated with LLGs. Each node represents a protein. The four nodes covered by a green circle indicate the four LLGs. The colorful connecting lines represent the types of evidence supporting each association: coexpression (dark brown), experiments (pink), databases (cyan), homology (violet), and text mining (light green).