Naoki Shinohara1, Kazuhiko Nishitani1. 1. Department of Biology, Kanagawa University, Tsuchiya 2946, Hiratsuka, Kanagawa 259-1293, Japan.
Abstract
All land plants encode large multigene families of xyloglucan endotransglucosylase/hydrolases (XTHs), plant-specific enzymes that cleave and reconnect plant cell-wall polysaccharides. Despite the ubiquity of these enzymes, considerable uncertainty remains regarding the evolutionary history of the XTH family. Phylogenomic and comparative analyses in this study traced the non-plant origins of the XTH family to Alphaproteobacteria ExoKs, bacterial enzymes involved in loosening biofilms, rather than Firmicutes licheninases, plant biomass digesting enzymes, as previously supposed. The relevant horizontal gene transfer (HGT) event was mapped to the divergence of non-swimming charophycean algae in the Cryogenian geological period. This HGT event was the likely origin of charophycean EG16-2s, which are putative intermediates between ExoKs and XTHs. Another HGT event in the Cryogenian may have led from EG16-2s or ExoKs to fungal Congo Red Hypersensitive proteins (CRHs) to fungal CRHs, enzymes that cleave and reconnect chitin and glucans in fungal cell walls. This successive transfer of enzyme-encoding genes may have supported the adaptation of plants and fungi to the ancient icy environment by facilitating their sessile lifestyles. Furthermore, several protein evolutionary steps, including coevolution of substrate-interacting residues and putative intra-family gene fusion, occurred in the land plant lineage and drove diversification of the XTH family. At least some of those events correlated with the evolutionary gain of broader substrate specificities, which may have underpinned the expansion of the XTH family by enhancing duplicated gene survival. Together, this study highlights the Precambrian evolution of life and the mode of multigene family expansion in the evolutionary history of the XTH family.
All land plants encode large multigene families of xyloglucan endotransglucosylase/hydrolases (XTHs), plant-specific enzymes that cleave and reconnect plant cell-wall polysaccharides. Despite the ubiquity of these enzymes, considerable uncertainty remains regarding the evolutionary history of the XTH family. Phylogenomic and comparative analyses in this study traced the non-plant origins of the XTH family to Alphaproteobacteria ExoKs, bacterial enzymes involved in loosening biofilms, rather than Firmicutes licheninases, plant biomass digesting enzymes, as previously supposed. The relevant horizontal gene transfer (HGT) event was mapped to the divergence of non-swimming charophycean algae in the Cryogenian geological period. This HGT event was the likely origin of charophycean EG16-2s, which are putative intermediates between ExoKs and XTHs. Another HGT event in the Cryogenian may have led from EG16-2s or ExoKs to fungal Congo Red Hypersensitive proteins (CRHs) to fungal CRHs, enzymes that cleave and reconnect chitin and glucans in fungal cell walls. This successive transfer of enzyme-encoding genes may have supported the adaptation of plants and fungi to the ancient icy environment by facilitating their sessile lifestyles. Furthermore, several protein evolutionary steps, including coevolution of substrate-interacting residues and putative intra-family gene fusion, occurred in the land plant lineage and drove diversification of the XTH family. At least some of those events correlated with the evolutionary gain of broader substrate specificities, which may have underpinned the expansion of the XTH family by enhancing duplicated gene survival. Together, this study highlights the Precambrian evolution of life and the mode of multigene family expansion in the evolutionary history of the XTH family.
Early speculation on the mechanisms underlying plant cell-wall extension proposed the existence of a hypothetical enzyme involved in cleavage and reconnection of cell-wall polysaccharides cross-linking cellulose microfibrils (Albersheim 1975). This prediction was later supported by the experimental detection of such enzymatic activities (Nishitani and Tominaga 1991, Smith and Fry 1991, Farkas et al. 1992, Fry et al. 1992) and substantiated by molecular cloning and characterization of the enzymes, named endo-xyloglucan transferases (Nishitani and Tominaga 1992, Okazawa et al. 1993, Purugganan et al. 1997, Campbell and Braam 1999a). The enzyme sequences demonstrated mutual homology and also exhibited homology with xyloglucan endo-β(1,4)-glucanase of nasturtium (de Silva et al. 1993). This multifaceted similarity initiated an analysis of the relationships between sequence and activity in the enzyme family (Nishitani 1997, Campbell and Braam 1999b). Subsequently, the first discovered enzyme class was renamed xyloglucan endotransglucosylase (XET), the xyloglucan glucanase class was renamed xyloglucan endohydrolase (XEH) and the larger encompassing enzyme family was termed the XET/hydrolase (XTH) family (Rose et al. 2002).Crystallographic and mutational analyses revealed that both XET and XEH enzymes cleaved substrate xyloglucan chains via a catalytic glutamic acid residue nucleophile at the active center (Johansson et al. 2004, Baumann et al. 2007, Mark et al. 2009). In the reaction model of XEH, a cleaved substrate chain is transferred to a water molecule at the completion of the hydrolysis reaction (Baumann et al. 2007, Mark et al. 2009). In the reaction model of XET, the cleaved substrate chain is transferred to the reducing end of another xyloglucan molecule at the completion of the transglycosylation reaction (Johansson et al. 2004, Baumann et al. 2007, Mark et al. 2009). In this context, the substrate molecule being cut is termed the donor substrate, and the connecting substrate is termed the acceptor substrate. On a single XTH molecule, the donor and acceptor binding sites are delimited by the catalytic nucleophile and together compose the substrate-binding cleft (Johansson et al. 2004, Baumann et al. 2007, Mark et al. 2009).The biological significance of the XTH family is underscored by their prolific and highly conserved nature, and each land plant genome encodes 20–60 XTH genes (Yokoyama and Nishitani 2001, Eklöf and Brumer 2010, Yokoyama et al. 2010). Ongoing sequence-based classification of XTH genes has been conducted over the last two decades. Early attempts classified 23 cDNA sequences from nine angiosperms and the 33 genes (AtXTH1-AtXTH33) from the Arabidopsis thaliana genome into three major groups (hereafter I, II and III) on the basis of phylogram topology (Nishitani 1997, Rose et al. 2002). With some modifications, including additional groups, this classification is widely accepted. Group III was divided into groups III-A and III-B and XET and XEH activities were delineated, the latter of which is confined to group III-A (Baumann et al. 2007, Kaewthai et al. 2013). In contrast to this neat delineation, the incorporation of 29 members (OsXTH1-OsXTH29) of the rice XTH family into the phylogeny of the 33 A. thaliana XTHs introduced ambiguity into the boundary between groups I and II, and composite group I/II was introduced (Yokoyama et al. 2004). A small group named ancestral was later separated from the rest of the composite group I/II (Baumann et al. 2007). As its name suggests, ancestral group members form an early-diverging clade in phylograms (Baumann et al. 2007, Yokoyama et al. 2010, Shinohara et al. 2017); however, surprisingly, genomic datasets of bryophytes (early-diverging land plants) do not include members of the ancestral group (Yokoyama et al. 2010, Shinohara et al. 2017). Thus, the XTH family requires further research to clarify its classification.Some XTHs have additional transglycosylation activities alongside their XET activity (Hrmova et al. 2007, Simmons et al. 2015, Shinohara et al. 2017, Stratilová et al. 2019, 2020). For example, EfHTG of Equisetum fluviatile (water horsetail) catalyzes two additional types of transglycosylation: cleavage of mixed-linkage β(1,3;1,4)-glucan (or MLG) and reconnection to xyloglucan and cleavage of cellulose and reconnection to xyloglucan. This broader substrate specificity is not found in three closely related members (EfXTH-A, EfXTH-H and EfXTH-I) of the E. fluviatile XTH family (Holland et al. 2020). Despite this variation in substrate specificity, all four XTHs belong to composite group I/II (Simmons et al. 2015, Holland et al. 2020). The classification of composite group I/II may therefore benefit from revision to delineate additional enzymatic activities of group members. The evolutionary implications of the broader substrate specificities of some XTH family members remain to be explained.The recent rapid expansion of genome and transcriptome datasets has promoted phylogenomics-based classification of XTHs in the land plant lineage (Eklöf and Brumer 2010, Del-Bem and Vincentz 2010). Additional data were provided by a census of plant glycoside hydrolase family 16 (GH16), which included more than 15,000 XTH sequences (Behar et al. 2018), and a subgrouping project of the whole GH16 family, including a large number of prokaryotic and eukaryotic enzymes (Viborg et al. 2019). These data resources are highly valuable for in-depth analysis of the evolutionary history of intra-family and inter-family relationships in the XTH family.Considerable uncertainty remains regarding the evolutionary emergence of the XTH family. One report concluded that the first XTHs emerged in Zygnematophyceae (Del-Bem and Vincentz 2010), a sister group of the land plants (Embryophyta). Another study found XTH-encoding transcripts in transcriptome datasets of non-charophycean green algae, remote relatives of land plants (Behar et al. 2018). Therefore, it remains unclear which algal lineage first acquired XTHs. Uncertainty also remains regarding the non-plant origins of XTHs. Bacterial licheninases, which digest plant biomass to provide nutrients to the bacteria (Stülke and Hillen 2000), were long considered as the non-plant ancestors of the XTH family because of their sequence similarities (Barbeyron et al. 1998, Michel et al. 2001). However, bacterial evolutionary gain of a plant digestive function presupposes the existence of plants for the bacteria to feed upon and, in this context, the plant-degrading function is unlikely to fully reflect the ancient function of the non-plant origin of XTHs. To address these questions, phylogenomic and comparative analyses were performed to provide insights regarding first, the evolutionary emergence of XTHs; second, the non-plant origin of the XTH family; and third, the molecular evolutionary events underlying diversification of the XTH family.
Results
Overview of the evolutionary history of the XTH family
Phylogenetic and comparative approaches were used to examine evolutionary relationships among XTH family members. The main conclusions are graphically summarized in . outlines the inter-family relationship of XTHs and other related GH16 enzymes. As shown in , Alphaproteobacteria ExoKs may have been adopted by plants and lost their extended loops to become plant EG16-2s, followed by gain of the acceptor-side extension to become the XTH family. includes a geographical timeline and plant evolutionary pathway. Intra-family relationships of XTHs are outlined, with their multiple diversification events shown relative to protein evolution.
Fig. 1
Comparison of XTH and other related enzymes. Three-dimensional structures are derived from experimental (PDB code) or predicted (gene ID of modeled protein and PDB code of template protein) data. Substructures absent in the jellyroll β-sandwich of EG16 are colored in purple. A hypothetical evolutionary route is shown at the top, where gain (+) and loss (−) events ([1]-[5], explained at the left) were parsimoniously mapped, and the dotted line represents horizontal gene transfer. In the table, ticks denote presence, crosses denote absence and ellipses denote unknown status.
Fig. 2
Evolutionary history of the XTH family. Green and gray lines represent the evolutionary histories of Streptophyta and bacteria, respectively. Black lines represent the molecular evolution of XTH and related enzymes, and the black dotted line indicates HGT. A wide light blue bar represents the Cryogenian period (720–635 million years ago). Evolutionary gain events related to plant cell-wall evolution (details in Discussion) are shown in purple, where XyG indicates xyloglucan.
Comparison of XTH and other related enzymes. Three-dimensional structures are derived from experimental (PDB code) or predicted (gene ID of modeled protein and PDB code of template protein) data. Substructures absent in the jellyroll β-sandwich of EG16 are colored in purple. A hypothetical evolutionary route is shown at the top, where gain (+) and loss (−) events ([1]-[5], explained at the left) were parsimoniously mapped, and the dotted line represents horizontal gene transfer. In the table, ticks denote presence, crosses denote absence and ellipses denote unknown status.Evolutionary history of the XTH family. Green and gray lines represent the evolutionary histories of Streptophyta and bacteria, respectively. Black lines represent the molecular evolution of XTH and related enzymes, and the black dotted line indicates HGT. A wide light blue bar represents the Cryogenian period (720–635 million years ago). Evolutionary gain events related to plant cell-wall evolution (details in Discussion) are shown in purple, where XyG indicates xyloglucan.
Emergence of the XTH family
Available plant transcriptome datasets with XTH-encoding sequences include non-charophycean green algae species Dunaliella salina, D. tertiolecta and Pseudoscourfieldia mariana (Behar et al. 2018), which are remotely related to land plants. Phylogenetic analysis () revealed that these sequences were more closely related to angiosperm XTHs than to liverwort XTHs. The incongruence between the protein tree () and the currently accepted species tree (Wickett et al. 2014) may be explained by variable sampling procedures for their transcriptome datasets or by multiple horizontal gene transfer (HGT) events from land plants to algae. However, in either case, the XTHs are unlikely to be direct descendants of the first XTHs to emerge.As previously reported (Del-Bem and Vincentz 2010), an early lineage of the XTH family was found in freshwater-dwelling charophycean algae belonging to Zygnematophyceae, a sister group of land plants, i.e. a late-diverging clade of Charophyta (Timme et al. 2012). In a prior census (Behar et al. 2018), nine of the 46 charophycean species examined had XTH transcripts, and all nine belonged to Zygnematophyceae. In addition, two recently sequenced Zygnematophyceae genomes (Spirogloea muscicola and Mesotaenium endlicherianum) harbored multiple XTH genes (Cheng et al. 2019; ), and the genome of Klebsormidium nitens, a charophycean alga that diverged earlier than Zygnematophyceae, contained no XTH genes (Hori et al. 2014, Bowman et al. 2017; ). Accordingly, the emergence of the XTH family was mapped to the last common ancestor of Zygnematophyceae and land plants ().
Fig. 3
Phylogenetic inference of XTH diversification in the land plant lineage. A maximum-likelihood phylogram of XTH protein sequences in genome datasets of Mesotaenium endlicherianum (charophycean alga), Spirogloea muscicola (charophycean alga), Marchantia polymorpha (liverwort), Selaginella moellendorffii (lycophyte), Oryza sativa (monocot) and Arabidopsis thaliana (eudicot) and a transcriptome dataset of Equisetum diffusum (horsetail; fern ally) is shown. Individual XTHs of EfHTG (E. fluviatile; horsetail), PttXET16-34 (Populus tremula × P. tremuloides; poplar) and TmXET6.3 (Tropaeolum majus; nasturtium) are also included. Pairs of uppercase letters around the phylogram denote amino acid residues corresponding to Q102 and R116 of the mature PttXET16-34 protein. Support values of major nodes are shown in parentheses. A list of source organisms and their mutual relationships is shown on the left, where the numbers of XTH sequences in genome [G] and transcriptome [T] datasets are given in brackets.
Fig. 4
Curation of XTH-homologs in the Klebsormidium nitens genome. (A) A maximum-likelihood phylogram of XTH-homologs in a genome dataset of K. nitens (formerly K. flaccidum) is shown. A genome dataset of Spirogloea muscicola (Zygnematophyceae), previously reported EG16-2s of Charophyta, EG16s of Coleochaetales, moss and eudicots and XTHs of eudicots, are included as reference sequences. The branch position of Kfl00047_0180, which was determined by a separate phylogenetic analysis of its fragmentary sequence, is indicated with an arrow. Support values of major nodes are shown on branches. Triplets of letters on the right denote amino acid residues corresponding to the 44th–46th residues of the mature PttXET16-34 protein, and signature residues of EG16s are emphasized. (B) Schematic representation of K. nitens XTH-homolog proteins, with gaps, predicted signal peptides, active sites (EXDXE) and regions similar to the EG16 protein of VvEG16 (XP_002273975.1) shown. (C) Number of genes in each genome. Parentheses represent uncertain status. In (A) and (C), lists of source organisms and their mutual relationships are shown on the left.
Phylogenetic inference of XTH diversification in the land plant lineage. A maximum-likelihood phylogram of XTH protein sequences in genome datasets of Mesotaenium endlicherianum (charophycean alga), Spirogloea muscicola (charophycean alga), Marchantia polymorpha (liverwort), Selaginella moellendorffii (lycophyte), Oryza sativa (monocot) and Arabidopsis thaliana (eudicot) and a transcriptome dataset of Equisetum diffusum (horsetail; fern ally) is shown. Individual XTHs of EfHTG (E. fluviatile; horsetail), PttXET16-34 (Populus tremula × P. tremuloides; poplar) and TmXET6.3 (Tropaeolum majus; nasturtium) are also included. Pairs of uppercase letters around the phylogram denote amino acid residues corresponding to Q102 and R116 of the mature PttXET16-34 protein. Support values of major nodes are shown in parentheses. A list of source organisms and their mutual relationships is shown on the left, where the numbers of XTH sequences in genome [G] and transcriptome [T] datasets are given in brackets.Curation of XTH-homologs in the Klebsormidium nitens genome. (A) A maximum-likelihood phylogram of XTH-homologs in a genome dataset of K. nitens (formerly K. flaccidum) is shown. A genome dataset of Spirogloea muscicola (Zygnematophyceae), previously reported EG16-2s of Charophyta, EG16s of Coleochaetales, moss and eudicots and XTHs of eudicots, are included as reference sequences. The branch position of Kfl00047_0180, which was determined by a separate phylogenetic analysis of its fragmentary sequence, is indicated with an arrow. Support values of major nodes are shown on branches. Triplets of letters on the right denote amino acid residues corresponding to the 44th–46th residues of the mature PttXET16-34 protein, and signature residues of EG16s are emphasized. (B) Schematic representation of K. nitens XTH-homolog proteins, with gaps, predicted signal peptides, active sites (EXDXE) and regions similar to the EG16 protein of VvEG16 (XP_002273975.1) shown. (C) Number of genes in each genome. Parentheses represent uncertain status. In (A) and (C), lists of source organisms and their mutual relationships are shown on the left.
Ancestry of the XTH family in the plant kingdom
Common shared ancestry between plant XTHs and bacterial licheninases was proposed because of their sequence similarities (Barbeyron et al. 1998, Michel et al. 2001). Licheninases, which have alternative names in the literature (e.g. lichenases or MLGases), hydrolyze linear polysaccharides found in lichens and cereals. These polysaccharides, termed MLGs, are composed of β(1,3)- and β(1,4)-glucosyl residues, and licheninases specifically cleave the β(1,4)-glucosyl linkage subsequent to the β(1,3)-glucosyl residue. Evolutionary intermediate groups have now been recognized between plant XTHs and bacterial licheninases (Eklöf et al. 2013, McGregor et al. 2017, Behar et al. 2018, Viborg et al. 2019; ).Although EG16s were previously included in the XTH family (Yokoyama et al. 2010), they now have their own group name (Eklöf et al. 2013, McGregor et al. 2017, Behar et al. 2018), belonging to the GH16 subgroup closest to the XTH family (Viborg et al. 2019). EG16s have licheninase-like endo-glycosidase activity with broad substrate specificity that cleaves both MLG and xyloglucan (Eklöf et al. 2013, McGregor et al. 2017). Together with their shared protein structural features, this activity prompted the proposal that EG16s were licheninase-to-XTH intermediates (Eklöf and Brumer 2010, McGregor et al. 2017). Recently, another plant enzyme group, termed EG16-2, was identified (Behar et al. 2018). EG16-2s are structurally and phylogenetically close to EG16s but remain biochemically uncharacterized (Behar et al. 2018). In a subgrouping analysis of the GH16 family, the XTH family was assigned its own subgroup number, GH16_20, but no such number has yet been assigned to the EG16 group (Viborg et al. 2019).The plant origin of the composite clade of EG16-2s, EG16s and XTHs was explored. K. nitens encodes six EG16-2s, but no EG16s or XTHs (Hori et al. 2014, Bowman et al. 2017; ). Two charophycean algae that diverged earlier than K. nitens, Mesostigma viride and Chlorokybus atmophyticus did not encode any EG16-2s, EG16s or XTHs (Behar et al. 2018, Del-Bem 2018, Wang et al. 2020; ). Unlike XTHs, EG16s lack signal peptides for apoplastic secretion (Behar et al. 2018). However, five of the six EG16-2s in the K. nitens genome v1.1 dataset (Hori et al. 2014) were predicted to have conventional signal peptides (), suggesting that apoplastic secretion was the ancestral form in the composite clade of EG16-2s, EG16s and XTHs. EG16-2s exhibited early emergence and lacked derived structural characteristics found in EG16s and XTHs, namely the absence of a signal peptide in EG16s and the presence of an acceptor-side extension in XTHs (), and EG16-2s therefore formed the putative ancestor of both EG16s and XTHs. Furthermore, two charophycean algae belonging to Coleochaetales have EG16 transcripts but no XTH transcripts (Behar et al. 2018). Coleochaetales resides phylogenetically between the Klebsormidiophyceae and Zygnematophyceae (Wickett et al. 2014), indicating that the emergence of EG16s predated the emergence of XTHs ().No EG16 sequences were attributed to non-charophycean green algae in an earlier census (Behar et al. 2018), but some EG16-2 sequences were found in the algae species Botryococcus braunii and Interfilum paradoxum (Behar et al. 2018). The latter of these was originally classified as a non-charophycean green alga (Chodat 1922) but was subsequently reclassified as a charophycean alga closely related to Klebsormidium (Mikhailyuk et al. 2008). Therefore, of the two species, only B. braunii was a non-charophycean green alga. Furthermore, when the two fragment EG16-2 sequences identified from B. braunii (Behar et al. 2018) were subjected to BLAST searches against the JGI B. braunii v2.1 genome dataset, no corresponding genes were found. The fragment sequences also lacked the signature residues common to other EG16-2s and were therefore relegated to an unclassified category (Dataset S1). The emergence of EG16-2s was thus mapped to the origin of Klebsormidiophyceae ().
Out of the plant kingdom
Accumulating evidence indicates that HGT events are common among virtually all the phylogenetic branches of life (Husnik and McCutcheon 2018), and HGT is a potential explanation for the non-plant origin of XTHs. In a GH16 subgrouping study (Viborg et al. 2019), the plant-specific composite clade of EG16s and XTHs is closely related to the GH16_21 group, which is enriched with bacterial enzymes including licheninases, and the composite fungal enzyme clade of GH16_18 and GH16_19. A phylogram of representative sequences from the groups was constructed, and charophycean EG16-2s were found to be closest to the bacterial enzymes of the GH16_21 group (). This topology supported the notion that the non-plant origin of XTHs was a bacterial gene, as proposed previously (Barbeyron et al. 1998, Michel et al. 2001, Eklöf et al. 2013, McGregor et al. 2017, Viborg et al. 2019).
Fig. 5
Phylogenetic inference of the non-plant origin of XTH-related proteins. (A) A maximum-likelihood phylogram of representative members of GH16_18 (CRH1 of Saccharomyces cerevisiae), GH16_19 (CRH2 and CRR1 of Saccharomyces cerevisiae), GH16_20 (XTHs of Mesotaenium endlicherianum, Zygnematophyceae), EG16-2 (GH16s of Klebsormidium nitens) and GH16_21 (QHA14546.1: an ExoK-like protein of Rhodobacter sphaeroides, a facultative photosynthetic bacterium; AAA16048.1: an ExoK of Ensifer meliloti, a nitrogen-fixing root-nodule bacterium) is shown. Support values are shown on branches. The output tree (A) supports the topology (B) and HGT from bacteria via plants to fungi (D) and discounts the topology (C) and HGT from bacteria via fungi to plants (E). Topology (B) is also explained by two separate bacteria-to-fungi and bacteria-to-plant HGT events (F).
Phylogenetic inference of the non-plant origin of XTH-related proteins. (A) A maximum-likelihood phylogram of representative members of GH16_18 (CRH1 of Saccharomyces cerevisiae), GH16_19 (CRH2 and CRR1 of Saccharomyces cerevisiae), GH16_20 (XTHs of Mesotaenium endlicherianum, Zygnematophyceae), EG16-2 (GH16s of Klebsormidium nitens) and GH16_21 (QHA14546.1: an ExoK-like protein of Rhodobacter sphaeroides, a facultative photosynthetic bacterium; AAA16048.1: an ExoK of Ensifer meliloti, a nitrogen-fixing root-nodule bacterium) is shown. Support values are shown on branches. The output tree (A) supports the topology (B) and HGT from bacteria via plants to fungi (D) and discounts the topology (C) and HGT from bacteria via fungi to plants (E). Topology (B) is also explained by two separate bacteria-to-fungi and bacteria-to-plant HGT events (F).The fungal origin of the composite group of GH16_18 and GH16_19 was explored further, and group members were found in Zoopagomycota and later diverging fungi (). The common ancestral gene of GH16_18 and GH16_19 may therefore have transferred to the common ancestor of terrestrial fungi, whose emergence was associated with the loss of flagella and the gain of hyphal growth (Naranjo-Ortiz and Gabaldón 2019). Together, GH16_18 and GH16_19 are termed the CRH family. The CRH family derives its name from yeast knockout mutants with increased sensitivity to cell-wall-interfering compounds (e.g. Congo red and calcofluor white), termed Congo Red Hypersensitive mutants (Rodríguez-Peña et al. 2000). GH16_18 and GH16_19 both include biochemically characterized members (), all of which mediate cleavage of a polysaccharide chain of chitin, β-(1,4)-linked N-acetylglucosamine homopolymer and reconnection of the cleaved chain to another molecule of chitin, β-(1,3)-glucan or β-(1,6)-glucan (Arroyo et al. 2016, Fang et al. 2019). This reaction resembles that of XTHs in that both are transglycosylation reactions involving the endolytic cleavage of cellulose analogs, as previously noted (Hrmova et al. 2009). The protein tree topology () and the structural features (Fang et al. 2019; ) suggest that the CRH family may have originated from plant EG16-2s or bacterial GH16 enzymes.
The bacterial origin of the XTH family
Bacterial licheninases (Hahn et al. 1995, Furtado et al. 2011), long-standing candidates for the origin of plant XTHs (Barbeyron et al. 1998, Michel et al. 2001), were recently classified into the GH16_21 group (Viborg et al. 2019). To revisit bacterial origins, a protein tree of GH16_21 members and charophycean EG16-2s was constructed and superimposed on the taxonomy of source organisms and the occurrence of known enzymatic activity (). This analysis revealed that bacterial GH16_21 proteins are encoded by single-copy genes, contrasting with the dozens of XTH genes in land plant genomes. The GH16_21 proteins were divided into two major clades that reflected differences in bacterial taxonomic groups and enzymatic activity (). One clade comprised Firmicutes enzymes, including well-characterized licheninases of Bacillus and Clostridium species. The other clade comprised Alphaproteobacteria enzymes, including ExoKs. Charophycean EG16-2s were nested within the latter clade ().
Fig. 6
Phylogenetic inference of the bacterial origin of EG16-2s. A maximum-likelihood phylogram of protein sequences in GH16_21 and of Charophyta EG16-2s is shown. Arrows indicate biochemically characterized enzymes. Support values of major nodes are shown on branches.
Phylogenetic inference of the bacterial origin of EG16-2s. A maximum-likelihood phylogram of protein sequences in GH16_21 and of Charophyta EG16-2s is shown. Arrows indicate biochemically characterized enzymes. Support values of major nodes are shown on branches.An ExoK gene was first discovered in a gene cluster of Ensifer meliloti (formerly Rhizobium meliloti or Sinorhizobium meliloti; Glucksmann et al. 1993), a nitrogen-fixing bacterium dwelling in root nodules of leguminous hosts (Corbin et al. 1983). The gene cluster is involved in the biosynthesis, export and modification of succinoglycan (Mendis et al. 2013), a branched acidic polysaccharide found in biofilms of Proteobacteria (Halder et al. 2017). ExoKs cleave the β(1,4)-glucosyl linkage subsequent to the β(1,3)-galactosyl residue of succinoglycan (Staehelin et al. 2006) and liberate oligosaccharides that facilitate symbiosis between bacteria and host plants (Mendis et al. 2013). However, succinoglycan occurs in bacteria that have no clear symbiotic relationships with plants (Halder et al. 2017), and conservation of ExoK genes in such bacteria () suggest a more general function for ExoKs. Recently, an ExoK knockout mutant of Agrobacterium sp. ATCC 31,749 was reported to produce a firmer biofilm (Gao et al. 2020). Accordingly, biofilm-loosening ExoK-like enzymes are the likely bacterial origin of plant XTHs ().Alphaproteobacteria ExoKs and Firmicutes licheninases exhibited similar folding structures () but also had notable variations in amino acid residues at the aligned sites corresponding to Q46, S67 and H83 of mature PttXET16-34 (). The Q46 and H83 sites were located in the substrate-binding cleft, but the Q and H residues of ExoKs provided more space than did the R and W residues of licheninases (). Considered alongside their different substrate specificities (Planas 2000, Staehelin et al. 2006), this structural variation supports the notion that ExoKs and licheninases are functionally differentiated. The deep branches in the protein tree () suggest that divergence occurred as early as the last common ancestor of Alphaproteobacteria and Firmicutes, which can be traced to more than 3 billion years ago (Moore et al. 2017). The distinct differences in cell-wall structure and composition between those bacterial groups are widely recognized and visualized by Gram staining (Battistuzzi and Hedges 2009) and may underlie the self/nonself targeting nature of these enzyme groups. Firmicutes may have gained licheninase-like enzymes during evolution to digest other bacteria in the ancient prokaryotic world, where such bacterial cells were postulated to be almost the only substrates for heterotrophs (Schönheit et al. 2016).
Fig. 7
Comparison between Firmicutes licheninases and Alphaproteobacteria ExoKs. (A) Mutual information profile estimating amino acid covariation in licheninase- and ExoK-clades of GH16_21. In the profile, peaks above 0.7 (> the 3-σ detection limit) are annotated, and the active site is shaded in light blue. (B) Frequency logos at the positions of the annotated peaks. (C) Three-dimensional structures of licheninase (3o5s) and ExoK (AAA16048 modeled using the 3o5s template).
Comparison between Firmicutes licheninases and Alphaproteobacteria ExoKs. (A) Mutual information profile estimating amino acid covariation in licheninase- and ExoK-clades of GH16_21. In the profile, peaks above 0.7 (> the 3-σ detection limit) are annotated, and the active site is shaded in light blue. (B) Frequency logos at the positions of the annotated peaks. (C) Three-dimensional structures of licheninase (3o5s) and ExoK (AAA16048 modeled using the 3o5s template).GH16_21 comprises numerous bacterial enzymes and a small number of eukaryotic enzymes (Viborg et al. 2019). The eukaryotic enzymes were nested within the clade of Firmicutes licheninases and confined to Neocallimastigomycota (), a group of anaerobic fungi dwelling in the gastrointestinal tracts of herbivorous animals (Hibbett et al. 2007). HGT events of plant-degrading enzyme genes from bacteria are rampant in this fungal lineage and are postulated to coincide with or postdate the evolutionary habitat transition to the guts of herbivores (Murphy et al. 2019). This transition is estimated to have occurred close to 70 million years ago (Wang et al. 2019), much later than the emergence of XTHs in plants (∼600 million years ago; ). The differences in their putative bacterial donors and the estimated timings of the relevant HGT events indicate that the GH16_21 proteins of Neocallimastigomycota are essentially unrelated to the non-plant origin of the XTH family.
Early divergence of XTHs in the land plant lineage
Phylogenetic analysis revealed that the diversification of XTH groups I and III predated the emergence of land plants (, ), but it was unclear which group emerged first. Genome datasets from bryophytes (Rensing et al. 2008, Bowman et al. 2017, Zhang et al. 2020), the basal lineage of the land plants (Wickett et al. 2014), included only these two groups (Dataset S1; ), and the bryophyte XTHs were equidistant from the algal XTHs (). Group III XTHs include a group III-specific extension in loop 2 (Eklöf and Brumer 2010). This extension was absent in algal XTHs and was confined to group III throughout the land plant lineage (Dataset S1; ). This indicates that group I was the ancestral lineage from which group III arose (). Differences in enzymatic activity between groups I and III remain unknown, and both groups have XET activity (Eklöf and Brumer 2010).
Divergence of group II XTHs in euphyllophytes
A previous protein tree of the A. thaliana XTH family suggested that group II members formed a distinct monophyletic clade (Rose et al. 2002). However, incorporation of the Oryza sativa XTH family into the protein tree introduced ambiguity into the group II boundary and led to the introduction of composite group I/II (Yokoyama et al. 2004). The signature residues of group II A. thaliana XTH family members were therefore examined in this context, and H102 and Q116 were identified (). Q116 was previously reported as the signature residue of group II (Eklöf and Brumer 2010), and the two corresponding residues (Q102 and R116) of PttXET16-34 (group I) were adjacent to the catalytic nucleophile E89, allowing interaction with the β-(1,4)-glucan backbone of the xyloglucan substrate (Johansson et al. 2004; ). Examination of amino acid variation at the two sites in an evolutionary context revealed a coevolutionary link between the sites (, ). Most of the examined XTH sequences had either the Q102-R116 or the H102-Q116 combination, the former of which was predominant and was the ancestral form (, ). The evolutionary shift to the H102-Q116 combination occurred at least five times: once in a charophycean alga, twice in bryophytes, once in Selaginella moellendorffii, and once in the euphyllophyte lineage that includes ferns and seed plants (, ). Similar to convergent evolution of proteins (Uchiyama et al. 2020), the first four events were instances of convergent coevolution of amino acid residues, but the last event led to the formation of group II (). In this circumscription, group II of the O. sativa XTH family comprised 15 members, two of which (OsXTH4 and OsXTH17) harbored secondary mutations (). These secondary mutations may underlie the previous fusion of groups I and II.
Fig. 8
Comparison of XTH family groups. (A) Mutual information profiles estimating amino acid covariation in groups of the Arabidopsis thaliana XTH family. In the profiles, peaks above 0.8 (> the 3-σ detection limit) are annotated, and the active site is shaded in light blue. In the annotation, digits after + denote Nth gap sites from preceding residues in PttXET16-34. (B) Frequency logos at the positions of the annotated peaks. The empty symbol represents a gap in aligned amino acid sequences. Signature residues of group II are colored in purple, and the corresponding residues of the other groups are colored in green. (C) Q102 and R116 residues in the three-dimensional structure of the enzyme-substrate complex (1umz) of PttXET16-34 and xyloglucan nonasaccharide, where E89 is the catalytic nucleophile.
Comparison of XTH family groups. (A) Mutual information profiles estimating amino acid covariation in groups of the Arabidopsis thaliana XTH family. In the profiles, peaks above 0.8 (> the 3-σ detection limit) are annotated, and the active site is shaded in light blue. In the annotation, digits after + denote Nth gap sites from preceding residues in PttXET16-34. (B) Frequency logos at the positions of the annotated peaks. The empty symbol represents a gap in aligned amino acid sequences. Signature residues of group II are colored in purple, and the corresponding residues of the other groups are colored in green. (C) Q102 and R116 residues in the three-dimensional structure of the enzyme-substrate complex (1umz) of PttXET16-34 and xyloglucan nonasaccharide, where E89 is the catalytic nucleophile.Within the group of horsetail (E. fluviatile) XTHs, EfHTG has broader substrate specificities than EfXTH-A, EfXTH-H or EfXTH-1 (Simmons et al. 2015, Holland et al. 2020). Similarly, nasturtium (Tropaeolum majus) TmXET6.3 has a broad substrate specificity that has not been found in other nasturtium XTHs (Stratilová et al. 2019). In the aforementioned circumscription, both EfHTG and TmXET6.3 belonged to group II (Dataset S1; ), and EfXTH-A, EfXTH-H and EfXTH-1 belonged to group I (Dataset S1). Group I was ancestral to composite group I/II (), and the broad substrate specificities of EfHTG and TmXET6.3 can therefore be viewed as an evolutionary gain caused by the coevolution of the substrate-interacting residues.
Divergence of group III-A in seed plants
By comparison with XTHs in composite group I/II, group III XTHs contain an extension in loop 2 (Eklöf and Brumer 2010), and this group is divided into groups III-A and III-B (Baumann et al. 2007). XTHs in Group III-A predominantly have XEH activity (Baumann et al. 2007, Kaewthai et al. 2013). This activity is a derived characteristic attributed to the loop 2 extension, which is longer than the group III-B loop extension (Baumann et al. 2007, Eklöf and Brumer 2010). Multiple alignment of land plant XTH sequences showed that the III-A-type loop extension was confined to seed plants, being found in conifers, cycads and angiosperms but not in ferns, lycophytes or bryophytes (). TmNXG1, the first identified member of group III-A, is postulated to be involved in the mobilization of seed-storage xyloglucan (de Silva et al. 1993), and some other members of group III-A are localized to rapidly growing tissues (Tabuchi et al. 2001, Kaewthai et al. 2013, Hara et al. 2014). Therefore, it is likely that the emergence of group III-A is correlated with the evolutionary development of seeds and rapid seedling growth.
Divergence of the ancestral group of XTHs in core angiosperms
Four members (AtXTH1, AtXTH2, AtXTH3 and AtXTH11) of the A. thaliana XTH family, first reported as members of group I (Rose et al. 2002), were reclassified as a distinct small group within the expanded collection of XTH sequences (Baumann et al. 2007, Miedes and Lorences 2007). This small group is called the ‘ancestral’ group (Baumann et al. 2007) or ‘group IV’ (Miedes and Lorences 2007). As the name ‘ancestral’ suggests, this group displays deep phylogram branching (Baumann et al. 2007, Yokoyama et al. 2010, Shinohara et al. 2017; ). However, early-diverging land plants such as bryophytes and lycopods encode no members of the ancestral group (Yokoyama et al. 2010, Shinohara et al. 2017; ), and this constitutes an incongruity between the protein and species trees.Signature residues of the ancestral group were examined to address the protein/species discrepancy (). A comparison between the ancestral group and other XTH groups identified no signature residues, but a comparison between the ancestral group and the remaining members of composite group I/II identified two signature residues in the N-terminal quarter region (). The two residues were found preferentially in both the ancestral group and group III, suggesting that an intra-family gene fusion between groups I and III led to the emergence of the ancestral group. Gene fusions are known to occur between promoter and coding regions as well as between different coding regions, presumably in a random fashion, and a small percentage of genes have undergone traceable inter-protein fusion events (Durrens et al. 2008). In addition, many examples of circularly permutated gene pairs have been identified whose evolutionary origins are explained by intra-family gene fusion (Bliven and Prlić 2012). These observations support the hypothesis that intra-family gene fusion events may have occurred and descended over generations at non-negligible levels. The deep branch of the ancestral group can be explained by the early divergence of their parental groups but is not explained by early emergence of this group per se.An early lineage of plants encoding members of the ancestral group was examined. Amborella genome and Austrobaileyales transcriptome datasets did not include any group members, but transcriptome datasets of Peperomia fraseri and Houttuynia cordata, both early-diverging core angiosperms, did include some ancestral group members (). The origin of the ancestral group was thus traced to core angiosperms (i.e. Mesangiospermae). Some monocots (e.g. O. sativa and Zea mays) also encoded some ancestral XTHs (), but others, namely Spirodela polyrhiza (common duckmeat) and Dioscorea alata (purple yam), did not (). The eudicot tree Eucalyptus grandis (flooded gum) encoded a relatively large number of ancestral group members (five; ). Thus, both lineage-specific loss and expansion of the ancestral group occurred in core angiosperms. The emergence of the ancestral group was mapped as the latest event subsequent to the aforementioned divergence of groups I-to-III, I-to-II and III-A-to-III-B (). In this light, the term ‘group IV’ (Miedes and Lorences 2007) may be preferable to ‘ancestral’ for future use.Four A. thaliana XTHs were members of the ancestral group. Of these, AtXTH3 had cellulose:cellooligosaccharide and cellulose:xyloglucan-oligosaccharide transglycosylation activities alongside its XET activity (Shinohara et al. 2017). This broad substrate specificity resembles the activities of EfHTG (Simmons et al. 2015) and TmXET6.3 (Stratilová et al. 2019). However, AtXTH3 neither had the H102-Q116 combination nor belonged to group II (). Therefore, it is likely that multiple types of protein evolution have led to the evolutionary gain of broad substrate specificities in the XTH family.
Discussion
Co-occurrence of XTHs and aligned cellulose microfibrils
In the evolutionary history of the plant cell wall, the origins of the rosette-type cellulose-synthase complex and some of the core component proteins that facilitate the association between cellulose synthases and cortical microtubules are found in Zygnematophyceae (Lampugnani et al. 2019). This corresponds with the origin of the XTH family. Thus, there is a noticeable evolutionary shift in the plant cell wall between Coleochaetophyceae (and earlier charophycean algae) and the composite clade of Zygnematophyceae and land plants. This evolutionary change theoretically allows plant cells to grow in a microtubule-oriented manner with the aid of XTHs and is consistent with the previously reported notion that the function of XTHs is the reversible loosening of plant cell walls composed of aligned cellulose microfibrils enmeshed in matrix polysaccharides (Albersheim 1975, Nishitani 1997).
The mechanism underlying the expansion of the XTH multigene family
The genomes of two Zygnematophyceae species (S. muscicola and M. endlicherianum) contained multigene XTH families (Cheng et al. 2019). These two algae are unicellular (Cheng et al. 2019), and the expansion of the XTH family in Zygnematophyceae is therefore unlikely to be driven only by the acquisition of a multicellular body plan, which often involves multiple types of cell differentiation. The question of how duplicated genes survive and undergo neofunctionalization over evolutionary time scales has been debated for decades and various theoretical models have been proposed (Soskine and Tawfik 2010). Currently favored models entail latent and promiscuous protein functions to explain both survival and neofunctionalization of duplicated genes (Soskine and Tawfik 2010). Some XTHs have additional enzymatic activities in addition to XET activity (Simmons et al. 2015, Shinohara et al. 2017, Stratilová et al. 2019, 2020), and this broader activity is likely to have emerged multiple times during evolution. Thus, the expansion and diversification of the XTH family may have been driven not only by the acquisition of new enzymatic activities (e.g. XEH in group III-A), but also by the relaxation of substrate specificity (e.g. the broader specificities of group II).
Vehicles for seemingly altruistic gene transfer
The bacterial origin of plant XTHs and other related enzymes, such as EG16s and EG16-2s, appeared to be Alphaproteobacteria ExoKs rather than the previously supposed Firmicutes licheninases. Alphaproteobacteria species are not uncommon donors of prokaryote-to-eukaryote HGT events, and the mechanisms responsible may include type IV secretion systems, gene transfer agents (GTAs) and Ti plasmids (Le et al. 2014). Of those mechanisms, GTAs are widely distributed in Rhodobacterales (Lang et al. 2012), whose ExoK-like enzymes are phylogenetically proximal to the root of charophycean EG16-2s. GTAs are phage-like particles loaded with random genomic fragments of donor bacteria (Lang et al. 2012) and are capable of transforming evolutionarily unrelated bacteria with high frequency (McDaniel et al. 2010). It is conceivable that the hypothetical HGT event of an Alphaproteobacteria ExoK-like gene to a Klebsormidium-related charophyte may have been caused by GTAs rather than by an accidental intake of naked DNA.
Adaptation of plants to an ancient icy environment
The HGT event from ExoK-like enzymes in bacteria to EG16-2s in plants was mapped to the time of divergence between M. viride, free-swimming freshwater algae covered with mineralized scaly cell walls (Domozych et al. 1991) and Klebsormidium, sessile but airborne algae that are found globally, including in polar regions (Ryšánek et al. 2016). The HGT event coincides with the evolutionary gain of the sessile vegetative phase in plants (Nishiyama et al. 2018) and the biosynthesis of xyloglucan (Del-Bem 2018, Herburger et al. 2018). Recent estimates (Donoghue and Paps 2020) mapped this HGT event to the Cryogenian (720–635 million years ago). The climate in this geological period was hypothesized to have led to global glaciation or Snowball Earth (Hoffman et al. 2017), and sessile plant traits on the modern Earth may have been co-opted as an adaptation to the ancient icy environment. In this light, acquired EG16-2s may have played a role in controlled cell-wall-loosening under sporadically occurring favorable conditions for growth on ice. Another possible role of EG16-2s may have been the modification of secreted polysaccharides, changing ice nucleation and allowing algae to effectively sunbathe on ice. Recently, high xyloglucan contents were reported in soil of a glacial forefield (Galloway et al. 2018), and environmental xyloglucan secreted by plant roots and rhizoids was proposed to have a pivotal role in the formation of the soil biosphere (Galloway et al. 2018). Such secretions may have originated from ancient charophycean snow algae.
The trinity of bacteria, plants and fungi
The HGT event leading to the development of the fungal CRH family may have coincided with the emergence of terrestrial fungi. Fungal terrestrialization involved the loss of flagellar-mediated swimming ability and the gain of hyphal growth. Acquisition of these features overlapped with the Cryogenian period, which led to the notion that fungal terrestrialization took place in the ancient icy environment (Naranjo-Ortiz and Gabaldón 2019). In this light, as with the acquisition of EG16-2s by charophycean plants, the acquisition of CRHs by fungi may have assisted growth by controlling cell-wall-loosening under sporadically occurring favorable conditions on ice, allowing adaptation to the Snowball Earth. Overall, the ancient successive transfer of GH16 enzymes from bacteria to plants and fungi may have provided the basis for the sessile lifestyles of plants and fungi.
Conclusions
The aims of this study were to improve understanding of the evolutionary origins and development of XTHs. Several conclusions can be drawn. (i) The likely non-plant origin of the plant-specific XTH family is Alphaproteobacteria ExoKs, which cleave succinoglycans and loosen bacterial biofilms. This functional framework is conserved among its putative descendants, including XTHs, which cleave xyloglucan in plant cell walls, and CRHs, which cleave chitin in fungal cell walls. (ii) Free-swimming scaly algae related to M. viride may have acquired Alphaproteobacteria ExoK-like genes during the Cryogenian period. This HGT event may have assisted algae to develop sessile lifestyles and adapt to the ancient icy environment. Putative direct descendants of the acquired enzymes are Klebsormidium EG16-2s. (iii) Flagellated fungi in the Cryogenian period may have experienced an HGT event leading to the fungal CRHs. This HGT event may have aided development of hyphal growth in fungi and survival in the ancient icy environment. (iv) The first XTHs may have evolved in desmid-related freshwater algae through extension of the C-terminal region of EG16-2s. This event coincides with the evolutionary gain of microtubule-oriented directional growth, which is well conserved across the land plants. (v) Multiple XTH family diversification events occurred, primarily in the land plant linage. These diversification events correlated with protein evolution (loop extension, coevolution of substrate-interacting residues and intra-family gene fusion) and also correlated with the emergence of some plant taxonomic groups (land plants, euphyllophytes and seed plants). At least some of the protein evolutionary events were related to the relaxation of substrate specificity, and this relaxation may have underpinned XTH family expansions. Overall, this study provides new insights into the evolutionary history of the XTH family and will facilitate future analysis of related enzymes.
Materials and Methods
Collection of protein sequences
Protein sequences of XTHs, EG16s and EG16-2s were retrieved from public databases and published datasets. Identities and sequences are collated in Dataset S1. Retrieved sequences were examined using multiple sequence alignments and phylogenetic analysis to obtain a non-redundant set of sequences for each genome. Transcriptome datasets from a prior census (Behar et al. 2018) were used for analysis when genome datasets for species of interest were unavailable at the time of analysis. Bacterial and fungal sequences of the GH16_18, 19, 21, 22 and 23 subfamilies (Viborg et al. 2019) were retrieved from the CAZy database. Taxonomically representative and/or biochemically characterized microbial sequences were used for analysis and are listed in Dataset S1.
Indexing of protein sequence data
Sequences used in this study are detailed in Dataset S1. This comprises a ‘Summary’ worksheet and four separate worksheets for enzyme groups ‘GH16_21’, ‘EG16s’ (+ EG16-2s), ‘XTHs’ and ‘Fungal’ GH16_18, _19, _22 and _23 enzymes. The ‘Summary’ sheet details the members in each enzyme group and provides references within this study (e.g. figures and tables). The four enzyme group worksheets collate the protein sequences and their source organisms and are formatted as in the prior census (Behar et al. 2018). Group information is also included where possible. Fragmentary sequences (often less than ∼200 residues) and presumably mispredicted or pseudogenized protein sequences, some of which were omitted from phylogenetic analysis, are retained within Dataset S1 and are marked accordingly.
Species trees and time scale estimates
Classification of organisms was based on the NCBI taxonomy browser and the literature—for Streptophyta (Wickett et al. 2014), Zygnematophyceae (Cheng et al. 2019), Lycophyta (Field et al. 2016), Chlorophyta (Leliaert et al. 2012), fungi (Hibbett et al. 2007, Naranjo-Ortiz and Gabaldón 2019) and bacteria (Battistuzzi and Hedges 2009, Castelle and Banfield 2018). Estimates of divergence times and placement of evolutionary gain events were based on the literature—for divergence time in plant (Langdale and Harrison 2008, Donoghue and Paps 2020), fungal (Berbee et al. 2017, Tedersoo et al. 2018, Naranjo-Ortiz and Gabaldón 2019) and bacterial (Battistuzzi and Hedges 2009, Moore et al. 2017) lineages and for the evolutionary gain events in plants (Del-Bem 2018, Nishiyama et al. 2018, Lampugnani et al. 2019).
User interfaces for sequence analysis
The SignalP 4.1 server (Petersen et al. 2011) was used for signal peptide prediction. SeaView software (Gouy et al. 2010) was used for MUSCULE sequence alignment (Edgar 2004), maximum-likelihood tree construction (Guindon et al. 2010) and calculation of approximate likelihood-ratio test (aLRT) (Anisimova and Gascuel 2006) branch support values. The ggtree package (Yu et al. 2017) was used for tree rendering and graphical annotation.
Computation of mutual information
Mutual information values for amino acid variation (including gaps) and grouping of aligned sequences were calculated as previously reported (Weckwerth and Selbig 2003). Values were normalized according to Equation 4 (Kvålseth 1987) to allow comparison between datasets. An overview of the procedure is summarized ().
Comparison of three-dimensional protein structures
Three-dimensional coordinates of protein structures were retrieved from the RCSB Protein Data Bank (Burley et al. 2020) or were modeled on and retrieved from the SWISS-MODEL server (Waterhouse et al. 2018). The UCSF Chimera software (Pettersen et al. 2004) was used for comparison and rendering of the protein structures. The N-terminal loop extension (A45–R59 in 3o5s; S53–S67 in P33693) and donor-side (formerly C-terminal) extension (C207–I272 in 1umw) were annotated as described previously (Behar et al. 2018). The acceptor-side extension (K24–S41, A245–D266 in 6ibw) was annotated in this study.Click here for additional data file.
Authors: Paul F Hoffman; Dorian S Abbot; Yosef Ashkenazy; Douglas I Benn; Jochen J Brocks; Phoebe A Cohen; Grant M Cox; Jessica R Creveling; Yannick Donnadieu; Douglas H Erwin; Ian J Fairchild; David Ferreira; Jason C Goodman; Galen P Halverson; Malte F Jansen; Guillaume Le Hir; Gordon D Love; Francis A Macdonald; Adam C Maloof; Camille A Partin; Gilles Ramstein; Brian E J Rose; Catherine V Rose; Peter M Sadler; Eli Tziperman; Aiko Voigt; Stephen G Warren Journal: Sci Adv Date: 2017-11-08 Impact factor: 14.136
Authors: Stefan A Rensing; Daniel Lang; Andreas D Zimmer; Astrid Terry; Asaf Salamov; Harris Shapiro; Tomoaki Nishiyama; Pierre-François Perroud; Erika A Lindquist; Yasuko Kamisugi; Takako Tanahashi; Keiko Sakakibara; Tomomichi Fujita; Kazuko Oishi; Tadasu Shin-I; Yoko Kuroki; Atsushi Toyoda; Yutaka Suzuki; Shin-Ichi Hashimoto; Kazuo Yamaguchi; Sumio Sugano; Yuji Kohara; Asao Fujiyama; Aldwin Anterola; Setsuyuki Aoki; Neil Ashton; W Brad Barbazuk; Elizabeth Barker; Jeffrey L Bennetzen; Robert Blankenship; Sung Hyun Cho; Susan K Dutcher; Mark Estelle; Jeffrey A Fawcett; Heidrun Gundlach; Kousuke Hanada; Alexander Heyl; Karen A Hicks; Jon Hughes; Martin Lohr; Klaus Mayer; Alexander Melkozernov; Takashi Murata; David R Nelson; Birgit Pils; Michael Prigge; Bernd Reiss; Tanya Renner; Stephane Rombauts; Paul J Rushton; Anton Sanderfoot; Gabriele Schween; Shin-Han Shiu; Kurt Stueber; Frederica L Theodoulou; Hank Tu; Yves Van de Peer; Paul J Verrier; Elizabeth Waters; Andrew Wood; Lixing Yang; David Cove; Andrew C Cuming; Mitsuyasu Hasebe; Susan Lucas; Brent D Mishler; Ralf Reski; Igor V Grigoriev; Ralph S Quatrano; Jeffrey L Boore Journal: Science Date: 2007-12-13 Impact factor: 47.728
Authors: Stephen K Burley; Charmi Bhikadiya; Chunxiao Bi; Sebastian Bittrich; Li Chen; Gregg V Crichlow; Cole H Christie; Kenneth Dalenberg; Luigi Di Costanzo; Jose M Duarte; Shuchismita Dutta; Zukang Feng; Sai Ganesan; David S Goodsell; Sutapa Ghosh; Rachel Kramer Green; Vladimir Guranović; Dmytro Guzenko; Brian P Hudson; Catherine L Lawson; Yuhe Liang; Robert Lowe; Harry Namkoong; Ezra Peisach; Irina Persikova; Chris Randle; Alexander Rose; Yana Rose; Andrej Sali; Joan Segura; Monica Sekharan; Chenghua Shao; Yi-Ping Tao; Maria Voigt; John D Westbrook; Jasmine Y Young; Christine Zardecki; Marina Zhuravleva Journal: Nucleic Acids Res Date: 2021-01-08 Impact factor: 16.971