| Literature DB >> 31231410 |
Elisabeth Fitzek1,2,3, Lauren Orton1, Sarah Entwistle1, W Scott Grayburn1, Catherine Ausland1, Melvin R Duvall1, Yanbin Yin1,4.
Abstract
Previous analysis of charophyte green algal (CGA) genomes and transcriptomes for specific protein families revealed that numerous land plant characteristics had already evolved in CGA. In this study, we have sequenced and assembled the transcriptome of Zygnema circumcarinatum UTEX 1559, and combined its predicted protein sequences with those of 13 additional species [five embryophytes (Emb), eight charophytes (Cha), and two chlorophytes (Chl) as the outgroup] for a comprehensive comparative genomics analysis. In total 25,485 orthologous gene clusters (OGCs, equivalent to protein families) of the 14 species were classified into nine OGC groups. For example, the Cha+Emb group contains 4,174 OGCs found in both Cha and Emb but not Chl species, representing protein families that have evolved in the common ancestor of Cha and Emb. Different OGC groups were subjected to a Gene Ontology (GO) enrichment analysis with the Chl+Cha+Emb group (including 5,031 OGCs found in Chl and Cha and Emb) as the control. Interestingly, nine of the 20 top enriched GO terms in the Cha+Emb group are cell wall-related, such as biological processes involving celluloses, pectins, lignins, and xyloglucans. Furthermore, three glycosyltransferase families (GT2, 8, 43) were selected for in-depth phylogenetic analyses, which confirmed their presence in UTEX 1559. More importantly, of different CGA groups, only Zygnematophyceae has land plant cellulose synthase (CesA) orthologs, while other charophyte CesAs form a CGA-specific CesA-like (Csl) subfamily (likely also carries cellulose synthesis activity). Quantitative real-time-PCR experiments were performed on selected GT family genes in UTEX 1559. After osmotic stress treatment, significantly elevated expression was found for GT2 family genes ZcCesA, ZcCslC and ZcCslA-like (possibly mannan and xyloglucan synthases, respectively), as well as for GT8 family genes (possibly pectin synthases). All these suggest that the UTEX 1559 cell wall polysaccharide synthesis-related genes respond to osmotic stress in a manner that is similar to land plants.Entities:
Keywords: RNA-seq; Zygnema circumcarinatum; charophyte green algae; gene expression; glycosyltransferases; osmotic stress
Year: 2019 PMID: 31231410 PMCID: PMC6566377 DOI: 10.3389/fpls.2019.00732
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Summary of UTEX 1559 RNA-Seq assembly and annotation.
| Product | Count | Percent |
|---|---|---|
| Total# of genes | 58,087 | |
| Total# of transcripts | 66,952 | |
| Total# of proteins | 43,573 | |
| GC% | 50.6 | |
| Contig N50 | 2,011 bp | |
| Avg. contig length | 1,027 bp | |
| Total assembled bases | 68,772,302 | |
| TAIR10_BLASTX | 25,366 | 37.88a |
| Kni_BLASTX | 27,166 | 40.57a |
| UniRef_ BLASTP | 29,133 | 66.86b |
| TAIR10_BLASTP | 24,608 | 56.48b |
| Kni_BLASTP | 26,516 | 60.86b |
List of species and their protein count.
| Clade | Species | Abbreviation | Source | Number of proteins |
|---|---|---|---|---|
| Ath | Phytozome v12 ( | 27,416 | ||
| Ptr | 41,335 | |||
| Osa | 42,189 | |||
| Smo | 22,273 | |||
| Ppa | 32,926 | |||
| Chlorophyte (Chl) | Vca | 14,247 | ||
| Cre | 17,741 | |||
| Charophyte (Cha-KCM) | Kni | 17,207 | ||
| Mvia | SRR1594255 ( | 110,511 | ||
| Charophyte (Cha-ZCC) | Nmia | SRR486217, SRR494512 ( | 95,381 | |
| Cora | SRR1594679 ( | 90,444 | ||
| Spra | SRR1594156 ( | 23,577 | ||
| Zcira | SRP117803 ( | 67,762 | ||
| Zygya | SRX5449751 (this study) | 43,573 |
Classification of orthologous gene clusters (OGCs) into nine groups.
| OGC groupsa | Max# of speciesb | Total# of proteins | Total# of OGCs |
|---|---|---|---|
| Chl+Cha+Emb | 14 | 115,796 | 5,031 |
| Cha+Emb | 12 | 66,491 | 4,174 |
| Chl+Cha | 9 | 10,323 | 1,221 |
| Chl+Emb | 7 | 1,030 | 140 |
| Emb | 5 | 28,954 | 4,849 |
| Cha | 7 | 22,077 | 3,600 |
| Zcc | 5 | 19,524 | 3,807 |
| Kcm | 2 | 257 | 66 |
| Chl | 2 | 6,012 | 2,597 |
| Total | 270,464 | 25,485 |
FIGURE 1Distribution of the nine OGC groups in the 14 species. (A) The Venn diagram shows the numbers of OGCs shared by and unique to the three major plant/algal taxonomic groups. The 7473 OGCs unique to the Cha group are the sum of three OGC numbers (3600+3807+66) in Table 3. Note that the sizes of the seven areas in the diagram are proportional to the number of OGCs and not to the number of proteins. For example, the number of proteins in the Chl+Cha+Emb group is 115,769, which is the largest in Table 3. (B) For each species, the 25,485 OGCs were examined to see if there is a protein from that species. Then the OGC counts for that species were plotted with different OGC groups presented in different colors. The species labels in the y axis are arranged according to the phylogenetic relatedness. Note that the x-axis shows the number of OGCs not the number of proteins.
Overview of gene ontology (GO) annotation for OGCs of the nine groups.
| OGC groups | #Of GO annotated proteins | % Of GO annotated proteins | #Of over-represented GO termsb |
|---|---|---|---|
| Chl+Cha+Emb | 80,871 | 69.84% | NAa |
| Cha+Emb | 41,532 | 62.46% | 501 |
| Chl+Cha | 4,248 | 41.15% | 76 |
| Chl+Emb | 907 | 88.06% | 65 |
| Emb | 24,831 | 85.76% | 404 |
| Cha | 3,785 | 17.14% | 20 |
| ZCC | 6,201 | 31.76% | 42 |
| KCM | 85 | 33.07% | 6 |
| Chl | 4,440 | 73.85% | 54 |
FIGURE 2Top 20 GO functions/terms that are over-represented in the Cha+Emb group of OGCs. The y-axis shows the GO terms, and the binomial test adjusted P-values in -log10 form are shown beside the bars. The bars are color-coded according to which of the three top GO levels the term is from. The GO terms are highlighted in red fonts if they are cell wall-related functions. The detailed methods for this GO enrichment analysis is described in Methods. Detailed data used for this plot is available in Supplementary Data Sheet S1.
Selected GT enzyme encoding genes from UTEX 1559 and their best Arabidopsis homologs.
| GT family | Gene name | Length (in aa) | Best hit in Arabidopsis | Arabidopsis protein length | Sequence identity (%) | BLASTP E-value |
|---|---|---|---|---|---|---|
| GT2 | ZcCesA | 1137 | AT5G05170 (AtCesA3) | 1065 | 63 | 0.0 |
| ZcCesA-like | 280 | AT4G32410 (AtCesA1) | 1081 | 52 | 7e-97 | |
| ZcCslC | 715 | AT2G24630 (AtCslC8) | 690 | 61 | 0.0 | |
| ZcCslA-like | 666 | AT2G35650 (AtCslA7) | 484 | 45 | 2e-106 | |
| GT8 | ZcGAUT3 | 731 | AT4G38270 (AtGAUT3) | 676 | 59 | 0.0 |
| ZcGAUT10 | 624 | AT2G20810 (AtGAUT10) | 536 | 50 | 2e-178 | |
| ZcGAUT13 | 562 | AT3G01040 (AtGAUT13) | 532 | 50 | 0.0 | |
| ZcGATL7-like | 423 | AT3G62660 (AtGATL7) | 361 | 27 | 7e-06 | |
| ZcGolS | 345 | AT1G56600 (AtGolS2) | 335 | 46 | 8e-84 | |
| ZcPGSIP-A-like | 924 | AT1G08990 (AtPGSIP5) | 566 | 35 | 2e-37 | |
| ZcPGSIP-B | 555 | AT5G18480 (AtPGSIP6) | 537 | 49 | 5e-159 | |
| ZcPGSIP-C | 510 | AT4G16600 (AtPGSIP8) | 494 | 47 | 4e-154 | |
| GT43 | ZcGT43-A | 501 | AT1G27600 (AtRX9L) | 394 | 65 | 1e-124 |
| ZcGT43-B | 711 | AT5G67230 (AtIRX14L) | 492 | 35 | 2e-82 | |
| ZcGT43-C | 485 | AT1G27600 (IRX-9) | 394 | 32 | 1e-35 |
FIGURE 3The phylogeny of GT2 proteins from selected species of land plants and algae. In total 248 GT2 protein sequences of 14 plant and algal species were used to build this phylogeny (see section “Materials and Methods”). For Arabidopsis and rice proteins, the gene names (adopted from Wang et al., 2010; Yin et al., 2014) were included in the tree leaves. The UTEX 1559 proteins that were selected for qRT-PCR analysis were highlighted with orange background and the proposed gene names (Table 5) were indicated with black lines.
FIGURE 4Expression response of 15 selected GT genes toward osmotic stress. Quantitative RT-PCR measures the relative expression in fold change (average ± SE, n = 6) in UTEX 1559 after 1 h of 300 mM Sorbitol treatment vs. no treatment (control). Significant p-values (<0.05) from t-tests were shown as ∗ to indicate that the Sorbitol treatment samples were significantly higher than control samples in terms of expression.
FIGURE 5The phylogeny with land plant CesAs and CGA CesAs (former CslD-like clade). In total 85 protein sequences of 15 plant and algal species were used to build this phylogeny (see section “Materials and Methods”). These include six proteins (leaf names contain “cbr| ”) from the sequenced C. braunii genome and 79 proteins from the CesA and CslD-like clades of Figure 3. The three seed plant clades are collapsed as black triangles with the representative Arabidopsis proteins indicated (AtCesA 4, 7, 8 are reportedly involved in secondary cell wall cellulose synthase complex (CSC) assembly, and the rest AtCesAs are involved in primary cell wall CSC assembly). The complete version of this phylogeny is Supplementary Figure S5.