Literature DB >> 31871586

Genomic resources for energy cane breeding in the post genomics era.

Augusto L Diniz1, Sávio S Ferreira2, Felipe Ten-Caten1, Gabriel R A Margarido3, João M Dos Santos4, Geraldo V de S Barbosa4, Monalisa S Carneiro5, Glaucia M Souza1.   

Abstract

Sugarcane is one of the most sustainable energy crops among cultivated crops presenting the highest tonnage of cultivated plants. Its high productivity of sugar, bioethanol and bioelectricity make it a promising green alternative to petroleum. Furthermore, the myriad of products that can be derived from sugarcane biomass has been driving breeding programs towards varieties with a higher yield of fiber and a more vigorous and sustainable performance: the energy cane. Here we provide an overview of the energy cane including plant description, breeding efforts, types, and end-uses. In addition, we describe recently published genomic resources for the development of this crop, discuss current knowledge of cell wall metabolism, bioinformatic tools and databases available for the community.
© 2019 The Authors.

Entities:  

Keywords:  Bioenergy; Biofuels; Biomass; Bioproducts; Genomics; Grasses

Year:  2019        PMID: 31871586      PMCID: PMC6906722          DOI: 10.1016/j.csbj.2019.10.006

Source DB:  PubMed          Journal:  Comput Struct Biotechnol J        ISSN: 2001-0370            Impact factor:   7.271


Introduction

Sugarcane (Saccharum spp.) is a perennial, tropical or subtropical non-cereal C4 grass; the major crop for food and bioenergy production and the highest tonnage crop in the world [1]. Mainly grown for sugar production in the tropical and subtropical regions of the world, sugarcane has one of the highest solar energy conversion efficiency and is a crop with one of the highest biomass yields [2], [3], [4]. Due to the high yields of both sugar and lignocellulosic biomass it is considered an important feedstock for substituting fossil fuel energy, either as natural biomass or transformed into liquid or gaseous forms [5]. An additional advantage is that this semi-perennial grass allows harvesting for several years (4 to 5 years) without the need for replanting, which reduces the cost of bioenergy production [6]. The development of conversion processes that use all plant carbon in a Biorefineries approach [7], [8] has stimulated the development of plants with several co-products for different applications, including sugars, biofuels and bioelectricity. In this scenario, breeders are designing crosses towards the development and improvement of energy cane, a new type of cane containing a high yield of fiber. Over the last century, there was a great effort in breeding programs and conventional agricultural research to increase the yield of sugarcane and sugar to reach the current levels [9]. In addition, research groups around the world, have generated a large amount of molecular and physiological data on sugarcane. In addition, sugarcane geneticists have invested significant effort to explore and dissect the complex genome of cane using a wide range of genomic tools. These combined factors make data integration a key step to achieve a broader understanding of cane physiology and its interaction with the environment and climate changes as well as to help the design of more productive varieties. In this mini review, we focus on bioenergy and energy cane and provide an overview of the current genomic resources and databases for the development of this crop. It is important to note that the theoretical potential of sugarcane dry biomass production is 177 t/(ha yr) or a fresh weight cane yield of 381 t/(ha yr) [10]. Worldwide sugarcane yield averages around 39 and 84 t/(ha yr), respectively. There is great interest and opportunity to decrease this yield gap.

The Saccharum complex

Sugarcane is the world’s leading biomass crop, produced in over 100 countries [1]. Modern sugarcane cultivars are polyploid interspecific hybrids, typically with 10–13 sets of their 10 basic chromosomes, 80–85% of Saccharum officinarum (2n = 80), 10–15% of S. spontaneum (2n = 40–128) and ~5% with recombined chromosomes between these two ancestors [11], [12]. Gene duplications as a result of polyploidization alter the transcriptional landscape [13] and provide additional flexibility to adapt and evolve new patterns of gene expression for homo(eo)logous gene copies [14]. This flexibility has been suggested to be an important mechanism allowing the diversification of adaptive traits [15], [16] through neofunctionalization of duplicated genes [17] and tissue-specific expression [18]. The high productivity cane makes this crop an excellent source of sugar, bioethanol and bioelectricity [19] and a promising green alternative to petroleum [20], [21], [22] with vast potential to mitigate climate change without affecting food security [23]. Additionally, the myriad of products that can be derived from sugarcane biomass [24], such cellulosic bioethanol, further enhance opportunities for sugarcane in a portfolio of technologies needed to transition to a low carbon ‘bioeconomy’. In this scenario, hybrids obtained through the cross-breeding of commercial varieties of sugarcane with ancestral species, such as S. spontaneum, have allowed the production of genotypes characterized by high fiber content, moderate brix levels, fine stalks and higher tillering rate – the energy cane [25], [26], [27].

The energy cane

Energy cane is an ideal type of sugarcane with high yield of fiber, more vigorous and rustic, i.e. these plants are less demanding in soil, climate, water and nutrients, more resistant to pests and diseases, which brings a series of economic and environmental advantages [25], [28], [29]. Efforts to develop this crop include interspecific hybridization between modern sugarcane varieties and other closely related wild species, such as S. spontaneum, which has the greatest potential as a source of genetic variation for a number of important traits for bioenergy production and low-input adaptability [30]. Initially, Tew and Cobil [31] classified energy cane hybrids into two categories: one (Type I) defined as a cane closer to the conventional sugarcane regarding sucrose content but higher fiber content; and other (Type II) with only marginal content of sugar but with fiber content higher than Type I, used exclusively for biomass production. More recently, Kumar et al. [32] proposed the classification of cane varieties considering variation in sucrose and fiber content. According to the authors, cane type I includes traditional commercial sugarcane varieties, with high sugar and commercial yield (13% sucrose and 12% fiber content); cane type II also comprises varieties with high sugar and commercial yield, but with an increase in fiber content (13% sucrose content and >14% fiber content); cane type III (energy cane) includes varieties for multiple use purposes, focusing on high biomass production (sucrose content <12% and >22% fiber content); and cane type IV (energy cane) embraces varieties for energy cogeneration purposes (sucrose content <5% and fiber >22%). Compared to sugarcane commercial varieties, energy canes have higher ratooning ability and number of tillers [27], [29], [30], [33], which combined are very important in defining total biomass yield (Fig. 1). Because this crop is “vegetative propagation based”, this characteristic is also important in overcoming one of the most significant economic constraints in the cane cultivation: clonal multiplication [27]. In addition, there is also a profound difference regarding the root systems; energy cane produced an abundant and vigorous root system, surpassing the conventional cane in lateral extension, depth and volume. This trait, shared with the S. spontaneum progenitor, which is considered an invasive weed in some countries [34], allows its cultivation in marginal lands because it gives grater rusticity, helps mitigate soil erosion, boost permanent carbon sequestration and extends the crop life cycle up to 10 years; this is an important attribute due to the high cost of replanting sugarcane [27], [30].
Fig. 1

An energy cane RB hybrid and a sugarcane variety (SP791011) at six months after planting, under field conditions at the experimental site ‘Estação de Floração e Cruzamento da Serra do Ouro’ (lat 9° 13′ S, long 35° 50′ W, alt 450 m asl) in Alagoas, Brazil.

An energy cane RB hybrid and a sugarcane variety (SP791011) at six months after planting, under field conditions at the experimental site ‘Estação de Floração e Cruzamento da Serra do Ouro’ (lat 9° 13′ S, long 35° 50′ W, alt 450 m asl) in Alagoas, Brazil. The energy cane breeding initiatives began in Puerto Rico, with a pioneering commercial project established to conduct an integrated exploration of sugarcane as a biomass feedstock for multiple products, instead of only sugar [35]. The growing interest in bioenergy in recent decades pushed several sugarcane breeding programs world-wide to also produce energy cane commercial varieties. In the United States, the Cultivar L 79-1002 (‘CP 52-68’ x Tainan, S. spontaneum clone) was developed by the Louisiana State University Agricultural Center in cooperation with the USDA-ARS and the American Sugarcane League, Inc. This cultivar has high biomass yield and fiber content, on average of 257 g kg−1 [36]. The breeding program in Barbados has vigorous canes with exceptional fiber content (>30%), which are suitable for energy cogeneration [37]. Mauritius developed a similar program, aiming the increasing of biomass and fiber yield [38]. Other genetic improvement initiatives have been conducted in Australia [39], [40], Colombia [41], Japan [42], [43], [44], and Thailand [45]. Further details about breeding programs using S. spontaneum as an integral part of their activities can be found in Matsuoka et al. [27] and da Silva [30]. In Brazil, energy cane hyrids were obtained by Canavialis, a private sugarcane breeding company [27]. In 2011, an initiative was launched to create the first biorefinery in South America to produce cellulosic ethanol from sugarcane residues and energy cane genotypes, the Brazilian Group GranBio, an innovation industry for ethanol generation through biomass conversion [46]. New varieties have been developed by RIDESA (Inter-University Network for the Development of Sugarcane Industry) at the experimental site ‘Estação de Floração e Cruzamento da Serra do Ouro’ (lat 9° 13′ S, long 35° 50′ W, alt 450 m asl), Federal University of Alagoas (UFAL), Brazil. This institution maintains an important collection of sugarcane germplasm, which holds modern hybrids and a myriad of Saccharum, Erianthus and Miscanthus accessions. In a study conducted by UFAL/RIDESA researchers, six energy cane clones were selected which presented an overall average of 24.7% higher yield of dry biomass/ha as compared to the standard sugarcane variety (RB0442) (Table 1). Plants were grown under field conditions at São Miguel dos Campos – Alagoas – Brazil; and traits were measured at 13 months after planting and in the second harvest year.
Table 1

Comparison of yield related traits between six RB energy cane clones and a sugarcane standard variety (RB0442).

ValueTraits
FBDBFiber (%)TFHTRSTSH
Top 6 RB energy cane clonesMin131.843.220.226.65110.6
Average137.646.422.731.267.812.7
Max153.350.124.133.996.215.1
RB0442Average126.537.213.817.4122.118
RB clones vs RB0442%8.824.764.579.3−44.5−29.4
CV%20.9421.1216.9725.0029.0128.49

CV = coefficient of variation; FB = fresh biomass (t/ha); DB = dry biomass (t/ha); TFH = fiber (t/ha); TRS = Total Recoverable Sugar (%); TSH = sugar (t/ha)

Comparison of yield related traits between six RB energy cane clones and a sugarcane standard variety (RB0442). CV = coefficient of variation; FB = fresh biomass (t/ha); DB = dry biomass (t/ha); TFH = fiber (t/ha); TRS = Total Recoverable Sugar (%); TSH = sugar (t/ha) There are three major problems in the selection of energy cane clones: high incidence of smut disease (Sporisorium scitamineum), high flowering rates and low unit stem mass. Thus, the major challenge of genetic breeding for the commercial cultivation and consolidation of energy cane cultivars is to use more effective strategies to overcome these three problems. In addition, there are other technological (agro-industrial) bottlenecks, such as: (i) in mechanized harvesting, development of harvesting machines for high biomass and high fiber cultivars; and (ii) in industrial processing, improve the efficiency of grinding for broth extraction (Geraldo Barbosa, UFAL, personal communication).

Sugarcane genomic resources

As mentioned before, modern sugarcane cultivars are polyploid interspecific hybrids and have a large (~10 Gb) and complex genome. Nevertheless, opportunities to accelerate breeding progress and enrich the knowledge of the fundamental biology of this important crop drive efforts to explore and dissect its complex genome using different genomic tools and to develop a high-quality reference genome. After over a decade of multiple parallel genome sequencing initiatives [47], [48], Garsmeur et al. published the first mosaic monoploid genome reference of the modern cultivar R570 [49]. This study relies on 4535 bacterial artificial chromosomes (BACs) sequences that were colinear to the gene-rich portion of the sorghum genome (used as a reference). The final assembly consisted of 382 Mb of high-quality sequence in 3965 contigs, organized as a single tiling path, representing the single copy sugarcane gene space, and includes 25,316 predicted protein-coding gene models. The next milestone was the publication of the first allele-defined genome reference of a tetraploid S. spontaneum genotype (AP85-441) [50]. To this end, authors took advantage of multiple sequencing technologies including high-throughput chromatin conformation capture (Hi-C) to assemble 32 pseudo-chromosomes (2.9 Gbp) comprising 8 homologous groups of 4 members each, bearing 35,525 genes with alleles defined. Subsequently, Nascimento et al. [51] published a new S. spontaneum gene space reference (from accession US851008), including 39,234 genes. Recently, our group completed the assembly (4.26 Gb) of the Brazilian modern cultivar SP80-3280 [52], which includes the complete sequence of 373,869 genes and their upstream regions which may be further explored to identify regulatory promoter elements. This is the largest genomic data set available for the sugarcane researchers’ community and includes putative homo(eo)logs (mostly 2–5 copies) for a large fraction of the SP80-3280 gene space [52]. To date, most of the sugarcane RNA-seq initiatives are based on de novo transcriptome assembly [53], [54], [55], [56], [57], [58], [59], [60], [61], [62], [63], [64], [65], [66], [67], [68], [69]. Nevertheless, a few studies also consider the closely related species Sorghum bicolor [70], the S. officinarum gene indices (SoGI) v3.0 [71] or the previously de novo assembled transcriptome [72] as a reference for transcriptome analysis. Further transcriptome studies are expected to take advantage of the multiple reference genomes now available. To investigate the value of the three largest public assemblies as a genomic resource, we used RNA-seq data from Kasirajan et al. [71]. These authors created RNA-seq libraries from top and bottom internodes of 20 different genotypes, including commercial cultivars and introgression lines derived from crosses with wild S. spontaneum relatives and Erianthus. We used HISAT2 [73] with default parameters to align these RNA-seq reads against the monoploid R570 hybrid assembly [49], the AP85-441 tetraploid S. spontaneum reference [50] and the SP80-3280 hybrid gene space assembly [52]. For the vast majority of sequenced samples, the SP80-3280 assembly resulted in higher alignment rates than those against the monoploid reference and the S. spontaneum assembly (Fig. 2). Only in the very high end of alignment rates did the AP85-441 assembly perform better for three of the 40 samples. These results show that these genome assemblies, and in particular the SP80-3280, can be used as a reference for downstream genomic studies. Combining both de novo and reference-guided transcriptome approaches, especially by taking advantage of multiple genome references, may allow the understanding of the extent to which homo(eo)logs resemble or differ from each other in their expression patterns, the spatiotemporal dynamics of these relationships, and how epistatic interactions between individual homo(eo)logs affect biological traits.
Fig. 2

Alignment rates of RNA-seq libraries from the top and bottom internodes of 20 different genotypes [71], contrasting for fiber content, against the monoploid R570 hybrid assembly [49], the AP85-441 tetraploid S. spontaneum genome reference [50] and the SP80-3280 hybrid gene space assembly [52].

Alignment rates of RNA-seq libraries from the top and bottom internodes of 20 different genotypes [71], contrasting for fiber content, against the monoploid R570 hybrid assembly [49], the AP85-441 tetraploid S. spontaneum genome reference [50] and the SP80-3280 hybrid gene space assembly [52].

Linking genomic data to biomass improvement by exploring the plant cell wall metabolism

Plant biomass is composed mainly by secondary cell walls (SCW) and, consequently, achieving tailor-made biomass for bioenergy could benefit from a detailed understanding of SCW biosynthesis. The potential use of currently available sugarcane genomic resources for identifying classes of genes involved in SCW biosynthesis will help to shed light on these aspects. Table 2 indicates the number of cell wall-related genes found in different Saccharum genomic resources as well as transcription factors (TFs) from the two main families, NAC and MYB, involved in the gene regulatory network (GRN) controlling SCW biosynthesis.
Table 2

Number of cell wall-related genes and NAC and MYB transcription factors identified in different Saccharum ssp. genomic databases: the R570 monoploid genome reference (R570) [49]; the allele-define genome reference of S. spontaneum (AP85-441) [50]; the SP80-3280 gene-space assembly; and the SAS (Sugarcane Assembled Sequences) from the SUCEST Project (SUCEST) [94].

Gene classR570AP85-441SP80-3280SUCEST
Cell Wall Differentiation83386742175
Cell Wall Growth/Extension5421943458
Lignin metabolism3116854153
Other Glycan Degradation117456870177
Phenylpropanoid biosynthesis58311511123
Polysaccharide biosynthesis40220475112
Structural proteins156810327
Unknown1775
MYB178999613120
NAC10141288463
Number of cell wall-related genes and NAC and MYB transcription factors identified in different Saccharum ssp. genomic databases: the R570 monoploid genome reference (R570) [49]; the allele-define genome reference of S. spontaneum (AP85-441) [50]; the SP80-3280 gene-space assembly; and the SAS (Sugarcane Assembled Sequences) from the SUCEST Project (SUCEST) [94]. The SCW-GRN has been elucidated in Arabidopsis and it is comprised by a three-layered structure. At the top level, TFs from the NAC family, named VASCULAR-RELATED NAC-DOMAIN (VNDs) and NAC SECONDARY WALL THICKENING PROMOTING FACTOR (NSTs) genes, act as master switches activating cell differentiation, including programmed cell death to form tracheary elements, and SCW deposition in vessels and fibers [74], [75], [76]. They activate a second layer of master switches, comprised by TFs from the MYB family (AtMYB46 [77] and AtMYB83 [78] in Arabidopsis). These second level TFs activate biosynthetic genes of cellulose, hemicellulose (xylan) and lignin and a third layer of TFs, turning on SCW deposition. These downstream TFs activate other aspects of SCW deposition, with some redundancies, and includes transcriptional repressors of NAC master switches, such as AtMYB32 [79], establishing a negative feedback loop. Besides this three-layer core structure and key players of GRN being conserved among vascular plants studied so far [80], [81], including grasses like rice, maize, brachypodium, sorghum, miscanthus and switchgrass, considerable divergence of transcription factors target diversities has been reported [82], [83], [84], [85], [86], [87], [88], [89], [90], [91]. Even among grasses some divergences of TF target repertoires may exist, such as in MYB SCW repressors orthologs [92]. Within the R570, SP80-3280, AP85 and SUCEST SAS (see below) databases (Table 2), we have found up to 884 and 999 NAC and MYB genes, respectively, which are potential targets for further exploration. For the NAC family, this number is 8-11x higher than in rice (105 genes) and Arabidopsis thaliana (75 genes) genomes [93], putting in perspective all the intricacy and diversity found in the Saccharum complex. By comparing the overlap of genes among the three references, we can estimate how these datasets can complement each other. In the transcription factor MYB family, 15–50% of these genes is common to all three genotypes (Fig. 3), suggesting that all three databases have their particularities, given the fact that they are derived from different species (S. spontaneum) or varieties (R570 and SP80-3280). This is evident for AP85-441 (S. spontaneum) which reflects in approximately 50% of MYB genes not overlapping to sugarcane varieties (Fig. 3C).
Fig. 3

Overlap of MYB genes among the three sugarcane genomic assemblies (datasets). All genes classified as MYBs (Table 2) were used for reciprocal blastp analysis among all three datasets. A, overlap of SP80-3280 with the other two databases; B, overlap of R570 with the other two databases; C, overlap of AP85-441 with the other two databases. Genes with coverage and identity >=90% were considered “overlapping genes”.

Overlap of MYB genes among the three sugarcane genomic assemblies (datasets). All genes classified as MYBs (Table 2) were used for reciprocal blastp analysis among all three datasets. A, overlap of SP80-3280 with the other two databases; B, overlap of R570 with the other two databases; C, overlap of AP85-441 with the other two databases. Genes with coverage and identity >=90% were considered “overlapping genes”. As lignin is one of the main causes of biomass recalcitrance, hampering lignocellulosic biofuels production [95], [96], [97], [98], [99], much effort has been made to understand its biosynthesis and polymerization and how to engineer it. Lignin is a phenolic polymer composed of three main units, p-hydroxyphenyl (H), guaiacyl (G), and syringyl (S), crosslinked to hemicellulose providing strength and rigidity to the cell wall [100]. Although known for years [100], additional insight into the phenylpropanoid pathway and lignin synthesis have occurred recently. For example, the Caffeoyl-shikimate esterase (CSE) [101], the bifunctional phenylalanine/tyrosine ammonia lyase (PTAL) [102] and a bifunctional cytosolic ascorbate peroxidase functioning as C3H [103] are missing links in the phenylpropanoid pathway that have only recently been discovered. There are ~511 genes in the phenylpropanoid pathway of SP80-3280 (Table 2), whereas other species like Arabidopsis thaliana and the model C4 grass Setaria viridis have only 26 [104] and 56 [105], respectively. Gaps in our knowledge include how monolignols, the lignin monomers, are transported to the apoplast for polymerization [106]. To date, only one transporter has been characterized, AtABCG29, which is responsible for transporting the precursor of p-hydroxyphenyl (H-units), a minor component of lignin, but other routes should exist since loss-of-function mutations in AtABCG29 reduce H-units, but do not eliminate it [107]. The majority of studies about plant cell wall metabolism are derived from dicot model plants, particularly Arabidopsis. However, grasses and dicots diverged ~150 million years ago; therefore, considerable differences in their vascular and morpho-anatomic patterns and cell wall structure and composition have emerged. Grasses have several distinct features from eudicots cell walls [108], with different abundance of pectin, structural proteins and phenolic compounds and also hemicellulose structure and composition [109]. Eudicots have only traces of H-units and low levels of other phenolic compounds in their cell walls, whereas grasses have significant amounts of H-units and increased levels of hydroxycinammic acids [110], especially ferulic and p-coumaric acids esterified to arabinoxylan [108], [111], [112] and ferulate-monolignol conjugates incorporated to lignin [113]. Furthermore, the flavonoid tricin was discovered in monocot lignin [114], [115], acting as a nucleation site for lignification. Grasses have arabinoxylan as the main hemicellulose, but eudicots do not have arabinosyl substitution in secondary wall xylan, which affects how lignin is crosslinked to hemicellulose [116]. Moreover, mixed-linkage glucan is a monocot-specific hemicellulose, due to the absence in eudicots of the genes responsible for its biosynthesis, cellulose synthase-like F and H (CslF e CslH) [117], [118], [119]. Also absent in eudicots, the bifunctional PTAL can use tyrosine as well as phenylalanine as substrate in the first step of phenylpropanoid pathway yielding 4-coumarate, thus bypassing the reaction catalyzed by cinnamate 4-hydroxylase (C4H), giving plasticity to the metabolism [102]. Furthermore, a transcription factor from the MYB family (BdSWAM1) was recently reported as SCW biosynthesis regulator in Brachypodium distachyon, although its clade is not found in the Brassicaceae family [120], which includes Arabidopsis. On the other hand, CSE, an essential enzyme in eudicot phenylpropanoid pathway whose down-regulation improves biomass saccharification, does not have a bona-fide ortholog identified in grasses so far [101], [121], [122]. Given all these differences, it is expected that considerable genetic divergence may be found and much of the knowledge from dicot cell wall cannot be extrapolated to grasses. Therefore, these differences can only be uncovered by studying grass functional genomics. How SCW biosynthesis is connected to plant growth and biomass accumulation is still less understood. However, unknown factors linking these two processes may exist [123]. For example, one of these factors could be the transcriptional regulatory Mediator complex, since it has been reported to directly control lignin biosynthesis and its disruption rescues dwarfing phenotype in Arabidopsis lignin-deficient mutants [124], [125]. Such interactions are completely unknown for grasses and may be species-specific, raising the need to study crop plant omics [123]. Categorizing these hidden molecular hubs linking SCW biosynthesis to plant growth and other physiological processes are crucial to move forward in developing novel biotechnological strategies to improve plant biomass [123]. Identifying genes of interest, addressing grass specificities and finding the missing links to improve biomass accumulation and quality is a major challenge, especially in a complex genome species such as sugarcane. We expect that the sugarcane researchers can take advantage of genomic databases such as the ones described here to explore cell wall related-genes (Table 2), for example, thus helping to advance sugarcane functional genomics and giving new opportunities for molecular breeding to achieve and improve energy canes.

SUCEST-FUN Database: A platform for sugarcane data integration in a genomic context

In addition to the recent genome assemblies, plant genomic databases such as GRASSIUS [126], TropGENE [127], Phytozome [128], Plant TF database [129], MOROKOSHI [130], KBase [131], Gramene [132], PLAZA [133] and Plant GDB [134], are important foundations for molecular breeders to mine candidate genes and to facilitate molecular crop breeding. Specially for sugarcane and energy cane breeders, the SUCEST-FUN Platform (http://sucest-fun.org/) [135] was developed to allow data analysis on five main aspects: i) gene annotation; ii) gene expression; iii) integration of public resources; iv) sequencing projects; and v) functional genomics. The database was initially based on 43,141 SAS (Sugarcane Assembled Sequences) from the SUCEST Project [94] and subsequently the 17,500 ORFeome genes generated using RNA-seq of sugarcane ancestral and hybrid varieties [69], which are useful for protein characterization, single nucleotide polymorphism analysis, splicing variants identification, evolutionary and comparative studies. An important advantage of the SUCEST-FUN Database is the in-depth automatic and manual annotation conducted by our group and the definition of curated catalogs of transcription factors, cell wall genes, signal transduction genes (including kinases and phosphatases), KEGG metabolic pathways and enzymes, transposable elements, as well as orthologous gene analysis among grasses. For gene expression studies, the SUCEST-FUN Database supports three microarray platforms, including: (i) the Signal Transduction (SUCAST) array, composed by 1900 genes with 152 hybridizations; (ii) the RNA/Carbohydrate Metabolism and Signal Transduction (SUCAMET) array, composed by 4600 genes with more than 150 hybridizations; and (iii) the general regulatory function (CaneRegNet) array, composed by 14,522 genes, including sense and antisense probes with 122 hybridizations. These transcriptome studies used samples from multiple plant materials such as ancestral genotypes and commercial varieties; multiple tissues, such as leaves, internodes and roots; multiple conditions, such as field and greenhouse; and multiple treatments, such as drought stress, developmental and circadian stages, high CO2 [136], [137], [138], [139], [140]. In this scenario, these experiments, summarized in Table 3, are a valuable data source for co-expression analysis, a promising approach to unravel complex biological processes and regulatory networks, which can be extracted from available tools in the platform [141].
Table 3

Public microarray data using the Signal Transduction (SUCAST [137]) array and general regulatory function (CaneRegNet [138]) array.

PlatformGEO accession number
Experiment descriptionNumber of hibridizationsReference
PlatformSeries
SUCASTGPL3799GSE4966Phosphate starvation16[137]*
GSE4967Response to herbivory by Diatraea saccharalis8*
GSE4968ABA treatment16*
GSE4969MeJa treatment12*
GSE4970Response to N2-fixing endophytic bacteria association8*
GSE4971Drought response12*
GSE14732Sucrose content relate to drought and cell wall metabolism80*
CaneRegNetGPL14862GSE33574Drought6[138]
GSE42725Circadian rhythms22[139]
GSE87826Sugarcane vs Leifsonia xyli subsp. xyli.4[142]
GSE124990SP80-3280 growth and maturation30[52]
Sugarcane Ancestral36[141]
GPL22278GPL22278Ethephon- and AVG-induced transcriptional changes24[143]
Public microarray data using the Signal Transduction (SUCAST [137]) array and general regulatory function (CaneRegNet [138]) array. At the genome level, the annotation of public Sugarcane BAC sequences [47] and availability of a genome browser (available at http://sucest-fun.org/cgi-bin/cane_regnet/gbrowse2/gbrowse/microsoft_genome_moleculo_scga7/) with the gene space assembly of SP80-3280 polyploid cultivar [52] enables the survey of sequences and annotation in a global and dynamic way. For instance, we present one example of how we can explore this genomic tool in Fig. 4. Using the ‘SCRURZ3080F11.g’ SAS ID we searched for SP80-3280 contigs holding this transcript sequence (annotated as a MYB transcription factor). As a result, we found unique matches to nine different contigs, which may represent putative homo(eo)logs for this gene (Fig. 4A). In addition, we further present the genomic features of one of these contigs (scga7_uti_cns_0226226), such as SAS [94] matches, predicted genes [52] and RNA-seq [69] alignment results (Fig. 4B).
Fig. 4

A view of the SUCEST-FUN genome browser, available at http://sucest-fun.org/cgi-bin/cane_regnet/gbrowse2/gbrowse/microsoft_genome_moleculo_scga7/. A: Screen shot of the result of searching for the ‘SCRURZ3080F11.g’ SAS (Sugarcane Assembled Sequence) derived from the SUCEST Project [94]. This SAS is annotated as a MYB transcription factor and has 9 matches (putative homo(eo)logs) in the SP80-3280 gene space. Yellow bars represent contigs and red diamonds indicate match position. B: Screen shot of the result of searching for the ‘scga7_uti_cns_0226226’ contig, which contains ‘SCRURZ3080F11.g’. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

A view of the SUCEST-FUN genome browser, available at http://sucest-fun.org/cgi-bin/cane_regnet/gbrowse2/gbrowse/microsoft_genome_moleculo_scga7/. A: Screen shot of the result of searching for the ‘SCRURZ3080F11.g’ SAS (Sugarcane Assembled Sequence) derived from the SUCEST Project [94]. This SAS is annotated as a MYB transcription factor and has 9 matches (putative homo(eo)logs) in the SP80-3280 gene space. Yellow bars represent contigs and red diamonds indicate match position. B: Screen shot of the result of searching for the ‘scga7_uti_cns_0226226’ contig, which contains ‘SCRURZ3080F11.g’. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Final considerations

Increasing plant yields is one of the greatest challenges of biotechnology. Computational tools that link genome sequences, their functions and possible attributes useful for breeding are greatly needed to speed up the process of improving sugarcane and the energy cane. Plant yields are directly impacted by lignocellulosic metabolism. We give an overview of the main genes involved and their regulators. Considering the size and complexity of cane genomes, the fact that many species have been used in breeding and the polyploidization that arose, datamining for genes of interest is a significant bioinformatics challenge for this crop. The SUCEST-FUN Platform comprises a robust infrastructure for storage, continuously updating, annotation, easy and controlled access and integration among the functional catalogs of the sugarcane transcriptome. Through various data-driven clustering analysis tools, crossings and enrichment analysis it allows for a systems biology approach. This will be an important resource considering progenies need to be analyzed in an integrated manner for multiple characteristics (technological, physiological, biochemical, genetic traits) for the construction of gene networks.

Author contributions

GMS conceived the topic and outline, wrote and critically revised the manuscript. ALD wrote the manuscript and consolidated authors’ contribution. MSC, JMS and GVSB provided energy cane breeding information and image. SSF wrote and critically revised the manuscript and identified cell wall-related genes and NAC and MYB transcription factors in different Saccharum genomic references. GRAM performed RNA-seq alignment to Saccharum genomic references. FTC curated the SUCEST-FUN genome browser. All authors reviewed the manuscript.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  104 in total

1.  Analysis and functional annotation of an expressed sequence tag collection for tropical crop sugarcane.

Authors:  André L Vettore; Felipe R da Silva; Edson L Kemper; Glaucia M Souza; Aline M da Silva; Maria Inês T Ferro; Flavio Henrique-Silva; Eder A Giglioti; Manoel V F Lemos; Luiz L Coutinho; Marina P Nobrega; Helaine Carrer; Suzelei C França; Mauricio Bacci Júnior; Maria Helena S Goldman; Suely L Gomes; Luiz R Nunes; Luis E A Camargo; Walter J Siqueira; Marie-Anne Van Sluys; Otavio H Thiemann; Eiko E Kuramae; Roberto V Santelli; Celso L Marino; Maria L P N Targon; Jesus A Ferro; Henrique C S Silveira; Danyelle C Marini; Eliana G M Lemos; Claudia B Monteiro-Vitorello; José H M Tambor; Dirce M Carraro; Patrícia G Roberto; Vanderlei G Martins; Gustavo H Goldman; Regina C de Oliveira; Daniela Truffi; Carlos A Colombo; Magdalena Rossi; Paula G de Araujo; Susana A Sculaccio; Aline Angella; Marleide M A Lima; Vicente E de Rosa Júnior; Fábio Siviero; Virginia E Coscrato; Marcos A Machado; Laurent Grivet; Sonia M Z Di Mauro; Francisco G Nobrega; Carlos F M Menck; Marilia D V Braga; Guilherme P Telles; Frank A A Cara; Guilherme Pedrosa; João Meidanis; Paulo Arruda
Journal:  Genome Res       Date:  2003-11-12       Impact factor: 9.043

2.  MYB83 is a direct target of SND1 and acts redundantly with MYB46 in the regulation of secondary cell wall biosynthesis in Arabidopsis.

Authors:  Ryan L McCarthy; Ruiqin Zhong; Zheng-Hua Ye
Journal:  Plant Cell Physiol       Date:  2009-10-06       Impact factor: 4.927

Review 3.  Sugarcane for bioenergy production: an assessment of yield and regulation of sucrose content.

Authors:  Alessandro J Waclawovsky; Paloma M Sato; Carolina G Lembke; Paul H Moore; Glaucia M Souza
Journal:  Plant Biotechnol J       Date:  2010-04       Impact factor: 9.803

4.  The expression of a rice secondary wall-specific cellulose synthase gene, OsCesA7, is directly regulated by a rice transcription factor, OsMYB58/63.

Authors:  Soichiro Noda; Taichi Koshiba; Takefumi Hattori; Masatoshi Yamaguchi; Shiro Suzuki; Toshiaki Umezawa
Journal:  Planta       Date:  2015-06-13       Impact factor: 4.116

Review 5.  Evolutionary conservation of the transcriptional network regulating secondary cell wall biosynthesis.

Authors:  Ruiqin Zhong; Chanhui Lee; Zheng-Hua Ye
Journal:  Trends Plant Sci       Date:  2010-09-15       Impact factor: 18.313

6.  A barley cellulose synthase-like CSLH gene mediates (1,3;1,4)-beta-D-glucan synthesis in transgenic Arabidopsis.

Authors:  Monika S Doblin; Filomena A Pettolino; Sarah M Wilson; Rebecca Campbell; Rachel A Burton; Geoffrey B Fincher; Ed Newbigin; Antony Bacic
Journal:  Proc Natl Acad Sci U S A       Date:  2009-03-25       Impact factor: 11.205

7.  Butanol production in a first-generation Brazilian sugarcane biorefinery: technical aspects and economics of greenfield projects.

Authors:  Adriano Pinto Mariano; Marina O S Dias; Tassia L Junqueira; Marcelo P Cunha; Antonio Bonomi; Rubens Maciel Filho
Journal:  Bioresour Technol       Date:  2012-10-06       Impact factor: 9.642

8.  MYB31/MYB42 Syntelogs Exhibit Divergent Regulation of Phenylpropanoid Genes in Maize, Sorghum and Rice.

Authors:  Tina Agarwal; Erich Grotewold; Andrea I Doseff; John Gray
Journal:  Sci Rep       Date:  2016-06-22       Impact factor: 4.379

9.  Evaluation of plant biomass resources available for replacement of fossil oil.

Authors:  Robert J Henry
Journal:  Plant Biotechnol J       Date:  2010-01-08       Impact factor: 9.803

10.  De novo assembly and transcriptome analysis of contrasting sugarcane varieties.

Authors:  Claudio Benicio Cardoso-Silva; Estela Araujo Costa; Melina Cristina Mancini; Thiago Willian Almeida Balsalobre; Lucas Eduardo Costa Canesin; Luciana Rossini Pinto; Monalisa Sampaio Carneiro; Antonio Augusto Franco Garcia; Anete Pereira de Souza; Renato Vicentini
Journal:  PLoS One       Date:  2014-02-11       Impact factor: 3.240

View more
  7 in total

1.  Allele expression biases in mixed-ploid sugarcane accessions.

Authors:  Fernando Henrique Correr; Agnelo Furtado; Antonio Augusto Franco Garcia; Robert James Henry; Gabriel Rodrigues Alves Margarido
Journal:  Sci Rep       Date:  2022-05-24       Impact factor: 4.996

2.  Characterization of full-length transcriptome in Saccharum officinarum and molecular insights into tiller development.

Authors:  Haifeng Yan; Huiwen Zhou; Hanmin Luo; Yegeng Fan; Zhongfeng Zhou; Rongfa Chen; Ting Luo; Xujuan Li; Xinlong Liu; Yangrui Li; Lihang Qiu; Jianming Wu
Journal:  BMC Plant Biol       Date:  2021-05-22       Impact factor: 4.215

3.  Metabolic engineering of energycane to hyperaccumulate lipids in vegetative biomass.

Authors:  Guangbin Luo; Viet Dang Cao; Baskaran Kannan; Hui Liu; John Shanklin; Fredy Altpeter
Journal:  BMC Biotechnol       Date:  2022-08-30       Impact factor: 3.329

4.  Identification of Differentially Expressed Proteins in Sugarcane in Response to Infection by Xanthomonas albilineans Using iTRAQ Quantitative Proteomics.

Authors:  Jian-Yu Meng; Mbuya Sylvain Ntambo; Philippe C Rott; Hua-Ying Fu; Mei-Ting Huang; Hui-Li Zhang; San-Ji Gao
Journal:  Microorganisms       Date:  2020-01-03

5.  Differential expression in leaves of Saccharum genotypes contrasting in biomass production provides evidence of genes involved in carbon partitioning.

Authors:  Fernando Henrique Correr; Guilherme Kenichi Hosaka; Fernanda Zatti Barreto; Isabella Barros Valadão; Thiago Willian Almeida Balsalobre; Agnelo Furtado; Robert James Henry; Monalisa Sampaio Carneiro; Gabriel Rodrigues Alves Margarido
Journal:  BMC Genomics       Date:  2020-09-29       Impact factor: 3.969

6.  Genome-wide identification and characterization of DCL, AGO and RDR gene families in Saccharum spontaneum.

Authors:  Dong-Li Cui; Jian-Yu Meng; Xiao-Yan Ren; Jing-Jing Yue; Hua-Ying Fu; Mei-Ting Huang; Qing-Qi Zhang; San-Ji Gao
Journal:  Sci Rep       Date:  2020-08-06       Impact factor: 4.379

7.  Limited allele-specific gene expression in highly polyploid sugarcane.

Authors:  Gabriel Rodrigues Alves Margarido; Fernando Henrique Correr; Agnelo Furtado; Frederik C Botha; Robert James Henry
Journal:  Genome Res       Date:  2021-12-23       Impact factor: 9.438

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.