Literature DB >> 34079574

Combining QTL Mapping and Transcriptomics to Decipher the Genetic Architecture of Phenolic Compounds Metabolism in the Conifer White Spruce.

Justine Laoué1, Claire Depardieu1,2, Sébastien Gérardi1, Manuel Lamothe2, Claude Bomal2, Aïda Azaiez1, Marie-Claude Gros-Louis2, Jérôme Laroche3, Brian Boyle3, Almuth Hammerbacher4, Nathalie Isabel1,2, Jean Bousquet1.   

Abstract

Conifer forests worldwide are becoming increasingly vulnerable to the effects of climate change. Although the production of phenolic compounds (PCs) has been shown to be modulated by biotic and abiotic stresses, the genetic basis underlying the variation in their constitutive production level remains poorly documented in conifers. We used QTL mapping and RNA-Seq to explore the complex polygenic network underlying the constitutive production of PCs in a white spruce (Picea glauca) full-sib family for 2 years. QTL detection was performed for nine PCs and differentially expressed genes (DEGs) were identified between individuals with high and low PC contents for five PCs exhibiting stable QTLs across time. A total of 17 QTLs were detected for eight metabolites, including one major QTL explaining up to 91.3% of the neolignan-2 variance. The RNA-Seq analysis highlighted 50 DEGs associated with phenylpropanoid biosynthesis, several key transcription factors, and a subset of 137 genes showing opposite expression patterns in individuals with high levels of the flavonoids gallocatechin and taxifolin glucoside. A total of 19 DEGs co-localized with QTLs. Our findings represent a significant step toward resolving the genomic architecture of PC production in spruce and facilitate the functional characterization of genes and transcriptional networks responsible for differences in constitutive production of PCs in conifers.
Copyright © 2021 Laoué, Depardieu, Gérardi, Lamothe, Bomal, Azaiez, Gros-Louis, Laroche, Boyle, Hammerbacher, Isabel and Bousquet.

Entities:  

Keywords:  Picea glauca; QTL; RNA-Seq; co-regulation; conifers; metabolites; phenolic compounds

Year:  2021        PMID: 34079574      PMCID: PMC8166253          DOI: 10.3389/fpls.2021.675108

Source DB:  PubMed          Journal:  Front Plant Sci        ISSN: 1664-462X            Impact factor:   5.753


Introduction

Empirical evidence indicates that global warming will increase pressure on conifer forests via the intensification of drought events and the introduction of new insects and pathogens (Dale et al., 2001; Sturrock et al., 2011). As long-lived species, trees largely depend on their mechanical and chemical defenses to survive and reproduce (Davis and Shaw, 2001). Trees can protect themselves by producing secondary metabolites such as terpenes and phenolic compounds including flavonoids, monolignols, phenolic acids, stilbenes, and coumarins (Berini et al., 2018). The constitutive production of phenolic compounds (PCs) is usually thought to confer broad resistance to pathogens and abiotic factors in trees (Francheschi et al., 2005; Kessler, 2015). In spruces (Picea spp.), PCs are frequently associated with defense responses against pathogenic fungi (Brignolas et al., 1995; Hammerbacher et al., 2011, 2013, 2014) or pest insects (Brignolas et al., 1998; Faccoli and Schlyter, 2007; Delvas et al., 2011; Schiebe et al., 2012; Warren et al., 2015), and a recent study also highlighted their involvement in drought resistance in Picea abies (Schiop et al., 2017). Hence, PCs likely play a major role in climate adaptation in conifers. However, the genetic basis underlying their quantitative variation remains poorly understood (Ralph et al., 2006a; Warren et al., 2015), mainly due to the occurrence of large and complex gene families in conifers, combined with a limited knowledge of their functions (Hamberger and Bohlmann, 2006; Prunier et al., 2016). The regulatory mechanism of phenylpropanoid biosynthesis is compn>lex and involves several families of transcripn>tion factors (TFs) that regulate the expn>ression of downstream genes (Yang et al., 2012). Recent advances showed that the PCs pathway in plants is also controlled at different branches by R2R3-MYB repressors (Cavallini et al., 2015; Ma and Constabel, 2019). The regulation of late biosynthetic anthocyanin and proanthocyanin genes is orchestrated by the ternary MBW complex, involving transcription factors from the R2R3-MYB, basic helix–loop–helix (bHLH) and WD40 classes (Xu et al., 2015; Ma et al., 2018). Although the MBW network is evolutionary conserved in plants (Ramsay and Glover, 2005), the considerable expansion of TF families in plant lineages (Feller et al., 2011) could lead to functionally divergent genes between angiosperms and gymnosperms. The complex regulatory system of PCs metabolism, i.e., genes and their networks of interactions, remains largely undeciphered in conifer species. Recent work evidenced protein-protein interactions among different MYB, bHLH and WDR (Nemesio-Gorriz et al., 2017) and NAC gene family (Dalman et al., 2017) in Norway spruce, while other studies identified a variety of gene families (Hamberger and Bohlmann, 2006; Ralph et al., 2006a; Warren et al., 2015), genes (Celedon et al., 2017) and transcription factors (Bomal et al., 2008; Bedon et al., 2010; Deng and Lu, 2017) playing a role in the regulation of PC pathway. The majority of PC traits are known to be predominantly polygenic (Külheim et al., 2011; Francisco et al., 2016; Ganthaler et al., 2017), hence the need to deploy genome-wide approaches to study their synthesis. Quantitative trait loci (QTL) mapping was used successfully to uncover the genetic architecture of PCs production in crops (Li et al., 2016; Czyczyło-Mysza et al., 2019) and tree species (Verdu et al., 2014; Caseys et al., 2015). QTL analyses have also been successfully conducted in the Pinaceae, using traits relative to growth and phenology (Pelgas et al., 2011; Prunier et al., 2013) and biotic stress resistance (Porth et al., 2012; Lind et al., 2014). However, these analyses suffer from low resolution in conifers because of the low density of genetic linkage maps currently available and because of the considerable time and resources needed to grow, phenotype and genotype progenies. Recently, several studies in angiosperms showed that combining QTL mapping and transcriptome profiling is a robust approach to identify candidate genes underlying complex traits (Xu et al., 2015; Ye et al., 2017; Zhang et al., 2017; Jian et al., 2019). Transcriptome profiling can be used in combination with genetic mapping to narrow down the number of candidate genes identified by QTL and pointing out key genes involved in the mechanisms of interest. Hence, we used a similar approach to identify key genes involved in the PC pathway in white spruce [Picea glauca (Moench) Voss], a conifer species with high genome complexity and for which considerable genomic resources such as a draft genome sequences (Birol et al., 2013; Warren et al., 2015), an annotated expressed gene catalog (Rigault et al., 2011), high-throughput genotyping arrays (Pavy et al., 2013), and a high-density genetic linkage map (Pavy et al., 2017) are currently available. In this study, we first conducted QTL mapping to assess the genomic architecture of the constitutive production of nine candidate PCs in a white spruce full-sib family. Three flavonoids, one neolignan and one stilbenoid, for which we identified QTLs stable across 2 years of measurement, were further analyzed with RNA-Seq in order to identify key genes involved in the regulatory network of PC metabolism. Differentially expressed genes (DEGs) among individuals displaying contrasting phenotypes (high vs. low metabolite content) were identified and compared with genes located in QTLs.

Materials and Methods

Workflow

In this study, QTL and RNA-Seq approaches were used to identify genes involved in the metabolism of nine PCs in a random subset of 1,976 siblings previously used to produce a high-resolution genetic map for white spruce (Pavy et al., 2017). The nine PCs, all related to various biotic and abiotic stress responses in spruce species as well as in other plants (Table 1), were quantified for 164 and 202 siblings in two different years (Figure 1). QTL analyses were conducted with phenotypes obtained for the same progeny used for genetic mapping. From there, five metabolites showed a significant underlying genetic component for both years of assessment and were deemed of prime interest to be further investigated at the transcriptomic level using an RNA-Seq approach. For each of these metabolites, at least 20 siblings presenting contrasting phenotypes (i.e., highest and lowest metabolite content) were selected for individual transcriptome profiling and analyzed to highlight sets of differentially expressed genes.
Table 1

Bark phenolic compounds analyzed in a white spruce full-sib family.

MetaboliteMolecule skeletonClassMW (g mol−1)aFormulaBiotic stressbAbiotic stressc
AstringinStilbenoid406.1C20H22O9Fungus Hammerbacher et al. (2011)Not studied
CatechinFlavonoid290.3C15H14O6Fungus Bahnweg et al. (2000)Chobot et al. (2009)
GallocatechinFlavonoid306.3C15H14O7Fungus Hammerbacher et al. (2018)Wang et al. (2016)
IsorhapontinStilbenoid420.4C21H24O9Fungus Hammerbacher et al. (2011)Not studied
Neolignan-2Lignan400.5C23H28O6Insect Schiebe et al. (2012)Moura et al. (2010)
PiceidStilbenoid390.4C20H22O8Insect and fungus Brignolas et al. (1998)Villangó et al. (2016)
Procyanidin B1Flavonoid578.5C30H26O12Insect Rohde et al. (1996)Varela et al. (2016)
TaxifolinFlavonoid304.2C15H12O7Fungus Evensen et al. (2000)Not studied
Taxifolin glucosideFlavonoid466.4C21H22O12Insect and fungus Brignolas et al. (1995)Not studied

Molecular weight of the phenolic compound (g mol.

Studies demonstrating the role of the phenolics tested in tree response to biotic stresses (insect and pathogens) in Picea sp. are reported.

Studies demonstrating the role of the phenolics tested in tree response to abiotic stresses (drought) in plant species are reported.

Figure 1

Schematic representation of experimental design. Bark samples were collected from 164 to 202 white spruce siblings in 2014 and 2017, respectively. QTL analyses were performed on nine phenolic compounds for both years. For five metabolites, RNA sequencing was conducted on a subset of individuals displaying high vs. low metabolite contents in 2017. For each of the five metabolites, differentially expressed genes (DEGs) among groups of individuals displaying contrasting phenotypes were identified.

Bark pan class="Chemical">phenolic compounds analyzed in a n>an class="Species">white spruce full-sib family. Molecular weight of the pan class="Chemical">phenolic compound (g mol. Studies demonstrating the role of the phenolics tested in tree response to biotic pan class="Disease">stresses (insect and pathogens) in Picea spn>. are reported. Studies demonstrating the role of the phenolics tested in tree response to pan class="Disease">abiotic stresses (drought) in plant spn>ecies are reported. Schematic representation of experimental design. Bark samples were collected from 164 to 202 white spruce siblings in 2014 and 2017, respectively. QTL analyses were performed on nine phenolic compounds for both years. For five metabolites, RNA sequencing was conducted on a subset of individuals displaying high vs. low metabolite contents in 2017. For each of the five metabolites, differentially expressed genes (DEGs) among groups of individuals displaying contrasting phenotypes were identified.

Plant Material and Phenotypic Data Measurements

Twigs were collected on 164 and 202 full-sib siblings in August 2014 (age 15) and August 2017 (age 18), respectively, from a QTL mapping population previously used in several studies (cross C94-1-2516 between ♀77111 × ♂2388; Figure 1; see Pelgas et al., 2011; Pavy et al., 2012, 2017) and raised in a common garden located in Saint-Antonin, Québec (47°45N; 69°28 W). Seventy-nine siblings were common to both sampling years. Twigs (3-4 copies each) corresponding to the two parents originally crossed were also sampled in August 2015, 2016, and 2017 at the Cap-Tourmente Arboretum of Natural Resources Canada, Québec (47°05 N; −70°47 W). All sampled trees were originally produced from grafted material with a normal growth. For each tree, current-year twigs (18 cm long) were collected at breast height on the same side. Each sample was cut longitudinally with a razor blade to separate the woody inner stem tissues, the pith, from the bark tissues. Each sample was then cut into two parts, immediately frozen in liquid pan class="Chemical">nitrogen, and stored at−80°C until further use: one part for metabolite quantification and the other one for transcripn>tomic analyses.

Metabolite Quantification: Extraction of Phenolic Compounds

Extraction of PCs, namely astringin, catechin, gallocatechin, isorhapontin, 4-[1,3-dihydroxy-2-[2-hydroxy-4-(3-hydroxypropyl)phenoxy]propyl]-β-D-xylopyranoside (neolignan-2; Supplementary Figure 1), piceid, procyanidin B1, taxifolin and taxifolin glucoside, was performed at the Max Planck Institute in Germany. Briefly, PCs were extracted from ground bark samples and quantified by LC-tandem mass spectrometry on an Agilent 1200 HPLC system (Agilent, Santa Clara, CA, United States) coupled to an API 3200 mass analyzer (Sciex, Darmstadt, Germany). Analyst v1.5 software (Applied Biosystems) was used for data acquisition and processing. Linearity of metabolite detection for quantification was verified by external calibration curves for catechin, taxifolin, astringin and procyanidin B1. Other metabolite concentrations were determined relative to the calibration curve of the metabolite most closely resembling it. Detailed procedures are given in Supplementary Methods 1 in Supplementary Material. Descriptive statistics and phenotypic distributions of metabolites are reported in Supplementary Table 1 and Supplementary Figure 2. Pairwise correlations between metabolite concentrations measured in 2014 and 2017 were calculated with the software R 3.5 (R Core Team, 2013) using the rcor function in the Hmisc R package (Harrell, 2014).

QTL Analyses

QTL analyses were conducted in order to identify genomic regions accounting for variation in metabolite concentrations among siblings and underlying candidate genes putatively carrying causal variants. QTLs detection was performed for the nine PCs investigated (Table 1). Data were analyzed on a yearly basis using the two parental linkage mapn>s consisting in 2,774 (female) and 2,308 (male) high-quality SNPs mapping in as many separated genetic bins (see details in Pavy et al., 2017), resulting in four distinct QTL analyses per metabolite. All individuals included herein were part of the progeny originally used to generate the consensus linkage map consisting of 8,793 distinct gene loci (Pavy et al., 2017). QTL analyses were conducted with the software MapQTL v6.0 (van Ooijen and Kyazma, 2009), using interval mapping, which is a robust method against deviations from normality (van Ooijen and Kyazma, 2009). A QTL with a LOD score ≥ 3.1 was considered significant (P < 0.05 based on 1,000 genome-wide permutations of the markers). QTLs detected on each parental map were then positioned on the consensus map. Genes within 1 LOD of either side of the QTL peak were considered as candidate genes. Given the extremely high LOD score associated with neolignan-2 QTL, the window was expanded to 15 cM, which corresponded to the average window size of other significant QTLs detected in this study.

RNA-Seq Experiment

QTL analyses revealed that five metabolites, namely gallocatechin, neolignan-2, piceid, procyanidin B1 and taxifolin glucoside, had significant QTLs across both years surveyed (Figure 1). The expression of genes involved in the biosynthesis of those five metabolites were further investigated using RNA-Seq to identify sets of differentially expressed genes (DEGs) between two groups of individuals that showed contrasting concentrations of metabolites. Out of the 202 siblings collected in 2017, we selected 80 individuals showing high or low concentrations in gallocatechin, piceid, procyanidin B1 and taxifolin glucoside metabolites (Figure 1), some of them showing extreme phenotypes for more than one metabolite. The transcriptomic profiles of 20 phenotypically divergent individuals per PC (10 high-metabolite content vs. 10 low-metabolite content individuals) were then compared. Since the neolignan-2 displayed a bimodal distribution of concentrations (Figure 2), each of the two contrasted groups of individuals included 35 siblings instead of 10.
Figure 2

Metabolite concentrations and QTL mapping for neolignan-2. Concentrations of neolignan-2 obtained (A) from grafted material (3–4 ramets) corresponding to both parents (♀77111, ♂2388) sampled in August 2015, 2016, and 2017, and (B) from 164 and 202 siblings sampled in August 2014 and 2017, respectively. (C) Linkage map of the white spruce progeny showing location of the putative QTLs with LOD score for neolignan-2 in 2014 (orange line) and 2017 (green line). The peak LOD score reached 75.87 and 86.29 in 2014 and 2017, respectively. The vertical dotted line indicates the threshold for significant linkages of the LOD scores (i.e., 3.1). Ticks on the chromosome represent mapped genes according to Pavy et al. (2017). Within the QTL interval, red and black ticks are genes for which markers were detected for ♂2388 and ♀77111, respectively.

Metabolite concentrations and QTL mapping for neolignan-2. Concentrations of neolignan-2 obtained (A) from grafted material (3–4 ramets) corresponding to both parents (♀77111, ♂2388) sampled in August 2015, 2016, and 2017, and (B) from 164 and 202 siblings sampled in August 2014 and 2017, respectively. (C) Linkage map of the white spruce progeny showing location of the putative QTLs with LOD score for neolignan-2 in 2014 (orange line) and 2017 (green line). The peak LOD score reached 75.87 and 86.29 in 2014 and 2017, respectively. The vertical dotted line indicates the threshold for significant linkages of the LOD scores (i.e., 3.1). Ticks on the chromosome represent mapped genes according to Pavy et al. (2017). Within the QTL interval, red and black ticks are genes for which markers were detected for ♂2388 and ♀77111, respectively.

RNA Extraction, Libraries Preparation and Sequencing

For the 80 individuals selected for RNA-Seq, total RNA was isolated from frozen bark tissues following the method of Chang et al. (1993), with the protocol modifications described in Pavy et al. (2008), as detailed in Supplementary Methods 2 in Supplementary Material. For each sibling, a cDNA library was generated from 1 μg of total RNA using the KAPA stranded mRNA-Seq Kit (KAPA Biosystems, Roche Sequencing solutions, Canada). This kit contains all buffers and enzymes required for the poly(A) mRNA capture and the construction of stranded mRNA-Seq libraries of 100 ng−4 μg of intact total RNA. A PCR of 13 cycles was then performed for each cDNA library having a specific adapter (pan class="Disease">Illumina TruSeq HT). The quality of total RNA was assessed using the Agilent 2100 Bioanalyzer with RNA 6000 Nano LabChipn>s (Agilent Technologies Inc, Santa Clara, CA, USA) and RNA concentration was determined with a Nanodrop 1000 spectrophotometer (Thermo Fisher Scientific, Wilmington, DE, U.S.A). For all sampn>les, RNA concentration exceeded 100 ng μl−1 and RNA Integrity Number (RIN) exceeded 7.8. Libraries were first tagged individually, and then merged into a single sequencing pool at equimolar concentrations. The pool was sequenced at McGill University and Genome Quebec Innovation Center (Montreal, Quebec, Canada) using two lanes of Illumina NovaSeq 6000 S2 PE100.

Pre-processing and Differential Gene Expression Analyses

For each of the five metabolites tested, gene expression was compared between groups of individuals showing high and low PC contents. Quality of raw sequence data was first checked using FASTQC v0.11.2 (Andrews et al., 2014) for each sibling sequenced. Adapn>ter sequences were trimmed using Cutadapn>t (Martin, 2011) and checked again with FASTQC v0.11.2 to ensure proper trimming. For each metabolite, reads of individuals making up the high and low PC content groups were pooled. High-quality reads were then aligned against the most recent version of the white spruce reference transcriptome (Rigault et al., 2011) using Salmon v0.11.0 with the following options: –Gcbias, –seqBias, –validatingMappings and minScoreFraction (0.8) (Patro et al., 2017). Read counts were further normalized using the DESeq2 Bioconductor package in R (Anders and Huber, 2010; Love et al., 2014). Differential gene expression analyses were then carried out using the default pipeline implemented in the function “DESeq.” The DEGs were then determined using a threshold of < 0.05 (adjusted p-value after applying the Benjamini-Hochberg's multiple-testing correction; Benjamini and Hochberg, 1995). Expression level of DEGs was further classified as high (log2fc ≥ 1) or low (log2fc ≤ −1) according to the log2 fold change values (log2fc) obtained when compn>aring expn>ression in high vs. low metabolite content individuals. DEGs with a total read count lower than 10 were discarded. Venn diagrams were plotted to highlight unique and shared DEGs among metabolites using a bio-analytic web tool (http://bioinformatics.psb.ugent.be/webtools/Venn/).

Functional Characterization and Enrichment Tests of Differentially Expressed Genes

Using the BLAST2GO PRO suite (Gotz et al., 2008), functional annotations of the DEGs were retrieved by performing a BLASTX search against the RefSeq database (RefSeq release 91, using a cut-off e-value of ≤10−10). Homologous protein domains from translated sequences were then identified by searching against the Interpro database. GO annotations (molecular functions and biological processes) were obtained for each DEG and slimmed down into the more general GO plant slim terms. For each metabolite, enrichment tests were performed by comparing GO associated with DEGs with those associated with the full set of expressed genes, using Fisher's exact tests (P < 0.05) in the BLAST2GO PRO suite. The transcription factor database PlantRegMap/PlantTFDB v5.0 was used to identify potential transcription factors among DEGs (TFs; http://planttfdb.cbi.pku.edu.cn/; Jin et al., 2017; Tian et al., 2020). Finally, DEG functions were visualized in MapMan v3.5.1 (Thimm et al., 2004) using DEG mapping files generated by Mercator (v3.6; Lohse et al., 2014). MapMan classifies genes into various relevant functional categories, also referred as BIN classes herein.

Results

Variation in Phenolic Compounds and Correlations

The nine pan class="Chemical">PCs examined in this study belonged to the n>an class="Chemical">flavonoid, the stilbenoid, and the lignan biosynthesis pathways (Table 1). Astringin and isorhapontin were the most abundant metabolites (62.35 and 22.20 mg g−1 DW, respectively), followed by taxifolin (12.55 mg g−1 DW). The other PCs were relatively less abundant, with concentrations <3.28 mg g−1 DW (Supplementary Table 1). All metabolites followed a unimodal and continuous distribution (Supplementary Figure 2), except for neolignan-2, which displayed a clear bimodal distribution in parents (Figure 2A) and progeny (Figure 2B), suggesting the possibility of a major gene effect. Most metabolite concentrations within individuals were positively correlated (Figure 3). An exception was found for gallocatechin and taxifolin glucoside, which appeared negatively correlated in 2014 (Figure 3). As a general trend, PCs synthesized from the same biosynthetic pathway were most highly correlated, as observed for the two stilbenoids astringin and piceid (r = 0.60 in 2014 and r = 0.70 in 2017), and for the flavonoids catechin and procyanidin B1 (r > 0.73). However, one other exception was the strong correlation between piceid (a stilbenoid) and catechin (a flavonoid) (r > 0.78 in 2017). The concentration of neolignan-2 only moderately correlated with other metabolites (0.14 < r < 0.36), but variations in concentration among individuals were stable across time (r = 0.94, P < 0.05; based on the 79 individuals measured in both years; data not shown).
Figure 3

Pairwise correlations between phenolic compounds contents measured in 2014 (bottom left; 164 siblings) and 2017 (top right; 202 siblings). Metabolites are classified into three major classes. The scale bar reports positive (red) and negative (blue) correlations. For each phenotypic correlation, the correlation coefficient and significance level are indicated. Levels of significance are as follows: *P < 0.05, **P < 0.01, and ***P < 0.001.

Pairwise correlations between pan class="Chemical">phenolic compounds contents measured in 2014 (bottom left; 164 siblings) and 2017 (top right; 202 siblings). Metabolites are classified into three major classes. The scale bar reports positive (red) and negative (blue) correlations. For each phenotypn>ic correlation, the correlation coefficient and significance level are indicated. Levels of significance are as follows: *P < 0.05, **P < 0.01, and ***P < 0.001. A total of 17 significant QTLs (LOD score ≥ 3.1) were detected for eight of the nine metabolites tested (Table 2). Individual QTLs included between 23 and 160 genes, resulting in a list of 871 unique candidate genes (Supplementary Table 2). QTLs for astringin, catechin and taxifolin were only detected from 2017 data (Table 2). Significant QTLs for gallocatechin, neolignan-2, piceid, and procyanidin B1 were observed for both years on at least one co-localizing region of a unique parent. In addition, two distinct QTLs were detected for taxifolin glucoside in 2014 and 2017, on different parents and genomic regions. The phenotypic variance explained (PVE) by single QTLs ranged between 8.0% (which is approximately corresponding to the genome-wide significance threshold for this experiment) and 91.3%. Interestingly, very high LOD scores (86.3 and 75.9) and PVE values (91.3 and 82.4%) were obtained for the only QTL detected for neolignan-2 in both years (Table 2; Figure 2C). Those observations highlight the occurrence of a genetic effect with strong impact on this phenotype on linkage group 4 (Figure 2C). Overall, QTLs were located on seven out of the 12 linkage groups (LG), corresponding to the 12 chromosomes of white spruce. Co-localization of QTLs for both years of measurement was observed for four PC (gallocatechin on LG8; neolignan-2 and piceid on LG4; and procyanidin B1 on LG1). This indicates that genotype-phenotype linkages were robust to inter-annual climate variations, which strengthen the involvement of these regions in the control of the studied phenotypes. These compounds were thus selected for transcriptomic analysis along with taxifolin glucoside, for which QTLs were detected for both years of measurement, but on different parents (Table 2; Figure 1).
Table 2

Summary statistics of significant QTLs (LOD score ≥ 3.1) detected for eight phenolic compounds in the C94-1-2516 white spruce full-sib family in 2014 and 2017.

TraitYearParentLinkage groupaQTL
PositionbLOD score maxcPVEdNumber of genes in the QTLeMarker with highest LODf
Astringin2014No QTL found
2017771112107.1–124.34.339.4101GQ0201_C16.1:213
Catechin201423886129–137.33.318.974GQ03417_E16.1:213
GQ03517_A20.1:183
GQ03813_N18.1:900
2017No QTL found
Gallocatechin2014771116102.6–115.24.2611.352GQ04010_D06.1:89 (LOD 4.21)
2388880.5–95.68.2320.851GQ03706_F01.1:209 (LOD 8.16)
PGLM1-0902 (LOD 8.16)
GQ03222_J15.1:397 (LOD 8.16)
201777111468.1–103.84.159.1160PGLM2-1250
77111894.1–117.83.888.599GQ04108_O24.1:246
2388885.8–99.36.9314.752GQ03616_J03.1:82
Neolignan-2g201423884122.3–134.586.2991.367GQ03509_O17.1:634
201723884122.3–134.575.8782.461GQ03509_O17.1:634
Piceid2014771114150.3–163.57.4119.172GQ0168_L11.2:1112
2017771114151.3–157.24.8310.523GQ02820_P07.1:861
Procyanidin B12014238810.5–6.23.449.436GQ03608_I02.1:558 (LOD 3.24)
GQ03711_A03.1:507 (LOD 3.24)
PGLM2-0208 (LOD 3.24)
2017238815.2–27.34.9710.8100GQ03222_P17.1:624
GQ03230_C18.1:470
GQ03718_P22.2:170
Taxifolin2014No QTL found
20177711110106.4–124.24.229.260GQ02811_E24.1:907
GQ03001_G19.1:192
GQ02802_M06.1:581
PGLM1-0106
GQ0024_A06.1:139
6102.6–111.13.668.035WS00110_I05.1:387 (LOD 3.63)
Taxifolin glucoside20142388663.5–72.34.1811.738GQ03220_G12.1:1305 (LOD 4.01)
20177711190.0–8.93.818.427PGLM2-0975 (LOD 3.79)
GQ03613_M22.1:156 (LOD 3.79)
PGLM1-1021 (LOD 3.79)

Linkage group (LG) according to Pavy et al. (.

Position on the consensus map ± 1 LOD in centimorgan (cM).

LOD score max: maximum LOD score for mapped markers.

PVE: phenotypic variance explained, expressed in percentage (%).

Number of genes found in the QTL.

Associated marker (Pavy et al., .

QTL position for neolignan-2 was defined as the average window size of significant QTLs detected in other metabolites (e.g., 15 cM), as only one gene was mapped in the neolignan-2 QTL when using a window of ± 1LOD.

Summary statistics of significant QTLs (LOD score ≥ 3.1) detected for eight pan class="Chemical">phenolic compounds in the C94-1-2516 n>an class="Species">white spruce full-sib family in 2014 and 2017. Linkage group (LG) according to Pavy et al. (. Position on the consensus map ± 1 LOD in centimorgan (cM). LOD score max: maximum LOD score for mapped markers. PVE: phenotypic variance explained, expressed in percentage (%). Number of genes found in the QTL. Associated marker (Pavy et al., . QTL position for pan class="Chemical">neolignan-2 was defined as the average window size of significant QTLs detected in other metabolites (e.g., 15 cM), as only one gene was mapn>ped in the n>an class="Chemical">neolignan-2 QTL when using a window of ± 1LOD. The functional annotation of genes located within QTL regions relied on BLAST2GO annotations, GO terms classification, and protein family signatures. While most QTL genes had no known direct role in the phenylpropanoid pathway, we found two genes previously reported as involved in the PC biosynthesis in Picea abies, namely a leucoanthocyanidin reductase-like gene (LAR; GQ03701_M12) for procyanidin B1, and a flavonoid 3′,5′-hydroxylase 2-like gene (F3′5′H; GQ03712_G11) for taxifolin glucoside (Supplementary Table 2).

Relative Functions of Differentially Expressed Genes Between Groups of High and Low PC Content Individuals

The five selected metabolites (gallocatechin, neolignan-2, piceid, procyanidin B1, and taxifolin glucoside) were further investigated using RNA-Seq to compare the transcriptomic profiles of individuals showing contrasting phenotypes (high vs. low metabolite concentrations for each PC). A summary of RNA-Seq statistics is presented in Supplementary Table 3. A total of 603 unique DEGs were identified at a rate of 3 to 372 DEGs per metabolite (Figure 4A; Supplementary Table 4). DEGs overlap among metabolites was very limited to non-existent, except for gallocatechin and taxifolin glucoside, which shared 137 DEGs (Figure 4B). Except for one gene, these DEGs showed higher expression in high taxifolin glucoside content individuals and lower expression in high gallocatechin content individuals (Figure 4C). Among the 603 DEGs, 436 were successfully annotated with a GO term associated to molecular functions and biological processes (Supplementary Figure 3). The GO enrichment analyses revealed that DEGs were particularly overrepresented in monosaccharide metabolic (2.5%), secondary metabolic (1.9%) and phenylpropanoid metabolic (1.9%) processes (Supplementary Figure 3). Overrepresented molecular functions included stress-related functions such as catalytic (59.2%), oxidoreductase (16.8%), lyase (7.6%), and transferase (5.7%) activities, and functions related to the phenylpropanoid pathway such as UDP-glycosyltransferase and O-methyltransferase activities (Supplementary Figure 3).
Figure 4

Results of differential expression analyses for five phenolic compounds. (A) Proportion of differentially expressed genes (DEGs) among groups of individuals displaying contrasting phenotypes, for each metabolite. The proportion of DEGs was calculated as the number of DEGs divided by the total number of genes expressed. The number of DEGs is reported above the bars. (B) Venn diagrams showing the overlap among the 603 DEGs identified for all five metabolites. The set of 137 genes shared between gallocatechin and taxifolin glucoside (surrounded by a blue and red diamond) is further presented in panel (c). (C) Opposite expression profiles of the 137 genes shared between gallocatechin and taxifolin glucoside. The log fold changes of DEGs identified for taxifolin glucoside are plotted against log fold changes of DEGs identified for gallocatechin. The correlation coefficient and p-values are reported. Log2FC: log2 fold change.

Results of differential expression analyses for five phenolic compounds. (A) Proportion of differentially expn>ressed genes (DEGs) among groups of individuals displaying contrasting phenotypes, for each metabolite. The proportion of DEGs was calculated as the number of DEGs divided by the total number of genes expn>ressed. The number of DEGs is reported above the bars. (B) Venn diagrams showing the overlapn> among the 603 DEGs identified for all five metabolites. The set of 137 genes shared between gallocatechin and taxifolin glucoside (surrounded by a blue and red diamond) is further presented in panel (c). (C) Opposite expression profiles of the 137 genes shared between gallocatechin and taxifolin glucoside. The log fold changes of DEGs identified for taxifolin glucoside are plotted against log fold changes of DEGs identified for gallocatechin. The correlation coefficient and p-values are reported. Log2FC: log2 fold change. MapMan analyses were conducted to classify DEGs into relevant functional bins (or functional categories; Supplementary Figure 4; Supplementary Table 5). DEGs were assigned to 29 functional bins. While an important proportion of the 603 DEGs were not assigned (37%), a significant proportion of DEGs was related to secondary metabolism (8.9%), stress (5.9%) and signaling (5%) (Supplementary Figure 4). The stress-related genes included 20 unique DEGs involved in abiotic stress response (Figure 5A). The most relevant functions associated to the 144 unique DEGs involved in biotic stress response included pathogenesis-related (PR) proteins, heat shock proteins, transcription factors as well as various genes involved in hormone signaling and defense (Figure 5A). The 59 defense-related DEGs identified by MapMan (Figure 5A) included 11 putative PR-proteins, and 48 genes involved in secondary metabolism, among which 36 genes were involved in the shikimate/phenylpropanoid pathway (Figure 5B).
Figure 5

MapMan-based classification of differentially expressed genes (DEGs) involved in stress responses. (A) Expression profiles of DEGs involved in biotic and abiotic stresses. (B) Expression profiles of DEGs involved in secondary metabolism. The scale bar represents positive (red) and negative (blue) regulation of gene expression based on log2fc scores. Black squares represent genes that were either high or low expressed depending on the phenolic compound considered.

MapMan-based classification of differentially expressed genes (DEGs) involved in stress responses. (A) Expn>ression profiles of DEGs involved in biotic and abiotic stresses. (B) Expression profiles of DEGs involved in secondary metabolism. The scale bar represents positive (red) and negative (blue) regulation of gene expression based on log2fc scores. Black squares represent genes that were either high or low expressed depending on the phenolic compound considered.

Identification of Candidate Genes Specifically Associated to the Phenylpropanoid Pathway

For all metabolites considered, the combination of MapMan classification and Blast2GO functional annotations allowed for the identification of 50 DEGs involved in the phenylpropanoid pathway associated with monophenols, phenylpropanoids, flavonoids, as well as lignins and lignans metabolism (Table 3; Figure 5B). From this subset, 31 and 30 DEGs were respectively associated to gallocatechin and taxifolin glucoside, including 15 genes shared between the two DEG sets. In addition, four other genes were associated with neolignan-2 and procyanidin B1. For all metabolites, the most represented gene families were the putative dirigent proteins (DIR) (9 genes), laccases (7 genes) and plant peroxidases (6 genes) (Table 3). Among the 50 DEGs identified, 36 DEGs were differentially expressed between the groups of high and low PC content individuals. A large proportion of putative DIR showed higher expression, with three genes associated with taxifolin glucoside (GQ03808_J11, GQ03806_D05, GQ03307_E08), two with gallocatechin (GQ036606_J07, GQ03815_M16), one with neolignan-2 (WS00740_J05) and one with procyanidin B1 (GQ01301_K10). Two DIR showing lower expression were detected for gallocatechin (GQ03806_D05, GQ03803_O03). Most laccases associated to the phenylpropanoid pathway were also differentially expressed, with four highly expressed and three lower expressed genes in individuals with high taxifolin glucoside and high gallocatechin content, respectively. In addition, three putative peroxidases (GQ0202_L09, GQ03322_C02, GQ02016_E21), two O-methyltransferases (WS00740_E09, GQ03507_F11), one cytochrome P450 (WS00736_D10) and one NmrA-like (GQ03009_B07) putative protein showed higher expression in high taxifolin glucoside content individuals.
Table 3

List of differentially expressed genes (DEGs; adjusted P < 0.05) involved in phenolic compounds metabolism.

MetaboliteDEG IDSequence descriptionaInterPro classificationbAdjusted p-valuesLog2fcExpressionc
GallocatechinGQ04107_C21Flavonoid 3′,5′-hydroxylase 2-likeCytochrome P450 E-class group I4.14E-052.30HIGH
GQ03606_J07Dirigent protein 11-likeDirigent protein4.41E-022.20HIGH
GQ04105_L05Protein DMR6-LIKE OXYGENASE 2Isopenicillin N synthase-like, oxoglutarate/iron-dependent dioxygenase6.39E-032.05HIGH
GQ03815_M16Dirigent protein 1-likeDirigent protein1.82E-031.76HIGH
GQ03111_E17Probable mannitol dehydrogenaseLeucine-rich repeat domain superfamily7.62E-03−1.17LOW
GQ03313_A02Cinnamoyl-coa reductase 1-likeNAD-dependent epimerase/dehydratase2.65E-02−1.00LOW
GQ03805_H10Laccase-3-like isoform X1Cupredoxin, laccase, multicopper oxidase type 19.81E-03−1.07LOW
GQ03806_D05Dirigent protein 11Dirigent protein1.72E-02−1.08LOW
GQ02901_F15Bifunctional pinoresinol-lariciresinol reductase 2NmrA-like domain3.38E-03−1.10LOW
GQ03322_C02Peroxidase 11Plant peroxidase3.04E-02−1.11LOW
GQ03216_M13Laccase-5-likeLaccase, multicopper oxidase type 17.00E-03−1.15LOW
GQ03812_J09Xanthohumol 4'-O-methyltransferaseWinged helix-like DNA-binding domain superfamily, O-methyltransferase domain4.56E-03−1.17LOW
GQ03803_O03Dirigent protein 22-likeDirigent protein2.07E-03−1.20LOW
GQ0202_L09Peroxidase 72-likePlant peroxidase2.45E-03−1.22LOW
GQ03009_B07Isoflavone reductase homolog PCBER-likeNmrA-like domain2.61E-04−1.30LOW
GQ03807_A11Omega-hydroxypalmitate O-feruloyl transferaseChloramphenicol acetyltransferase-like domain superfamily7.05E-03−1.31LOW
GQ03214_N14Laccase-12-likeCupredoxin, multicopper oxidase type 24.07E-03−1.39LOW
GQ0253_H12UDP-glycosyltransferase 85A8-likeFAD/NAD(P)-binding domain superfamily1.18E-03−1.43LOW
GQ03804_M07Peroxidase 40Plant peroxidase1.44E-02−1.58LOW
GQ0082_B18Flavonol synthase/flavanone 3-hydroxylase-likeIsopenicillin N synthase-like8.28E-06−1.74LOW
GQ03229_E14UDP-glycosyltransferase 86A1Alpha/Beta hydrolase fold6.39E-031.00
GQ03507_H08Isoflavone reductase homolog TP7Concanavalin A-like lectin/glucanase domain superfamily3.50E-020.94
GQ03901_P05Probable 2-oxoglutarate-dependent dioxygenase ANSIsopenicillin N synthase-like2.39E-020.53
GQ03810_P08Isoflavone reductase-like proteinNmrA-like domain3.77E-03−0.47
GQ03312_B13Phenylalanine ammonia-lyasePhenylalanine ammonia-lyase shielding domain superfamily2.11E-02−0.56
GQ0043_N14Anthranilate N-methyltransferase-likeO-methyltransferase domain1.02E-03−0.58
GQ03207_H04Isoflavone reductase homolog A622-likeNmrA-like domain2.49E-03−0.60
GQ03712_G11Flavonoid 3′,5′-hydroxylase 2-likeCytochrome P4503.24E-02−0.85
WS00736_D10Cinnamoyl-coa reductase 1-like isoform X2Cytochrome P450 superfamily3.06E-02−0.86
GQ03712_H19Anthocyanidin reductase ((2S)-flavan-3-ol-forming)Citrate synthase superfamily3.50E-04−0.87
GQ04102_M17Protein DMR6-LIKE OXYGENASE 2-likeOxoglutarate/iron-dependent dioxygenase4.97E-02−0.91
Neolignan-2WS00740_J05Dirigent protein 11-likeDirigent protein1.01E-051.60HIGH
GQ03232_H18Protein DMR6-LIKE OXYGENASE 2-likeOxoglutarate/iron-dependent dioxygenase4.25E-041.26HIGH
GQ02820_P07Anthranilate N-benzoyltransferase protein 1Chloramphenicol acetyltransferase-like domain superfamily6.28E-12−0.27
Procyanidin B1GQ01301_K10Disease resistance response protein 206 isoform X2Dirigent protein2.11E-022.45HIGH
Taxifolin glucosideWS00740_E09Caffeic acid 3-O-methyltransferaseO-methyltransferase domain6.39E-032.51HIGH
GQ0253_H12UDP-glycosyltransferase 85A8-likeFAD/NAD(P)-binding domain superfamily2.29E-041.88HIGH
GQ03519_N09Flavonol synthase/flavanone 3-hydroxylaseProtein kinase-like domain superfamily7.88E-031.82HIGH
GQ02016_E21Peroxidase 72-like isoform X2Plant peroxidase1.11E-061.74HIGH
GQ03507_F11Caffeic acid 3-O-methyltransferase-likeO-methyltransferase COMT-type2.92E-021.67HIGH
GQ0082_B18Flavonol synthase/flavanone 3-hydroxylase-likeIsopenicillin N synthase-like1.10E-041.59HIGH
GQ03805_O13Laccase-3-likeMulticopper oxidase type 25.88E-041.53HIGH
GQ03807_A11Omega-hydroxypalmitate O-feruloyl transferaseChloramphenicol acetyltransferase-like domain superfamily6.55E-081.52HIGH
GQ03808_J11Dirigent protein 22-likeDirigent protein8.72E-041.41HIGH
GQ03214_N14Laccase-12-likeCupredoxin, multicopper oxidase type 23.42E-041.37HIGH
GQ03805_H10Laccase-3-like isoform X1Cupredoxin, laccase, multicopper oxidase type 14.69E-041.31HIGH
GQ03806_D05Dirigent protein 11Dirigent protein4.17E-041.26HIGH
GQ03216_M13Laccase-5-likeLaccase, multicopper oxidase type 19.87E-051.23HIGH
GQ03812_J09Xanthohumol 4'-O-methyltransferaseWinged helix-like DNA-binding domain superfamily1.35E-051.22HIGH
GQ03307_E08Disease resistance response protein 206 precursorDirigent protein3.04E-051.22HIGH
GQ0202_L09Peroxidase 72-likePlant peroxidase4.20E-041.17HIGH
WS00736_D10Cinnamoyl-coa reductase 1-like isoform X2Cytochrome P450 superfamily2.33E-021.15HIGH
GQ03111_E17Probable mannitol dehydrogenaseLeucine-rich repeat domain superfamily3.26E-031.11HIGH
GQ03009_B07Isoflavone reductase homolog PCBER-likeNmrA-like domain9.14E-041.09HIGH
GQ03322_C02Peroxidase 11Plant peroxidase9.84E-031.04HIGH
GQ03814_I06Cinnamoyl-coa reductase 1-likeNAD-dependent epimerase/dehydratase3.70E-021.03HIGH
GQ03206_H08Dihydroflavonol 4-reductase-likeNAD-dependent epimerase/dehydratase1.06E-021.03HIGH
GQ04004_H10Geraniol 8-hydroxylase-likeCytochrome P450 superfamily2.93E-02−1.00LOW
GQ03712_H19Anthocyanidin reductase ((2S)-flavan-3-ol-forming)NAD-dependent epimerase/dehydratase1.50E-041.00
GQ0074_I15Hydroquinone glucosyltransferase-likeLysM domain1.91E-040.93
GQ03004_G22Phenylalanine ammonia-lyasePhenylalanine ammonia-lyase shielding domain superfamily1.89E-020.92
WS0322_G204-coumarate–coa ligase 2B-cell receptor-associated protein 29/311.70E-030.87
GQ03321_M15Cytochrome P450 CYP736A12-likeRibosomal protein S11, cytochrome P450 E-class group I superfamily2.06E-020.83
GQ03803_O03Dirigent protein 22-likeDirigent protein3.18E-020.83
GQ03313_I03Protein SRG1Isopenicillin N synthase-like, Oxoglutarate2.57E-020.75

Sequence description: annotations obtained using BLAST2GO (P < 0.05).

InterPro classification: most informative InterPro names.

Expression: DEGs having higher expression (log2 fold change ≥ 1) in trees producing high levels of PCs were labeled as HIGH, while DEGs having lower expression (log2 fold change ≤ −1) in individuals producing high levels of PCs were labeled as LOW. Piceid does not appear in the table as none of the 3 DEGs identified for this metabolite was associated with the phenylpropanoid pathway.

List of differentially expressed genes (DEGs; adjusted P < 0.05) involved in pan class="Chemical">phenolic compounds metabolism. Sequence description: annotations obtained using BLAST2GO (P < 0.05). InterPro classification: most informative InterPro names. Expression: DEGs having pan class="Disease">higher expn>ression (log2 fold change ≥ 1) in trees producing n>an class="Disease">high levels of PCs were labeled as HIGH, while DEGs having lower expression (log2 fold change ≤ −1) in individuals producing high levels of PCs were labeled as LOW. Piceid does not appear in the table as none of the 3 DEGs identified for this metabolite was associated with the phenylpropanoid pathway. In this study, genes known to be involved in the phenylpropanoid pathway in white spruce or other plant species, and genes that were found differentially expressed were designated as key genes. Among the 50 DEGs associated to the phenylpropanoid pathway (Table 3), we identified five key DEGs, namely the 4-coumarate-CoA ligase (4CL), the phenylalanine ammonia-lyase (PAL), cinnamoyl-CoA reductase (CCR), p-coumaroyl shikimate/quinate 3′–hydroxylase (C3H) and caffeic O-methyltransferase (COMT) (Figure 6). Six other key enzymes were associated specifically to the flavonoid pathway, and included the flavonol synthase (FLS), the UDP-dependent glucosyl transferase (UGT), the flavonoid 3′5′-hydroxylase (F3′5′H - CYP75A), the bifunctional dihydroflavonol 4-reductase/flavanone 4-reductase (DFR), the leucoanthocyanidin reductase (LAR) and the anthocyanidin reductase (ANR) (Figure 6). It should be noted that FLS, UGT and F3H displayed opposite gene expression patterns in individuals producing high levels of taxifolin glucoside, compared to those producing high levels of gallocatechin (see Supplementary Table 4 for details).
Figure 6

Proposed phenolic compound biosynthesis pathway in spruce. The pathway presented here is adapted from Warren et al. (2015). Candidate gene IDs identified in this study were retrieved from the gene catalog GCAT3.3 (Rigault et al., 2011) and are reported in black and italic. Genes identified in QTLs only are shown in orange boxes. Lower- or highly-expressed genes are indicated by blue and red boxes, respectively. Remaining DEGs (i.e., excluding high- and low expressed genes) are shown in white boxes. Genes within the taxifolin glucoside QTL and differentially expressed for gallocatechin are shown in white boxes with bold margins. The metabolite for which genes are differentially expressed is reported in each box according to the following nomenclature: G, gallocatechin; N, neolignan-2; P, piceid; PB, procyanidin B1; Tg, taxifolin glucoside. The five metabolites studied are underlined in green. Known key enzymes involved in the phenylpropanoid pathway are labeled in blue and bold capital letters. Abbreviations are as follow: 4CL, 4-coumarate-CoA ligase; ANR, anthocyanidin reductase; ANS, anthocyanidin synthase; C3H, p-coumaroyl shikimate/quinate 3′ -hydroxylase; C4H, cinnamate 4-hydroxylase; CAD, cinnamyl-alcohol dehydrogenase; CAD, cinamyl alcohol deshydrogenase; CCoAOMT, caffeoyl-CoA O-methyltransferase; CCR, cinnamoyl-CoA reductase; CHI, chalcone isomerase; CHS, naringenin-chalcone synthase; COMT, caffeic O-methyltransferase DFR, bifunctional dihydroflavonol 4-reductase/flavanone 4-reductase; DIR, dirigent protein; F3H, flavanone 3-hydroxylase; F3′H, flavonoid 3′-hydroxylase; F3′5′H, flavonoid 3′5′-hydroxylase; FLS, flavonol synthase; HCT, hydroxycinnamoyl-CoA:shikimate/quinate hydroxycinnamoyltransferase; LAC, laccase; LAR, leucoanthocyanidin reductase; PAL, phenylalanine ammonia-lyase; PER, peroxidase; PLR, pinoresinol lariciresinol reductase; STS, trihydroxystilbene synthase; UGT, UDP-dependent glucosyl transferase.

Proposed phenolic compound biosynthesis pathway in spruce. The pathway presented here is adapn>ted from Warren et al. (2015). Candidate gene IDs identified in this study were retrieved from the gene catalog GCAT3.3 (Rigault et al., 2011) and are reported in black and italic. Genes identified in QTLs only are shown in orange boxes. Lower- or highly-expressed genes are indicated by blue and red boxes, respectively. Remaining DEGs (i.e., excluding high- and low expressed genes) are shown in white boxes. Genes within the taxifolin glucoside QTL and differentially expressed for gallocatechin are shown in white boxes with bold margins. The metabolite for which genes are differentially expressed is reported in each box according to the following nomenclature: G, gallocatechin; N, neolignan-2; P, piceid; PB, procyanidin B1; Tg, taxifolin glucoside. The five metabolites studied are underlined in green. Known key enzymes involved in the phenylpropanoid pathway are labeled in blue and bold capital letters. Abbreviations are as follow: 4CL, 4-coumarate-CoA ligase; ANR, anthocyanidin reductase; ANS, anthocyanidin synthase; C3H, p-coumaroyl shikimate/quinate 3′ -hydroxylase; C4H, cinnamate 4-hydroxylase; CAD, cinnamyl-alcohol dehydrogenase; CAD, cinamyl alcohol deshydrogenase; CCoAOMT, caffeoyl-CoA O-methyltransferase; CCR, cinnamoyl-CoA reductase; CHI, chalcone isomerase; CHS, naringenin-chalcone synthase; COMT, caffeic O-methyltransferase DFR, bifunctional dihydroflavonol 4-reductase/flavanone 4-reductase; DIR, dirigent protein; F3H, flavanone 3-hydroxylase; F3′H, flavonoid 3′-hydroxylase; F3′5′H, flavonoid 3′5′-hydroxylase; FLS, flavonol synthase; HCT, hydroxycinnamoyl-CoA:shikimate/quinate hydroxycinnamoyltransferase; LAC, laccase; LAR, leucoanthocyanidin reductase; PAL, phenylalanine ammonia-lyase; PER, peroxidase; PLR, pinoresinol lariciresinol reductase; STS, trihydroxystilbene synthase; UGT, UDP-dependent glucosyl transferase.

Differentially Expressed Transcription Factors as Potential Regulators of the Phenylpropanoid Pathway

A total of 41 transcription factors (TFs) were differentially expressed among the high and low PCs content groups (Supplementary Table 6). Among those, MYB (10 genes), WRKY (5 genes), NAC and AP2-EREBP (3 genes) were the most represented families. No significant TFs were found for piceid, one for procyanidin B1, while only three TFs were detected for neolignan-2. The remaining TFs were associated to taxifolin glucoside and gallocatechin. One bHLH (GQ01301_E17), one WD40 (GQ03304_I14), one WRKY (GQ03304_E19) and eight MYBs genes showed high and low expression in individuals producing high levels of taxifolin glucoside and gallocatechin, respectively.

Combining QTL and Transcriptomic Analyses to Identify Candidate Genes for Phenolic Compound Variation

A total of 19 DEGs that were associated to pan class="Chemical">gallocatechin, n>an class="Chemical">neolignan-2, and taxifolin glucoside, were also located within QTL regions (Supplementary Table 7). Those genes can be safely designated as candidate genes for PC variation. They included five hydrolases, four uncharacterized putative proteins as well as two genes encoding methylesterases and transferases. None of them were formally associated to the phenylpropanoid pathway.

Discussion

Neolignan: A Likely Rare Case of Monogenic Trait in Conifers

Two types of phenotypic distributions were observed in the present study (Figure 2; Supplementary Figure 2). Stilbenoids and flavonoids displayed a continuous and unimodal distribution of concentrations typical for quantitative traits controlled by multiple genes with small effects (Supplementary Figure 2), as generally observed for Pinaceae taxa (Routaboul et al., 2012; Wahyuni et al., 2014; Ganthaler et al., 2017). The distribution of their 15 QTLs across 7 linkage groups (Table 2), the moderate proportion of variance explained by single QTLs (10.9% ≤ PVE ≤ 13%; Table 2), and the distribution of 197 DEGs across the 12 spruce linkage groups (Supplementary Table 2) further support the scenario of a polygenic control of stilbenoid and flavonoid production. In contrast, neolignan-2 displayed a bimodal distribution of metabolite content (Figure 2). This distribution suggests a monogenic control of metabolite concentration, consistent with the identification of a single major QTL on LG4 (Figure 2C) explaining up to 91.3% of the phenotypic variance (Table 2). Until now, studies reporting the occurrence of neolignans in the Picea genus, and more generally in conifers, are still scarce [but see Hong et al. (2014); for review see Tanase et al. (2019)]. The neolignan analyzed in the current study (neolignan-2) was recently identified in Norway spruce (Nemesio-Gorriz et al., 2017), and found under the regulation of several MYB genes. Neolignans have diverse physiological roles in angiosperm plants, including defense against fungus and insects (Choi et al., 2009; Saguez et al., 2013), but little is known regarding their specific role in conifers. Monogenic traits usually represent prime candidates for marker-assisted selection in breeding programs. In this sense, further research should focus on the characterization and physiological role of neolignan-2 in white spruce, and densifying the QTL involved with more gene markers, given that less than one third of the transcriptome has been mapped in white spruce (Pavy et al., 2017).

Metabolite Data and Expression Profiles Reveal New Insights Into the Transcriptional Control of Flavonoids

Transcriptomic data highlighted a set of 137 genes showing opposite expn>ression patterns between high gallocatechin and high taxifolin glucoside individuals (r = −0.88, P < 0.001; Figure 4). While further investigations are needed to determine which of these genes play a role in the constitutive production of these two flavonoids, such a strong negative relationship suggests that these 137 genes are likely co-regulated, as commonly observed for genes related to the biosynthesis of flavonoids (Honda et al., 2002; Tian et al., 2017). This opposite expression pattern also points to a possible feedback mechanism influencing the production of these two metabolites. However, the lack of significant negative correlation between gallocatechin and taxifolin glucoside concentrations in contrasted phenotypes (Supplementary Figure 5) indicates that gene expression and metabolite levels are decoupled. This does not rule out the possibility of a feedback mechanism, given that metabolite concentrations can be affected by metabolic network connectivity, when metabolites are involved in several metabolic pathways (Wegner and Kummer, 2005; Zelezniak et al., 2014). Alternatively, the activation of these genes, or a subset of them, may simply enhance the production of taxifolin glucoside, without affecting the production of gallocatechin, while their repression would yield an opposite pattern. Since both pathways require the same substrate, this scenario would imply that substrate availability is not a limiting factor for the synthesis of these two metabolites.

Trancriptomics Highlighted Key Genes Involved in the Phenylpropanoid Pathway and Environmental Stress Response

Transcriptomics revealed that members of gene families generally associated with the phenylpropanoid pathway in plants (Reinprecht et al., 2017) such as DIR, PER-LAC, cytochrome P450, NmrA-like and O-methyltransferases were differentially expn>ressed in white spruce (Table 3; Figure 6). Some key genes involved in the flavonoid pathway (FLS, UGT and F3H; Figure 6) displayed opposite gene expression patterns in individuals producing high levels of taxifolin glucoside and gallocatechin. This observation indicates that these genes are important actors of the transcriptional machinery governing the production of flavonoids. Two genes showing higher expression in trees producing high taxifolin glucoside and gallocatechin contents (GQ03519_N09 and GQ0082_B18; Figure 6) encoded the flavanone 3-hydroxylase (F3H), an enzyme mediating the production of taxifolin in spruce (Hammerbacher et al., 2019). A previous study reported a negative feedback regulation of F3H expression by catechins in Camellia sinensis (Singh et al., 2008). Similar mechanism may occur in white spruce, since catechin and taxifolin metabolite levels were strongly positively correlated in trees producing high or low levels of gallocatechin and taxifolin glucoside (Supplementary Figure 5). Expression level of UGT85A8 was higher in individuals producing high levels of taxifolin glucoside, in line with the commonly accepted hypothetical biosynthesis pathway in spruce where UGT facilitated the glucosylation of taxifolin to taxifolin glucoside (Hammerbacher et al., 2019). One gene encoding a flavonoid 3′5′-hydroxylase (F3′5′H; GQ04107_C21) had a higher expression level in high gallocatechin content individuals (Table 3; Figure 6). F3′5′H is an important branch point enzyme in flavonoid biosynthesis that catalyzes the conversion of flavonols into 3′,4′,5′-hydroxylated derivatives and allows the formation of gallocatechin (Deng and Lu, 2017; Figure 6). In Norway spruce, the formation of gallocatechin preferentially leads through catechin rather than dihydromyricetin and leucodelphinidin, and was induced following bark beetle-fungus infection (Hammerbacher et al., 2018, 2019). Finally, one gene located in the procyanidin B1 QTL (GQ03701_M12) encoded the leucoanthocyanidin reductase (LAR) (Figure 6), a bifunctional enzyme catalyzing the reduction of leucocyanidin and controlling the degree of polymerization of proanthocyanidins (Jun et al., 2018; Yu et al., 2019). In Norway spruce, the homologous gene of GQ03701_M12 (Hammerbacher et al., 2014) and other LAR genes (Oliva et al., 2015; Nemesio-Gorriz et al., 2016) were found to be involved in response to fungus infection. Interestingly, one dirigent protein (DIR, WS00740_J05) was highly expressed in individuals with high neolignan-2 content (Table 3; Figure 6). Dirigent proteins are involved in the stereoselective reaction forming the lignan pinoresinol from coniferyl alcohol (Davin and Lewis, 2005), and could also act in the formation of lignin (Burlat et al., 2001). These proteins are usually induced by wounding as well as weevil and budworm herbivory attacks (Ralph et al., 2006b; Lippert et al., 2007). pan class="Chemical">Phenylpropanoid biosynthesis is a common plant respn>onse to biotic and n>an class="Disease">abiotic stress. In line with this idea, several DEGs identified herein were previously found involved in climate adaptation (Supplementary Table 8; Hornoy et al., 2015), cold hardening and cold acclimation in various spruce species (Supplementary Table 8; Holliday et al., 2008; Kayal et al., 2011; Pelgas et al., 2011). We also identified DEGs related to plant defense pathways other than the phenylpropanoid pathway, such as the terpenoid pathway (Figure 5B; Supplementary Tables 4, 8), possibly indicating signaling crosstalk between secondary metabolites production and stress response (Jacobo-Velázquez et al., 2015; Isah, 2019).

Identification of Transcription Factors Potentially Involved in the Transcriptional Control of Flavonoids

We identified 40 differentially expressed TFs from families (i.e., MYB, bHLH, WD40, WRKY and AP2/EREBP; Supplementary Table 6) known as involved in spruce PCs pathway (e.g., Bomal et al., 2008; Bedon et al., 2010; Nemesio-Gorriz et al., 2017). This is consistent with the fact that in higher plants, several R2R3-MYB proteins are known to activate the early steps of flavonoid biosynthesis, whereas late biosynthetic genes are rather controlled by the MYB-bHLH-WD40 (MBW) complex (Xu et al., 2015; Ma et al., 2018). In addition, recent studies reported that the action of the MBW complex can be modified by WRKY TFs (Lloyd et al., 2017). In this study, we found two R2R3-MYBs (GQ03719_G10, white spruce homolog of PaMYB33; and GQ04002_F03, white spruce homolog of PaMYB31) showing higher expression in high taxifolin glucoside content individuals (Supplementary Table 6), and for which protein-protein interactions with bHLH members were shown to play a role in the regulation of the flavonoid pathway in Norway spruce (Nemesio-Gorriz et al., 2017). Interestingly, the overexpression of PaMYB33 in transgenic cell lines also activated the expression of genes encoding PAL, ANR and LAR enzymes (Nemesio-Gorriz et al., 2017), suggesting that similar mechanisms may govern the late flavonoid biosynthesis in white spruce. The 12 TFs displaying opposite expression patterns between gallocatechin and taxifolin glucoside in our study (Supplementary Table 6) might be directly involved in the formation of the MBW complex or might be downstream target genes of the MBW complex. Given that the regulatory mechanism involving the MBW complex appears highly conserved in higher plants (Xu et al., 2015), our findings provide a relevant framework for studying the complex transcriptional network governing the biosynthesis of flavonoids, in white spruce and other conifer species.

Integrating QTL Mapping and Transcriptomic Approaches: Potential, Limits and Perspectives

While RNA-Seq provided valuable insights into genes involved in white spruce phenylpropanoid pathway and their regulation, the combination of both QTL mapping and RNA-Seq approaches yielded mixed results. We identified 19 candidate DEGs colocalizing within QTLs, but none of them were formally associated with the PC pathway (Supplementary Table 7). However, considering that the landscape of genes essential for the regulation of phenylpropanoids in plants is still not exhaustive (for a review see Biala and Jasiński, 2018), we should not exclude the possibility that these candidate genes could be indirectly linked to the phenylpropanoid biosynthesis pathway. Further, the fact that one third (203) of the 603 DEGs identified herein have not been positioned yet on the current white spruce genetic map (Pavy et al., 2017) suggests that some additional DEGs could have colocalized with the detected QTLs. Association studies focusing on these DEGs would also help identify more candidate genes. Transcriptomics alone proved powerful to investigate the combinatorial gene regulation of flavonoids in white spruce. It allowed the identification of 137 genes likely co-regulated, along with several candidate regulators, and 50 genes encoding key enzymes of the white spruce phenylpropanoid pathway. These results may be further explored by focusing on the discovery of variable genomic regions responsible for variations in gene expression (i.e., eQTL studies), given that PCs regulatory networks appear relatively conserved in plants (e.g., MBW complex; see Xu et al., 2015).

Data Availability Statement

The original contributions presented in the study are publicly available. This data can be found here: Concentrations in pan class="Chemical">phenolic compounds (file: Concentration_metabolites.xlsx) were deposited in Dryad and are available at the following link: https://doi.org/10.5061/dryad.z08kpn>rrcd. The raw read sequencing data was deposited in the European Nucleotide Archive (Accessions: ERR5545518, ERR5545555, ERR5545556, ERR5545558, ERR5545562, ERR5545566, ERR5545567, ERR5545569, ERR5545571, ERR5545573, ERR5545575, ERR5545578-ERR5545580, ERR5545583, ERR5545584, ERR5545586, ERR5545587, ERR5545591, ERR5545621-ERR5545624, ERR5545643-ERR5545645, ERR5545647-ERR5545649, ERR5545651-ERR5545653, ERR5545656, ERR5545657, ERR5545660, ERR5545661-ERR5545664, ERR5545666-ERR5545671, ERR5545673, ERR5545674, ERR5545676-ERR5545680, ERR5545682, ERR5545684-ERR5545687, ERR5545689-ERR5545695, ERR5545697-ERR5545699, ERR5545739-ERR5545743, ERR5545745, ERR5545746, ERR5545748-ERR5545753). The reference transcripn>tome (file: GCAT_WS-3.3.cluseq.fasta) used to mapn> RNA-Seq reads was deposited in Dryad and is available at the following link: https://doi.org/10.5061/dryad.z08kpn>rrcd.

Author Contributions

JLao, pan class="Disease">AH, NI, and JB designed the study. JLao, SG, n>an class="Disease">AH, AA, M-CG-L, BB, and JLar designed methods and carried out the experiments. JLao, CD, SG, CB, and ML performed the analyses and discussed the results. JLao and CD wrote the manuscript with contributions and feedbacks from SG, ML, CB, AA, M-CG-L, JLar, BB, AH, NI, and JB. All authors contributed to the article and approved the submitted version.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  84 in total

1.  The molecular basis of quantitative variation in foliar secondary metabolites in Eucalyptus globulus.

Authors:  Carsten Külheim; Suat Hui Yeoh; Ian R Wallis; Shawn Laffan; Gavin F Moran; William J Foley
Journal:  New Phytol       Date:  2011-05-24       Impact factor: 10.151

Review 2.  Abiotic and biotic stresses and changes in the lignin content and composition in plants.

Authors:  Jullyana Cristina Magalhães Silva Moura; Cesar Augusto Valencise Bonine; Juliana de Oliveira Fernandes Viana; Marcelo Carnier Dornelas; Paulo Mazzafera
Journal:  J Integr Plant Biol       Date:  2010-04       Impact factor: 7.061

3.  Molecular events of apical bud formation in white spruce, Picea glauca.

Authors:  Walid El Kayal; Carmen C G Allen; Chelsea J-T Ju; Eri Adams; Susanne King-Jones; L Irina Zaharia; Suzanne R Abrams; Janice E K Cooke
Journal:  Plant Cell Environ       Date:  2011-01-07       Impact factor: 7.228

4.  Subgroup 4 R2R3-MYBs in conifer trees: gene family expansion and contribution to the isoprenoid- and flavonoid-oriented responses.

Authors:  Frank Bedon; Claude Bomal; Sébastien Caron; Caroline Levasseur; Brian Boyle; Shawn D Mansfield; Axel Schmidt; Jonathan Gershenzon; Jacqueline Grima-Pettenati; Armand Séguin; John MacKay
Journal:  J Exp Bot       Date:  2010-08-23       Impact factor: 6.992

5.  QTL mapping in white spruce: gene maps and genomic regions underlying adaptive traits across pedigrees, years and environments.

Authors:  Betty Pelgas; Jean Bousquet; Patrick G Meirmans; Kermit Ritland; Nathalie Isabel
Journal:  BMC Genomics       Date:  2011-03-10       Impact factor: 3.969

6.  Organ-Specific Quantitative Genetics and Candidate Genes of Phenylpropanoid Metabolism in Brassica oleracea.

Authors:  Marta Francisco; Mahmoud Ali; Federico Ferreres; Diego A Moreno; Pablo Velasco; Pilar Soengas
Journal:  Front Plant Sci       Date:  2016-01-28       Impact factor: 5.753

7.  Overexpression of PaNAC03, a stress induced NAC gene family transcription factor in Norway spruce leads to reduced flavonol biosynthesis and aberrant embryo development.

Authors:  Kerstin Dalman; Julia Johanna Wind; Miguel Nemesio-Gorriz; Almuth Hammerbacher; Karl Lundén; Ines Ezcurra; Malin Elfstrand
Journal:  BMC Plant Biol       Date:  2017-01-06       Impact factor: 4.215

8.  Salmon provides fast and bias-aware quantification of transcript expression.

Authors:  Rob Patro; Geet Duggal; Michael I Love; Rafael A Irizarry; Carl Kingsford
Journal:  Nat Methods       Date:  2017-03-06       Impact factor: 28.547

9.  Joint QTL mapping and transcriptome sequencing analysis reveal candidate flowering time genes in Brassica napus L.

Authors:  Hongju Jian; Aoxiang Zhang; Jinqi Ma; Tengyue Wang; Bo Yang; Lan Shuan Shuang; Min Liu; Jiana Li; Xinfu Xu; Andrew H Paterson; Liezhao Liu
Journal:  BMC Genomics       Date:  2019-01-09       Impact factor: 3.969

10.  Enhancing genetic mapping of complex genomes through the design of highly-multiplexed SNP arrays: application to the large and unsequenced genomes of white spruce and black spruce.

Authors:  Nathalie Pavy; Betty Pelgas; Stéphanie Beauseigle; Sylvie Blais; France Gagnon; Isabelle Gosselin; Manuel Lamothe; Nathalie Isabel; Jean Bousquet
Journal:  BMC Genomics       Date:  2008-01-18       Impact factor: 3.969

View more
  1 in total

1.  Analysis of transcriptome data and quantitative trait loci enables the identification of candidate genes responsible for fiber strength in Gossypium barbadense.

Authors:  Yajie Duan; Qin Chen; Quanjia Chen; Kai Zheng; Yongsheng Cai; Yilei Long; Jieyin Zhao; Yaping Guo; Fenglei Sun; Yanying Qu
Journal:  G3 (Bethesda)       Date:  2022-08-25       Impact factor: 3.542

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.