Literature DB >> 34608951

The metabolomic landscape of rice heterosis highlights pathway biomarkers for predicting complex phenotypes.

Zhiwu Dan1, Yunping Chen1, Hui Li1, Yafei Zeng1, Wuwu Xu1, Weibo Zhao1, Ruifeng He2, Wenchao Huang1.   

Abstract

Understanding the molecular mechanisms underlying complex phenotypes requires systematic analyses of complicated metabolic networks and contributes to improvements in the breeding efficiency of staple cereal crops and diagnostic accuracy for human diseases. Here, we selected rice (Oryza sativa) heterosis as a complex phenotype and investigated the mechanisms of both vegetative and reproductive traits using an untargeted metabolomics strategy. Heterosis-associated analytes were identified, and the overlapping analytes were shown to underlie the association patterns for six agronomic traits. The heterosis-associated analytes of four yield components and plant height collectively contributed to yield heterosis, and the degree of contribution differed among the five traits. We performed dysregulated network analyses of the high- and low-better parent heterosis hybrids and found multiple types of metabolic pathways involved in heterosis. The metabolite levels of the significantly enriched pathways (especially those from amino acid and carbohydrate metabolism) were predictive of yield heterosis (area under the curve = 0.907 with 10 features), and the predictability of these pathway biomarkers was validated with hybrids across environments and populations. Our findings elucidate the metabolomic landscape of rice heterosis and highlight the potential application of pathway biomarkers in achieving accurate predictions of complex phenotypes.
© The Author(s) 2021. Published by Oxford University Press on behalf of American Society of Plant Biologists.

Entities:  

Mesh:

Substances:

Year:  2021        PMID: 34608951      PMCID: PMC8491067          DOI: 10.1093/plphys/kiab273

Source DB:  PubMed          Journal:  Plant Physiol        ISSN: 0032-0889            Impact factor:   8.340


Introduction

Variations in the levels of specific metabolites are closely related to the quantitative changes in complex phenotypes. For example, in a previous study in tomato (Solanum lycopersicum), most of the identified metabolites that belong to central metabolic pathways were significantly correlated with whole-plant phenotypic traits (Schauer et al., 2006). Recently, 40 plasma metabolites explained the variance in gut microbiome α-diversity in humans (Wilmanski et al., 2019). Although the combination of metabolites has potential for predicting multiple polygenic phenotypes (Wen et al., 2014; Dan et al., 2019, 2020), the prediction of individuals with the same performance is hampered by molecular heterogeneity (Chen et al., 2014; Menche et al., 2017; Guo et al., 2019). Moreover, the contribution of statistically insignificant metabolites to phenotypic variances under one condition was ignored in the other conditions. With the rapid advancements in dysregulated network analysis of metabolomics (Chong et al., 2018; Shen et al., 2019), the development of metabolomic biomarkers at the pathway level after discrete metabolites provides approaches to increase the predictability of complex phenotypes. Heterosis, which has been widely used for improving global food production, has complex characteristics, and the metabolomic mechanisms have yet to be elucidated (Darwin, 1876; Williams, 1959). With continuously growing populations and dramatic climatic changes, the breeding of new heterotic and adaptive hybrids are a major challenge for traditional breeding programs (Varshney et al., 2018; Hickey et al., 2019). Previous studies conducted on hybrid crops (including maize, wheat, and rice) have demonstrated that the screened metabolites detected from leaves or roots have predictive power for biomass (Lisec et al., 2011), grain weight and production (Zhao et al., 2015; Xu et al., 2016; Dan et al., 2019), and yield heterosis (Dan et al., 2020). Obstacles such as feature selection and cross-validation procedures still exist (Crossa et al., 2017; Dan et al., 2019), and the metabolomic connections between components (e.g. grain number and grain weight) and complex traits (e.g. yield and biomass) are largely unknown. Therefore, metabolome-based precision designs require optimization to achieve accurate predictions across populations and environments. To understand the metabolomic mechanisms of heterosis and identify robust pathway biomarkers for yield heterosis in rice, we identified heterosis-associated analytes and revealed their contribution to six agronomic traits. The metabolic pathways involved in heterosis were identified through dysregulated network analysis of the high- and low-better parent heterosis hybrids, and the finding of overlapping pathways revealed the metabolomic landscape of heterosis for both vegetative and reproductive traits. Quantitative changes in the significantly enriched pathways were predictive of yield heterosis, and the pathway biomarkers at a small number were further validated with hybrids across environments and a separate hybrid population, suggesting a wide application potential for predicting complex phenotypes.

Results

Identifying heterosis-associated analytes for six agronomic traits

To identify metabolic analytes associated with rice (Oryza sativa) heterosis, we phenotyped grain yield; four yield components (seed setting rate, grain weight, grain number, and tiller number); and plant height (PH, a yield-related trait) for a hybrid population (complete diallel crosses with 18 parents) and collected untargeted metabolite profiles from 15-d-old parental seedlings (Supplemental Table S1). Previous results have demonstrated that the calculated average parental metabolite levels are appropriate for representing the hybrid metabolite profiles (Dan et al., 2020). We performed a Pearson correlation analysis on the transformed parental metabolite levels and better-parent heterosis (BPH), which estimates the degree of hybrid performance outperforming the better parent, with the high values always pursued by the breeders, of the six investigated traits (Supplemental Figure S1). Although the degree of heterosis largely varied across traits at both individual and population levels (Figure 1A), closer links between the average parental metabolite levels and heterosis were observed based on the number of significant correlations, compared to those of the differences in and ratios of the values (Figure 1B).
Figure 1

Identification of heterosis-associated analytes for six agronomic traits. A, Heterosis of six agronomic traits at the population and individual levels. Five reproductive traits (including yield and four yield components) and one vegetative trait (PH) were recorded. Bars represent standard errors. B, Number of correlations between transformed parental metabolite levels and heterosis. The means of, differences in, and ratios of parental metabolite levels were calculated to perform Pearson correlations with heterosis of the six traits. Correlations with P <0.05 were considered significant. N = 3,746. C, Changes in r values with different numbers of predictive analytes in the PLS regressions. The optimal number of predictive analytes for each trait is marked with a black arrow. D–E, Correlations between the observed and predicted values of heterosis for PH (D) and yield (E) with correspondingly identified heterosis-associated analytes. F, MS/MS spectra of an analyte with peak tag M163T337_NEG and 4-hydroxycinnamic acid standard. G, Correlation between metabolite levels of 4-hydroxycinnamic acid and PH heterosis. H, Venn diagram of heterosis-associated analytes for yield and four yield components. In (A, D, and E) and (G) N = 287. SSR, seed setting rate; TGW, thousand-grain weight; GNP, grain number per panicle; TPP, tiller number per plant; YPP, yield per plant.

Identification of heterosis-associated analytes for six agronomic traits. A, Heterosis of six agronomic traits at the population and individual levels. Five reproductive traits (including yield and four yield components) and one vegetative trait (PH) were recorded. Bars represent standard errors. B, Number of correlations between transformed parental metabolite levels and heterosis. The means of, differences in, and ratios of parental metabolite levels were calculated to perform Pearson correlations with heterosis of the six traits. Correlations with P <0.05 were considered significant. N = 3,746. C, Changes in r values with different numbers of predictive analytes in the PLS regressions. The optimal number of predictive analytes for each trait is marked with a black arrow. D–E, Correlations between the observed and predicted values of heterosis for PH (D) and yield (E) with correspondingly identified heterosis-associated analytes. F, MS/MS spectra of an analyte with peak tag M163T337_NEG and 4-hydroxycinnamic acid standard. G, Correlation between metabolite levels of 4-hydroxycinnamic acid and PH heterosis. H, Venn diagram of heterosis-associated analytes for yield and four yield components. In (A, D, and E) and (G) N = 287. SSR, seed setting rate; TGW, thousand-grain weight; GNP, grain number per panicle; TPP, tiller number per plant; YPP, yield per plant. Next, we performed partial least squares (PLS) regression analysis (Wold, 1975), which handles high-dimensional megavariate relationships, on the average parental metabolite levels to identify predictive analytes for heterosis, namely, heterosis-associated analytes. The number of latent factors that are proxies for blocks of directly observed variables ranged from 1 to 17, and 3 or 4 latent factors, at which the r value was the highest, were chosen for each trait in building predictive models (Supplemental Figure S2). In addition, both 10-fold cross-validation and a permutation test were performed for the six predictive models to estimate the issue of overfitting (Supplemental Figure S3). The optimal number of predictive analytes ranging from 100 to 300 was chosen for each trait after removing redundant feature information (Figure 1C). The correlation coefficients between the observed and predicted values of BPH for PH and grain yield at the maturation stage were 0.68 and 0.60, respectively (Figure 1, D and E), showing a higher predictability for the vegetative trait than those for reproductive traits (Supplemental Figure S4). For PH, 100 heterosis-associated analytes were identified, and an analyte (peak tag: M163T337_NEG) was annotated as 4-hydroxycinnamic acid with the corresponding standard (Figure 1F). The metabolite levels of 4-hydroxycinnamic acid, whose positive relationship with PH has been confirmed in diverse plants (Gui et al., 2011; Riedelsheimer et al., 2012b; Li et al., 2015), had significant positive correlations with PH heterosis (Figure 1G). None of the heterosis-associated analytes overlapped with the five reproductive traits (Figure 1H). In yield heterosis, more weight was observed for seed setting rate and tiller number, compared to grain number and grain weight, based on the number of overlapping heterosis-associated analytes.

Connections of heterosis-associated analytes among traits

To investigate the connections of heterosis-associated analytes among the traits, we performed both partial and Pearson correlation analyses on heterosis of the five reproductive traits and PH (Figure 2A). Notably, heterosis of seed setting rate (R = 0.72) and tiller number (R = 0.66) contributed more than those of grain number (R = 0.34) and grain weight (R = 0.16) to yield heterosis, based on the correlation coefficients. We then investigated the relationship between the metabolite levels of the 27 overlapping heterosis-associated analytes for yield and seed setting rate (Supplemental Table S2), and found that all analytes had consistent positive or negative correlations with heterosis of the two traits (Figure 2B;  Supplemental Table S3). Furthermore, positive and negative correlations were detected among the five reproductive traits, and consistent or opposite relationships were found between the metabolite levels of overlapping heterosis-associated analytes and heterosis (Figure 2C;  Supplemental Figure S5 and Supplemental Table S3), indicating that the overlapping analytes underlie the association patterns for the traits.
Figure 2

Connections of heterosis-associated analytes among traits. A, Correlations among heterosis of five reproductive traits and PH. Partial correlations were performed to investigate the contribution of four yield components and PH to yield heterosis. Pearson correlations were conducted to analyze the relationship among the four yield components and PH. Correlation coefficients of the partial and Pearson correlations are indicated with R and r, respectively. *, **, statistically significant at 0.05 and 0.01 levels, respectively; ns, no statistically significant correlation. B, Correlations between metabolite levels of M853T560_NEG and heterosis of seed setting rate and yield. C, Correlations between metabolite levels of M131T16_NEG and heterosis of seed setting rate and tiller number. D, Correlation between the observed and predicted values of yield heterosis based on heterosis of the four yield components and PH. An equation was obtained through stepwise regression analysis: BPH-YPP = BPH-SSR*1.674 + BPH-TPP*0.949 + BPH-TGW*0.571 + BPH-GNP*0.533 + BPH-PH*0.504 + 0.299. E, Correlation between the observed and predicted values of yield heterosis based on heterosis-associated analytes of the four yield components and PH with the equation in Figure 2D. In (A–E), N = 287.

Connections of heterosis-associated analytes among traits. A, Correlations among heterosis of five reproductive traits and PH. Partial correlations were performed to investigate the contribution of four yield components and PH to yield heterosis. Pearson correlations were conducted to analyze the relationship among the four yield components and PH. Correlation coefficients of the partial and Pearson correlations are indicated with R and r, respectively. *, **, statistically significant at 0.05 and 0.01 levels, respectively; ns, no statistically significant correlation. B, Correlations between metabolite levels of M853T560_NEG and heterosis of seed setting rate and yield. C, Correlations between metabolite levels of M131T16_NEG and heterosis of seed setting rate and tiller number. D, Correlation between the observed and predicted values of yield heterosis based on heterosis of the four yield components and PH. An equation was obtained through stepwise regression analysis: BPH-YPP = BPH-SSR*1.674 + BPH-TPP*0.949 + BPH-TGW*0.571 + BPH-GNP*0.533 + BPH-PH*0.504 + 0.299. E, Correlation between the observed and predicted values of yield heterosis based on heterosis-associated analytes of the four yield components and PH with the equation in Figure 2D. In (A–E), N = 287. We then performed stepwise regression analysis on the heterosis of yield and the four components and found an equation that explained the variance in yield heterosis (r = 0.81; Supplemental Figure S6). Because the degree of heterosis for the four components was predicted with corresponding heterosis-associated analytes (Supplemental Figure S4), we used the predicted values in the equation and calculated new values for yield heterosis. A significant correlation was observed between the observed and predicted values (r = 0.52; Supplemental Figure S6). Furthermore, the percentage of explained variance for yield heterosis slightly increased with the addition of PH to the regression equation (r = 0.82; Figure 2D), and the correlation coefficient increased to 0.53, based on the heterosis-associated analytes of the five traits (Figure 2E). Heterosis of PH was positively correlated with almost all investigated reproductive traits, except seed setting rate (Figure 2A), and the overlapping heterosis-associated analytes were found among these traits with the same correlations as those shown in Figure 2, B and C (Supplemental Figure S7 and Supplemental Table S3). These results indicated that the heterosis-associated analytes of the yield component and yield-related traits collectively contributed to the yield heterosis.

Metabolic pathways involved in heterosis

The metabolic pathways involved in heterosis need to be elucidated. Of the 3,746 analytes in our study, only 114 had been annotated, making it difficult to perform pathway enrichment analysis based on limited metabolite information. To identify enriched pathways for heterosis of each trait, we first divided the diallel cross population into two distinct regions of high- and low-BPH based on the quartiles (25th and 75th percentiles) at which most of the differential analytes from the empirical Bayesian analysis overlapped with the corresponding heterosis-associated analytes (Figure 3A). We then performed dysregulated network analysis on the two groups with Metabolite identification and Dysregulated Network Analysis software (MetDNA; Shen et al., 2019), which annotates metabolites with a recursive algorithm and identifies dysregulated metabolic pathways based on differential metabolic peaks. The results showed that only two pathways were simultaneously enriched for the five reproductive traits (Figure 3B). The enriched pathways for heterosis of the seed setting rate and tiller number had higher percentages of overlapping pathways with yield heterosis than those of grain number and grain weight (Figure 3B), which was consistent with the results shown in Figures 1, H and 2, A. With respect to quantitative information on the enriched pathways (the average levels of all metabolites per pathway), 77.3% of the pathways for yield heterosis showed significant differences between the high- and low-BPH hybrids (17 pathways; Figure 3C;  Supplemental Tables S4 and S5), and 81.8% of those were significantly correlated with yield heterosis (Figure 3D;  Supplemental Table S6). This result confirmed previously reported metabolites that have positive or negative correlations with grain yield or biomass at the pathway level and indicated that the metabolite levels of the enriched pathways were closely related to yield heterosis (Table 1).
Figure 3

Enriched metabolic pathways for heterosis. A, Overlap of analytes between PLS regression and Bayesian method. B, Venn diagram of enriched pathways for heterosis of the five reproductive traits. The percentages of overlapping pathways for each of the four yield components with yield heterosis are correspondingly shown in brackets. The numbers of overlapping and per se enriched pathways for the four yield components are indicated at the left and right side of the slash, respectively. NA, not applicable. C, Comparison of metabolite levels of pentose and glucuronate interconversions between the high- and low-BPH-YPP hybrids. Independent samples t test, two-tailed. N = 72. The center line of each boxplot represents the 50th percentile. The bottom and top of each boxplot represent the 25th and 75th percentiles, respectively. The whiskers represent the minimum and maximum values. The circles represent outliers. D, Correlation between metabolite levels of pentose and glucuronate interconversions and yield heterosis. N = 144. E, Correlation pattern of significantly enriched pathways for yield heterosis. A total of 17 pathways were significantly enriched for yield heterosis, and Pearson correlations were performed among these pathways based on their quantitative information. The purple and green arrows indicate that the high-BPH-YPP hybrids had high or low metabolite levels, respectively. The percentages of regulated pathways from amino acid metabolism and carbohydrate metabolism are shown in brackets. The correlation between cyanoamino acid metabolism and propanoate metabolism is highlighted with a black square. F, Correlations between metabolite levels of the citrate cycle and two pathways from amino acid and carbohydrate metabolism. N = 144.

Table 1

The enriched metabolic pathways for yield heterosis

Pathway name P-value of enrichment analysis P-value of t testMetabolite levelPreviously known metabolitesSpecies
Tyrosine Metabolism0.0463783.47E-10LowSuccinic acid, tyrosine, maleic acid, dopamine, fumarate Arabidopsis (Meyer et al., 2007; Sulpice et al., 2013), maize (Riedelsheimer et al., 2012b; Obata et al., 2015)
Pantothenate and CoA Biosynthesis0.0012713.54E-04LowAspartate, valine Arabidopsis (Meyer et al., 2007; Sulpice et al., 2010), tomato (Schauer et al., 2006), maize (Obata et al., 2015; de Abreu et al., 2017)
Propanoate Metabolism0.0013381.81E-04LowSuccinic acid Arabidopsis (Meyer et al., 2007), tomato (Schauer et al., 2006)
Nicotinate and Nicotinamide Metabolism0.0149072.60E-04LowSuccinic acid, aspartate, fumarate, nicotinate, gamma-aminobutyric acid Arabidopsis (Sulpice et al., 2013), tomato (Schauer et al., 2006), maize (Obata et al., 2015; de Abreu et al., 2017)
C5-Branched Dibasic Acid Metabolism0.0176631.31E-05LowGlutamate, 2-oxoglutarate, itaconate Arabidopsis (Sulpice et al., 2010), tomato (Schauer et al., 2006), maize (Obata et al., 2015)
Citrate Cycle0.004712.95E-08LowSuccinic acid, citric acid, fumarate, malate Arabidopsis (Meyer et al., 2007; Sulpice et al., 2013), tomato (Schauer et al., 2006), maize (Obata et al., 2015)
Glyoxylate and Dicarboxylate Metabolism0.0211091.72E-03LowSuccinic acid, glutamine, citric acid, serine, glycine, 2-oxoglutarate, malate, glyceric acid, glutamate Arabidopsis (Meyer et al., 2007; Sulpice et al., 2009, 2010, 2013), tomato (Schauer et al., 2006), maize (Obata et al., 2015)
Butanoate Metabolism0.000720.17LowSuccinic acid, maleic acid, glutamate, 2-oxoglutarate, fumarate, gamma-aminobutyric acid Arabidopsis (Meyer et al., 2007; Sulpice et al., 2010, 2013), tomato (Schauer et al., 2006), maize (Obata et al., 2015)
Galactose Metabolism0.0080155.09E-05HighGlycerol, raffinose, galactinol, glucoseMaize (Obata et al., 2015), Miscanthus (Maddison et al., 2017)
Pentose and Glucuronate Interconversions0.0149075.06E-04HighGlycerol, xylose, xylitolMaize (Obata et al., 2015)
Sulfur Metabolism0.0176634.67E-04HighSuccinic acidMaize (Obata et al., 2015)
Cysteine and Methionine Metabolism0.0222762.28E-05HighAspartateMaize (Obata et al., 2015)
Pentose Phosphate Pathway0.0224621.16E-06HighGlycerate, glucoseMaize (Obata et al., 2015), Miscanthus (Maddison et al., 2017)
Monobactam Biosynthesis0.0298612.33E-03HighAspartate, threonineMaize (Obata et al., 2015)
Tropane, Piperidine and Pyridine alkaloid Biosynthesis0.0301025.58E-04HighPutrescine, nicotinate, nicotinate Arabidopsis (Meyer et al., 2007), maize (de Abreu et al., 2017; Obata et al., 2015)
Lysine Degradation0.0019259.41E-03HighSuccinic acidMaize (Obata et al., 2015)
Valine, Leucine and Isoleucine Biosynthesis0.007052.89E-03HighValine, threonineMaize (Obata et al., 2015)
Cyanoamino acid Metabolism0.0430812.85E-02HighGlycine, tyrosine, asparagine Arabidopsis (Gärtner et al., 2009; Sulpice et al., 2013)
Phenylalanine, Tyrosine and Tryptophan Biosynthesis0.0224620.08High
Glycine, Serine and Threonine Metabolism0.010740.33HighGlycerate, threonine, aspartate,Maize (Obata et al., 2015)
Pyruvate Metabolism0.0018790.32HighSuccinic acid, fumarateMaize (Obata et al., 2015)
Phenylalanine Metabolism0.0361230.67HighBenzoic acid, succinic acid, fumarate Arabidopsis (Sulpice et al., 2013), maize (Obata et al., 2015)
Synthesis and Degradation of Ketone Bodies0.004325

The pathway name, P-value, metabolite level, previously known metabolites, and corresponding species are provided. Since two pathways have no reported metabolites and one pathway’s quantitative information is not available, corresponding areas are marked with horizontal lines.

Enriched metabolic pathways for heterosis. A, Overlap of analytes between PLS regression and Bayesian method. B, Venn diagram of enriched pathways for heterosis of the five reproductive traits. The percentages of overlapping pathways for each of the four yield components with yield heterosis are correspondingly shown in brackets. The numbers of overlapping and per se enriched pathways for the four yield components are indicated at the left and right side of the slash, respectively. NA, not applicable. C, Comparison of metabolite levels of pentose and glucuronate interconversions between the high- and low-BPH-YPP hybrids. Independent samples t test, two-tailed. N = 72. The center line of each boxplot represents the 50th percentile. The bottom and top of each boxplot represent the 25th and 75th percentiles, respectively. The whiskers represent the minimum and maximum values. The circles represent outliers. D, Correlation between metabolite levels of pentose and glucuronate interconversions and yield heterosis. N = 144. E, Correlation pattern of significantly enriched pathways for yield heterosis. A total of 17 pathways were significantly enriched for yield heterosis, and Pearson correlations were performed among these pathways based on their quantitative information. The purple and green arrows indicate that the high-BPH-YPP hybrids had high or low metabolite levels, respectively. The percentages of regulated pathways from amino acid metabolism and carbohydrate metabolism are shown in brackets. The correlation between cyanoamino acid metabolism and propanoate metabolism is highlighted with a black square. F, Correlations between metabolite levels of the citrate cycle and two pathways from amino acid and carbohydrate metabolism. N = 144. The enriched metabolic pathways for yield heterosis The pathway name, P-value, metabolite level, previously known metabolites, and corresponding species are provided. Since two pathways have no reported metabolites and one pathway’s quantitative information is not available, corresponding areas are marked with horizontal lines. We then investigated the correlations of the 17 significantly enriched pathways for yield heterosis, which were mainly from amino acid and carbohydrate metabolism. Two distinct clustering trends were found among the metabolic pathways (Figure 3E), and they were close to the correlation pattern of the 100 yield heterosis-associated analytes (Supplemental Figure S8). Because 114 of the analytes had already been successfully annotated, we converted the compound names of these metabolites into Kyoto Encyclopedia of Genes and Genomes (KEGG) IDs and mapped them to the KEGG metabolic pathways. A total of 18 metabolites were mapped to the pathways listed in Figure 3E, and six metabolites in the cyanoamino acid metabolism (l-phenylalanine, l-aspartate, and l-tyrosine) and propanoate metabolism (dihydroxyacetone phosphate, alpha-hydroxybutyric acid, and pyruvaldehyde) pathways were selected for further correlation analysis. Metabolites in the same pathways had significant positive correlations, and metabolites in different pathways had significant negative or no correlations (Supplemental Table S7). As shown in Figure 3E, the average levels of the six metabolites in the two pathways were significantly negatively correlated (Supplemental Figure S9). After the metabolite levels of the enriched pathways were compared between the high- and low-BPH hybrids, we found that all pathways involved in amino acid metabolism, except for tyrosine metabolism, had high metabolite levels in high-BPH hybrids, and 57.1% of the pathways from carbohydrate metabolism had low metabolite levels in high-BPH hybrids (Supplemental Table S5). Because negative correlations existed between the metabolite levels of amino acid and carbohydrate metabolism (Figure 3F;  Supplemental Table S6), we speculated that higher metabolite levels of amino acid metabolism and lower metabolite levels of carbohydrate metabolism were closely related to a higher degree of yield heterosis. With respect to the four yield components, the significantly enriched pathways showed different correlation manners across traits, and most of the manners were similar to those of corresponding heterosis-associated analytes (Supplemental Figures S10–S12). Accordingly, we constructed a metabolomic landscape for heterosis of both reproductive and vegetative traits through overlapping pathways (Figure 4). In concordance with the yield heterosis—as shown in Figure 3E—most of the significantly enriched pathways from amino acid metabolism demonstrated positive correlations with heterosis of grain weight (100%) and seed setting rate (66.7%), and the pathways from carbohydrate metabolism were negatively correlated (100% and 25%, respectively). In contrast to the reproductive traits, 83.3% of the enriched pathways from amino acid metabolism were negatively correlated with PH heterosis, and 75% of those from carbohydrate metabolism were positively correlated. Thus, the metabolite levels of the significantly enriched pathways (especially those in amino acid and carbohydrate metabolism) for the four yield components always had consistent correlation patterns with the degree of yield heterosis, whereas those for vegetative trait (PH) manifested opposite relationships with the five reproductive traits (yield and yield components).
Figure 4

Metabolomic landscape of heterosis for six agronomic traits. The landscape of heterosis was created by the overlapping metabolic pathways between traits. All the significantly enriched pathways from amino acid metabolism were positively correlated with heterosis of grain weight, and all the pathways from carbohydrate metabolism were negatively correlated. Similarly, four of six significantly enriched pathways from amino acid metabolism displayed positive correlations with heterosis of seed setting rate, and one out of four pathways from carbohydrate metabolism displayed a negative correlation. Eight significantly enriched pathways for grain number (namely, zeatin biosynthesis, two pathways in amino acid metabolism, and five in carbohydrate metabolism) showed negative relationships, and the pentose phosphate pathway showed a positive correlation. Only one pathway was significantly enriched for tiller heterosis, and the metabolite levels of pentose and glucuronate interconversions were positively correlated with tiller heterosis. In contrast to the above-mentioned correlation patterns, five out of six significantly enriched pathways in amino acid metabolism showed negative correlations with heterosis of PH, and three out of four pathways in carbohydrate metabolism showed positive correlations. Pearson correlation analysis was performed based on the metabolite levels of the significantly enriched pathways, and a correlation was significant when the P <0.05. Positive and negative correlations are indicated in different colors. The metabolic pathways from different types are marked correspondingly. Purple and green arrows indicate high-BPH hybrids with high or low metabolite levels, respectively. Numbers in brackets represent percentages of regulated pathways from amino acid and carbohydrate metabolism.

Metabolomic landscape of heterosis for six agronomic traits. The landscape of heterosis was created by the overlapping metabolic pathways between traits. All the significantly enriched pathways from amino acid metabolism were positively correlated with heterosis of grain weight, and all the pathways from carbohydrate metabolism were negatively correlated. Similarly, four of six significantly enriched pathways from amino acid metabolism displayed positive correlations with heterosis of seed setting rate, and one out of four pathways from carbohydrate metabolism displayed a negative correlation. Eight significantly enriched pathways for grain number (namely, zeatin biosynthesis, two pathways in amino acid metabolism, and five in carbohydrate metabolism) showed negative relationships, and the pentose phosphate pathway showed a positive correlation. Only one pathway was significantly enriched for tiller heterosis, and the metabolite levels of pentose and glucuronate interconversions were positively correlated with tiller heterosis. In contrast to the above-mentioned correlation patterns, five out of six significantly enriched pathways in amino acid metabolism showed negative correlations with heterosis of PH, and three out of four pathways in carbohydrate metabolism showed positive correlations. Pearson correlation analysis was performed based on the metabolite levels of the significantly enriched pathways, and a correlation was significant when the P <0.05. Positive and negative correlations are indicated in different colors. The metabolic pathways from different types are marked correspondingly. Purple and green arrows indicate high-BPH hybrids with high or low metabolite levels, respectively. Numbers in brackets represent percentages of regulated pathways from amino acid and carbohydrate metabolism.

The enriched pathways are predictive of yield heterosis

Based on the metabolite levels of the significantly enriched pathways for yield heterosis, we performed biomarker analysis by calculating the ratios of all pathway pairs, which can increase the chance of identifying individual biomarkers (Chong et al., 2019). The univariate receiver operating characteristic (ROC) curve analysis showed that a cutoff of 0.551 for ratios of tyrosine metabolism and sulfur metabolism could distinguish between the high- and low-BPH hybrids, with an area under the curve (AUC) equal to 0.836 (Figure 5, A and B; Supplemental Table S8). When multivariate ROC curve analysis was performed to identify biomarkers, the AUC increased to 0.907, and the predictive accuracy was 0.827 (Figure 5, C and D). The best model contained only 10 features; tyrosine metabolism was highly important and was frequently selected (Figure 5E;  Supplemental Figure S13 and Supplemental Table S9), demonstrating the critical role of tyrosine metabolism in yield heterosis.
Figure 5

The enriched pathways are predictive of yield heterosis. A, AUC for the ratio of tyrosine metabolism to sulfur metabolism. Univariate ROC curve analysis was performed on high- and low-BPH-YPP hybrids from the diallel cross population to identify biomarkers. The shadow is the computed 95% confidence band. B, Box plot of ratios of tyrosine metabolism to sulfur metabolism. The red line indicates the optimal cutoff value. N = 72. C, AUC for the top 10 features based on the multivariate ROC curve analysis. D, Predictive accuracies with different numbers of features. E, Average importance of the top 10 features. Met = metabolism. F, Correlation between the metabolite levels of l-tyrosine and yield heterosis. N = 287. G, Correlation between the average metabolite levels of the five annotated metabolites in tyrosine metabolism and yield heterosis. N = 287. H, Comparison of yield heterosis for 34 hybrids across growth conditions. Paired samples t test, two-tailed. N = 33. I, Correlation between the metabolite levels of tyrosine metabolism and yield heterosis of the 34 hybrids grown under different conditions. N = 34. J, Comparison of the metabolite levels of tyrosine metabolism between the high- and low-BPH-YPP hybrids (N = 53 and 54, respectively) from a testcross population. K, Correlation between the metabolite levels of tyrosine metabolism and yield heterosis of the testcross population (N = 107). The center line of each boxplot represents the 50th percentile. The bottom and top of each boxplot represent the 25th and 75th percentiles, respectively. The whiskers represent the minimum and maximum values. The circles represent outliers.

The enriched pathways are predictive of yield heterosis. A, AUC for the ratio of tyrosine metabolism to sulfur metabolism. Univariate ROC curve analysis was performed on high- and low-BPH-YPP hybrids from the diallel cross population to identify biomarkers. The shadow is the computed 95% confidence band. B, Box plot of ratios of tyrosine metabolism to sulfur metabolism. The red line indicates the optimal cutoff value. N = 72. C, AUC for the top 10 features based on the multivariate ROC curve analysis. D, Predictive accuracies with different numbers of features. E, Average importance of the top 10 features. Met = metabolism. F, Correlation between the metabolite levels of l-tyrosine and yield heterosis. N = 287. G, Correlation between the average metabolite levels of the five annotated metabolites in tyrosine metabolism and yield heterosis. N = 287. H, Comparison of yield heterosis for 34 hybrids across growth conditions. Paired samples t test, two-tailed. N = 33. I, Correlation between the metabolite levels of tyrosine metabolism and yield heterosis of the 34 hybrids grown under different conditions. N = 34. J, Comparison of the metabolite levels of tyrosine metabolism between the high- and low-BPH-YPP hybrids (N = 53 and 54, respectively) from a testcross population. K, Correlation between the metabolite levels of tyrosine metabolism and yield heterosis of the testcross population (N = 107). The center line of each boxplot represents the 50th percentile. The bottom and top of each boxplot represent the 25th and 75th percentiles, respectively. The whiskers represent the minimum and maximum values. The circles represent outliers. We investigated the relationship between the metabolite levels of l-tyrosine and yield heterosis in the whole hybrid population and found no significant correlation (Figure 5F). However, the average levels of the five annotated metabolites that participate in tyrosine metabolism (some of which had significant negative correlations with yield heterosis; Supplemental Table S10), namely, l-tyrosine, maleic acid, atrolactic acid, 4-hydroxycinnamic acid, and 1,4-dihydroxybenzene, were significantly negatively correlated with yield heterosis (r = −0.23; Figure 5G). Furthermore, we evaluated the impact of changes in pathway information on predictions by adding new metabolites to tyrosine metabolism, given that KEGG or other databases are dynamic and more metabolites can be identified and added to a metabolic pathway. We first included two putatively annotated metabolites (succinate and acetoacetate) when calculating the metabolite levels of tyrosine metabolism. The correlation coefficient increased to 0.28 when succinate was added (P = 1.0e-6), and it further changed to 0.34 after the two metabolites were used (P = 2.0e-9; Supplemental Figure S14). However, the correlation coefficients decreased when using other metabolites (uracil and l-phenylalanine) that are not involved in tyrosine metabolism (Supplemental Figure S14). Thus, the metabolite levels of tyrosine metabolism, rather than those of l-tyrosine alone, were predictive of yield heterosis, and the performance of pathway biomarkers was determined by the completeness and accuracy of the pathway information. To validate the contribution of quantitative changes in tyrosine metabolism in predicting yield heterosis, both univariate and multivariate ROC curve analyses were performed on the metabolite levels of 34 hybrids with different performances across growth conditions (Figure 5H). Tyrosine metabolism functioned as a critical feature in both analyses (Supplemental Figures S15 and 16; Supplemental Tables S11 and 12), and a significant negative correlation was found between tyrosine metabolism and yield heterosis (Figure 5I). Subsequently, we obtained the metabolite levels of tyrosine metabolism from another testcross population containing 107 hybrids (Supplemental Table S13). As shown in Figure 3E, the metabolite levels of tyrosine metabolism in the high-BPH group were significantly lower than those in the low-BPH group (Figure 5J). Furthermore, the metabolite levels of tyrosine metabolism showed a significant negative correlation with yield heterosis (Figure 5K). Thus, the metabolite levels of the significantly enriched pathways were predictive of yield heterosis across environments and populations.

Discussion

With the rapid developments in systems biology, the elucidation of molecular mechanisms and exploration of biomarkers based on metabolic pathways for complex phenotypes can accelerate the establishment of precision design programs, such as precision breeding or precision medicine. In this study, untargeted metabolite profiles and computational analyses were combined to explore the metabolomic mechanisms underlying heterosis of six agronomic traits in rice. Consistent with previous findings (Dan et al., 2019, 2020), we found that the average parental metabolite levels, which are additive metabolite profiles, are appropriate predictors for diverse over-dominant phenotypes (better parent heterosis). The changes from metabolomic additive effects to phenotypic over-dominance effects may be partially explained by the combination of hierarchical structure and multiplicative interactions of complex traits (Dan et al., 2015). Additional systematic analyses—incorporating both hybrid individuals and populations—can be performed in the near future. We determined the optimal number of heterosis-associated analytes for each trait by performing the PLS regression multiple times. This strategy makes possible the optimal selection of features for diverse phenotypes (Sprenger et al., 2018; Dan et al., 2019; Hu et al., 2019). In evaluating the performance of PLS or random forest models, changes in the number of predictive variables (top 50–3,746 predictive analytes in Figure 1C and top 5–100 predictive features in Figure 5D) yielded slight variations in predictive models, which are similar to the finding of predicting potato drought tolerance using the random forest method (Sprenger et al., 2018). We speculate that this phenomenon may arise from the inclusion of the most contributed predictive variables, namely, the top 50 analytes in Figure 1C and top 5 features in Figure 5D, in predictive models. We also analyzed the connections between metabolite levels of specific analytes and heterosis of multiple traits, which are rarely reported in previous studies (Dan et al., 2016; Xu et al., 2016; Wilmanski et al., 2019). The overlapping heterosis-associated analytes were found to underlie the association patterns among traits. The metabolic pathways involved in heterosis were finally identified through dysregulated network analysis of the high- and low-BPH hybrids, among which the high-performance hybrids are usually selected by plant breeders, and the correlation patterns of the significantly enriched pathways were similar to those of the corresponding heterosis-associated analytes. However, we were unable to pair the analytes and metabolic pathways because the number of annotated metabolites was rather low (3% of all detected analytes), and the functions of the lipids (which account for about 50% of the annotated metabolites) were mostly unknown. The annotation of new metabolites and functional analyses are urgently required to obtain more details about the connections between predictive analytes and enriched metabolic pathways. Pathway biomarkers were developed for yield heterosis based on quantitative information on significantly enriched metabolic pathways, and the performance of these biomarkers was validated with hybrids across environments and populations. Because all metabolites per pathway, rather than a single metabolite, were used for the calculation of metabolite levels, the pathway biomarkers may overcome the negative effects of molecular heterogeneity in predicting individuals with the same performance (Menche et al., 2017; Guo et al., 2019). In addition, the changes in molecular levels that are triggered by environmental discrepancies can also be “buffered” by the pathway biomarkers with the inclusion of both significant and “insignificant” variables in predictive models, which may contribute to the breeding of adaptive varieties (Varshney et al., 2018; Hickey et al., 2019). The robust predictive power of the pathway biomarkers was unexpected, given that the predictability of grain weight and yield heterosis with sets of metabolites was <0.8 in previous studies (Dan et al., 2019, 2020). The metabolite levels of tyrosine metabolism were stable biomarkers for both the training and validation sets, and the average levels of the five metabolites involved in tyrosine metabolism also displayed a significant negative correlation with yield heterosis. However, the metabolite levels of l-tyrosine showed no significant correlation with yield heterosis. We believe that the metabolomic biomarkers identified in this study emphasize quantitative changes in enriched metabolic pathways rather than differences between metabolites. The metabolite levels of l-tyrosine may have significant negative correlations with yield heterosis, and the remaining metabolites involved in tyrosine metabolism (which had significant negative correlations with yield heterosis) in this study can have no correlation with yield heterosis in other hybrid populations. This contradiction can be understood as metabolomic heterogeneity among populations, similar to the expressional heterogeneity of complex diseases among patients (Menche et al., 2017; Guo et al., 2019). Furthermore, the latest findings demonstrate that changes in metabolite levels of steroid hormone biosynthesis are precisely timed to gestation in pregnant women (Liang et al., 2020). Thus, we anticipate that refined pathway biomarkers based on omics analyses, including genomics (Riedelsheimer et al., 2012a; Millet et al., 2019), transcriptomics (Sprenger et al., 2018; Azodi et al., 2020), proteomics (Zhang et al., 2016; Dou et al., 2020), and lipidomics (Aviram et al., 2016; de Abreu et al., 2018), may provide better predictions than the traditional sets of predictive variables. The prevailing negative correlations between metabolite levels of amino acid metabolism and carbohydrate metabolism suggest that focusing on the regulation of specific metabolic pathways may facilitate the conformation of yield heterosis. With respect to the metabolomic connections of heterosis among traits, the significantly enriched pathways for the yield components always had similar correlation patterns with yield heterosis, whereas that for PH showed an opposite relationship with yield heterosis. Thus, we speculate that there is a rough balance between amino acid metabolism and carbohydrate metabolism in yield heterosis (Dan et al., 2015, 2020), and this balance may originate from metabolomic connections of the remaining reproductive traits (yield components) and vegetative traits (yield-related traits) with different degrees of contribution. The strategy of investigating metabolomic connections between the component and complex traits through overlapping pathways may be used to analyze molecular connections among different complex human diseases—with the knowledge that patients with different diseases share sets of disease-associated genes (Barabasi et al., 2011; Menche et al., 2015, 2017). Our results provide a metabolomic landscape of heterosis in rice, as well as an evaluation of the application potential of biomarkers based on enriched pathways for yield heterosis. Optimal balances among specific metabolic pathways and reproductive and vegetative traits are critical for yield heterosis. Quantitative changes in pathway biomarkers predict yield heterosis without considering discrepancies in growth conditions and hybrid populations, indicating the wide application potential of pathway biomarkers for predicting complex phenotypes and thus achieving precision design programs.

Materials and methods

Plant materials and phenotyping

Eighteen traditional rice (O. sativa) cultivars that include both indica and japonica were parents of one hybrid population, with a complete diallel cross design (Dan et al., 2020). Phenotypic data of five reproductive traits, namely, seed setting rate, thousand-grain weight (Dan et al., 2019), grain number per panicle, tiller number per plant, and yield per plant (YPP, Dan et al., 2020), were collected at the maturation stage. Plant height was also measured at the maturation stage. Trait values of the 18 parents and 287 hybrids were collected and used for the analyses. Another testcross population consisted of a Honglian-type cytoplasmic male-sterile line (Yuetai A) and recombinant inbred lines (F5). The YPP of the maintainer line (Yuetai B), 107 pairs of parent hybrids, was measured at the maturation stage. A total of 34 hybrids that were reciprocals from the diallel cross population were replanted with the testcross population, and their yield performance was recorded for analysis. Details such as locations, planting time, and plant densities of the two hybrid populations were described in a previous study (Dan et al., 2019).

Metabolomics

Metabolite profiling analysis of the parental seedlings was performed as described previously (Dan et al., 2020). Briefly, untargeted metabolite profiles of 15-d-old seedlings were collected with a 1290 Infinity liquid chromatography system (Agilent Technologies, Santa Clara, CA, USA), Agilent quadrupole time-of-flight mass spectrometer (Agilent 6550 iFunnel QTOF; Agilent Technologies, Santa Clara, CA, USA), and Triple TOF 6600 mass spectrometer (AB SCIEX, Foster City, CA, USA). The metabolites were annotated using an in-house standard spectral library, and the lipids were annotated through matching with an in-house tandem mass spectrometry (MS/MS) spectral library. Data reliability was checked using a quality control sample, and the metabolite levels of a total of 3,746 detected analytes, among which 114 metabolites were annotated using the in-house spectral libraries, were normalized (sum, log, and none) for the statistical analyses.

Identification of heterosis-associated analytes

To identify analytes that were closely associated with heterosis of each trait, we used the PLS regression method (Wold, 1975). PLS is an iterative algorithm with the involvement of latent factors and is suitable for conducting multivariate analysis when the number of predictor variables (X variables) significantly exceeds that of response variables (Y variables). The latent factors or latent variables, which can be numerically assessed and provide consistent information for further development of predictive models (Wold, 1975), are formed to not only maximize the explained variance of predictive variables, but also to maximize the covariance of observations (Bijlsma et al., 2006). Values of BPH and the means of parental metabolite levels were X and Y variables, respectively. The number of latent factors was first set to 50, and the largest number of extracted latent factors was 17. The number of latent factors was then set to three or four, at which the r value was the highest among predictive models with different numbers of latent factors, to perform the second regression. To evaluate the performance of the PLS-based models, both cross-validation and permutation test were performed to check whether the models were overfitted. Hybrids from the diallel cross population were divided into high- and low-BPH groups according to the 75th and 25th percentiles of heterosis of each trait. The PLS-discriminant analysis was then performed with the module “Statistical Analysis” on MetaboAnalyst (www.metaboanalyst.ca; Xia and Wishart, 2011). The 10-fold cross-validation method was used, and three parameters were provided to describe the model performance: prediction accuracy, sum of squares of the model (R2), and cross-validated R2 (i.e. Q2; Wold et al., 2001). The separation distance (B/W), which is the ratio of the between-group sum of squares (B) and the within-group sum of squares (W; Bijlsma et al., 2006), was selected for the permutation test (2,000 permutations). The relationship of the B/W distribution between the original and permutated data is indicated by the observed statistical P-value. Subsequently, the values of variable importance in the projection, which are the weighted sums of squares of the model’s weights (Wold et al., 2001), of the three or four latent factors were averaged to evaluate the importance of each analyte. To remove redundant feature information, the top 2,000, 1,500, 1,000, 500, 300, 200, 100, 50, 25, 10, and 5 analytes from the 3,746 predictive analytes were selected for multiple PLS regressions. The optimal number of predictive analytes for each trait was determined when r plateaued. The predictive analytes chosen for multiple traits were treated as overlapping heterosis-associated analytes. The parameters for heterosis-associated analytes and constants were used to describe the connections between metabolite levels and heterosis.

Dysregulated network analysis

To identify the metabolic pathways involved in heterosis of the six traits, pathway enrichment analysis was performed on the diallel cross population. Because of the fact that only 114 metabolites (3% of all detected analytes) had been annotated using the in-house standard spectral libraries, it was difficult to conduct pathway enrichment analysis using traditional strategies. Thus, we utilized the metabolic reaction network-based recursive algorithm (MetDNA; Shen et al., 2019), which can achieve large-scale metabolite annotations for untargeted metabolomics without the dependence of comprehensive standard spectral libraries. The principle of MetDNA is that metabolites in a reaction pair with similar structures tend to have similar MS2 spectra. With the availability of a small library of MS2 spectra, MetDNA significantly and progressively expanded the number of annotated metabolites through the recursive algorithm. The dysregulated metabolic peaks were first discovered using a univariate test (Student’s t test or Mann–Whitney–Wilcoxon test), and the dysregulated peaks with annotations were then mapped to the KEGG metabolic pathways. The metabolite level of one dysregulated pathway was the average level of all annotated metabolites in the pathway. To ensure the sensitivity and specificity of the pathway biomarkers, the diallel cross population was divided into high and low parts based on the 75th and 25th percentiles of the heterosis of each trait. When performing dysregulated network analysis with the MetDNA web server (http://metdna.zhulab.cn), the high- and low-BPH hybrids (hybrids with heterosis ≥75th and ≤25th percentiles, respectively) were the control and case groups, respectively. Analytes with m/z, retention time, and average parental metabolite levels constituted the MS1 peak table, and the raw MS/MS files (mgf format) of a quality control sample (two injections) were the MS2 data files. The corresponding parameters were as follows: ionization polarity, negative; liquid chromatograph, RP; MS instrument, Sciex TripleTOF; collision energy, 35 ± 15; univariate statistics, Student’s t test; species: Arabidopsis thaliana (Thale Cress); cutoff P-value, 0.05; P-value adjustment, yes. For the testcross population, the hybrids were divided into two parts (54 hybrids and 53 hybrids) in the dysregulated network analysis, according to the values of yield heterosis. Metabolic pathways were grouped according to the KEGG pathway database (https://www.genome.jp/kegg/pathway.html; Kanehisa et al., 2014).

ROC curve analysis

Quantitative information on the significantly enriched pathways for yield heterosis was used for the ROC curve analysis with the module “Biomarker Analysis” on MetaboAnalyst (Xia and Wishart, 2011). In the normalization procedures for both univariate and multivariate ROC curve analyses, none was performed for sample normalization and data scaling. The top 100 metabolite ratios (viz. pathway ratios) were computed and included to facilitate the identification of individual biomarkers (Chong et al., 2019). The top 20 metabolite ratios were computed and included in the ROC curve analyses of the 34 hybrids. Random forest (Breiman, 2001) was selected as the classification and the feature ranking method in the multivariate ROC curve analysis. To ensure the performance of random forest models, the “Biomarker Analysis” module performs Monte Carlo cross-validation through balanced subsampling. In each cross-validation, two-thirds of the hybrids were used to evaluate feature rankings, and the top 2, 3, 5, 10, etc., important analytes were selected to build classification models, which were then validated with one-third of the hybrids. The cross-validation procedures were repeated 500 times to calculate the performance and 95% confidence interval (95% confidence band) for each model.

Statistical analyses

Pearson correlations between heterosis and transformed parental metabolite levels, among heterosis of the investigated traits (pairwise) and among heterosis-associated analytes (pairwise), were performed using the analysis path of “Correlation Heatmaps” in the module “Statistical Analysis” on MetaboAnalyst (Xia and Wishart, 2011). Correlations with P <0.05, were considered significant. Empirical Bayesian analysis of differential analytes for the high- and low-BPH groups was performed with the analysis path of “Empirical Bayesian Analysis of Metabolites.” An equal group variance was assumed, and 0.9 was set as the fudge factor (a0) and posterior delta. Unpaired t tests (adjusted P-value cutoff: 0.05) with equal group variance were performed between the high- and low-BPH groups with the analysis path of “T tests.” Compound names of the annotated metabolites were converted into KEGG IDs with the analysis path of “Compound ID Conversion” in the module “Other Utilities.” PLS regressions of BPH and metabolite levels were performed using SPSS (IBM SPSS Statistics for Windows, Version 20.0. Armonk, NY: IBM Corp.). Partial correlations (two-tailed) between the four yield components/PH and yield were performed to investigate the contribution of the four yield components and PH to yield heterosis using SPSS. The two analyzed traits were variables, and the remaining four traits were treated as control variables in partial correlations. Pearson correlations (two-tailed) between the observed and predicted BPH, or between metabolite levels and BPH, were implemented using SPSS, with the correlation coefficient as predictability. Stepwise regression was used to describe yield heterosis (dependent variable) with the four components and PH (independent variables) using SPSS. Independent samples t test (two-tailed) and paired samples t test (two-tailed) were used to compare the differences in pathway levels between the high- and low-BPH hybrids and phenotypic differences in the 34 hybrids across growth conditions using SPSS. Venn diagrams were drawn using a webtool from http://bioinformatics.psb.ugent.be/webtools/Venn.

Accession numbers

All phenotypic data were provided in supporting information and the raw metabolite profiles were deposited in the metabolomic database: MetaboLights (MTBLS742; Dan et al., 2020).

Supplemental data

The following materials are available in the online version of this article. . Heatmap for correlations between heterosis and transformed parental metabolite levels. . Determining the number of latent factors for heterosis of each trait. . Cross-validation and permutation test of the PLS-based models. . Scatter plots for observed and metabolome-predicted heterosis of the four yield components. . Correlations between the metabolite levels of overlapping heterosis-associated analytes and heterosis of yield and yield components. . Scatter plots for observed and yield components-predicted yield heterosis. . Correlations between the metabolite levels of overlapping heterosis-associated analytes and heterosis of PH and yield and yield components. . Heatmap for correlations among the 100 yield heterosis-associated analytes. . Correlation between average levels of metabolites in cyanoamino acid metabolism and propanoate metabolism. . Heatmaps for correlations among the screened heterosis-associated analytes and enriched metabolic pathways for SSR. . Heatmaps for correlations among the screened heterosis-associated analytes and enriched metabolic pathways for TGW. . Heatmaps for correlations among the screened heterosis-associated analytes and enriched metabolic pathways for grain number per plant. . Multivariate ROC curve analysis of high- and low-BPH hybrids from the diallel cross population. . Correlations between the average levels of metabolites and yield heterosis. . AUC for tyrosine metabolism based on the univariate ROC curve analysis of 34 hybrids. . Multivariate ROC curve analysis of the 34 hybrids. . Phenotypic data of parents and hybrids. . Overlapping heterosis-associated analytes among traits. . Correlations between metabolite levels of overlapping heterosis-associated analytes and BPH. . Metabolite levels of the enriched pathways for yield heterosis of the high- and low-BPH hybrids from the diallel cross population. . T test of metabolite levels of the enriched pathways for yield heterosis. . Correlations between metabolite levels of the enriched pathways and yield heterosis. . Correlations of six metabolites in cyanoamino acid metabolism and propanoate metabolism. . Univariate ROC curve analysis of the 17 significantly enriched pathways for yield heterosis. . Multivariate ROC curve analysis of the 17 significantly enriched pathways for yield heterosis. . Correlations between yield heterosis and the five annotated metabolites in tyrosine metabolism. . Univariate ROC curve analysis of the significantly enriched pathways for yield heterosis of the 34 hybrids. . Multivariate ROC curve analysis of the significantly enriched pathways for yield heterosis of the 34 hybrids. . Metabolite levels of the enriched pathways for hybrids from the testcross population. Click here for additional data file.
  46 in total

1.  Genome-based establishment of a high-yielding heterotic pattern for hybrid wheat breeding.

Authors:  Yusheng Zhao; Zuo Li; Guozheng Liu; Yong Jiang; Hans Peter Maurer; Tobias Würschum; Hans-Peter Mock; Andrea Matros; Erhard Ebmeyer; Ralf Schachschneider; Ebrahim Kazman; Johannes Schacht; Manje Gowda; C Friedrich H Longin; Jochen C Reif
Journal:  Proc Natl Acad Sci U S A       Date:  2015-12-09       Impact factor: 11.205

2.  Web-based inference of biological patterns, functions and pathways from metabolomic data using MetaboAnalyst.

Authors:  Jianguo Xia; David S Wishart
Journal:  Nat Protoc       Date:  2011-05-05       Impact factor: 13.491

3.  Genomic and metabolic prediction of complex heterotic traits in hybrid maize.

Authors:  Christian Riedelsheimer; Angelika Czedik-Eysenberg; Christoph Grieder; Jan Lisec; Frank Technow; Ronan Sulpice; Thomas Altmann; Mark Stitt; Lothar Willmitzer; Albrecht E Melchinger
Journal:  Nat Genet       Date:  2012-01-15       Impact factor: 38.330

4.  Genome-wide association analyses provide genetic and biochemical insights into natural variation in rice metabolism.

Authors:  Wei Chen; Yanqiang Gao; Weibo Xie; Liang Gong; Kai Lu; Wensheng Wang; Yang Li; Xianqing Liu; Hongyan Zhang; Huaxia Dong; Wan Zhang; Lejing Zhang; Sibin Yu; Gongwei Wang; Xingming Lian; Jie Luo
Journal:  Nat Genet       Date:  2014-06-08       Impact factor: 38.330

5.  Proteogenomic Characterization of Endometrial Carcinoma.

Authors:  Yongchao Dou; Emily A Kawaler; Daniel Cui Zhou; Marina A Gritsenko; Chen Huang; Lili Blumenberg; Alla Karpova; Vladislav A Petyuk; Sara R Savage; Shankha Satpathy; Wenke Liu; Yige Wu; Chia-Feng Tsai; Bo Wen; Zhi Li; Song Cao; Jamie Moon; Zhiao Shi; MacIntosh Cornwell; Matthew A Wyczalkowski; Rosalie K Chu; Suhas Vasaikar; Hua Zhou; Qingsong Gao; Ronald J Moore; Kai Li; Sunantha Sethuraman; Matthew E Monroe; Rui Zhao; David Heiman; Karsten Krug; Karl Clauser; Ramani Kothadia; Yosef Maruvka; Alexander R Pico; Amanda E Oliphant; Emily L Hoskins; Samuel L Pugh; Sean J I Beecroft; David W Adams; Jonathan C Jarman; Andy Kong; Hui-Yin Chang; Boris Reva; Yuxing Liao; Dmitry Rykunov; Antonio Colaprico; Xi Steven Chen; Andrzej Czekański; Marcin Jędryka; Rafał Matkowski; Maciej Wiznerowicz; Tara Hiltke; Emily Boja; Christopher R Kinsinger; Mehdi Mesri; Ana I Robles; Henry Rodriguez; David Mutch; Katherine Fuh; Matthew J Ellis; Deborah DeLair; Mathangi Thiagarajan; D R Mani; Gad Getz; Michael Noble; Alexey I Nesvizhskii; Pei Wang; Matthew L Anderson; Douglas A Levine; Richard D Smith; Samuel H Payne; Kelly V Ruggles; Karin D Rodland; Li Ding; Bing Zhang; Tao Liu; David Fenyö
Journal:  Cell       Date:  2020-02-13       Impact factor: 41.582

6.  A directed learning strategy integrating multiple omic data improves genomic prediction.

Authors:  Xuehai Hu; Weibo Xie; Chengchao Wu; Shizhong Xu
Journal:  Plant Biotechnol J       Date:  2019-04-14       Impact factor: 9.803

7.  Comprehensive metabolic profiling and phenotyping of interspecific introgression lines for tomato improvement.

Authors:  Nicolas Schauer; Yaniv Semel; Ute Roessner; Amit Gur; Ilse Balbo; Fernando Carrari; Tzili Pleban; Alicia Perez-Melis; Claudia Bruedigam; Joachim Kopka; Lothar Willmitzer; Dani Zamir; Alisdair R Fernie
Journal:  Nat Biotechnol       Date:  2006-03-12       Impact factor: 54.908

8.  Metabolome-based genome-wide association study of maize kernel leads to novel biochemical insights.

Authors:  Weiwei Wen; Dong Li; Xiang Li; Yanqiang Gao; Wenqiang Li; Huihui Li; Jie Liu; Haijun Liu; Wei Chen; Jie Luo; Jianbing Yan
Journal:  Nat Commun       Date:  2014-03-17       Impact factor: 14.919

9.  Integrating personalized gene expression profiles into predictive disease-associated gene pools.

Authors:  Jörg Menche; Emre Guney; Amitabh Sharma; Patrick J Branigan; Matthew J Loza; Frédéric Baribaud; Radu Dobrin; Albert-László Barabási
Journal:  NPJ Syst Biol Appl       Date:  2017-03-13

10.  Hierarchical additive effects on heterosis in rice (Oryza sativa L.).

Authors:  Zhiwu Dan; Jun Hu; Wei Zhou; Guoxin Yao; Renshan Zhu; Wenchao Huang; Yingguo Zhu
Journal:  Front Plant Sci       Date:  2015-09-11       Impact factor: 5.753

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.