| Literature DB >> 35668791 |
François Vasseur1, Denis Cornet2,3, Grégory Beurier2,3, Julie Messier4, Lauriane Rouan2,3, Justine Bresson1, Martin Ecarnot3, Mark Stahl5, Simon Heumos6,7, Marianne Gérard1, Hans Reijnen1, Pascal Tillard8, Benoît Lacombe8, Amélie Emanuel1,8, Justine Floret1,9, Aurélien Estarague1, Stefania Przybylska1, Kevin Sartori1, Lauren M Gillespie1, Etienne Baron1, Elena Kazakou10, Denis Vile9, Cyrille Violle1.
Abstract
The trait-based approach in plant ecology aims at understanding and classifying the diversity of ecological strategies by comparing plant morphology and physiology across organisms. The major drawback of the approach is that the time and financial cost of measuring the traits on many individuals and environments can be prohibitive. We show that combining near-infrared spectroscopy (NIRS) with deep learning resolves this limitation by quickly, non-destructively, and accurately measuring a suite of traits, including plant morphology, chemistry, and metabolism. Such an approach also allows to position plants within the well-known CSR triangle that depicts the diversity of plant ecological strategies. The processing of NIRS through deep learning identifies the effect of growth conditions on trait values, an issue that plagues traditional statistical approaches. Together, the coupling of NIRS and deep learning is a promising high-throughput approach to capture a range of ecological information on plant diversity and functioning and can accelerate the creation of extensive trait databases.Entities:
Keywords: Arabidopsis thaliana; functional traits; machine learning; metabolomics; multivariate analysis; near-infrared spectroscopy (NIRS); trait-based ecology
Year: 2022 PMID: 35668791 PMCID: PMC9163986 DOI: 10.3389/fpls.2022.836488
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 6.627
Figure ILeaf reflectance as a function of light wavelength. All spectrum available in the database used to analyze the ability of spectral reflectance to predict trait values and plant categories are represented here and colored according to the experiment they come from (see Supplementary Table S1 and Supplementary Material for details about experiments, conditions, as well as number of spectra per experiment). Colored lines represent the mean absorbance spectra, light grey lines represent the median absorbance spectra, dark shaded area represents spectra with absorbance ranging between the 5 and 95th percentiles, and light shaded area represents the entire absorbance range covered by the spectra.
Prediction accuracy for functional traits.
| Variable |
| Calibration | Validation | ||||
|---|---|---|---|---|---|---|---|
|
|
| RMSE | Bias | Slope | RPD | ||
| LDMC (mg g−1) | 2,932 | 52.73 | 0.86 | 16.10 | 0.38 | 1.06 | 3.28 |
| SLA (mm2 mg−1) | 3,423 | 20.90 | 0.85 | 7.47 | 0.14 | 1.01 | 2.80 |
| LNC (%) | 1,961 | 2.18 | 0.93 | 0.53 | −0.06 | 0.97 | 4.12 |
| Leaf thickness (μm) | 4,143 | 178.08 | 0.89 | 69.49 | 2.79 | 1.02 | 2.56 |
| RWC (%) | 1,421 | 22.06 | 0.17 | 4.52 | 0.40 | 1.27 | 4.88 |
| LCC (%) | 1,960 | 4.78 | 0.65 | 1.17 | 0.03 | 0.86 | 4.10 |
| δ13C | 1,222 | 1.59 | 0.83 | 0.62 | −0.04 | 0.95 | 2.56 |
| δ15N | 1,223 | 3.76 | 0.28 | 1.83 | −0.13 | 0.82 | 2.06 |
| Plant lifespan (days) | 1,403 | 10.55 | 0.17 | 8.01 | −1.31 | 0.86 | 1.32 |
| Plant growth rate (mg d−1) | 701 | 0.01 | 0.53 | 0.00 | 0.00 | 0.96 | 1.94 |
| 2,905 | 10.25 | 0.88 | 3.28 | −0.02 | 1.03 | 3.13 | |
| 2,905 | 11.64 | 0.75 | 2.57 | 0.19 | 1.11 | 4.53 | |
| 2,905 | 17.03 | 0.87 | 4.79 | 0.33 | 0.99 | 3.55 | |
LDMC, leaf dry matter content; SLA, specific leaf area; LNC, leaf nitrogen content; RWC, relative water content; LCC, leaf carbon content; δ13C, fraction of 13C isotope; and δ15N, fraction of 15N isotope. CSR scores were estimated from leaf traits by the algorithm from Pierce et al. (2017). n is the total number of leaves used for modelling from our database that are associated with both trait and spectra measurements. All predictions have been obtained from convolutional neural network (CNN) models (see Supplementary Material for details). SD, standard deviation; RMSE, root mean square deviation; and RPD, relative percent difference.
Figure 1Predictions of the leaf economics spectrum and CSR strategies. Log10 relationships between specific leaf area (SLA, mm2 mg−1) and leaf nitrogen content (LNC, %; A); between leaf nitrogen content (LNC, %) and leaf dry matter content (LDMC, mg g−1; B). Only predicted values in the validation dataset (1/4 of the whole dataset, n = 123) were plotted here. Observed trait values are colored in blue and predicted trait values are colored in red. Regression lines have been estimated by standard major axis (SMA). P is the p value of the SMA test of slope difference between observed and predicted relationships. (C) 3D representation of the leaf economics spectrum between observed and predicted trait values in the validation dataset (n = 123). (D) CSR triangle between observed and predicted trait values in the validation dataset (n = 699) depicting the variation of plant ecological strategies between competitive ability (C), stress-tolerance (S), and ruderalism (R). CSR scores (%) have been measured from leaf traits following the method from Pierce et al. (2017) (see Supplementary Material). Only measurements performed on fully expanded but non-senescing leaves, and only under non stressing conditions, were used here.
Prediction accuracy for five plant categories.
| Calibration accuracy (%) | Validation accuracy (%) | |
|---|---|---|
| Survival (2) | 0.988 | 0.915 |
| Genotypes (10) | 0.831 | 0.640 |
| Indoor/Outdoor (2) | 0.998 | 1.000 |
| CSR categories (11) | 0.980 | 0.700 |
| Treatment (2) | 0.955 | 0.714 |
Plant survival has two categories (dead or alive), which were measured according to the protocol described in Estarague et al. (2021). Genotypes have 10 categories corresponding to the 10 natural accessions used here. Indoor/outdoor represents whether a plant has been grown in a greenhouse or growth chamber (indoor) or in a common garden (outdoor) across all the experiments included in the database used here. CSR categories are the intermediate CSR classes estimated from leaf traits by the algorithm from Pierce et al. (2017), such as R/SR, S/SC, RS, and C/CSR (see Supplementary Material). Treatment has two categories (control and water stress) from the dedicated experiments included in the database (see Supplementary Material). All predictions have been obtained from CNN models.
Figure 2Prediction accuracy of plant survival and growth conditions. Confusion matrices showing the classification performance for the prediction of (A) plant survival (positive P) and mortality (negative N), and (B) the growth condition: indoor (positive P) vs. outdoor (negative N). Precision score = true P/(false P + true P). Recall score = true P/(false N + true P). Accuracy Score = (true P + true N)/(true P + false N + true N + false P). F1 Score = 2*Precision score*Recall score/(Precision score + Recall score).
Prediction accuracy for 67 metabolites.
| Variable | Caibration validation | ||||||||
|---|---|---|---|---|---|---|---|---|---|
|
|
| RMSE | Bias | Slope | RPD | ||||
| Sugars | Glucose | 6764.56 | 0.14 | 1621.88 | −4.49 | 0.95 | 4.17 | ||
| Fructose | 10240.92 | 0.56 | 1316.93 | 352.08 | 1.17 | 7.78 | |||
| Sucrose | 11380.72 | 0.00 | 2086.69 | 538.48 | −12.55 | 5.45 | |||
| Fucose | 28.65 | 0.03 | 1.90 | 0.37 | 0.75 | 15.04 | |||
| Isomaltose | 26.02 | 0.16 | 6.58 | 1.44 | 1.41 | 3.95 | |||
| Cellobiose | 157.51 | 0.39 | 73.21 | 19.87 | 1.85 | 2.15 | |||
| Arabinose | 37.57 | 0.00 | 51.42 | 9.39 | 100.65 | 0.73 | |||
| Galactose | 293.66 | 0.18 | 304.29 | 82.21 | 1.11 | 0.97 | |||
| Inositol | 911.06 | 0.31 | 136.28 | 23.17 | 1.29 | 6.69 | |||
| Maltose | 58.40 | 0.02 | 57.31 | 19.37 | 0.86 | 1.02 | |||
| Mannose | 219.79 | 0.42 | 35.78 | 12.77 | 2.19 | 6.14 | |||
| Raffinose | 644.65 | 0.57 | 457.00 | 112.77 | 1.12 | 1.41 | |||
| Rhamnose | 68.56 | 0.02 | 95.56 | 17.09 | −1150.74 | 0.72 | |||
| Ribose | 32.35 | 0.00 | 42.17 | 13.41 | 138.61 | 0.77 | |||
| Palatinose | 236.89 | 0.00 | 294.60 | 36.80 | −5.60 | 0.80 | |||
| Melezitose | 15.62 | 0.38 | 7.47 | 1.31 | 1.26 | 2.09 | |||
| Melibiose | 200.00 | 0.09 | 264.69 | 47.47 | 0.69 | 0.76 | |||
| Trehalose | 176.00 | 0.00 | 146.34 | 23.78 | −1.69 | 1.20 | |||
| Xylose | 35.75 | 0.13 | 7.09 | 1.54 | 1.32 | 5.04 | |||
| Hormones | ABA | 12.54 | 0.06 | 11.25 | 1.43 | 0.57 | 1.12 | ||
| IAA | 21.37 | 0.26 | 18.16 | 1.84 | 0.95 | 1.18 | |||
| JA | 337.70 | 0.29 | 197.91 | 31.53 | 1.03 | 1.71 | |||
| SA | 799.00 | 0.00 | 495.41 | 147.44 | −10.54 | 1.61 | |||
| CMLX | 7277.61 | 0.02 | 8086.67 | 2421.27 | 63.66 | 0.90 | |||
| Glucosinolates | Glucoalysiin | 28.79 | 0.10 | 27.76 | 3.95 | 1.05 | 1.04 | ||
| Glucobrassicin | 1462.69 | 0.15 | 914.32 | 210.01 | 0.76 | 1.60 | |||
| Glucoerucin | 12.22 | 0.39 | 5.88 | 0.51 | 0.86 | 2.08 | |||
| Gluconapin | 5005.90 | 0.00 | 4703.53 | 2123.30 | 0.43 | 1.06 | |||
| Gluconasturtiin | 94.36 | 0.00 | 91.73 | 12.46 | 0.63 | 1.03 | |||
| Glucoraphanin | 1308.98 | 0.00 | 1166.48 | 250.14 | 0.22 | 1.12 | |||
| Glucoraphenin | 1.78 | 0.74 | 0.62 | 0.07 | 0.91 | 2.88 | |||
| Epigallocatechin | 210.86 | 0.27 | 163.05 | 2.91 | 0.83 | 1.29 | |||
| Progoitrin | 666.26 | 0.01 | 564.65 | 135.83 | 0.38 | 1.18 | |||
| Epiprogoitrin | 6316.22 | 0.09 | 5944.42 | 1814.64 | 0.74 | 1.06 | |||
| Isobutyl | 473.57 | 0.03 | 356.50 | 56.56 | 0.67 | 1.33 | |||
| Glucosinalbin | 10.35 | 0.00 | 7.96 | 1.28 | 2.52 | 1.30 | |||
| Sinigrin | 4445.20 | 0.07 | 4259.39 | 1571.86 | 1.04 | 1.04 | |||
| Hexyl | 49.96 | 0.00 | 45.61 | 12.28 | 0.53 | 1.10 | |||
| Butyl | 5.49 | 0.51 | 3.20 | −0.24 | 1.07 | 1.72 | |||
| Neoglucobrassicin Peak1 | 265.97 | 0.73 | 273.80 | 59.08 | 1.86 | 0.97 | |||
| Neoglucobrassicin Peak2 | 1051.25 | 0.06 | 254.92 | 24.16 | 0.41 | 4.12 | |||
| X3MTP | 47.48 | 0.51 | 9.63 | 0.36 | 1.41 | 4.93 | |||
| X5MTP | 20.76 | 0.61 | 11.56 | 1.14 | 1.40 | 1.80 | |||
| X6MSH | 51.83 | 0.22 | 48.64 | 9.55 | 1.09 | 1.07 | |||
| X7MSH | 261.68 | 0.18 | 277.93 | 88.23 | 1.19 | 0.94 | |||
| X7MTH | 244.30 | 0.36 | 224.81 | 36.56 | 1.04 | 1.09 | |||
| X8MSO | 2013.33 | 0.31 | 1528.42 | 169.92 | 0.87 | 1.32 | |||
| X8MTO | 1278.38 | 0.17 | 1053.50 | 176.17 | 0.85 | 1.21 | |||
| Other secondary metabolites | Apigenin rutinoside | 1140.31 | 0.31 | 848.50 | 73.33 | 0.63 | 1.34 | ||
| Caffeic Acid | 30.01 | 0.32 | 0.96 | −0.20 | 0.74 | 31.31 | |||
| Chlorogenic Acid | 29.55 | 0.66 | 16.29 | 1.38 | 1.09 | 1.81 | |||
| Citrat | 2647.54 | 0.44 | 1894.98 | 169.09 | 1.08 | 1.40 | |||
| Cyanidin rhamnoside | 1431.34 | 0.53 | 842.46 | −56.16 | 0.81 | 1.70 | |||
| Cyanidin sophorosid glucoside | 674.85 | 0.31 | 387.08 | 88.61 | 1.04 | 1.74 | |||
| Dihydro caffeoyl glucuronide | 27.05 | 0.85 | 8.96 | 0.01 | 1.12 | 3.02 | |||
| Fumarat | 294.76 | 0.10 | 174.41 | 18.17 | 0.68 | 1.69 | |||
| Kaempherol glucosyl rhamnosyl glucoside | 989.20 | 0.14 | 518.91 | 97.70 | 0.69 | 1.91 | |||
| Kaempherol rutinoside | 2788.98 | 0.59 | 1613.31 | 127.58 | 0.88 | 1.73 | |||
| Kaempherol xylosyl rhamnoside | 1362.13 | 0.56 | 774.66 | 7.04 | 0.88 | 1.76 | |||
| Malat | 1078.18 | 0.16 | 786.53 | 133.47 | 0.61 | 1.37 | |||
| m-Coumaric Acid | 144.26 | 0.00 | 143.67 | 18.09 | 0.84 | 1.00 | |||
| p-Coumaric Acid | 4.00 | 0.46 | 1.35 | −0.08 | 1.02 | 2.95 | |||
| Pelargonidin cumaroyl diglucoside glucoside | 69.47 | 0.65 | 34.69 | −0.28 | 0.94 | 2.00 | |||
| Pelargonidin sambubioside | 291.72 | 0.47 | 223.17 | 13.31 | 0.81 | 1.31 | |||
| Prenyl naringenin | 36.74 | 0.63 | 14.89 | −2.09 | 0.93 | 2.47 | |||
| Quercetin glucoside | 56.73 | 0.23 | 54.09 | 11.77 | 1.41 | 1.05 | |||
| Succinat | 60.74 | 0.16 | 45.15 | 0.70 | 0.93 | 1.35 | |||
Metabolites have been measured with GC–MS or LC–MS depending on the metabolite (n = 124 per metabolite) on leaves harvested on 4-week old plants grown in the greenhouse. Sugars are given in μmol/gFW; hormones in ng/gFW. For glucosinolates and other secondary metabolites, foliar relative concentrations were estimated by dividing the peak area corresponding to the metabolite by the fresh weight of the sample. SD, standard deviation; RMSE, root mean square deviation; and RPD, relative percent difference. All predictions have been obtained from CNN models (see Supplementary Material for details).