Literature DB >> 32168339

Cultivar-specific nutritional status of potato (Solanum tuberosum L.) crops.

Zonlehoua Coulibali1, Athyna Nancy Cambouris2, Serge-Étienne Parent1.   

Abstract

Gradients in the elemental composition of a potato leaf tissue (i.e. its ionome) can be linked to crop potential. Because the ionome is a function of genetics and environmental conditions, practitioners aim at fine-tuning fertilization to obtain an optimal ionome based on the needs of potato cultivars. Our objective was to assess the validity of cultivar grouping and predict potato tuber yields using foliar ionomes. The dataset comprised 3382 observations in Québec (Canada) from 1970 to 2017. The first mature leaves from top were sampled at the beginning of flowering for total N, P, K, Ca, and Mg analysis. We preprocessed nutrient concentrations (ionomes) by centering each nutrient to the geometric mean of all nutrients and to a filling value, a transformation known as row-centered log ratios (clr). A density-based clustering algorithm (dbscan) on these preprocessed ionomes failed to delineate groups of high-yield cultivars. We also used the preprocessed ionomes to assess their effects on tuber yield classes (high- and low-yields) on a cultivar basis using k-nearest neighbors, random forest and support vector machines classification algorithms. Our machine learning models returned an average accuracy of 70%, a fair diagnostic potential to detect in-season nutrient imbalance of potato cultivars using clr variables considering potential confounding factors. Optimal ionomic regions of new cultivars could be assigned to the one of the closest documented cultivar.

Entities:  

Mesh:

Year:  2020        PMID: 32168339      PMCID: PMC7069643          DOI: 10.1371/journal.pone.0230458

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


1 Introduction

Potato cultivars are commonly classified into maturity groups based on the number of days from planting to maturity [1]. Compared to other maturity groups, cultivars with longer maturity generally show yield potential that is similar or higher [2-4] because of higher genetic potential [5] related to higher foliar nitrogen status [6] and root acquisition rate [7]. Hence, nutrient management of potato cultivars often consider the cultivar maturity group. However, nutrient profiles or ionomes [8, 9] may vary among potato cultivars of the same maturity groups because cultivars inherit from a diversity of parents specific traits for nutrient absorption and assimilation [10]. Indeed, White et al. [11] provided evidence of important ionome variations in angiosperm species and stated that plant families could be distinguished by their shoot ionomes. Successful classifications of plant species based on axis-reductions have been implemented on compositionally preprocessed plant ionomes [12, 13]. The potato cultivar may also be classified similarly, allowing newly introduced cultivars to benefit from the documented nutrient management of older cultivars. Hence, the foliar ionome, easily collected from field trials, could provide a tool for the fertilization of newly introduced cultivars. Tissue ionome portrays plant nutritional status [13] under the assumption of causal relationships between plant growth rate and nutrient concentration in a tissue [14, 15]. In survey datasets, reference compositions are those that are nutritionally balanced [12]. Imbalanced ionomes could be rebalanced theoretically through a perturbation operation [16] i.e., a change in tissue composition after nutrient stress has been applied. Any factor impacting yield response to nutrients can perturb leaf composition [17]. Fertilization perturbs soil composition [18] by supplying readily available plant-nutrients [19]. Because nutrients interact in the plant, Baxter [20] suggested that the ionome could be treated as a combination of elements rather than elements taken in isolation. Parent [13] described ionomes as multivariate balance systems of isometric log-ratios [16]. Isometric log-ratios maps vectors of concentrations, which are strictly positive data constrained to the measurement unit that convey only relative information, to a real space of orthonormal coordinates [21]. Indeed, ionomes are intrinsically multivariate: each part cannot be interpreted without being related to the other parts of the whole [22]. Parent and Dafir [23] developed the compositional nutrient diagnosis in plants using row-centered log-ratios (clr). Thereafter, compositional data transformation has been used to preprocess combined nutrients traits of plant species and cultivars [13, 24–26] as well as animal species [27], and human food [28, 29]. The first objective of this study was to identify clusters of potato cultivars based on their leaf ionomes. The second objective was to develop, evaluate and compare the performance of machine learning algorithms in predicting yield categories using ionomes. The third objective was to develop a conceptual workflow to adjust the ionome of potato cultivars using compositional perturbations. Our hypotheses were that (1) nutritionally balanced leaf ionomes of potato cultivars differ among potato cultivars, (2) tuber yield is impacted by specifically leaf compositional traits, and (3) cultivar-specific leaf ionomes could be rebalanced using a perturbation operation.

2 Methodology

2.1 Data set

The data set is a collection of potato surveys, and nitrogen (N), phosphorus (P) and potassium (K) fertilizer trials conducted in the province of Québec (Canada) from 1970 to 2017 (S1 Table) between the US border (45th parallel) and the Northern limit of cultivation (49th parallel). The data set was filtered to remove foliar samples collected too early or too late from the beginning (10%) of flowering, as reported by scouting teams, and where three or more of the five major elements (N, P, K, Ca and Mg) have not been quantified. The complete data set comprised 3382 observations of 151 field trials. Five maturity classes were represented, and we matched the duration from planting to harvest described by the Canadian Food Inspection Agency [1], although the names differed: early season (65–70 days), early mid-season (70–90 days), mid-season (90–110 days), mid-season late (110–130) and late season (130 days and more) cultivars. The number of samples per cultivar and the corresponding maturity classes are reported in S2 Table.

2.2 Diagnostic tissue composition

The potato diagnostic tissue is the first mature leaf (4th leaf from top) sampled at the beginning (10%) of the blooming stage [15, 30]. Twenty to 30 leaves were collected at random in each plot, composited, dried at 65°C, ground to pass through a 1 mm sieve, and analyzed for N, P, K, Ca and Mg concentrations after dissolution of combustion. Total N was determined by micro-Kjeldahl or Dumas combustion (Leco CNS-2000 analyzer, St. Joseph, MI, USA). After acid dissolution [31], K, Ca, and Mg concentrations were quantified by atomic absorption spectrometry or inductively coupled plasma spectroscopy (ICP), and P by colorimetry or ICP. We made no distinction between methodologies in the analysis of ionomes.

2.3 Processing nutrient composition to nutrient balances

The compositional space [16] of the leaf tissue comprised five nutrients (N, P, K, Mg, Ca) and undetermined components amalgamated into a filling value (Fv) computed by difference between the measurement unit and the sum of quantified nutrients. Tissue components were preprocessed using the row-centered log-ratio transformation, as follows [23]: where x is raw concentration of the ith component and g(x) is the geometric mean across components including the filling value.

2.4 Clustering cultivars

Yield thresholds are useful for decision-making. Because tuber yield potential varies widely among cultivars, we processed by discretizing tuber yields into low- and high-productivity categories [12] by ranking the marketable yield in ascending order within a given cultivar, and selecting the yield corresponding to the 65th percentile as cut-off between the two subgroups. Hence, each cultivar had its cut-off with respect to its yield potential as shown in S2 Table. The high-yielding subpopulation ionomes were used to assess cultivars clustering ability. This subgroup comprised 1190 occurrences (after the exclusion of 144 outliers) across 151 trials and 47 cultivars. A density-based clustering method [32] was used to delineate cultivar groups of similar compositions using clr variables.

2.5 Ionome effect and yield prediction

Machine learning algorithms can either regress to predict continuous variables or classify to predict categories [33]. Tuber yield categories were predicted using clr variables and information on ionomic groups of the full data set (high and low yielders i.e. 3382 rows). Three machine learning algorithms were compared: k-nearest neighbors, random forest and support vector machines. We estimated the relative influence of variables in the model and their rank by examining how can prediction error increases when data for a variable is permuted while all others are left unchanged [34, 35]. A variable can score a zero or too small value compared to others. Deleting such variable from the dataset should not impact on the results. The random forest model was used for feature selection to assess importance of each clr variable in predicting tuber yield, but none of the variable was removed. The data were split into training (75%) and testing (remaining 25%) sets at cultivar level i.e., for each cultivar the samples were randomly separated according to these proportions. The performance of the classification models was assessed using accuracy computed with the testing set. Applied to the context, the four quadrants defined by Swets [36] in binary system diagnosis to delineate the response classes are presented in the contingency table (Table 1). The accuracy is the proportion of correctly-predicted instances:
Table 1

Term definitions used for the study.

Observed yield
Low (unbalanced)High (balanced)
Predicted yieldLowTrue positive (TP): observed low-yielders correctly predicted as low-yielders.False positive (FP): observed high-yielders incorrectly predicted as low-yielders.
HighFalse negative (FN): observed low-yielders incorrectly predicted as high-yielders.True negative (TN): observed high-yielders correctly predicted as high-yielders.

As in medical sciences, the negative term is used in cases where no intervention is needed after diagnosis.

As in medical sciences, the negative term is un class="Disease">sed in can class="Disease">ses where no intervention is needed after diagnosis.

2.6 Rebalancing a composition: The enchanting islands

A compositional perturbation is a translation in the compositional space [37, 38]. A perturbation vector expressed as clr values contains a series of deltas (differences). Once back-transformed into the compositional space, the perturbation vector alters a composition through a perturbation (⊕) operation as follows [37]: where a D-part composition A is perturbed (⊕) by a D-part composition B, and is the closure operator to constant sum. We used the testing set to display the effect of a perturbation across the simplex. We selected two elements (N and P) and simulated an increase of their initial (observed) clr values by 20% (theoretically). The observed (ionome of the instance) and new clr vector (perturbed ionome) were back-transformed into N, P, K, Ca, Mg and Fv compositional space for comparison using familiar concentration units. The high yielders of the training set correctly diagnosed as balanced (true negative specimens) by the most accurate model were used as the reference subpopulations. The clr values of these reference specimens were used as reference nutritional status at high yield potential. A potato nutrient imbalance index was computed as a distance from the closest high-yielding specimen using the Aitchison distance, i.e. the Euclidean distance between compositions using clr-transformed concentrations [39]. For any misbalanced or new specimen of a given cultivar, the closest true negative (closest reference composition) was identified as the sample with the minimum Aitchison distance from the new composition. The nutrient clr differences defining the Aitchison distance may be considered as apparently excess or deficiency of the nutrient requiring correcting measures in a multivariate and compositional data perspective [40]. Hence, the clr space of nutrient components (N, P, K, Ca, Mg) was described not as an ellipsoidal hyper-space [41] but as islands of high-yielding specimens dispersed in the hyper-space of differently yielding specimens. The closer is a specimen from the enchanting island, the higher its chance to become a high-yielder [40]. The clr-difference was converted into a perturbation vector between two nutrient compositions expressed as familiar nutrient concentrations.

2.7 Statistical analysis

Statistical computations were performed in the R statistical environment version 3.6.1 [42]. Compositional data analysis was conducted using the R compositions package version 1.40–2 [43]. Multivariate outliers were removed for robust multivariate analysis [44] using the Mahalanobis distance at a 0.01 level of significance with the R mvoutlier package version 2.0.9 [45]. The clustering operation were performed using dbscan package version 1.1–3 [32]. Linear discriminant analysis (LDA) was conducted using the R ade4 package version 1.7–13 [46] which allows computing linear combinations of clr coordinates that best discriminate cultivars ionomes centroids. Supervised analysis was conducted using the caret package version 6.0–84 [47]. Our results are reproducible by using the R computation codes and data given as supplementary information and available online in a GitHub repository (https://git.io/Jvt2r).

3 Results

3.1 Cluster analysis

The data set used for clustering is described in S2 Table. The AC Chaleur cultivar showed the lowest tuber marketable yield cut-off (65th percentile) at 17.4 Mg ha-1 and Red-Maria, the highest at 64.6 Mg ha-1. Average marketable yield was 40.5 Mg ha-1 for high yielders and 24.8 Mg ha-1 for low yielders. In comparison, average potato tuber yields in Canada and Québec were 31.2 Mg ha-1 and 32.2 Mg ha-1 respectively, in 2018 [48]. The dbscan clustering function looked for dense regions in the clr-space, and detected a single cluster of cultivars i.e., cultivars were scattered without any particular dense region. A principle components analysis allowed to map cultivars and nutrients in the biplot shown in Fig 1. The principle components scores mapped on the distance biplot (Fig 1A) showed no particular pattern allowing groups partition. The clr correlation loadings (Fig 1B) showed a negative relationship between K and Mg, P and Ca, and positive relationship between N and P in agreement to concentration changes with time as the plant matures [49]. Discrepancies between cultivars were driven mainly by Mg and K on the first axis, and by P and Ca on the second axis (right hand side plot).
Fig 1

Principle components biplot of potato ionome showing (A) scores in distance scaling and (B) loadings in correlation scaling.

Principle components biplot of n class="Species">potato ionome showing (A) scores in distance scaling and (B) loadings in correlation scaling.

3.2 Predicting tuber yield

Classification models assigned explanatory clr variables to two categorical tuber marketable yield: high- and low-yielders. The random-forest algorithm allowed to rank the importance of variables in the model. The clr of nitrogen appeared to be the most discriminant variable between tuber yield categories, followed by the amalgamated unknown components (Fv), then Ca, Mg and, finally, P. After splitting data into training (75%) and testing (25%) data sets, we used a ten-fold cross-validation process that sequentially splits the training data set into ten parts, using nine parts for calibration and the remainder for validation. The k-nearest neighbours, the random forest and the support vector machine models returned practically similar predictive accuracies (although slightly lower for the support vector machine algorithm), with a mean accuracy of 70% representing 591 successful and 254 unsuccessful cases classification with the testing set. The null hypothesis for a random classifier i.e., non-informative classification consisting of an equal distribution of 50% successful and 50% unsuccessful cases was rejected after a χ2 homogeneity test (χ2 = 69.135, p < 2.2 10−16). Since all the models returned practically similar accuracy over the testing set, predictions with the k-nearest neighbors model were used for interpreting. There was high variation in model fit by cultivar as shown in Fig 2. The accuracy at testing varied from 25% for Estima and Waneta, to 100% for Ambra, Carolina, Dark Red Chieftain, Harmony, Peribonka and Viking. All these cultivars had small sample sizes in the dataset, as shown in the S2 Table.
Fig 2

The k nearest neighbors model evaluation accuracies for cultivars.

3.3 Ionome perturbation

The true negative specimens (correctly diagnosed as balanced) comprising 783 occurrences in the training data set provided the clr reference values required to compute the Aitchison distance, which is equal to the Euclidean distance across clr-transformed compositions. The S3 Table displays mean values for each cultivar. Using the Aitchison metric, the closest true negative specimen was set as the reference composition for each imbalanced specimen. In the clr-space, the difference between the reference and the imbalanced compositions returns a perturbation vector. The Fig 3 shows the imbalanced sample with the highest Aitchison distance from its reference and the perturbation to apply as a translation to reach a balanced ionome.
Fig 3

Perturbation vector example mapped using the most imbalanced sample.

The most imbalanced observation nutrient composition was (0.0601, 0.0037, 0.0355, 0.0032, 0.0048. 0.8919), the nearest reference composition was (0.0561, 0.0036, 0.0603, 0.0052, 0.0184, 0.8565), the corresponding perturbation vector was (0.0919, 0.0965, 0.1696, 0.1629, 0.3832, 0.0959) for N, P, K, Mg, Ca and Fv respectively. The Aitchison distance computed between the observation and its associated true negative was 1.135.

Perturbation vector example mapped using the most imbalanced sample.

The most imbalanced observation nutrient composition was (0.0601, 0.0037, 0.0355, 0.0032, 0.0048. 0.8919), the nearest reference composition was (0.0561, 0.0036, 0.0603, 0.0052, 0.0184, 0.8565), the corresponding perturbation vector was (0.0919, 0.0965, 0.1696, 0.1629, 0.3832, 0.0959) for N, P, K, Mg, Ca and Fv respectively. The Aitchison distance computed between the observation and its associated true negative was 1.135.

4 Discussion

4.1 Clustering potato cultivars

The Canadian Food Inspection Agency classified potato cultivars broadly into maturity groups based on the time elapsed between planting and maturity [1]. However, nutrient requirements, especially nitrogen, vary widely between cultivars of the same maturity group. In New Brunswick (Canada), Zebarth et al. [50] recommended 200–208 kg N ha-1 for Russet Norkota (early-season cultivar) and Russet Burbank (late-season cultivar), 190 kg N ha-1 for Superior (early-mid-season cultivar) and Goldrush (mid-season cultivar), 175 kg N ha-1 for Shepody (mid-season), 135 kg N ha-1 for early cultivars for the table market, 160–180 kg N ha-1 for other mid-season, 180–200 kg N ha-1 for other late, and 135–160 kg N ha-1 for low N requirement cultivars. Such large discrepancies within the same cultivar maturity group was attributed to differential foliar gene expression [6] and root development [7]. Hence, information additional to maturity grouping is needed to assess nutrient requirements of potato cultivars. Huang and Salt [51] reported that ionomics allows the discovery of genes controlling natural variation in the plant ionome and for Salt et al. [9], ionomics could capture information about the functional state of an organism driven by genetic and environmental factors. The content of plant tissue reflects what the plant can absorb from the soil and for each nutrient, there is a correlation between its concentration and yield. Moreover, since tissue analysis is also carried out to observe the effect of fertilizer applications, and for determining the in-season or next season nutrient requirement [52, 53], ionomes could be useful in discriminating potato cultivars. Indeed, using a small data set of eight potato cultivars, Hernandes et al. [10] showed that foliar nutrient profiles varied widely among cultivars of the same maturity group. According to Parent et al. [12], variations in ionomes could be interpreted only partly as genotypic effect, and phenotypic plasticity can also be driven by nutrient supply capacity specific to agroecosystems while breeding programs are conducted under relatively luxurious environments to reach high productivity. The N, Mg and K clr values, that dominated principal components (Fig 1), could reflect the abilities of individual cultivars to acquire and use those nutrients more efficiently [54, 55]. Natale et al. [56] provided evidence that in general macronutrient contents differ among species and cultivars and within the same species for fruit trees. For N, K and Ca, this range is wider because of higher requirement of these elements by plants, and narrower for P, Mg and S, indicating smaller demand for the latter. To cluster is to recognize that objects are sufficiently similar to be put in the same group, and to identify distinctions or separations between groups of objects [57, 58]. Based on the assumptions of differential genotypic potential, root development, nutrient requirements, nutrient uptake and use efficiency, the goal was to discover interesting structures in the N, P, K, Mg and Ca contents of the diagnosis tissue in order to decipher dissimilarities between cultivars [33]. However, the process failed to discriminate groups of cultivars along the clr coordinates. Hernandes et al. [10] reached similar results with overlapping nutrient profiles between cultivar groups depending on isometric log ratio (ilr) coordinates. They found similar nutrient profiles between cultivars groups along some ilr coordinates and very different ones along others. While ionome dissimilarities are not numerically compelling, they could assist classifying new cultivars into appropriate ionomic group to benefit from costly fertilizer trials conducted on cultivars of the same group.

4.2 Tuber marketable yield prediction

The P content of the diagnostic leaf did not appear useful in predicting potato tuber yield classes. Other elements (N, K, Ca and Mg) showed important contribution of their clr values to the prediction quality metric, especially N, which is directly related to photosynthesis [59]. Since the fertilization trials were conducted over a time span of 47 years (1970–2017), the question arises whether the different methodology of quantifying P (colorimetry/ICP) may have contributed to depreciating this variable in predicting tuber yield classes. The ICP method is shown to be faster and to give higher results for total phosphorus content in ‘soil’ extracts in comparison to the colorimetric method. However, there are exceptions and controversial results [60-62]. Ivanov et al. [61] found that the two methods for total P determination in plant material were highly correlated, and the results were generally within 5% to 10% of one another. Moreover, Valkama et al. [63] reported that, although agricultural practices, soil conditions and analytical techniques have undergone substantial changes over time, the differences between old and recent experiments in yield responses to P application were not statistically important. For all these reasons, we consider the two analytical methods equally relevant to the analysis. The low importance of the P clr variable in predicting tuber yield classes may come from its correlation with Ca. Globally, the selection of relevant features is achieved, by first checking the correlation between features and response to select the features that have correlation above a selected level (e.g., 0.5). Then, the independent variables need to be uncorrelated with one another. If some features are correlated, only one is kept. The process selected the clr_Ca variable (alphabetical order) instead of clr_P since these features are correlated as shown in Fig 1B. In this study no element was discarded from the process relative to its importance. The tested algorithms (k-nearest neighbours, random forest and support vector machine) returned similar accuracies in the prediction of yield classes using clr variables as predictors and showed fair diagnostic potential to detect nutrient imbalance. The correctly predicted high and low yielders reached 70% in the testing data set. The models classified more accurately the yield categories compared to a random classifier [64]. Specimens classified as false negatives (i.e., low yielders incorrectly classified as high yielders) are attributable to limiting conditions other than N, P, K, Mg, and Ca nutrition: soil physical and chemical properties [65, 66], fertilization [67], management failures, diseases [68] or weather events [69] impacting plants growth and yield potential. False positive specimens (i.e., high yielders incorrectly classified as low yielders) indicate luxury consumption when nutrient concentrations are higher [12, 70], or other particularly favorable growth conditions. The confusion matrix built for cultivars revealed poor predictive accuracy for certain cultivars (i.e., 25% for Estima and Waneta) and conversely an accuracy of 100% for others (i.e., Ambra, Peribonka) as shown in Fig 2. These cultivars involved mainly small sample sizes (only one, two or three high-yielders and five, six or lightly more low-yielders). The problems of small-data in machine learning are numerous, but mainly revolve around over-fitting. The training and testing datasets division could only aggregate observations of one class in the training set so that the model would train to always predict this dominant class [71]. The model could also memorize labels, which is not ideal for generalizing from new data. Brownlee [72] explained that imbalanced classifications (one or less examples in a minority class for hundreds or more examples in the other) pose a challenge for predictive modeling as most of the machine learning algorithms used for classification were designed around the assumption of an equal number of examples for each class. This results in models that have poor predictive performance, specifically for the minority class. The controversial accuracy level for some cultivars (especially low level) could also come from other yield limiting factors specific to the experiments but not involved in this study, as for false positive specimens. Our model was not effective for these cultivars treated separately. The differential nutrition of potato cultivars could be addressed objectively using mineral analysis of the diagnostic leaf. More data are needed for poorly documented cultivars. Moreover, dedicated models could be trained for cultivars for which sufficient data are available (e.g., Goldrush, Superior, FL 1207, Chieftain). Other algorithmic, sampling and quality measurement approaches could further be implemented to deal with the problems of small-data and unequal distribution of classes [71, 72]. One could extend the predictors to the experimental conditions (soils, weather data), fitting a site-and-cultivar-specific nutrients diagnosis model.

4.3 Perturbation vector for fertilizer recommendation

Rational fertilization requires information on the nutrients that are available in the soil, and the nutritional status of the plant [14] as portrayed by the diagnostic tissue composition [14, 15]. However, the diagnosis of deficiency and toxicity of mineral nutrients may be complicated in field-grown plants where more than one mineral nutrient is deficient or where there is a deficiency of one nutrient and simultaneously toxicity of another [14]. The scientific principle behind tissue analysis is that healthy plants contain predictable concentrations of analytical nutrients [73]. The values are compared to established norms for inadequate, adequate and excess levels. However, Parent et al. [13] proved that this concept of growth-limiting nutrient concentrations supported by the Law of minimum and illustrated by Liebig’s barrel, should be replaced by a concept of growth-limiting nutrient balances illustrated by a pan balance design, where groups of elements are balanced optimally against each other in weighing pans. The difference between two equal-length compositional vectors can only be computed using tools of compositional data analysis. The perturbation vector concept applied to foliar tissue diagnosis returns a scaling operator [21] that when applied to an imbalanced composition translates it (theoretically) into a balanced composition with high yield potential (i.e., true negative). Although the closure of the simplex implies that a perturbation on the clr of a specific nutrient is methodologically not a change in proportion of a single nutrient, perturbations expressed in the clr space appear suitable for interpretation. Indeed, the difference measured between clr values of the diagnosed sample and reference (true negative) specimen can be ranked using the sign of that difference [10, 74, 75], hence indicating which components are at excessive or deficient levels. As provided by Parent [40], K and Mg were apparently deficient while N, P and Ca were apparently in excess compared to the closest reference specimen (Fig 3). Using the same approach, ionomes of newly introduced cultivars with unknown nutrient requirements could be assigned to the cultivars of known nutrient requirements showing the closest ionomes. A perturbation as the one shown in Fig 3 should not be interpreted as shifts of individual components, since the operation on a single component resonates on the whole simplex [40]. For instance, an offset in the simplex (N, P, K, Ca, Mg, Fv) composition following the increase by 20% (theoretically) of N and P clr values is displayed on Fig 4. The K, Ca and Mg concentrations seemed more stable with respect to the others. Although P clr values have been increased, P proportion decreased globally for the new equilibrium of the simplex. The offset was higher for the selected components followed by the filling value (Fv).
Fig 4

Effect of the perturbation of N and P clr coordinates on the other element proportions.

‘Observation’ stands for the element’s original proportion, ‘Perturbation’ designates the new proportion after the ‘Observed’ vector’s clr value was offset. Greyed boxplots plot distribution of perturbed elements of the simplex.

Effect of the perturbation of N and P clr coordinates on the other element proportions.

‘Observation’ stands for the element’s original proportion, ‘Perturbation’ designates the new proportion after the ‘Observed’ vector’s clr value was offset. Greyed boxplots plot distribution of perturbed elements of the simplex. Perturbation (as defined in Eq 3) is the measure of compositional change from one composition to another [37]. Because foliar composition belongs to compositional data family, the Fig 4 illustrates the principle that changing a proportion of such data affects at least another proportion of the simplex [16]. The result displayed variable offsets for other elements, decreasing or increasing to reach another balance in the simplex.

5 Conclusion

Since the concept of compositional data analysis was applied to plant tissues, several studies classified plant species and cultivars using multivariate analysis of nutrients compositions. This study is, to our knowledge, the third (following Parent et al. [49] and Hernandes et al. [10]) to use statistical tools to address the differential nutrition of potato cultivars using combination of nutrient concentrations in the diagnostic leaf, and the first using tools of machine learning to predict tuber marketable yield. The potato ionomes showed some dissimilarities in principle components analysis, but not compelling to separate definite density-based clusters between cultivars on the basis of the clr values. However, the ionome showed a determinant effect on tubers yield. Used as predictors in machine learning tools, clr variables showed diagnostic potential to detect in-season nutrient imbalance to address objectively the differential response of cultivars to fertilization. The perturbation vector of the leaf compositional space could indicate cultivar sensitivity to fertilization and address specific problems of nutrient imbalance in new cultivars. Tissue testing remains an informative, diagnostic and preventive tool with real-world applications for growers in evaluating the effectiveness of their nutrient management program. When using the right interpretation, this timely and correct tissue testing helps diagnose the presence and magnitude of suspected nutrient deficiencies. By using the compositional perturbation vector involving interactions among nutrients, our study provided a useful tool in potato precision fertilization in Quebec. The perturbation vector can help identify limiting nutrients requiring correcting measures as a season progresses or for subsequent seasons. Moreover, our study implicitly provided robust multi-nutrient norms for potato crops, gathering more cultivars of different maturity classes than the previous works. These norms are sets of true-negative or nutritionally-balanced compositions per cultivar (enchanting islands) with high-yield potential. More data are needed to fine-tune the models, especially for poorly-documented cultivars. New algorithms, other sampling methods and model quality measures could be tested to deal with the problem of small-data and imbalanced classification. Further studies extending predictive features to site-specific conditions could improve the diagnosis with a site- and cultivar-specific nutrient diagnosis model.

Quebec potato leaves ionome data set.

raw_leaf_df.csv file available online in data repository at https://git.io/Jvt2r. (CSV) Click here for additional data file.

Potato data set used for cluster analysis.

(DOCX) Click here for additional data file.

True negatives mean clr values for cultivars.

(DOCX) Click here for additional data file. 27 Dec 2019 PONE-D-19-27444 Cultivar-specific nutritional status of n class="Species">potato (n class="Species">Solanum tuberosum L.) crops PLOS ONE Dear Dr. Parent, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. We would appreciate receiving your revised manuscript by Feb 10 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. If you would like to make cn class="Chemical">hanges to your financial disclosure, plean class="Disease">se include your updated statement in your cover letter. To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. We look forward to receiving your revin class="Disease">sed manuscript. Kind regards, Paul Esker Academic Editor PLOS ONE Additional Editor Comments: This paper presents a machine learning approach to modeling tuber yields as a function of foliar ionomes. The concept is interesting and the database extensive considering time and cultivar diversity. Overall, the paper does add something new to the literature, but as indicated by the primary reviewer, revisions are required before the article is ready for publication. I would like to apologize for the delay in returning this review since there were challenges with finding reviewers, as well as having several situations where the reviewer was released from finalizing the review due to non-response. Nonetheless, both the primary reviewer and I are in agreement regarding areas for improvement for this manuscript. In my case, I was confused somewhat by the description of how trials were selected, and what the yield range was since in the methods, trials that had less than 28 Mg ha-1 were dropped from inclusion, yet the low yielding group (high versus low) averaged 24.8 Mg ha-1. I assume this means that within the selected trials, there were still many low-yielding cultivars, correct? Also, the sample sizes by cultivar were quite variable and in some cases, it appeared that there were very few of one class or the other when looking at the supplementary material. As indicated by Reviewer 1, I think this is important to provide further details or context since it may partially explain the high variation in fit by cultivar - this is not necessarily addressed well in the discussion. Some more specific obn class="Disease">servations include: Lines 229-231: Confusing statement Line 269 (and in other statements), the citation style was rather odd, with a double mention of the author. Lines 300-304: Confused with pre-selection procedure - also ties in to the question that Reviewer 1 had for lines 305-307. Lines 319-323: n class="Disease">Seems like a transition is missing to the connect the different thoughts. Journal Requirements: When submitting your revision, we need you to address then class="Disease">se additional requirements: 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.plosone.org/attachments/PLOSOne_formatting_sample_main_body.pdf and http://www.plosone.org/attachments/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Tn class="Chemical">hank you for stating in your Funding Statement: "ZC is partly funded by the Natural Sciences and Engineering Council of Canada (CRDPJ 385199-09 and DG-2254), the Quebec Ministry of Agriculture, Fisheries and Food (IA216581), Centre SEVE, Patate Dolbec Inc., Groupe Gosselin FG, Agriparmentier Inc., Ferme Daniel Bolduc Inc., Patate Laurentienne, Ferme Bergeron-Niquet, and Patates Lac-St-Jean. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." a. Please provide an amended statement that declares *all* the funding or sources of support (whether external or internal to your organization) received during this study, as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now.  Please also include the statement “There was no additional external funding received for this study.” in your updated Funding Statement. b. Please include your amended Funding Statement within your cover letter. We will change the online submission form on your behalf. 3. Tn class="Chemical">hank you for stating the following in the Competing Interests n class="Disease">section: 'The authors n class="Chemical">have declared tn class="Chemical">hat no competing interests exist.' We note that you received funding from commercial sources: Patate Dolbec Inc., Groupe Gosselin FG, Agriparmentier Inc., Ferme Daniel Bolduc Inc. a. Please provide an amended Competing Interests Statement that explicitly states this commercial funder, along with any other relevant declarations relating to employment, consultancy, patents, products in development, marketed products, etc. Within this Competing Interests Statement, please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests).  If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared. b. Please include your amended Competing Interests Statement within your cover letter. We will change the online submission form on your behalf. Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests 4. We note tn class="Chemical">hat Figure 1 in your submission contains map images which may be copyrighted. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For these reasons, we cannot publish previously copyrighted maps or satellite images created using proprietary data, such as Google software (Google Maps, Street View, and Earth). For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright. We require you to either (a) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (b) remove the figures from your submission: a.    You may n class="Disease">seek permission from the original copyright holder of Figure(s) [#] to publish the content specifically under the CC BY 4.0 licenn class="Disease">se. We recommend tn class="Chemical">hat you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text: “I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.” Plean class="Disease">se upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission. In the figure caption of the copyrighted figure, plean class="Disease">se include the following text: “Reprinted from [ref] under a CC BY licenn class="Disease">se, with permission from [name of publisher], original copyright [original copyright year].” b.    If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only. The following resources for replacing copyrighted map figures may be helpful: USGS National Map Viewer (public domain): http://viewer.nationalmap.gov/viewer/ The Gateway to Astronaut Photography of Earth (public domain): http://eol.jsc.nasa.gov/sn class="Disease">seop/clickmap/ Maps at the CIA (public domain): https://www.cia.gov/library/publications/the-world-factbook/index.html and https://www.cia.gov/library/publications/cia-maps-publications/index.html NASA Earth Obn class="Disease">servatory (public domain): http://earthobn class="Disease">servatory.nasa.gov/ Landsat: http://landsat.visibleearth.nasa.gov/ USGS EROS (Earth Resources Obn class="Disease">servatory and Science (EROS) Center) (public domain): http://eros.usgs.gov/# Natural Earth (public domain): http://www.naturalearthdata.com/ [Note: HTML markup is below. Plean class="Disease">se do not edit.] Reviewers' comments: Reviewer's Responn class="Disease">ses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly ********** 2. n class="Chemical">Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes ********** 3. n class="Chemical">Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes ********** 4. Is the manuscript pren class="Disease">sented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so plean class="Disease">se note any specific errors here. Reviewer #1: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: In lines 305-307, the authors state tn class="Chemical">hat the sample size is small and results should be carefully interpreted. It would be misleading to the reader to indicate any relationship or conclusions at this stage until further investigation/ren class="Disease">search is done to explore such implications from this data. Line 25-26, “The scarcity of data, in particular for new cultivars, constrains to group cultivars into maturity groups.” Awkward n class="Disease">sentence – consider revision. Line 51 “In particular, foliar gene expression….” Incomplete n class="Disease">sentence- revin class="Disease">se grammar. Line 109, Can the maturity clasn class="Disease">ses be further described by their range of days from planting to maturity? Line 113-114, If I understand the protocol correctly, this leaf sampling was taken at different times throughout the growing n class="Disease">season due to the differences in maturity of the n class="Species">potato cultivars? Line 115-116, “ground to less than 1mm” what does that mean exactly? The plant material was ground to less than 1mm particle diameter? Just curious. Line 265-266, “…information additional to maturity grouping is needed to assess nutrient requirements of potato cultivars.” Can you provide some additional considerations and why? Are they practical? Line 278-280, The paragraph starts with describing the variation in cultivars and foliar nutrient profiles and would help if there’s more of a transition in explaining to the reader how this variation could be explained through the clustering process. n class="Disease">Sentence 278-280 is a stark transition in thought. Consider the additional of another n class="Disease">sentence to guide the reader. Line 290, could the different methodology of quantifying P (colorimetry/ICP) n class="Chemical">have contributed to the insignificance of its content in predicting tuber yield clasn class="Disease">ses? Lines 305-307, Expand on why the predictive accuracy for some cultivars were very high while others were not. Would then class="Disease">se potential factors need to be considered in future yield prediction models? Lines 367-369, could the perturbation vector of leaf compositional space assist with correcting for in-n class="Disease">season nutrient imbalances (per cultivar) to improve yield potential? Conclusion n class="Disease">section This section needs further expansion and discussion. What are the implications of the study on potato fertility and management? What about future directions and next steps? Does this research provide a positive direction in precision potato production in Canada? Does this mean that cultivar-specific nutrition recommendations may need revision or can be more precise in the future? What are the economic implications of this research for the potato industry? Are there any? Why use potato? Has there been significant background research already conducted in this species or has there been cultivar-specific fertility variability that has led to the investigations with potato? Has there been economic impact of varied fertility regimes on potato cultivars that more precise fertility recommendations based on this model could address? Could the datan class="Disease">set n class="Chemical">have been more robust if collected from other geographical sites? Would it be possible to do some similar analyses with select potato cultivars grown in very controlled conditions and compare these results also to the analyses here? ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choon class="Disease">se “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Plean class="Disease">se log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step. 11 Feb 2020 The reviewers' comments were thoroughly reviewed in our file Responn class="Disease">se to Reviewers.docx. Submitted filename: Responn class="Disease">se to Reviewers.docx Click here for additional data file. 2 Mar 2020 Cultivar-specific nutritional status of n class="Species">potato (Solanum tuberosum L.) crops PONE-D-19-27444R1 Dear Dr. Parent, We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements. Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication. Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. With kind regards, Paul Esker Academic Editor PLOS ONE Additional Editor Comments (optional): Thank you for taking the time and energy to consider the reviewer comments. The manuscript reads very well and I am satisfied that it can be moved along for publication. Reviewers' comments: None required. 3 Mar 2020 PONE-D-19-27444R1 Cultivar-specific nutritional status of n class="Species">potato (n class="Species">Solanum tuberosum L.) crops Dear Dr. Parent: I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. For any other questions or concerns, plean class="Disease">se email plosone@plos.org. Tn class="Chemical">hank you for submitting your work to PLOS On class="Chemical">NE. With kind regards, PLOS ONE Editorial Office Staff on ben class="Chemical">half of Dr. Paul Esker Academic Editor PLOS ONE
  14 in total

Review 1.  Ionomics and the study of the plant ionome.

Authors:  David E Salt; Ivan Baxter; Brett Lahner
Journal:  Annu Rev Plant Biol       Date:  2008       Impact factor: 26.379

Review 2.  Plant Ionomics: From Elemental Profiling to Environmental Adaptation.

Authors:  Xin-Yuan Huang; David E Salt
Journal:  Mol Plant       Date:  2016-05-19       Impact factor: 13.164

Review 3.  Measuring the accuracy of diagnostic systems.

Authors:  J A Swets
Journal:  Science       Date:  1988-06-03       Impact factor: 47.728

4.  Perturbation vectors to evaluate air quality using lichens and bromeliads: a Brazilian case study.

Authors:  F Monna; A N Marques; R Guillon; R Losno; S Couette; N Navarro; G Dongarra; E Tamburo; D Varrica; C Chateau; F O Nepomuceno
Journal:  Environ Monit Assess       Date:  2017-10-17       Impact factor: 2.513

5.  Genomic scale profiling of nutrient and trace elements in Arabidopsis thaliana.

Authors:  Brett Lahner; Jiming Gong; Mehrzad Mahmoudian; Ellen L Smith; Khush B Abid; Elizabeth E Rogers; Mary L Guerinot; Jeffrey F Harper; John M Ward; Lauren McIntyre; Julian I Schroeder; David E Salt
Journal:  Nat Biotechnol       Date:  2003-08-31       Impact factor: 54.908

6.  Plant ionome diagnosis using sound balances: case study with mango (Mangifera Indica).

Authors:  Serge-Étienne Parent; Léon E Parent; Danilo Eduardo Rozane; William Natale
Journal:  Front Plant Sci       Date:  2013-11-12       Impact factor: 5.753

7.  Heat stress affects carbohydrate metabolism during cold-induced sweetening of potato (Solanum tuberosum L.).

Authors:  Derek J Herman; Lisa O Knowles; N Richard Knowles
Journal:  Planta       Date:  2016-11-30       Impact factor: 4.116

8.  The plant ionome revisited by the nutrient balance concept.

Authors:  Serge-Étienne Parent; Léon Etienne Parent; Juan José Egozcue; Danilo-Eduardo Rozane; Amanda Hernandes; Line Lapointe; Valérie Hébert-Gentile; Kristine Naess; Sébastien Marchand; Jean Lafond; Dirceu Mattos; Philip Barlow; William Natale
Journal:  Front Plant Sci       Date:  2013-03-22       Impact factor: 5.753

9.  The Ionomics of Lettuce Infected by Xanthomonas campestris pv. vitians.

Authors:  Olbert Nicolas; Marie Thérèse Charles; Sylvie Jenni; Vicky Toussaint; Serge-Étienne Parent; Carole Beaulieu
Journal:  Front Plant Sci       Date:  2019-03-22       Impact factor: 5.753

10.  Balance design for robust foliar nutrient diagnosis of "Prata" banana (Musa spp.).

Authors:  José Aridiano Lima de Deus; Júlio César Lima Neves; Márcio Cleber de Medeiros Corrêa; Serge-Étienne Parent; William Natale; Léon Etienne Parent
Journal:  Sci Rep       Date:  2018-10-09       Impact factor: 4.379

View more
  2 in total

1.  Leaf elemental composition analysis in spider plant [Gynandropsis gynandra L. (Briq.)] differentiates three nutritional groups.

Authors:  Aristide Carlos Houdegbe; Enoch G Achigan-Dako; E O Dêêdi Sogbohossou; M Eric Schranz; Alfred O Odindo; Julia Sibiya
Journal:  Front Plant Sci       Date:  2022-09-02       Impact factor: 6.627

2.  Spectroscopic analysis reveals that soil phosphorus availability and plant allocation strategies impact feedstock quality of nutrient-limited switchgrass.

Authors:  Zhao Hao; Yuan Wang; Na Ding; Malay C Saha; Wolf-Rüdiger Scheible; Kelly Craven; Michael Udvardi; Peter S Nico; Mary K Firestone; Eoin L Brodie
Journal:  Commun Biol       Date:  2022-03-11
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.