| Literature DB >> 26779211 |
Anna O Conrad1, Pierluigi Bonello2.
Abstract
New approaches for identifying disease resistant trees are needed as the incidence of diseases caused by non-native and invasive pathogens increases. These approaches must be rapid, reliable, cost-effective, and should have the potential to be adapted for high-throughput screening or phenotyping. Within the context of trees and tree diseases, we summarize vibrational spectroscopic and chemometric methods that have been used to distinguish between groups of trees which vary in disease susceptibility or other important characteristics based on chemical fingerprint data. We also provide specific examples from the literature of where these approaches have been used successfully. Finally, we discuss future application of these approaches for wide-scale screening and phenotyping efforts aimed at identifying disease resistant trees and managing forest diseases.Entities:
Keywords: chemometrics; disease; infrared and Raman spectroscopy; resistance; trees
Year: 2016 PMID: 26779211 PMCID: PMC4703757 DOI: 10.3389/fpls.2015.01152
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1A representative chemical fingerprint using raw spectra collected in the mid-IR region (4000–700 cm. FT-IR spectroscopy produces chemical fingerprints that can be analyzed using chemometrics to identify spectral differences between resistant and susceptible trees.
Commonly used pre-processing methods for infrared spectroscopy and Raman spectroscopy derived data.
| Standard normal variate (SNV) | Multiplicative scatter and particle size interference is removed (Barnes et al., |
| Multiplicative scatter correlation (MSC) | Corrects for noise and scatter; removes multiplicative effects (Lupoi et al., |
| Derivative | First and second derivative functions are commonly used to reduce error and resolve overlapping bands (peaks) (Sankaran et al., |
| Savitzky-Golay polynomial filter | For smoothing and derivatizing data (Gierlinger et al., |
| Detrending | Corrects for variation in baseline shifts and co-linearity (Barnes et al., |
List of commonly used chemometric methods for visualizing and mining spectral data, and for building predictive models from spectral data for trees.
| Discriminant function analysis (DFA) | Supervised projection method, which identifies regions that are important for separating groups. |
| K-nearest neighbors (KNN) | Compares the distance between unknown samples (testing set) and samples in the training set. Samples are classified based on proximity to training set samples (Guzmán et al., |
| Linear discriminant analysis (LDA) | A supervised method for classifying data with two or more classes, which selects latent variables that maximize variance between groups and minimize variance within groups. Uses a discriminant function to assign classes to unknown samples (Sankaran et al., |
| Principal components analysis (PCA) | Unsupervised method for visualizing and grouping data based on natural clustering patterns. Can be used to reduce the dimensionality of data, minimize co-linearity, examine spectral variance, and identify outliers (Gierlinger et al., |
| Partial least squares regression (PLSR) | Combines methods for reduction of high dimensional and potentially co-linear data with regression to develop predictive (calibration) models for quantitative traits of interest. This supervised method is commonly used for the analysis of NIR spectra (Fackler et al., |
| Partial least squares discriminant analysis (PLS-DA) | Supervised classification analysis that resolves separation between groups and identifies the most important variables for discriminating between groups. Similar to PLSR but with categorical (qualitative) response variables (Guzmán et al., |
| Soft independent modeling of class analogy (SIMCA) | A supervised classification method that develops principal components models for each training group and identifies important variables for discriminating between groups. Can be used to predict group memberships of unknown samples (Guzmán et al., |
Please refer to cited papers for additional information.
Figure 2This hypothetical output from SIMCA analysis displays the relative, dimension-free distance between samples (trees), and groupings of individual trees into resistant and susceptible phenotypes. Dashed lines represent critical sample residual thresholds. Trees in quadrant (A) would be classified as resistant by the SIMCA model, while trees in quadrant (B) would be classified as neither resistant nor susceptible, i.e., as ambiguous. Trees in quadrant (C) could be classified as either resistant or susceptible, and may therefore include trees of intermediate phenotype. Trees in quadrant (D) would be classified as susceptible.