Literature DB >> 33344095

Composite modeling of leaf shape along shoots discriminates Vitis species better than individual leaves.

Abigail E Bryson^1,2, Maya Wilson Brown³, Joey Mullins⁴, Wei Dong², Keivan Bahmani⁴, Nolan Bornowski⁴, Christina Chiu⁵, Philip Engelgau⁴, Bethany Gettings³, Fabio Gomezcano², Luke M Gregory³, Anna C Haber⁴, Donghee Hoh^6,7, Emily E Jennings^3,8, Zhongjie Ji⁵, Prabhjot Kaur^4,9, Sunil K Kenchanmane Raju³, Yunfei Long¹⁰, Serena G Lotreck³, Davis T Mathieu^1,2, Thilanka Ranaweera³, Eleanore J Ritter³, Rie Sadohara⁵, Robert Z Shrote⁵, Kaila E Smith³, Scott J Teresi⁴, Julian Venegas¹¹, Hao Wang¹¹, McKena L Wilson³, Alyssa R Tarrant⁴, Margaret H Frank¹², Zoë Migicovsky¹³, Jyothi Kumar³, Robert VanBuren⁴, Jason P Londo¹⁴, Daniel H Chitwood^4,11.

Abstract

PREMISE: Leaf morphology is dynamic, continuously deforming during leaf expansion and among leaves within a shoot. Here, we measured the leaf morphology of more than 200 grapevines (Vitis spp.) over four years and modeled changes in leaf shape along the shoot to determine whether a composite leaf shape comprising all the leaves from a single shoot can better capture the variation and predict species identity compared with individual leaves.
METHODS: Using homologous universal landmarks found in grapevine leaves, we modeled various morphological features as polynomial functions of leaf nodes. The resulting functions were used to reconstruct modeled leaf shapes across the shoots, generating composite leaves that comprehensively capture the spectrum of leaf morphologies present.
RESULTS: We found that composite leaves are better predictors of species identity than individual leaves from the same plant. We were able to use composite leaves to predict the species identity of previously unassigned grapevines, which were verified with genotyping. DISCUSSION: Observations of individual leaf shape fail to capture the true diversity between species. Composite leaf shape-an assemblage of modeled leaf snapshots across the shoot-is a better representation of the dynamic and essential shapes of leaves, in addition to serving as a better predictor of species identity than individual leaves.

Entities: Chemical

Keywords: Vitis; grapevine; landmark analysis; leaf shape; modeling; morphometrics

Year: 2020 PMID： 33344095 PMCID： PMC7742203 DOI： 10.1002/aps3.11404

Source DB: PubMed Journal: Appl Plant Sci ISSN： 2168-0450 Impact factor: 1.936

Leaf shape is dynamic. Allometry—the differential growth of shape attributes relative to size or other attributes—was first documented in leaves by Stephen Hales in 1727. Imprinting a grid of points onto expanding young fig leaves, he observed “the difference of the progressive and lateral motions of these points in different leaves,” finding that they “were of very different lengths in proportion to their breadths” (Hales, 1727). Not only is the shape of a leaf dynamic over its development as it expands, but so too are the shapes of multiple leaves located at different nodes within a plant. Heteroblasty—phenotypic changes in sequential lateral organs (such as leaves)—was first described by Johann Wolfgang von Goethe (a contributor to art, philosophy, and botany; Friedman and Diggle, 2011) when he compared the transformation of mature leaf shapes within a plant to Poseidon, the Greek god of the sea, and the mutable nature of water (Goethe, 1817, 1952). Our ability to recognize differences in leaf shape indicates that this is a quantifiable trait. A number of morphometric approaches have been proposed to measure leaf shape across the angiosperms, from focusing on the venation using computer vision (Wilf et al., 2016) to the closed contour of the blade using topological data analysis (Li et al., 2018). In some special cases, simple but powerful geometric approaches have been developed to aid in the classification of plants. For example, all grapevine (Vitis L.) leaves have corresponding homologous primary and secondary veins. With the introduction of North American rootstocks in Europe during the late 19th and early 20th centuries to combat the spread of phylloxera, viticulturists were exposed to new and unfamiliar varieties; thus, they needed a method to confirm rootstock identity. Without the ability to genotype, viticulturists used phenotyping, and proposed that the angle of the petiolar veins, which form the petiolar sinus, could be used to differentiate between varieties (Goethe, 1876, 1878; Ravaz, 1902). Pierre Galet (Galet, 1979, 1985, 1988, 1990, 2000) extended this system to other major veins in the leaf by measuring their relative angles and the ratios of lobe and sinus lengths. A system of homologous features was seized upon to further elaborate all veins hierarchically and to enumerate the corresponding teeth in which they terminate (Rodrigues, 1939, 1941a, 1941b, 1952a, 1952b). Martínez and Grenan (1999) used these approaches to calculate an average leaf shape and with this mathematical framework classified varieties, clones, and even their similarity to depictions of grapevines in art (Martínez et al., 1995, 1997a, 1997b; Santiago et al., 2005, 2007, 2008; Gago et al., 2009a, 2009b, 2014). These morphometric methods, which have been tailored to the unique geometric properties of grapevine leaves for over a century, have been extended to formal landmark‐based methods and have been used in the study of the genetic basis of leaf shape (Chitwood et al., 2014; Demmings et al., 2019). The unique features of grapevine leaves allow for not only the classification of different varieties, but also the study of the dynamics of leaf morphology on individual vines, as these features vary along the grapevine shoot (Fig. 1). From shoot base to tip, two developmental processes—allometry and heteroblasty—are discernable through the observation of leaf size and shape (Fig. 2). Following their initiation at the shoot tip, leaves rapidly undergo expansion while their shape continuously deforms, governed by the allometric processes first described by Hales (1727). Heteroblasty is caused by temporal changes in the shoot apical meristem that alter the phenotype of the subsequent lateral organs (in this case, leaves) produced, including leaf shape. Leaves found at the shoot base were the first to emerge from buds; thus, leaves at nodes closer to the base are relatively mature and have reached their maximum size. At this point in development, differences in leaf shape are predominantly attributable to the transformations from “first to last” described by Goethe (1817) between mature leaves with different shapes.

Figure 1

Figure 2

Quantifying leaf shape changes along the shoot. (A) Two Vitis labrusca leaves from the second (left) and fourth (right) nodes from the shoot tip. Landmarks are indicated to the left, and lobe tips, sinuses, and associated nomenclature to the right. Leaves are scaled to show the allometric decrease in the ratio of vein‐to‐blade area that occurs during expansion. (B) Examples of morphological changes in V. riparia and V. labrusca leaves sampled along the shoot. Shoot tip, shoot base, nodes, and scale are indicated. Leaves in (A) are from the V. labrusca shoot shown here. (C) Diagrammatic representation of the methods used to quantify leaf shape change along the shoot. Leaves are first superimposed and scaled using Procrustean methods. The outlines shown were formed using Procrustean coordinates derived from the blade of each leaf shown in (B). The relative node position is calculated as the node number (starting at the tip) divided by the total leaf count (for the shoot), such that all nodes are assigned a fractional value between 0 and 1. Coordinate x‐ and y‐values are modeled as a function of relative node position. Dots connected between leaves by the dotted line correspond to landmark 19 of the distal sinus.

Examples of changes in leaf traits between different developmental stages in different grapevine species. Vineyard‐collected leaves (adaxial side up, except for Vitis riparia) from the tip (top of stack) to base (bottom of stack) of the shoot. The size, shape, and color (among other traits) vary from node to node. Images are not to scale relative to each other. Quantifying leaf shape changes along the shoot. (A) Two Vitis labrusca leaves from the second (left) and fourth (right) nodes from the shoot tip. Landmarks are indicated to the left, and lobe tips, sinuses, and associated nomenclature to the right. Leaves are scaled to show the allometric decrease in the ratio of vein‐to‐blade area that occurs during expansion. (B) Examples of morphological changes in V. riparia and V. labrusca leaves sampled along the shoot. Shoot tip, shoot base, nodes, and scale are indicated. Leaves in (A) are from the V. labrusca shoot shown here. (C) Diagrammatic representation of the methods used to quantify leaf shape change along the shoot. Leaves are first superimposed and scaled using Procrustean methods. The outlines shown were formed using Procrustean coordinates derived from the blade of each leaf shown in (B). The relative node position is calculated as the node number (starting at the tip) divided by the total leaf count (for the shoot), such that all nodes are assigned a fractional value between 0 and 1. Coordinate x‐ and y‐values are modeled as a function of relative node position. Dots connected between leaves by the dotted line correspond to landmark 19 of the distal sinus. Previously, we measured the leaf shape of hundreds of vines from a germplasm collection representative of North American Vitis species (Chitwood et al., 2016a). We were able to capture changes in shape between species as well as changes due to allometry and heteroblasty by sampling the leaves from all nodes on a single shoot. We found that the effects of species and developmental processes on leaf shape were additive and statistically distinct, meaning that, regardless of the node chosen along the vine, a given leaf can be used to identify the species and vice versa. We sampled leaves from the same vines in a second season (Chitwood et al., 2016b), and, by comparing equivalent leaves (by the vine and node) between the two growing seasons (2013 and 2015), we observed interannual variability in leaf shape that was attributable to climate rather than genotype or development. Here, using the same vines as in our previous work (Chitwood et al., 2016a, 2016b), we sampled leaves from two more seasons (2016 and 2017) and successfully use composite leaf modeling to predict species identity in vines with an unknown background, which were confirmed by genotyping. Our work demonstrates that phenotypic modeling of dynamic changes in leaf shape using composite leaves improves genotype predictions compared with approaches statistically accounting for individual leaves only.

METHODS

Germplasm, sample collection, and imaging

A total of 8465 leaves were collected from 209 vines at the USDA germplasm repository vineyard in Geneva, New York, USA. Samples were taken from the same vines during the second week of June, annually, in 2013 and 2015–2017. This study builds upon and analyzes previous work, including data sets from 2013 (Chitwood et al., 2016a) and 2013 + 2015 (Chitwood et al., 2016b). The vines sampled represent 11 species (Ampelopsis glandulosa (Wall.) Momiy. var. brevipedunculata (Maxim.) Momiy., V. acerifolia Raf., V. aestivalis Michx., V. amurensis Rupr., V. cinerea (Engelm.) Millardet, V. coignetiae Pulliat ex Planch., V. labrusca L., V. palmata Vahl, V. riparia Michx., V. rupestris Scheele, and V. vulpina L.), four hybrids (V. ×andersonii Rehder, V. ×champinii Planch., V. ×doaniana Munson ex Viala, and V. ×novae‐angliae Fernald), and 13 Vitis vines with an unassigned identity. Starting at the shoot tip (with shoot order noted for each leaf), leaves greater than ~1 cm in length were collected in stacks (Fig. 1) and stored in a cooler in labeled plastic bags with ventilation holes. Within two days of collection, the leaves were arranged on a large‐format Epson Workforce DS‐50000 scanner (Tokyo, Japan) in the order they were collected, with a small number near each leaf indicating which node it came from and a ruler for scale within the image file. The image files were named with the vine identification number, followed by a sequential lowercase letter if multiple scans were needed.

Landmarking and generalized Procrustes analysis

A total of 21 landmarks were (manually) annotated sequentially on the abaxial side of the leaf, as in Fig. 2A, using the point tool in ImageJ (version 1.52k; Abràmoff et al., 2004). For the 8465 leaves used in this study, 177,765 landmarks (355,530 values in total) were analyzed. Landmarks were placed sequentially for each leaf in a scan and saved as a text file of x‐ and y‐coordinate values. To check for errors, landmarks from each scan were visualized using ggplot2 (Wickham, 2016) in R (version 3.6.0; R Core Team, 2019), and landmarking was redone as necessary. A generalized Procrustes analysis (GPA) was performed using the shapes package (version 1.2.4; Dryden and Mardia, 2016) in R, allowing for reflection. The resulting superimposed Procrustes coordinates were used in subsequent analyses.

Data analyses

The data analysis methods used in this study were devised during the Fall 2019 semester by students in Foundation in Computational Plant Science, a graduate‐level course offered by the Department of Horticulture at Michigan State University, East Lansing, Michigan, USA, which focused on the integration of computational and plant science approaches. Jupyter notebooks (Kluyver et al., 2016) were an important element in the data analyses. An innovative feature of Jupyter notebooks is their multifunctional use in both education and research. Additionally, as an open‐source platform, Jupyter notebooks facilitate sharing between multiple sources, including the classroom and laboratory. All Python code (Jupyter notebooks) and R scripts used for the data analysis are available on GitHub: https://github.com/DanChitwood/grapevine_shoots. The original scans of grapevine leaves used for the analysis are available on Dryad (Chitwood et al., 2020). The concept of modeling Procrustean coordinate values across nodes using polynomial functions was introduced to students using published leaf shape data from Passiflora L. spp. (Chitwood and Otoni, 2017a, 2017b): https://github.com/DanChitwood/PlantsAndPython/blob/master/PlantsAndPython10_STUDENT_A_Passion_for_Passiflora.ipynb. Jupyter notebooks used as instructional materials for the class are available in the PlantsAndPython repository: https://github.com/DanChitwood/PlantsAndPython. Analyses and visualizations in Python were done with NumPy (Oliphant, 2006), pandas (McKinney, 2010), scikit‐learn (Pedregosa et al., 2011), and Matplotlib (Hunter, 2007). In order to compare the node position of each leaf against other leaves, we created a relative node position, which is simply the node position for each leaf divided by the total leaf count for the shoot, where 0 is the shoot tip and 1 is the shoot base (Fig. 2). Procrustean coordinates for each vine were modeled as a function of the relative node using a second‐degree polynomial function fitted to the data using the NumPy polyfit and poly1d functions (Appendix S1). Ten modeled leaf shapes were calculated across the normalized range of node values from zero to one for each shoot. Collectively, the coordinate values for these 10 modeled leaf shapes were used in subsequent analyses, representing leaf shape changes across the shoot as a composite leaf shape. The data set for the individual leaves contains 8465 leaves that are each treated independently. They arise from 209 vines sampled across four years, in which each vine sampled had a variable number of leaves. A total of 42 data points (arising from 21 landmarks) describe the shape of each individual leaf. The composite leaf data set comprises 836 composite leaves that arise from 209 vines that were sampled four times each across year. Each composite leaf is described by 420 data points (arising from 10 modeled leaves with 21 landmarks each). A principal component analysis (PCA) and a linear discriminant analysis (LDA) were performed using scikit‐learn. PCAs are a dimension reduction technique in which the data are reoriented along axes that explain most of the variation in the data set; LDAs are similar to PCAs except that the axes are reoriented to explain most of the variation that discriminates the predefined groups of a categorical variable. Unlike PCAs, LDAs can be used for modeling, and in this work LDAs are used to model the species and node as a function of leaf shape. For individual and composite leaf data sets, the PCAs and LDAs are separate; that is, the individual and composite leaf data sets are not combined in dimension reduction analyses. For LDA and classification, random resampling was used to even the replication between species. For the test set, 20% of the data were used, while the remaining 80% was used for training. Vitis spp. with no assigned species identity and Vitis hybrids of unknown parentage were omitted from the training set but included in the test set, to determine which species their leaves most resembled. The LDA prediction was run 1000 times to estimate precision, recall, accuracy, and F1 statistics. Precision is the number of true positives divided by the sum of the number of true positives and false positives. Recall is the number of true positives divided by the sum of the number of true positives and false negatives. The F1 score is the harmonic mean of precision and recall and it is a measure of test accuracy.

Genotypic data and ADMIXTURE analysis

The genotype data are derived from Klein et al. (2018). The VCF files containing genotype data for the grapevines were processed using PLINK2 (Chang et al., 2015), resulting in a binary biallelic genotype table (.bed), PLINK extended MAP file (.bim), and PLINK sample information file (.fam). PLINK2 was used to calculate the eigenvectors and eigenvalues used in the PCA (Galinsky et al., 2016). The biallelic genotype table was used in a single run of ADMIXTURE (Alexander et al., 2009). K values from 3–15 were run and the cross‐validation (CV) error was calculated. The CV error decreased consistently from 0.20442 where K = 3 to 0.16453 where K = 10, after which it fluctuated around 0.164 for K values of 11–15. For the final analysis, K = 10 was used. The resulting table of group proportions was analyzed in R and visualized with ggplot2.

RESULTS

Using composite leaves to model leaf shape along the shoot

The vein area relative to that of the whole leaf blade decreases exponentially with leaf expansion in grapevine, making this trait an important allometric feature of leaf morphology (Fig. 2A; Chitwood et al., 2016b). We measured leaf shape and vein width using 21 homologous landmarks, which were superimposed using a generalized Procrustes analysis such that the resulting coordinates for each sample were translated, rotated, reflected, and scaled, thus allowing for a cross‐comparison of shape (Gower, 1975). The relative node position was used to compare vines with different numbers of leaves (Fig. 2B, C). With comparable Procrustes‐adjusted coordinates and relative node positions, the coordinate x‐ and y‐values for each of the 21 homologous landmarks (42 per leaf) were modeled as a function of node position. There was never more than one inflection point and the data were smooth; thus, a second‐order polynomial function was used for modeling (Appendix S1). This was also beneficial to avoid overfitting the data, which can occur with higher‐order polynomials. Using the resulting functions, coordinate values for 10 modeled leaf shapes were calculated for each vine in intervals of 0.1 across the relative node space from 0.1–1. The resulting modeled leaf shapes (with 210 landmarks and 420 coordinate values) are referred to as composite leaves and represent the dynamic changes in shape across each sampled shoot. Composite leaves can be superimposed and visualized, enabling a comparison of the changes in leaf shape along the shoots of different species, effectively reflecting both genotypic (species) and developmental (node position) differences (Fig. 3). Some of the most noticeable differences in leaf shape were observed between different species; for example, the wide reniform leaves of V. rupestris; the broad orbicular leaves of V. labrusca, V. coignetiae, and V. amurensis; and the deep‐lobed leaves of V. palmata and Ampelopsis glandulosa var. brevipedunculata. Developmental trends were also observed; generally, the youngest leaves (at the shoot tip) were thinner and had deeper lobes than the mature leaves found at the shoot base. Composite leaves were able to capture the dynamic developmental changes in leaf shape along the shoot, which individual leaves could not.

Figure 3

Modeling leaf shape along grapevine shoots with composite leaves. For each species, the modeled leaf shapes for 10 relative node positions along the shoot were superimposed and illustrated, forming a composite leaf. Illustrations are grouped by species‐relatedness: (A) Vitis riparia, V. acerifolia, and V. rupestris; (B) V. cinerea and V. vulpina; (C) V. aestivalis and V. labrusca; (D) V. coignetiae and V. amurensis; (E) V. palmata; (F) Ampelopsis glandulosa var. brevipedunculata. For each species, the number of leaves and vines sampled is given (note that every vine is sampled across four years, yielding a pseudoreplication of four). Composite leaves are colored as a gradient from gray (the shoot tip, node 1) to pink‐purple (the shoot base, node 10).

Composite leaves outperform individual leaves in predicting species identity

We hypothesized that composite leaves would better predict species identity than individual leaves. To test this, we performed a PCA on individual vs. composite leaf shapes. Performing a PCA for individual leaves permits the representation of the axes of variation along principal components (PCs) as eigenleaves (theoretical leaf shapes representing the variation along each PC axis at a chosen standard deviation value; Fig. 4A). A set of four example species with distinct leaf morphologies (V. riparia, V. acerifolia, V. rupestris, and Ampelopsis glandulosa var. brevipedunculata) was compared, and the individual‐leaf PCAs separated the species predominantly by PC2 (Fig. 4B). The eigenleaf representations support this structure; for example, a short and wide reniform leaf type characteristic of V. rupestris is associated with low PC2 values, whereas the longer leaf type with more prominent lobes more similar to A. glandulosa var. brevipedunculata is associated with high PC2 values. In contrast, the separation between species is much greater for the composite leaf PCA space (Fig. 4C). The PCA analyses for the individual and composite leaf data sets were performed independently from each other.

Figure 4

Principal component analysis (PCA) of individual vs. composite leaves. (A) Eigenleaf representations of shape variance at ±3 standard deviations explained by principal components (PCs) for a PCA performed on all individual leaves. The percent variance explained by each PC is shown on the left. (B, C) Comparison of results from two separate PCAs, one with all individual (B) and the other with composite (C) leaves. Confidence ellipses (95%) for four species (following the color legend) are provided in addition to all data points (gray). (D) Relative node position discretized into nodes counting from one to 10 projected onto the individual leaf PCA space. For the relative node position, there is no composite leaf PCA as the nodes are accounted for and integrated into the resulting values for that analysis. (E, F) Individual (E) and composite (F) leaf PCAs with 95% confidence ellipses for each year (following the color legend). Node position (discretized into 10 relative nodes) could only be projected for individual leaf PCA, as it is intrinsic to the modeling approach used. Node position varied mostly by PC1 in the individual leaf PCA (Fig. 4D) associated with eigenleaf representations characteristic of the shoot tip (high PC1 values) and the shoot base (low PC1 values). The species factor varied along one axis (PC2) and node position along another (PC1), consistent with previous observations that species and developmental effects on leaf shape are additive and orthogonal (Chitwood et al., 2016a). The main morphological difference between the two axes was the association of the petiolar sinus with other features (Fig. 4A). Four replicates for each vine were included in the composite leaf space, corresponding to the four growing seasons during which each vine was sampled. Comparing individual (Fig. 4E) and composite (Fig. 4F) leaf PCA with data divided by year yielded little separation, as was expected given that the main sources of variance (species type and developmental effects) were balanced in each analysis. The increased separation of species in the composite leaf PCA compared with individual leaf PCA (Fig. 4B, C) suggests that composite leaves may better facilitate the discrimination between species than individual leaves. To test the ability of these two methods to predict species identity, we used an LDA. The resulting confusion matrices, which plot the proportion of predicted species (horizontal axis) for each actual species class (vertical axis), showed that composite leaves outperform individual leaves in predicting species identity (Fig. 5A, B). Precision (true positives divided by the total number of positive predictions), recall (true positives divided by the sum of true positives and false negatives), and the F1 score (the harmonic mean of precision and recall) for species prediction were higher for composite than individual leaves (Table 1). For species prediction, the minimum values of precision, recall, accuracy, and the F1 score were 0.57, 0.50, 0.50, and 0.53, respectively, for individual leaves, and 0.85, 0.84, 0.84, and 0.86 for composite leaves, demonstrating the superior predictive ability of composite leaves. Node position can be predicted for individual leaves but not for composite leaves. For individual leaves, the prediction was the most accurate at the shoot tip and base (Fig. 5C, Table 2), as the effects of allometry and heteroblasty are most pronounced at the tip and base of the shoot, respectively, with little influence over leaves in the middle. Although matched by vine and species, the prediction of year was also better using composite than individual leaves (Fig. 5D, E; Table 3). This indicates that composite leaves retain morphological information useful for discriminating effects more subtle than genotype or development, such as the environment.

Figure 5

Comparison of linear discriminant analysis (LDA) results for individual vs. composite leaves. (A, B) Comparison of confusion matrices from two separate LDAs, one for individual (A) and the other for composite (B) leaves. The proportion of actual species (vertical) assigned to a predicted species identity (horizontal) is indicated by color. Vitis hybrids and species were not used in the training set and were only assigned an identity in the test set. (C) Confusion matrix for an LDA performed on the relative node position discretized into 10 nodes along the shoot for individual leaves. For the relative node position, there is no composite leaf LDA as the nodes are accounted for and integrated into the resulting values for that analysis. (D, E) Individual (D) and composite (E) leaf LDAs predicting year. All panels use the indicated color scheme for the assigned proportion, from 0 (white) to 1 (dark green).

Table 1

Comparison of species prediction using LDA for individual vs. composite leaves.

	Individual leaves				Composite leaves
Species	Precision	Recall	Accuracy	F1	Precision	Recall	Accuracy	F1
Ampelopsis glandulosa	0.93	1.00	1.00	0.96	1.00	1.00	1.00	1.00
Vitis acerifolia	0.57	0.50	0.50	0.53	0.90	0.94	0.94	0.92
V. aestivalis	0.68	0.56	0.56	0.62	0.96	0.97	0.97	0.97
V. amurensis	0.65	0.73	0.73	0.69	0.99	0.97	0.97	0.98
V. cinerea	0.59	0.69	0.69	0.64	0.97	0.91	0.91	0.94
V. coignetiae	0.74	0.77	0.77	0.75	1.00	1.00	1.00	1.00
V. labrusca	0.67	0.65	0.65	0.66	0.95	0.97	0.97	0.96
V. palmata	0.85	0.84	0.84	0.85	1.00	1.00	1.00	1.00
V. riparia	0.58	0.51	0.51	0.54	0.85	0.87	0.87	0.86
V. rupestris	0.76	0.73	0.73	0.75	0.99	0.84	0.84	0.91
V. vulpina	0.69	0.75	0.75	0.72	0.88	0.99	0.99	0.93

Table 2

Relative node position prediction using LDA for individual leaves.

Node position	Precision	Recall	Accuracy	F1
1 (tip)	0.66	0.66	0.66	0.66
2	0.42	0.37	0.37	0.39
3	0.41	0.44	0.44	0.42
4	0.35	0.40	0.40	0.37
5	0.27	0.30	0.30	0.28
6	0.26	0.26	0.26	0.26
7	0.29	0.22	0.22	0.25
8	0.28	0.23	0.23	0.25
9	0.35	0.39	0.39	0.36
10 (base)	0.53	0.56	0.56	0.54

Table 3

Comparison of year prediction using LDA for individual vs. composite leaves.

	Individual leaves				Composite leaves
Year	Precision	Recall	Accuracy	F1	Precision	Recall	Accuracy	F1
2013	0.58	0.56	0.56	0.57	0.83	0.79	0.79	0.80
2015	0.62	0.64	0.64	0.63	0.79	0.85	0.85	0.82
2016	0.74	0.79	0.79	0.76	0.95	0.96	0.96	0.95
2017	0.79	0.73	0.73	0.76	0.97	0.95	0.95	0.96

Using composite leaves to predict genotype

We demonstrated that composite leaves outperform individual leaves in discriminating vines of known species identity. We subsequently sought to test the predictive ability of composite leaves on vines of unassigned identity. Of the 209 vines measured, 147 had been genotyped previously (Klein et al., 2018). An ADMIXTURE analysis of genotyped vines confirmed most known species groups (Moore, 1991; Miller et al., 2013), with some exceptions (Fig. 6). Some intraspecies population structure was detected in V. riparia and V. acerifolia vines. Only a few vines represent the species V. amurensis and V. coignetiae, potentially limiting the resolution of these species (Fig. 6A). The V. aestivalis vines were found to be either misidentified from V. palmata or unresolved hybrids of V. palmata × V. labrusca + V. aestivalis. A number of other misassigned vines are indicated by small roman numerals in Fig. 6A. These and the V. aestivalis vines occupy positions in the PCA genotype space inconsistent with their assigned identities, or between species groups reflecting complex ancestry (Fig. 6C, D).

Figure 6

Comparing species identity predictions based on morphology to known ancestry. (A) Ancestry for each individual using K = 10 from ADMIXTURE. Each population is assigned a different color. Species designations for each vine are as previously assigned for this collection, without prior genetic knowledge, and arranged by known phylogenetic relationships. Vines with genetic identities at odds with their assigned identity are indicated by black arrows and lowercase roman numerals. (B) For Vitis spp. with genetic information, the ancestry (left) and predicted species identity (based on morphology) for each shoot for each of the four study years (right) are provided. Morphological predictions consistent with genetic identity are indicated in bold. (C) Principal component analysis (PCA) of the same individuals in (A). Vines with conflicting assigned and genetic identities are indicated by black arrows and lowercase roman numerals as in (A). Vitis spp. in (B) are indicated by black dots and vine identification numbers. Species are indicated by colors that do not correspond with the color scheme of other panels, and the number of vines with genetic information is provided. The Vitis vines of unassigned species identity totaled 13, of which 10 have been genotyped. For each Vitis spp. vine with no assigned species identity for which we had genotype data, we predicted the species identity using composite leaves with our LDA classifier (Fig. 5) for each of the four years sampled (Fig. 6B). Three vines were unambiguously identified using leaf shape consistently across the four years: vines 588282 and 588508 were correctly predicted to be V. riparia, and vine 588501 as V. labrusca. These vines clustered clearly with their respective species groups within the genotype PCA space (Fig. 6C, D). Two additional vines were ambiguously identified: vine 588529 has a V. labrusca + V. aestivalis × V. palmata genotype characteristic of V. aestivalis and was identified as V. labrusca or V. aestivalis in three of the four years. Vine 597293 with a V. amurensis + V. coignetiae ancestry was identified as V. coignetiae in two of the four years. Vine 588549 was not predicted correctly using composite leaf shape and has a V. labrusca + V. aestivalis × V. palmata genotype similar to vine 588529, but lies even farther from the V. labrusca cluster in the genotype PCA space, suggesting that the more ambiguous the genotype of a vine from a well‐characterized species, the more difficult morphological‐based prediction is. Vine 588628 had a unique V. rupestris × V. palmata ancestry and occupied an ambiguous position in the genotypic PCA. We expect that this vine was not classified correctly using morphology. The remaining vines that were not predicted are all of V. amurensis + V. coignetiae ancestry (vines 597294, 597295, and 597298). Because of the small sample sizes for these two species, it is likely that we have not adequately sampled leaf shape for this lineage and therefore cannot accurately predict for test cases. Our results show that composite leaf shape can successfully predict species identity, but not for vines with complex ancestry or species that have not been adequately sampled for leaf shape variation.

DISCUSSION

Previously, we analyzed leaf morphology across nodes in different species of Vitis and Passiflora, revealing that allometric and heteroblastic effects are statistically separable from species‐specific differences (Chitwood et al., 2016a; Chitwood and Otoni, 2017a). The tracking of alterations to leaf shape between nodes was critical for several observations, including (1) the determination that species‐specific differences in leaf shape arise from a common juvenile form (Chitwood and Otoni, 2017b) and (2) the identification of alterations to leaf morphology between growing seasons and/or years (Chitwood et al., 2016b; Baumgartner et al., 2020). In these studies, individual leaf shape was statistically analyzed, meaning that node position is a statistical effect rather than being part of the phenomenon studied (shape). By normalizing the nodes against the overall leaf count of each shoot (Fig. 2B, C) and subsequently modeling Procrustes‐adjusted coordinates as a function of node position (Appendix S1), we were able to construct composite leaves (Fig. 3). These composite leaves are better able to capture multiple leaf forms along the shoot within a single object. It should be noted that measuring the landmarks in so many leaves requires a significant amount of time; however, landmarks are a powerful way to capture the intricate details of leaf shape and the comparisons in this work (individual vs. composite leaves) rely on the same data sets. For the same amount of time and work to create the initial data set, composite leaves offer superior discrimination by species compared with individual leaves alone. Shifting the concept of leaf shape from singular leaves to multiple sequential leaves found along the shoot has repercussions for the morphological species concept and phenotype. Species are better resolved and predicted morphologically when using composite (Fig. 4) rather than individual leaves (Fig. 5). With an adequate training data set, composite leaves can help to identify vines of an unknown species identity (Fig. 6). However, it is vital to have a well‐characterized data set in which the underlying genetics that explain leaf shape is known; without adequate sampling of hybrids or intraspecies genetic variation, the ability to predict genotypes based on leaf shape diminishes significantly (Fig. 6). Leaf shape—a feature that is complex, conspicuous, and easily observable—is readily used to classify closely related botanical specimens, as well as to assess the developmentally or environmentally driven alterations within single plants. While the diversity of leaf shape can be recognized by specialists and communicated to others through writing, the geometric properties of leaf shape allow for the quantification of differences that we perceive visually (Amézquita et al., 2020). Furthermore, our assessment of leaf morphology is limited with individual leaves, which only allow us to observe facets of the comprehensive phenotype. Thus, composite leaves can better help to identify and define species by allowing us to capture dynamic morphological data from developmental and environmental conditions compared with individual leaves.

AUTHOR CONTRIBUTIONS

D.H.C. and J.P.L. conceived of the project and experimental design. D.H.C., J.M., Z.M., and M.H.F. collected data. D.H.C., R.V., and J.K. developed the curriculum and innovated the teaching methods used to guide student contributions to the manuscript. All authors contributed to the data analysis. D.H.C. wrote the first draft of the manuscript, which was read and edited by all authors. APPENDIX S1. Modeled coordinate values across grapevine shoots. Click here for additional data file.

16 in total

1. Climate and Developmental Plasticity: Interannual Variability in Grapevine Leaf Morphology.

Authors: Daniel H Chitwood; Susan M Rundell; Darren Y Li; Quaneisha L Woodford; Tommy T Yu; Jose R Lopez; Daniel Greenblatt; Julie Kang; Jason P Londo
Journal: Plant Physiol Date: 2016-01-29 Impact factor: 8.340

2. Charles Darwin and the origins of plant evolutionary developmental biology.

Authors: William E Friedman; Pamela K Diggle
Journal: Plant Cell Date: 2011-04-22 Impact factor: 11.277

3. High-throughput sequencing data clarify evolutionary relationships among North American Vitis species and improve identification in USDA Vitis germplasm collections.

Authors: Laura L Klein; Allison J Miller; Claudia Ciotir; Katie Hyma; Simon Uribe-Convers; Jason Londo
Journal: Am J Bot Date: 2018-03-12 Impact factor: 3.844

4. Second-generation PLINK: rising to the challenge of larger and richer datasets.

Authors: Christopher C Chang; Carson C Chow; Laurent Cam Tellier; Shashaank Vattikuti; Shaun M Purcell; James J Lee
Journal: Gigascience Date: 2015-02-25 Impact factor: 6.524

5. Morphometric analysis of Passiflora leaves: the relationship between landmarks of the vasculature and elliptical Fourier descriptors of the blade.

Authors: Daniel H Chitwood; Wagner C Otoni
Journal: Gigascience Date: 2017-01-01 Impact factor: 6.524

6. Topological Data Analysis as a Morphometric Method: Using Persistent Homology to Demarcate a Leaf Morphospace.

Authors: Mao Li; Hong An; Ruthie Angelovici; Clement Bagaza; Albert Batushansky; Lynn Clark; Viktoriya Coneva; Michael J Donoghue; Erika Edwards; Diego Fajardo; Hui Fang; Margaret H Frank; Timothy Gallaher; Sarah Gebken; Theresa Hill; Shelley Jansky; Baljinder Kaur; Phillip C Klahs; Laura L Klein; Vasu Kuraparthy; Jason Londo; Zoë Migicovsky; Allison Miller; Rebekah Mohn; Sean Myles; Wagner C Otoni; J C Pires; Edmond Rieffer; Sam Schmerler; Elizabeth Spriggs; Christopher N Topp; Allen Van Deynze; Kuang Zhang; Linglong Zhu; Braden M Zink; Daniel H Chitwood
Journal: Front Plant Sci Date: 2018-04-25 Impact factor: 5.753

7. Quantitative Trait Locus Analysis of Leaf Morphology Indicates Conserved Shape Loci in Grapevine.

Authors: Elizabeth M Demmings; Brigette R Williams; Cheng-Ruei Lee; Paola Barba; Shanshan Yang; Chin-Feng Hwang; Bruce I Reisch; Daniel H Chitwood; Jason P Londo
Journal: Front Plant Sci Date: 2019-11-15 Impact factor: 5.753

8. Vitis phylogenomics: hybridization intensities from a SNP array outperform genotype calls.

Authors: Allison J Miller; Naim Matasci; Heidi Schwaninger; Mallikarjuna K Aradhya; Bernard Prins; Gan-Yuan Zhong; Charles Simon; Edward S Buckler; Sean Myles
Journal: PLoS One Date: 2013-11-13 Impact factor: 3.240

9. Latent developmental and evolutionary shapes embedded within the grapevine leaf.

Authors: Daniel H Chitwood; Laura L Klein; Regan O'Hanlon; Steven Chacko; Matthew Greg; Cassandra Kitchen; Allison J Miller; Jason P Londo
Journal: New Phytol Date: 2015-11-18 Impact factor: 10.151

Review 10. The shape of things to come: Topological data analysis and biology, from molecules to organisms.

Authors: Erik J Amézquita; Michelle Y Quigley; Tim Ophelders; Elizabeth Munch; Daniel H Chitwood
Journal: Dev Dyn Date: 2020-04-13 Impact factor: 3.780

1 in total

1. Increases in vein length compensate for leaf area lost to lobing in grapevine.

Authors: Zoë Migicovsky; Joel F Swift; Zachary Helget; Laura L Klein; Anh Ly; Matthew Maimaitiyiming; Karoline Woodhouse; Anne Fennell; Misha Kwasniewski; Allison J Miller; Peter Cousins; Daniel H Chitwood
Journal: Am J Bot Date: 2022-07-19 Impact factor: 3.325

1 in total