| Literature DB >> 28904781 |
Carmelo Fruciano1, Mélina A Celik1, Kaylene Butler2, Tom Dooley2, Vera Weisbecker3, Matthew J Phillips1.
Abstract
Geometric morphometrics is routinely used in ecology and evolution and morphometric datasets are increasingly shared among researchers, allowing for more comprehensive studies and higher statistical power (as a consequence of increased sample size). However, sharing of morphometric data opens up the question of how much nonbiologically relevant variation (i.e., measurement error) is introduced in the resulting datasets and how this variation affects analyses. We perform a set of analyses based on an empirical 3D geometric morphometric dataset. In particular, we quantify the amount of error associated with combining data from multiple devices and digitized by multiple operators and test for the presence of bias. We also extend these analyses to a dataset obtained with a recently developed automated method, which does not require human-digitized landmarks. Further, we analyze how measurement error affects estimates of phylogenetic signal and how its effect compares with the effect of phylogenetic uncertainty. We show that measurement error can be substantial when combining surface models produced by different devices and even more among landmarks digitized by different operators. We also document the presence of small, but significant, amounts of nonrandom error (i.e., bias). Measurement error is heavily reduced by excluding landmarks that are difficult to digitize. The automated method we tested had low levels of error, if used in combination with a procedure for dimensionality reduction. Estimates of phylogenetic signal can be more affected by measurement error than by phylogenetic uncertainty. Our results generally highlight the importance of landmark choice and the usefulness of estimating measurement error. Further, measurement error may limit comparisons of estimates of phylogenetic signal across studies if these have been performed using different devices or by different operators. Finally, we also show how widely held assumptions do not always hold true, particularly that measurement error affects inference more at a shallower phylogenetic scale and that automated methods perform worse than human digitization.Entities:
Keywords: geometric morphometrics; measurement error; photogrammetry; phylogenetic signal
Year: 2017 PMID: 28904781 PMCID: PMC5587461 DOI: 10.1002/ece3.3256
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 2.912
Figure 1Schematic representation of the workflow of the present study. Red boxes represent data acquisition and preparation. Light blue boxes represent analyses of measurement error and bias. Dark blue boxes indicate analyses on the effect of measurement error on phylogenetic signal
Figure 2Phylogenetic tree used in analyses of phylogenetic signal, pruned to match the most comprehensive dataset used. Clade A and Clade B highlight two of the subsets used (see text and Appendix S1)
Figure 3Scatterplot of the scores along the first two between‐group principal components (species used as group) for the dataset comprising all the landmarks and a dataset in which the most difficult landmarks had been removed
Procrustes ANOVAs of various marsupial cranial datasets
| Effect | SS | %Var | MS |
|
|
| Repeatability |
|---|---|---|---|---|---|---|---|
| Full dataset, all landmarks | |||||||
| Individual (species) | 0.965853 | 83.19789 | 0.000954 | 1012 | 65.87 | <.0001 | 0.832 |
| Side | 0.000724 | 0.062351 | 1.81E‐05 | 40 | 1.25 | .1415 | |
| Individual × Side | 0.012751 | 1.098381 | 1.45E‐05 | 880 | 0.91 | .9638 | |
| Device | 0.063118 | 5.436964 | 1.6E‐05 | 3956 | 0.8 | 1 | |
| Operator | 0.118464 | 10.20441 | 2E‐05 | 5934 | |||
| Full dataset, reduced landmarks | |||||||
| Individual (species) | 0.910388 | 94.37447 | 0.001182 | 770 | 66.54 | <.0001 | 0.961 |
| Side | 0.000742 | 0.076948 | 2.47E‐05 | 30 | 1.39 | .0812 | |
| Individual × Side | 0.011728 | 1.215769 | 1.78E‐05 | 660 | 2.66 | <.0001 | |
| Device | 0.01996 | 2.069179 | 6.68E‐06 | 2990 | 1.37 | <.0001 | |
| Operator | 0.021836 | 2.263638 | 4.87E‐06 | 4485 | |||
SS, sum of squares; %Var, percentage of variance accounted by the term (computed dividing the sum of squares for the term by the total sum of squares); MS, mean squares; df, degrees of freedom; F, F‐statistic; p, p‐value (parametric); repeatability, value of repeatability obtained using the formulas for the intraclass correlation coefficient on Procrustes ANOVA terms (see the text for details).
Significance of the test of bias for different subsets of our marsupial cranial data. The table reports p‐value based on a within‐subject permutation procedure (1000 random permutations). For comparisons between devices, p‐values above the diagonal were obtained with landmark sets digitized by Operator 1 and p‐values below the diagonal with datasets digitized by Operator 2. Significant comparisons in bold
| Between devices digitized by the same operator | Between operators, same device | |||||
|---|---|---|---|---|---|---|
| Solutionix | NextEngine | Photogrammetry | Solutionix | NextEngine | Photogrammetry | |
| All landmarks | ||||||
| Solutionix | – | 0.11 | 0.32 | 0.25 | 0.12 | 0.09 |
| NextEngine | 0.52 | – |
| |||
| Photogrammetry | 0.19 | 0.17 | – | |||
| Reduced set of landmarks | ||||||
| Solutionix | – |
|
|
|
|
|
| NextEngine |
| – | 0.17 | |||
| Photogrammetry |
| 0.14 | – | |||
Results of analyses of measurement error on data automatically acquired using GPSA with and without dimensionality reduction
| df | SS | MS | Rsq |
|
|
| Repeatability | |
|---|---|---|---|---|---|---|---|---|
| Procrustes ANOVA, full set of nonzero principal coordinates | ||||||||
| Species | 23 | 11394.1 | 495.4 | 0.72365 | 5.1235 | 2.1345 | .001 | 0.58 |
| Residuals | 45 | 4351.1 | 96.69 | |||||
| Total | 68 | 15745.3 | ||||||
| Procrustes ANOVA, first five principal coordinates | ||||||||
| Species | 23 | 7061.6 | 307.024 | 0.96809 | 59.364 | 2.8411 | .001 | 0.95 |
| Residuals | 45 | 232.7 | 5.172 | |||||
| Total | 68 | 7294.3 | ||||||
df, degrees of freedom; SS, sum of squares; MS, mean squares; Rsq, r squared; p, p‐value; in the pairwise test for bias, above the diagonal test based on the full set of nonzero principal coordinates and below the diagonal test based on the first five principal coordinates.
Figure 4Plots of the coefficient of variation of KMULT (across unique device/operator/landmark set combinations) against phylogenetic diversity for randomly drawn taxa (5, 10, 15)
Figure 5Distribution of the value of K for subsets (unique device/operator/landmark set combinations) computed using the posterior distribution of trees obtained from the phylogenetic analysis
Descriptive summary of the results
| Analysis | Results |
|---|---|
| Levels of error (human‐digitized landmarks) |
Using all landmarks, measurement error accounts for about 10% of total variance (repeatability around 0.8) |
| Presence of bias (human‐digitized landmarks) |
Using all landmarks, generally no significant bias |
| Levels of error (automated method) |
Using all the nonzero principal coordinates, error accounts for almost 30% of variance (repeatability 0.58) |
| Presence of bias (automated method) | Significant bias generally present |
| Measurement error and phylogenetic signal, single tree |
In some cases, the value of KMULT for unique operator/device combinations is more variable at a broader than at a shallower phylogenetic scale (KMULT differences between subsets between 0.01 and 0.18). No clear association of phylogenetic diversity and variation in KMULT estimates across operator/device combinations for random samples of taxa. |
| Measurement error and phylogenetic signal, posterior distribution of trees |
When using all landmarks, typically 60%–80% of variance due to error |