| Literature DB >> 30283681 |
Veronica K Chong1, Hannah F Fung1, John R Stinchcombe1,2.
Abstract
Measuring natural selection through the use of multiple regression has transformed our understanding of selection, although the methods used remain sensitive to the effects of multicollinearity due to highly correlated traits. While measuring selection on principal component (PC) scores is an apparent solution to this challenge, this approach has been heavily criticized due to difficulties in interpretation and relating PC axes back to the original traits. We describe and illustrate how to transform selection gradients for PC scores back into selection gradients for the original traits, addressing issues of multicollinearity and biological interpretation. In addition to reducing multicollinearity, we suggest that this method may have promise for measuring selection on high-dimensional data such as volatiles or gene expression traits. We demonstrate this approach with empirical data and examples from the literature, highlighting how selection estimates for PC scores can be interpreted while reducing the consequences of multicollinearity.Entities:
Keywords: Flowering time; life history; multicollinearity; principal component analysis; principal component regression; selection gradients; trade‐offs
Year: 2018 PMID: 30283681 PMCID: PMC6121829 DOI: 10.1002/evl3.63
Source DB: PubMed Journal: Evol Lett ISSN: 2056-3744
Figure 1Line mean correlation between flowering time and flowering duration in the Arabidopsis chamber experiment. The inbred line mean correlation is –0.95.
Line mean correlations for the Arabidopsis study, with eigenvector loadings for PC1–PC4
| Flowering Time | Flowering Duration | Branch Number | Rosette Diameter | PC1 | PC2 | PC3 | PC4 | |
|---|---|---|---|---|---|---|---|---|
| Flowering time | 1 | 0.522 | –0.215 | 0.154 | –0.352 | |||
| Flowering duration | –0.95 | 1 | –0.523 | 0.157 | ‐0.244 | 0.426 | ||
|
| ||||||||
| Branch number | –0.74 | 0.72 | 1 | –0.476 | –0.018 | 0.856 | –0.189 | |
|
|
| |||||||
| Rosette diameter | –0.12 | 0.06 | –0.09 | 1 | 0.045 | 0.913 | –0.047 | –0.397 |
|
|
|
| ||||||
| Rosette leaf number | 0.70 | –0.73 | –0.64 | 0.33 | 0.473 | 0.306 | 0.425 | 0.708 |
|
|
|
|
| |||||
| Percent variance explained | 65.17% | 23% | 6.94% | 3.5% | ||||
Selection estimates for the Arabidopsis experiment
| Univariate Regressions | Multiple Regression | Principal Component Regression | ||||||
|---|---|---|---|---|---|---|---|---|
| Trait | s ± SE |
|
| β ± SE |
|
| VIF | β ± SE |
| Flowering time |
|
|
| –0.30 ± 0.19 | 1.57 | 0.12 | 12.54 | –0.17 ± 0.05 |
| Flowering duration |
|
|
| 0.05 ± 0.18 | 0.27 | 0.78 | 11.32 | 0.17 ± 0.06 |
| Branch number |
|
|
| 0.07 ± 0.08 | 0.95 | 0.35 | 2.45 | 0.09 ± 0.08 |
| Rosette diameter | 0.11 ± 0.07 | 1.52 | 0.136 | 0.059 ± 0.07 | 0.87 | 0.39 | 1.60 | 0.07 ± 0.07 |
| Rosette leaf number |
|
|
| 0.061 ± 0.09 | 0.63 | 0.53 | 3.21 | –0.06 ± 0.10 |
Univariate estimates of selection are from regressions of relative fruit production on each trait alone; significant differentials are shown in bold. Selection gradients are estimated from a multiple regression of relative fitness on all traits, although the estimates for flowering time and duration are uncertain because of multicollinearity (indicated by variance inflation factors (VIF)). Principal component regression estimates are regression coefficients for PC1–PC4 projected back into the original trait space.
Figure 2Selection differentials for (A) flowering time and (B) flowering duration, (C) Rosette leaf number, and (D) branch number in the Arabidopsis experiment. Differentials are portrayed in the original trait units; plotted points are line means.