| Literature DB >> 31375629 |
Venelin Mitov1,2, Krzysztof Bartoszek3, Tanja Stadler4,2.
Abstract
Phylogenetic comparative methods are widely used to understand and quantify the evolution of phenotypic traits, based on phylogenetic trees and trait measurements of extant species. Such analyses depend crucially on the underlying model. Gaussian phylogenetic models like Brownian motion and Ornstein-Uhlenbeck processes are the workhorses of modeling continuous-trait evolution. However, these models fit poorly to big trees, because they neglect the heterogeneity of the evolutionary process in different lineages of the tree. Previous works have addressed this issue by introducing shifts in the evolutionary model occurring at inferred points in the tree. However, for computational reasons, in all current implementations, these shifts are "intramodel," meaning that they allow jumps in 1 or 2 model parameters, keeping all other parameters "global" for the entire tree. There is no biological reason to restrict a shift to a single model parameter or, even, to a single type of model. Mixed Gaussian phylogenetic models (MGPMs) incorporate the idea of jointly inferring different types of Gaussian models associated with different parts of the tree. Here, we propose an approximate maximum-likelihood method for fitting MGPMs to comparative data comprising possibly incomplete measurements for several traits from extant and extinct phylogenetically linked species. We applied the method to the largest published tree of mammal species with body- and brain-mass measurements, showing strong statistical support for an MGPM with 12 distinct evolutionary regimes. Based on this result, we state a hypothesis for the evolution of the brain-body-mass allometry over the past 160 million y.Entities:
Keywords: clustering; correlated quantitative traits; evolutionary regimes; nonultrametric tree; selection
Year: 2019 PMID: 31375629 PMCID: PMC6708313 DOI: 10.1073/pnas.1813823116
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.An MGPM model of phylogenetically linked body- and brain-mass data from mammal species. (A) A tree of 629 extant species representative of 21 mammal orders (subsampled from ref. 29). (B) Body and brain masses measured as log-10–transformed mean values from finite samples of individual organisms from each species (curated measurements available from ref. 28). In A, a colored number followed by an uppercase letter denotes the regime number and model type selected for each of the 12 model regimes found in MGPM*. (C) “Standard” estimates for the 95% contours and linear regression line of brain mass on body mass for 3 regimes—1, 3, and 10. These estimates ignore the phylogenetic relationship, assuming independence of the data points in each group. (D) Expected 95% contours and regression lines for regimes 1, 3, and 10, according to MGPM*. Under the hypothesis that the inferred MGPM is the true model, the distributions in D represent the expectation at the present time for samples of species that have evolved independently from the root to an arbitrary tip in the corresponding regime following the regime shifts on that path. Thus, the MGPM* expectations correct for possible biases due to phylogenetic relationship. We observed an agreement between the standard estimates and the MGPM* expectations for most of the 12 groups ( and ).
Competing model fits to the mammal data
| Model | AIC | |||||
| Global | n.a. | 1 | 4 | −540.79 | 1,089.58 | 1,321.28 |
| Global | n.a. | 1 | 5 | 30.60 | −51.19 | 180.51 |
| Global | n.a. | 1 | 8 | −540.79 | 1,097.58 | 1,329.28 |
| Global | n.a. | 1 | 9 | 30.60 | −43.19 | 188.51 |
| Global | n.a. | 1 | 10 | 47.62 | −75.24 | 156.46 |
| Global | n.a. | 1 | 11 | 62.89 | −103.77 | 127.93 |
| SURFACE OU | 20 | 1 | 8 | −540.83 | 1,097.66 | 1,329.37 |
| SCALAR OU | 20 | 6 | 38 | 98.37 | −120.74 | 110.97 |
| RATEMATRIX BM | 20 | 9 | 37 | 116.15 | −158.30 | 73.40 |
| MGPM* ( | 20 | 12 | 115 | 230.85 | −231.70 | 0.00 |
q, minimal number of tips visible from a shift node; R, number of inferred regimes; p, number of parameters; ℓℓ, log-likelihood (higher values are better); AIC, Akaike information criterion (higher values are worse); ΔAIC, difference with respect to the best AIC score (higher values are worse); n.a., not applicable. The optimal parameter values of the models are described in .We note that, up to small error of the numerical optimization, the models BMA, OUC, and SURFACE OU converged to the same BM model (). The SCALAR OU model was the third best fit to the data. This fit converged to a BM model with shifts (). The fit of the RATEMATRIX BM model (which is also a BM model with shifts; ) resulted in the second-best AIC score.
Fig. 2.An MGPM reconstruction of the evolution of body mass and brain mass and their allometric relationship in mammals. (Left and Center) Inferred evolution of body mass and brain mass and brain–body-mass regression slope for each lineage in the mammal tree starting from the root (166.2 Ma ago) and ending at a random tip (extant species) in each of the 12 regimes in MGPM*. The allometry between brain mass and body mass is quantified as the deviation from 1 of the regression slope—increasing regression slope corresponds to decreasing allometry. The thicker lines represent the expected evolution for the mean trait value and the regression slope in each of the 12 regimes assuming the hypothesis that the model MGPM* is the true model; each thinner line represents the corresponding expectation from an MGPM fit to 1 of 50 “parametric bootstraps” datasets—these datasets were generated by simulating MGPM* on the mammal tree (Fig. 1 and ). The background colors correspond to the 12 inferred regimes in the tree according to MGPM* (Fig. 1). The error bars on white background on the right side of each plot denote the standard estimates with 95% confidence intervals from the extant species in each regime, ignoring the phylogenetic relationship; for each of the 12 regimes, the selected model type in MGPM* and the number of extant species are written in the top right corner of each plot. (Right) Silhouette images courtesy of Phylopic/T. Michael Keesey, Joseph Wolf, Natasha Vitek, Daniel Jaron, Catherine Yasuda, Allis Markham, Gareth Monger, Jan A. Venter, Herbert H. T. Prins, David A. Balfour, Rob Slotow, C. De Muizon, Scott Hartman, Michael Scroggie, Yan Wong, and Becky Barnes (see also for full credit details).