| Literature DB >> 29550451 |
Tanja Stadler1, Alexandra Gavryushkina2, Rachel C M Warnock2, Alexei J Drummond3, Tracy A Heath4.
Abstract
A birth-death-sampling model gives rise to phylogenetic trees with samples from the past and the present. Interpreting "birth" as branching speciation, "death" as extinction, and "sampling" as fossil preservation and recovery, this model - also referred to as the fossilized birth-death (FBD) model - gives rise to phylogenetic trees on extant and fossil samples. The model has been mathematically analyzed and successfully applied to a range of datasets on different taxonomic levels, such as penguins, plants, and insects. However, the current mathematical treatment of this model does not allow for a group of temporally distinct fossil specimens to be assigned to the same species. In this paper, we provide a general mathematical FBD modeling framework that explicitly takes "stratigraphic ranges" into account, with a stratigraphic range being defined as the lineage interval associated with a single species, ranging through time from the first to the last fossil appearance of the species. To assign a sequence of fossil samples in the phylogenetic tree to the same species, i.e., to specify a stratigraphic range, we need to define the mode of speciation. We provide expressions to account for three common speciation modes: budding (or asymmetric) speciation, bifurcating (or symmetric) speciation, and anagenetic speciation. Our equations allow for flexible joint Bayesian analysis of paleontological and neontological data. Furthermore, our framework is directly applicable to epidemiology, where a stratigraphic range is the observed duration of infection of a single patient, "birth" via budding is transmission, "death" is recovery, and "sampling" is sequencing the pathogen of a patient. Thus, we present a model that allows for incorporation of multiple observations through time from a single patient.Entities:
Keywords: Birth-death process; Fossils; Macroevolution; Phylogenetic tree; Sampling-through-time; Tree prior
Mesh:
Year: 2018 PMID: 29550451 PMCID: PMC5931795 DOI: 10.1016/j.jtbi.2018.03.005
Source DB: PubMed Journal: J Theor Biol ISSN: 0022-5193 Impact factor: 2.691
Fig. 1Three speciation modes as described in Foote (1996). The gray and white rectangles represent distinct species. In (i) asymmetric or budding speciation, the ancestral species (gray rectangle) survives after the speciation event whereas in the (ii) symmetric or bifurcating and (iii) anagenetic cases, the ancestral species is replaced by two or one descendant species.
Fig. 2A complete species tree of three species that originated through asymmetric speciation is shown on the left. In the middle, an “oriented” species tree is shown with asymmetric speciation corresponding to the species tree of the same three taxa. At each speciation event, one of the two new branches is labeled with A, because it represents a continuation of the ancestral species, and the other with D, designating the new descendant species. In an oriented tree, every species is identified by a unique sequence of A and D branches. Thus, the oldest species is identified by DA, the one that diverges next by DDA, and the most recent by DDD. On the right, a labeled species tree is shown where the orientations are omitted and every species is assigned with a label (taxon name) instead. The labeled tree representation is more common for existing phylogenetic software. In all three representations the same-colored segments represent the same species. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Notation used throughout this paper.
| Rate of branching speciation | |
| Rate of anagenetic speciation | |
| Probability of symmetric (vs. asymmetric) speciation | |
| Fossil sampling rate | |
| Extant species sampling probability | |
| Extinction rate | |
| ( | |
| Time of origin of a tree | |
| Speciation times in a sampled tree | |
| Orientation of the two branches descending a budding branching event | |
| Orientation of the two branches descending a general branching event | |
| Number of sampled stratigraphic ranges, | |
| Number of sampled stratigraphic ranges where the associated species goes extinct before present | |
| Number of sampled stratigraphic ranges with an extant species sample | |
| Number of sampled-ancestor-stratigraphic ranges | |
| Total number of sampled fossils | |
| Total number of sampled fossils that represent the start and end times of stratigraphic ranges (including ranges represented by a single occurrence) | |
| Total number of sampled fossils within the stratigraphic ranges ( | |
| Indicates the presence of a fossil within a stratigraphic interval if =1, and absence if =0 | |
| Number of branching speciation events in the labeled tree where we know the orientation | |
| Number of budding speciation events (out of the | |
| Extinction time of species associated with stratigraphic range | |
| Time of first observed fossil corresponding to the species represented by stratigraphic range | |
| Time of last observed fossil corresponding to the species represented by stratigraphic range | |
| Branching event in extended sampled tree giving rise to the straight line on which stratigraphic range | |
| Number of lineages co-existing at the birth time | |
| Most recent stratigraphic range ancestral to stratigraphic range | |
| Time of augmented unobserved speciation event that gave rise to the species associated with stratigraphic range | |
| Set of stratigraphic ranges, with | |
| Sum of all stratigraphic range lengths | |
| Length of a sub-branch spanning a stratigraphic interval | |
| Start time of branch | |
| End time of branch | |
| Oriented extended sampled tree | |
| Labeled extended sampled tree | |
| Oriented sampled tree | |
| Tree when ignoring the | |
| Tree when ignoring the number of fossils within a stratigraphic interval | |
| Summary of fossil occurrence data with |
Fig. 3Example of a complete tree (left) and its extended sampled tree (middle) and sampled tree (right). We mark all fossil and extant species’ samples with a diamond. The stratigraphic ranges are marked in blue, the extended stratigraphic ranges in grey. We remind the reader that a straight line in these trees represents our graphical representation, meaning the oldest branch in a line is the D branch and all subtending branches are A branches. We omit D/A here for clarity of the figure. Furthermore, we omit in the extended sampled tree the fossils within stratigraphic range i that are younger than o, and in the sampled trees the fossils that appear between o and y, as the times of these fossils do not contribute to the probability density of the respective tree. The numbering of species and bifurcation events is chosen to simplify the notation and does not reflect the chronological order of the events. Theorem 2 provides the probability density for the oriented extended sampled tree, Corollary 5 for the labeled extended sampled tree, and Corollary 6 for the extended sampled tree when summing over the possible tree topologies. Theorem 8 provides the probability density for the oriented sampled tree. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 4Illustration of sampled-ancestor-stratigraphic range assignment to non-stratigraphic range lineages X, Y, Z of a sampled tree. If sampled ancestor SA1 is assigned to lineage X, then SA2 can be assigned to Y or Z, while if SA1 is assigned to lineage Y, then SA2 can be assigned only to Z.
Fig. 5A complete species tree with three speciation modes (mixed speciation) is shown on the left. A sampled tree with mixed speciation is shown on the right.