| Literature DB >> 25281848 |
Thomas Marcussen1, Lise Heier2, Anne K Brysting2, Bengt Oxelman2, Kjetill S Jakobsen3.
Abstract
Allopolyploidization accounts for a significant fraction of speciation events in many eukaryotic lineages. However, existing phylogenetic and dating methods require tree-like topologies and are unable to handle the network-like phylogenetic relationships of lineages containing allopolyploids. No explicit framework has so far been established for evaluating competing network topologies, and few attempts have been made to date phylogenetic networks. We used a four-step approach to generate a dated polyploid species network for the cosmopolitan angiosperm genus Viola L. (Violaceae Batch.). The genus contains ca 600 species and both recent (neo-) and more ancient (meso-) polyploid lineages distributed over 16 sections. First, we obtained DNA sequences of three low-copy nuclear genes and one chloroplast region, from 42 species representing all 16 sections. Second, we obtained fossil-calibrated chronograms for each nuclear gene marker. Third, we determined the most parsimonious multilabeled genome tree and its corresponding network, resolved at the section (not the species) level. Reconstructing the "correct" network for a set of polyploids depends on recovering all homoeologs, i.e., all subgenomes, in these polyploids. Assuming the presence of Viola subgenome lineages that were not detected by the nuclear gene phylogenies ("ghost subgenome lineages") significantly reduced the number of inferred polyploidization events. We identified the most parsimonious network topology from a set of five competing scenarios differing in the interpretation of homoeolog extinctions and lineage sorting, based on (i) fewest possible ghost subgenome lineages, (ii) fewest possible polyploidization events, and (iii) least possible deviation from expected ploidy as inferred from available chromosome counts of the involved polyploid taxa. Finally, we estimated the homoploid and polyploid speciation times of the most parsimonious network. Homoploid speciation times were estimated by coalescent analysis of gene tree node ages. Polyploid speciation times were estimated by comparing branch lengths and speciation rates of lineages with and without ploidy shifts. Our analyses recognize Viola as an old genus (crown age 31 Ma) whose evolutionary history has been profoundly affected by allopolyploidy. Between 16 and 21 allopolyploidizations are necessary to explain the diversification of the 16 major lineages (sections) of Viola, suggesting that allopolyploidy has accounted for a high percentage-between 67% and 88%-of the speciation events at this level. The theoretical and methodological approaches presented here for (i) constructing networks and (ii) dating speciation events within a network, have general applicability for phylogenetic studies of groups where allopolyploidization has occurred. They make explicit use of a hitherto underexplored source of ploidy information from chromosome counts to help resolve phylogenetic cases where incomplete sequence data hampers network inference. Importantly, the coalescent-based method used herein circumvents the assumption of tree-like evolution required by most techniques for dating speciation events.Entities:
Keywords: Dating; Viola; low-copy nuclear gene; polyploidy; species network; violaceae
Mesh:
Year: 2014 PMID: 25281848 PMCID: PMC4265142 DOI: 10.1093/sysbio/syu071
Source DB: PubMed Journal: Syst Biol ISSN: 1063-5157 Impact factor: 15.683
Subdivision of the genus Viola (Violaceae) into sections, showing number of species (altogether 583–620), distribution, chromosome numbers, base chromosome numbers and inferred ploidy
| Section | Species | Distribution | Chromosome numbers (haploid) | Estimated ploidy |
|---|---|---|---|---|
| Sect. | 113 | S America | ||
| Sect. | 61 | N America, northeast Asia; | ||
| Sect. | 8 | southern S America | – | |
| Sect. | 3 | western Eurasia: southern Spain; Balkans | ||
| Sect. | 11–18 | eastern Australia; Tasmania | ||
| Sect. | 19 | S America | ||
| Sect. | 125 | western Eurasia; | ||
| Sect. | 31–50 | N, C and northern S America; Beringia; Hawaii | ||
| Sect. nov. A ( | 1–3 | Africa: equatorial high mountains | ||
| Sect. nov. B ( | 7–9 | western and central Asia: northern Iraq to Mongolia | – | |
| Sect. | 120 | northern hemisphere; ca 5 spp. in Australasia | ||
| Sect. | 3–6 | S America: Chile | ||
| Sect. | 1–4 | northeastern Africa to southwestern Asia | ||
| Sect. | 2 | southern S America | ||
| Sect. | 75 | northern hemisphere | ||
| Sect. | 3–4 | Mediterranean region; |
Notes: The systematics is provisional, based on earlier treatments (Becker 1925; Clausen 1929; Brizicky 1961; Clausen 1964) and our own studies, published (Marcussen et al. 2010; 2011; 2012) and in progress. Known chromosome numbers () within each section are indicated, with species names if only a few species have been counted, and numbers interpreted to be base chromosome number () for each section are underlined. Inferred, but not observed base chromosome numbers are underlined and given within square brackets. No base number is inferrable for section Melanium, owing to dysploidy (but see Erben 1996; Yockteng et al. 2003).
asensu Brizicky (1961), Clausen (1929, 1964) and Marcussen et al. (2012); i.e., including sect. Dischidium Ging. and grex Orbiculares Pollard.
bsensu Marcussen et al. (2012); i.e., including subsects. Boreali-Americanae (W.Becker) Gil-ad, Langsdorffianae (W.Becker) auct., Mexicanae (W.Becker) auct. and Pedatae (Pollard) auct.
csensu Marcussen et al. (2012); i.e., excluding subsects. Boreali-Americanae (W.Becker) Gil-ad, Langsdorffianae (W.Becker) auct., Mexicanae (W.Becker) auct. and Pedatae (Pollard) auct.
dsensu Brizicky (1961), Clausen (1964; as sect. Rostellatae Boiss.) and Marcussen et al. (2012).
eV. montagnei, V. roigii (Sanso and Seo 2005).
fV. delphinantha, V. cazorlensis (Schmidt 1964; Leal Perez-chao et al. 1980; Diosdado et al. 1993).
g“V. hederacea complex” (Moore in Smith-White 1959).
hV. dombeyana (Heilborn 1926; as V. humboldtii).
ie.g., Yockteng et al. (2003).
jfollowing interpretation in Marcussen et al. (2012).
kV. abyssinica (Morton 1993).
lV. rubella (Blaxland & Windham in Marcussen et al. 2012).
mV. stocksii (Khatoon and Ali 1993).
nV. tridentata (Moore 1967).
oV. arborescens, V. saxifraga (Arrigoni and Mori 1980; Galland 1985, 1988; Verlaque and Espeut 2007).
FA flow diagram of the four-step approach used to estimate a dated network from individual gene trees, using Viola as an example. See main text for further details.
Parsimony evaluation of five HOLM phylogenetic networks (A-E; Supplementary Table S4, http://dx.doi.org/10.5061/dryad.jc754) for Viola
| Section | Section | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Scenario | Lineages | (1) Inferred ghost lineages | (2) Sum poly- ploidizations | Inferred ploidy | Actual ploidy | (3) Undetected poly- ploidizations | Inferred ploidy | Actual ploidy | (3) Undetected poly- ploidizations | Parsimony score |
| A | 1 | 0 | 6 | 2 | 2 | 10 | ||||
| B | 2 | 0 | 9 | 1 | 0 | 10 | ||||
| C | 2 | 1: Erpetion1 | 9 | 0 | 0 | 10 | ||||
| D | 2 | 1: Chilenium3 | 9 | 1 | 0 | 11 | ||||
| E | 2 | 2: Chilenium3, Erpetion1 | 7 | 0 | 0 | 9 | ||||
Notes: The corresponding multilabeled trees are shown in Figure 2. Parsimony criteria were (1) fewest possible “ghost” subgenome lineages (i.e., subgenome lineages that were not detected by the nuclear gene phylogenies), (2) fewest possible polyploidization events, and (3) least possible deviation in ploidy as inferred from available chromosome counts, i.e., for sect. Erpetion and for sect. Tridens. The three parsimony criteria were given equal weights and the sum of steps for each scenario was compared. Scenario E (nine steps) results in the most parsimonious network (Fig. 2). Lineages refer to whether the gene tree Clade I and Clade II (Fig. 4) are considered to represent one genome lineage (scenario A) under the assumption that the gene tree incongruence is due to deep coalescence, or two genome lineages (scenarios B–E) under the assumption of polyploidization followed by complementary loss of homoeologs in Clade I and Clade II.
FMultilabeled genome trees corresponding to the five competing network scenarios A–E in Table 2, reconciled from three low-copy nuclear gene phylogenies. The five competing network scenarios are explained in the text. The focal subclades with conflicting topology are shaded, and each inferred but not observed ghost lineage therein is indicated with a broken line and with a little ghost symbol. The subclades CHAM and MELVIO have identical topology under all scenarios and are shown only for scenario E (collapsed in trees A–D), which produces the most parsimonious network. The individual gene phylogenies (GPI, NRPD2a, SDH) are superimposed on the genome tree for scenario E, and node labels are given. The networks corresponding to scenario E is shown in Figure 4; networks for scenarios A–D are shown in Supplementary Table S4, http://dx.doi.org/10.5061/dryad.jc754. Homoeolog clades are indicated with numbers following section names.
FMost parsimonious HOLM network for the 16 provisional sections of Viola based on the multilabeled tree in Figure 2E. Node labels correspond to those in Figure 2 and Supplementary Table S5. Lineages immediately involved in allopolyploid speciation are drawn as curved lines. A total of between 15 and 21 polyploid speciations was inferred; all 21 are shown here. Whether events 13–18 and 21 are seven independent polyploidizations or homoploid segregates of a single polyploidization is unclear. A multigene chloroplast phylogeny (Yoo and Jang 2010) and the pattern of ITS ribotype segregation (Ballard et al. 1999, Yockteng et al. 2003) indicate more than one origin but these data are incomplete in terms of sampling and/or resolution. Estimated ages for homoploid and polyploid speciations are shown as means and medians, respectively. Estimated ploidy for section lineages ranges from diploidy () to dodecaploidy (), as indicated after each section name and by line thickness for each lineage.
FThe two types of branches in a polyploid species network and their associated parameters. a) Species branch with no polyploid speciation event. The branch spans the time interval between two homoploid speciations occurring at time and time and is associated with a homoploid speciation rate . b) Species network containing one polyploid speciation event. The branch spans the time interval between the youngest split from the parental lineages at time and the homoploid speciation at time , within which interval an allopolyploidization happened at time . This branch is associated with a polyploid speciation rate before polyploidization (i.e., between and ) and with the general homoploid speciation rate after polyploidization (i.e., between and ). When , and are known, and can be estimated from the data. The difference in the estimates for and is due to different coalescence and arises because the individuals representing the parental lineages and the polyploid are not identical with those originally involved in the allopolyploidization.