Literature DB >> 32550787

A complete time-calibrated multi-gene phylogeny of the European butterflies.

Martin Wiemers1,2, Nicolas Chazot3,4,5, Christopher W Wheat6, Oliver Schweiger2, Niklas Wahlberg3.   

Abstract

With the aim of supporting ecological analyses in butterflies, the third most species-rich superfamily of Lepidoptera, this paper presents the first time-calibrated phylogeny of all 496 extant butterfly species in Europe, including 18 very localised endemics for which no public DNA sequences had been available previously. It is based on a concatenated alignment of the mitochondrial gene COI and up to eleven nuclear gene fragments, using Bayesian inferences of phylogeny. To avoid analytical biases that could result from our region-focussed sampling, our European tree was grafted upon a global genus-level backbone butterfly phylogeny for analyses. In addition to a consensus tree, the posterior distribution of trees and the fully concatenated alignment are provided for future analyses. Altogether a complete phylogenetic framework of European butterflies for use by the ecological and evolutionary communities is presented. Martin Wiemers, Nicolas Chazot, Christopher Wheat, Oliver Schweiger, Niklas Wahlberg.

Entities:  

Keywords:  Butterflies of Europe; divergence times; macroecology; phylogeny; time tree

Year:  2020        PMID: 32550787      PMCID: PMC7289901          DOI: 10.3897/zookeys.938.50878

Source DB:  PubMed          Journal:  Zookeys        ISSN: 1313-2970            Impact factor:   1.546


Introduction

The incorporation of phylogenetic information in ecological theory and research has led to significant advancements by facilitating the connection of large-scale and long-term macro-evolutionary processes with ecological processes in the analysis of species interactions with their abiotic and biotic environments (Webb et al. 2002; Mouquet et al. 2012). Phylogenies are increasingly used across diverse areas of macroecological research (Roquet et al. 2013), such as studies on large-scale diversity patterns (De Palma et al. 2017), disentangling historical and contemporary processes (Mazel et al. 2017), latitudinal diversity gradients (Economo et al. 2018) or improving species area relationships (Mazel et al. 2015). Phylogenetic information has also improved studies on assembly rules of local communities (Cavender-Bares et al. 2009; Gerhold et al. 2015; D’Amen et al. 2018), including spatiotemporal community dynamics (Monnet et al. 2014) and multi-spatial and -temporal context-dependencies (Ovaskainen et al. 2017). Additionally, phylogenetic information has provided insights into the mechanisms and consequences of biological invasions (Knapp et al. 2008; Winter et al. 2009; Li et al. 2015; Gallien et al. 2017). They also contribute to assessments of ecosystem functioning and service provisioning (Díaz et al. 2013; Davies et al. 2016), though phylogenetic relationships cannot simply be taken as a one-to-one proxy for ecosystem functioning (Winter et al. 2013; Mazel et al. 2018). However, they are of great value for studies of species traits and niche characteristics by quantifying the amount of phylogenetic conservatism (Wiens and Graham 2005) and ensuring statistical independence (Kühn et al. 2009) in multi-species studies. Using an ever increasing toolkit of phylogenetic metrics (Schweiger et al. 2008; Tucker et al. 2017), and a growing body of phylogenetic insights, the afore mentioned advances across diverse research fields document how integrating evolutionary and ecological information can enhance assessments of future impacts of global change on biodiversity (Thuiller et al. 2011; Lavergne et al. 2013; Morales-Castilla et al. 2017) and consequently inform conservation efforts (Thuiller et al. 2015; but see also Winter et al. 2013). Although the amount of molecular data has increased exponentially during the last decades, most available phylogenetic studies are either restricted to a selected subset of species, higher taxa, or to small geographic areas. Complete and dated species-level phylogenetic hypotheses for species-rich taxa of larger regions have been restricted to vascular plants (Durka and Michalski 2012) or vertebrates, such as global birds (Jetz et al. 2012) or European tetrapods (Roquet et al. 2014), or the analyses are based on molecular data from a small subset of species (e.g., 5% in ants; Economo et al. 2018). Regionally complete phylogenetic hypotheses are rare for insects, although they comprise the majority of multicellular life on Earth (Stork 2018), have enormous impacts on ecosystem functioning, provide a multitude of ecosystem services (Noriega et al. 2018), and have long been used as biodiversity indicators (McGeoch 2007). Here, we present the first comprehensive time-calibrated molecular phylogeny of all 496 extant European butterfly species (: ), based on one mitochondrial and up to eleven nuclear genes, and the most recent systematic list of European butterflies (Wiemers et al. 2018). European butterflies are well-studied, ranging from population level analyses (Settele et al. 2009) to large-scale impacts of global change (Devictor et al. 2012). There is also good knowledge of species traits and environmental niche characteristics (Bartonova et al. 2014; Schweiger et al. 2014), population trends (van Swaay et al. 2006; van Swaay et al. 2010) and large-scale distributions (Settele et al. 2008; Kudrna et al. 2011). Butterflies are thus well placed for studies in the emerging field of ecophylogenetics (Mouquet et al. 2012). Compared to other groups of insects, the phylogenetic relationships of butterflies are reasonably well-known, with robust backbone molecular phylogenies at the subfamily (Wahlberg et al. 2005a; Heikkilä et al. 2012; Espeland et al. 2018) and genus-level (Chazot et al. 2019). In addition, molecular phylogenies also exist for most butterfly families (Campbell et al. 2000; Caterino et al. 2001; Wahlberg et al. 2003; Braby et al. 2006; Warren et al. 2008; Wahlberg et al. 2009; Wahlberg et al. 2014; Espeland et al. 2015; Sahoo et al. 2016; Seraphim et al. 2018; Toussaint et al. 2018; Allio et al. 2020) as well as major subgroups (Wahlberg et al. 2005b; Peña et al. 2006; Nylin and Wahlberg 2008; Peña and Wahlberg 2008; Wiemers et al. 2010; Talavera et al. 2013; Peña et al. 2015; Condamine et al. 2018) and comprehensive COI data at the species level are available from DNA barcoding studies (Wiemers and Fiedler 2007; Dincă et al. 2011; Hausmann et al. 2011; Dincă et al. 2015; Huemer and Wiesmair 2017; Litman et al. 2018). Some ecological studies on butterflies have already incorporated phylogenetic information, e.g., on the impact of climate change on abundance trends (Bowler et al. 2015; Bowler et al. 2017), the sensitivity of butterflies to invasive species (Gallien et al. 2017; Schleuning et al. 2016) or the ecological determinants of butterfly vulnerability (Essens et al. 2017). However, the phylogenetic hypotheses used in these studies had incomplete taxon coverage and were not made available for reuse by other researchers. A first complete phylogeny of European butterflies was published by Dapporto et al. (2019) but this tree was not based on a global backbone phylogeny and therefore was also not time-calibrated. To fill these gaps in the literature, and to facilitate the growing field of ecophylogenetics, here we present the first complete and time-calibrated species-level phylogeny of a speciose higher invertebrate taxon above the family level for an entire continent. Importantly, we provide this continent-wide fully resolved phylogeny in standard analysis formats for further advancements in theoretical and applied ecology.

Materials and methods

Taxonomic, spatial, and temporal coverage

We analyse a dataset comprising all extant European species of butterflies (), including the families , , , , , and . We base our species concepts, as well as the area defined as Europe, on the latest checklist of European butterflies (Wiemers et al. 2018).

Acquisition of sequence data

The data were mainly collated from published sources and downloaded from NCBI GenBank (Suppl. material 1). One mitochondrial gene, cytochrome c oxidase subunit I (COI, 1464 bp), was available for all species in the data matrix, in particular the 5’ half of the gene (658 bp, also known as the DNA barcode). Eleven nuclear genes were included when available: elongation factor-1α (EF-1α, 1240 bp), carbamoyl-phosphate synthase domain protein (CAD, 850 bp), cytosolic malate dehydrogenase (MDH, 733 bp), isocitrate dehydrogenase (IDH, 711 bp), glyceraldehyde-3-phosphate dehydrogenase (GAPDH, 691 bp), ribosomal protein S5 (RpS5, 617 bp), arginine kinase (ArgK, 596 bp), wingless (412 bp), ribosomal protein S2 (RpS2, 411 bp), DOPA decarboxylase (DDC, 373 bp), and histone 3 (H3, 329 bp). H3 has been sequenced almost exclusively for the family , while the other gene regions have been sampled widely also in the other butterfly families. For each gene, the longest available sequence was used. However, in the case of several available sequences of similar length, those of European origin were preferentially used. Sequences were aligned manually to maintain protein reading frame, and were curated and managed using VoSeq (Peña and Malm 2012). In many cases, new sequences were generated for this study. For these specimens, protocols followed Wahlberg and Wheat (2008) or Wiemers and Fiedler (2007). These include several species that did not have any available published sequences, many of which are island endemics (Table 1). The 239 new sequences have been submitted to GenBank (accessions KC462784–KC462854, MN752702–MN752850, MN829460–MN829496).
Table 1.

Newly sequenced species for which no published sequences had previously been available.

TaxonOrigin COI EF-1α GAPDH Wingless
Coenonympha orientalis Greece MN829478 MN829462
Glaucopsyche paphos Cyprus MN829481 MN829463
Gonepteryx maderensis Portugal: Madeira MN829482 MN829464
Hipparchia azorina Portugal: Azores MN829483 MN829465
Hipparchia bacchus Spain: Canary Islands MN829484 MN829466
Hipparchia cretica Greece: Crete MN752718 MN829467 MN752786 MN752837
Hipparchia gomera Spain: Canary Islands MN829485 MN829468
Hipparchia maderensis Portugal: Madeira MN829486
Hipparchia mersina Greece: Samos MN752720 MN829469 MN752785 MN752836
Hipparchia miguelensis Portugal: Madeira MN829487
Hipparchia sbordonii Italy: Pontine Islands MN752723
Hipparchia tamadabae Spain: Canary Islands MN829488
Hipparchia tilosi Spain: Canary Islands MN829489
Hipparchia wyssii Spain: Canary Islands MN829490 MN829470
Lycaena bleusei Spain MN829492
Pieris balcana North Macedonia KC462788
Pieris wollastoni Portugal: Madeira KC462820
Thymelicus christi Spain: Canary Islands MN829496
Almost all genera are represented by multiple genes, except , , , , and (the latter recently synonymised with ; Fric et al. 2019) which are represented only by the COI gene. Species represented by only the DNA barcode tend to be closely related to species with more genes sequenced (Suppl. material 1), minimising the potential bias these samples could have in our analyses. Newly sequenced species for which no published sequences had previously been available.

Phylogenetic tree reconstructions

A biogeographically restricted tree of a given taxon is inherently very asymmetrically sampled. To avoid potentially strong biases when estimating topology and divergence times we chose to build upon the recent genus-level tree of butterflies (Chazot et al. 2019), which provides a well-supported time-calibrated backbone and is congruent with a recent phylogenomic analysis of (Kawahara et al. 2019). This backbone tree contains 994 taxa, each taxon representing a genus across all . The tree was time-calibrated using a set of 14 fossil calibration points, which provided minimum ages and ten calibration points based on ages of host plant clades taken from the literature, which provided maximum ages. Importantly, Chazot et al. (2019) tested the robustness of their results to a wide range of alternative assumptions made in the time-calibration analysis, and showed that the estimated times of divergences were robust.

Analysis overview

To estimate a time-calibrated tree of European butterflies, we first identified the position of the European lineages and designed a grafting procedure accordingly. We split the European butterflies that needed to be added to the tree into 12 subclades. For each of these subclades we combined the DNA sequences of the taxa already included in the backbone to the DNA sequences of the European taxa to assemble an aligned molecular matrix. After identifying the best partitioning scheme, we performed a tree reconstruction without time-calibration (i.e., only estimating branch lengths proportional to relative time). The subclade trees were then rescaled using the ages estimated in the backbone and were subsequently grafted. This procedure was repeated using 1000 trees from BEAST posterior distributions of the backbone and subclade trees in order to obtain a posterior distribution of grafted trees. The details of these procedures are described below.

Backbone and subclades

The time-calibrated backbone tree provided by Chazot et al. (2019) contained about 55% of all butterfly genera, including 79% of the genera occurring in Europe. A fixed topology was obtained using RAxML (Stamatakis 2014) and node ages where estimated with BEAST v.1.8.3. (Suchard et al. 2018). We used this fixed topology from Chazot et al. (2019) to identify at which nodes European clades should be grafted. We partitioned the analysis into 12 subclades. For each subclade, the DNA sequences of all taxa already included in the global backbone (including also non-European taxa) were combined with the DNA sequences of all the new European taxa that were added. In addition to the focal taxa, we added between two and four outgroups. We note that the relationships of the 12 subclades were fixed according to Chazot et al. (2019), while the relationships of species within the 12 subclades were estimated with the new data. The subclades, sorted by families, were defined as follows: – All were placed into one subclade. – We identified two main clades to graft within the : and . The subclade was extended to also encompass the subfamilies and . The genus , not available in the backbone, was included in the subclade. – All were considered as a single clade. – All were considered as a single clade. – The only European species, , was already available in the backbone tree. – European were divided into seven subclades. (i) A subclade for the . (ii) In order to add we generated a tree of . (iii) We combined the sister clades and into a single subclade. (iv) was treated as a single subclade. (v) A first clade of contained the genera , , , , and . (vi) A second clade contained the genera , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and . (vii) A third clade was created for the genus . were not treated separately from the backbone. is the only occurring in Europe and (which is very closely related to ; Aduse-Poku et al. 2009) was already included in the backbone tree from Chazot et al. (2019). Hence, we used the position of for .

Partitioning the dataset

For each subclade we ran PartitionFinder 2.1.1 (Lanfear et al. 2016) in order to select the best partitioning strategy and corresponding substitution models. The dataset was initially partitioned into genes and codon positions. Branch lengths were set to linked and the comparison between partitioning strategies was made using the greedy algorithm and BIC score (Lanfear et al. 2012).

Phylogenetic reconstruction

For each subclade, the dataset was imported in BEAUTi 1.8.3 (Drummond et al. 2012) and partitioned according to the partitioning strategy identified by PartitionFinder. We enforced the monophyly of the clade to be grafted (i.e., excluding the outgroups). All other relationships were estimated by BEAST 1.8.3. (Suchard et al. 2018). We used an uncorrelated relaxed clock with lognormal distribution. By default, we started by setting one molecular clock per partition. If convergence or good mixing could not be obtained after running BEAST we reduced the number of molecular clocks (see details for each dataset further below). We did not add any time-calibration and therefore only estimated the relative timing of divergence. We performed at least two independent runs with BEAST for each subclade. We checked for convergence and mixing of the MCMC using Tracer 1.7.1 (Rambaut et al. 2018) and in the case of full convergence of the runs, the posterior distribution of trees from different runs were combined after removing the burn-in fraction.

procedure

Subclades were grafted on the backbone as follows. One backbone was sampled from the posterior distribution of time-calibrated trees from Chazot et al. (2019). For each subclade, one subclade tree was sampled from the posterior distribution of trees, the outgroups removed, and the tree was rescaled based on the crown age of the subclade extracted from the backbone tree. Finally, the rescaled subclade tree was grafted on the backbone after removing all lineages belonging to this subclade in the backbone (i.e., only keeping the stem branch). We repeated this procedure for 1000 backbone trees and 1000 subclade trees, and we thus obtained a posterior distribution of 1000 grafted trees. The topology of the backbone was fixed (see Chazot et al. 2019) but the topologies of the subclades were free. Hence the posterior distribution of grafted trees includes a posterior distribution of topologies and node ages. We describe below the details of the phylogenetic tree reconstruction for each subclade. 1. Dataset – The dataset for the consisted of 36 taxa to which three outgroups were added: (), (), (). We concatenated 11 gene fragments (COI, CAD, EF-1α, GAPDH, ArgK, IDH, MDH, RpS2, RpS5, DDC, wingless). PartitionFinder – PartitionFinder identified 12 subsets (Suppl. material 2: Table S1). BEAST analysis – In order to improve the quality of our runs we replaced the default priors for rates of substitutions by uniform prior ranging between 0 and 10 for the following cases: subset5.at, subset5.cg, subset7.cg, subset7.gt, subset12.cg, subset12.gt. We used one molecular clock per subset identified by PartitionFinder and obtained good mixing and convergence. We used a birth-death tree prior. We performed three runs of 40 million generations, sampling trees and parameters every 4000 generations. – For grafting, the outgroups were removed, as well as , the first to diverge and endemic to Mexico (Allio et al. 2020), i.e., we grafted at the most recent common ancestor (MRCA) of all but . 2. Dataset – The dataset for the consisted of 169 taxa to which two outgroups were added: (: ), (: ). We concatenated 10 gene fragments (COI, CAD, EF-1α, GAPDH, ArgK, IDH, MDH, RpS2, RpS5, wingless). PartitionFinder – PartitionFinder identified 17 subsets (Suppl. material 2: Table S2). BEAST analysis – Preliminary analyses showed problems with the subset 3 (ArgKin_pos3) which was therefore removed from the analyses. In order to improve the quality of our runs we replaced the default priors for rates of substitutions by uniform priors ranging between 0 and 10 for the following case: subset17.cg. The substitution model for the subset 14 was also changed into HKY+I after preliminary analyses. We used one molecular clock per subset identified by PartitionFinder and obtained good mixing and convergence. We used a birth-death tree prior. We performed two runs of 150 million generations, sampling trees and parameters every 15000 generations. – For grafting, the outgroups were removed and the subclade grafted at the MRCA of . 3. Dataset – The dataset for the consisted of 77 taxa to which three outgroups were added: (: ), (: ), and (: ). We concatenated ten gene fragments (COI, CAD, EF-1α, GAPDH, ArgK, IDH, MDH, RpS2, RpS5, wingless). PartitionFinder – PartitionFinder identified 14 subsets (Suppl. material 2: Table S3). BEAST analysis – In order to improve the quality of our runs we replaced the default priors for rates of substitutions by uniform priors ranging between 0 and 10 for the following cases: subset7.ac, subset7.gt, subset14.cg, subset3.cg. Preliminary analyses showed problems when using a separate molecular clock for each subset identified by PartitionFinder. We restricted the analysis to one molecular clock. We used a birth-death tree prior. We performed two runs of 100 million generations, sampling trees and parameters every 10000 generations. – For grafting, the outgroups were removed, and the subclade grafted at the MRCA of . 4. Dataset – The dataset for the consisted of 126 taxa to which three outgroups were added: (), (), and (). We concatenated eleven gene fragments (COI, CAD, EF-1α, GAPDH, ArgK, IDH, MDH, RpS2, RpS5, DDC, wingless). PartitionFinder – PartitionFinder identified 17 subsets (Suppl. material 2: Table S4). BEAST analysis – In order to improve the quality of our runs we replaced the default priors for rates of substitutions by uniform priors ranging between 0 and 10 for the following case: subset7.cg. The substitution model for the subset 7 was also changed into GTR+G after preliminary analyses. We used one molecular clock per subset identified by PartitionFinder and obtained good mixing and convergence. We used a birth-death tree prior. We performed two runs of 100 million generations, sampling trees and parameters every 10000 generations. – For grafting, the outgroups were removed, and the subclade grafted at the MRCA of . 5. Dataset – The dataset for the consisted of 187 taxa to which three outgroups were added: (), () and (). We concatenated 12 gene fragments (COI, CAD, EF-1α, GAPDH, ArgK, IDH, MDH, RpS2, RpS5, DDC, wingless and H3). PartitionFinder – PartitionFinder identified 12 subsets (Suppl. material 2: Table S5). BEAST analysis – In order to improve the quality of our runs we replaced the default priors for rates of substitutions by uniform priors ranging between 0 and 10 for the following cases: subset3.cg, subset6.ag, subset6.at, subset11.gt_subst7.cg. We used one molecular clock per subset identified by PartitionFinder and obtained good mixing and convergence. We used a birth-death tree prior. We performed two runs of 150 million generations, sampling trees and parameters every 15000 generations. – For grafting, the outgroups were removed, and the subclade grafted at the MRCA of . 6. Dataset – The dataset for the consisted of 7 taxa to which two outgroups were added: (: ) and (: ). We concatenated 9 gene fragments (COI, CAD, EF-1α, GAPDH, IDH, MDH, RpS2, RpS5, wingless). PartitionFinder – PartitionFinder identified eight subsets (Suppl. material 2: Table S6). BEAST analysis – We used one molecular clock per subset identified by PartitionFinder and obtained good mixing and convergence. We used a birth-death tree prior. We performed two runs of 20 million generations, sampling trees and parameters every 2000 generations. – For grafting, the outgroups were removed, and the subclade grafted at the MRCA of . 7. Dataset – The dataset for the consisted of nine taxa to which two outgroups were added: (: ) and (: ). We concatenated ten gene fragments (COI, CAD, EF-1α, GAPDH, ArgK, IDH, MDH, RpS2, RpS5, wingless). PartitionFinder – PartitionFinder identified seven subsets (Suppl. material 2: Table S7). BEAST analysis – We used one molecular clock per subset identified by PartitionFinder and obtained good mixing and convergence. We used a birth-death tree prior. We performed two runs of 20 million generations, sampling trees and parameters every 2000 generations. – For grafting, the outgroups were removed, and the subclade grafted at the MRCA of . 8. Dataset – The dataset combined the sister clades and and consisted of 92 taxa to which three outgroups were added: (: ), (: ) and (: ). We concatenated eleven gene fragments (COI, CAD, EF-1α, GAPDH, ArgK, IDH, MDH, RpS2, RpS5, DDC, wingless). PartitionFinder – PartitionFinder identified 14 subsets (Suppl. material 2: Table S8). BEAST analysis – Preliminary analyses showed problems with the subset 14 (RpS2_pos2) which was therefore removed from the analyses. In order to improve the quality of our runs we replaced the default priors for rates of substitutions by uniform priors ranging between 0 and 10 for the following case: subset7.cg. We used one molecular clock per subset identified by PartitionFinder and obtained good mixing and convergence. We used a birth-death tree prior. We performed two runs of 100 million generations, sampling trees and parameters every 10000 generations. – For grafting, the outgroups were removed, and the subclade grafted at the split between and . 9. Dataset – The dataset of consisted of 83 taxa to which two outgroups were added: (: ) and (: ). We concatenated eleven gene fragments (COI, CAD, EF-1α, GAPDH, ArgK, IDH, MDH, RpS2, RpS5, DDC, wingless). PartitionFinder – PartitionFinder identified 12 subsets (Suppl. material 2: Table S9). BEAST analysis – In order to improve the quality of our runs we replaced the default priors for rates of substitutions by uniform priors ranging between 0 and 10 for the following case: subset5.cg. Preliminary analyses revealed problems when using one molecular clock per subset identified by Partition Finder. We restricted the analysis to one molecular clock for the mitochondrial gene fragments and one molecular clock for the nuclear gene fragments. We used a birth-death tree prior. We performed two runs of 100 million generations, sampling trees and parameters every 10000 generations. – For grafting, the outgroups were removed, and the subclade grafted at the MRCA of . 10. Dataset – The first dataset consisted of 13 taxa, belonging to the genera , , , , , and , to which three outgroups were added: (: ), (: ), and (: ). We concatenated 5 gene fragments (COI, EF-1α, GAPDH, RpS5, wingless). PartitionFinder – PartitionFinder identified six subsets (Suppl. material 2: Table S10). BEAST analysis – We used one molecular clock per subset identified by PartitionFinder and obtained good mixing and convergence. We used a birth-death tree prior. We performed two runs of 20 million generations, sampling trees and parameters every 2000 generations. – For grafting, the outgroups were removed, and the subclade grafted at the crown of the clade after removing the outgroups. 11. Dataset – The second dataset consisted of 161 taxa, belonging to the genera , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and , to which three outgroups were added: (: ), (: ), and (: ). We concatenated ten gene fragments (COI, CAD, EF-1α, GAPDH, ArgK, IDH, MDH, RpS2, RpS5, wingless). PartitionFinder – PartitionFinder identified eleven subsets (Suppl. material 2: Table S11). BEAST analysis – In order to improve the quality of our runs we replaced the default priors for rates of substitutions by uniform prior ranging between 0 and 10 for the following cases: subset5.ac, subset5.ag, subset5.at, subset5.cg, subset5.gt. We used one molecular clock per subset identified by PartitionFinder and obtained good mixing and convergence. We used a birth-death tree prior. We performed two runs of 100 million generations, sampling trees and parameters every 10000 generations. – For grafting, the outgroups were removed, and the subclade grafted at the crown of the clade after removing the outgroups. 12. Dataset – The third dataset consisted of 15 taxa all belonging to the genus , to which two outgroups were added: (: ) and (: ). We concatenated nine gene fragments (COI, CAD, EF-1α, GAPDH, IDH, MDH, RpS2, RpS5, wingless). PartitionFinder – PartitionFinder identified six subsets (Suppl. material 2: Table S12). BEAST analysis – We used one molecular clock per subset identified by PartitionFinder and obtained good mixing and convergence. We used a birth-death tree prior. We performed two runs of 20 million generations, sampling trees and parameters every 2000 generations. – For grafting, the outgroups were removed, and the subclade grafted at the crown of .

Quality control

Species identities of the chosen sequences for the dataset were validated by blasting the DNA barcode sequences against the Barcode Of Life Database (http://www.boldsystems.org/), which has a good representation of European butterfly species due to a number of barcoding projects implemented in different countries (e.g., Wiemers and Fiedler 2007; Dincă et al. 2011; Hausmann et al. 2011; Dincă et al. 2015; Huemer and Wiesmair 2017; Litman et al. 2018). In almost all cases, the sequences came from the same voucher specimen itself. In 17% of cases (Suppl. material 1), the sequences used were from different individuals. In these cases special care was taken to use sequences from reliable sources, preferably those with voucher photographs. We estimated our time-calibration from a recent re-evaluation of the timing of divergence of higher-level . We used the topology inferred by Chazot et al. (2019) as a backbone in our grafting procedure. This topology was fixed in Chazot et al. (2019), hence only node ages were estimated. However, within each subclade we grafted, we let BEAST estimate the topology in addition to node ages. Several sections of the European butterfly tree remain poorly supported. This most likely arises from the lack of molecular information as well as recent and rapid diversification events within , , or for example. Further more detailed work is needed in these groups, building on preliminary studies (e.g., Wiemers and Fiedler 2007; Vila et al. 2010; Wiemers et al. 2010; Verovnik and Wiemers 2016; Vishnevskaya et al. 2016), which might show that some of the taxa need to be synonymised (as e.g., with ; see Aarvik et al. 2017). Most of the higher relationships among genera are well supported, however. Exceptions with low support values are the relationships among the genera , , and (: ), among , , , and (: ), some relationships among the () and between and (: ). This also means that the apparent non-monophyly of the genera , , , and in our tree needs to be confirmed by further studies. The only subfamily relationship with low support is the sister relationship of with . In Dapporto et al. (2019) turned out as sister to the remaining , a result in line with Espeland et al. (2018), although with low support in the latter study. In most of these cases, the low support values are due to insufficient molecular information for those groups of taxa. We show here a synthetic tree summarising the posterior distribution of topologies and node ages, but the posterior distribution of grafted trees can be found in the Supporting Information, providing a distribution of alternative topologies and node ages estimated by BEAST. We strongly advise any researcher using these phylogenetic trees to repeat any analyses on at least 100 trees randomly sampled from this posterior distribution in order to account for topology and node age uncertainties. This tree can also help to identify the sections of the tree lacking molecular information and therefore points at the sections that should be targeted in the future when generating new molecular data.

Dataset descriptions

The analysed dataset (a concatenated alignment of the genes COI, CAD, EF-1α, GAPDH, ArgK, IDH, MDH, RpS2, RpS5, DDC, wingless, and H3) is available in NEXUS format and the posterior distribution of ML trees and the consensus tree in NEWICK format at DOI: https://dx.doi.org/10.5281/zenodo.3531555.

Conclusions

We have generated a robust phylogenetic hypothesis for all European species of butterflies with estimations of divergence times (Fig. 1, Suppl. material 3: Fig. S1) as well as subtrees of major sections (Suppl. material 4: Fig. S2, Suppl. material 5: Fig. S3, Suppl. material 6: Fig. S4, Suppl. material 7: Fig. S5), a tree with posterior probabilities (Fig. 2, Suppl. material 8: Fig. S6) and gene coverage (Fig. 3). Our purpose is to provide a complete phylogenetic framework for use by the ecological and evolutionary communities. The demand for such phylogenetic information is high and various proxies have been used that are not ideal, starting already in 2005 (Päivinen et al. 2005). Although the topology of major clades in our consensus tree is largely congruent with the one by Dapporto et al. (2019), differences can be found e.g., in the monophyly of which appeared as a paraphylum in the trees of Dapporto et al. (2019) and Espeland et al. (2018). Our tree also confirms the monophyly of most of the European butterfly genera in the recent checklist of Wiemers et al. (2018). An exception is the genus which turned out to be a paraphylum. This result is in line with the tree in Dapporto et al. (2019) and a recent study by Zhang et al. (2020), that revises the taxonomy of accordingly, leading to a change of several names (Table 2). We provide a posterior distribution of topologies and node ages for researchers to be able to take phylogenetic and node age uncertainty into account in the analyses. The tree files are provided in standard Newick format as output from BEAST. Since there are easily applied methods to prune the phylogeny to the species pool of a particular study, e.g., the ape package (Paradis et al. 2004) in R (R Core Team 2018), our tree is readily applicable to a large variety of ecological analyses ranging from the very local and regional scales, where the species pool only represents a subset of the European species, to the European scale. Since butterflies are an important indicator taxon for biodiversity studies, this time-calibrated phylogeny will provide a solid basis to advance our understanding of large-scale biodiversity patterns and underlying mechanisms by allowing the incorporation of macro-evolutionary processes into biodiversity analyses at macroecological, landscape and local community scales and by combining trait- and phylogeny-based assessments of species assembly processes.
Figure 1.

Time-calibrated tree of European butterflies (: ) with time scale and taxonomic assignment to subfamilies and families.

Figure 2.

Majority rule consensus tree topology of a set of 1000 trees from the posterior distribution of time-calibrated trees of European butterflies. Circles at the nodes display clade support with a colour gradient from 50% (red) via 75% (yellow) to 100% (green).

Figure 3.

Time-calibrated tree of European butterflies. Grey bars indicate gene coverage per taxon.

Table 2.

Proposal for changes in the current taxonomic checklist by Wiemers et al. (2018) according to the recent revision of by Zhang et al. (2020).

Current name (Wiemers et al. 2018) Proposed name (Zhang et al. 2020)
Muschampia cribrellum (Eversmann, 1841)Favria cribrellum (Eversmann, 1841)
Carcharodus lavatherae (Esper, 1783)Muschampia (Reverdinus) lavatherae (Esper, 1783)
Carcharodus orientalis Reverdin, 1913Muschampia (Reverdinus) orientalis (Reverdin, 1913)
Carcharodus floccifera (Zeller, 1847)Muschampia (Reverdinus) floccifera (Zeller, 1847)
Carcharodus stauderi Reverdin, 1913Muschampia (Reverdinus) stauderi (Reverdin, 1913)
Carcharodus baeticus (Rambur, 1839)Muschampia (Reverdinus) baeticus (Rambur, 1840)
Time-calibrated tree of European butterflies (: ) with time scale and taxonomic assignment to subfamilies and families. Majority rule consensus tree topology of a set of 1000 trees from the posterior distribution of time-calibrated trees of European butterflies. Circles at the nodes display clade support with a colour gradient from 50% (red) via 75% (yellow) to 100% (green). Time-calibrated tree of European butterflies. Grey bars indicate gene coverage per taxon. Proposal for changes in the current taxonomic checklist by Wiemers et al. (2018) according to the recent revision of by Zhang et al. (2020).
  62 in total

1.  Molecular evolution of the wingless gene and its implications for the phylogenetic placement of the butterfly family Riodinidae (Lepidoptera: papilionoidea).

Authors:  D L Campbell; A V Brower; N E Pierce
Journal:  Mol Biol Evol       Date:  2000-05       Impact factor: 16.240

2.  The effects of phylogenetic relatedness on invasion success and impact: deconstructing Darwin's naturalisation conundrum.

Authors:  Shao-Peng Li; Marc W Cadotte; Scott J Meiners; Zheng-Shuang Hua; Hao-Yue Shu; Jin-Tian Li; Wen-Sheng Shu
Journal:  Ecol Lett       Date:  2015-10-06       Impact factor: 9.492

3.  Whole Genome Shotgun Phylogenomics Resolves the Pattern and Timing of Swallowtail Butterfly Evolution.

Authors:  Rémi Allio; Céline Scornavacca; Benoit Nabholz; Anne-Laure Clamens; Felix Ah Sperling; Fabien L Condamine
Journal:  Syst Biol       Date:  2020-01-01       Impact factor: 15.683

Review 4.  How Many Species of Insects and Other Terrestrial Arthropods Are There on Earth?

Authors:  Nigel E Stork
Journal:  Annu Rev Entomol       Date:  2017-09-22       Impact factor: 19.686

5.  Priors and Posteriors in Bayesian Timing of Divergence Analyses: The Age of Butterflies Revisited.

Authors:  Nicolas Chazot; Niklas Wahlberg; André Victor Lucci Freitas; Charles Mitter; Conrad Labandeira; Jae-Cheon Sohn; Ranjit Kumar Sahoo; Noemy Seraphim; Rienk de Jong; Maria Heikkilä
Journal:  Syst Biol       Date:  2019-09-01       Impact factor: 15.683

6.  Out-of-Africa again: a phylogenetic hypothesis of the genus Charaxes (Lepidoptera: Nymphalidae) based on five gene regions.

Authors:  Kwaku Aduse-Poku; Eric Vingerhoedt; Niklas Wahlberg
Journal:  Mol Phylogenet Evol       Date:  2009-07-04       Impact factor: 4.286

7.  Genomic outposts serve the phylogenomic pioneers: designing novel nuclear markers for genomic DNA extractions of lepidoptera.

Authors:  Niklas Wahlberg; Christopher West Wheat
Journal:  Syst Biol       Date:  2008-04       Impact factor: 15.683

8.  Conserving the functional and phylogenetic trees of life of European tetrapods.

Authors:  Wilfried Thuiller; Luigi Maiorano; Florent Mazel; François Guilhaumon; Gentile Francesco Ficetola; Sébastien Lavergne; Julien Renaud; Cristina Roquet; David Mouillot
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2015-02-19       Impact factor: 6.237

9.  Species delimitation in the Grayling genus Pseudochazara (Lepidoptera, Nymphalidae, Satyrinae) supported by DNA barcodes.

Authors:  Rudi Verovnik; Martin Wiemers
Journal:  Zookeys       Date:  2016-06-22       Impact factor: 1.546

10.  Ecological networks are more sensitive to plant than to animal extinction under climate change.

Authors:  Matthias Schleuning; Jochen Fründ; Oliver Schweiger; Erik Welk; Jörg Albrecht; Matthias Albrecht; Marion Beil; Gita Benadi; Nico Blüthgen; Helge Bruelheide; Katrin Böhning-Gaese; D Matthias Dehling; Carsten F Dormann; Nina Exeler; Nina Farwig; Alexander Harpke; Thomas Hickler; Anselm Kratochwil; Michael Kuhlmann; Ingolf Kühn; Denis Michez; Sonja Mudri-Stojnić; Michaela Plein; Pierre Rasmont; Angelika Schwabe; Josef Settele; Ante Vujić; Christiane N Weiner; Martin Wiemers; Christian Hof
Journal:  Nat Commun       Date:  2016-12-23       Impact factor: 14.919

View more
  15 in total

1.  Genomics-guided refinement of butterfly taxonomy.

Authors:  Jing Zhang; Qian Cong; Jinhui Shen; Paul A Opler; Nick V Grishin
Journal:  Taxon Rep Int Lepid Surv       Date:  2021-05-29

2.  Large-scale comparative analysis of cytogenetic markers across Lepidoptera.

Authors:  Irena Provazníková; Martina Hejníčková; Sander Visser; Martina Dalíková; Leonela Z Carabajal Paladino; Magda Zrzavá; Anna Voleníková; František Marec; Petr Nguyen
Journal:  Sci Rep       Date:  2021-06-09       Impact factor: 4.379

3.  A new comprehensive trait database of European and Maghreb butterflies, Papilionoidea.

Authors:  Joseph Middleton-Welling; Leonardo Dapporto; Enrique García-Barros; Martin Wiemers; Piotr Nowicki; Elisa Plazio; Simona Bonelli; Michele Zaccagno; Martina Šašić; Jana Liparova; Oliver Schweiger; Alexander Harpke; Martin Musche; Josef Settele; Reto Schmucki; Tim Shreeve
Journal:  Sci Data       Date:  2020-10-15       Impact factor: 6.444

4.  Wolbachia affects mitochondrial population structure in two systems of closely related Palaearctic blue butterflies.

Authors:  Alena Sucháčková Bartoňová; Martin Konvička; Jana Marešová; Martin Wiemers; Nikolai Ignatev; Niklas Wahlberg; Thomas Schmitt; Zdeněk Faltýnek Fric
Journal:  Sci Rep       Date:  2021-02-04       Impact factor: 4.379

5.  Insect egg-killing: a new front on the evolutionary arms-race between brassicaceous plants and pierid butterflies.

Authors:  Eddie Griese; Lotte Caarls; Niccolò Bassetti; Setareh Mohammadin; Patrick Verbaarschot; Gabriella Bukovinszkine'Kiss; Erik H Poelman; Rieta Gols; M Eric Schranz; Nina E Fatouros
Journal:  New Phytol       Date:  2021-01-08       Impact factor: 10.151

6.  High resolution DNA barcode library for European butterflies reveals continental patterns of mitochondrial genetic diversity.

Authors:  Vlad Dincă; Leonardo Dapporto; Panu Somervuo; Raluca Vodă; Sylvain Cuvelier; Martin Gascoigne-Pees; Peter Huemer; Marko Mutanen; Paul D N Hebert; Roger Vila
Journal:  Commun Biol       Date:  2021-03-09

7.  Migrators within migrators: exploring transposable element dynamics in the monarch butterfly, Danaus plexippus.

Authors:  Tobias Baril; Alexander Hayward
Journal:  Mob DNA       Date:  2022-02-16

8.  Great chemistry between us: The link between plant chemical defenses and butterfly evolution.

Authors:  Corné F H van der Linden; Michiel F WallisDeVries; Sabrina Simon
Journal:  Ecol Evol       Date:  2021-05-27       Impact factor: 2.912

9.  Climate change drives mountain butterflies towards the summits.

Authors:  Dennis Rödder; Thomas Schmitt; Patrick Gros; Werner Ulrich; Jan Christian Habel
Journal:  Sci Rep       Date:  2021-07-13       Impact factor: 4.379

10.  Ecology and Genetic Structure of the Parasitoid Phobocampe confusa (Hymenoptera: Ichneumonidae) in Relation to Its Hosts, Aglais Species (Lepidoptera: Nymphalidae).

Authors:  Hélène Audusseau; Gaspard Baudrin; Mark R Shaw; Naomi L P Keehnen; Reto Schmucki; Lise Dupont
Journal:  Insects       Date:  2020-07-28       Impact factor: 2.769

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.