Literature DB >> 30388147

Large geographic distance versus small DNA barcode divergence: Insights from a comparison of European to South Siberian Lepidoptera.

Peter Huemer1, Paul D N Hebert2, Marko Mutanen3, Christian Wieser4, Benjamin Wiesmair1, Axel Hausmann5, Roman Yakovlev6,7, Markus Möst8, Brigitte Gottsberger9, Patrick Strutzenberger9, Konrad Fiedler9.   

Abstract

Spanning nearly 13,000 km, the Palearctic region provides an opportunity to examine the level of geographic coverage required for a DNA barcode reference library to be effective in identifying species with broad ranges. This study examines barcode divergences between populations of 102 species of Lepidoptera from Europe and South Siberia, sites roughly 6,000 km apart. While three-quarters of these species showed divergence between their Asian and European populations, these divergence values ranged between 0-1%, distinctly less than the distance to the Nearest-Neighbor species in all but a few cases. Our results suggest that further taxonomic studies may be required for 16 species that showed either extremely low interspecific or high intraspecific variation. For example, seven species pairs showed low or no barcode divergence, but four of these cases are likely to reflect taxonomic over-splitting while the others involve species pairs that are either young or show evidence for introgression. Conversely, some of the nine species with deep intraspecific divergence at varied spatial levels may include overlooked species. Although these 16 cases require further investigation, our overall results indicate that barcode reference libraries based on records from one locality can be very effective in identifying specimens across an extensive geographic area.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 30388147      PMCID: PMC6214556          DOI: 10.1371/journal.pone.0206668

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

In many cases, DNA barcoding can be an effective tool for both specimen identification and species discovery. In animals, a 648 base pair segment of the mitochondrial cytochrome c oxidase subunit 1 (COI) gene has been adopted as the barcode region [1], [2]. Numerous researchers have added data to BOLD, the Barcode of Life Data Systems (www.boldsystems.org), which at present includes more than 6 million barcode records from about 550,000 operational taxonomic units (i.e. BINs–see [3]). Currently, more than 22,000 registered users are accessing these records. Despite varied coverage among taxonomic groups and regions, these data are increasingly useful to address diverse research questions in ecology and evolutionary biology. One important issue that needs further investigation relates to the performance of barcode-based species identifications across large distances. In particular, since species’ distributions vary from narrow endemism to global occurrence, it needs to be assessed whether DNA barcodes from one site or region can be used to identify specimens of the same species from distant localities. This is especially important in the Palearctic region because its elongate axis spans more than 13,000 km and many species are thought to occur from the Atlantic Ocean in the west to the Pacific Ocean in the east. For the same reason, the Palearctic region is ideal to quantify the influence of geographic distance on intraspecific variation under relatively comparable conditions (similar ecotypes). In recent years, a few studies on Lepidoptera have examined the congruence of DNA barcodes across larger geographic distances including 1,000 species shared by Fennoscandia and Central Europe [4], butterflies from Central Asia [5], and 1,500 species of Noctuoidea in North America [6]. However, these studies still are the exception and in contrast to the present paper either only cover a single taxonomic sub-group of Lepidoptera or a comparatively small geographic distance. Moreover, most prior work has examined patterns of sequence variation at a national or regional level [7], [8], [9]. Ideally, DNA barcodes from specimens collected at a single locality would enable the identification of conspecifics from the entire species distribution. This might not be the case if intraspecific sequence variation within widespread taxa is greater than interspecific differences. In other words, identification problems will arise whenever intraspecific variation blurs the ‘barcode gap’ which is critical to assign specimens to their correct species, either a Linnaean name or an Operational Taxonomic Unit (e.g. BIN). In such cases DNA barcodes fail to correctly identify species and additional diagnostic characters, particularly morphological traits and high density genetic markers, have to be considered to firmly identify species. Our study is the first to examine patterns of DNA barcode variation across a very large geographic range for a broad set of Lepidoptera (102 species, 22 families) shared by Europe and South Siberia. Specifically, we ascertain levels of barcode divergence between putatively conspecific specimens from southern Siberia, i.e. Russian Altai, and Europe, particularly Northern Europe and the Alps. Although higher intraspecific variation within populations spanning Siberia and Europe compared to the respective populations from each region considered separately can be expected, the magnitude of this variation will determine whether an effective system for DNA barcode-based identifications can be based on a narrowly parameterized reference library. To examine this matter, we compared intraspecific divergences between populations of 102 species from Siberia and their divergences to the 5,016 species (41,583 specimens) in a carefully validated dataset of European Lepidoptera [10]. We also ascertained if intraspecific distances are lower in species with a near-continuous Euro-Siberian distribution than in those with a disjunct arctic-alpine or central-Asian-alpine distribution. Finally, we asked if patterns of isolation by geographic distance as measured by COI barcode sequences are influenced by overall sequence divergence or distribution type.

Material and methods

Taxon sampling strategy

This study examined two Palaearctic sub-regions separated by a distance of about 6,000 km: central/northern Europe with a focus on the Alps and Finland, and South Siberia (Altai Republic, Russia), supplemented by a few reliably identified specimens from other areas (Fig 1).
Fig 1

Geographic origin of the voucher specimens for the 102 sequenced species of Eurasian Lepidoptera.

Map created with SimpleMappr (http://www.simplemappr.net).

Geographic origin of the voucher specimens for the 102 sequenced species of Eurasian Lepidoptera.

Map created with SimpleMappr (http://www.simplemappr.net). Whereas DNA barcode coverage for lepidopteran taxa is generally high for species from central and northern Europe, only few records are available from South Siberia. We therefore sought to obtain specimens of >100 species shared by these regions. We focused on species with a disjunct arctic-alpine and South Siberian-alpine distribution based on the expectation that they would be likely to show higher intraspecific barcode variation. Species identification was exclusively based on morphological traits. In general, we analyzed three specimens from South Siberia for each of these species to estimate intraspecific divergence, but only two specimens were available for 18 species whereas for 15 species the number of voucher specimens ranged between 4 and 8. The average number of successfully sequenced specimens per species from Asia was 3.24. By comparison, the number of sequenced specimens was much higher for most European representatives of these species with 16.34 sequenced specimens per species on average. Existing specimens from museum collections were analyzed where possible and were supplemented with material from an expedition to the Russian Altai Mountains from late July to mid-August 2016 [11]. A permit was not required for the Altai specimens as no protected species were collected. Collections in other countries were made in compliance with current legislation. In Finland, permits were issued by the Finnish Centre for Economic Development, Transport, and the Environment to MM under permissions VARELY/441/07.01/2012 and LAPELY/275/07.01/2012, while collecting permits were not necessary for scientific research in Austria/Tyrol. The Nagoya protocol was not applicable because our European material was collected before October 12, 2014 and because the protocol has not been ratified by Russia. Most sequences considered in this study derive from specimens held in the Tiroler Landesmuseum Ferdinandeum, Innsbruck, Austria; the University of Oulu, Oulu, Finland; the Bavarian State Collection of Zoology, Munich, Germany; and another 25 specimen depositories. Wherever possible, data were supplemented by publicly available sequences in BOLD ([12], see http://www.boldsystems.org).

DNA sequencing

For freshly collected specimens, a single leg was removed and placed in a 96-well lysis plate that was submitted for analysis to the CCDB (Canadian Center for DNA Barcoding, University of Guelph, Canada) where DNA extraction, PCR amplification, and sequencing were performed following standard high-throughput protocols [13]. Altogether, 315 specimens of 102 South Siberian species that also occur in Europe were sequenced. Moreover, we examined previously published 1,682 sequences (>500bp) [10] from specimens of the same species from sites in Europe including Finland (423), Austria (410), Germany (329), Russia (315), and 19 other countries (520) (Fig 1). Information regarding the institutions hosting each publicly available specimen, sample and process IDs and GenBank accession numbers are available in S1 Table. Further details on each specimen, including complete voucher data, and images are available on BOLD [12] in the public dataset “Lepidoptera of Altai Mountains (DS-LEPEUALT)” under the DOI: 10.5883/DS-LEPEUALT.

Data analysis

The extent of intraspecific sequence variation in the COI sequences for each species was estimated using the Kimura-2-parameter (K2P) model of nucleotide substitution using analytical tools on BOLD v4.0 (http://www.boldsystems.org) and MEGA v.6 [14]. There has been an interesting debate over the choice and justification of K2P and other distance measures used in barcoding analyses (e.g., [15]), however, the ‘best method’ depends on the dataset under consideration and the effects of different distance measures and models on the distances and identification success are generally small (e.g., [16]). Therefore a consequence of model choice on the main results of this specific work is unlikely and we applied the K2P method as implemented in BOLD. For each species we obtained four estimates of intraspecific divergence by calculating the arithmetic mean for all pairwise distances (K2P) among conspecific individuals within the following spatial contexts: (a) ‘total intraspecific’ (mean distance for all data for each species); (b) ‘within Europe’ (mean distance for all European samples); (c) ‘within Asia’ (mean distance for all South Siberian-Central Asian samples); and (d) ‘inter Europe-Asia’ (mean distance within each species for all pairs of specimens from Europe vs. Asia). Furthermore, we examined the potential impact of distribution type on intraspecific divergences. For this analysis, each species was assigned to one of two categories: (a) those with largely continuous distributions across Eurasia, i.e. with known gaps <500 km; and (b) those with highly disjunct distributions, i.e. with gaps between known populations >2,000 km. These two categories basically reflect what has been termed Euro-Siberian versus arctic-alpine and South Siberian-alpine distribution patterns in biogeographic studies [17]. We compared mean intraspecific sequence divergences across the three spatial levels (intra-Europe, intra-Asia, inter-Europe-Asia) using a non-parametric Friedman ANOVA of ranks because of uneven variance and sequence numbers for the 102 species. Total mean intraspecific barcode divergence between the two types of species distributions was compared using a Mann-Whitney U-test. In addition, we examined the strength of isolation by distance within every species. For this purpose, we calculated a Mantel correlation coefficient for the matrix of geographic distances between sampling localities and the K2P distance matrix for every species using the Geographic Distance Correlation tool in BOLD. These correlation coefficients were then tested for contingency upon distribution type or overall intraspecific sequence divergence using a Mann-Whitney test and a Spearman rank correlation, respectively. Statistical analyses were performed using Statistica 8.0 (StatSoft Inc.). Finally, we compared the mean and maximum intraspecific divergence for each of the 102 species with its Nearest-Neighbor (NN) distance, because a gap between intraspecific and interspecific variation is essential for DNA barcoding to be effective in specimen identification. For this purpose we used the DS-MARKALL dataset (dx.doi.org/10.5883/DS-MARKALL). It includes >500 bp sequence records for 41,583 specimens representing 5,016 species of Lepidoptera [10]. We limited comparisons to this dataset because it is both comprehensive and identifications are very reliable. Sequences from the present study and from DS-MARKALL were pooled, and a barcode gap analysis was then carried out on BOLD using the K2P model. This analysis estimated the minimum genetic divergence to the NN and both the mean and maximum intraspecific divergences for each species.

Results

Sequenced species

We collected 1,997 sequences >500 bp from the 102 species. Among them, 54 sequences were not barcode compliant according to the standards in BOLD, i.e. a minimum sequence length of 500 bp, less than 1% ambiguous bases, the presence of two trace files, a minimum of low trace quality status, and the presence of a country specification in the record as set out by the Consortium for DNA Barcoding (CBOL), most likely due to partially degraded DNA. Nevertheless, these 54 sequences were still considered in the analysis as they were correctly placed with their conspecifics in an initial NJ tree. The seven families with the largest numbers of sequences were Noctuidae (551), Geometridae (389), Erebidae (175), Tortricidae (157), Nymphalidae (146), Gelechiidae (144), and Lycaenidae (133).

Intraspecific barcode divergences

Intraspecific barcode divergence was generally <1% with a mean (± SD) of 0.68 ± 0.67% (median: 0.43%; range: 0.00 to 3.46%) for the 102 species. As expected, there were highly significant differences among the three regional comparisons (Friedman ANOVA: χ22df = 77.82; p<0.0001). Divergences were lowest within the Asiatic samples as expected because they originated from few collecting sites with low numbers of specimens, while divergences within Europe averaged higher, and those between the European and Asiatic samples were highest (Fig 2, Table 1). In post-hoc comparisons, all three pairwise comparisons were highly significant (Wilcoxon-tests, p<0.007).
Fig 2

Mean intraspecific sequence divergences for 102 Lepidoptera species in geographic comparisons.

Boxplots (median, interquartile range, total range) of mean intraspecific sequence divergences (Kimura-2-Parameter) for 102 Lepidoptera species: total intraspecific divergences (mean distance for all data for each species), and intraspecific divergences at three geographic levels: intra-Asia (mean distance for all South Siberian samples); intra-Europe (mean distance for all European samples); inter- Europe-Asia (mean distance within each species for all pairs of specimens from Europe vs. Asia).

Table 1

Mean intraspecific barcode divergences (% Kimura-2P-distances) for 102 Lepidoptera species from Europe and South Siberia and for the geographic comparisons, and distribution type.

Speciestotal-intraintra-Asiaintra-Europeinter-Europe-Asiadistributiontype
Acleris aspersana0.230.200.250.22continuous
Acompsia cinerella0.860.000.940.79continuous
Acronicta auricoma0.090.15*0.080.12continuous
Aethes kindermanniana1.080.46*0.851.41continuous
Agrotis fatidica0.180.000.170.23disjunct
Anaplectoides prasina0.030.15*1.630.89continuous
Apamea furva0.410.200.340.54continuous
Apamea lateritia0.110.000.130.07continuous
Arctia caja0.651.560.391.51continuous
Arctia flavia0.210.000.190.26disjunct
Argyresthia pygmaeella0.280.100.300.25continuous
Arichanna melanaria0.120.000.070.19continuous
Athrips pruinosella1.430.001.511.63continuous
Autographa pulchrina0.170.000.170.10continuous
Boloria dia0.320.210.120.62continuous
Boloria napaea1.000.790.281.54disjunct
Boloria titania0.740.070.201.90disjunct
Brenthis ino0.470.220.530.42continuous
Carsia sororiata0.540.000.530.60disjunct
Caryocolum leucomelanella1.092.060.531.89continuous
Caryocolum pullatella2.022.30*2.022.10disjunct
Catoptria languidellus0.980.100.521.52disjunct
Celypha rivulana0.540.46*0.580.48continuous
Cerapteryx graminis0.390.410.360.47continuous
Charissa ambiguata0.860.31*0.601.59continuous
Chionodes distinctella1.351.111.381.27continuous
Chionodes holosericella0.260.000.190.41disjunct
Coenonympha glycerion0.320.000.330.32continuous
Coenonympha tullia0.330.180.220.51continuous
Colostygia aptata0.430.15*0.430.43disjunct
Coscinia cribraria3.460.953.533.16continuous
Crambus perlella0.390.380.270.57continuous
Crocallis elinguaria1.380.201.490.92continuous
Cupido minimus0.230.110.240.21continuous
Cyaniris semiargus0.310.400.290.34continuous
Diarsia brunnea0.200.000.210.15continuous
Diarsia mendica1.860.002.001.38continuous
Dicallomera fascelina1.500.510.223.53continuous
Eana osseana1.830.100.393.90continuous
Eana penziana0.330.000.320.35continuous
Eilema lutarella0.130.00*0.150.08continuous
Elachista bedellella1.030.650.871.31continuous
Entephria caesiata0.700.000.650.94continuous
Epermenia illigerella0.730.000.830.60continuous
Epinotia cruciana0.920.001.120.67continuous
Epinotia trigonella1.051.131.060.10continuous
Eudonia alpina0.060.00*0.100.05continuous
Eulamprotes wilkella1.870.101.971.71continuous
Eulithis populata0.250.00*0.270.19continuous
Eulithis prunata2.470.002.024.28continuous
Eulithis testata0.370.00*0.190.62continuous
Eumedonia eumedon1.081.530.741.69continuous
Euphyia unangulata0.160.15*0.150.19continuous
Eupithecia pusillata0.200.00*0.220.10continuous
Eurois occulta0.060.000.070.04continuous
Euxoa recussa0.060.000.080.04continuous
Gazoryctra ganna2.161.39*1.083.58disjunct
Graphiphora augur0.100.000.120.06continuous
Gypsonoma nitidulana1.350.00*1.162.12continuous
Hadena compta0.160.000.190.10continuous
Lasionycta imbecilla0.540.100.540.58continuous
Lasionycta proxima0.720.480.571.02continuous
Levipalpus hepatariella0.450.000.520.44disjunct
Lycaena virgaureae0.210.000.230.13continuous
Macaria brunneata0.470.000.460.57continuous
Matilella fusca0.240.00*0.280.17continuous
Miltochrista miniata0.000.00*0.000.00continuous
Mompha locupletella0.080.000.100.05continuous
Monopis spilotella1.030.510.611.33continuous
Noctua interposita0.060.000.070.04continuous
Ochsenheimeria urella2.100.152.102.54continuous
Oidaematophorus rogenhoferi0.510.310.460.61disjunct
Papestra biren0.090.00*0.110.06continuous
Parnassius phoebus0.490.690.240.57disjunct
Pediasia aridella0.460.00*0.510.42continuous
Perizoma hydrata0.000.000.000.00continuous
Phiaris obsoletana0.090.100.080.09continuous
Plebejus orbitulus0.430.100.080.82disjunct
Polia bombycina0.070.00*0.070.04continuous
Polypogon tentacularia0.230.000.230.25continuous
Pontia callidice1.810.100.233.42disjunct
Protolampra sobrina0.030.100.000.05continuous
Pyrausta aerealis0.990.001.130.82continuous
Scopula incanata1.150.101.240.94continuous
Scopula virgulata0.250.000.310.20continuous
Scotopteryx chenopodiata0.150.00*0.170.09continuous
Scrobipalpula diffluella2.060.180.513.55disjunct
Selagia spadicella0.820.000.551.21continuous
Setina irrorella2.040.002.221.95continuous
Sparganothis pilleriana0.400.200.470.37continuous
Syngrapha ain0.110.06*0.030.20continuous
Syngrapha hochenwarthi0.310.000.210.59disjunct
Syngrapha interrogationis0.070.00*0.060.09continuous
Trichiura crataegi0.820.200.731.21continuous
Udea uliginosalis1.760.001.652.20disjunct
Xanthorhoe decoloraria0.590.000.600.60disjunct
Xanthorhoe montanata1.390.001.530.79continuous
Xestia speciosa1.440.931.331.91continuous
Yponomeuta evonymella0.220.46*0.150.41continuous
Ypsolopha dentella0.340.000.230.49continuous
Ypsolopha nemorella0.540.000.620.50continuous
Zeiraphera griseana0.050.000.060.03continuous
Mean Values0.550.210.470.72

Species with an asterisk (*) indicate intraspecific variation assessed from 2 specimens.

Mean intraspecific sequence divergences for 102 Lepidoptera species in geographic comparisons.

Boxplots (median, interquartile range, total range) of mean intraspecific sequence divergences (Kimura-2-Parameter) for 102 Lepidoptera species: total intraspecific divergences (mean distance for all data for each species), and intraspecific divergences at three geographic levels: intra-Asia (mean distance for all South Siberian samples); intra-Europe (mean distance for all European samples); inter- Europe-Asia (mean distance within each species for all pairs of specimens from Europe vs. Asia). Species with an asterisk (*) indicate intraspecific variation assessed from 2 specimens.

Relationship between distribution type and intraspecific barcode divergences

Contrary to expectation, total intraspecific divergence values were only slightly larger in species with disjunct as opposed to those with continuous distributions (Mann-Whitney test: z = 2.09; p = 0.036; Fig 3). Species with continuous ranges (n = 83) had an average intraspecific sequence divergence of 0.63± 0.66% (median: 0.37%; range = 0.00–3.46%), while those with disjunct distributions (n = 19) showed a divergence of 0.89± 0.70% (median: 0.54%; range = 0.18–2.16%).
Fig 3

Mean intraspecific sequence divergences for 102 Lepidoptera species in different distribution types.

Boxplot (median, interquartile range, total range) of total mean intraspecific sequence divergences (Kimura-2-Parameter) for 102 Lepidoptera species from Europe and South Siberia, comparing species with continuous versus disjunct distributions.

Mean intraspecific sequence divergences for 102 Lepidoptera species in different distribution types.

Boxplot (median, interquartile range, total range) of total mean intraspecific sequence divergences (Kimura-2-Parameter) for 102 Lepidoptera species from Europe and South Siberia, comparing species with continuous versus disjunct distributions.

Factors affecting isolation by distance within species

As expected, the extent of sequence divergence between members of a species was often related to the distance between their sites of collection. However, the extent of this isolation-by-distance effect was highly variable among species. Sequence divergences in 56 of the 102 species showed no association with distance, while 13 species showed a weakly significant Mantel correlation (p<0.05) and 33 species showed a strong relationship (p<0.01). Evidence for isolation-by-distance was stronger in species with disjunct (mean Mantel r = 0.59±0.32) than continuous distributions (mean Mantel r = 0.28±0.27; Mann-Whitney test: z = 4.19, p<0.0001; Fig 4). In species with disjunct distributions, the extent of isolation-by-distance was only weakly and non-significantly related to overall sequence divergence (Spearman rank correlation: rS = 0.40, p = 0.087), and this relationship was even weaker and also non-significant for species with continuous ranges (rS = 0.20; p = 0.073). The strength of isolation-by-distance patterns within species did not co-vary with the maximum distance between sampling sites (rS = -0.005, p = 0.96), but it was negatively related to the number of sequences available for a taxon (rS = -0.27, p = 0.007).
Fig 4

Relationship of intraspecific sequence divergence and geographic distance.

Relationship between mean overall intraspecific sequence divergence and the extent of isolation by distance (as quantified by the Mantel correlation coefficient, r), with species patitioned according to their type of distribution. Species with disjunct distributions (blue circles) tended to show stronger isolation-by-distance (i.e. higher r values) than species with continuous distributions (orange triangles), and this pattern was marginally stronger in species with higher overall levels of intraspecific sequence divergence.

Relationship of intraspecific sequence divergence and geographic distance.

Relationship between mean overall intraspecific sequence divergence and the extent of isolation by distance (as quantified by the Mantel correlation coefficient, r), with species patitioned according to their type of distribution. Species with disjunct distributions (blue circles) tended to show stronger isolation-by-distance (i.e. higher r values) than species with continuous distributions (orange triangles), and this pattern was marginally stronger in species with higher overall levels of intraspecific sequence divergence.

Relationships between interspecific and intraspecific divergences

Nearest Neighbor distances (K2P) for the 102 species averaged 4.52%, but ranged from 0.00–12.98%. By comparison, maximum intraspecific divergence values averaged 1.69% (range = 0.00–7.32%) while mean intraspecific variation values averaged 0.68% (range = 0.00–3.46%). Therefore, the gap to the NN species averaged 2.73-fold the maximum intraspecific variation (Wilcoxon test: z = 7.22, p<0.0001), and 6.90-fold the mean intraspecific variation (z = 8.47, p<0.0001). While the barcode gap was clear in most cases, divergence to the NN was either absent or less than intraspecific variation in 12 cases (Figs 5 and 6, Table 2). The four cases (Table 2) which completely lacked interspecific divergence may reflect taxonomic over-splitting or introgression, as discussed in Mutanen et al. (2016) [10].
Fig 5

Mean intraspecific sequence divergences for 102 Lepidoptera species in relation to nearest neighbor.

Barcode sequence distances to the nearest neighbor species in relation to mean intraspecific distances for 102 species of Palearctic Lepidoptera. The straight line indicates where distance to nearest neighbor equals the respective intraspecific distance, viz. for species above this line the ‘barcode gap’ does exist.

Fig 6

Maximum intraspecific sequence divergences for 102 Lepidoptera species in relation to nearest neighbor.

Barcode sequence distances to the nearest neighbor species in relation to maximum intraspecific distances for 102 species of Palearctic Lepidoptera. The straight line indicates where distance to nearest neighbor equals the respective intraspecific distance, viz. for species above this line the ‘barcode gap’ does exist.

Table 2

Nearest-Neighbor distances (% K2P) for 102 species of Lepidoptera as well as the mean and maximum intraspecific divergences for the new records obtained in the present study and DS-MARKALL dataset.

SpeciesNNearest NeighborDistance to NNMax intraMean intra
Acleris aspersana14Acleris shepherdana4.200.650.23
Acompsia cinerella23Acompsia subpunctella3.452.030.90
Acronicta auricoma24Acronicta rumicis5.670.310.09
Aethes kindermanniana8Aethes smeathmanniana3.181.861.08
Agrotis fatidica12Agrotis cinerea2.920.510.18
Anaplectoides prasina20Eurois occulta3.860.310.03
Apamea furva15Apamea platinea2.980.930.41
Apamea lateritia18Apamea schildei3.560.460.11
Arctia caja24Arctia flavia4.752.350.65
Arctia flavia9Borearctia menetriesi3.300.500.21
Argyresthia pygmaeella14Argyresthia curvella4.251.240.28
Arichanna melanaria12Bupalus piniaria5.250.310.12
Athrips pruinosella8Athrips mouffetella7.802.511.43
Autographa pulchrina108Autographa buraetica0.880.960.17
Boloria dia29Boloria titania5.790.770.33
Boloria napaea15Boloria aquilonaris0.822.021.01
Boloria titania26Boloria chariclea0.152.230.75
Brenthis ino28Brenthis daphne0.622.190.47
Carsia sororiata14Aplocera simpliciata9.590.960.54
Caryocolum leucomelanella16Caryocolum mazeli3.791.550.78
Caryocolum pullatella12Caryocolum marmorea3.203.642.02
Catoptria languidellus9Catoptria digitellus7.981.870.99
Celypha rivulana10Celypha flavipalpana4.741.080.54
Cerapteryx graminis22Tholera decimalis3.770.800.39
Charissa ambiguata14Charissa predotae1.431.940.86
Chionodes distinctella31Chionodes continuella4.913.371.35
Chionodes holosericella20Chionodes fumatella5.730.610.26
Coenonympha glycerion27Coenonympha rhodopensis7.940.930.32
Coenonympha tullia21Coenonympha rhodopensis*0.311.030.33
Colostygia aptata22Colostygia aqueata6.571.080.43
Coscinia cribraria44Euplagia quadripunctaria8.67.323.33
Crambus perlella17Crambus monochromellus*0.001.080.39
Crocallis elinguaria30Crocallis albarracina*0.007.171.39
Cupido minimus43Cupido osiris3.480.620.23
Cyaniris semiargus27Agriades glandon4.271.240.31
Diarsia brunnea31Diarsia dahlii3.630.680.20
Diarsia mendica37Diarsia dahlii3.305.291.87
Dicallomera fascelina13Gynaephora selenitica7.423.811.51
Eana osseana12Eana argentana2.814.271.83
Eana penziana30Eana nervana3.461.150.33
Eilema lutarella17Setema cereola2.500.460.13
Elachista bedellella13Elachista lugdunensis2.261.870.90
Entephria caesiata30Entephria nobiliaria4.921.720.70
Epermenia illigerella16Epermenia falciformis7.901.550.74
Epinotia cruciana14Epinotia mercuriana3.962.980.92
Epinotia trigonella12Epinotia indecorana*0.002.511.05
Eudonia alpina5Eudonia mercurella4.550.160.06
Eulamprotes wilkella19Eulamprotes libertinella7.926.291.88
Eulithis populata27Eulithis prunata5.770.800.25
Eulithis prunata27Eulithis populata5.775.782.48
Eulithis testata8Eulithis prunata6.000.800.37
Eumedonia eumedon25Plebejus orbitulus4.052.821.09
Euphyia unangulata16Euphyia adumbraria4.480.460.16
Eupithecia pusillata40Eupithecia oxycedrata5.311.550.20
Eurois occulta20Spaelotis suecica3.340.320.06
Euxoa recussa15Euxoa vitta2.010.460.06
Gazoryctra ganna8Gazoryctra fuscoargenteus8.144.282.07
Graphiphora augur15Eurois occulta3.770.460.10
Gypsonoma nitidulana18Archips crataegana7.302.501.36
Hadena compta19Hadena magnolii2.380.920.16
Lasionycta imbecilla17Papestra biren5.551.080.54
Lasionycta proxima20Polia bombycina5.891.410.73
Levipalpus hepatariella14Agonopterix cluniana7.050.920.43
Lycaena virgaureae28Lycaena tityrus2.910.620.19
Macaria brunneata21Macaria wauaria7.261.240.47
Matilella fusca12Selagia spadicella5.290.770.24
Miltochrista miniata28Eucarta virgo8.750.000.00
Mompha locupletella12Mompha miscella9.280.460.08
Monopis spilotella6Monopis laevigella8.061.861.03
Noctua interposita24Noctua atlantica4.090.460.06
Ochsenheimeria urella14Ochsenheimeria vacculella9.633.772.03
Oidaematophorus rogenhoferi15Oidaematophorus vafradactylus10.681.080.51
Papestra biren15Lacanobia oleracea3.470.310.09
Parnassius phoebus17Parnassius apollo1.971.550.49
Pediasia aridella9Pediasia truncatellus5.250.930.47
Perizoma hydrata19Perizoma affinitata0.150.000.00
Phiaris obsoletana7Phiaris metallicana1.080.310.09
Plebejus orbitulus10Agriades glandon3.031.240.43
Polia bombycina18Polia hepatica4.260.320.07
Polypogon tentacularia17Zanclognatha zelleralis3.950.620.23
Pontia callidice9Pieris bryoniae8.353.641.83
Protolampra sobrina10Spaelotis suecica5.340.160.03
Pyrausta aerealis15Anania crocealis7.824.600.96
Scopula incanata22Scopula marginepunctata3.463.311.16
Scopula virgulata13Calamodes subscudularia7.220.620.26
Scotopteryx chenopodiata28Scotopteryx bipunctaria12.980.620.15
Scrobipalpula diffluella14Scrobipalpula tussilaginis1.094.292.07
Selagia spadicella10Ortholepis betulae3.451.550.83
Setina irrorella23Setina aurita0.003.842.01
Sparganothis pilleriana9Doloploca punctulana6.881.080.40
Syngrapha ain14Syngrapha microgamma2.820.460.10
Syngrapha hochenwarthi13Syngrapha interrogationis2.850.620.32
Syngrapha interrogationis25Syngrapha hochenwarthi2.850.460.07
Trichiura crataegi28Trichiura castiliana4.501.870.82
Udea uliginosalis23Udea alpinalis1.083.641.77
Xanthorhoe decoloraria16Xanthorhoe montanata5.401.080.59
Xanthorhoe montanata31Xanthorhoe decoloraria5.402.670.43
Xestia speciosa39Xestia viridescens3.012.961.44
Yponomeuta evonymella14Yponomeuta cagnagella1.150.770.22
Ypsolopha dentella10Ypsolopha falcella6.480.770.34
Ypsolopha nemorella12Ypsolopha falcella4.581.390.54
Zeiraphera griseana23Zeiraphera rufimitrana4.090.480.05

Species with an asterisk (*) indicate cases where the nearest neighbor may represent an example of taxonomic over-splitting (cf. [10]). ‘Mean intra’ values correspond to ‘total-intra’ values in Table 1.

Mean intraspecific sequence divergences for 102 Lepidoptera species in relation to nearest neighbor.

Barcode sequence distances to the nearest neighbor species in relation to mean intraspecific distances for 102 species of Palearctic Lepidoptera. The straight line indicates where distance to nearest neighbor equals the respective intraspecific distance, viz. for species above this line the ‘barcode gap’ does exist.

Maximum intraspecific sequence divergences for 102 Lepidoptera species in relation to nearest neighbor.

Barcode sequence distances to the nearest neighbor species in relation to maximum intraspecific distances for 102 species of Palearctic Lepidoptera. The straight line indicates where distance to nearest neighbor equals the respective intraspecific distance, viz. for species above this line the ‘barcode gap’ does exist. Species with an asterisk (*) indicate cases where the nearest neighbor may represent an example of taxonomic over-splitting (cf. [10]). ‘Mean intra’ values correspond to ‘total-intra’ values in Table 1.

Discussion

Our analysis of DNA barcode sequences from a phylogenetically diverse group of Lepidoptera from Asia and Europe revealed that intraspecific divergences increased with sampling intensity and distance. However, intraspecific divergences in most species remained low with mean K2P divergences averaging 0.68% and exceeding 2.5% in 23 species of the complete sample. However, divergence was >2.5% in just 9 of the 102 species in one or more of the three spatial levels of our analysis. By comparison, the species with a higher divergence than 2.5% showed a mean sequence divergence of 4.62% to European populations of 5,016 species of Lepidoptera. This result corroborates patterns from earlier studies on North American [6] and European Lepidoptera [4], confirming that the barcode region of COI is an efficient tool for species identification, given that the databases are of high quality, even when the reference sequences used for species identification derive from sites far distant from the locality under study. Irrespective of their origin, most sequences could be unambiguously allocated to a taxonomically defined species although several cases of high intraspecific divergence may reflect overlooked species (as discussed later). Conversely, 4 of the 7 species pairs (Crambus perlella/monochromella, Crocallis elinguaria/albarracina, Epinotia trigonella/indecorana, Coenonympha tullia/rhodopensis) that either lacked or possessed very limited (<0.5%) divergence from their NN may indicate taxonomic over-splitting rather than the failure of DNA barcoding to discriminate valid species (see [10]). For three other species pairs (Setina irrorella/aurita, Boloria titania/chariclea and Perizoma hydrata/affinitata), the low NN values suggest a recent divergence of valid, morphologically well-defined species or recent mitochondrial introgression. For example, an earlier study suggested that the low NN divergence between P. hydrata and P. affinitata resulted from mitochondrial introgression from P. hydrata to P. affinitata [18]. Our comparisons of European and South Siberian populations revealed regional sequence divergence in the respective region in about half the species, but most values were well below 2%. In addition, regional barcode variation was similar in species with disjunct distributions and in those with continuous ranges, indicating substantial gene flow in both cases. In part, this may reflect the fact that current distributions of Euro-Siberian Lepidoptera largely result from range expansions in the brief interval since the last glacial maximum, i.e. within less than 15,000 years [19], [20]. However, when intraspecific sequence divergences were examined using an isolation-by-distance approach, they were slightly stronger in species with disjunct ranges. Despite our limited sampling, some species (e.g. Elachista bedellella, Boloria napaea, B. titania and Plebejus orbitulus) showed clear divergence between South Siberian and European populations (see Table 1). In addition, populations of some species from northern Europe clustered with those from Asia rather than from central Europe (e.g. Xestia speciosa). This pattern likely indicates that formerly glaciated areas in northern Europe were sometimes recolonized by lineages from Asia. All these intraspecific patterns need to be examined in more detail by increased sampling effort in intermediate areas, and should be cross-checked using morphology and nuclear markers to clarify phylogeographic histories. Yet, for the purpose of species identification, we did not encounter any significant barriers, even in these taxa.

High intraspecific divergences–potential cryptic diversity

High intraspecific barcode divergences (> 2–3%) may be indicative for the existence of overlooked species of Lepidoptera, but may also be due to mitochondrial introgression from a sister species [21]. Therefore, all such cases should be analyzed in more detail by examining divergence patterns at nuclear loci and morphological characters. We detected high intraspecific divergences (> 2.5% max divergence) between European and Asian populations for 9 of the 102 species (Table 3). Six of these species have a disjunct distribution, suggesting the possible existence of cryptic species in South Siberia versus Europe. In three other species (e.g. Coscinia cribraria), barcode variation was high even within Europe without an obvious geographical pattern. The remainder of this section discusses these nine species in more detail. All of them group into two or more different BINs [3] (S1 Table), operational taxonomic units which in Lepidoptera are frequently but not always congruent with species boundaries (e.g. [7], [22]). In fact deep barcode splits may be caused by pseudogenes, Wolbachia infection, hybridization etc. [23] and these cases need to be analysed using an integrative approach (e.g., [24]).
Table 3

Nine Euro-Siberian species of Lepidoptera with a max intra-specific K2P distance for COI >2.5% between Asia and Europe.

Speciestotal-Eurasiaintra-Asiaintra-Europeinter-Europe-Asia
Caryocolum pullatella3.12.32.03.1
Coscinia cribraria3.51.03.53.2
Dicallomera fascelina1.50.50.23.5
Eana osseana3.00.10.43.9
Eulithis prunata2.50.02.04.3
Gazoryctra ganna2.21.41.13.6
Ochsenheimeria urella2.10.20.212.5
Pontia callidice1.80.10.23.4
Scrobipalpula diffluella2.10.20.53.6

1. Caryocolum pullatella (Tengström, 1848) (Gelechiidae)

C. pullatella is a Holarctic species that is widespread in northern Europe, but restricted to isolated localities in the Alps and Balkans [25]. As its Palearctic populations include two DNA barcode clusters with allopatric distributions (central/south-east Europe versus north Europe-South Siberia), this may indicate cryptic diversity. The situation potentially gains further complexity when North American specimens are considered as they include additional BINs and requires further assessment.

2. Coscinia cribraria (Linnaeus, 1758) (Erebidae)

This morphologically variable species is widely distributed across the Palearctic. Numerous forms and subspecies have been described, including ssp. sibirica (Staudinger, 1892) from the Altai Mountains which was recently synonymized by Dubatolov (2010) [26]. However, Witt & Ronkay (2011) [27] suspected that sequence data would indicate the existence of a species complex. Current DNA barcode sequences are assigned to five clades; specimens from Altai belong to the same BIN as those from northern and central Europe. As the clusters within Europe do not show a clear phylogeographic pattern, sequence variation may indicate introgression or the impacts of Wolbachia infection [23].

3. Dicallomera fascelina (Linnaeus, 1758) (Erebidae)

D. fascelina is almost continuously distributed in temperate Eurasia, extending from northern Spain east to Korea, although absent from the Mediterranean region and the British Isles. Several subspecies have been recognized. Populations from the Altai region have been attributed to the nominotypical subspecies, but the clear differences in their external morphology and genitalia [28], coupled with their barcode divergence, suggest they represent a cryptic species.

4. Eana osseana (Scopoli, 1763) (Tortricidae)

E. osseana is a widespread Holarctic species, restricted to mountainous areas at the southern limits of its distribution. DNA barcodes indicate two divergent BINs, one from Europe, and a second from the Altai Mountains. As three additional BINs are known from North America, the species requires integrative revisionary work.

5. Eulithis prunata (Linnaeus, 1758) (Geometridae)

This species is almost continuously distributed in temperate Eurasia, but is restricted to mountainous areas in the southern parts of its range. Hausmann & Viidalepp (2012) [29] found high COI sequence divergence in E. prunata, with distances reaching 5.9% and at least six divergent haplotypes in Europe and Turkey. South Siberian populations have been assigned to the ssp. leucoptera (Djakonov, 1929), but it may represent a distinct species given its deep barcode divergence from other populations.

6. Gazoryctra ganna (Hübner, 1804) (Hepialidae)

G. ganna is an arctic-alpine species with a disjunct distribution. It occurs in the Alps and High Tatra Mountains, northern Finland, and European Russia, as well as at isolated localities to the Far East [30]. Moderate sequence divergence exists between northern and central European populations [8] while those from the Altai Mountains show high sequence divergence from both European clusters. Because of their differing flight times (late afternoon in Asia versus early morning in Europe) and slightly different phenotypes, the Asian specimens likely represent an overlooked species.

7. Ochsenheimeria urella (Fischer von Röslerstamm, 1842) (Ypsolophidae)

O. urella is widely although locally distributed in central and northern Europe, including European Russia. A previously doubtful record from the Far East [30], together with our record from Altai [11], indicates a much wider distribution in Asia. Members of this species are placed in two BINs, one shared by the Alps and Finland, and the other by Finland and the Altai Mountains.

8. Pontia callidice (Hübner, 1800) (Pieridae)

P. callidice shows a disjunct distribution in the high mountains of Eurasia from the Pyrenees to the Himalayas, and in the subarctic Tundra from the Ural Mountains to the Far East. Linked to their geographic isolation, populations show considerable variation in wing patterns and have been assigned to several subspecies. The nominotypical subspecies occurs in the high mountains of Europe (Pyrenees and Alps). Della Bruna et al. (2004) [31] assigned populations from the Altai to spp. hinducucica Verity, 1811 (type locality Hindu Kush), whereas Tshikolovets et al. (2009) [32] listed spp. kalora (Moore, 1865) from Altai (type locality NW Himalaya). Korb & Bolshakov (2016) [33] listed ssp. halasia Huang et Murayama, 1992 from SW Altai (described from Halasi, [Chinese] Altai). Despite this nomenclatural uncertainty, the DNA barcode results indicate that specimens from the Alps belong to a very distinct barcode cluster from those in Russia (Altai), Kyrgyzstan and Tajikistan.

9. Scrobipalpula diffluella (Frey, 1870) (Gelechiidae)

In the Palearctic, the genus Scrobipalpula includes a complex of closely related species with disputed taxonomy [25]. S. diffluella shows a typical boreo-montane distribution with most records from northern and central Europe, extending to the southern Urals. Specimens of the newly detected population from the Altai show close morphological similarity with European material, but clear barcode divergence, suggesting cryptic diversity.

Conclusions

This study on a phylogenetically diverse sample of Lepidoptera across a wide geographic range within the Palearctic region corroborates the utility of DNA barcode data for enabling both species identification and species discovery. For most species, unequivocal identifications could be established for samples from a widely distant region (the Russian Altai mountains), even though available reference data largely derived from regions in north and central Europe. On the other hand, in a few ‘species’ taxonomically known since Linnean times, patterns of sequence divergence suggest the possibility of unrecognized cryptic species diversity and demand further assessment using an integrative taxonomic approach. Hence, this study exemplifies the usefulness of well curated DNA barcode libraries whose power and versatility will expand as more sequence data are collated under strict quality standards.

Accession numbers and BINs.

List of species names, sample-IDs, process-IDs (from BOLD database), GenBank Accession numbers, BINs, and Institution/collection storing vouchers. (PDF) Click here for additional data file.
  20 in total

1.  Biological identifications through DNA barcodes.

Authors:  Paul D N Hebert; Alina Cywinska; Shelley L Ball; Jeremy R deWaard
Journal:  Proc Biol Sci       Date:  2003-02-07       Impact factor: 5.349

2.  DNA barcoding Central Asian butterflies: increasing geographical dimension does not significantly reduce the success of species identification.

Authors:  Vladimir A Lukhtanov; Andrei Sourakov; Evgeny V Zakharov; Paul D N Hebert
Journal:  Mol Ecol Resour       Date:  2009-02-25       Impact factor: 7.090

3.  From writing to reading the encyclopedia of life.

Authors:  Paul D N Hebert; Peter M Hollingsworth; Mehrdad Hajibabaei
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2016-09-05       Impact factor: 6.237

4.  A systematic catalogue of butterflies of the former Soviet Union (Armenia, Azerbaijan, Belarus, Estonia, Georgia, Kyrgyzstan, Kazakhstan, Latvia, Lituania, Moldova, Russia, Tajikistan, Turkmenistan, Ukraine, Uzbekistan) with special account to their type specimens (Lepidoptera: Hesperioidea, Papilionoidea).

Authors:  Stanislav K Korb; Lavr V Bolshakov
Journal:  Zootaxa       Date:  2016-09-01       Impact factor: 1.091

5.  Molecular biogeography of Europe: Pleistocene cycles and postglacial trends.

Authors:  Thomas Schmitt
Journal:  Front Zool       Date:  2007-04-17       Impact factor: 3.172

6.  Genetic patterns in European geometrid moths revealed by the Barcode Index Number (BIN) system.

Authors:  Axel Hausmann; H Charles J Godfray; Peter Huemer; Marko Mutanen; Rodolphe Rougerie; Erik J van Nieukerken; Sujeevan Ratnasingham; Paul D N Hebert
Journal:  PLoS One       Date:  2013-12-17       Impact factor: 3.240

7.  Deep intraspecific DNA barcode splits and hybridisation in the Udea alpinalis group (Insecta, Lepidoptera, Crambidae) - an integrative revision.

Authors:  Richard Mally; Peter Huemer; Matthias Nuss
Journal:  Zookeys       Date:  2018-03-13       Impact factor: 1.546

8.  DNA barcodes for 1/1000 of the animal kingdom.

Authors:  Paul D N Hebert; Jeremy R Dewaard; Jean-François Landry
Journal:  Biol Lett       Date:  2009-12-16       Impact factor: 3.703

9.  A DNA-based registry for all animal species: the barcode index number (BIN) system.

Authors:  Sujeevan Ratnasingham; Paul D N Hebert
Journal:  PLoS One       Date:  2013-07-08       Impact factor: 3.240

10.  Probing planetary biodiversity with DNA barcodes: The Noctuoidea of North America.

Authors:  Reza Zahiri; J Donald Lafontaine; B Christian Schmidt; Jeremy R deWaard; Evgeny V Zakharov; Paul D N Hebert
Journal:  PLoS One       Date:  2017-06-01       Impact factor: 3.240

View more
  5 in total

1.  DNA barcoding and species delimitation of the Old World tooth-carps, family Aphaniidae Hoedeman, 1949 (Teleostei: Cyprinodontiformes).

Authors:  Hamid Reza Esmaeili; Azad Teimori; Fatah Zarei; Golnaz Sayyadzadeh
Journal:  PLoS One       Date:  2020-04-16       Impact factor: 3.240

2.  Assembling a DNA barcode reference library for the spiders (Arachnida: Araneae) of Pakistan.

Authors:  Muhammad Ashfaq; Gergin Blagoev; Hafiz Muhammad Tahir; Arif M Khan; Muhammad Khalid Mukhtar; Saleem Akhtar; Abida Butt; Shahid Mansoor; Paul D N Hebert
Journal:  PLoS One       Date:  2019-05-22       Impact factor: 3.240

3.  Perlidae (Plecoptera) from the Paranapiacaba Mountains, Atlantic Forest, Brazil: Diversity and implications of the integrative approach and teneral specimens on taxonomy.

Authors:  Lucas Henrique de Almeida; Pitágoras da Conceição Bispo
Journal:  PLoS One       Date:  2020-12-10       Impact factor: 3.240

4.  Wireworm (Coleoptera: Elateridae) genomic analysis reveals putative cryptic species, population structure, and adaptation to pest control.

Authors:  Kimberly R Andrews; Alida Gerritsen; Arash Rashed; David W Crowder; Silvia I Rondon; Willem G van Herk; Robert Vernon; Kevin W Wanner; Cathy M Wilson; Daniel D New; Matthew W Fagnan; Paul A Hohenlohe; Samuel S Hunter
Journal:  Commun Biol       Date:  2020-09-07

5.  Hylotelephium spectabile, a New Host for Carnation Tortrix Moth (Cacoecimorpha pronubana) and Molecular Characterization in Greece.

Authors:  Konstantinos B Simoglou; Dimitrios N Avtzis; Joaquín Baixeras; Ioanna Sarigkoli; Emmanouil Roditakis
Journal:  Insects       Date:  2021-03-15       Impact factor: 2.769

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.