| Literature DB >> 25524367 |
Marko Mutanen1, Mari Kekkonen2,3, Sean W J Prosser3, Paul D N Hebert3, Lauri Kaila2.
Abstract
Each holotype specimen provides the only objective link to a particular Linnean binomen. Sequence information from them is increasingly valuable due to the growing usage of DNA barcodes in taxonomy. As type specimens are often old, it may only be possible to recover fragmentary sequence information from them. We tested the efficacy of short sequences from type specimens in the resolution of a challenging taxonomic puzzle: the Elachista dispunctella complex which includes 64 described species with minuscule morphological differences. We applied a multistep procedure to resolve the taxonomy of this species complex. First, we sequenced a large number of newly collected specimens and as many holotypes as possible. Second, we used all >400 bp examine species boundaries. We employed three unsupervised methods (BIN, ABGD, GMYC) with specified criteria on how to handle discordant results and examined diagnostic bases from each delineated putative species (operational taxonomic units, OTUs). Third, we evaluated the morphological characters of each OTU. Finally, we associated short barcodes from types with the delineated OTUs. In this step, we employed various supervised methods, including distance-based, tree-based and character-based. We recovered 658 bp barcode sequences from 194 of 215 fresh specimens and recovered an average of 141 bp from 33 of 42 holotypes. We observed strong congruence among all methods and good correspondence with morphology. We demonstrate potential pitfalls with tree-, distance- and character-based approaches when associating sequences of varied length. Our results suggest that sequences as short as 56 bp can often provide valuable taxonomic information. The results support significant taxonomic oversplitting of species in the Elachista dispunctella complex.Entities:
Keywords: Automatic Barcode Gap Discovery; Barcode Index Number; Elachista; GMYC; species delimitation; species delineation
Mesh:
Substances:
Year: 2015 PMID: 25524367 PMCID: PMC4964951 DOI: 10.1111/1755-0998.12361
Source DB: PubMed Journal: Mol Ecol Resour ISSN: 1755-098X Impact factor: 7.090
Figure 1Schematic presentation of the study. The molecular data were divided into two subsets according to sequence length. Only barcode sequences >400 bp were employed for OTU delineation for re‐examination of species boundaries (unsupervised, exploratory methods), whereas both subsets were used for type association (supervised, reference‐based methods). Both molecular stages (grey boxes) were followed by evaluation based on morphological characters (male genitalia) to aid the elimination of possible errors caused by single‐locus DNA data. OTUs supported by both morphology and DNA will be preferred in the taxonomic revision unless evidence supporting conflicting boundaries (i.e. cryptic species) emerges during revisionary work.
Figure 2Length of DNA barcode sequences from 42 type specimens in the complex.
Figure 3A Bayesian inference tree used for OTU delineation via GMYC (includes >400‐bp sequences) with bootstrap (from RAxML analysis) and posterior probability values (on the left). Coloured bars indicate delineated OTUs by different methods (Barcode Index Numbers BIN, Automatic Barcode Gap Discovery ABGD, General Mixed Yule‐coalescent GMYC and morphology) (in the middle). ‘Types’ includes all DNA barcodes of type specimens associated with OTUs in this study (on the right). Types marked with * have controversial results based on either conflicting results between the used methods or rather long distance to the nearest OTU (see text for further information). E. casascoensis (marked with #) is placed according to the distance measures, BLOG results and morphology as none of the tree‐based methods associated E. casascoensis with any OTU.
The number of OTUs recognized by Automatic Barcode Gap Discovery (ABGD) analyses among 191 COI sequences >400 bp using three distance metrics
| Subst. model | X | Partition | Prior intraspecific divergence (P) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.0599 | 0.0359 | 0.0215 | 0.0129 | 0.00774 | 0.00464 | 0.00278 | 0.00167 | 0.001 | |||
| P | 1.5 | Initial | 1 | 5 | 35 | 35 | 35 | 35 | 35 | 35 | 35 |
| Recursive | 1 | 5 | 37 | 40 | 40 | 44 | 50 | 50 | 82 | ||
| JC | 1.5 | Initial | 0 | 19 | 19 | 19 | 19 | 19 | 19 | 19 | |
| Recursive | 0 | 19 | 19 | 19 | 20 | 25 | 27 | 62 | |||
| K2P | 1.5 | Initial | 0 | 19 | 19 | 19 | 19 | 19 | 19 | 19 | |
| Recursive | 0 | 19 | 19 | 19 | 20 | 25 | 27 | 62 | |||
X, relative gap width; P, p‐distance; JC69, Jukes‐Cantor; K2P, Kimura 2‐parameter.
Results of the General Mixed Yule‐coalescent (GMYC) analyses for OTU formation (>400 bp sequences, n = 191, haplotypes = 78)
| Analysis | Clusters (CI) | Entities (CI) | Likelihoodnull | LikelihoodGMYC | Likelihood ratio | Threshold |
|---|---|---|---|---|---|---|
| Single | 14 (13–15) | 22 (20–25) | 644.652 | 660.319 | 31.33367 ( | −0.001613839 |
| Multiple | 16 (15–16) | 27 (24–28) | 644.652 | 661.441 | 33.57823 ( | −0.001613839 |
| −0.001016101 | ||||||
| −0.0002361306 |
Clusters, OTUs delineated by GMYC with more than one specimen; Entities, all OTUs (clusters and singletons) delineated by GMYC; CI, confidence interval; Likelihoodnull: likelihood of the null model; LikelihoodGMYC, likelihood of the GMYC model; Threshold, the threshold between speciation and coalescence processes; Single, single threshold model; Multiple, multiple threshold model. *** P < 0.001.
Diagnostic characters of delineated operational taxonomic units (OTUs)
| OTU | Diagnostic characters |
|---|---|
|
| 11 |
|
| 6 |
|
| 7 |
|
| 14 |
|
| 7 |
|
| 5 |
|
| 7 |
|
| 6 |
|
| 4 |
|
| 2 |
|
| 2 |
|
| 11 |
|
| 9 |
|
| 4 |
|
| 4 |
|
| 6 |
|
| 4 |
|
| 2 |
|
| 3 |
Morphological differentiation between operational taxonomic unit's (OTU's) in the complex expressed as an identification key. Some species appear more than once in the key, as the characters sometimes display intraspecific variation, or variation in dissection technique and success may yield apparent although not real differences. For some species, the diagnostic morphological characters are found in one sex only
| 1. Forewing fringe scales distally grey forming grey distal line along termen__ 2 |
| – Forewing fringe scales white, sometimes with single dark grey or brown tips of otherwise white scales __3 |
| 2. Juxta lobes with at least 5 distinctive setae; female bursa oval__ OTU 14 |
| – Juxta lobes without or with at most two small setae; female bursa divided into two portions separated by median narrowing__OTU 13 |
| 3. Digitate process twice as long as juxta lobes__ OTU 18; |
| – Digitate process at most as long as juxta lobes__4 |
| 4. Phallus longer than valva__ OTU 15 |
| – Phallus at most as long as valva__5 |
| 5. Uncus lobes narrow, three times as long as wide__OTU 5 |
| – Uncus lobes at most twice as long as broad__6 |
| 6. Phallus with curved apex__7 |
| – Phallus with straight apex__11 |
| 7. Forewing unicolorous white; digitate process laterally orientated__OTU 4 |
| Forewing with scattered dark grey scales; digitate process posteriorly orientated__8 |
| 8. Length of phallus 5/6 of valva; juxta lobes as long as digitate process__OTU 7 |
| – Length of phallus at most 2/3 of valva__9 |
| 9. Digitate process elongate, at least three times as long as wide__OTU 2 |
| – Digitate process broad and triangular, length at most twice its width at base__10 |
| 10. Juxta lobes reduced__OTU 1 |
| – Juxta lobes developed, as large as digitate process__OTU 3 |
| 11. Juxta lobes longer than uncus lobes__OTU 7 |
| – Juxta lobes shorter than uncus lobes__12 |
| 12. Uncus lobes laterally produced, elongate, with pointed apex__OTU 6 |
| – Uncus lobes posteriorly directed, with rounded or at most slightly lateroposteriorly conical apex__13 |
| 13. Phallus as long as valva__14 |
| – Phallus shorter than valva__15 |
| 14. Female ostium bursae as wide as deep__ OTU 8 |
| – Female ostium bursae three times as wide as deep__ OTU 9; OTU 10 |
| 15. Valva somewhat S‐shaped, narrowest medially; phallus basally significantly broader than distally__OTU 12 |
| – Valva straight, parallel‐sided; phallus slender, hardly tapered toward apex__16 |
| 16. Uncus lobes as long as broad__OTU 17 |
| – Uncus lobes longer than broad__17 |
| 17. Valva 3 x as long as its width basally__OTU 16 |
| – Valva 4 x as long as its width basally__18 |
| 18. Valva 4 x as long as digitate process__OTU 11 |
| – Valva 5 x as long as digitate process__OTU 9; OTU 10 |
The type specimens associated with OTUs according to tree‐based methods (maximum likelihood, Bayesian inference, neighbour‐joining with K2P) with bootstrap and posterior probability values. Discovered differences in amino acids between types and corresponding OTUs are also given. E. casascoensis and E. moroccoensis are excluded as they associated with none of the OTUs in tree‐based analyses
| OTU | TypesML | BootstrapML | TypesBI | PP | TypesNJ | BootstrapNJ | Amino acids |
|---|---|---|---|---|---|---|---|
| 1 |
| 88 |
| 0.97 |
| 63 | |
| 7 |
| 48 |
| 0.6 |
| 60 | |
| 8 |
| 79 |
| 0.99 |
| 30 | |
|
|
|
| |||||
|
|
|
| |||||
|
|
|
| |||||
|
|
|
| |||||
|
|
|
| |||||
|
|
|
| |||||
| 9 |
| 69 |
| 1 |
| 53 | |
|
|
|
| |||||
|
|
|
| |||||
| 10 |
| 73 |
| 0.99 |
| 48 | |
|
|
|
| |||||
|
|
|
| |||||
| 12 |
| 56 |
| 0.97 |
| 32 | |
|
|
|
| |||||
|
|
|
| |||||
|
|
|
| |||||
|
|
|
| |||||
|
|
|
| |||||
|
|
|
| |||||
|
|
|
| |||||
| 13 |
| 84 |
| 0.98 | NO | One difference | |
| 15 |
| 36 |
| 0.96 |
| 19 | |
|
|
|
| |||||
|
|
|
| |||||
|
|
|
| |||||
| 16 | NO | NO |
| 41 | One difference | ||
| 17 |
| 89 |
| 1 |
| 52 | |
|
| 56 |
| 0.94 |
| 22 | One difference | |
|
|
|
|
|
|
|
|
OTU, operational taxonomic unit; TypesML, TypesBI, TypesNJ, types associated with OTUs in maximum likelihood, Bayesian inference or neighbour‐joining (respectively) analyses; BootstrapML, BootstrapNJ, boostrap values of OTUs including types in maximum likelihood or neighbour‐joining (respectively) analyses; PP, posterior probability values of OTUs including types in analysis.
Type specimen associated as a sister to its corresponding OTU.
Best correspondence (by least K2P distance) between the short sequences from 33 type specimens in the complex and operational taxonomic units (OTUs). The second column indicates the length of the sequence in base pairs. The values in the columns 3–5 are minimum K2P distances between the type specimens and OTUs in question
| Type | Seq. length | Best hit | 2nd best hit | 3rd best hit |
|---|---|---|---|---|
|
| 164 | OTU17 (0.000) | OTU18 (0.012) | OTU7 (0.031) |
|
| 164 | OTU10 (0.000) | OTU9 (0.019) | OTU7 (0.019) |
|
| 94 | OTU9 (0.011) | OTU7 (0.033) | OTU8, OTU10, OTU17, OTU18 (0.044) |
|
| 325 | OTU12 (0.000) | OTU16 (0.038) | OTU1 (0.042) |
|
| 164 | OTU7 (0.000) | OTU9, OTU10, OTU13, OTU15 (0.025) | OTU18 (0.031) |
|
| 164 | OTU18 (0.006) | OTU17 (0.019) | OTU7 (0.038) |
|
| 164 | OTU7 (0.006) | OTU10 (0.012) | OTU9 (0.012) |
|
| 94 | OTU13 (0.011) | OTU16 (0.022) | OTU7 (0.044) |
|
| 94 | OTU12 (0.000) | OTU15 (0.022) | OTU7 (0.027) |
|
| 94 | OTU15 (0.000) | OTU12 (0.013) | OTU7 (0.022) |
|
| 164 | OTU8 (0.000) | OTU9 (0.025) | OTU10 (0.031) |
|
| 164 | OTU8 (0.000) | OTU9 (0.025) | OTU10 (0.031) |
|
| 94 | OTU8 (0.000) | OTU9 (0.033) | OTU5, OTU7, OTU10 (0.044) |
|
| 56 | OTU9 (0.018) | OTU7 (0.037) | OTU6, OTU8, OTU10, OTU17, OTU18 (0.056) |
|
| 56 | OTU12 (0.000) | OTU15 (0.018) | OTU1, OTU16 (0.037) |
|
| 56 | OTU18 (0.000) | OTU17 (0.018) | OTU5 (0.037) |
|
| 658 | N/A | OTU18 (0.045) | OTU17 (0.051) |
|
| 164 | OTU8 (0.006) | OTU9 (0.031) | OTU10 (0.038) |
|
| 164 | OTU8 (0.000) | OTU9 (0.025) | OTU10 (0.031) |
|
| 164 | OTU15 (0.000) | OTU7 (0.025) | OTU12 (0.028) |
|
| 94 | OTU12 (0.000) | OTU15 (0.022) | OTU7 (0.027) |
|
| 94 | OTU1 (0.014) | OTU7 (0.044) | OTU12, OTU15, OTU17 (0.055) |
|
| 56 | OTU8 (0.000) | OTU5, OTU7, OTU9, OTU10 (0.037) | OTU18 (0.057) |
|
| 94 | OTU12 (0.000) | OTU15 (0.022) | OTU7 (0.027) |
|
| 164 | OTU12 (0.000) | OTU7 (0.031) | OTU10, OTU13, OTU15, OTU16 (0.038) |
|
| 93 | OTU12 (0.000) | OTU15 (0.022) | OTU7 (0.027) |
|
| 164 | OTU8 (0.019) | OTU7 (0.031) | OTU9, OTU10, OTU15, OTU17, OTU18 (0.038) |
|
| 164 | OTU10 (0.000) | OTU7, OTU9 (0.019) | OTU8, OTU12 (0.031) |
|
| 90 | OTU9 (0.011) | OTU7 (0.034) | OTU8, OTU10, OTU17, OTU18 (0.046) |
|
| 164 | OTU12 (0.000) | OTU7 (0.025) | OTU10, OTU13, OTU15, OTU16 (0.031) |
|
| 56 | OTU12, OTU15 (0.000) | OTU1 (0.018) | OTU5 (0.037) |
|
| 94 | OTU15 (0.000) | OTU12 (0.013) | OTU7 (0.022) |
|
| 56 | OTU10 (0.000) | OTU7 (0.027) | OTU8, OTU9 (0.037) |
Results of BOLD ID engine. Type sequences were searched against all barcode records on BOLD with a minimum sequence length of 500 bp
| Types | BOLD ID engine (# matching sequences) | Result with highest similarity (%) |
|---|---|---|
|
| OTU1 (3) | 97.85 |
|
| OTU7 (7) | 100 |
|
| OTU7 (2) | 99.38 |
|
| OTU8 (7) | 99.38 |
|
| OTU8 (16) | 100 |
|
| N/A | |
|
| OTU8 (7) | 100 |
|
| OTU8 (7) | 100 |
|
| OTU8 (7) | 100 |
|
| OTU8 (7) | 98.15 |
|
| OTU9 | 98.92 |
|
| N/A | |
|
| OTU9 | 98.89 |
|
| OTU10 (7) | 100 |
|
| N/A | |
|
| OTU10 (7) | 100 |
|
| OTU12 (20) | 100 |
|
| OTU12 (20) | 100 |
|
| OTU12 (20) | 100 |
|
| OTU12 (20) | 100 |
|
| OTU12 (1) | 100 |
|
| OTU12 (20) | 100 |
|
| OTU12 (20) | 100 |
|
| N/A | |
|
| OTU13 (20) | 98.92 |
|
| N/A | |
|
| OTU15 (9) | 100 |
|
| OTU15 (9) | 100 |
|
| OTU15 (9) | 100 |
|
| OTU17 (2) | 100 |
|
| OTU18 (7) | 99.38 |
|
| N/A | |
|
|
| 100 |
OTU, operational taxonomic unit.
Type specimen associated with itself.
Results of four blog analyses (654, 162, 93, 54 bp). The type of E. moroccoensis is excluded from this table as it only associated with itself in all analyses
| Sequence length (bp) | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 654 | 162 | 93 | 54 | |||||||||||||
| # corr. | % corr. | # wrong | # n.c. | # corr. | % corr. | # wrong | # n.c. | # corr. | % corr. | # wrong | # n.c. | # corr. | % corr. | # wrong | # n.c. | |
| OTU1 | 0 | 0 | 0 | 1 | 1 | 100 | 0 | 0 | 1 | 100 | 0 | 0 | 1 | 100 | 0 | 0 |
| OTU2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| OTU3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| OTU4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| OTU5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| OTU6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| OTU7 | 0 | 0 | 0 | 2 | 2 | 100 | 0 | 0 | 2 | 100 | 0 | 0 | 2 | 100 | 0 | 0 |
| OTU8 | 0 | 0 | 0 | 7 | 5 | 71.43 | 1 | 1 | 5 | 71.43 | 1 | 1 | 6 | 85.71 | 1 | 0 |
| OTU9 | 3 | 100 | 0 | 0 | 3 | 100 | 0 | 0 | 3 | 100 | 0 | 0 | 3 | 100 | 0 | 0 |
| OTU10 | 0 | 0 | 0 | 3 | 3 | 100 | 0 | 0 | 3 | 100 | 0 | 0 | 3 | 100 | 0 | 0 |
| OTU11 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| OTU12 | 0 | 0 | 0 | 8 | 8 | 100 | 0 | 0 | 8 | 100 | 0 | 0 | 8 | 100 | 0 | 0 |
| OTU13 | 0 | 0 | 0 | 1 | 1 | 100 | 0 | 0 | 1 | 100 | 0 | 0 | 1 | 100 | 0 | 0 |
| OTU14 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| OTU15 | 0 | 0 | 0 | 4 | 1 | 25 | 0 | 3 | 3 | 75 | 0 | 1 | 4 | 100 | 0 | 0 |
| OTU16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| OTU17 | 0 | 0 | 0 | 1 | 1 | 100 | 0 | 0 | 1 | 100 | 0 | 0 | 1 | 100 | 0 | 0 |
| OTU18 | 0 | 0 | 0 | 2 | 1 | 50 | 0 | 1 | 1 | 50 | 0 | 1 | 1 | 50 | 0 | 1 |
# corr., number of correctly classified elements; % corr., percentage of correctly classified elements; # wrong, number of wrongly classified elements; # n.c., number of unclassified elements; OTU, operational taxonomic unit.
Type association based on DNA barcodes. Numbers in cells correspond to the operational taxonomic unit (OTU) where each method linked a given type specimen. Discordant results are marked in bold
| Types | NJ | ML | BI | Pw. dist. | BOLD ID |
|
|---|---|---|---|---|---|---|
|
| 1 | 1 | 1 | 1 | 1 | 1 |
|
| 7 | 7 | 7 | 7 | 7 | 7 |
|
|
|
|
| 7 | 7 | 7 |
|
| 8 | 8 | 8 | 8 | 8 | 8 |
|
| 8 | 8 | 8 | 8 | 8 | 8 |
|
| 8 | 8 | 8 | 8 | N/A | 8 |
|
| 8 | 8 | 8 | 8 | 8 | 8 |
|
| 8 | 8 | 8 | 8 | 8 | 8 |
|
| 8 | 8 | 8 | 8 | 8 | 8 |
|
| 8 | 8 | 8 | 8 | 8 |
|
|
| 9 | 9 | 9 | 9 | 9 | 9 |
|
| 9 | 9 | 9 | 9 | N/A | 9 |
|
| 9 | 9 | 9 | 9 | 9 | 9 |
|
| 10 | 10 | 10 | 10 | 10 | 10 |
|
| 10 | 10 | 10 | 10 | N/A | 10 |
|
| 10 | 10 | 10 | 10 | 10 | 10 |
|
| 12 | 12 | 12 | 12 | 12 | 12 |
|
| 12 | 12 | 12 | 12 | 12 | 12 |
|
| 12 | 12 | 12 | 12 | 12 | 12 |
|
| 12 | 12 | 12 | 12 | 12 | 12 |
|
| 12 | 12 | 12 | 12 | 12 | 12 |
|
| 12 | 12 | 12 | 12 | 12 | 12 |
|
| 12 | 12 | 12 | 12 | 12 | 12 |
|
| 12 | 12 | 12 | 12 | N/A | 12 |
|
|
| 13 | 13 | 13 | 13 | 13 |
|
| 15 | 15 | 15 |
| N/A | 15 |
|
| 15 | 15 | 15 | 15 | 15 | 15 |
|
| 15 | 15 | 15 | 15 | 15 | 15 |
|
| 15 | 15 | 15 | 15 | 15 | 15 |
|
| 17 | 17 | 17 | 17 | 17 | 17 |
|
| 18 | 18 | 18 | 18 | 18 | 18 |
|
| 18 | 18 | 18 | 18 | N/A | N/A |
|
| None | None | None | N/A | N/A | N/A |
NJ, Neighbour‐Joining; ML, maximum likelihood; BI, Bayesian inference; Pw. dist., pairwise distances; BOLD ID, BOLD Identification system; blog, blog analysis with the sequence length of 54 bp.
Placed as a sister lineage.