| Literature DB >> 30871464 |
Zhijun Zhou1, Huifang Guo2, Li Han2, Jinyan Chai2, Xuting Che2, Fuming Shi3.
Abstract
BACKGROUND: DNA barcoding has been developed as a useful tool for species discrimination. Several sequence-based species delimitation methods, such as Barcode Index Number (BIN), REfined Single Linkage (RESL), Automatic Barcode Gap Discovery (ABGD), a Java program uses an explicit, determinate algorithm to define Molecular Operational Taxonomic Unit (jMOTU), Generalized Mixed Yule Coalescent (GMYC), and Bayesian implementation of the Poisson Tree Processes model (bPTP), were used. Our aim was to estimate Chinese katydid biodiversity using standard DNA barcode cytochrome c oxidase subunit I (COI-5P) sequences.Entities:
Keywords: Clustering-based method, Similarity-based method; Cryptic species; DNA barcoding; Katydids; Species delimitation
Mesh:
Year: 2019 PMID: 30871464 PMCID: PMC6419471 DOI: 10.1186/s12862-019-1404-5
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Summary of K2P distances between barcode sequences at each taxonomic level
| Label | n | Taxa* | Comparisons | Min Dist (%) | Mean Dist (%) | Max Dist (%) | SE Dist (%) | |
|---|---|---|---|---|---|---|---|---|
| Entire dataset | Within Species | 2070 | 186 | 40,374 | 0.00 | 1.44 | 27.45 | 0.00 |
| Within Genus | 2091 | 34 | 88,915 | 0.00 | 14.21 | 28.82 | 0.00 | |
| Within Family | 2164 | 1 | 2,211,077 | 1.85 | 22.21 | 34.09 | 0.00 | |
| Identified Morphospecies | Within Species | 1614 | 109 | 36,966 | 0.00 | 1.54 | 27.45 | 0.00 |
| Within Genus | 1526 | 24 | 55,140 | 0.00 | 15.35 | 28.74 | 0.00 | |
| Within Family | 1636 | 1 | 1,245,324 | 2.16 | 22.29 | 32.88 | 0.00 | |
| Unidentified BIN-species | Within BIN-Species | 456 | 77 | 3408 | 0.00 | 0.39 | 2.81 | 0.00 |
| Within Genus | 502 | 18 | 13,109 | 1.07 | 12.86 | 28.07 | 0.00 | |
| Within Family | 528 | 1 | 122,611 | 3.27 | 21.67 | 32.78 | 0.00 |
*Singleton morphospecies or BIN-species were excluded
Normalized divergence statistics
| Quantity | Entire dataset | Identified Morphospecies | Unidentified BIN-species |
|---|---|---|---|
| Species Count | 186 | 109 | 77 |
| Mean Within-Species Dist (%) | 1.01 | 1.40 | 0.45 |
| SE of Mean Within-Species Dist (%) | 0.01 | 0.02 | 0.01 |
| Min Between-Species Dist (%) | 0.00 | 0.00 | 1.07 |
The within-species distribution is normalized to reduce bias in sampling at the species level
Highlighted records which the distance to nearest neighbor (NN) less than 2% or Max intra-specific distance
| Species | Mean Intra-Sp | Max Intra-Sp | Nearest Neighbour (Process ID) | Distance to NN |
|---|---|---|---|---|
| 0.72 | 1.08 | 1.85 | ||
| 1.05 | 1.38 | 1.85 | ||
|
| 5.1 | 9.25 | 7.03 | |
|
| 9.45 | 18.85 | 18 | |
|
| 0.15 | 0.15 | 0.61 | |
|
| 11.58 | 17.81 | 0.61 | |
|
| 0.2 | 0.46 | 1.55 | |
| 0.44 | 1.39 | 1.55 | ||
| 0.3 | 0.3 | 1.38 | ||
| 0 | 0 | 1.38 | ||
| N/A | 0 | 1.7 | ||
| N/A | 0 | 1.7 | ||
| 0.31 | 0.46 | 1.86 | ||
|
| 1.03 | 1.69 | 0.15 | |
|
| 1.17 | 3.45 | 0.15 | |
|
| 0.8 | 27.45 | 2.49 | |
|
| 2.62 | 4.91 | 2.49 | |
|
| 1.95 | 4.9 | 0.3 | |
|
| 0.86 | 1.85 | 0 | |
|
| 1.37 | 3.78 | 0 | |
|
| 2.43 | 4.4 | 3.76 | |
| 0.24 | 0.61 | 1.07 | ||
| 0.23 | 0.46 | 1.07 | ||
|
| 2.12 | 7.97 | 4.62 | |
|
| 0.13 | 0.46 | 1.38 | |
| Parapsyra sp. 2 | 0.1 | 0.15 | 1.86 | |
| Parapsyra sp. 3 | 0.43 | 0.76 | 1.38 | |
|
| 5.49 | 8.09 | 7.72 | |
| 0.15 | 0.46 | 1.23 | ||
| 0.15 | 0.3 | 1.23 | ||
|
| 2.69 | 10.36 | 5.08 | |
| 0.05 | 0.3 | 1.07 | ||
| 1.25 | 2.81 | 2.79 | ||
|
| 2.82 | 6.59 | 1.07 | |
|
| 1.33 | 3.27 | 0.15 | |
|
| 0.7 | 1.39 | 0.15 | |
|
| 0.46 | 0.76 | 0.15 | |
|
| 2.12 | 3.78 | 2.17 | |
|
| 1.75 | 7.92 | 4.74 | |
|
| 8.41 | 12.14 | 10.39 | |
|
| 1.21 | 2.65 | 1.23 | |
|
| 0.81 | 1.38 | 1.85 | |
|
| 9.25 | 13.02 | 10.39 | |
|
| 0.51 | 0.77 | 1.23 | |
|
| 0.31 | 0.46 | 0.3 | |
|
| 4.57 | 7.79 | 1.07 | |
|
| N/A | 0 | 0.3 | |
|
| 0.88 | 1.7 | 0.77 | |
|
| 0.89 | 1.54 | 1.07 | |
| 0.06 | 0.3 | 1.85 | ||
|
| 5.28 | 9.43 | 7.12 | |
|
| 0.92 | 0.92 | 0.77 |
Alignment: MUSCLE (Edgar, 2004). Filters applied: ≥ 600 bp only, exclude records contained contaminants, had stop codons, flagged as misidentifications or errors; Deletion method: Pairwise deletion. NA, not applicable
Fig. 1Maximum intraspecific variation (K2P) against record count (a) and maximum geographic extent (b) of sampled individuals. a linear regression, y = 0.0007x + 0.0174, Adjusted R-square = 0.142, P < 0.001; b linear regression, y = 2E-05x + 0.0131, Adjusted R-square = 0.080, P < 0.001
MOTU number of Chinese Katydids inferred by each single-locus species delimitation methods
| Quantity | DBCHL | DBMEC | DBPPM | DBTB | Total |
|---|---|---|---|---|---|
| COI-5P sequences | 596 | 376 | 993 | 200 | 2165 |
| Haplotypes | 390 | 158 | 530 | 147 | 1225 |
| Identified morphospecies/BINs/Record count | 40/69/579 | 40/51/341 | 43/63/551 | 8/13/164 | 131/196/1635 |
| Singleton BINs/Record count | 11/11 | 20/20 | 19/19 | 2/2 | 52/52 |
| Concordant BINs/Record count | 55/502 | 27/288 | 44/532 | 10/54 | 136/1376 |
| Discordant BINs/Record count | 3/66 | 4/33 | 0/0 | 1/108 | 8/207 |
| Unidentified BIN-species/Record count | 8/17 | 5/35 | 118/441 | 19/36 | 150/529 |
| Singleton BINs/Record count | 2/2 | 4/4 | 57/57 | 10/10 | 73/73 |
| Concordant BINs/Record count | 6/15 | 1/31 | 61/384 | 9/26 | 77/456 |
| Total BINs count/Record count | 77/596 | 56/376 | 181/992a | 32/200 | 346/2164 |
| RESL* | 71 | 56 | 180 | 42 | 349 |
| jMOTU (13 bp) | 61 | 57 | 169 | 31 | 318 |
| ABGD (2.15%) | 42 | 53 | 140 | 20 | 255 |
| Single-GMYC | 87 | 66 | 197 | 32 | 382 |
| Multiple-GMYC | 94 | 60 | 206 | 37 | 397 |
| bPTP | 55 | 56 | 178 | 23 | 312 |
Record count, the number of COI-5P sequences after quality control, *Alignment: MUSCLE (Edgar, 2004). Filters applied: ≥ 600 bp only, exclude records contained contaminants, had stop codons, flagged as misidentifications or errors; Deletion method: Pairwise deletion; One COI-5P sequence without BIN records in DBPPM dataset corresponded that sequences did not fulfill with barcode compliance standards. All bPTP results were from Bayesian MCMC analyses. Results of GMYC and bPTP analyses were from genealogies based on COI-5P haplotype sequences
Morphospecies-BIN perfect matches in this study
| Species | BIN (Record count) | Species | BIN (Record count) |
|---|---|---|---|
|
| BOLD:ADB5605 (1) |
| BOLD:ADB5036 (19) |
|
| BOLD:ADB6076 (2) |
| BOLD:ACD4991 (16) |
|
| BOLD:ADB9406 (2) |
| BOLD:ADB6578 (1) |
|
| BOLD:ACD5704 (2) |
| BOLD:ADB5974 (1) |
|
| BOLD:ADB6973 (18) |
| BOLD:ADB6970 (4) |
|
| BOLD:ADC2356 (3) |
| BOLD:AAL2811 (13) |
|
| BOLD:ADB5791 (24) |
| BOLD:ACD4406 (2) |
|
| BOLD:ADC0408 (7) |
| BOLD:ADB6754 (1) |
|
| BOLD:ADC0317 (3) |
| BOLD:ACD4648 (1) |
|
| BOLD:ADC0398 (7) |
| BOLD:ADB4064 (1) |
|
| BOLD:AAR9918 (6) |
| BOLD:ADB6233 (10) |
|
| BOLD:ACD8581 (34) |
| BOLD:ADB9362 (2) |
|
| BOLD:ACD4634 (45) |
| BOLD:ADC0410 (5) |
|
| BOLD:AAR9916 (3) |
| BOLD:ADC0545 (3) |
|
| BOLD:ACD8224 (2) |
| BOLD:ADB4056 (1) |
|
| BOLD:ADB3725 (10) |
| BOLD:ACD5159 (2) |
|
| BOLD:AAI9644 (1) |
| BOLD:ACI0393 (7) |
|
| BOLD:ADB5148 (18) |
| BOLD:ACH8980 (5) |
|
| BOLD:ADB5531 (2) |
| BOLD:ACH9706 (6) |
|
| BOLD:ADE1944 (9) |
| BOLD:ACD4415 (2) |
|
| BOLD:ADB3480 (11) |
| BOLD:ACD5863 (1) |
|
| BOLD:ADE1953 (1) |
| BOLD:ADB5353 (7) |
|
| BOLD:ADE1822 (1) |
| BOLD:ACD5305 (2) |
|
| BOLD:ADB3478 (17) |
| BOLD:ACD5306 (2) |
|
| BOLD:ACD5212 (7) |
| BOLD:ABY3224 (1) |
|
| BOLD:ACX8629 (2) |
| BOLD:ADE3080 (1) |
|
| BOLD:ACX8886 (6) |
| BOLD:ADE0541 (6) |
|
| BOLD:ACD7247 (60) |
| BOLD:ACD6661 (5) |
|
| BOLD:ADB4481 (1) |
| BOLD:ADE3081 (1) |
|
| BOLD:ADB5805 (3) |
| BOLD:ADE3990 (1) |
|
| BOLD:ACD7803 (3) |
| BOLD:ADE1671 (3) |
|
| BOLD:ACD5474 (3) |
| BOLD:ADE3991 (1) |
|
| BOLD:ACD6794 (4) |
| BOLD:ADE3283 (1) |
|
| BOLD:ADE0701 (1) |
| BOLD:ADE0468 (1) |
|
| BOLD:AAF0977 (7) |
| BOLD:ADE1374 (3) |
|
| BOLD:ADB4148 (4) |
| BOLD:ADE4028 (7) |
|
| BOLD:ACD6675 (2) |
| BOLD:ACD5524 (3) |
|
| BOLD:ADB9525 (1) |
| BOLD:ADE1431 (3) |
|
| BOLD:ACD8365 (7) |
| BOLD:ADE2569 (2) |
|
| BOLD:ADB5364 (7) |
| BOLD:ADE2568 (2) |
|
| BOLD:ADB5358 (3) |
| BOLD:ADE2811 (3) |
|
| BOLD:ADB5037 (9) |
Number of barcodes included in each BIN was given in brackets
Discordance BINs report for different morphospecies were assigned to one BIN
| URI | Rank | Species 1 (Record count) | Species 2 (Record count) | Species 3 (Record count) |
|---|---|---|---|---|
| BOLD:AAY1322 | Species | |||
| BOLD:ACD5539 | Species | |||
| BOLD:ACD6726 | Species | |||
| BOLD:ACD8335 | Species | |||
| BOLD:ADB3332 | Species | |||
| BOLD:ADB3697 | Species | |||
| BOLD:ADB5868 | Species | |||
| BOLD:ADE4977 | Species |
The underlined species split into more than one BIN
Summary of the 37 morphospecies were assigned to more than one BIN
| Species | BIN (Record count) |
|---|---|
|
| BOLD:ACD4962 (4), BOLD:ADB5725 (19), BOLD:ADB5876 (7), BOLD:ACD4960 (1), BOLD:ADB5726 (1), BOLD:ADB9877 (1), BOLD:ADC0465 (1) |
|
| BOLD:ADB9302 (3), BOLD:ADB9301 (1), BOLD:ADB9303 (1) |
|
| BOLD:ADB9596 (2), BOLD:ADB6577 (2), BOLD:ADC0531 (1) |
|
| BOLD:ADC0256 (3), BOLD:ADC0257 (4), |
|
| BOLD:ADB6782 (2), BOLD:ADB6356 (24), BOLD:ADB5579 (2), BOLD:ACD2116 (3), BOLD:ADB6002 (5), BOLD:ACN8107 (7), BOLD:ADB6842 (5), BOLD:ACD4542 (12), BOLD:ACD4543 (6), BOLD:ADC0188 (3), BOLD:ABV1952 (1) |
|
| BOLD:ACE7214 (101), BOLD:ADB6191 (7), BOLD:ACD7324 (33) |
|
| BOLD:ACA6035 (2), BOLD:ADM8991 (1), BOLD:ADM8992 (1) |
|
| BOLD:AAP6087 (10), |
|
| BOLD:ADB5001 (116), BOLD:ADE2467 (1) |
|
| BOLD:ADB5352 (2), BOLD:ADB5870 (1) |
|
| BOLD:ADA6038 (2), BOLD:ADA6039 (3), BOLD:ADA6037 (3) |
|
| BOLD:ADA5568 (5), BOLD:ADA6837 (6), BOLD:ADA6838 (8), BOLD:ADA6839 (5), BOLD:ADA6836 (1) |
|
| BOLD:ACX8110 (7), BOLD:ACD8277 (14), BOLD:ACD8278 (3), BOLD:ADM2486 (4) |
|
| BOLD:ACD5194 (30), BOLD:ACD5193 (10) |
|
| BOLD:ACD7465 (14), BOLD:ADB5963 (2), BOLD:ADB3600 (5), BOLD:ADB5962 (1) |
|
| BOLD:ACD8152 (42), BOLD:ACQ5648 (1), BOLD:ACQ0048 (1), BOLD:ACQ0049 (1) |
|
| BOLD:ADB5808 (4), BOLD:ADB5615 (12) |
|
| BOLD:ADB4776 (3), BOLD:ADB5035 (1) |
|
| BOLD:ADB5607 (7), BOLD:ADB5606 (2), BOLD:ACD4881 (5), BOLD:ADB4615 (6), BOLD:ADB5608 (1), BOLD:ADB6880 (1) |
|
| BOLD:ADB7056 (7), BOLD:ADB9469 (1) |
|
| BOLD:ACD7529 (9), BOLD:ACD6433 (20) |
|
| BOLD:ACD5503 (6), BOLD:ADE5391 (5), BOLD:ADE5392 (1), |
|
| BOLD:ACD5257 (41), BOLD:ACD5256 (7) |
|
| BOLD:ACH8981 (2), BOLD:ADE5243 (2), BOLD:ADC2408 (5) |
|
| BOLD:ADB3789 (42), BOLD:ACD8228 (3) |
|
| BOLD:ACD6622 (17), BOLD:ACD6623 (3) |
|
| BOLD:ADE1666 (2), BOLD:ADE1667 (1) |
|
| BOLD:ADB7052 (1), BOLD:ADE1670 (1), BOLD:ADE1669 (1), BOLD:ADE1668 (1) |
|
| BOLD:ACD4254 (7), BOLD:ADE3375 (3) |
|
| BOLD:ADB5688 (10), BOLD:ADE3141 (4), |
|
| BOLD:ADB3846 (61), BOLD:ADE2939 (1) |
|
| BOLD:ADE2449 (2), BOLD:ADE2447 (1), BOLD:ADE2448 (1) |
|
| BOLD:ADE0562 (5), BOLD:ADE0560 (4), BOLD:ADE0561 (1) |
|
| BOLD:ADB3348 (24), BOLD:ADE0823 (3) |
The underlined BINs were shared by more than one morphospecies. Number of barcodes included in each BIN was given in brackets
Fig. 2Correspondence between the genetically putative molecular species (MOTUs) number and the cut-off value (bp) generated by jMOTU
Fig. 3Correspondence between the genetically distinct MOTUs number and prior maximal distance (%) by ABGD based on K2P model
Bidirectional concordance among clustering methods for identified katydids specimens dataset using Adjusted Wallace and 95% CI
| Morphospecies | BIN-species | ABGD | jMOTU | RESL | sGMYC | mGMYC | bPTP | |
|---|---|---|---|---|---|---|---|---|
| Morphospecies | 0.709 (0.683–0.735) | 0.955 (0.945–0.965) | 0.861 (0.839–0.883) | 0.841 (0.827–0.855) | 0.391 (0.375–0.408) | 0.451 (0.429–0.472) | 0.940 (0.929–0.951) | |
| BIN-species | 0.954 (0.946–0.963) | 1.000 (1.000–1.000) | 0.999 (0.999–1.000) | 0.999 (0.998–0.999) | 0.535 (0.517–0.553) | 0.608 (0.583–0.633) | 0.998 (0.996–0.999) | |
| ABGD | 0.859 (0.846–0.873) | 0.668 (0.644–0.693) | 0.880 (0.858–0.901) | 0.793 (0.779–0.808) | 0.367 (0.352–0.382) | 0.416 (0.398–0.435) | 0.969 (0.959–0.978) | |
| jMOTU | 0.881 (0.867–0.894) | 0.760 (0.737–0.782) | 1.000 (1.000–1.000) | 0.874 (0.865–0.883) | 0.415 (0.399–0.431) | 0.469 (0.448–0.490) | 0.995 (0.993–0.996) | |
| RESL | 0.953 (0.943–0.963) | 0.841 (0.815–0.868) | 1.000 (1.000–1.000) | 0.969 (0.949–0.990) | 0.450 (0.434–0.466) | 0.513 (0.492–0.535) | 0.998 (0.996–0.999) | |
| sGMYC | 0.959 (0.951–0.966) | 0.974 (0.964–0.983) | 1.000 (1.000–1.000) | 0.995 (0.993–0.996) | 0.973 (0.963–0.982) | 0.916 (0.906–0.927) | 0.995 (0.992–0.998) | |
| mGMYC | 0.969 (0.964–0.975) | 0.972 (0.964–0.980) | 0.996 (0.994–0.997) | 0.986 (0.983–0.989) | 0.974 (0.965–0.982) | 0.804 (0.796–0.813) | 0.990 (0.987–0.993) | |
| bPTP | 0.872 (0.859–0.884) | 0.687 (0.663–0.711) | 0.998 (0.996–1.000) | 0.901 (0.880–0.922) | 0.816 (0.803–0.828) | 0.376 (0.361–0.392) | 0.427 (0.408–0.446) |
Values in parentheses indicate the total number of clusters generated for each analysis. Each value in the table indicates how well the clusters generated by the method indicated by the row label correspond to the clusters yielded by the method indicated in the column label. Each pair of methods is represented by two values in the table