| Literature DB >> 29192249 |
Junning Liu1,2,3, Jiamei Jiang1,2,3, Shuli Song1,2,3, Luke Tornabene4, Ryan Chabarria5, Gavin J P Naylor6, Chenhong Li7,8,9.
Abstract
Species identification using DNA sequences, known as DNA barcoding has been widely used in many applied fields. Current barcoding methods are usually based on a single mitochondrial locus, such as cytochrome c oxidase subunit I (COI). This type of barcoding method does not always work when applied to species separated by short divergence times or that contain introgressed genes from closely related species. Herein we introduce a more effective multi-locus barcoding framework that is based on gene capture and "next-generation" sequencing. We selected 500 independent nuclear markers for ray-finned fishes and designed a three-step pipeline for multilocus DNA barcoding. We applied our method on two exemplar datasets each containing a pair of sister fish species: Siniperca chuatsi vs. Sini. kneri and Sicydium altum vs. Sicy. adelum, where the COI barcoding approach failed. Both of our empirical and simulated results demonstrated that under limited gene flow and enough separation time, we could correctly identify species using multilocus barcoding method. We anticipate that, as the cost of DNA sequencing continues to fall that our multilocus barcoding approach will eclipse existing single-locus DNA barcoding methods as a means to better understand the diversity of the living world.Entities:
Mesh:
Year: 2017 PMID: 29192249 PMCID: PMC5709489 DOI: 10.1038/s41598-017-16920-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Intra- (red) and interspecific (blue) p-distance of Siniperca chuatsi and Sini. kneri calculated using different number of nuclear loci or COI gene. Scale of distances larger than 0.020 was reduced to fit all data points in the art board.
Figure 2Intra- (red) and interspecific (blue) p-distance of Sicydium altum and Sicy. adelum calculated using different numbers of nuclear loci or COI gene.
Figure 3The relationship between number of loci used and success rate of identification between Siniperca chuatsi and Sini. kneri (green dots), and between Sicydium altum and Sicy. adelum (red triangles).
Figure 4Identification success rate using simulated sequences under different scenarios. (a) migration rate equals zero and divergence time equals 700,000 generations (red line), 100,000 generations (black crosses), 10,000 generations (blue triangles), and 1000 generations (green circles); (b) divergence time equals 10,000 and migration rate equals 0 (red line), 0.000001 (black crosses), 0.00001 (blue triangles) and 0.0001 (green circles); (c) divergence time equals 100,000 and migration rate equals 0 (red line), 0.000001 (black crosses), 0.00001 (blue triangles) and 0.0001 (green circles); (d) divergence time equals 700,000 and migration rate equals 0 (red line), 0.000001 (black crosses), 0.00001 (blue triangles) and 0.0001 (green circles).
Figure 5Comparison between success rates of species identification based on a single locus (red circles) and multiple loci (blue triangles). The length of the single locus equals the total length of multiple loci (300 bp each).
Results for species delimitation on unknown sample 893_3 (Sini. kneri) using BFD* based on all 500 nuclear loci, missing 20%, 30% and 50% of the 500 loci or missing conspecific of Sini. kneri in the database.
| Data treatment | Model | Marginal likelihood | 2lnBF |
|---|---|---|---|
| Using all data | Lumping 839_3 and | −1575.80 | 20.62 |
| Splitting 839_3 and | −1586.11 | ||
| Excluding conspecifics of 839_3 | Lumping 839_3 and | −2350.77 | |
| Splitting 839_3 and | −1222.90 | 2255.7 | |
| Excluding 20% loci of the 839_3 | Lumping 839_3 and | −1467.41 | 26.54 |
| Splitting 839_3 and | −1480.68 | ||
| Excluding 30% loci of the 839_3 | Lumping 839_3 and | −1247.44 | 22.12 |
| Splitting 839_3 and | −1258.50 | ||
| Excluding 50% loci of the 839_3 | Lumping 839_3 and | −914.60 | 22.40 |
| Splitting 839_3 and | −925.80 |
Figure 6A three-step multilocus DNA barcoding pipeline.