| Literature DB >> 29933389 |
Monika Mioduchowska1, Michał Jan Czyż2, Bartłomiej Gołdyn3, Jarosław Kur4, Jerzy Sell1.
Abstract
The cytochrome c oxidase subunit I (cox1) gene is the main mitochondrial molecular marker playing a pivotal role in phylogenetic research and is a crucial barcode sequence. Folmer's "universal" primers designed to amplify this gene in metazoan invertebrates allowed quick and easy barcode and phylogenetic analysis. On the other hand, the increase in the number of studies on barcoding leads to more frequent publishing of incorrect sequences, due to amplification of non-target taxa, and insufficient analysis of the obtained sequences. Consequently, some sequences deposited in genetic databases are incorrectly described as obtained from invertebrates, while being in fact bacterial sequences. In our study, in which we used Folmer's primers to amplify COI sequences of the crustacean fairy shrimp Branchipus schaefferi (Fischer 1834), we also obtained COI sequences of microbial contaminants from Aeromonas sp. However, when we searched the GenBank database for sequences closely matching these contaminations we found entries described as representatives of Gastrotricha and Mollusca. When these entries were compared with other sequences bearing the same names in the database, the genetic distance between the incorrect and correct sequences amplified from the same species was c.a. 65%. Although the responsibility for the correct molecular identification of species rests on researchers, the errors found in already published sequences data have not been re-evaluated so far. On the basis of the standard sampling technique we have estimated with 95% probability that the chances of finding incorrectly described metazoan sequences in the GenBank depend on the systematic group, and variety from less than 1% (Mollusca and Arthropoda) up to 6.9% (Gastrotricha). Consequently, the increasing popularity of DNA barcoding and metabarcoding analysis may lead to overestimation of species diversity. Finally, the study also discusses the sources of the problems with amplification of non-target sequences.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29933389 PMCID: PMC6014667 DOI: 10.1371/journal.pone.0199609
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
The list of correctly and incorrectly flagged sequences of eukaryotic and microbial cox1 gene fragment applied in our study.
| KP702848 | 100% | 99% | 0.0 | Gandolfi et al., unpublished | |
| KT583299.1 | 99% | 83% | 2e-162 | Halliburton et al., unpublished | |
| AF308960.1 | 99% | 83% | 4e-163 | [ | |
| AF308958.1 | 99% | 82% | 2e-160 | [ | |
| KR685968.1 | 99% | 83% | 5e-156 | [ | |
| CP006579.1 | 100% | 88% | 0.0 | [ | |
| CP007567.1 | 100% | 87% | 0.0 | [ | |
| CP013067.1 | 100% | 82% | 3e-153 | [ | |
| CP002209.1 | 100% | 80% | 8e-141 | [ | |
| CP013650.1 | 100% | 80% | 8e-141 | Kim and Lee, unpublished | |
| AB238526.1 | 99% | 80% | 6e-136 | [ | |
| KM221059.1 | 99% | 74% | 7e-91 | [ | |
| AB161601.1 | 99% | 74% | 1e-88 | [ | |
| AB161622.1 | 99% | 74% | 4e-88 | [ | |
| AB161608.1 | 99% | 73% | 5e-87 | [ | |
| JF432035.1 | 100% | 98% | 0.0 | [ | |
| CP013251.1 | 100% | 83% | 1e-158 | [ | |
| AP014637.1 | 100% | 78% | 4e-138 | [ | |
| CP001978.1 | 98% | 79% | 7e-123 | [ | |
| CP000155.1 | 100% | 79% | 1e-125 | [ | |
| JF432028.1 | 99% | 79% | 1e-132 | [ | |
| JF432025.1 | 98% | 78% | 1e-118 | [ | |
| JF432023.1 | 89% | 76% | 4e-100 | [ | |
| JF432021.1 | 98% | 74% | 2e-79 | [ | |
| JF432026.1 | 97% | 73% | 3e-83 | [ | |
Fig 1Maximum Likelihood tree showing evolutionary relatedness of eukaryotic and prokaryotic COI barcoding sequences.
The bootstrap values are shown at the nodes. The used colors of asterisk indicate both correctly and incorrectly described sequences for the same species. In the case of the sequences obtained in our study for B. schaefferi the sequences were correctly labeled and put into the GenBank. More information about the applied sequences is provided in Table 1.