| Literature DB >> 35212872 |
María L Peláez1, José L Horreo2,3, Ricardo García-Jiménez4, Antonio G Valdecasas4.
Abstract
Public molecular databases are fundamental tools for modern taxonomic studies whose usefulness rely on the soundness of the data within them. Here, we study potential errors that can arise along the data pipeline from sampling, specimen identification and molecular processing (digestion, amplification and sequencing) to the submission of sequences to these databases by using the DNA sequences of Hydrachnidia (Acari, Parasitengona) as a case study. Our results indicate that molecular information is available for only about 3% of the Hydrachnidia species known to date; yet, within this small percentage, errors are present in almost 5% of the species analyzed (0.5% of the sequences and almost 11% of the genera). This study underscores the scarcity of genetic data available for Hydrachnidia, but also that the proportion of errors in DNA sequences is relatively small. Even so, it highlights the danger associated with using DNA sequences from public databases, particularly for species identification, and reinforces the need for greater quality control measures and/or protocols to avoid an intensification of errors in the (post) genomics era. Finally, our study emphasizes that potential errors may also reveal cryptic diversity within a species.Entities:
Keywords: BOLD; Cryptic diversity; GenBank; Phylogeny; Species identification; Water mites
Mesh:
Year: 2022 PMID: 35212872 DOI: 10.1007/s10493-022-00703-0
Source DB: PubMed Journal: Exp Appl Acarol ISSN: 0168-8162 Impact factor: 2.132