| Literature DB >> 24453565 |
Gontran Sonet1, Kurt Jordaens2, Zoltán T Nagy1, Floris C Breman3, Marc De Meyer3, Thierry Backeljau4, Massimiliano Virgilio3.
Abstract
Identification by DNA barcoding is more likely to be erroneous when it is based on a large distance between the query (the barcode sequence of the specimen to identify) and its best match in a reference barcode library. The number of such false positive identifications can be decreased by setting a distance threshold above which identification has to be rejected. To this end, we proposed recently to use an ad hoc distance threshold producing identifications with an estimated relative error probability that can be fixed by the user (e.g. 5%). Here we introduce two R functions that automate the calculation of ad hoc distance thresholds for reference libraries of DNA barcodes. The scripts of both functions, a user manual and an example file are available on the JEMU website (http://jemu.myspecies.info/computer-programs) as well as on the comprehensive R archive network (CRAN, http://cran.r-project.org).Entities:
Keywords: COI; Species identification; accuracy; precision; reference library; relative error
Year: 2013 PMID: 24453565 PMCID: PMC3890685 DOI: 10.3897/zookeys.365.6034
Source DB: PubMed Journal: Zookeys ISSN: 1313-2970 Impact factor: 1.546
Figure 1.DNA barcoding identification using the best close match method.
Figure 2.Estimation of the ad hoc distance threshold. Example of output obtained using the function adhocTHR with default settings (30 arbitrary distance thresholds, linear fit and an estimated relative identification error (RE) of 5%). The following message was given by the function: "for a RE of 0.05 use a threshold of 0.0334".