| Literature DB >> 24502833 |
Youri Lammers, Tamara Peelen, Rutger A Vos, Barbara Gravendeel1.
Abstract
BACKGROUND: Mixtures of internationally traded organic substances can contain parts of species protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). These mixtures often raise the suspicion of border control and customs offices, which can lead to confiscation, for example in the case of Traditional Chinese medicines (TCMs). High-throughput sequencing of DNA barcoding markers obtained from such samples provides insight into species constituents of mixtures, but manual cross-referencing of results against the CITES appendices is labor intensive. Matching DNA barcodes against NCBI GenBank using BLAST may yield misleading results both as false positives, due to incorrectly annotated sequences, and false negatives, due to spurious taxonomic re-assignment. Incongruence between the taxonomies of CITES and NCBI GenBank can result in erroneous estimates of illegal trade.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24502833 PMCID: PMC3922334 DOI: 10.1186/1471-2105-15-44
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Steps of the analysis pipeline. Section A shows the process of both controlling the version of the local CITES database as well as updating the database with NCBI taxonomy IDs of CITES-protected taxa. Section B shows the process of running a BLAST search on the user input FASTA file, filtering the output according to minimum BLAST quality settings and blacklisted GenBank entries and flagging CITES protected taxa by comparing the BLAST hits against the local CITES database (and optionally other databases).
User-specified, additional names database
| 44587 | |
| 44686 |
Examples of species of Panax that are listed in NCBI GenBank as distinct species, but are considered to be synonyms of P. ginseng (listed on CITES Appendix 2) by most other taxonomic databases, including the CITES Appendices, due to their unpublished status and close genetic similarity to P. ginseng.
User-specified blacklist
| EF090607 | Nyctaginaceae | |
| EU135905 | Nyctaginaceae |
Examples of NCBI GenBank accessions placed in our user-specified blacklist, erroneously listed on GenBank as belonging to Gastrodia elata, a highly endangered orchid (monocot) species listed on CITES appendix I, but containing nrITS sequences of the not endangered eudicot Nyctaginaceae instead.
Summarized test cases
| Incense cone | Cluster 0 | 210372 | 2 | ||
| Agarwood chips | Cluster 22 | 314115 | 2 | ||
| Cluster 5 | Orchidaceae spp. | 179352 | 2 | ||
| Cluster 1400 | Orchidaceae spp. | 335151 | 2 | ||
| Cluster 4500 | Orchidaceae spp. | 161871 | 2 |
Condensed pipeline results for the incense cone, agarwood chips and the Dendrobium stem IonTorrent clusters. Only the clusters with CITES hits are listed, for each CITES hit the cluster with the lowest e-value and highest sequence similarity percentage was selected. For simplicity’s sake columns with BLAST hit metadata (i.e. bit-score, e-value, accession numbers etc.) have been omitted, the full results are accessible in Additional file 1.