| Literature DB >> 32607275 |
Pedro M Pedro1,2, Jandui Amorim1, Martha V R Rojas1, Ivy Luizi Sá1, Allan Kardec Ribeiro Galardo3,4, Noel Fernandes Santos Neto3,4, Dario Pires de Carvalho5, Kaio Augusto Nabas Ribeiro5, Maria Tereza Pepe Razzolini6, Maria Anice Mureb Sallum1.
Abstract
A practical limitation to many metabarcoding initiatives is that sampling methods tend to collect many non-target taxa, which become "amplicon noise" that can saturate Next Generation Sequencing results and lead to both financial and resource inefficiencies. An available molecular tool that can significantly decrease these non-target amplicons and decrease the need for pre-DNA-extraction sorting of bycatch is the design of PCR primers tailored to the taxa under investigation. We assessed whether the D2 extension segment of the 28S ribosomal operon can limit this shortcoming within the context of mosquito (Culicidae) monitoring. We designed PCR primers that are fully conserved across mosquitos and exclude from amplification most other taxa likely to be collected with current sampling apparatuses. We show that, given enough sequencing depth, D2 is an effective marker for the detection of mosquito sequences within mock genomic DNA pools. As few as 3,050 quality-filtered Illumina reads were able to recover all 17 species in a bulk pool containing as little as 0.2% of constituent DNA from single taxa. We also mixed these mosquito DNA pools with high concentrations of non-Culicidae bycatch DNA and show that the component mosquito species are generally still recoverable and faithful to their original relative frequencies. Finally, we show that there is little loss of fidelity in abundance parameters when pools from degraded DNA samples were sequenced using the D2 primers. ©2020 Pedro et al.Entities:
Keywords: Abundance estimates; D2 expansion segment; Metabarcoding; Mosquito monitoring; Non-target taxa
Year: 2020 PMID: 32607275 PMCID: PMC7315618 DOI: 10.7717/peerj.9057
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Sequence logos for the D2 primers tested herein.
The comparisons are divided among Culicidae, all Nematocera apart from Culicidae, orders that are often collected as bycatch in sampling traps (Hymenoptera, Lepidoptera, and Coleoptera), and the remainder of all arthropods. Primer sequences are defined above the logos in black type (forward primer is listed first).
Summary of datasets used to evaluate amplicon length and taxonomic resolution for each primer set discussed in the text.
The number of GenBank sequences in the first two columns were those attending the ecopcr maximum nucleotide mismatch of five and that also contained both priming sites.
| Marker (reference) | Number of sequences used for all arthropod taxa (mean amplicon length including primers) | Number of sequences used for Culicidae only (mean amplicon length including primers) | Species resolution, all arthropod taxa | Species resolution, Culicidae only |
|---|---|---|---|---|
| D2 (herein) | 15,356 (465 bp) | 227 (400 bp) | 0.910 | 0.969 |
| 16S ( | 79,039 (205 bp) | 83 (217 bp) | 0.886 | 0.838 |
| 16S ( | 77,160 (140 bp) | 96 (145 bp) | 0.867 | 0.816 |
| CO1 ( | 793,484 (154 bp) | 13,782 (154 bp) | 0.866 | 0.765 |
| CO1 ( | 3,232 (220 bp) | 20 (220 bp) | 0.926 | 0.929 |
Figure 2Representation of the normalized read counts extrapolated from the average of the four Fresh Pools (A–D) for each species (x-axis) and the values from the four individual pools (y-axis).
Figure 3The Best Estimate values (x-axis) against the estimates from standardized values in the Bycatch Pools (y-axis).
(A) The 1:1 and 1:10 A Bycatch Pools and their replicates. (B) The two dilution types for Bycatch Pools B–D. Empty icons were from “contaminant” specimen tissue within the CDC DNA, as discussed in the text. Regression lines were drawn based only on the non-contaminant specimens.
Regression line parameters for the pools created from the bycatch mixed DNA (created from either 1:1 or 1:10 mixtures of fresh pools A–D with CDC bycatch) and degraded DNA (I–IV).
The results for the bycatch mixtures and the first four columns of pools I–IV are regressed against the standardized averages of the four pure-culicid fresh pools. The second I–IV regression parameters are based on a regression against the average of the standardized output for each species from those same pools.
| A | A-REP | B | C | D | A1:1 | A1:1-REP | A1:10 | A1:10-REP | B1:1 | B1:10 | C1:1 | C1:10 | D1:1 | D1:10 | I | II | III | IV | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| R2 | 0.981 | 0.980 | 0.993 | 0.985 | 0.974 | 0.888 | 0.807 | 0.237 | 0.002 | 0.914 | 0.004 | 0.911 | 0.004 | 0.947 | 0.141 | 0.942 | 0.856 | 0.961 | 0.898 |
| slope | 0.937 | 0.973 | 0.934 | 1.012 | 1.063 | 1.413 | 1.004 | 0.916 | −0.061 | 3.548 | 0.668 | 1.962 | 0.177 | 6.441 | 10.545 | 1.057 | 0.964 | 1.047 | 1.029 |
| y-intercept | 0.004 | 0.002 | 0.004 | −0.001 | −0.004 | −0.003 | 0.005 | 0.013 | 0.031 | −0.004 | 0.029 | −0.007 | 0.029 | 0.000 | 0.019 | −0.004 | 0.001 | −0.003 | −0.002 |