Literature DB >> 28542313

Environmental DNA (eDNA) metabarcoding assays to detect invasive invertebrate species in the Great Lakes.

Katy E Klymus1, Nathaniel T Marshall1, Carol A Stepien1.   

Abstract

Describing and monitoring biodiversity comprise integral parts of ecosystem management. Recent research coupling metabarcoding and environmental DNA (eDNA) demonstrate that these methods can serve as important tools for surveying biodiversity, while significantly decreasing the time, expense and resources spent on traditional survey methods. The literature emphasizes the importance of genetic marker development, as the markers dictate the applicability, sensitivity and resolution ability of an eDNA assay. The present study developed two metabarcoding eDNA assays using the mtDNA 16S RNA gene with Illumina MiSeq platform to detect invertebrate fauna in the Laurentian Great Lakes and surrounding waterways, with a focus for use on invasive bivalve and gastropod species monitoring. We employed careful primer design and in vitro testing with mock communities to assess ability of the markers to amplify and sequence targeted species DNA, while retaining rank abundance information. In our mock communities, read abundances reflected the initial input abundance, with regressions having significant slopes (p<0.05) and high coefficients of determination (R2) for all comparisons. Tests on field environmental samples revealed similar ability of our markers to measure relative abundance. Due to the limited reference sequence data available for these invertebrate species, care must be taken when analyzing results and identifying sequence reads to species level. These markers extend eDNA metabarcoding research for molluscs and appear relevant to other invertebrate taxa, such as rotifers and bryozoans. Furthermore, the sphaeriid mussel assay is group-specific, exclusively amplifying bivalves in the Sphaeridae family and providing species-level identification. Our assays provide useful tools for managers and conservation scientists, facilitating early detection of invasive species as well as improving resolution of mollusc diversity.

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28542313      PMCID: PMC5436814          DOI: 10.1371/journal.pone.0177643

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Detecting and monitoring species presence using environmental DNA (eDNA) is a rapidly growing research area. Species identification from recovered DNA that was shed into the environment by organisms constitutes a powerful tool for conservation biologists, allowing them to monitor taxa with greater sensitivity, less effort, and fewer negative affects relative to traditional survey methods [1-2]. Since only trace DNA amounts are required, it is particularly attractive to detect low abundance taxa. Consequently, there is great interest in using eDNA tools for early detection of invasive species when populations are small and confined, which can facilitate successful eradication outcomes [3]. Most initial eDNA studies have taken a targeted single species-specific approach, using conventional PCR, quantitative PCR (qPCR) or digital droplet PCR [4-6]. Others have used this single-species approach to investigate eDNA production and degradation dynamics [7-10]. Alternative to the single species-specific approach, analyzing eDNA with high-throughput (also termed next-generation) sequencing allows for multiple taxa identification simultaneously from a single sample via the general concepts of DNA barcoding [11] and metabarcoding [12]. Metabarcoding is especially useful for discerning unanticipated taxa. Beyond species detection, metabarcoding further enables the analysis of community composition from eDNA samples [13-17]. Although promising, difficulties in applying metabarcoding techniques to eDNA samples exist, such as obtaining species level discriminatory markers, as well as possible PCR and sequencing biases. The latter may influence the presumed relationship between sequence reads and eDNA abundance. Metabarcoding studies rely on universal primers that amplify DNA of multiple target species at a specific locus. For samples that contain the entire organism (e.g., microorganisms, meiofauna, bulk samples, diet analyses, eggs, gametes, spores, or larvae), universal primers that target a large barcoding region (i.e., COI in animals [11], rbcL and matK in plants [18], and ITS in fungi [19]) have been used [20-22]. However, macro-organismal DNA that has been cast into the environment (e.g., eDNA, forensic sampling) often is degraded, which results in poor amplification of larger markers [23]. Thus a difficulty in eDNA metabarcoding is finding short amplifiable markers (~100–200 base pairs) that can discriminate among species. Such mini-barcodes have been developed for some taxa; for instance, Hajibabei et al. [24] successfully amplified and identified museum arthropod specimens using markers 100–400 bp long. For primer development, researchers either have used programs such as EcoPrimer [25] and Primer Tree [16], or visually discerned prospective markers. Despite these efforts, it remains difficult to target species-discrimination regions given the short length eDNA marker requirements. A second concern related to metabarcoding of eDNA, is whether read abundance can be used to assess a species’ abundance or biomass [26]. Targeted species specific studies using qPCR have found positive relationships between copy number and species biomass [7,27]; however, few eDNA metabarcoding studies have reported a similar relationship between read abundance and biomass, and those that have, discerned moderate rather than strong relationships [14,17]. Furthermore many studies assume that high-throughput sequencing produces quantitative output, but do not explicitly test whether read abundance accurately reflects the starting amount of DNA. Notably, metabarcoding of bulk samples [28-29] and artificial communities of fungal PCR products [30] indicate that biases in primer binding, sequencing, and bioinformatic pipelines can limit quantitative metabarcoding abilities. To address these two issues, we designed high-throughput sequencing assays to discern invasive mollusc species in the Laurentian Great Lakes by developing group-specific (rather than broadly universal) primers and tested them using mock communities of mixed template samples, at various proportions of concentrations. Focusing on potential primer bias, the group-specific primers were formulated to detect 19 invasive or potentially invasive bivalve and snail species listed on government databases and watch lists [31-33]. These included NOAA’s Great Lakes Aquatic Nonindigenous Species Information System (GLANSIS) database [31], the U.S. Army Corps of Engineers Great Lakes and Mississippi River Interbasin Study (GLMRIS) report “Non-native Species of Concern and Dispersal Risk for the Great Lakes and Mississippi River Interbasin Study” [32], and the USEPA report “Predicting Future Introductions of Nonindigenous Species to the Great Lakes” [33]. We also developed a primer set specific to the Sphaeriidae family of clams, a cosmopolitan group having several native and invasive species in the Great Lakes region. Like other understudied invertebrates, their identification requires extensive taxonomic expertise due to morphological similarity and phenotypic plasticity. Our goal was to develop primers that amplify short (100–300 bp) fragments to discriminate among species, aiding taxonomic studies and surveys. To address PCR amplification bias, we used mock communities containing known amounts of DNA (copy number) per species to validate our assays. Finally, the assays were used to test eDNA samples from known lab aquarium assemblages and the field.

Materials and methods

Primer design and reference database development

We evaluated three DNA regions that have been used in phylogenetic studies of the targeted invertebrate taxa. Sequences from 16S mitochondrial (mt) ribosomal (r)DNA, cytochrome oxidase I (COI) mtDNA, and 28S nuclear rDNA genes were downloaded from NIH GenBank (https://www.ncbi.nlm.nih.gov/genbank/) for all targeted species. Relative to the COI and 28S data, we found that the 16S sequence data were most abundant and well-resolved for the targeted taxa. Furthermore, 16S was found to possess conserved regions in which primers could readily be designed. Those regions were interspersed with interspecific variable regions, allowing for targeted species identification. COI appeared too variable, leading to difficulty in primer design and 28S was too conserved, circumventing discrimination among species. Focusing on 16S, we used the downloaded GenBank 16S sequences for the targeted taxa to design out reference database (Table 1). Specimens of these taxa and close relatives had their DNA extracted, amplified, and sequenced at the 16S locus using universal 16S primers, 16Sar-L: 5’-CGCCTGTTTATCAAAAACAT-3’ and 16Sbr-H: 5’-CCGGTCTGAACTCAGATCACGT-3’ [34]. Polymerase chain reactions (PCR) of 25 μl, including 16.7 μl ddH2O, 1X AmpliTaq® PCR buffer, 0.8 mM dNTPs, 0.5 μM of each primer, 1.5 U AmpliTaq®, and 25–50 ng template DNA. Cycling conditions consisted of initial denaturation at 94° C for 5 min, followed by 40 cycles of 94° C for 1 min, 48.5°C for 1 min, and 72°C for 1.5 min, with a final 5 min extension at 72°C. PCR products were purified using QIAquick® PCR Purification kits (Qiagen). Sequencing was performed at Cornell University Life Sciences Core Laboratories Center using an Applied Biosystems Automated 3730 DNA Analyzer (Fullerton, CA, USA). These sequences then were added to the reference database of downloaded GenBank (https://www.ncbi.nlm.nih.gov/genbank/) sequences in our laboratory (Table 1) as well as to the GenBank nucleotide database (KY426891-KY426915). Subsequent primer design was based on this reference database.
Table 1

List of targeted species for the MOL16S assay.

SpeciesScientific nameGenBank accession numbers
Bivalvesgolden musselLimnoperna fortune (Dunker, 1857)(1) JQ267790
*dark false musselMytilopsis leucophaeata (Conrad, 1831)(3) AF507051–52, EF414448 (1) AF038998 (1) KY426891
*quagga musselDreissena rostriformis (Deshayes, 1838)(2) AF038996, KY426895 (9) DQ333745–46, AF507047–48, AY302247, JX099457, KY426892-94 (1) JQ348913 (1) KY426896
*zebra musselD. polymorpha (Pallas, 1771)(14) GBXKY426897-902, DQ333747–48, EF414464–66, AF038997, JX099458, AF507049
*Asian clamCorbicula fluminea (O. F. Müller, 1774)(7) AB522656, AF038999, AF152024, DQ280039, KC429294, KY426903-04
*European fingernail clamSphaerium corneum (Linnaeus, 1758)(1) GU128616 (18) GU128617–35 (2) KY426905-06
Ŧgrooved fingernail clamS. similie (Say, 1817)(1) KY426907
greater European peaclamPisidium amnicum (O. F. Müller, 1774)(7) EU559086–89, AY093572, DQ062609–10
henslow peaclamP. henslowanum (Sheppard, 1825)(1) DQ062644 (8) EU559115–22, DQ062620–22 (1) KF483297
pygmy peaclamP. moitessierianum Paladihe, 1866(3) DQ062626, DQ06262628–29 (1) DQ06262627
humpbacked peaclamP. supinum Schmidt, 1850(7) DQ062646–50, EU559148, AY093569
Ŧridged-beak peaclamP. compressum Prime, 1852(1) AY093560 (2) AF152029, KY426908 (1) AY957810 (1) AY957812
Gastropods*New Zealand mud snailPotamopyrgus antipodarum (Gray, 1843)(1) AY634079 (1) AY955392 (2) AY9553886–87 (3) AY955388–89, AY634106 (1) AY955377 (1) AY955378 (1) AY955381 (1) AY955391 (8) AY955379–80, AY955382–85, AY634104, AY634107 (5) JN639013–14, KY426909, JQ346706, AY955393 (1) AY634109 (10) JQ346702–05, JQ346708–09, AY634080, AY955376, AY314009, EU573989
*faucet snailBithynia tentaculata (Linnaeus, 1758)(2) AF445344, JX970531 (1) FJ160288
*Oriental mystery snailCipangopaludina chinensis malleata/ japonica (Gray, 1834)C. chinensis: (2) FJ710213–214 (8) LC028474–76, LC028479–81, KY42610-11, (1) LC028482 (4) LC0284732, LC028483–84, LC028477 (1) LC028473 (1) LC028478; C. japonica: (1) LC028460 (1) LC028461 (1) LC028463 (5) LC028462, LC028464–66, LC028468 (1) LC028467 (1) LC028469 (2) LC028470–71 (1) FJ405736; C. diachiensis: (1) GU1988839 (3) GU198835, GU198871–72 (1) GU198836; C. longispira: (2) GU198863–64 (1) KJ867106; C. ventricosa: (1) KJ867107
*buffalo pebble snailGillia altilis (Lea, 1841)(1) KY426912
*red-rim melania (Malaysian trumpet snail)Melanoides tuberculata (O. F. Müller, 1774)(1) AF101006 (1) KP284119 (1) KP284124 (5) KP284118, KP284120, KP284122, KP2841227, AY010517 (9) KP774656–58, KP774669–72, AY791930–31 (23) AY791911, AY791915, KP774640, KP774644–45, KP774647–55, KP774659, KP774661–62, KP774664–68, KY426914 (1) KP284121 (1) AY456618 (5) AY791910, KP774660, KP774663, AY456616–17 (1) KP284126 (1) AY791914 (7) KP774633–39 (3) KP284125, KP284128–29
island apple snailPomacea maculata Syn. P. insularum Perry, 1810P. insularum (1) EF519108 (1) FJ71028 (1) FJ71029; P. bridgesi: (2) EU274500, KC109970; P. papyracea: (1) FJ710248; P. guyanensis: (1) FJ710243; P. flagellata: (1) FJ710247; P. patula: (1) FJ710246; P. sordida: (1) FJ710244; P. camena: (1) FJ710245; P. haustrum: (1) FJ710239; P. paludosa: (1) FJ710237 (1) FJ710238; P. doloides: (1) FJ710232 (1) FJ710233; P. lineata: (1) FJ710230 (1) FJ710231; P. diffusa: (1) FJ710242 (1) KM389472; P. scalaris: (1) FJ710240 (1) FJ710241; P. canaliculata: (1) KF032562 (4) KF032563–65, KF032578 (1) KF032570 (1) KF032572 (1) KF032574 (1) KF032577 (1) FJ710236 (1) FJ710234 (9) KF032500-501, KF032503-504, KF032507, KF032566–67, KF032569, KF032579 (1) KF032568 (1) KF032571 (1) KF032573 (1) KF032575 (1) KF032576 (7) EU27450, FJ710235, KF002499, KF002502, KF002505–06, KJ766112
European ear snailRadix auricularia (Linnaeus, 1758)R. auricularia: (1) AF485646 (2) KP098540, NC_026538; R. peregra: (2) HQ283242, U82074; R. rubiginosa: (2) GU 451749, GU167907 (1) U82076; R. natalenisis: (1) HQ28342; *Radix balthica: (1) KY426913
*European stream valvataValvata piscinalis (O. F. Müller, 1774)(1) FJ917248 (1) KY426915
banded mystery snailViviparus georgianus (Lea, 1834)(1) AY377626

Parentheses denote unique OTUs for the MOL16S amplicon, with the number inside representing the number of sequences belonging to that OTU. Accession numbers are for 16S sequence data for each species.

* indicates invasive species whose extractions were used to test primers in vitro.

Ŧ indicates native or non-invasive species also used to test primers. All other species are invasive in the Great Lakes.

Parentheses denote unique OTUs for the MOL16S amplicon, with the number inside representing the number of sequences belonging to that OTU. Accession numbers are for 16S sequence data for each species. * indicates invasive species whose extractions were used to test primers in vitro. Ŧ indicates native or non-invasive species also used to test primers. All other species are invasive in the Great Lakes. Sequences first were aligned using Geneious Software (7.1.9) with MUSCLE [35]. Alignments were verified by visual inspection and corrections were made. Non-target taxa (human Homo sapiens, walleye Sander vitreus, and spiny water flea Bythotrephes longimanus) also were included in alignments to assess the relative target specificity of the primers. Primer pairs were designed by visually inspecting alignments and searching for regions of diagnostic interspecific variability in the target taxa that were flanked by conserved regions. Primers then were evaluated using IDT’s OligoAnalyzer ® software for GC content, annealing temperature, and possible dimer products. Finally, primer pairs were tested in vitro with DNA extractions from targeted species for which we had tissue samples from (Table 1), including a non-target fish species (S. vitreus). We developed primer sets for two assays. First, degenerate primers were designed to differentiate invasive bivalve and snail species using a short fragment of the mtDNA 16S RNA gene (herein referred to as MOL16S assay): MOL16S_F: 5’–RRWRGACRAGAAGACCCT– 3’, MOL16S_R: 5’-ARTCCAACATCGAGGT-3’. The primer set amplified some non-target species as well, since it is partially conserved across other invertebrate taxa and some fishes. Due to concern that fish DNA in an environmental water sample could potentially swamp out invertebrate DNA signal, a blocking primer was designed to reduce the former. The 5’ end of the blocking primer overlaps the 3’ MOL16S reverse primer, but extends further to capture fish specific nucleotide variation to increase specific binding to fish DNA. The 3’ end was modified with a C3 spacer to prevent product elongation [36], MOL16S_FISBLOCK: 5’–AGGTCGTAACCCCCTRG/3SpC3/–3’. For our second assay, non-degenerate primers amplified all 58 identified sphaeriid mussel species obtained from GenBank sequences for 16S (herein referred to as SPH16S assay) (S1 Table), SPH16S_F:5’–TAGGGGAAGGTATGAATGGTTTG–3’, SPH16S_R: 5’–ACATCGAGGTCGCAACC–3’. Sphaeriid mussels are defined here as the three genera in the Sphaeriidae family (Sphaerium, Musculium, Pisidium,and Euglesa). Targeted amplicon length for SPH16S was 299 bp, whereas amplicon length varied among species for the MOL16S assay from 183–310 bp.

Mock community design

Five mock communities were designed to encompass varying copy numbers of the targeted amplicon from 11 species (Table 2). We included two native sphaeriid species, as well as walleye, to increase diversity and to test the fish blocking primer. To quantify the target copy number per extraction, a series of competitive PCRs were run, in which an extraction or native template (NT) and an internal standard (IS) were co-amplified with the primer set. The IS is a synthetic double stranded homolog of the target marker that has the same primer annealing sites as the NT, enabling it to compete equally for the same primers, as well as a small deletion that enables products of the IS and NT to be differentiated by size. Given the similar amplification efficiencies between IS and NT, the initial IS and NT proportions should correspond to their ratio in the amplified product [37]. Using known concentrations (copy number/μl) of IS, we determined when this ratio was 1:1 and calculated the NT copy number. To determine this 1:1 ratio, a set of PCRs with a constant amount of target DNA extraction (25 ng) and a known IS amount were amplified, with the latter serially diluted among reactions. Molarity of each product was measured on a 2100 Bioanalyzer (Agilent Technologies). We then calculated the ratios of the log10 transformed NT:IS template molarities, which were regressed against the log10 transformed copy number of known IS concentrations used in each competitive PCR. The y-intercept of the regression line indicated the 1:1 ratio of NT:IS, which was the copy number concentration of the NT. Once copy number concentrations of the original DNA extractions were determined, we created our mock community samples using calculated volumes of the extraction required to provide the desired copy number for each target species.
Table 2

Composition of the five mock communities showing the target amplicon copy number per extraction.

SpeciesMock Communities
12345
Sphaerium similie909018183637214
Dreissena rostriformis181836372149090
Sander vitreus (walleye)363721490901818
Sphaerium corneum721490901818363
Pisidium compressum149090181836372
Mytilopsis leucophaeata22721136568284142
Dreissena polymorpha11365682841422272
Potamopyrgus antipodarum56828414222721136
Gillia altilis28414222721136568
Cipangopaludina chinensis14222721136568284
Melanoides tuberculata1818181818
To distinguish IS and NT amplification products on the 2100 Bioanalyzer (Agilent Technologies), IS was ordered as a gBlocks ® gene fragment (IDT) of target amplicons with a deletion interior to the priming sites, making the IS 10% shorter than the NT amplicon. A 10% difference in size has been shown to not cause differences in PCR efficiency between NT and IS templates [36].

eDNA samples from aquaria and the Maumee River

We collected Corbicula fluminea, Dreissena polymorpha, D. rostriformis, sphaeriid clams (Sphaerium sp. and Pisidium sp.) and pleurocerid snails from streams in western Lake Erie during summer 2016. The sphaeriid clams were morphologically identified to genus level and the snails were identified to family level. These were placed in two 38-liter aquaria, whose assemblage compositions are described in Table 3. After two weeks, one liter of water was collected from each aquarium (the water was first mixed) for eDNA extraction and analysis.
Table 3

Number of individuals from each species placed into two aquaria for later eDNA sampling.

SpeciesApproximate N individuals Tank AApproximate N individuals Tank B
Sphaerium spp.4040
Pisidium spp.100
Dreissena spp.1010
Corbicula fluminea41
Pleurocerid snails33
The Ohio EPA gathered data for identifying fish and macroinvertebrate assemblages using traditional sampling methods in the Maumee River in 2012. At the same time, they took one liter water samples from each site for later eDNA analysis by the Stepien laboratory, which were labeled and stored at -80°C, and used here.

Samples and DNA extractions

Extractions for the mock community samples and primer testing were prepared using tissue from taxonomically identified specimens stored in 95% EtOH (S2 & S3 Tables). Specimens were identified by the original collector (S3 Table). The Qiagen DNeasy® Blood and Tissue kit was used to extract bivalve DNA, whereas the Omega Bio-tek E.Z.N.A® Mollusc DNA Kit was used for snail DNA. Water samples were filtered through a polyethersulfone (PES) membrane using a vacuum pump. For the aquarium samples, one liter of water was filtered through a 0.45-micron PES filter, and 400 ml of the river water samples were filtered with a 0.2-micron PES filter. We also filtered and extracted DNA from 400 ml of molecular grade purified ddH2O as a negative control or blank sample. DNA from water samples was extracted following the Turner et al. [38] cetyl trimethyl ammonium bromide (CTAB) protocol. Three river water samples and the blank were tested with our MOL16S assay (without the fish blocking primer). Sample selection was based on presence of invasive species from our list, according to Ohio EPA’s visual identification data (http://www.epa.ohio.gov/Portals/35/documents/MaumeeTSD_2014.pdf) [39]. Aquaria samples were tested with both the MOL16S (without the fish blocking primer) and the SPH16S assays.

Library preparation

To prepare samples for paired-end, high-throughput sequencing on the MiSeq platform (Illumina, San Diego, CA, USA), libraries were prepared in a two step PCR process. In the first step, the target was amplified with assay specific primers that were appended with 7–17 additional nucleotides (spacers) and a 33–34 bp sequencing primer region on the 5’ end. Four primer sets were created, which differed only in their spacer combinations (Table 4). The spacer region increased nucleotide diversity of the sequence reads and enhanced cluster formation for improved sequencing on the MiSeq platform [40-41] and was not analogous to the C3 spacer modification used in the blocking primer. Each sample was prepared with one of these four primer pairs. PCRs contained 25 μl reaction volume, including 14.69 μl of the 10 μM blocking primer (5.88 μM final concentration), 1X NEB PCR buffer, 0.2 mM dNTPs, 0.5 μM of each primer, 1.57 U of AmpliTaq ®, and 2.5 μl of template DNA. In samples lacking the fish blocking primer, the volume of blocking primer was replaced with an equal volume of ddH20. Conditions were: 30 sec initial denaturation at 95°C, followed by 25–35 cycles of 95°C for 30 sec, 58°C for 30 sec, and 68°C for 1 min, with a 2 min final extension at 68°C. Products were cleaned using QIAquick® PCR purification kits (Qiagen).
Table 4

Primers used for library preparation.

AssayPrimersSequence 5’– 3’
MOL16SMOL16S_E_FTCGTCGGCAGCGTCAGATGTGTATAAGAGACAGTCCTATGRRWRGACRAGAAGACCCT
MOL16S_F_FTCGTCGGCAGCGTCAGATGTGTATAAGAGACAGATGCTACAGTRRWRGACRAGAAGACCCT
MOL16S_G_FTCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCGAGGCTACAACTCRRWRGACRAGAAGACCCT
MOL16S_H_FTCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGATACGATCTCGCACTCRRWRGACRAGAAGACCCT
MOL16S_E_RGTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGCGTACTAGATGTACGAARTCCAACATCGAGGT
MOL16S_F_RGTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGTCACTAGCTGACGCARTCCAACATCGAGGT
MOL16S_G_RGTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGAGTAGCTGAARTCCAACATCGAGGT
MOL16S_H_RGTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGATCGGCTARTCCAACATCGAGGT
SPH16SSPH16S_E_FTCGTCGGCAGCGTCAGATGTGTATAAGAGACAGTCCTATGTAGGGGAAGGTATGAATGGTTTG
SPH16S_F_FTCGTCGGCAGCGTCAGATGTGTATAAGAGACAGATGCTACAGTTAGGGGAAGGTATGAATGGTTTG
SPH16S_G_FTCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCGAGGCTACAACTCTAGGGGAAGGTATGAATGGTTTG
SPH16S_H_FTCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGATACGATCTCGCACTCTAGGGGAAGGTATGAATGGTTTG
SPH16S_E_RGTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGCGTACTAGATGTACGAACATCGAGGTCGCAACC
SPH16S_F_RGTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGTCACTAGCTGACGCACATCGAGGTCGCAACC
SPH16S_G_RGTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGAGTAGCTGAACATCGAGGTCGCAACC
SPH16S_H_RGTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGATCGGCTACATCGAGGTCGCAACC

Assay specific primers (in bold) with a 7–17 nucleotide spacer (underlined) and the sequencing primer region (italicized). 5’ to 3’ direction for all primers, the last character in the primer name indicates direction (F = forward, R = reverse).

Assay specific primers (in bold) with a 7–17 nucleotide spacer (underlined) and the sequencing primer region (italicized). 5’ to 3’ direction for all primers, the last character in the primer name indicates direction (F = forward, R = reverse). The second PCR step used the prior step’s column-cleaned product as the template and incorporated Nextera paired end indices (Illumina ®, kit FC-121-1011) as well as the P5 and P7 adaptor sequences, which enabled the prepared product to bind onto the surface of the Illumina® MiSeq flowcell. The added indices allowed for multiple samples to be pooled together in a single MiSeq run. PCR was carried out in a 25 μl reaction including a variable volume of H20 (depending on volume of template added), 1X NEB (New England Biolabs) PCR buffer, 0.2 mM dNTPs, 2.5 μl of each indexed primer, 1.57 U of NEB Hotstart Taq polymerase, and up to 24 ng of purified product from the first reaction. Conditions consisted of a 30 sec initial denaturation at 95°C, followed by 8 cycles of 95°C for 30 sec, 55°C for 30 sec, and 68° C for 1 min, with a final 2 min extension at 68°C. After column clean up, product was sized and quantified using a 2100 Bioanalyzer (Agilent Technologies) before sending to Ohio State University’s Molecular and Cellular Imaging Center in Wooster, OH for MiSeq analysis (http://mcic.osu.edu/). To avoid sequencing dimer product observed around 200–250 bp, the targeted fragments (320–550 bp) were size-selected with a 1.5% agarose gel cassette on Pippin Prep (Sage Science). Concentration of pooled product was measured with a Qubit fluorometer (Invitrogen). Pooled samples were run on an Illumina® MiSeq with 2 X 300 bp V3 chemistry. An additional 40–50% PhiX DNA spike-in control was added to improve data quality of low nucleotide diversity samples. The first MiSeq run included the following 10 pooled samples: (1–5) one sample from each mock community with the MOL16S primers and fish blocking primer, (6–7) samples from mock communities 3 and 4 using MOL16S primers but without the fish blocking primer, and (8–10) samples from mock communities 1, 2, and 4 with the SPH16S primers. Two additional MiSeq runs were conducted to sequence the water samples.

Bioinformatics analyses

Reads returned from sequencing were trimmed of adaptors and sequencing primers. We used custom PERL scripts to merge paired reads and trim assay-specific primers [42]. For the mock community samples, an exhaustive search was implemented via custom PERL script to assign reads to taxa using the reference sequence database. For the mock community sample, only reads that were an exact match were retained, thus eliminating chimeric sequences and sequencing errors. For the water samples, our custom reference sequence datasets were not used to identify reads. Instead, after primer trimming, reads were clustered using QIIME’s pick_de_novo_otus python workflow script with the default 97% similarity threshold and UCLUST [43-44]. The subsequent output file, OTU Biom (Biological Observation Matrix; http://biom-format.org/) table, and representative sequence set then were employed to create a table listing the number of reads that matched each OTU (operational taxonomic unit) with the Biom package in R [45-46]. Finally, a BLAST [47] search was performed on the OTU frequency table using the NCBI nucleotide database to identify each OTU to species. The search retained each OTU’s first BLAST hit sequence, scientific name, accession number, identity percentage, coverage percent, and e- (expect) value. Only OTUs with ≥80% coverage and more than one sequence were retained. Reads that were assigned the same GenBank accession number were grouped together as a single OTU. OTUs assigned an identity (% similarity between query and subject sequence) ≥97% were considered identified to species level.

Statistical analyses of mock communities

MOL16S assay

Percentages of reads per species in each mock community were calculated, log10 transformed, and analyzed by regressing observed on to expected read percentages. We compared these within species (among mock communities), as well as within mock communities (among species). To compare the ability of the blocking primer to reduce amplification of walleye (and other fish) DNA, we calculated the reduction in walleye DNA amplification in the two mock community samples run with and without the blocking primer.

SPH16S assay

Since just three sphaeriid species were included in the mock communities, we did not run regression analyses on those results. Instead we show the untransformed expected and observed read percentages for comparison.

eDNA samples

To compare our DNA sequence results from the aquaria and river water samples with those from their morphological identifications, we included data from both the molecular and the morphological analyses that were defined at species level. For the molecular barcoding assays, these included those OTUs having 80% coverage and 97% similarity to query sequences from the BLAST search (as discussed above). For the morphological identifications, those taxa that were defined at the species level by morphology were included (S1 File).

Results

Primers and amplification results

According to our reference database, the MOL16S and SPH16S primer sets should amplify all targeted species (see Table 1). Primers were tested in vitro using tissue extractions (*labels, Table 1), which included three non-targeted species: two native sphaeriid mussels Sphaerium similie and Pisidium compressum, and the snail Radix balthica, which is a congeneric relative of the targeted European ear snail (R. auricularia) for which we did not have tissue. All extractions amplified with MOL16S. Just the Sphaeriid mussels amplified with the SPH16S primers, indicating their targeted specificity.

MiSeq read results

Sequencing of the pooled libraries from our first run resulted in a total of 15,184,208 reads that passed cluster quality filtering. Of these, 53% were indexed and corresponded to our samples. This is in concordance with the 40–50% PhiX spike in. Of the indexed reads, approximately 84% merged using the custom PERL script, leading to 6,790,019 merged reads. Primers then were trimmed and taxa assigned to reads using custom PERL scripts and an exhaustive search with our reference sequence database. Percentage of reads that were trimmed ranged from 83.44–99.89% among our 10 samples, and the trimmed reads that exactly matched the database sequences ranged from 25.83–50.08% (S4 Table). The eDNA samples were divided between two other MiSeq runs. These two runs included a number of samples that were part of another study. The first run, which included two Ohio EPA water samples and the water blank, had 15,209,380 reads that passed quality filtering. Of these, 58% (8,821,442) reads were indexed and the rest belonged to the ~40–50% PhiX spike in. Using our custom PERL script, 95–97% merged correctly (excluding the blank sample in which only 5% of reads merged). Finally, 53–95% of these reads were retained after processing, which included: trimming primers, removing singletons, and excluding reads <100 bp long. The second run included one additional Ohio EPA water sample and the four aquaria samples. In this run, 14,329,380 reads passed quality filtering and 41% (5,857,044) were indexed for our samples, with the rest attributed to the PhiX spike. Using our custom PERL script, 96–98% of the reads merged correctly. Finally, the percentage of reads that remained after primer trimming and length processing ranged from 50–96% (S5 Table).

MOL16S assay mock communities

For each sample, the total number of reads, percentage of expected reads, and percentage of observed reads per species are shown in Table 5. All regression comparisons revealed significant relationships (p<0.05) between observed and expected percentage of reads for the five mock community samples using the MOL16S and fish blocking primers (Figs 1 and 2), indicating that the observed read percentages closely reflected those expected. The two samples without fish blocking primer (replicate samples of mock communities 3 and 4) showed an 88–100% reduction in amplified walleye DNA (Table 6), while yielding similar proportions of reads for the other species. Since the proportion of walleye reads was smaller than expected for all mock communities even without the fish blocking primer, we believe that the extraction copy number was miscalculated leading to a lower than expected copy numbers of walleye in each mock community. Because of this, we also include results comparing the use of the fish blocking primer in a preliminary data set from a separate mock community composition. Those results show that even at relatively high concentrations of fish DNA (~41% sample 1), we obtained a 62% reduction in walleye DNA with the blocking primer (Table 6).
Table 5

Number of reads, percentage observed and percentage expected for each of the five mock community samples run with the MOL16S assay and fish blocking primer.

Mock Community 1Mock Community 2Mock Community 3
SpeciesRead Number% Observed% ExpectedRead Number% Observed% ExpectedRead Number% Observed% Expected
Sphaerium similie13807543.1157.6237452.1511.528850.532.30
Dreissena rostriformis8501026.5411.5221741.252.303410.200.46
Sander vitreus (walleye)130.002.3000.000.4600.000.09
Sphaerium corneum13850.430.46420.020.0911257367.2857.62
Pisidium compressum1950.060.0914935985.8957.62153599.1811.52
Mytilopsis leucophaeta230027.1814.4025361.467.208000.483.60
Dreissena polymorpha5997318.727.2051662.973.6026861.611.80
Potamopyrgus antipodarum55941.753.605650.321.802090.120.90
Gillia altilis62351.951.807830.450.903179519.0014.40
Cipangopaludina chinensis7110.220.9095135.4714.4026541.597.20
Melanoides tuberculata990.030.11120.010.11130.010.11
Mock Community 4Mock Community 5
SpeciesRead Number% Observed% ExpectedRead Number% Observed% Expected
Sphaerium similie7730.290.46300.010.09
Dreissena rostriformis2440.090.0916382578.4057.62
Sander vitreus (walleye)26791.0257.62630.0311.52
Sphaerium corneum11699444.6411.5221991.052.30
Pisidium compressum176206.722.303310.160.46
Mytilopsis leucophaeta17940.681.802460.120.90
Dreissena polymorpha48721.860.903345916.0114.40
Potamopyrgus antipodarum4380716.7114.4037521.807.20
Gillia altilis6730925.687.2044942.153.60
Cipangopaludina chinensis59212.263.605360.281.80
Melanoides tuberculata840.030.11260.010.11
Fig 1

Regressions of log 10 transformed observed read percentages versus log 10 transformed expected read percentages for each of the five mock communities run with the MOL16S assay.

Fig 2

Regressions of log 10 transformed observed read percentages versus log 10 transformed expected read percentages for each of the species in the mock communities run with the MOL16S assay.

Table 6

Number of reads, percentage observed and percentage expected for each of the mock community samples run with and without the fish blocking primer and the MOL16S assay.

Mock Community 3Mock Community 4
SpeciesRead Number with FB% Observed with FBRead Number without FB% Observed without FBRead Number with FB% Observed with FBRead Number without FB% Observed without FB
Sphaerium similie8850.5297180.4967730.2953140.229
Dreissena rostriformis3410.2042840.1962440.0932570.187
Sander vitreus (walleye)00.00020.00126791.022119568.717
Sphaerium corneum11257367.28210086269.73411699444.6385476639.931
Pisidium compressum153599.180128238.866176206.72374615.440
Mytilopsis leucophaeta8000.4787010.48517940.68410550.769
Dreissena polymorpha26861.60519241.33048721.85922761.659
Potamopyrgus antipodarum2090.1251440.1004380716.7142358317.195
Gillia altilis3179519.0032514717.3866730925.6813254923.732
Cipangopaludina chinensis26541.58620241.39959212.25928762.097
Melanoides tuberculata130.00890.006840.032570.042
Trial Run
SpeciesRead Numbers with FB% Observed with FBRead Number without FB% Observed without FB
Dreissena polymorpha4988364.3122948844.726
Dreissena rostriformis22532.90514492.198
Sphaerium similie1210.156310.047
Cipangopaludina chinensis00.00000.000
Sander vitreus1192915.3802715941.194
Corbicula fluminea1207815.572716410.867
Potamopyrgus antipodarum190.02450.008
Valvata piscinalis9921.2792050.311
Sphaerium corneum180.02360.009
Mytilopsis leucophaeta00.00000.000

SPH16S assay mock communities

Results for the three samples (mock communities 1, 2, and 4) using the SPH16S primers show similar trends as found with the MOL16S assay, in that among the three sphaeriid species, observed read percentages reflected expected rank (Table 7). Furthermore, no reads from non-sphaeriid taxa were observed, indicating that this primer set is highly specific to sphaeriid clams.
Table 7

Number of reads, percentage observed and percentage expected for each of the mock community samples run with the SPH16S assay.

Mock Community 1Mock Community 2Mock Community 4
SpeciesRead Numbers% Observed% ExpectedRead Numbers% Observed% ExpectedRead Numbers% Observed% Expected
Sphaerium similie10876099.48099.06314611.43716.6455000.0063.196
Dreissena rostriformis00.0000.00000.0000.00000.0000.000
Sander vitreus (walleye)00.0000.00000.0000.00000.0000.000
Sphaerium corneum5510.5040.785160.0160.1288299092.21980.692
Pisidium compressum170.0160.15310019098.54783.22765027.22516.112
Mytilopsis leucophaeta00.0000.00000.0000,00000.0000.000
Dreissena polymorpha00.0000.00000.0000.00000.0000.000
Potamopyrgus antipodarum00.0000.00000.0000.00000.0000.000
Gillia altilis00.0000.00000.0000.0000.0000.0000.000
Cipangopaludina chinensis00.0000.00000.0000.00000.0000.000
Melanoides tuberculata00.0000.00000.0000.00000.0000.000

eDNA aquaria samples

As expected, samples from both tanks run with the MOL16S primers yielded sequences comprising D. polymorpha, D. rostriformis and sphaeriid species. Based on known compositions within those tanks, sequence reads followed rank abundances, except for C. fluminea that did not amplify and the snails whose DNA did amplify but was not identified by our 97% threshold (Fig 3). The dominant sphaeriid species was S. striatinum, whereas P. casertanum and P. compressum also were discerned in tank A in which Pisidum clams had been placed. These three species were identified as common at the collection site (J. Boehler, pers. comm.). At the 97% similarity threshold, the MOL6S primers also identified a freshwater limpet (Ferrissia fragilis), oligochaete worms, bryozoans, and human DNA (Table 8). Although pleurocerid snails were placed in the tanks, no snail reads were identified at the 97% sequence similarity cut off. However, below this threshold, OTUs from three snail genera (Leptoxis, Lithasia, and Pleurocera) were detected (S6 Table). Given that pleurocerid species were identified as abundant at the collection site (J. Boehler, pers. comm.), it is likely that the actual species placed in the tanks belonged to these or other closely related snail genera, and that there is no sequence data for these specific species in GenBank. No C. fluminea were identified from the eDNA, although they had been placed in the tanks. Using the SPH16S assay, only the sphaeriid clams were identified at the 97% threshold (Fig 3) (Table 8).
Fig 3

Bar charts denoting the proportion of reads for each assay (MOL16S and SPH16S) that were recovered from sequencing of water samples from tanks A and B. The middle bar shows the original composition of the tanks.

Table 8

Number of reads per OTU identified at 97% similarity via BLAST search of aquaria samples.

Assay and SampleSequence ID Accession NumberSpeciesNumber of ReadsIdentity (%)Coverage (%)Percentage of total reads (%)Percentage of just the 97% ID reads (%)
MOL16S Tank A|gb|AF152045Sphaerium striatinum2570798–991008.98986.737
|gb|KP052744Dreissena polymorpha30371001001.06210.247
|gb|AF038996Dreissena rostriformis2351001000.0820.793
|gb|KF889403Ferrissia fragilis2181001000.0760.736
|gb|AY957830**Pisidium casertanum132991000.0460.445
|dbj|AB365626Plumatella emarginata1171001000.0410.395
|gb|DQ459934Pristina aequiseta7497–991000.0260.250
|gb|JN681057Plumatella emarginata561001000.0200.189
|gb|AY957811Pisidium compressum52991000.0180.175
|gb|AF152044Sphaerium striatinum10981000.0030.034
MOL16STank B|gb|AF152045Sphaerium striatinum21806991007.92293.128
|gb|KP052744Dreissena polymorpha12861001000.4675.492
|gb|JX099457Dreissena rostriformis2221001000.0810.948
|gb|KC429295*Sphaerium nucleus351001000.0130.149
|gb|JN681057Plumatella emarginata291001000.0110.124
|gb|GQ355403Chaetogaster diaphanus25991000.0090.107
|gb|KX594326Homo sapiens71001000.0030.030
|dbj|AB365626Plumatella emarginata51001000.0020.021
SPH16STank A|gb|AF152045Sphaerium striatinum2401659910099.52099.959
|gb|AF152044Sphaerium striatinum41698980.1720.173
|gb|AY957830**Pisidium casertanum215991000.0890.089
|gb|KC429295*Sphaerium nucleus68981000.0280.028
SPH16STank B|gb|AF152045Sphaerium striatinum2177259910099.78399.978
]gb|KC429295*Sphaerium nucleus481001000.0220.022

*Denotes species from NCBI GenBank (https://www.ncbi.nlm.nih.gov/genbank/) not recognized on the World Register of Marine Species (WoRMS; http://www.marinespecies.org/).

** Pisidium casertanum is accepted as Euglesa casertana

Bar charts denoting the proportion of reads for each assay (MOL16S and SPH16S) that were recovered from sequencing of water samples from tanks A and B. The middle bar shows the original composition of the tanks. *Denotes species from NCBI GenBank (https://www.ncbi.nlm.nih.gov/genbank/) not recognized on the World Register of Marine Species (WoRMS; http://www.marinespecies.org/). ** Pisidium casertanum is accepted as Euglesa casertana

Maumee River field eDNA samples

Comparing results from our metabarcoding assay to those from the Ohio EPA’s morphological identification, we find that the two approaches are biased to particular taxa. Given that we designed our MOL16S assay specifically for molluscs, this is not surprising; however, results show that some non-molluscan taxa including bryozoans, annelids, rotifers, and chordates also amplified. Without our fish blocking primer to reduce amplification of fish DNA, the number of fish reads also was not unexpected, however, we did not anticipate the amplification of those other taxa. Morphological identification of other taxonomic groups at the collection site was highest for fishes and arthropods (mainly insects) (S1 File). The morphological survey failed to discern sphaeriid clams or pleurocerid snails (from the Physella and Elimia genera) to species level, whereas our molecular assays identified them to species level (Table 9) (Fig 4). Interestingly the molecular assay found quagga mussel D. rostriformis in water samples at river miles 26.7 and 58.1, while the morphological survey did not record this species (Table 9). Furthermore, DNA from the European stream valve snail Valvata piscninalis was detected from river mile 58.1, but not detected visually in sampling. Some sequences were identified as belonging to the snail Radix swinhoei by our MOL16S molecular assay. This Asian species of snail has not been noted on watch lists as a potential invasive; however, upon further investigation including a BLAST search of the referenced OTU (GenBank accession number KP279638), the reference OTU sequence was closer to Physella acuta (a North American snail) than to other Radix species, indicating that the original Radix swinhoei GenBank entry (gb|KP279638) was incorrect. Thus we believe the correct identification of our reads for this OTU was P. acuta, which is a common snail species found in the sampling region, and not R. swinhoei. The morphological survey was able to discern one to three species of unionid mussels across the three water samples, whereas the MOL16S assay did not detect them. Finally, our water blank sample had 71 reads, eight of those being singletons with the remaining belonging to Sphaerium striatinum (Table 9), likely due to amplicon contamination in the lab.
Table 9

Number of reads per OTU identified at 97% similarity via BLAST search of Maumee River water samples.

SampleSequence ID Accession NumberSpeciesNumber of ReadsIdentity(%)Coverage (%)Percentage of total reads (%)Percentage of the 97% ID reads (%)
Maumee River RM 26.7|dbj|AB365626Plumatella emarginata476981000.06449.739
|gb|FJ426631Brachionus calyciflorus124991000.01712.957
|gb|HQ691208Aeolosoma sp.78991000.0118.150
|gb|GQ343273Brachionus calyciflorus471001000.0064.911
|gb|KX372733Homo sapiens431001000.0064.493
|gb|EU038309Physella acuta381001000.0053.971
|gb|CP012504Aeromonas veronii37981000.0053.866
|gb|AY093554Musculium transversum311001000.0043.239
|gb|FJ426630Brachionus angularis18991000.0021.881
|gb|KP306894Ictiobus cyprinellus141001000.0021.463
|dbj|AB365638Fredericella indica11991000.0011.149
|gb|JF275060Meleagris gallopavo81001000.0010.836
|gb|DQ459934Pristina aequiseta7991000.0010.731
|gb|GQ466406Brachionus calyciflorus798910.0010.731
|dbj|AB365640Pectinatella magnifica51001000.0010.522
|gb|AY520093Aplodinotus grunniens41001000.0010.418
|gb|GQ343273Brachionus calyciflorus398800.0000.313
|dbj|AB466325Paludicella articulata21001000.0000.209
|dbj|AB365642Lophopodella carteri21001000.0000.209
|gb|AF038996Dreissena rostriformis2991000.0000.209
Maumee River RM 58.1|dbj|AB365626Plumatella emarginata12532097–10098–10014.20465.371
|dbj|AB365631Hyalinella punctata2895997–981003.28215.106
|dbj|AB365642Lophopodella carteri1678798–1001001.9038.757
|gb|KJ186046Valvata piscinalis81011001000.9184.226
|dbj|AB365623Plumatella rugosa32361001000.3671.688
|gb|KP306894Ictiobus cyprinellus27441001000.3111.431
|gb|AF038996Dreissena rostriformis176098–991000.1990.918
|dbj|AB365640Pectinatella magnifica15811001000.1790.825
|gb|DQ311116Elimia livescens11601001000.1310.605
|gb|KX372733Homo sapiens7951001000.0900.415
|gb|AY216556Pimephales notatus39197–10094–1000.0440.204
|gb|KM051966Brachionus angularis380991000.0430.198
|gb|GQ343301Plumatella emarginata140971000.0160.073
|dbj|AB126083Carpiodes carpio1321001000.0150.069
|gb|AF038486Notropis atherinoides106991000.0120.055
|dbj|AB466325Paludicella articulata961001000.0110.050
|gb|GQ343296Plumatella emarginata798990.0010.004
|dbj|AB365622Plumatella repens5971000.0010.003
|gb|KF162319Homo sapiens497930.0000.002
|gb|FJ426630Brachionus angularis398960.0000.002
Maumee River RM 76.1|gb|KP279638*Radix swinhoei9031001000.25524.048
|dbj|AB365623Plumatella rugosa6011001000.17016.005
|gb|AY093554Musculium transversum5921001000.16715.770
|gb|DQ311116Elimia livescens4271001000.12111.372
|gb|AY885589*Pristina longiseta341991000.0969.081
|gb|AY216556Pimephales notatus2511001000.0716.68
|gb|FJ372631Saldula sp.140981000.0403.73
|gb|DQ536422Cyprinella spiloptera1181001000.0333.142
|gb|KX353761Homo sapiens911001000.0262.423
|gb|KP013098Sander vitreus88991000.0252.343
|dbj|AB365628Plumatella reticulata691001000.0191.838
|gb|KR476977Sander lucioperca551001000.0161.464
|gb|DQ912062Dorosoma cepedianum341001000.0100.905
|gb|AF152045Sphaerium striatinum29991000.0080.772
|gb|KP013087Lepomis cyanellus161001000.0050.426
Maumee River Blank|gb|AF152044Sphaerium striatinum6399100
|dbj|AB365642Lophopodella carteri199100
|dbj|AB365626Plumatella emarginata199100
|gb|GQ343301Plumatella emarginata195100
|gb|AF152044Sphaerium striatinum194100
|dbj|AB365641Asajirella gelatinosa19497
|gb|AF499051Synchaeta pectinata18975
|gb|AF325131*Asplanchna sieboldi18779
|gb|AF499051Synchaeta pectinata18775

* Denotes species from NCBI GenBank (https://www.ncbi.nlm.nih.gov/genbank/) not recognized in the World Register of Marine Species (WoRMS; http://www.marinespecies.org/).

Fig 4

Pie charts comparing the molecular and morphological/ visual species identification methods from three different water samples taken on the Maumee River.

* Denotes species from NCBI GenBank (https://www.ncbi.nlm.nih.gov/genbank/) not recognized in the World Register of Marine Species (WoRMS; http://www.marinespecies.org/).

Discussion

MOL16S assay with mock community samples

This study shows that careful primer design and testing reduces amplification bias in target amplicon sequencing, allowing for the inference of relative abundances among targeted taxa from high-throughput sequences of eDNA samples. Results from the mock communities support the hypothesis that the number of sequence reads correlates with the relative amounts of DNA initially put into the sample for each OTU. Although our MOL16S primer set is not specific to molluscs, it provides useful relative abundance measurements for the targeted group. The MOL16S primer set also amplifies some vertebrate and non-target invertebrate species. In fact after development of this marker, further research found that our forward MOL16S primer overlaps a previously designed primer used by Karlsson and Holmund [48] for forensic identification of mammal species. Thus we believe that this region of 16S may be very useful for development of eDNA assays that aim for species level identification. We also showed that our fish blocking primer successfully reduced the amplification of added walleye DNA in the mock communities. Interestingly, it was not reads from fishes that dominated our river samples as we expected, but rather reads from rotifers and bryozoans. Designing blocking primers for these taxa will decrease their amplification. Alternatively, this primer set may be very useful for efforts targeting these other taxa.

SPH16S assay with mock community samples

Results from our SPH16S sphaeriid specific assay also demonstrated its ability to reflect relative abundance of the targeted species in a sample. Furthermore, and perhaps even more useful, these primers were shown to be extremely specific to the Sphaeriidae family, amplifying and identifying solely the target species in both the mock communities and the eDNA samples. Given the difficulty of morphological identification in this group, a concerted effort between sphaeriid taxonomists and molecular systematists would increase the amount of 16S sequence data from morphologically verified sphaeriid clams in genetic databases. Ultimately such an effort would enable biologists to then use the SPH16S marker as a mini-barcode for sphaeriid clams, thereby enhancing survey and monitoring efforts for this bivalve family.

eDNA samples

Results from eDNA (water) samples with our two assays show an improvement over morphological surveys for bivalve and snail species identifications. In the aquaria trials, the MOL16S assay detected all bivalve taxa that were housed in the tank except for C. fluminea, which, at the time of water sampling, were buried in the tank substrate. We believe that any DNA shed by individuals of this species was less likely to enter the water column, thus reducing their detectability. Had we sampled the substrate, we likely would have had a better chance at detecting C. fluminea. Environmental DNA is not distributed homogenously in the environment, and species habitat preferences likely influence eDNA detection. Taking soil or substrate samples would be a better technique for detecting eDNA from benthic organisms. Snail species were not detected at the 97% similarity level, but several pleurocerid snail OTUs were identified just below that threshold. Despite North America having the highest diversity of snails in the family Pleuroceridae, current diversity and taxonomy of this group is not well understood [49-50] and like many other invertebrate taxa, few genetic data have been collected. We believe that this lack of pleurocerid sequences in GenBank led to our assays’ inability to make an identity at the 97% threshold. Our SPH16S assay results demonstrate this primer pair’s specificity to sphaeriid bivalves. For both assays, the proportion of reads that aligned with Sphaerium spp. was higher than expected, especially in comparison to the Dreissena and Pisidium spp. Ten Pisidium and 10 Dreissena mussels were placed into tank A, but the number of reads assigned was much lower than expected. Individual Pisidium clams were much smaller in size than the Sphaerium clams, suggesting that biomass influences the amount of DNA detected. In fact other studies [7,51] report a strong relationship between biomass and eDNA, rather than between abundance (numbers of organisms) and eDNA. Biologists considering eDNA methods should first decide whether their question requires abundance as measured in number of individuals, or if biomass would suffice to answer their question. The river water samples demonstrate that these metabarcoding assays improve species level identification for the targeted molluscan species relative to visual morphology. Our MOL16S assay detected three different snail and three different bivalve OTUs among the three water samples; whereas the visual inspection was unable to define the sphaeriid clams and pleurocerid snails to species level. Furthermore, our molecular assays detected invasive D. rostriformis and V. piscinalis that the visual method did not identify. The sole molluscan taxon that the visual method was able to discern to species was four unionid mussels. Upon further investigation of our molecular assay, we found several mismatches between our MOL16S primers and unionid mussel sequences from GenBank, thus explaining the lack of unionid sequences. Our MOL16S assay amplified several non-mollusc taxa as well, particularly bryozoans and rotifers. Given the abundance of these organisms in the aquatic communities, it is not surprising that most of the sequence reads aligned to these groups. Design of taxon specific primers to block these groups will likely improve the MOL16S assay’s ability to detect more mollusc eDNA. Although rotifers made up a smaller percentage of reads relative to the bryozoans at the 97% ID threshold, analysis of reads below that threshold (S7 Table) show that the majority of reads closely match a number of rotifer species. Our primers did not amplify many arthropods, whereas arthropods were one of the dominant groups that the morphological method diagnosed to species level. Results from the Maumee River methods comparison demonstrate the respective strengths and weaknesses of both methods. Larger-sized taxa that are well studied such as insects (arthropods) and unionid mussels were readily diagnosed to species level through morphology by trained individuals. Our molecular assays outperformed this visual classification for other molluscs. The MOL16S assay also appears to be useful for bryozoan and rotifer identities. In fact the assay detected six genera and 10 different species of bryozoans, including the invasive Lophopodella carteri, whereas the morphological survey only diagnosed it and Plumatella spp. to the genus level. Given the diversity of bryozoans detected with our MOL16S assay, further work with this marker would aid assessment of its potential as a mini-barcode for bryozoans. Finally our blank sample did suggest that some contamination occurred, but in a relatively small amount (Table 9). It is unclear as to whether the contamination occurred during the filtering or amplification stage. Given the multi-step PCR process employed for library preparation which included opening of tubes after amplification, it is likely that contamination could have occurred in the process. Nevertheless, the amount of contamination we observed was relatively small. Blanks should be used throughout the process in order to limit and document the extent of contamination. Our study demonstrates that careful primer design is important in preserving relative abundance relationships among taxa in a metabarcoding study. This and our assays’ abilities to out preform morphological species level identification in less well studied taxa suggest that metabarcoding of eDNA samples will be a promising addition to biodiversity surveys. However one of the largest obstacles we came across when analyzing our field data was the lack of genetic data (and specifically voucher referenced data) in the publically-available genetic sequence databases. For example, our pleurocerid snail data from the aquaria experiments revealed that snail OTUs were only identified below the 97% similarity threshold. Thus, although we had snails of this family in the tanks, their DNA was not discerned according to our threshold. Similarly with the river water samples, a large number of rotifers were found below the 97% cut-off, thus not making our final species list. Finally, some of our reads showed 97% similarity to an OTU that was incorrectly identified in GenBank. For instance, some reads were identified by the BLAST search as R. swinhoei (gb|KP279638) and Gammarus balcanicsus (gb|DQ320034). Upon using these references in a BLAST (Basic Local Alignment Search Tool) search against the National Institutes of Health (NIH) NCBI (National Center for Biotechnology Information) nucleotide database (https://blast.ncbi.nlm.nih.gov/Blast.cgi), we found that the R. swinhoei sequence in GenBank (gb|KP279638) was more similar to P. acuta sequences than to other Radix spp. and the G. balcanicsus was more similar to insects than to other gammarid amphipods. Checking suspicious results is important for data interpretation, especially when detection of invasive species can lead to rapid and costly management action. Researchers and managers interested in metabarcoding surveys need to be aware that, like any tool, metabarcoding comes with assumptions and limitations. For instance sequence divergence thresholds do not necessarily reflect actual species boundaries [52-53]. Furthermore the percentage similarity in sequence thresholds will vary with species, loci used, and the amount of genetic data available. Nevertheless, we believe that metabarcoding capabilities will be greatly improved with increase in taxonomically verified sequences from diverse taxa. As others have proposed, we emphasize the need for increased collaboration among taxonomists and molecular biologists to catalogue taxonomically verified specimens with genetic sequence barcode data across all taxa to increase the capability of genetic barcoding methods [54-57].

Conclusions

Environmental DNA metabarcoding surveys can significantly augment efforts for identifying and eradicating invasive species, and conserving native species. Through low-impact sampling, eDNA samples processed with high-throughput sequencing can provide biological community composition data, identifying both native and invasive species. Our work shows that careful primer design and optimization can allow for targeted group analyses in which read abundance correlates well with original amount of DNA, providing potentially informative relative abundance or biomass data for taxa. We also have shown that eDNA methods and our two assays constitute useful tools for invasive species detection, as well as native mollusc (and perhaps other taxa) survey efforts, providing a valuable complement to traditional survey methods. The present study further demonstrates that metabarcoding data are only as good as the sequence and taxonomic information provided on genetic databases. Increased collaboration among taxonomists and molecular systematists is required in order to gain maximum benefits of this developing tool.

Excel file of the Ohio EPA Maumee River sample data.

(XLSX) Click here for additional data file.

List of targeted species for the Sphaeriidae SPH16S assay.

* indicates invasive species whose extractions were used to test primers in vitro. Ŧ indicates native or non-invasive species also used to test primers. Parentheses indicate unique OTUs for the SPH16S amplicon, the number inside the parentheses represents the number of sequences belonging to that OTU. (DOCX) Click here for additional data file.

Collection information for each of the samples used for the mock communities and primer design.

(DOCX) Click here for additional data file.

Collection information for additional samples used for primer design.

(DOCX) Click here for additional data file.

Number of reads, merged reads and exact match trimmed reads of the mock community samples.

(DOCX) Click here for additional data file.

Number of reads, merged reads and processed reads of the eDNA samples.

(DOCX) Click here for additional data file.

Number of reads per OTU identified below 97% similarity via BLAST search of aquaria samples.

(DOCX) Click here for additional data file.

Number of reads per OTU identified below 97% similarity via BLAST search of Maumee River samples.

(DOCX) Click here for additional data file.
  38 in total

1.  Biological identifications through DNA barcodes.

Authors:  Paul D N Hebert; Alina Cywinska; Shelley L Ball; Jeremy R deWaard
Journal:  Proc Biol Sci       Date:  2003-02-07       Impact factor: 5.349

2.  Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi.

Authors:  Conrad L Schoch; Keith A Seifert; Sabine Huhndorf; Vincent Robert; John L Spouge; C André Levesque; Wen Chen
Journal:  Proc Natl Acad Sci U S A       Date:  2012-03-27       Impact factor: 11.205

3.  A molecular phylogeny of Mobile River drainage basin pleurocerid snails (Caenogastropoda: Cerithioidea).

Authors:  C Lydeard; W E Holznagel; J Garner; P Hartfield; J M Pierson
Journal:  Mol Phylogenet Evol       Date:  1997-02       Impact factor: 4.286

Review 4.  Environmental conditions influence eDNA persistence in aquatic systems.

Authors:  Matthew A Barnes; Cameron R Turner; Christopher L Jerde; Mark A Renshaw; W Lindsay Chadderton; David M Lodge
Journal:  Environ Sci Technol       Date:  2014-01-21       Impact factor: 9.028

5.  Species detection using environmental DNA from water samples.

Authors:  Gentile Francesco Ficetola; Claude Miaud; François Pompanon; Pierre Taberlet
Journal:  Biol Lett       Date:  2008-08-23       Impact factor: 3.703

6.  Phasing amplicon sequencing on Illumina Miseq for robust environmental microbial community analysis.

Authors:  Liyou Wu; Chongqing Wen; Yujia Qin; Huaqun Yin; Qichao Tu; Joy D Van Nostrand; Tong Yuan; Menting Yuan; Ye Deng; Jizhong Zhou
Journal:  BMC Microbiol       Date:  2015-06-19       Impact factor: 3.605

7.  Estimation of fish biomass using environmental DNA.

Authors:  Teruhiko Takahara; Toshifumi Minamoto; Hiroki Yamanaka; Hideyuki Doi; Zen'ichiro Kawabata
Journal:  PLoS One       Date:  2012-04-26       Impact factor: 3.240

8.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity.

Authors:  Robert C Edgar
Journal:  BMC Bioinformatics       Date:  2004-08-19       Impact factor: 3.169

9.  MiFish, a set of universal PCR primers for metabarcoding environmental DNA from fishes: detection of more than 230 subtropical marine species.

Authors:  M Miya; Y Sato; T Fukunaga; T Sado; J Y Poulsen; K Sato; T Minamoto; S Yamamoto; H Yamanaka; H Araki; M Kondoh; W Iwasaki
Journal:  R Soc Open Sci       Date:  2015-07-22       Impact factor: 2.963

10.  The utility of DNA metabarcoding for studying the response of arthropod diversity and composition to land-use change in the tropics.

Authors:  Kingsly Chuo Beng; Kyle W Tomlinson; Xian Hui Shen; Yann Surget-Groba; Alice C Hughes; Richard T Corlett; J W Ferry Slik
Journal:  Sci Rep       Date:  2016-04-26       Impact factor: 4.379

View more
  16 in total

1.  Environmental RNA as a Tool for Marine Community Biodiversity Assessments.

Authors:  Marissa S Giroux; Jay R Reichman; Troy Langknecht; Robert M Burgess; Kay T Ho
Journal:  Sci Rep       Date:  2022-10-22       Impact factor: 4.996

2.  Detection of Galba truncatula, Fasciola hepatica and Calicophoron daubneyi environmental DNA within water sources on pasture land, a future tool for fluke control?

Authors:  Rhys Aled Jones; Peter M Brophy; Chelsea N Davis; Teri E Davies; Holly Emberson; Pauline Rees Stevens; Hefin Wyn Williams
Journal:  Parasit Vectors       Date:  2018-06-08       Impact factor: 3.876

3.  Invasion genetics of the silver carp Hypophthalmichthys molitrix across North America: Differentiation of fronts, introgression, and eDNA metabarcode detection.

Authors:  Carol A Stepien; Matthew R Snyder; Anna E Elz
Journal:  PLoS One       Date:  2019-03-27       Impact factor: 3.240

4.  New PCR primers for metabarcoding environmental DNA from freshwater eels, genus Anguilla.

Authors:  Aya Takeuchi; Tetsuya Sado; Ryo O Gotoh; Shun Watanabe; Katsumi Tsukamoto; Masaki Miya
Journal:  Sci Rep       Date:  2019-05-28       Impact factor: 4.379

Review 5.  Prospects and challenges of implementing DNA metabarcoding for high-throughput insect surveillance.

Authors:  Alexander M Piper; Jana Batovska; Noel O I Cogan; John Weiss; John Paul Cunningham; Brendan C Rodoni; Mark J Blacket
Journal:  Gigascience       Date:  2019-08-01       Impact factor: 6.524

6.  The detection of a non-anemophilous plant species using airborne eDNA.

Authors:  Mark D Johnson; Robert D Cox; Matthew A Barnes
Journal:  PLoS One       Date:  2019-11-20       Impact factor: 3.240

7.  Environmental DNA reveals quantitative patterns of fish biodiversity in large rivers despite its downstream transportation.

Authors:  Didier Pont; Mathieu Rocle; Alice Valentini; Raphaël Civade; Pauline Jean; Anthony Maire; Nicolas Roset; Michael Schabuss; Horst Zornig; Tony Dejean
Journal:  Sci Rep       Date:  2018-07-10       Impact factor: 4.379

8.  The effects of spatial and temporal replicate sampling on eDNA metabarcoding.

Authors:  Kevin K Beentjes; Arjen G C L Speksnijder; Menno Schilthuizen; Marten Hoogeveen; Berry B van der Hoorn
Journal:  PeerJ       Date:  2019-07-26       Impact factor: 2.984

9.  Analysis of 13,312 benthic invertebrate samples from German streams reveals minor deviations in ecological status class between abundance and presence/absence data.

Authors:  Dominik Buchner; Arne J Beermann; Alex Laini; Peter Rolauffs; Simon Vitecek; Daniel Hering; Florian Leese
Journal:  PLoS One       Date:  2019-12-23       Impact factor: 3.240

Review 10.  Genomic biosurveillance of forest invasive alien enemies: A story written in code.

Authors:  Richard C Hamelin; Amanda D Roe
Journal:  Evol Appl       Date:  2019-09-10       Impact factor: 5.183

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.