| Literature DB >> 29666773 |
Vasco Elbrecht1,2, Ecaterina Edith Vamos1, Dirk Steinke2, Florian Leese1,3.
Abstract
BACKGROUND: DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high-throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTUs), losing information on intraspecific diversity in the process. While Cytochrome c oxidase subunit I (COI) haplotype information is limited in resolving intraspecific diversity it is nevertheless often useful e.g. in a phylogeographic context, helping to formulate hypotheses on taxon distribution and dispersal.Entities:
Keywords: CO1; Ecosystem assessment; Exact sequence variant; Haplotyping; High-throughput sequencing; Metabarcoding; Population genetics
Year: 2018 PMID: 29666773 PMCID: PMC5896493 DOI: 10.7717/peerj.4644
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Overview of DNA metabarcoding data of a single-species mock sample containing specimens with 15 distinct haplotypes amplified with the fwh1 primer set (black circles), with red numbers above each circle showing the original 31 haplotypes using the full 658 bp barcoding region (Elbrecht & Leese, 2015; Vamos, Elbrecht & Leese, 2017).
Detected haplotypes (unexpected ones shown in grey and blue) are plotted against specimen biomass for the processed data (A) and followed by read denoising using unoise3 (C). Denoising was applied to both replicates individually, with a circle if the read was detected in both samples (error bar = SD) and “A” or “B” if the read was found in only one replicate. For processing of the multi-species samples (B, Fig. 2), all samples were pooled and jointly denoised, followed by OTU clustering and read mapping, then followed by discarding of haplotypes below a 5% threshold within each sample.
Figure 2Haplotype maps and networks extracted from multi-species monitoring metabarcoding datasets amplified with the BF2+BR2 primer set for four abundant macroinvertebrate taxa (A = Taeniopteryx nebulosa, B = Hydropsyche pellucidula, C = Oulimnius tuberculatus, D = Asellus aquaticus).
Numbers next to each sampling site indicate sample size of the respective taxa based on morphological identification in a sample (Elbrecht et al., 2017). Conflicts between DNA and morphology-based detections are highlighted in yellow. Haplotype frequency composition per site is indicated by pie charts. For A. aquaticus only the 10 most common haplotypes are visualised with different colors (remaining ones in white). Each crossline in a network represents one base pair difference between the respective haplotypes. Dashed lines around a circle indicate novel haplotypes that were not available in the BOLD reference database. An “A” or “B” next to a haplotype in the map or network indicates the presence of this haplotype only in one replicate. Shapefile-data© OpenStreetMap contributors, licensed under Creative Commons 2.0 (CC BY-SA).