| Literature DB >> 29915686 |
Bachir Balech1,2, Anna Sandionigi3, Caterina Manzari1, Emiliano Trucchi4, Apollonia Tullo1,5, Flavio Licciulli5, Giorgio Grillo5, Elisabetta Sbisà5, Stefano De Felici4,6, Cecilia Saccone7, Anna Maria D'Erchia7, Donatella Cesaroni4, Maurizio Casiraghi3, Saverio Vicario8.
Abstract
Nowadays DNA meta-barcoding is a powerful instrument capable of quickly discovering the biodiversity of an environmental sample by integrating the DNA barcoding approach with High Throughput Sequencing technologies. It mainly consists of the parallel reading of informative genomic fragment/s able to discriminate living entities. Although this approach has been widely studied, it still needs optimization in some necessary steps requested in its advanced accomplishment. A fundamental element concerns the standardization of bioinformatic analyses pipelines. The aim of the present study was to underline a number of critical parameters of laboratory material preparation and taxonomic assignment pipelines in DNA meta-barcoding experiments using the cytochrome oxidase subunit-I (coxI) barcode region, known as a suitable molecular marker for animal species identification. We compared nine taxonomic assignment pipelines, including a custom in-house method, based on Hidden Markov Models. Moreover, we evaluated the potential influence of universal primers amplification bias in qPCR, as well as the correlation between GC content with taxonomic assignment results. The pipelines were tested on a community of known terrestrial invertebrates collected by pitfall traps from a chestnut forest in Italy. Although the present analysis was not exhaustive and needs additional investigation, our results suggest some potential improvements in laboratory material preparation and the introduction of additional parameters in taxonomic assignment pipelines. These include the correct setup of OTU clustering threshold, the calibration of GC content affecting sequencing quality and taxonomic classification, as well as the evaluation of PCR primers amplification bias on the final biodiversity pattern. Thus, careful attention and further validation/optimization of the above-mentioned variables would be required in a DNA meta-barcoding experimental routine.Entities:
Keywords: Amplification bias; Biodiversity; Carabids; DNA metabarcoding; GC content; High Throughput Sequencing; Taxonomic assignment; coxI
Year: 2018 PMID: 29915686 PMCID: PMC6004112 DOI: 10.7717/peerj.4845
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Taxonomically identified organisms in MPE5 sample and their corresponding biomass.
MPE4 simply contains the same organisms at equal biomass.
| Taxonomic group | Total biomass (g) | Species name | Biomass of single organism (g) |
|---|---|---|---|
| 22.89 | 12.94 | ||
| 4.85 | |||
| 0.58 | |||
| 0.29 | |||
| 0.23 | |||
| 0.18 | |||
| 0.07 | |||
| 4.12 | |||
| 2.44 | |||
| 1.02 | |||
| 0.82 | |||
| 0.7 | |||
| 0.64 | |||
| 0.63 | |||
| 0.21 | |||
| 0.1 | |||
| 0.02 |
Notes.
Class taxonomy rank.
Laemostenus latialis = LL1 and Carabus (Tomocarabus) convexus dilatatus = CC1
Figure 1Bioinformatics analysis pipelines overview.
The same reference database is used by all methods to assign each sequence to a known taxon name. RDP and BLAST classifiers take as input the output of OTU picking methods (Usearch, Uclust, Blast). HSTA, Usearch-Ref and Uclust-Ref use the denoised sequences as direct input. The Bayes Factor (BF) equation is a logical representation of that implemented in HSTA (see OTU picking and Taxonomic Assignment section, letter c.)
Figure 2Order level taxonomic assignments in MPE5 sample.
(A) Forward strand or 5′ coxI, (B) reverse strand or 3′ coxI. Each color corresponds to a taxonomic classifier (HSTA, RDP, BLAST) or a reference-based assigner algorithm (Usearch-Ref, Uclust-Ref). The symbols indicate either the combination of an OTU picking method with a classifier or the classifier/algorithm used for direct taxonomic assignment. The x-axis corresponds to the similarity thresholds used in OTU picking or with the direct assignments in HSTA and reference-based algorithms.
Figure 3MPE5 taxonomic assignments plots at species level.
(A) Forward strand or 5′ coxI, (B) reverse strand or 3′ coxI. Each color corresponds to a taxonomic classifier (HSTA, RDP, BLAST) or a reference-based assigner algorithm (Usearch-Ref, Uclust-Ref). The symbols indicate either the combination of an OTU picking method with a classifier or the classifier/algorithm used for direct taxonomic assignment. The x-axis corresponds to the similarity thresholds used in OTU picking or with the direct assignment in HSTA and reference-based algorithms.
Sites diversity in sequence reads predicted by sites diversity in references and by GC content.
The results are reported for sequences assigned to Coleoptera order over the 5′ and 3′ end of coxI barcode region.
| 5′ | 3′ | |||||
|---|---|---|---|---|---|---|
| GCref | Dref | Residuals | GCref | Dref | Residuals | |
| Degree of freedom | 1 | 1 | 562 | 1 | 1 | 562 |
| Sum of squares | 4.518 | 58.766 | 123.808 | 37.999 | 50.025 | 135.407 |
| Mean square | 4.518 | 58.766 | 0.220 | 37.999 | 50.025 | 0.241 |
| 20.507 | 266.754 | – | 157.71 | 207.63 | – | |
| Explained variance (%) | 31,41 | 66,17 | 22,3 | 66,17 | ||
| Pr(> | <2.2e−16 | – | <2.2e−16 | – | ||
Notes.
GCref and Dref are the GC content and sites diversity of reference sequences respectively.
0
0.001
0.01
0.05
Universal primers amplification bias analysis results of MPE5 and MPE4 samples.
All the reported values refer to the ratio CC1/LL1 of qPCR intensity signal for DNA extract, PCR products and library preparation categories while 5′and 3′counts correspond to the number of assigned reads for CC1 and LL1 species obtained from HSTA pipeline. LL1 = Laemostenus latialis, CC1 = Carabus (Tomocarabus) convexus dilatatus.
| qPCR signal intensity | HSTA assigned reads | |||||
|---|---|---|---|---|---|---|
| Sample | Biomass | DNA extract | PCR products | Library preparation | Count at 5′ | Count at 3′ |
| MPE5 | 56.26 | 1508.46 | 620.31 | 122.34 | 231.35 | 103.55 |
| MPE4 | 1 | 53.92 | 163.76 | 19.59 | 1.83 | 1.29 |