| Literature DB >> 31226847 |
Simone Maestri1, Emanuela Cosentino2, Marta Paterno3, Hendrik Freitag4, Jhoana M Garces5, Luca Marcolungo6, Massimiliano Alfano7, Iva Njunjić8, Menno Schilthuizen9, Ferry Slik10, Michele Menegon11, Marzia Rossato12, Massimo Delledonne13,14.
Abstract
Genetic markers (DNA barcodes) are often used to support and confirm species identification. Barcode sequences can be generated in the field using portable systems based on the Oxford Nanopore Technologies (ONT) MinION sequencer. However, to achieve a broader application, current proof-of-principle workflows for on-site barcoding analysis must be standardized to ensure a reliable and robust performance under suboptimal field conditions without increasing costs. Here, we demonstrate the implementation of a new on-site workflow for DNA extraction, PCR-based barcoding, and the generation of consensus sequences. The portable laboratory features inexpensive instruments that can be carried as hand luggage and uses standard molecular biology protocols and reagents that tolerate adverse environmental conditions. Barcodes are sequenced using MinION technology and analyzed with ONTrack, an original de novo assembly pipeline that requires as few as 1000 reads per sample. ONTrack-derived consensus barcodes have a high accuracy, ranging from 99.8 to 100%, despite the presence of homopolymer runs. The ONTrack pipeline has a user-friendly interface and returns consensus sequences in minutes. The remarkable accuracy and low computational demand of the ONTrack pipeline, together with the inexpensive equipment and simple protocols, make the proposed workflow particularly suitable for tracking species under field conditions.Entities:
Keywords: barcoding; biodiversity; field ecology; long reads; nanopore sequencing; portable lab
Mesh:
Year: 2019 PMID: 31226847 PMCID: PMC6627956 DOI: 10.3390/genes10060468
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1The portable genomics laboratory. Panel (a) shows the equipment comprising the portable genomics laboratory, namely (i) micropipettes, (ii) a mini-microcentrifuge, (iii) a thermal cycler, (iv) an electrophoresis system, (v) a fluorometer, (vi) the nanopore sequencer MinION, and (vii) a laptop. Panel (b) shows how the laboratory is transported.
Sequencing statistics. For each sample, we show the cytochrome oxidase I (COI) primers used for PCR amplification, the number of sequenced reads, the mean and the standard deviation of read length in base pairs, and the average accuracy of MinION reads.
| Sample ID | Sample Name | COI Amplicon Primers | Reads | Mean Read Length (SD) | Average Read Accuracy |
|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 2ONTrack pipeline flowchart. (i) MinION reads were clustered at 70% identity using VSEARCH and only reads in the most abundant cluster were retained for subsequent analysis. (ii) Next, 200 reads were then subsampled by Seqtk, aligned with MAFFT and a draft consensus was extracted with EMBOSS cons. (iii) The draft consensus sequence was then polished using Nanopolish, based on a second set of 200 randomly sampled reads.
Accuracy of consensus sequences generated by the ONTrack pipeline. For each sample, we show the percentage accuracy of the consensus sequences obtained.
| Sample ID | Consensus Accuracy |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 3Analysis of residual errors in the ONTrack final consensus sequences. Alignment of the MinION consensus sequence (Query) to the Sanger sequence (Sbjct) is shown for samples BC01 (a) and BC06 (b). The residual errors, present in homopolymeric runs of 6 and 8 nt, are highlighted in red. Properly reconstructed homopolymers of 7 nt are highlighted in green.
Accuracy of consensus sequences generated by combining three iterations of the ONTrack pipeline. For each sample, we show the number of properly reconstructed positions over the alignment length and (in parentheses) the percentage accuracy of the consensus sequences for each of the three iterations, the accuracy of the final consensus sequence generated based on the majority rule from the three iterations and the number of iterations supporting it.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|