| Literature DB >> 32071706 |
Andres Santos1,2,3, Ronny van Aerle3, Leticia Barrientos1,2,3, Jaime Martinez-Urtaza3.
Abstract
Assessment of bacterial diversity through sequencing of 16S ribosomal RNA (16S rRNA) genes has been an approach widely used in environmental microbiology, particularly since the advent of high-throughput sequencing technologies. An additional innovation introduced by these technologies was the need of developing new strategies to manage and investigate the massive amount of sequencing data generated. This situation stimulated the rapid expansion of the field of bioinformatics with the release of new tools to be applied to the downstream analysis and interpretation of sequencing data mainly generated using Illumina technology. In recent years, a third generation of sequencing technologies has been developed and have been applied in parallel and complementarily to the former sequencing strategies. In particular, Oxford Nanopore Technologies (ONT) introduced nanopore sequencing which has become very popular among molecular ecologists. Nanopore technology offers a low price, portability and fast sequencing throughput. This powerful technology has been recently tested for 16S rRNA analyses showing promising results. However, compared with previous technologies, there is a scarcity of bioinformatic tools and protocols designed specifically for the analysis of Nanopore 16S sequences. Due its notable characteristics, researchers have recently started performing assessments regarding the suitability MinION on 16S rRNA sequencing studies, and have obtained remarkable results. Here we present a review of the state-of-the-art of MinION technology applied to microbiome studies, the current possible application and main challenges for its use on 16S rRNA metabarcoding.Entities:
Keywords: Microbial diversity; MinION; Third generation sequencing
Year: 2020 PMID: 32071706 PMCID: PMC7013242 DOI: 10.1016/j.csbj.2020.01.005
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Most common metabarcoding sequencing strategies for each sequencing technology generation. (a) First generation sequencing (Sanger). Under this approach, metabarcoding is classically performed by amplifying full-length 16S rRNA genes from an environmental DNA sample; once the amplicon has been obtained, the cloning of the 16S amplicons is performed, sequences are added into a vector and then transformed into a host; finally, plasmid extraction and purification are performed and the sequencing of 16S rRNA inserts is carried out by the Sanger method. (b) Second generation sequencing (Illumina). From environmental DNA samples, a PCR amplification of specific regions of de 16S rRNA gene is performed; depending on the scope of the study, one or two regions of the 16S gene can be amplified, with regions V1-V2 and V3-V4 being the most frequently used; by using these regions, a paired end library (the mix of DNA fragments with adapters attached to theirs ends and ready to be sequenced) preparation is often used for this purpose, adapters (exogenous nucleic acids that are ligated to a nucleic acid molecule to be sequenced) and index (unique DNA sequences ligated to fragments within a sequencing library, they allow the posterior sorting and identification of different samples sequenced on a same sequencing run) are added to 16S amplicon extremes and libraries of ~300 bp in length are finally sequenced on the Illumina MiSeq platform. (c) Third generation sequencing (Nanopore). This recently developed approach starts with the amplification of the full-length 16S rRNA gene from environmental DNA using universal primers; simultaneously, indexes for multiplexing are added to the amplicons in the same PCR reaction; once amplicons have been purified, the library preparation process is performed, consisting of the addition of a protein at a specific tagged region of the 16S amplicons (10 min for library preparation); finally direct sequencing of the samples is carried out on the MinION sequencer.
Comparison of the available sequencing platforms for 16S metagenomic analysis using metabarcoding approach.
| Sequencing Platform | Read Length (bp) | Accuracy | Output | Sequencing Chemistry | Run Time | Advantages in Metabarcoding approaches | |
|---|---|---|---|---|---|---|---|
| Sanger | 400–900 | 99.999% | 1.9–84 Kb | Dideoxy chain termination | 20 min −3 h | Long read length, high quality | |
| Illumina MiSeq | 75–300 | 99.9% | 13.2–20 Gb | Sequencing by Synthesis | 21–56 h | High Throughput, read quality | |
| MinION | >200,000 | ~95% | ~50 Gb | Single Sequencing real time-long reads | 1–48 h | High Throughput, Long read length, portability | |
| PacBio | 10–15 Kb | 99.999 | 5–10 Gb | Single Sequencing real time-long reads | 4 h | Long read length and quality |
Fig. 2Classic pipelines MOTHUR [21] and QIIME2 [20] and their complete workflow for 16S rRNA amplicons analyses, the “common processes” flow contains all common steps in both pipelines.
Fig. 3Recommended MinION 16S rRNA amplicons pipeline for bacterial diversity analysis. [90], [91], [92]
Different tools used to analyze Nanopore 16S data in metabarcoding studies.
| Analysis approach | Data processes included | Tools used for analysis | Taxonomic Data Base | Reference |
|---|---|---|---|---|
| Profiling of bacterial communities | Basecalling, Demultiplexing, adapters and barcode trimming, chimera removal, taxonomic assignment | Albacore V2.3.1, Porechop, Yacrd 0.3, Minimap, EPI2ME | NCBI and rrn database | |
| In field metagenome bacterial community analysis | Basecalling, Demultiplexing, Taxonomic assignment, diversity analysis | Albacore v1.10, SiINTAX, usearch v10.0.240 | Ribosomal Database Project | |
| Rapid bacterial pathogens identification | Basecalling, human reads removal, bacterial reads taxonomic assignment | Albacore 2.2.4, TanTan v13, Minimap2, R | GenomeSync database, NCBI database | |
| Monitoring microbial of an anaerobic digestion system | Basecalling, Demultiplexing, adapter trimming, Taxonomic assignment | Metrichor, EPI2ME, poRe, Porechop, QIIME, BLAST, | GreenGenes database | |
| Microbiome characterization | Basecalling, OTU picking, taxonomy assignment. | Metrichor v2.42.2, Poretools, QIIME 1.9. RDP classifier, BLASTn | GreenGenes database | |
| Microbiome amplicon sequencing workflow | Bassecalling, alignment, re-orientation of reads, de-novo clustering, chimera removal, | Fast5-to-fastq, seqtk, INC-Seq, blastn, Graphmap, POA, chopSeq, nanoClust, R | No taxonomic assignment |
Bioinformatic tools for 16S rRNA metabarcoding Nanopore data.
| Process | Tool | Input file | Programming languages | Available from | Reference |
|---|---|---|---|---|---|
| Basecalling | Albacore | Fast5 | Python | ONT | |
| Guppy | Fast5 | Python | ONT | ||
| Deep Nano | fast5 | Python | |||
| Chiron | Fast5 | Python | |||
| Sequencing report | NanoPlot | fastq, fasta, sequencing_summary (Albacore or guppy basecaller) | Python | ||
| pOre | fastq, fasta | R | |||
| pauvre | fastq | Github | |||
| poretools | fastq, fast5 | Python | |||
| Demultiplexing | Albacore | Fast5 | Python | ONT | |
| qcat | fastq | Python | Github | ||
| porechop | fastq, fasta | C++, Python | Github | ||
| Filtering and trimming | NanoFilt | fastq | Python | ||
| Filtlong | fastq | C++, Python | Github | ||
| Porechop | fastq | C++, Python | Github | ||
| Taxonomic assignment | Minimap2 | fastq, fasta | C++, Python | ||
| Wimp | fastq | Cloud-based | ONT | ||
| Centrifuge | fastq, fasta | g++ | |||
| LASTZ | fastq, fasta | g++, python | Github | ||
| Clustering | NanoClust | USEARCH/VSEARCH format | Python | ||
| CARNAC-LR | paf | C++, Python | |||
| Data exploration | Pavian | Kraken and MetaPhlan formats | R | ||
| PHINCH | biom | Cloud-based | |||
| Krona | Krona format | – | |||
| MEGAN6 | OTU table | – | |||
| Microbiome Analyst | OTU table, taxonomy table | Cloud-based | |||