| Literature DB >> 31971941 |
Alex N Salazar1, Franklin L Nobrega2, Christine Anyansi1,3, Cristian Aparicio-Maldonado2, Ana Rita Costa2, Anna C Haagsma2, Anwar Hiralal2, Ahmed Mahfouz1,4, Rebecca E McKenzie2, Teunke van Rossum2, Stan J J Brouns2, Thomas Abeel1,3.
Abstract
The last decade has witnessed a remarkable increase in our ability to measure genetic information. Advancements of sequencing technologies are challenging the existing methods of data storage and analysis. While methods to cope with the data deluge are progressing, many biologists have lagged behind due to the fast pace of computational advancements and tools available to address their scientific questions. Future generations of biologists must be more computationally aware and capable. This means they should be trained to give them the computational skills to keep pace with technological developments. Here, we propose a model that bridges experimental and bioinformatics concepts using the Oxford Nanopore Technologies (ONT) sequencing platform. We provide both a guide to begin to empower the new generation of educators, scientists, and students in performing long-read assembly of bacterial and bacteriophage genomes and a standalone virtual machine containing all the required software and learning materials for the course.Entities:
Mesh:
Year: 2020 PMID: 31971941 PMCID: PMC6977714 DOI: 10.1371/journal.pcbi.1007314
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Course overview.
Integrated bioinformatics training with time on the x-axis. Lectures (green) give students the necessary background to execute and understand Practical (blue) and Project (purple) sessions. Laboratory sessions (yellow) enable students to employ their biological background and prepare their own DNA libraries from samples of interest. Libraries prepared by each student group are pooled together and run on a MinION device (Oxford Nanopore Technologies, Oxford, UK), generating data to be processed in Project sessions. Backup data previously prepared from the same samples can be used if the students’ MinION run fails to provide enough quality data for analysis. In the Practical sessions, students learn to use established bioinformatics methods, with an emphasis on processing long-read data (see Fig 2, S1 Table and S1 Text). In the Project sessions, they then apply these methods to the generated data to answer specific research questions. After intragroup and intergroup discussions of results, students prepare their final project report and present their results in a poster format.
Fig 2Pipeline for genome assembly using MinION data.
First, the barcoded sequences are demultiplexed using Deepbinner[11] and basecalled using Albacore (Oxford Nanopore Technologies, Oxford, UK). Nanoplot [12] is used to assess the quality of the sequencing data for downstream processing. If the data have sufficient quality, they are used for assembly using, e.g., Canu [13]. Confidence on the resulting consensus assembly is obtained using Minimap2[14]. The assembly is polished to remove common mistakes using Nanopolish[15], and then Circlator [16] is used to determine the zero-based start of the genome, which depends on whether it is a bacterial sequence or a bacteriophage sequence. Finally, the assembled genome is annotated using Prokka [17]. Please refer to S1 Text for further details.