| Literature DB >> 31665473 |
Noam Harel1, Moran Meir1, Uri Gophna1, Adi Stern1.
Abstract
One of the key challenges in the field of genetics is the inference of haplotypes from next generation sequencing data. The MinION Oxford Nanopore sequencer allows sequencing long reads, with the potential of sequencing complete genes, and even complete genomes of viruses, in individual reads. However, MinION suffers from high error rates, rendering the detection of true variants difficult. Here, we propose a new statistical approach named AssociVar, which differentiates between true mutations and sequencing errors from direct RNA/DNA sequencing using MinION. Our strategy relies on the assumption that sequencing errors will be dispersed randomly along sequencing reads, and hence will not be associated with each other, whereas real mutations will display a non-random pattern of association with other mutations. We demonstrate our approach using direct RNA sequencing data from evolved populations of the MS2 bacteriophage, whose small genome makes it ideal for MinION sequencing. AssociVar inferred several mutations in the phage genome, which were corroborated using parallel Illumina sequencing. This allowed us to reconstruct full genome viral haplotypes constituting different strains that were present in the sample. Our approach is applicable to long read sequencing data from any organism for accurate detection of bona fide mutations and inter-strain polymorphisms.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31665473 PMCID: PMC7107797 DOI: 10.1093/nar/gkz907
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971