Literature DB >> 28327957

NanoSim: nanopore sequence read simulator based on statistical characterization.

Chen Yang1,2, Justin Chu1,2, René L Warren1, Inanç Birol1,3,4.   

Abstract

Background: The MinION sequencing instrument from Oxford Nanopore Technologies (ONT) produces long read lengths from single-molecule sequencing - valuable features for detailed genome characterization. To realize the potential of this platform, a number of groups are developing bioinformatics tools tuned for the unique characteristics of its data. We note that these development efforts would benefit from a simulator software, the output of which could be used to benchmark analysis tools.
Results: Here, we introduce NanoSim, a fast and scalable read simulator that captures the technology-specific features of ONT data and allows for adjustments upon improvement of nanopore sequencing technology. The first step of NanoSim is read characterization, which provides a comprehensive alignment-based analysis and generates a set of read profiles serving as the input to the next step, the simulation stage. The simulation stage uses the model built in the previous step to produce in silico reads for a given reference genome. NanoSim is written in Python and R. The source files and manual are available at the Genome Sciences Centre website: http://www.bcgsc.ca/platform/bioinfo/software/nanosim.
Conclusion: In this work, we model the base-calling errors of ONT reads to inform the simulation of sequences with similar characteristics. We showcase the performance of NanoSim on publicly available datasets generated using the R7 and R7.3 chemistries and different sequencing kits and compare the resulting synthetic reads to those of other long-sequence simulators and experimental ONT reads. We expect NanoSim to have an enabling role in the field and benefit the development of scalable next-generation sequencing technologies for the long nanopore reads, including genome assembly, mutation detection, and even metagenomic analysis software.
© The Authors 2017. Published by Oxford University Press.

Entities:  

Keywords:  NanoSim; nanopore sequencing; sequence read simulation; statistical modeling

Mesh:

Year:  2017        PMID: 28327957      PMCID: PMC5530317          DOI: 10.1093/gigascience/gix010

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


  15 in total

1.  pIRS: Profile-based Illumina pair-end reads simulator.

Authors:  Xuesong Hu; Jianying Yuan; Yujian Shi; Jianliang Lu; Binghang Liu; Zhenyu Li; Yanxiang Chen; Desheng Mu; Hao Zhang; Nan Li; Zhen Yue; Fan Bai; Heng Li; Wei Fan
Journal:  Bioinformatics       Date:  2012-04-15       Impact factor: 6.937

2.  Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences.

Authors:  Heng Li
Journal:  Bioinformatics       Date:  2016-03-19       Impact factor: 6.937

3.  A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis.

Authors:  E L Sonnhammer; R Durbin
Journal:  Gene       Date:  1995-12-29       Impact factor: 3.688

4.  NanoSim: nanopore sequence read simulator based on statistical characterization.

Authors:  Chen Yang; Justin Chu; René L Warren; Inanç Birol
Journal:  Gigascience       Date:  2017-04-01       Impact factor: 6.524

5.  PBSIM: PacBio reads simulator--toward accurate genome assembly.

Authors:  Yukiteru Ono; Kiyoshi Asai; Michiaki Hamada
Journal:  Bioinformatics       Date:  2012-11-04       Impact factor: 6.937

6.  FASTQSim: platform-independent data characterization and in silico read generation for NGS datasets.

Authors:  Anna Shcherbina
Journal:  BMC Res Notes       Date:  2014-08-15

7.  A reference bacterial genome dataset generated on the MinION™ portable single-molecule nanopore sequencer.

Authors:  Joshua Quick; Aaron R Quinlan; Nicholas J Loman
Journal:  Gigascience       Date:  2014-10-20       Impact factor: 6.524

8.  Oxford Nanopore sequencing, hybrid error correction, and de novo assembly of a eukaryotic genome.

Authors:  Sara Goodwin; James Gurtowski; Scott Ethe-Sayers; Panchajanya Deshpande; Michael C Schatz; W Richard McCombie
Journal:  Genome Res       Date:  2015-10-07       Impact factor: 9.043

9.  Improved data analysis for the MinION nanopore sequencer.

Authors:  Miten Jain; Ian T Fiddes; Karen H Miga; Hugh E Olsen; Benedict Paten; Mark Akeson
Journal:  Nat Methods       Date:  2015-02-16       Impact factor: 28.547

Review 10.  Innovations and challenges in detecting long read overlaps: an evaluation of the state-of-the-art.

Authors:  Justin Chu; Hamid Mohamadi; René L Warren; Chen Yang; Inanç Birol
Journal:  Bioinformatics       Date:  2017-04-15       Impact factor: 6.937

View more
  56 in total

Review 1.  A broad survey of DNA sequence data simulation tools.

Authors:  Shatha Alosaimi; Armand Bandiang; Noelle van Biljon; Denis Awany; Prisca K Thami; Milaine S S Tchamga; Anmol Kiran; Olfa Messaoud; Radia Ismaeel Mohammed Hassan; Jacquiline Mugo; Azza Ahmed; Christian D Bope; Imane Allali; Gaston K Mazandu; Nicola J Mulder; Emile R Chimusa
Journal:  Brief Funct Genomics       Date:  2020-01-22       Impact factor: 4.241

2.  Sensitive alignment using paralogous sequence variants improves long-read mapping and variant calling in segmental duplications.

Authors:  Timofey Prodanov; Vikas Bansal
Journal:  Nucleic Acids Res       Date:  2020-11-04       Impact factor: 16.971

3.  SVEngine: an efficient and versatile simulator of genome structural variations with features of cancer clonal evolution.

Authors:  Li Charlie Xia; Dongmei Ai; Hojoon Lee; Noemi Andor; Chao Li; Nancy R Zhang; Hanlee P Ji
Journal:  Gigascience       Date:  2018-07-01       Impact factor: 6.524

4.  NanoSim: nanopore sequence read simulator based on statistical characterization.

Authors:  Chen Yang; Justin Chu; René L Warren; Inanç Birol
Journal:  Gigascience       Date:  2017-04-01       Impact factor: 6.524

5.  kngMap: Sensitive and Fast Mapping Algorithm for Noisy Long Reads Based on the K-Mer Neighborhood Graph.

Authors:  Ze-Gang Wei; Xing-Guo Fan; Hao Zhang; Xiao-Dan Zhang; Fei Liu; Yu Qian; Shao-Wu Zhang
Journal:  Front Genet       Date:  2022-05-05       Impact factor: 4.772

6.  MAIRA- real-time taxonomic and functional analysis of long reads on a laptop.

Authors:  Benjamin Albrecht; Caner Bağcı; Daniel H Huson
Journal:  BMC Bioinformatics       Date:  2020-09-17       Impact factor: 3.169

7.  Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment.

Authors:  Yilei Fu; Medhat Mahmoud; Viginesh Vaibhav Muraliraman; Fritz J Sedlazeck; Todd J Treangen
Journal:  Gigascience       Date:  2021-09-24       Impact factor: 6.524

8.  Single cell transcriptome sequencing on the Nanopore platform with ScNapBar.

Authors:  Qi Wang; Sven Boenigk; Volker Boehm; Niels H Gehring; Janine Altmueller; Christoph Dieterich
Journal:  RNA       Date:  2021-04-27       Impact factor: 4.942

9.  INC-Seq: accurate single molecule reads using nanopore sequencing.

Authors:  Chenhao Li; Kern Rei Chng; Esther Jia Hui Boey; Amanda Hui Qi Ng; Andreas Wilm; Niranjan Nagarajan
Journal:  Gigascience       Date:  2016-08-02       Impact factor: 6.524

Review 10.  Innovations and challenges in detecting long read overlaps: an evaluation of the state-of-the-art.

Authors:  Justin Chu; Hamid Mohamadi; René L Warren; Chen Yang; Inanç Birol
Journal:  Bioinformatics       Date:  2017-04-15       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.