Literature DB >> 34561697

Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment.

Yilei Fu1, Medhat Mahmoud2,3, Viginesh Vaibhav Muraliraman1, Fritz J Sedlazeck2, Todd J Treangen1.   

Abstract

BACKGROUND: Long-read sequencing has enabled unprecedented surveys of structural variation across the entire human genome. To maximize the potential of long-read sequencing in this context, novel mapping methods have emerged that have primarily focused on either speed or accuracy. Various heuristics and scoring schemas have been implemented in widely used read mappers (minimap2 and NGMLR) to optimize for speed or accuracy, which have variable performance across different genomic regions and for specific structural variants. Our hypothesis is that constraining read mapping to the use of a single gap penalty across distinct mutational hot spots reduces read alignment accuracy and impedes structural variant detection.
FINDINGS: We tested our hypothesis by implementing a read-mapping pipeline called Vulcan that uses two distinct gap penalty modes, which we refer to as dual-mode alignment. The high-level idea is that Vulcan leverages the computed normalized edit distance of the mapped reads via minimap2 to identify poorly aligned reads and realigns them using the more accurate yet computationally more expensive long-read mapper (NGMLR). In support of our hypothesis, we show that Vulcan improves the alignments for Oxford Nanopore Technology long reads for both simulated and real datasets. These improvements, in turn, lead to improved accuracy for structural variant calling performance on human genome datasets compared to either of the read-mapping methods alone.
CONCLUSIONS: Vulcan is the first long-read mapping framework that combines two distinct gap penalty modes for improved structural variant recall and precision. Vulcan is open-source and available under the MIT License at https://gitlab.com/treangenlab/vulcan.
© The Author(s) 2021. Published by Oxford University Press GigaScience.

Entities:  

Keywords:  gap penalty; long-read; read mapping; structural variation

Mesh:

Year:  2021        PMID: 34561697      PMCID: PMC8463296          DOI: 10.1093/gigascience/giab063

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


  42 in total

1.  Adaptive seeds tame genomic sequence comparison.

Authors:  Szymon M Kiełbasa; Raymond Wan; Kengo Sato; Paul Horton; Martin C Frith
Journal:  Genome Res       Date:  2011-01-05       Impact factor: 9.043

2.  Mapping short DNA sequencing reads and calling variants using mapping quality scores.

Authors:  Heng Li; Jue Ruan; Richard Durbin
Journal:  Genome Res       Date:  2008-08-19       Impact factor: 9.043

3.  rMETL: sensitive mobile element insertion detection with long read realignment.

Authors:  Tao Jiang; Bo Liu; Junyi Li; Yadong Wang
Journal:  Bioinformatics       Date:  2019-09-15       Impact factor: 6.937

4.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

5.  Weighted minimizer sampling improves long read mapping.

Authors:  Chirag Jain; Arang Rhie; Haowen Zhang; Claudia Chu; Brian P Walenz; Sergey Koren; Adam M Phillippy
Journal:  Bioinformatics       Date:  2020-07-01       Impact factor: 6.937

6.  De Novo Clustering of Long-Read Transcriptome Data Using a Greedy, Quality Value-Based Algorithm.

Authors:  Kristoffer Sahlin; Paul Medvedev
Journal:  J Comput Biol       Date:  2020-03-16       Impact factor: 1.479

7.  Integrative genomics viewer.

Authors:  James T Robinson; Helga Thorvaldsdóttir; Wendy Winckler; Mitchell Guttman; Eric S Lander; Gad Getz; Jill P Mesirov
Journal:  Nat Biotechnol       Date:  2011-01       Impact factor: 54.908

8.  Comprehensive evaluation and characterisation of short read general-purpose structural variant calling software.

Authors:  Daniel L Cameron; Leon Di Stefano; Anthony T Papenfuss
Journal:  Nat Commun       Date:  2019-07-19       Impact factor: 14.919

9.  Fast and sensitive mapping of nanopore sequencing reads with GraphMap.

Authors:  Ivan Sović; Mile Šikić; Andreas Wilm; Shannon Nicole Fenlon; Swaine Chen; Niranjan Nagarajan
Journal:  Nat Commun       Date:  2016-04-15       Impact factor: 14.919

10.  A diploid assembly-based benchmark for variants in the major histocompatibility complex.

Authors:  Chen-Shan Chin; Justin Wagner; Qiandong Zeng; Erik Garrison; Shilpa Garg; Arkarachai Fungtammasan; Mikko Rautiainen; Sergey Aganezov; Melanie Kirsche; Samantha Zarate; Michael C Schatz; Chunlin Xiao; William J Rowell; Charles Markello; Jesse Farek; Fritz J Sedlazeck; Vikas Bansal; Byunggil Yoo; Neil Miller; Xin Zhou; Andrew Carroll; Alvaro Martinez Barrio; Marc Salit; Tobias Marschall; Alexander T Dilthey; Justin M Zook
Journal:  Nat Commun       Date:  2020-09-22       Impact factor: 14.919

View more
  3 in total

1.  Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment.

Authors:  Yilei Fu; Medhat Mahmoud; Viginesh Vaibhav Muraliraman; Fritz J Sedlazeck; Todd J Treangen
Journal:  Gigascience       Date:  2021-09-24       Impact factor: 6.524

2.  The De Novo Genome Assembly of Olea europaea subsp. cuspidate, a Widely Distributed Olive Close Relative.

Authors:  Tao Wu; Ting Ma; Tian Xu; Li Pan; Yanli Zhang; Yongjie Li; Delu Ning
Journal:  Front Genet       Date:  2022-08-25       Impact factor: 4.772

3.  Characterization of Blf4, an Archaeal Lytic Virus Targeting a Member of the Methanomicrobiales.

Authors:  Katrin Weidenbach; Sandro Wolf; Anne Kupczok; Tobias Kern; Martin A Fischer; Jochen Reetz; Natalia Urbańska; Sven Künzel; Ruth A Schmitz; Michael Rother
Journal:  Viruses       Date:  2021-09-26       Impact factor: 5.048

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.