Literature DB >> 35365778

Long-read mapping to repetitive reference sequences using Winnowmap2.

Chirag Jain1,2, Arang Rhie3, Nancy F Hansen4, Sergey Koren3, Adam M Phillippy3.   

Abstract

Approximately 5-10% of the human genome remains inaccessible due to the presence of repetitive sequences such as segmental duplications and tandem repeat arrays. We show that existing long-read mappers often yield incorrect alignments and variant calls within long, near-identical repeats, as they remain vulnerable to allelic bias. In the presence of a nonreference allele within a repeat, a read sampled from that region could be mapped to an incorrect repeat copy. To address this limitation, we developed a new long-read mapping method, Winnowmap2, by using minimal confidently alignable substrings. Winnowmap2 computes each read mapping through a collection of confident subalignments. This approach is more tolerant of structural variation and more sensitive to paralog-specific variants within repeats. Our experiments highlight that Winnowmap2 successfully addresses the issue of allelic bias, enabling more accurate downstream variant calls in repetitive sequences.
© 2022. The Author(s), under exclusive licence to Springer Nature America, Inc.

Entities:  

Mesh:

Year:  2022        PMID: 35365778     DOI: 10.1038/s41592-022-01457-8

Source DB:  PubMed          Journal:  Nat Methods        ISSN: 1548-7091            Impact factor:   28.547


  33 in total

1.  Minimap2: pairwise alignment for nucleotide sequences.

Authors:  Heng Li
Journal:  Bioinformatics       Date:  2018-09-15       Impact factor: 6.937

2.  lordFAST: sensitive and Fast Alignment Search Tool for LOng noisy Read sequencing Data.

Authors:  Ehsan Haghshenas; S Cenk Sahinalp; Faraz Hach
Journal:  Bioinformatics       Date:  2019-01-01       Impact factor: 6.937

3.  Weighted minimizer sampling improves long read mapping.

Authors:  Chirag Jain; Arang Rhie; Haowen Zhang; Claudia Chu; Brian P Walenz; Sergey Koren; Adam M Phillippy
Journal:  Bioinformatics       Date:  2020-07-01       Impact factor: 6.937

4.  A Fast Approximate Algorithm for Mapping Long Reads to Large Reference Databases.

Authors:  Chirag Jain; Alexander Dilthey; Sergey Koren; Srinivas Aluru; Adam M Phillippy
Journal:  J Comput Biol       Date:  2018-04-30       Impact factor: 1.479

5.  Sensitive alignment using paralogous sequence variants improves long-read mapping and variant calling in segmental duplications.

Authors:  Timofey Prodanov; Vikas Bansal
Journal:  Nucleic Acids Res       Date:  2020-11-04       Impact factor: 16.971

6.  Detection and removal of biases in the analysis of next-generation sequencing reads.

Authors:  Schraga Schwartz; Ram Oren; Gil Ast
Journal:  PLoS One       Date:  2011-01-31       Impact factor: 3.240

7.  Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data.

Authors:  Jacob F Degner; John C Marioni; Athma A Pai; Joseph K Pickrell; Everlyne Nkadori; Yoav Gilad; Jonathan K Pritchard
Journal:  Bioinformatics       Date:  2009-10-06       Impact factor: 6.937

8.  Kart: a divide-and-conquer algorithm for NGS read alignment.

Authors:  Hsin-Nan Lin; Wen-Lian Hsu
Journal:  Bioinformatics       Date:  2017-08-01       Impact factor: 6.937

9.  Accurate detection of complex structural variations using single-molecule sequencing.

Authors:  Fritz J Sedlazeck; Philipp Rescheneder; Moritz Smolka; Han Fang; Maria Nattestad; Arndt von Haeseler; Michael C Schatz
Journal:  Nat Methods       Date:  2018-04-30       Impact factor: 28.547

10.  Fast gap-affine pairwise alignment using the wavefront algorithm.

Authors:  Santiago Marco-Sola; Juan Carlos Moure; Miquel Moreto; Antonio Espinosa
Journal:  Bioinformatics       Date:  2021-05-01       Impact factor: 6.937

View more
  1 in total

1.  The complete sequence of a human genome.

Authors:  Sergey Nurk; Sergey Koren; Arang Rhie; Mikko Rautiainen; Andrey V Bzikadze; Alla Mikheenko; Mitchell R Vollger; Nicolas Altemose; Lev Uralsky; Ariel Gershman; Sergey Aganezov; Savannah J Hoyt; Mark Diekhans; Glennis A Logsdon; Michael Alonge; Stylianos E Antonarakis; Matthew Borchers; Gerard G Bouffard; Shelise Y Brooks; Gina V Caldas; Nae-Chyun Chen; Haoyu Cheng; Chen-Shan Chin; William Chow; Leonardo G de Lima; Philip C Dishuck; Richard Durbin; Tatiana Dvorkina; Ian T Fiddes; Giulio Formenti; Robert S Fulton; Arkarachai Fungtammasan; Erik Garrison; Patrick G S Grady; Tina A Graves-Lindsay; Ira M Hall; Nancy F Hansen; Gabrielle A Hartley; Marina Haukness; Kerstin Howe; Michael W Hunkapiller; Chirag Jain; Miten Jain; Erich D Jarvis; Peter Kerpedjiev; Melanie Kirsche; Mikhail Kolmogorov; Jonas Korlach; Milinn Kremitzki; Heng Li; Valerie V Maduro; Tobias Marschall; Ann M McCartney; Jennifer McDaniel; Danny E Miller; James C Mullikin; Eugene W Myers; Nathan D Olson; Benedict Paten; Paul Peluso; Pavel A Pevzner; David Porubsky; Tamara Potapova; Evgeny I Rogaev; Jeffrey A Rosenfeld; Steven L Salzberg; Valerie A Schneider; Fritz J Sedlazeck; Kishwar Shafin; Colin J Shew; Alaina Shumate; Ying Sims; Arian F A Smit; Daniela C Soto; Ivan Sović; Jessica M Storer; Aaron Streets; Beth A Sullivan; Françoise Thibaud-Nissen; James Torrance; Justin Wagner; Brian P Walenz; Aaron Wenger; Jonathan M D Wood; Chunlin Xiao; Stephanie M Yan; Alice C Young; Samantha Zarate; Urvashi Surti; Rajiv C McCoy; Megan Y Dennis; Ivan A Alexandrov; Jennifer L Gerton; Rachel J O'Neill; Winston Timp; Justin M Zook; Michael C Schatz; Evan E Eichler; Karen H Miga; Adam M Phillippy
Journal:  Science       Date:  2022-03-31       Impact factor: 63.714

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.