Literature DB >> 34183036

Using the longest run subsequence problem within homology-based scaffolding.

Sven Schrinner1, Manish Goel2,3, Michael Wulfert4, Philipp Spohr1, Korbinian Schneeberger5,2,3, Gunnar W Klau6,7.   

Abstract

Genome assembly is one of the most important problems in computational genomics. Here, we suggest addressing an issue that arises in homology-based scaffolding, that is, when linking and ordering contigs to obtain larger pseudo-chromosomes by means of a second incomplete assembly of a related species. The idea is to use alignments of binned regions in one contig to find the most homologous contig in the other assembly. We show that ordering the contigs of the other assembly can be expressed by a new string problem, the longest run subsequence problem (LRS). We show that LRS is NP-hard and present reduction rules and two algorithmic approaches that, together, are able to solve large instances of LRS to provable optimality. All data used in the experiments as well as our source code are freely available. We demonstrate its usefulness within an existing larger scaffolding approach by solving realistic instances resulting from partial Arabidopsis thaliana assemblies in short computation time.

Entities:  

Keywords:  Alignment; Assembly; Longest subsequence; String algorithm

Year:  2021        PMID: 34183036     DOI: 10.1186/s13015-021-00191-8

Source DB:  PubMed          Journal:  Algorithms Mol Biol        ISSN: 1748-7188            Impact factor:   1.405


  9 in total

1.  Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions.

Authors:  Joshua N Burton; Andrew Adey; Rupali P Patwardhan; Ruolan Qiu; Jacob O Kitzman; Jay Shendure
Journal:  Nat Biotechnol       Date:  2013-11-03       Impact factor: 54.908

2.  Snakemake--a scalable bioinformatics workflow engine.

Authors:  Johannes Köster; Sven Rahmann
Journal:  Bioinformatics       Date:  2012-08-20       Impact factor: 6.937

3.  ALLMAPS: robust scaffold ordering based on multiple maps.

Authors:  Haibao Tang; Xingtan Zhang; Chenyong Miao; Jisen Zhang; Ray Ming; James C Schnable; Patrick S Schnable; Eric Lyons; Jianguo Lu
Journal:  Genome Biol       Date:  2015-01-13       Impact factor: 13.583

4.  A comparative evaluation of genome assembly reconciliation tools.

Authors:  Hind Alhakami; Hamid Mirebrahim; Stefano Lonardi
Journal:  Genome Biol       Date:  2017-05-18       Impact factor: 13.583

5.  Direct determination of diploid genome sequences.

Authors:  Neil I Weisenfeld; Vijay Kumar; Preyas Shah; Deanna M Church; David B Jaffe
Journal:  Genome Res       Date:  2017-04-05       Impact factor: 9.043

6.  Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.

Authors:  Wen-Biao Jiao; Gonzalo Garcia Accinelli; Benjamin Hartwig; Christiane Kiefer; David Baker; Edouard Severing; Eva-Maria Willing; Mathieu Piednoel; Stefan Woetzel; Eva Madrid-Herrero; Bruno Huettel; Ulrike Hümann; Richard Reinhard; Marcus A Koch; Daniel Swan; Bernardo Clavijo; George Coupland; Korbinian Schneeberger
Journal:  Genome Res       Date:  2017-02-03       Impact factor: 9.043

7.  ntJoin: Fast and lightweight assembly-guided scaffolding using minimizer graphs.

Authors:  Lauren Coombe; Vladimir Nikolić; Justin Chu; Inanc Birol; René L Warren
Journal:  Bioinformatics       Date:  2020-06-01       Impact factor: 6.937

8.  SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies.

Authors:  Manish Goel; Hequan Sun; Wen-Biao Jiao; Korbinian Schneeberger
Journal:  Genome Biol       Date:  2019-12-16       Impact factor: 13.583

9.  RaGOO: fast and accurate reference-guided scaffolding of draft genomes.

Authors:  Michael Alonge; Sebastian Soyk; Srividya Ramakrishnan; Xingang Wang; Sara Goodwin; Fritz J Sedlazeck; Zachary B Lippman; Michael C Schatz
Journal:  Genome Biol       Date:  2019-10-28       Impact factor: 13.583

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.