Literature DB >> 25252805

On the complexity of Minimum Path Cover with Subpath Constraints for multi-assembly.

Romeo Rizzi, Alexandru I Tomescu, Veli Mäkinen.   

Abstract

BACKGROUND: Multi-assembly problems have gathered much attention in the last years, as Next-Generation Sequencing technologies have started being applied to mixed settings, such as reads from the transcriptome (RNA-Seq), or from viral quasi-species. One classical model that has resurfaced in many multi-assembly methods (e.g. in Cufflinks, ShoRAH, BRANCH, CLASS) is the Minimum Path Cover (MPC) Problem, which asks for the minimum number of directed paths that cover all the nodes of a directed acyclic graph. The MPC Problem is highly popular because the acyclicity of the graph ensures its polynomial-time solvability.
RESULTS: In this paper, we consider two generalizations of it dealing with integrating constraints arising from long reads or paired-end reads; these extensions have also been considered by two recent methods, but not fully solved. More specifically, we study the two problems where also a set of subpaths, or pairs of subpaths, of the graph have to be entirely covered by some path in the MPC. We show that in the case of long reads (subpaths), the generalized problem can be solved in polynomial-time by a reduction to the classical MPC Problem. We also consider the weighted case, and show that it can be solved in polynomial-time by a reduction to a min-cost circulation problem. As a side result, we also improve the time complexity of the classical minimum weight MPC Problem. In the case of paired-end reads (pairs of subpaths), the generalized problem becomes NP-hard, but we show that it is fixed-parameter tractable (FPT) in the total number of constraints. This computational dichotomy between long reads and paired-end reads is also a general insight into multi-assembly problems.

Entities:  

Mesh:

Year:  2014        PMID: 25252805      PMCID: PMC4168716          DOI: 10.1186/1471-2105-15-S9-S5

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  28 in total

1.  Mapping and quantifying mammalian transcriptomes by RNA-Seq.

Authors:  Ali Mortazavi; Brian A Williams; Kenneth McCue; Lorian Schaeffer; Barbara Wold
Journal:  Nat Methods       Date:  2008-05-30       Impact factor: 28.547

Review 2.  RNA-Seq: a revolutionary tool for transcriptomics.

Authors:  Zhong Wang; Mark Gerstein; Michael Snyder
Journal:  Nat Rev Genet       Date:  2009-01       Impact factor: 53.242

3.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.

Authors:  Cole Trapnell; Brian A Williams; Geo Pertea; Ali Mortazavi; Gordon Kwan; Marijke J van Baren; Steven L Salzberg; Barbara J Wold; Lior Pachter
Journal:  Nat Biotechnol       Date:  2010-05-02       Impact factor: 54.908

4.  NSMAP: a method for spliced isoforms identification and quantification from RNA-Seq.

Authors:  Zheng Xia; Jianguo Wen; Chung-Che Chang; Xiaobo Zhou
Journal:  BMC Bioinformatics       Date:  2011-05-16       Impact factor: 3.169

5.  ShoRAH: estimating the genetic diversity of a mixed sample from next-generation sequencing data.

Authors:  Osvaldo Zagordi; Arnab Bhattacharya; Nicholas Eriksson; Niko Beerenwinkel
Journal:  BMC Bioinformatics       Date:  2011-04-26       Impact factor: 3.307

6.  GapFiller: a de novo assembly approach to fill the gap within paired reads.

Authors:  Francesca Nadalin; Francesco Vezzi; Alberto Policriti
Journal:  BMC Bioinformatics       Date:  2012-09-07       Impact factor: 3.169

7.  iReckon: simultaneous isoform discovery and abundance estimation from RNA-seq data.

Authors:  Aziz M Mezlini; Eric J M Smith; Marc Fiume; Orion Buske; Gleb L Savich; Sohrab Shah; Sam Aparicio; Derek Y Chiang; Anna Goldenberg; Michael Brudno
Journal:  Genome Res       Date:  2012-11-29       Impact factor: 9.043

8.  Viral population estimation using pyrosequencing.

Authors:  Nicholas Eriksson; Lior Pachter; Yumi Mitsuya; Soo-Yon Rhee; Chunlin Wang; Baback Gharizadeh; Mostafa Ronaghi; Robert W Shafer; Niko Beerenwinkel
Journal:  PLoS Comput Biol       Date:  2008-05-09       Impact factor: 4.475

9.  Efficient RNA isoform identification and quantification from RNA-Seq data with network flows.

Authors:  Elsa Bernard; Laurent Jacob; Julien Mairal; Jean-Philippe Vert
Journal:  Bioinformatics       Date:  2014-05-09       Impact factor: 6.937

View more
  4 in total

1.  Strawberry: Fast and accurate genome-guided transcript reconstruction and quantification from RNA-Seq.

Authors:  Ruolin Liu; Julie Dickerson
Journal:  PLoS Comput Biol       Date:  2017-11-27       Impact factor: 4.475

2.  Evaluating approaches to find exon chains based on long reads.

Authors:  Anna Kuosmanen; Tuukka Norri; Veli Mäkinen
Journal:  Brief Bioinform       Date:  2018-05-01       Impact factor: 11.622

3.  CircAST: Full-length Assembly and Quantification of Alternatively Spliced Isoforms in Circular RNAs.

Authors:  Jing Wu; Yan Li; Cheng Wang; Yiqiang Cui; Tianyi Xu; Chang Wang; Xiao Wang; Jiahao Sha; Bin Jiang; Kai Wang; Zhibin Hu; Xuejiang Guo; Xiaofeng Song
Journal:  Genomics Proteomics Bioinformatics       Date:  2020-01-31       Impact factor: 7.691

4.  Population-scale detection of non-reference sequence variants using colored de Bruijn Graphs.

Authors:  Thomas Krannich; W Timothy J White; Sebastian Niehus; Guillaume Holley; Bjarni V Halldórsson; Birte Kehr
Journal:  Bioinformatics       Date:  2021-11-02       Impact factor: 6.937

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.