Literature DB >> 7497129

Toward simplifying and accurately formulating fragment assembly.

E W Myers1.   

Abstract

The fragment assembly problem is that of reconstructing a DNA sequence from a collection of randomly sampled fragments. Traditionally, the objective of this problem has been to produce the shortest string that contains all the fragments as substrings, but in the case of repetitive target sequences this objective produces answers that are overcompressed. In this paper, the problem is reformulated as one of finding a maximum-likelihood reconstruction with respect to the two-sided Kolmogorov-Smirnov statistic, and it is argued that this is a better formulation of the problem. Next the fragment assembly problem is recast in graph-theoretic terms as one of finding a noncyclic subgraph with certain properties and the objectives of being shortest or maximally likely are also recast in this framework. Finally, a series of graph reduction transformations are given that dramatically reduce the size of the graph to be explored in practical instances of the problem. This reduction is very important as the underlying problems are NP-hard. In practice, the transformed problems are so small that simple branch-and-bound algorithms successfully solve them, thus permitting auxiliary experimental information to be taken into account in the form of overlap, orientation, and distance constraints.

Mesh:

Substances:

Year:  1995        PMID: 7497129     DOI: 10.1089/cmb.1995.2.275

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  57 in total

1.  An Eulerian path approach to DNA fragment assembly.

Authors:  P A Pevzner; H Tang; M S Waterman
Journal:  Proc Natl Acad Sci U S A       Date:  2001-08-14       Impact factor: 11.205

2.  RePS: a sequence assembler that masks exact repeats identified from the shotgun data.

Authors:  Jun Wang; Gane Ka-Shu Wong; Peixiang Ni; Yujun Han; Xiangang Huang; Jianguo Zhang; Chen Ye; Yong Zhang; Jianfei Hu; Kunlin Zhang; Xin Xu; Lijuan Cong; Hong Lu; Xide Ren; Xiaoyu Ren; Jun He; Lin Tao; Douglas A Passey; Jian Wang; Huanming Yang; Jun Yu; Songgang Li
Journal:  Genome Res       Date:  2002-05       Impact factor: 9.043

3.  A graph-theoretic approach to comparing and integrating genetic, physical and sequence-based maps.

Authors:  Immanuel V Yap; David Schneider; Jon Kleinberg; David Matthews; Samuel Cartinhour; Susan R McCouch
Journal:  Genetics       Date:  2003-12       Impact factor: 4.562

4.  Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs.

Authors:  Bastien Chevreux; Thomas Pfisterer; Bernd Drescher; Albert J Driesel; Werner E G Müller; Thomas Wetter; Sándor Suhai
Journal:  Genome Res       Date:  2004-05-12       Impact factor: 9.043

5.  De novo repeat classification and fragment assembly.

Authors:  Pavel A Pevzner; Paul A Pevzner; Haixu Tang; Glenn Tesler
Journal:  Genome Res       Date:  2004-09       Impact factor: 9.043

6.  New Generations: Sequencing Machines and Their Computational Challenges.

Authors:  David C Schwartz; Michael S Waterman
Journal:  J Comput Sci Technol       Date:  2010-01-01       Impact factor: 1.571

7.  Paired de bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers.

Authors:  Paul Medvedev; Son Pham; Mark Chaisson; Glenn Tesler; Pavel Pevzner
Journal:  J Comput Biol       Date:  2011-10-14       Impact factor: 1.479

8.  Genome sequencing in microfabricated high-density picolitre reactors.

Authors:  Marcel Margulies; Michael Egholm; William E Altman; Said Attiya; Joel S Bader; Lisa A Bemben; Jan Berka; Michael S Braverman; Yi-Ju Chen; Zhoutao Chen; Scott B Dewell; Lei Du; Joseph M Fierro; Xavier V Gomes; Brian C Godwin; Wen He; Scott Helgesen; Chun Heen Ho; Chun He Ho; Gerard P Irzyk; Szilveszter C Jando; Maria L I Alenquer; Thomas P Jarvie; Kshama B Jirage; Jong-Bum Kim; James R Knight; Janna R Lanza; John H Leamon; Steven M Lefkowitz; Ming Lei; Jing Li; Kenton L Lohman; Hong Lu; Vinod B Makhijani; Keith E McDade; Michael P McKenna; Eugene W Myers; Elizabeth Nickerson; John R Nobile; Ramona Plant; Bernard P Puc; Michael T Ronan; George T Roth; Gary J Sarkis; Jan Fredrik Simons; John W Simpson; Maithreyan Srinivasan; Karrie R Tartaro; Alexander Tomasz; Kari A Vogt; Greg A Volkmer; Shally H Wang; Yong Wang; Michael P Weiner; Pengguang Yu; Richard F Begley; Jonathan M Rothberg
Journal:  Nature       Date:  2005-07-31       Impact factor: 49.962

9.  Genome assembly reborn: recent computational challenges.

Authors:  Mihai Pop
Journal:  Brief Bioinform       Date:  2009-05-29       Impact factor: 11.622

10.  Assembly errors cause false tandem duplicate regions in the chicken (Gallus gallus) genome sequence.

Authors:  Qu Zhang; Niclas Backström
Journal:  Chromosoma       Date:  2013-11-10       Impact factor: 4.316

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.