Literature DB >> 24021385

Toward a statistically explicit understanding of de novo sequence assembly.

Mark Howison1, Felipe Zapata, Casey W Dunn.   

Abstract

MOTIVATION: Draft de novo genome assemblies are now available for many organisms. These assemblies are point estimates of the true genome sequences. Each is a specific hypothesis, drawn from among many alternative hypotheses, of the sequence of a genome. Assembly uncertainty, the inability to distinguish between multiple alternative assembly hypotheses, can be due to real variation between copies of the genome in the sample, errors and ambiguities in the sequenced data and assumptions and heuristics of the assemblers. Most assemblers select a single assembly according to ad hoc criteria, and do not yet report and quantify the uncertainty of their outputs. Those assemblers that do report uncertainty take different approaches to describing multiple assembly hypotheses and the support for each.
RESULTS: Here we review and examine the problem of representing and measuring uncertainty in assemblies. A promising recent development is the implementation of assemblers that are built according to explicit statistical models. Some new assembly methods, for example, estimate and maximize assembly likelihood. These advances, combined with technical advances in the representation of alternative assembly hypotheses, will lead to a more complete and biologically relevant understanding of assembly uncertainty. This will in turn facilitate the interpretation of downstream analyses and tests of specific biological hypotheses.

Mesh:

Year:  2013        PMID: 24021385     DOI: 10.1093/bioinformatics/btt525

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  10 in total

1.  The Global Invertebrate Genomics Alliance (GIGA): developing community resources to study diverse invertebrate genomes.

Authors:  Heather Bracken-Grissom; Allen G Collins; Timothy Collins; Keith Crandall; Daniel Distel; Casey Dunn; Gonzalo Giribet; Steven Haddock; Nancy Knowlton; Mark Martindale; Mónica Medina; Charles Messing; Stephen J O'Brien; Gustav Paulay; Nicolas Putnam; Timothy Ravasi; Greg W Rouse; Joseph F Ryan; Anja Schulze; Gert Wörheide; Maja Adamska; Xavier Bailly; Jesse Breinholt; William E Browne; M Christina Diaz; Nathaniel Evans; Jean-François Flot; Nicole Fogarty; Matthew Johnston; Bishoy Kamel; Akito Y Kawahara; Tammy Laberge; Dennis Lavrov; François Michonneau; Leonid L Moroz; Todd Oakley; Karen Osborne; Shirley A Pomponi; Adelaide Rhodes; Scott R Santos; Nori Satoh; Robert W Thacker; Yves Van de Peer; Christian R Voolstra; David Mark Welch; Judith Winston; Xin Zhou
Journal:  J Hered       Date:  2014 Jan-Feb       Impact factor: 2.645

2.  ALLMAPS: robust scaffold ordering based on multiple maps.

Authors:  Haibao Tang; Xingtan Zhang; Chenyong Miao; Jisen Zhang; Ray Ming; James C Schnable; Patrick S Schnable; Eric Lyons; Jianguo Lu
Journal:  Genome Biol       Date:  2015-01-13       Impact factor: 13.583

Review 3.  Beyond the whole genome consensus: unravelling of PRRSV phylogenomics using next generation sequencing technologies.

Authors:  Zen H Lu; Alan L Archibald; Tahar Ait-Ali
Journal:  Virus Res       Date:  2014-10-12       Impact factor: 3.303

4.  ILP-based maximum likelihood genome scaffolding.

Authors:  James Lindsay; Hamed Salooti; Ion Măndoiu; Alex Zelikovsky
Journal:  BMC Bioinformatics       Date:  2014-09-10       Impact factor: 3.169

5.  Evaluation of de novo transcriptome assemblies from RNA-Seq data.

Authors:  Bo Li; Nathanael Fillmore; Yongsheng Bai; Mike Collins; James A Thomson; Ron Stewart; Colin N Dewey
Journal:  Genome Biol       Date:  2014-12-21       Impact factor: 13.583

6.  Extensive error in the number of genes inferred from draft genome assemblies.

Authors:  James F Denton; Jose Lugo-Martinez; Abraham E Tucker; Daniel R Schrider; Wesley C Warren; Matthew W Hahn
Journal:  PLoS Comput Biol       Date:  2014-12-04       Impact factor: 4.475

7.  Bayesian genome assembly and assessment by markov chain monte carlo sampling.

Authors:  Mark Howison; Felipe Zapata; Erika J Edwards; Casey W Dunn
Journal:  PLoS One       Date:  2014-06-26       Impact factor: 3.240

8.  Automated ensemble assembly and validation of microbial genomes.

Authors:  Sergey Koren; Todd J Treangen; Christopher M Hill; Mihai Pop; Adam M Phillippy
Journal:  BMC Bioinformatics       Date:  2014-05-03       Impact factor: 3.169

Review 9.  Structural and Computational Biology in the Design of Immunogenic Vaccine Antigens.

Authors:  Lassi Liljeroos; Enrico Malito; Ilaria Ferlenghi; Matthew James Bottomley
Journal:  J Immunol Res       Date:  2015-10-07       Impact factor: 4.818

10.  Genomic repeats, misassembly and reannotation: a case study with long-read resequencing of Porphyromonas gingivalis reference strains.

Authors:  Luis Acuña-Amador; Aline Primot; Edouard Cadieu; Alain Roulet; Frédérique Barloy-Hubler
Journal:  BMC Genomics       Date:  2018-01-16       Impact factor: 3.969

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.