Literature DB >> 20305016

Genome assembly quality: assessment and improvement using the neutral indel model.

Stephen Meader1, LaDeana W Hillier, Devin Locke, Chris P Ponting, Gerton Lunter.   

Abstract

We describe a statistical and comparative-genomic approach for quantifying error rates of genome sequence assemblies. The method exploits not substitutions but the pattern of insertions and deletions (indels) in genome-scale alignments for closely related species. Using two- or three-way alignments, the approach estimates the amount of aligned sequence containing clusters of nucleotides that were wrongly inserted or deleted during sequencing or assembly. Thus, the method is well-suited to assessing fine-scale sequence quality within single assemblies, between different assemblies of a single set of reads, and between genome assemblies for different species. When applying this approach to four primate genome assemblies, we found that average gap error rates per base varied considerably, by up to sixfold. As expected, bacterial artificial chromosome (BAC) sequences contained lower, but still substantial, predicted numbers of errors, arguing for caution in regarding BACs as the epitome of genome fidelity. We then mapped short reads, at approximately 10-fold statistical coverage, from a Bornean orangutan onto the Sumatran orangutan genome assembly originally constructed from capillary reads. This resulted in a reduced gap error rate and a separation of error-prone from high-fidelity sequence. Over 5000 predicted indel errors in protein-coding sequence were corrected in a hybrid assembly. Our approach contributes a new fine-scale quality metric for assemblies that should facilitate development of improved genome sequencing and assembly strategies.

Entities:  

Mesh:

Year:  2010        PMID: 20305016      PMCID: PMC2860169          DOI: 10.1101/gr.096966.109

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  32 in total

1.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors:  Daniel R Zerbino; Ewan Birney
Journal:  Genome Res       Date:  2008-03-18       Impact factor: 9.043

2.  A machine-learning approach to combined evidence validation of genome assemblies.

Authors:  Jeong-Hyeon Choi; Sun Kim; Haixu Tang; Justen Andrews; Don G Gilbert; John K Colbourne
Journal:  Bioinformatics       Date:  2008-01-18       Impact factor: 6.937

3.  Mapping short DNA sequencing reads and calling variants using mapping quality scores.

Authors:  Heng Li; Jue Ruan; Richard Durbin
Journal:  Genome Res       Date:  2008-08-19       Impact factor: 9.043

Review 4.  Next-generation DNA sequencing methods.

Authors:  Elaine R Mardis
Journal:  Annu Rev Genomics Hum Genet       Date:  2008       Impact factor: 8.929

5.  The mitochondrial DNA molecule of Sumatran orangutan and a molecular proposal for two (Bornean and Sumatran) species of orangutan.

Authors:  X Xu; U Arnason
Journal:  J Mol Evol       Date:  1996-11       Impact factor: 2.395

6.  Chromatin-associated periodicity in genetic variation downstream of transcriptional start sites.

Authors:  Shin Sasaki; Cecilia C Mello; Atsuko Shimada; Yoichiro Nakatani; Shin-Ichi Hashimoto; Masako Ogawa; Kouji Matsushima; Sam Guoping Gu; Masahiro Kasahara; Budrul Ahsan; Atsushi Sasaki; Taro Saito; Yutaka Suzuki; Sumio Sugano; Yuji Kohara; Hiroyuki Takeda; Andrew Fire; Shinichi Morishita
Journal:  Science       Date:  2008-12-11       Impact factor: 47.728

7.  Thousands of human mobile element fragments undergo strong purifying selection near developmental genes.

Authors:  Craig B Lowe; Gill Bejerano; David Haussler
Journal:  Proc Natl Acad Sci U S A       Date:  2007-04-26       Impact factor: 11.205

8.  Lineage-specific biology revealed by a finished genome assembly of the mouse.

Authors:  Deanna M Church; Leo Goodstadt; Ladeana W Hillier; Michael C Zody; Steve Goldstein; Xinwe She; Carol J Bult; Richa Agarwala; Joshua L Cherry; Michael DiCuccio; Wratko Hlavina; Yuri Kapustin; Peter Meric; Donna Maglott; Zoë Birtle; Ana C Marques; Tina Graves; Shiguo Zhou; Brian Teague; Konstantinos Potamousis; Christopher Churas; Michael Place; Jill Herschleb; Ron Runnheim; Daniel Forrest; James Amos-Landgraf; David C Schwartz; Ze Cheng; Kerstin Lindblad-Toh; Evan E Eichler; Chris P Ponting
Journal:  PLoS Biol       Date:  2009-05-26       Impact factor: 8.029

9.  Hawkeye: an interactive visual analytics tool for genome assemblies.

Authors:  Michael C Schatz; Adam M Phillippy; Ben Shneiderman; Steven L Salzberg
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

10.  Genome assembly forensics: finding the elusive mis-assembly.

Authors:  Adam M Phillippy; Michael C Schatz; Mihai Pop
Journal:  Genome Biol       Date:  2008-03-14       Impact factor: 13.583

View more
  26 in total

1.  Trans genomic capture and sequencing of primate exomes reveals new targets of positive selection.

Authors:  Renee D George; Graham McVicker; Rachel Diederich; Sarah B Ng; Alexandra P MacKenzie; Willie J Swanson; Jay Shendure; James H Thomas
Journal:  Genome Res       Date:  2011-07-27       Impact factor: 9.043

2.  Assemblathon 1: a competitive assessment of de novo short read assembly methods.

Authors:  Dent Earl; Keith Bradnam; John St John; Aaron Darling; Dawei Lin; Joseph Fass; Hung On Ken Yu; Vince Buffalo; Daniel R Zerbino; Mark Diekhans; Ngan Nguyen; Pramila Nuwantha Ariyaratne; Wing-Kin Sung; Zemin Ning; Matthias Haimel; Jared T Simpson; Nuno A Fonseca; İnanç Birol; T Roderick Docking; Isaac Y Ho; Daniel S Rokhsar; Rayan Chikhi; Dominique Lavenier; Guillaume Chapuis; Delphine Naquin; Nicolas Maillet; Michael C Schatz; David R Kelley; Adam M Phillippy; Sergey Koren; Shiaw-Pyng Yang; Wei Wu; Wen-Chi Chou; Anuj Srivastava; Timothy I Shaw; J Graham Ruby; Peter Skewes-Cox; Miguel Betegon; Michelle T Dimon; Victor Solovyev; Igor Seledtsov; Petr Kosarev; Denis Vorobyev; Ricardo Ramirez-Gonzalez; Richard Leggett; Dan MacLean; Fangfang Xia; Ruibang Luo; Zhenyu Li; Yinlong Xie; Binghang Liu; Sante Gnerre; Iain MacCallum; Dariusz Przybylski; Filipe J Ribeiro; Shuangye Yin; Ted Sharpe; Giles Hall; Paul J Kersey; Richard Durbin; Shaun D Jackman; Jarrod A Chapman; Xiaoqiu Huang; Joseph L DeRisi; Mario Caccamo; Yingrui Li; David B Jaffe; Richard E Green; David Haussler; Ian Korf; Benedict Paten
Journal:  Genome Res       Date:  2011-09-16       Impact factor: 9.043

Review 3.  Next-generation transcriptome assembly.

Authors:  Jeffrey A Martin; Zhong Wang
Journal:  Nat Rev Genet       Date:  2011-09-07       Impact factor: 53.242

Review 4.  Sequence assembly demystified.

Authors:  Niranjan Nagarajan; Mihai Pop
Journal:  Nat Rev Genet       Date:  2013-01-29       Impact factor: 53.242

5.  Sequence shortening in the rodent ancestor.

Authors:  Steve Laurie; Macarena Toll-Riera; Núria Radó-Trilla; M Mar Albà
Journal:  Genome Res       Date:  2011-11-29       Impact factor: 9.043

6.  A vertebrate case study of the quality of assemblies derived from next-generation sequences.

Authors:  Liang Ye; LaDeana W Hillier; Patrick Minx; Nay Thane; Devin P Locke; John C Martin; Lei Chen; Makedonka Mitreva; Jason R Miller; Kevin V Haub; David J Dooling; Elaine R Mardis; Richard K Wilson; George M Weinstock; Wesley C Warren
Journal:  Genome Biol       Date:  2011-03-31       Impact factor: 13.583

7.  Massive turnover of functional sequence in human and other mammalian genomes.

Authors:  Stephen Meader; Chris P Ponting; Gerton Lunter
Journal:  Genome Res       Date:  2010-08-06       Impact factor: 9.043

8.  Limitations of next-generation genome sequence assembly.

Authors:  Can Alkan; Saba Sajjadian; Evan E Eichler
Journal:  Nat Methods       Date:  2010-11-21       Impact factor: 28.547

9.  A high-resolution cucumber cytogenetic map integrated with the genome assembly.

Authors:  Jianying Sun; Zhonghua Zhang; Xu Zong; Sanwen Huang; Zongyun Li; Yonghua Han
Journal:  BMC Genomics       Date:  2013-07-09       Impact factor: 3.969

10.  Error and error mitigation in low-coverage genome assemblies.

Authors:  Melissa J Hubisz; Michael F Lin; Manolis Kellis; Adam Siepel
Journal:  PLoS One       Date:  2011-02-14       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.