Literature DB >> 20693479

Optimization of de novo transcriptome assembly from next-generation sequencing data.

Yann Surget-Groba1, Juan I Montoya-Burgos.   

Abstract

Transcriptome analysis has important applications in many biological fields. However, assembling a transcriptome without a known reference remains a challenging task requiring algorithmic improvements. We present two methods for substantially improving transcriptome de novo assembly. The first method relies on the observation that the use of a single k-mer length by current de novo assemblers is suboptimal to assemble transcriptomes where the sequence coverage of transcripts is highly heterogeneous. We present the Multiple-k method in which various k-mer lengths are used for de novo transcriptome assembly. We demonstrate its good performance by assembling de novo a published next-generation transcriptome sequence data set of Aedes aegypti, using the existing genome to check the accuracy of our method. The second method relies on the use of a reference proteome to improve the de novo assembly. We developed the Scaffolding using Translation Mapping (STM) method that uses mapping against the closest available reference proteome for scaffolding contigs that map onto the same protein. In a controlled experiment using simulated data, we show that the STM method considerably improves the assembly, with few errors. We applied these two methods to assemble the transcriptome of the non-model catfish Loricaria gr. cataphracta. Using the Multiple-k and STM methods, the assembly increases in contiguity and in gene identification, showing that our methods clearly improve quality and can be widely used. The new methods were used to assemble successfully the transcripts of the core set of genes regulating tooth development in vertebrates, while classic de novo assembly failed.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20693479      PMCID: PMC2945192          DOI: 10.1101/gr.103846.109

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  48 in total

1.  Short read fragment assembly of bacterial genomes.

Authors:  Mark J Chaisson; Pavel A Pevzner
Journal:  Genome Res       Date:  2007-12-14       Impact factor: 9.043

2.  Broad phylogenomic sampling improves resolution of the animal tree of life.

Authors:  Casey W Dunn; Andreas Hejnol; David Q Matus; Kevin Pang; William E Browne; Stephen A Smith; Elaine Seaver; Greg W Rouse; Matthias Obst; Gregory D Edgecombe; Martin V Sørensen; Steven H D Haddock; Andreas Schmidt-Rhaesa; Akiko Okusu; Reinhardt Møbjerg Kristensen; Ward C Wheeler; Mark Q Martindale; Gonzalo Giribet
Journal:  Nature       Date:  2008-03-05       Impact factor: 49.962

3.  Mapping short DNA sequencing reads and calling variants using mapping quality scores.

Authors:  Heng Li; Jue Ruan; Richard Durbin
Journal:  Genome Res       Date:  2008-08-19       Impact factor: 9.043

4.  Mapping and quantifying mammalian transcriptomes by RNA-Seq.

Authors:  Ali Mortazavi; Brian A Williams; Kenneth McCue; Lorian Schaeffer; Barbara Wold
Journal:  Nat Methods       Date:  2008-05-30       Impact factor: 28.547

5.  ALLPATHS: de novo assembly of whole-genome shotgun microreads.

Authors:  Jonathan Butler; Iain MacCallum; Michael Kleber; Ilya A Shlyakhter; Matthew K Belmonte; Eric S Lander; Chad Nusbaum; David B Jaffe
Journal:  Genome Res       Date:  2008-03-13       Impact factor: 9.043

6.  Next-generation sequencing reveals complex relationships between the epigenome and transcriptome in maize.

Authors:  Axel A Elling; Xing Wang Deng
Journal:  Plant Signal Behav       Date:  2009-08-03

Review 7.  RNA-Seq: a revolutionary tool for transcriptomics.

Authors:  Zhong Wang; Mark Gerstein; Michael Snyder
Journal:  Nat Rev Genet       Date:  2009-01       Impact factor: 53.242

8.  Next-generation tag sequencing for cancer gene expression profiling.

Authors:  A Sorana Morrissy; Ryan D Morin; Allen Delaney; Thomas Zeng; Helen McDonald; Steven Jones; Yongjun Zhao; Martin Hirst; Marco A Marra
Journal:  Genome Res       Date:  2009-06-18       Impact factor: 9.043

9.  Next-generation pyrosequencing of gonad transcriptomes in the polyploid lake sturgeon (Acipenser fulvescens): the relative merits of normalization and rarefaction in gene discovery.

Authors:  Matthew C Hale; Cory R McCormick; James R Jackson; J Andrew Dewoody
Journal:  BMC Genomics       Date:  2009-04-29       Impact factor: 3.969

10.  Tissue compartment analysis for biomarker discovery by gene expression profiling.

Authors:  Antoine Disset; Lydie Cheval; Olga Soutourina; Jean-Paul Duong Van Huyen; Guorong Li; Christian Genin; Jacques Tostain; Alexandre Loupy; Alain Doucet; Rabary Rajerison
Journal:  PLoS One       Date:  2009-11-10       Impact factor: 3.240

View more
  189 in total

1.  Graph accordance of next-generation sequence assemblies.

Authors:  Guohui Yao; Liang Ye; Hongyu Gao; Patrick Minx; Wesley C Warren; George M Weinstock
Journal:  Bioinformatics       Date:  2011-10-23       Impact factor: 6.937

Review 2.  RNA-Seq technology and its application in fish transcriptomics.

Authors:  Xi Qian; Yi Ba; Qianfeng Zhuang; Guofang Zhong
Journal:  OMICS       Date:  2013-12-31

3.  De novo assembly and characterization of the transcriptome of the parasitic weed dodder identifies genes associated with plant parasitism.

Authors:  Aashish Ranjan; Yasunori Ichihashi; Moran Farhi; Kristina Zumstein; Brad Townsley; Rakefet David-Schwartz; Neelima R Sinha
Journal:  Plant Physiol       Date:  2014-01-07       Impact factor: 8.340

Review 4.  Emerging tools for synthetic genome design.

Authors:  Bo-Rahm Lee; Suhyung Cho; Yoseb Song; Sun Chang Kim; Byung-Kwan Cho
Journal:  Mol Cells       Date:  2013-05-02       Impact factor: 5.034

5.  RNA-Seq analysis and de novo transcriptome assembly of Hevea brasiliensis.

Authors:  Zhihui Xia; Huimin Xu; Jinling Zhai; Dejun Li; Hongli Luo; Chaozu He; Xi Huang
Journal:  Plant Mol Biol       Date:  2011-08-03       Impact factor: 4.076

6.  Corset: enabling differential gene expression analysis for de novo assembled transcriptomes.

Authors:  Nadia M Davidson; Alicia Oshlack
Journal:  Genome Biol       Date:  2014-07-26       Impact factor: 13.583

Review 7.  Computational methods for transcriptome annotation and quantification using RNA-seq.

Authors:  Manuel Garber; Manfred G Grabherr; Mitchell Guttman; Cole Trapnell
Journal:  Nat Methods       Date:  2011-05-27       Impact factor: 28.547

Review 8.  Next-generation transcriptome assembly.

Authors:  Jeffrey A Martin; Zhong Wang
Journal:  Nat Rev Genet       Date:  2011-09-07       Impact factor: 53.242

9.  BHap: a novel approach for bacterial haplotype reconstruction.

Authors:  Xin Li; Samaneh Saadat; Haiyan Hu; Xiaoman Li
Journal:  Bioinformatics       Date:  2019-11-01       Impact factor: 6.937

10.  Methanotrophic bacteria in oilsands tailings ponds of northern Alberta.

Authors:  Alireza Saidi-Mehrabad; Zhiguo He; Ivica Tamas; Christine E Sharp; Allyson L Brady; Fauziah F Rochman; Levente Bodrossy; Guy Cj Abell; Tara Penner; Xiaoli Dong; Christoph W Sensen; Peter F Dunfield
Journal:  ISME J       Date:  2012-12-20       Impact factor: 10.302

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.