Literature DB >> 28172640

Comprehensive evaluation of de novo transcriptome assembly programs and their effects on differential gene expression analysis.

Sufang Wang1, Michael Gribskov1,2.   

Abstract

Motivation: With the decreased cost of RNA-Seq, an increasing number of non-model organisms have been sequenced. Due to the lack of reference genomes, de novo transcriptome assembly is required. However, there is limited systematic research evaluating the quality of de novo transcriptome assemblies and how the assembly quality influences downstream analysis.
Results: We used two authentic RNA-Seq datasets from Arabidopsis thaliana, and produced transcriptome assemblies using eight programs with a series of k-mer sizes (from 25 to 71), including BinPacker, Bridger, IDBA-tran, Oases-Velvet, SOAPdenovo-Trans, SSP, Trans-ABySS and Trinity. We measured the assembly quality in terms of reference genome base and gene coverage, transcriptome assembly base coverage, number of chimeras and number of recovered full-length transcripts. SOAPdenovo-Trans performed best in base coverage, while Trans-ABySS performed best in gene coverage and number of recovered full-length transcripts. In terms of chimeric sequences, BinPacker and Oases-Velvet were the worst, while IDBA-tran, SOAPdenovo-Trans, Trans-ABySS and Trinity produced fewer chimeras across all single k-mer assemblies. In differential gene expression analysis, about 70% of the significantly differentially expressed genes (DEG) were the same using reference genome and de novo assemblies. We further identify four reasons for the differences in significant DEG between reference genome and de novo transcriptome assemblies: incomplete annotation, exon level differences, transcript fragmentation and incorrect gene annotation, which we suggest that de novo assembly is beneficial even when a reference genome is available. Availability and Implementation: Software used in this study are publicly available at the authors' websites. Contact: gribskov@purdue.edu Supplimentary Information: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2017        PMID: 28172640     DOI: 10.1093/bioinformatics/btw625

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  26 in total

1.  RNA-Seq in Nonmodel Organisms.

Authors:  Vered Chalifa-Caspi
Journal:  Methods Mol Biol       Date:  2021

Review 2.  Genomics of coloration in natural animal populations.

Authors:  Luis M San-Jose; Alexandre Roulin
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2017-07-05       Impact factor: 6.237

3.  De novo transcriptome assembly: A comprehensive cross-species comparison of short-read RNA-Seq assemblers.

Authors:  Martin Hölzer; Manja Marz
Journal:  Gigascience       Date:  2019-05-01       Impact factor: 6.524

4.  A Comprehensive Guide to Potato Transcriptome Assembly.

Authors:  Maja Zagorščak; Marko Petek
Journal:  Methods Mol Biol       Date:  2021

5.  Transcriptomic analysis of crustacean neuropeptide signaling during the moult cycle in the green shore crab, Carcinus maenas.

Authors:  Andrew Oliphant; Jodi L Alexander; Martin T Swain; Simon G Webster; David C Wilcockson
Journal:  BMC Genomics       Date:  2018-09-26       Impact factor: 3.969

6.  Enrichment of genomic resources and identification of simple sequence repeats from medicinally important Clausena excavata.

Authors:  Doo Young Bae; Sang Mi Eum; Sang Woo Lee; Jin-Hyub Paik; Soo-Yong Kim; Mihyun Park; Changyoung Lee; The Bach Tran; Van Hai Do; Jae-Yun Heo; Eun-Soo Seong; Il-Seop Kim; Ki-Young Choi; Jin Sung Hong; Rahul Vasudeo Ramekar; Sangho Choi; Jong-Kuk Na
Journal:  3 Biotech       Date:  2018-02-15       Impact factor: 2.406

7.  The gonadal transcriptome of the unisexual Amazon molly Poecilia formosa in comparison to its sexual ancestors, Poecilia mexicana and Poecilia latipinna.

Authors:  Ina Maria Schedina; Detlef Groth; Ingo Schlupp; Ralph Tiedemann
Journal:  BMC Genomics       Date:  2018-01-03       Impact factor: 3.969

8.  Challenges and advances for transcriptome assembly in non-model species.

Authors:  Arnaud Ungaro; Nicolas Pech; Jean-François Martin; R J Scott McCairns; Jean-Philippe Mévy; Rémi Chappaz; André Gilles
Journal:  PLoS One       Date:  2017-09-20       Impact factor: 3.240

9.  De novo Transcriptome Assembly and Comparison of C3, C3-C4, and C4 Species of Tribe Salsoleae (Chenopodiaceae).

Authors:  Maximilian Lauterbach; Hanno Schmidt; Kumari Billakurthi; Thomas Hankeln; Peter Westhoff; Udo Gowik; Gudrun Kadereit
Journal:  Front Plant Sci       Date:  2017-11-14       Impact factor: 5.753

10.  SAUTE: sequence assembly using target enrichment.

Authors:  Alexandre Souvorov; Richa Agarwala
Journal:  BMC Bioinformatics       Date:  2021-07-21       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.