Literature DB >> 28535780

Differentially expressed genes from RNA-Seq and functional enrichment results are affected by the choice of single-end versus paired-end reads and stranded versus non-stranded protocols.

Susan M Corley1, Karen L MacKenzie2, Annemiek Beverdam3,4, Louise F Roddam5, Marc R Wilkins6.   

Abstract

BACKGROUND: RNA-Seq is now widely used as a research tool. Choices must be made whether to use paired-end (PE) or single-end (SE) sequencing, and whether to use strand-specific or non-specific (NS) library preparation kits. To date there has been no analysis of the effect of these choices on identifying differentially expressed genes (DEGs) between controls and treated samples and on downstream functional analysis.
RESULTS: We undertook four mammalian transcriptomics experiments to compare the effect of SE and PE protocols on read mapping, feature counting, identification of DEGs and functional analysis. For three of these experiments we also compared a non-stranded (NS) and a strand-specific approach to mapping the paired-end data. SE mapping resulted in a reduced number of reads mapped to features, in all four experiments, and lower read count per gene. Up to 4.3% of genes in the SE data and up to 12.3% of genes in the NS data had read counts which were significantly different compared to the PE data. Comparison of DEGs showed the presence of false positives (average 5%, using voom) and false negatives (average 5%, using voom) using the SE reads. These increased further, by one or two percentage points, with the NS data. Gene ontology functional enrichment (GO) of the DEGs arising from SE or NS approaches, revealed striking differences in the top 20 GO terms, with as little as 40% concordance with PE results. Caution is therefore advised in the interpretation of such results. By comparison, there was overall consistency in gene set enrichment analysis results.
CONCLUSIONS: A strand-specific protocol should be used in library preparation to generate the most reliable and accurate profile of expression. Ideally PE reads are also recommended particularly for transcriptome assembly. Whilst SE reads produce a DEG list with around 5% of false positives and false negatives, this method can substantially reduce sequencing cost and this saving could be used to increase the number of biological replicates thereby increasing the power of the experiment. As SE reads, when used in association with gene set enrichment, can generate accurate biological results, this may be a desirable trade-off.

Entities:  

Keywords:  Differential expression; Non-strand-specific; Paired-end reads; RNA-Seq; Single-end reads; Strand-specific; Transcriptomics

Mesh:

Year:  2017        PMID: 28535780      PMCID: PMC5442695          DOI: 10.1186/s12864-017-3797-0

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


  24 in total

1.  A simple strand-specific RNA-Seq library preparation protocol combining the Illumina TruSeq RNA and the dUTP methods.

Authors:  Marc Sultan; Simon Dökel; Vyacheslav Amstislavskiy; Daniela Wuttig; Holger Sültmann; Hans Lehrach; Marie-Laure Yaspo
Journal:  Biochem Biophys Res Commun       Date:  2012-05-15       Impact factor: 3.575

Review 2.  A survey of sequence alignment algorithms for next-generation sequencing.

Authors:  Heng Li; Nils Homer
Journal:  Brief Bioinform       Date:  2010-05-11       Impact factor: 11.622

3.  Count-based differential expression analysis of RNA sequencing data using R and Bioconductor.

Authors:  Simon Anders; Davis J McCarthy; Yunshun Chen; Michal Okoniewski; Gordon K Smyth; Wolfgang Huber; Mark D Robinson
Journal:  Nat Protoc       Date:  2013-08-22       Impact factor: 13.491

4.  Mapping and quantifying mammalian transcriptomes by RNA-Seq.

Authors:  Ali Mortazavi; Brian A Williams; Kenneth McCue; Lorian Schaeffer; Barbara Wold
Journal:  Nat Methods       Date:  2008-05-30       Impact factor: 28.547

5.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

6.  Camera: a competitive gene set test accounting for inter-gene correlation.

Authors:  Di Wu; Gordon K Smyth
Journal:  Nucleic Acids Res       Date:  2012-05-25       Impact factor: 16.971

7.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.

Authors:  Cole Trapnell; Brian A Williams; Geo Pertea; Ali Mortazavi; Gordon Kwan; Marijke J van Baren; Steven L Salzberg; Barbara J Wold; Lior Pachter
Journal:  Nat Biotechnol       Date:  2010-05-02       Impact factor: 54.908

8.  voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.

Authors:  Charity W Law; Yunshun Chen; Wei Shi; Gordon K Smyth
Journal:  Genome Biol       Date:  2014-02-03       Impact factor: 13.583

9.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

10.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions.

Authors:  Daehwan Kim; Geo Pertea; Cole Trapnell; Harold Pimentel; Ryan Kelley; Steven L Salzberg
Journal:  Genome Biol       Date:  2013-04-25       Impact factor: 13.583

View more
  11 in total

1.  CircRNAs in Xiang pig ovaries among diestrus and estrus stages.

Authors:  Xi Niu; Yali Huang; Huan Lu; Sheng Li; Shihui Huang; Xueqin Ran; Jiafu Wang
Journal:  Porcine Health Manag       Date:  2022-06-23

Review 2.  Transcriptome Profiling in Human Diseases: New Advances and Perspectives.

Authors:  Amelia Casamassimi; Antonio Federico; Monica Rienzo; Sabrina Esposito; Alfredo Ciccodicola
Journal:  Int J Mol Sci       Date:  2017-07-29       Impact factor: 5.923

3.  Identifying inaccuracies in gene expression estimates from unstranded RNA-seq data.

Authors:  Mikhail Pomaznoy; Ashu Sethi; Jason Greenbaum; Bjoern Peters
Journal:  Sci Rep       Date:  2019-11-08       Impact factor: 4.379

4.  Detection of genomic structure variants associated with wrinkled skin in Xiang pig by next generation sequencing.

Authors:  Liu Xiaoli; Hu Fengbin; Huang Shihui; Niu Xi; Li Sheng; Wang Zhou; Ran Xueqin; Wang Jiafu
Journal:  Aging (Albany NY)       Date:  2021-11-27       Impact factor: 5.682

5.  Human Umbilical Cord Mesenchymal Stem Cell-Derived Exosomes Accelerate Diabetic Wound Healing via Ameliorating Oxidative Stress and Promoting Angiogenesis.

Authors:  Chenchen Yan; Yan Xv; Ze Lin; Yori Endo; Hang Xue; Yiqiang Hu; Liangcong Hu; Lang Chen; Faqi Cao; Wu Zhou; Peng Zhang; Guohui Liu
Journal:  Front Bioeng Biotechnol       Date:  2022-01-31

6.  Evaluation of methods to detect circular RNAs from single-end RNA-sequencing data.

Authors:  Ha-Nam Nguyen; Trung Nghia Vu; Manh Hung Nguyen
Journal:  BMC Genomics       Date:  2022-02-08       Impact factor: 3.969

7.  Analysis of Hub Genes and the Mechanism of Immune Infiltration in Stanford Type a Aortic Dissection.

Authors:  Haoyu Gao; Xiaogang Sun; Yanxiang Liu; Shenghua Liang; Bowen Zhang; Luchen Wang; Jie Ren
Journal:  Front Cardiovasc Med       Date:  2021-07-02

8.  Saliva exosomes-derived UBE2O mRNA promotes angiogenesis in cutaneous wounds by targeting SMAD6.

Authors:  Bobin Mi; Lang Chen; Yuan Xiong; Chenchen Yan; Hang Xue; Adriana C Panayi; Jing Liu; Liangcong Hu; Yiqiang Hu; Faqi Cao; Yun Sun; Wu Zhou; Guohui Liu
Journal:  J Nanobiotechnology       Date:  2020-05-06       Impact factor: 10.435

9.  QuantSeq. 3' Sequencing combined with Salmon provides a fast, reliable approach for high throughput RNA expression analysis.

Authors:  Susan M Corley; Niamh M Troy; Anthony Bosco; Marc R Wilkins
Journal:  Sci Rep       Date:  2019-12-11       Impact factor: 4.379

10.  how_are_we_stranded_here: quick determination of RNA-Seq strandedness.

Authors:  Brandon Signal; Tim Kahlke
Journal:  BMC Bioinformatics       Date:  2022-01-22       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.