| Literature DB >> 35337265 |
Xu Liu1,2, Jialu Zhao1,2,3,4, Liting Xue1,2, Tian Zhao1,2, Wei Ding1, Yuying Han5,6, Haihong Ye7,8.
Abstract
BACKGROUND: The application of RNA-seq technology has become more extensive and the number of analysis procedures available has increased over the past years. Selecting an appropriate workflow has become an important issue for researchers in the field.Entities:
Keywords: Ballgown; Cuffdiff; DESeq2; Differentially expressed analysis; Differentially expressed genes (DEGs); RNA-seq; Sleuth; Transcriptome data analysis
Mesh:
Year: 2022 PMID: 35337265 PMCID: PMC8957167 DOI: 10.1186/s12864-022-08465-0
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1A schematic overview of the evaluation workflow. a The six procedures for RNA-seq analysis compared in this article are as follows: (1) HISAT2-HTseq-DESeq2; (2) HISAT2-HTseq-edgeR; (3) HISAT2-HTseq-limma; (4) HISAT2-StringTie-Ballgown; (5) HISAT2-Cufflinks-Cuffdiff; (6) Kallisto-Sleuth. b Time and memories consumed by each software
Fig. 2Evaluation and comparison of genes expression levels in the mouse dataset. a MA plots of different analytical procedures. b Comparison of gene expression levels evaluated by different quantitative software without any screening. c Comparison of gene expression levels obtained with different quantitative software after removing the genes with the top and bottom 10% expression levels. The numbers in brackets represents the procedure number. R2 was calculated via Pearson’s correlation analysis
Fig. 3Evaluation and comparison of fold change (FC) of gene expression levels obtained with different analytical procedures for the mouse dataset. a-h Comparison of log2FC obtained with different procedures. R2 and p were calculated via Pearson’s correlation analysis. i Set visualization graphics of DEG numbers when |log2FC|> 1 was used as threshold to define DEGs. The numbers in brackets represent the procedure number
Fig. 4Evaluation and comparison of p values from different analytical procedures for the mouse dataset. a-h Comparison of p values obtained from different procedures. R2 and p were calculated via Pearson’s correlation analysis. i Set visualization graphics of DEG numbers when p < 0.01 was used as threshold to define DEGs. The numbers in brackets represent the procedure number
Fig. 5Number of DEGs defined with combination of FC and p value for the mouse dataset. a The line chart reflects the total number of DEGs estimated by different procedures with |log2FC|> 1 and different p values. b The histogram reflects the interval number of DEGs estimated by different procedures with |log2FC|> 1 and different p values. c The line chart reflects the total number of DEGs estimated by different procedures with p < 0.01 and different |log2FC|. d The histogram reflects the interval number of DEGs estimated by different procedures with p < 0.01 and different |log2FC|. e Set visualization graphics of DEG numbers when |log2FC|> 1 and p < 0.01. The numbers in brackets represent the procedure number
Fig. 6Correlation of log2FC for the same genes in different procedures and qRT-PCR experiments. A total of 21 genes were assessed. VR, verification rate. R2 and p was calculated via Pearson’s correlation analysis
Fig. 7Guidelines for researchers to decide the appropriate procedure for RNA-seq analysis