| Literature DB >> 34946896 |
Samarendra Das1,2,3, Anil Rai4, Michael L Merchant5,6, Matthew C Cave7, Shesh N Rai2,3,6,7,8,9.
Abstract
Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput sequencing technique for studying gene expressions at the cell level. Differential Expression (DE) analysis is a major downstream analysis of scRNA-seq data. DE analysis the in presence of noises from different sources remains a key challenge in scRNA-seq. Earlier practices for addressing this involved borrowing methods from bulk RNA-seq, which are based on non-zero differences in average expressions of genes across cell populations. Later, several methods specifically designed for scRNA-seq were developed. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to comprehensively study the performance of DE analysis methods. Here, we provide a review and classification of different DE approaches adapted from bulk RNA-seq practice as well as those specifically designed for scRNA-seq. We also evaluate the performance of 19 widely used methods in terms of 13 performance metrics on 11 real scRNA-seq datasets. Our findings suggest that some bulk RNA-seq methods are quite competitive with the single-cell methods and their performance depends on the underlying models, DE test statistic(s), and data characteristics. Further, it is difficult to obtain the method which will be best-performing globally through individual performance criterion. However, the multi-criteria and combined-data analysis indicates that DECENT and EBSeq are the best options for DE analysis. The results also reveal the similarities among the tested methods in terms of detecting common DE genes. Our evaluation provides proper guidelines for selecting the proper tool which performs best under particular experimental settings in the context of the scRNA-seq.Entities:
Keywords: TOPSIS; combined data settings; differential expression; multiple criteria decision making; scRNA-seq; statistical models
Mesh:
Year: 2021 PMID: 34946896 PMCID: PMC8701051 DOI: 10.3390/genes12121947
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Description of scRNA-seq DE methods.
| SN. | Methods | Distribution | Utility | Input | DE Test Stat. | Runtime | Availability | Ref. |
|---|---|---|---|---|---|---|---|---|
| 01 | DESeq2 | NB | Bulk-cell | Counts | Wald | Low | Bioconductor | [ |
| 02 | edgeR | NB | Bulk-cell | Counts | QLF, LRT | Low | Bioconductor | [ |
| 03 | LIMMA | Linear Model | Bulk-cell | Norm. | Bayesian Wald | Low | Bioconductor | [ |
| 04 | DEGseq | Poisson | Bulk-cell | Counts | Z-score | Low | Bioconductor | [ |
| 05 | T-test | T-test | General | Norm. | t stat. | Low | CRAN | [ |
| 06 | Wilcoxon | Wilcoxon test | General | Counts | Wilcox | Low | CRAN | [ |
| 07 | baySeq | NB | Bulk-cell | Counts | Posterior prob. | Low | Bioconductor | [ |
| 08 | NBseq | NB | Bulk-cell | Counts | Fisher’s stat. | Low | CRAN | [ |
| 09 | EBSeq | NB | Bulk-cell | Counts | Bayesian | High | Bioconductor | [ |
| 10 | Cuffdiff | Beta-NB | Bulk-cell | sam file | Low | Linux | [ | |
| 11 | SAMseq | NP | Bulk-cell | Counts | Wilcox | Low | CRAN | [ |
| 12 | Ballgown | Linear Model | Bulk-cell | Counts | Lin. Mod. test stat. | Medium | Bioconductor | [ |
| 13 | TSPM | Poisson | Bulk-cell | Counts | Low | R code | [ | |
| 14 | ROTS | NP | Bulk-cell | Norm. | Medium | Bioconductor | [ | |
| 15 | metagenomeSeq | Bulk-cell | Medium | [ | ||||
| 16 | SCDE | Mixture Model | Single-cell | UMI | Bayesian Stat. | High | Bioconductor | [ |
| 17 | scDD | Multi-Modal Bayesian | Single-cell | Norm. | Bayesian Stat. | High | Bioconductor | [ |
| 18 | D3E | NP | Single-cell | UMI | Cramér-von Mises test/KS test | High | GitHub, Python | [ |
| 19 | BPSC | Beta-Poisson | Single-cell | UMI | LRT | Medium | GitHub | [ |
| 20 | MAST | Hurdle Model | Single-cell | Norm. | LRT | Medium | Bioconductor | [ |
| 21 | Monocle2 | GAM | Single-cell | Norm. | LRT | Medium | Bioconductor | [ |
| 22 | DEsingle | ZINB | Single-cell | UMI | LRT | High | Bioconductor, GitHub | [ |
| 23 | DECENT | ZINB, Beta-Binomial | Single-cell | UMI | LRT | High | GitHub | [ |
| 24 | DESCEND | Poisson | Single-cell | UMI | High | GitHub | [ | |
| 25 | EMDomics | NP | Single-cell | Norm. | Euclidean distance | High | Bioconductor | [ |
| 26 | Sincera | NP | Single-cell | Norm. | Welch’s t-stat. (LS) | High | GitHub | [ |
| 27 | ZIAQ | Logistic Regression | Single-cell | Norm. | Fisher’s stat. | Medium | GitHub | [ |
| 28 | sigEMD | NP | Single-cell | Norm. | Distance measure | High | GitHub | [ |
| 29 | TASC | Logistic, Poisson | Single-cell | UMI | LRT | High | GitHub | [ |
| 30 | ZINB-Wave | ZINB | Single-cell | UMI | LRT | High | Bioconductor, GitHub | [ |
| 31 | SwarnSeq | ZINB | Single-cell | UMI | LRT | High | GitHub | [ |
| 32 | NODES | Wilcoxon test | Single-cell | Norm. | Wilcox | Medium | *Dropbox | [ |
| 33 | BASiCS | Poisson-Gamma | Single-cell | Norm. | Posterior prob. | High | Bioconductor | [ |
| 34 | NBID | NB | Single-cell | UMI | LRT | Medium | R code | [ |
| 35 | tradeSeq | GAM | Single-cell | UMI | Wald | Medium | GitHub | [ |
| 36 | SC2P | ZIP | Single-cell | UMI | Posterior prob. | High | GitHub | [ |
Bulk-cell: bulk RNA-seq; NB: Negative Binomial; ZINB: Zero Inflated Negative Binomial; ZIP: Zero Inflated Poisson; UMI: Unique Molecular Identification counts; single-cell: Single-cell RNA-seq; Norm.: Normalized (Continuous); Ref.: Reference; GAM: Generalized Additive Model; LRT: Likelihood Ratio Test; LS: Large Samples; SS: Small Samples; KS: Kolmogorov-Smirnov’s test; QLF: Quasi-Likelihood F-test; Wilcox: Wilcoxon signed rank/Mann–Whitney.
Figure 1Schematic Representation of Classification of scRNA-seq DE Methods and Tools. Schematic overview illustrating the breakup of the methods that can be adapted from the RNA-seq practice to fit scRNA-seq data (Class I) as well as those specifically designed for single-cell (Classes II, III) based on the different distribution models that they fit. Different example tools belonging to each category are listed in pink color boxes.1 Methods use the external RNA spike-ins and 2 Parametric approaches but can handle multi-modality of the data.
Classification of methods used for detection of DE genes in scRNA-seq data.
| SN. | Classes | Descriptions |
|---|---|---|
| 01 | Class I | Underlying Models: |
| Features: | ||
| Limitations: | ||
| Tools: | ||
| 02 | Class II | Methods: |
| Features: | ||
| Limitations: | ||
| Tools: | ||
| 03 | Class III | Models: |
| Features: | ||
| Limitations: | ||
| Tools: |
SN.: Serial Number; DE: Differentially Expressed; GLM: Generalized Linear Model; GAM: Generalized Additive Model.
List of the scRNA-seq datasets used in this study.
| SN. | Data | Description | Accession | #Genes | #Cells | Ref. |
|---|---|---|---|---|---|---|
| 01 | Tung | Human induced Pluripotent stem cell lines | GSE77288 | 18938 | 576 | [ |
| 02 | Islam | single-cell (ES and MEF) transcriptional landscape by highly multiplex RNA-Seq | GSE29087 | 22928 | 92 | [ |
| 03 | Soumillon1 | Differentiating adipose cells by scRNA-Seq (Day 1 vs. 2) | GSE53638 | 23895 | 1835 | [ |
| 04 | Soumillon2 | Differentiating adipose cells by scRNA-Seq (Days 1 vs. 3) | GSE53638 | 23895 | 2268 | [ |
| 05 | Soumillon3 | Differentiating adipose cells by scRNA-Seq (Days 2 vs. 3) | GSE53638 | 23895 | 1613 | [ |
| 06 | Klein | Mouse ES cells | GSE65525 | 24174 | 1481 | [ |
| 07 | Gierahn | Single-cell RNA sequencing experiments of HEK cells | GSE92495 | 24176 | 1453 | [ |
| 08 | Chen | ScRNA-seq of Rh41 using 10x Genomics | GSE113660 | 33694 | 7261 | [ |
| 09 | Savas | Breast cancer cells using 10x Genomics | GSE110686 | 33694 | 6311 | [ |
| 10 | Grun | Mouse ES single cells using CEL-seq technique | GSE54695 | 12467 | 320 | [ |
| 11 | Ziegenhain | Sc-RNA-seq of Mouse ES cells | GSE75790 | 39016 | 583 | [ |
SN: Serial Number; #Genes: number of genes/transcripts, #Cells: number of cells; ES: Embryonic Stem; MEF: Mouse Embryonic Fibroblast; Ref.: Reference.
Figure 2Comparative evaluation of DE methods through performance metrics for Soumillion2 data. The tested DE methods are evaluated on Soumillion2 data through the performance evaluation metrics, such as TP, FP, TPR, FPR, PPR, FDR, Accuracy, F1-score, and AUROC. The 19 tested methods are shown in X-axis. The violin plots are shown for the comparative evaluation of tested methods through (A) TP; (B) FP; (C) PPR; (D) TPR; (E) FPR; (F) Accuracy; (G) FDR; (H) F1-score; and (I) AUROC. The violin plot shows full distribution of performance metrics computed for each tested method. The box represents the inter-quartile range; the horizontal line represents the median, and the bars on the boxes show 1.5x inter-quartile range.
Evaluation of DE methods based on performance evaluation metrics for Soumillon2 scRNA-seq data.
| TP | FP | TN | FN | TPR | FPR | FDR | PPR | NPV | ACC | F1 | AUROC | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| BPSC | 1478 | 1522 | 11113 | 1522 | 0.493 | 0.120 | 0.507 | 0.493 | 0.880 | 0.805 | 0.493 | 0.722 |
| DECENT | 1674 | 1326 | 11309 | 1326 | 0.558 | 0.105 | 0.442 | 0.558 | 0.895 | 0.830 | 0.558 | 0.857 |
| DEGseq | 1228 | 1772 | 10863 | 1772 | 0.409 | 0.140 | 0.591 | 0.409 | 0.860 | 0.773 | 0.409 | 0.585 |
| DESeqNB | 1653 | 1347 | 11288 | 1347 | 0.551 | 0.107 | 0.449 | 0.551 | 0.893 | 0.828 | 0.551 | 0.811 |
| DESeqLRT | 1247 | 1753 | 10882 | 1753 | 0.416 | 0.139 | 0.584 | 0.416 | 0.861 | 0.776 | 0.416 | 0.666 |
| DEsingle | 1428 | 1572 | 11063 | 1572 | 0.476 | 0.124 | 0.524 | 0.476 | 0.876 | 0.799 | 0.476 | 0.709 |
| EBSeq | 1110 | 1890 | 10745 | 1890 | 0.370 | 0.150 | 0.630 | 0.370 | 0.850 | 0.758 | 0.370 | 0.654 |
| edgeRLRT | 1537 | 1463 | 11172 | 1463 | 0.512 | 0.116 | 0.488 | 0.512 | 0.884 | 0.813 | 0.512 | 0.729 |
| edgeRQLF | 1506 | 1494 | 11141 | 1494 | 0.502 | 0.118 | 0.498 | 0.502 | 0.882 | 0.809 | 0.502 | 0.758 |
| EMDomics | 844 | 2156 | 10479 | 2156 | 0.281 | 0.171 | 0.719 | 0.281 | 0.829 | 0.724 | 0.281 | 0.594 |
| LIMMA | 1612 | 1388 | 11247 | 1388 | 0.537 | 0.110 | 0.463 | 0.537 | 0.890 | 0.822 | 0.537 | 0.768 |
| MAST | 1337 | 1663 | 10972 | 1663 | 0.446 | 0.132 | 0.554 | 0.446 | 0.868 | 0.787 | 0.446 | 0.685 |
| Monocle | 1454 | 1546 | 11089 | 1546 | 0.485 | 0.122 | 0.515 | 0.485 | 0.878 | 0.802 | 0.485 | 0.691 |
| NBSeq | 1497 | 1503 | 11132 | 1503 | 0.499 | 0.119 | 0.501 | 0.499 | 0.881 | 0.808 | 0.499 | 0.718 |
| NODES | 1173 | 1827 | 10808 | 1827 | 0.391 | 0.145 | 0.609 | 0.391 | 0.855 | 0.766 | 0.391 | 0.620 |
| ROTS | 1170 | 1830 | 10805 | 1830 | 0.390 | 0.145 | 0.610 | 0.390 | 0.855 | 0.766 | 0.390 | 0.618 |
| scDD | 697 | 2303 | 10332 | 2303 | 0.232 | 0.182 | 0.768 | 0.232 | 0.818 | 0.705 | 0.232 | 0.547 |
| T-test | 1501 | 1499 | 11136 | 1499 | 0.500 | 0.119 | 0.500 | 0.500 | 0.881 | 0.808 | 0.500 | 0.719 |
| Wilcox | 1413 | 1587 | 11048 | 1587 | 0.471 | 0.126 | 0.529 | 0.471 | 0.874 | 0.797 | 0.471 | 0.695 |
TP: True Positives; FP: False Positives; TN: True Negatives; FN: False Negatives; TPR: True Positive Rate; FPR: False Positive Rate; FDR: False Discovery Rate; PPR: Positive Prediction Rate; NPV: Negative Prediction Value; ACC: Accuracy; F1: F1 score; AUROC: Area Under Receiver Operating Curve; Values are computed for DE gene set of size 3000.
Figure 3Performance evaluation of methods under MCDM setup for Soumillion2 data. Comparative performance analysis of 19 tested methods was carried out through the TOPSIS approach under MCDM setup on the Soumillion2 dataset. MCDM-TOPSIS analysis was carried out under (i) multi-criteria including runtime criterion; (ii) multi-criteria excluding runtime criterion. X-axis represents tested methods and Y-axis represents TOPSIS scores. The results from the (A) MCDM-TOPSIS analysis of the DE methods are shown for 12 performance metrics excluding runtime criterion; (B) MCDM-TOPSIS analysis of the methods based on 13 performance metrics including runtime criterion; (C) Average similarities between evaluated DE methods based on 13-performance metrics. The dendrogram was obtained by average-linkage hierarchical clustering based on matrix of average values of performance metrics over all gene sets; (D) Similarity analysis among the methods based on their ability to detect common DE genes. The p-values for each comparison were computed through the Binomial test (Supplementary Document S12). Significant proportions (at 1% level of significance) of common genes shared among the methods are shown in various colors and white empty cells represent non-significant values.
Figure 4Combined data analysis of methods based on F1-score through the TOPSIS technique. Comparative performance evaluation of DE methods was performed based on F1score through the TOPSIS approach under multi-data setup. This analysis was performed on data matrix having F1-scores of tested methods across 11 considered datasets. (A) Shows results from the TOPSIS analysis of tested methods; (B) Similarity analysis of tested methods through clustering. Dendrogram was obtained by average-linkage hierarchical clustering based on the matrix of average F1-scores (over DE gene sets). (C) Correlation analysis of methods through rank correlation. Correlation plot was obtained by Spearman’s rank correlation method using the matrix of average (over DE gene sets) F1-scores across all datasets. (D) Weighted similarity analysis of tested methods (Supplementary Document S12) based on their ability to detect common genes. Nodes represent tested methods and edges represent shared degree of similarity between pairs of methods. Red color edges (with scores > 0.7) among methods indicated highest similarity, blue color edges indicate higher similarity ([0.5, 0.7]), green color edges represent with low similarity ([0.2, 0.5]), and magenta color edges represent lowest degree of similarity ([0, 0.2]) among the methods. Nodes in the network are abbreviated as EMD: EMDomics; LIM: LIMMA; EBS: EBSeq; scD: scDD; DEG: DEGseq; DSN: DESeqNB; NOD: NODES; BPS: BPSC; NBS: NBSeq; Wil: Wilcox; Mon: Monocle; DEC: DECENT; MAS: MAST; Tst: T-test; DEs: DEsingle; ROT: ROTS; DSL: DESeqLRT; edQ: edgeRQLF; and edL: edgeRLRT.
Figure 5Combined data analysis of methods based on FDR and Accuracy metrics through TOPSIS technique. Comparative performance evaluation of DE methods was performed based on FDR and Accuracy metrics through TOPSIS technique under multi-data setup. This analysis was performed on the matrix having FDR and Accuracy scores of tested methods across 11 considered datasets. (A) TOPSIS analysis of tested methods based on FDR across real datasets. Bars show values of TOPSIS scores and number on each bar represents ranks; (B) Similarity analysis of methods based on FDR through clustering. Dendrogram was obtained by average-linkage hierarchical clustering based on the matrix of average FDR scores across all datasets. (C) Similarity analysis of methods based on FDR through correlation. Correlation plot was obtained by Spearman’s rank correlation method. White color boxes show non-significant correlation; (D) TOPSIS analysis of methods based on Accuracy across the real datasets. Bars show values of TOPSIS scores and number on each bar represent methods’ ranks; (E) Similarity analysis of methods based on Accuracy through clustering. (F) Similarity analysis of methods based on Accuracy metrics through correlation. Correlation plot was obtained by Spearman’s rank correlation for Accuracy scores across all datasets. Strength of correlation is shown through color intensity.