| Literature DB >> 36009024 |
Shaoqiang Zhang1, Linjuan Xie1, Yaxuan Cui1, Benjamin R Carone2, Yong Chen2.
Abstract
The detection of differentially expressed genes (DEGs) is one of most important computational challenges in the analysis of single-cell RNA sequencing (scRNA-seq) data. However, due to the high heterogeneity and dropout noise inherent in scRNAseq data, challenges in detecting DEGs exist when using a single distribution of gene expression levels, leaving much room to improve the precision and robustness of current DEG detection methods. Here, we propose the use of a new method, DEGman, which utilizes several possible diverse distributions in combination with Bhattacharyya distance. DEGman can automatically select the best-fitting distributions of gene expression levels, and then detect DEGs by permutation testing of Bhattacharyya distances of the selected distributions from two cell groups. Compared with several popular DEG analysis tools on both large-scale simulation data and real scRNA-seq data, DEGman shows an overall improvement in the balance of sensitivity and precision. We applied DEGman to scRNA-seq data of TRAP; Ai14 mouse neurons to detect fear-memory-related genes that are significantly differentially expressed in neurons with and without fear memory. DEGman detected well-known fear-memory-related genes and many novel candidates. Interestingly, we found 25 DEGs in common in five neuron clusters that are functionally enriched for synaptic vesicles, indicating that the coupled dynamics of synaptic vesicles across in neurons plays a critical role in remote memory formation. The proposed method leverages the advantage of the use of diverse distributions in DEG analysis, exhibiting better performance in analyzing composite scRNA-seq datasets in real applications.Entities:
Keywords: Bhattacharyya distance; differentially expressed gene; memory formation; scRNA-seq
Mesh:
Year: 2022 PMID: 36009024 PMCID: PMC9405875 DOI: 10.3390/biom12081130
Source DB: PubMed Journal: Biomolecules ISSN: 2218-273X
Figure 1Overflows of DEGman method and detecting fear-memory-related genes. (a) Workflow of DEGman method. (b) The application of DEGman to detect memory-related genes from mouse neurons with fear memory. BD: Bhattacharyya distance NUMI: normalized UMI. ACC: anterior cingulate cortex.
Figure 2The performance of DEGman. (a) ROC curves and AUC values of nine DEG analysis tools using simulated data. (b) The F1-scores of DEGman on simluated data with different thresholds of Bhattacharyya distance. (c) The F1-scores of DEGman, DEseq2 and DEsingle for different dropout levels.
Performance comparison of ten tools on the simulated data. Adjusted p-value < 0.05.
| Tools | Average DEGs | Average TPs | Sensitivity | Precision | F1-Score | Time (s) |
|---|---|---|---|---|---|---|
| DEGman | 1697.3 | 1591.1 | 0.796 | 0.937 | 0.861 | 567.59 |
| DEsingle | 1609.2 | 1514.3 | 0.757 | 0.941 | 0.839 | 1581.48 |
| SigEMD | 1458.6 | 1226.4 | 0.613 | 0.841 | 0.709 | 3006.17 |
| DESeq2 | 1411.2 | 1335.8 | 0.668 | 0.947 | 0.783 | 88.77 |
| scDD | 1236.4 | 1092.6 | 0.546 | 0.884 | 0.675 | 3344.19 |
| Monocle2 | 4883.5 | 1672.3 | 0.836 | 0.342 | 0.486 | 191.71 |
| edgeR | 1254.3 | 1163.2 | 0.582 | 0.927 | 0.715 | 25.41 |
| singleCellHaystack | 32 | 32 | 0.016 | 1.000 | 0.031 | 8.11 |
| glmmTMB | 1192.2 | 1045.6 | 0.523 | 0.877 | 0.655 | 1687.76 |
| NEBULA-HL | 1334.3 | 1194.2 | 0.597 | 0.895 | 0.714 | 94.34 |
Numbers of predicted DEGs and the TPs of the 1000 gold standard genes for the ten tools, and numbers of detected DEGs (FPs) by using negative control real data. Adjusted p-value < 0.05.
| Tools | TPs | DEGs | FPs of 10,000 Genes | FP Rate |
|---|---|---|---|---|
| DEGman | 808 | 8175 | 5 | 0.0005 |
| DEsingle | 779 | 8242 | 4 | 0.0004 |
| SigEMD | 488 | 3702 | 51 | 0.0051 |
| DEseq2 | 695 | 8437 | 19 | 0.0019 |
| scDD | 351 | 2638 | 5 | 0.0005 |
| Monocle2 | 765 | 8674 | 917 | 0.0917 |
| edgeR | 580 | 4447 | 0 | 0 |
| singleCellHaystack | 238 | 1739 | 0 | 0 |
| glmmTMB | 417 | 3652 | 121 | 0.0121 |
| NEBULA-HL | 349 | 3947 | 114 | 0.0114 |
Figure 3Comparative analysis of DEGs among mouse neurons. (a) UMAP dimensional reduction of the clustering result of all Snap25+ neurons (3530 cells). (b) The numbers of TRAPed FR and NF cells in the five largest clusters. (c) Venn diagram of the DEGs called by DEGman, DEsingle, glmmTMB and NEBULA-HL under an adjusted p-value of 0.05. (d) Venn diagram of the DEGs called by DEGman, DEseq2, scDD and edgeR under an adjusted p-value of 0.05.
Figure 4Expression heatmap and cellular locations of 25 fear-memory-related genes. (a) Heatmap of 25 genes detected by DEGman in the first cell cluster. Three gene groups are detected by hierarchical clustering analysis. * denotes the genes previously reported in RA-DEGs. (b) Illustration of gene locations. TF: transcriptional factor.