| Literature DB >> 31787074 |
Kefei Liu1, Li Shen1, Hui Jiang2.
Abstract
BACKGROUND: A fundamental problem in RNA-seq data analysis is to identify genes or exons that are differentially expressed with varying experimental conditions based on the read counts. The relativeness of RNA-seq measurements makes the between-sample normalization of read counts an essential step in differential expression (DE) analysis. In most existing methods, the normalization step is performed prior to the DE analysis. Recently, Jiang and Zhan proposed a statistical method which introduces sample-specific normalization parameters into a joint model, which allows for simultaneous normalization and differential expression analysis from log-transformed RNA-seq data. Furthermore, an ℓ0 penalty is used to yield a sparse solution which selects a subset of DE genes. The experimental conditions are restricted to be categorical in their work.Entities:
Keywords: Between-sample normalization; Differential expression; RNA-seq; ℓ 0-regularized regression
Mesh:
Year: 2019 PMID: 31787074 PMCID: PMC6886201 DOI: 10.1186/s12859-019-3070-4
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Models and parameters for synthetic data generation
| length of gene | |
| other log scaling factors of gene | |
| log-fold change for non-DE genes | |
| log-fold change for up-regulated DE genes | |
| log-fold change for down-regulated DE genes | |
| covariates for sample | |
| library size of sample | |
| other log scaling factors of sample | |
| mean read counts of gene |
The AUCs of edgeR-robust, DESeq2, limma-voom, ELMSeq and rSeqRobust based on log-normally distributed data
| DE (%) | Up (%) | edgeR | DESeq2 | voom | ELMSeq | rSeqRobust |
|---|---|---|---|---|---|---|
| 1 | 50 | 0.9734 | 0.9736 | 0.9717 | 0.9757 | |
| (0.0091) | (0.009) | (0.0065) | (0.0064) | |||
| 1 | 75 | 0.954 | 0.9531 | 0.9343 | 0.935 | |
| (0.0113) | (0.0141) | (0.0153) | (0.018) | |||
| 1 | 100 | 0.9525 | 0.9531 | 0.9476 | 0.9594 | |
| (0.0144) | (0.0137) | (0.0151) | (0.0139) | |||
| 10 | 50 | 0.958 | 0.9623 | 0.9573 | 0.9627 | |
| (0.0079) | (0.0069) | (0.0069) | (0.0067) | |||
| 10 | 75 | 0.9707 | 0.9632 | 0.964 | 0.9668 | |
| (0.0057) | (0.0045) | (0.0061) | (0.0057) | |||
| 10 | 100 | 0.9403 | 0.9272 | 0.94 | 0.9435 | |
| (0.0107) | (0.0142) | (0.0128) | (0.0128) | |||
| 30 | 50 | 0.9689 | 0.9696 | 0.9665 | 0.9678 | |
| (0.0056) | (0.0048) | (0.0053) | (0.0052) | |||
| 30 | 75 | 0.9318 | 0.9265 | 0.9458 | 0.9564 | |
| (0.0113) | (0.0116) | (0.0096) | (0.0078) | |||
| 30 | 100 | 0.8771 | 0.8693 | 0.8753 | 0.9372 | |
| (0.0153) | (0.0091) | (0.0145) | (0.0149) | |||
| 50 | 50 | 0.9466 | 0.954 | 0.9425 | 0.9557 | |
| (0.0092) | (0.0059) | (0.0087) | (0.0071) | |||
| 50 | 75 | 0.9099 | 0.906 | 0.9145 | 0.9401 | |
| (0.0167) | (0.0123) | (0.0178) | (0.0135) | |||
| 50 | 100 | 0.7083 | 0.7236 | 0.7197 | 0.879 | |
| (0.022) | (0.0291) | (0.0242) | (0.0195) | |||
| 70 | 50 | 0.967 | 0.9655 | 0.9652 | 0.9655 | |
| (0.0039) | (0.0034) | (0.0036) | (0.0031) | |||
| 70 | 75 | 0.8569 | 0.8351 | 0.8564 | 0.9089 | |
| (0.0193) | (0.0161) | (0.0194) | (0.0118) | |||
| 70 | 100 | 0.4536 | 0.5212 | 0.4893 | 0.4786 | |
| (0.0344) | (0.0296) | (0.018) | (0.037) | |||
| 90 | 50 | 0.953 | 0.9538 | 0.9513 | 0.9512 | |
| (0.0064) | (0.0064) | (0.0081) | (0.0049) | |||
| 90 | 75 | 0.7203 | 0.6918 | 0.7256 | 0.6906 | |
| (0.0239) | (0.0177) | (0.0323) | (0.0167) | |||
| 90 | 100 | 0.2568 | 0.506 | 0.2566 | 0.3516 | |
| (0.0257) | (0.0265) | (0.0278) | (0.0345) |
The sample size is n=20. The variance of the normal distribution is . The table shows the percent of DE genes (DE %), percent of up-regulated genes among all the DE genes (Up %), and the mean AUCs (standard errors in parentheses) for all five methods with 10 simulated replicates. The highest AUC value is shown in bold
The AUCs of edgeR-robust, DESeq2, limma-voom, ELMSeq and rSeqRobust based on negative-binomially distributed data
| DE (%) | Up (%) | edgeR | DESeq2 | voom | ELMSeq | rSeqRobust |
|---|---|---|---|---|---|---|
| 1 | 50 | 0.9585 | 0.9635 | 0.9636 | 0.9622 | |
| (0.0105) | (0.0105) | (0.0106) | (0.0112) | |||
| 1 | 75 | 0.9644 | 0.9696 | 0.967 | 0.9711 | |
| (0.0114) | (0.0098) | (0.0105) | (0.0095) | |||
| 1 | 100 | 0.9711 | 0.977 | 0.9765 | 0.9754 | |
| (0.0083) | (0.0061) | (0.005) | (0.0056) | |||
| 10 | 50 | 0.9576 | 0.9604 | 0.9613 | 0.9647 | |
| (0.005) | (0.0035) | (0.0039) | (0.0042) | |||
| 10 | 75 | 0.9551 | 0.957 | 0.9559 | 0.9613 | |
| (0.0054) | (0.0075) | (0.0061) | (0.0075) | |||
| 10 | 100 | 0.9469 | 0.9496 | 0.9474 | 0.9611 | |
| (0.0105) | (0.008) | (0.0103) | (0.0059) | |||
| 30 | 50 | 0.9509 | 0.9528 | 0.949 | 0.9582 | |
| (0.0083) | (0.0045) | (0.0101) | (0.0043) | |||
| 30 | 75 | 0.9413 | 0.9428 | 0.9406 | 0.9664 | |
| (0.0093) | (0.0056) | (0.0069) | (0.0026) | |||
| 30 | 100 | 0.8689 | 0.8629 | 0.879 | 0.9128 | |
| (0.015) | (0.0106) | (0.0168) | (0.0113) | |||
| 50 | 50 | 0.9599 | 0.9618 | 0.9543 | 0.962 | |
| (0.0081) | (0.006) | (0.0086) | (0.006) | |||
| 50 | 75 | 0.8834 | 0.8902 | 0.892 | 0.9279 | |
| (0.0123) | (0.0131) | (0.01) | (0.0132) | |||
| 50 | 100 | 0.7465 | 0.7003 | 0.7425 | 0.8802 | |
| (0.0302) | (0.0174) | (0.0318) | (0.012) | |||
| 70 | 50 | 0.9565 | 0.9629 | 0.956 | 0.9636 | |
| (0.0049) | (0.0036) | (0.0054) | (0.0026) | |||
| 70 | 75 | 0.8164 | 0.7922 | 0.8264 | 0.8847 | |
| (0.0187) | (0.0066) | (0.0248) | (0.0107) | |||
| 70 | 100 | 0.4964 | 0.488 | 0.5462 | 0.4482 | |
| (0.0323) | (0.0227) | (0.0315) | (0.0224) | |||
| 90 | 50 | 0.9503 | 0.9463 | 0.9584 | 0.9478 | |
| (0.0064) | (0.0077) | (0.0037) | (0.0062) | |||
| 90 | 75 | 0.6657 | 0.6272 | 0.6879 | 0.5992 | |
| (0.0205) | (0.0124) | (0.0226) | (0.0131) | |||
| 90 | 100 | 0.2455 | 0.4752 | 0.2905 | 0.2826 | |
| (0.0317) | (0.0225) | (0.0316) | (0.0214) |
The table shows the percent of DE genes (DE %), percent of up-regulated genes among all the DE genes (Up %), and the mean AUCs (standard errors in parentheses) for all five methods with 10 simulated replicates. The highest AUC value is shown in bold
The AUCs of edgeR-robust, DESeq2, limma-voom, ELMSeq and rSeqRobust based on log-normally distributed data
| DE (%) | Up (%) | edgeR - robust | DESeq2 | limma - voom | ELMSeq | rSeqRobust |
|---|---|---|---|---|---|---|
| 1 | 50 | 0.9349 | 0.9087 | 0.9243 | 0.9277 | |
| (0.0222) | (0.0265) | (0.0154) | (0.0156) | |||
| 1 | 75 | 0.9349 | 0.9423 | 0.9359 | 0.9315 | |
| (0.0153) | (0.0125) | (0.015) | (0.0147) | |||
| 1 | 100 | 0.907 | 0.8781 | 0.8498 | 0.8481 | |
| (0.0391) | (0.0456) | (0.0579) | (0.0596) | |||
| 10 | 50 | 0.8743 | 0.8687 | 0.8604 | 0.864 | |
| (0.0177) | (0.0211) | (0.0194) | (0.0192) | |||
| 10 | 75 | 0.9043 | 0.8916 | 0.8751 | 0.8729 | |
| (0.0256) | (0.0276) | (0.0329) | (0.0373) | |||
| 10 | 100 | 0.9217 | 0.8959 | 0.9174 | 0.9194 | |
| (0.0185) | (0.0191) | (0.0233) | (0.0201) | |||
| 30 | 50 | 0.9154 | 0.9111 | 0.8874 | 0.8937 | |
| (0.0141) | (0.0177) | (0.023) | (0.0224) | |||
| 30 | 75 | 0.8762 | 0.8942 | 0.8777 | 0.8862 | |
| (0.0395) | (0.0407) | (0.0458) | (0.0509) | |||
| 30 | 100 | 0.8599 | 0.8431 | 0.8658 | 0.8391 | |
| (0.0201) | (0.0175) | (0.022) | (0.0265) | |||
| 50 | 50 | 0.9018 | 0.9035 | 0.8978 | 0.8914 | |
| (0.0187) | (0.0162) | (0.0162) | (0.0252) | |||
| 50 | 75 | 0.8704 | 0.8681 | 0.8724 | 0.8719 | |
| (0.02) | (0.021) | (0.0182) | (0.027) | |||
| 50 | 100 | 0.7227 | 0.759 | 0.7251 | 0.8133 | |
| (0.0331) | (0.0278) | (0.0291) | (0.036) | |||
| 70 | 50 | 0.8804 | 0.8641 | 0.9004 | 0.8885 | |
| (0.0247) | (0.0348) | (0.0258) | (0.0301) | |||
| 70 | 75 | 0.8073 | 0.8202 | 0.8088 | 0.8747 | |
| (0.0275) | (0.0285) | (0.0241) | (0.0227) | |||
| 70 | 100 | 0.4748 | 0.5097 | 0.4891 | 0.4778 | |
| (0.0507) | (0.0415) | (0.0601) | (0.0614) | |||
| 90 | 50 | 0.8905 | 0.8625 | 0.9094 | 0.8581 | |
| (0.0299) | (0.0322) | (0.0116) | (0.0433) | |||
| 90 | 75 | 0.6897 | 0.6534 | 0.7015 | 0.6706 | |
| (0.0485) | (0.0438) | (0.045) | (0.0379) | |||
| 90 | 100 | 0.2229 | 0.2818 | 0.3102 | 0.411 | |
| (0.04) | (0.0365) | (0.041) | (0.0916) |
The table shows the percent of DE genes (DE %), percent of up-regulated genes among all the DE genes (Up %), and the mean AUCs (standard errors in parentheses) for all five methods with 10 simulated replicates. The highest AUC value is shown in bold
The AUCs of edgeR-robust, DESeq2, limma-voom, ELMSeq and rSeqRobust based on negative-binomially distributed data
| DE (%) | Up (%) | edgeR - robust | DESeq2 | limma - voom | ELMSeq | rSeqRobust |
|---|---|---|---|---|---|---|
| 1 | 50 | 0.8696 | 0.8944 | 0.8686 | 0.8924 | |
| (0.0378) | (0.0175) | (0.0389) | (0.017) | |||
| 1 | 75 | 0.9038 | 0.8961 | 0.9001 | 0.9057 | |
| (0.0146) | (0.0166) | (0.0163) | (0.0162) | |||
| 1 | 100 | 0.9108 | 0.898 | 0.8992 | 0.8933 | |
| (0.0228) | (0.0279) | (0.0223) | (0.0237) | |||
| 10 | 50 | 0.9176 | 0.9141 | 0.9092 | 0.9126 | |
| (0.009) | (0.0091) | (0.0091) | (0.008) | |||
| 10 | 75 | 0.8999 | 0.8994 | 0.8892 | 0.8961 | |
| (0.0099) | (0.0124) | (0.0122) | (0.0108) | |||
| 10 | 100 | 0.8558 | 0.8646 | 0.854 | 0.8651 | |
| (0.0263) | (0.0217) | (0.029) | (0.0258) | |||
| 30 | 50 | 0.9148 | 0.9082 | 0.9046 | 0.9037 | |
| (0.0108) | (0.0126) | (0.0097) | (0.0095) | |||
| 30 | 75 | 0.8963 | 0.9002 | 0.8879 | 0.8935 | |
| (0.0134) | (0.008) | (0.0171) | (0.0126) | |||
| 30 | 100 | 0.8655 | 0.8843 | 0.8489 | 0.8962 | |
| (0.02) | (0.0091) | (0.0244) | (0.0085) | |||
| 50 | 50 | 0.8924 | 0.8804 | 0.8888 | 0.8895 | |
| (0.0146) | (0.0201) | (0.0129) | (0.0123) | |||
| 50 | 75 | 0.8837 | 0.9025 | 0.8761 | 0.8925 | |
| (0.0214) | (0.0095) | (0.0219) | (0.024) | |||
| 50 | 100 | 0.6974 | 0.6906 | 0.6963 | 0.7648 | |
| (0.0255) | (0.0261) | (0.0236) | (0.029) | |||
| 70 | 50 | 0.8985 | 0.8948 | 0.8897 | 0.8806 | |
| (0.0175) | (0.0168) | (0.0144) | (0.0189) | |||
| 70 | 75 | 0.7951 | 0.7845 | 0.806 | 0.8163 | |
| (0.0203) | (0.0094) | (0.0236) | (0.0158) | |||
| 70 | 100 | 0.5673 | 0.4875 | 0.5651 | 0.48 | |
| (0.0271) | (0.0255) | (0.0326) | (0.0261) | |||
| 90 | 50 | 0.8809 | 0.8658 | 0.8841 | 0.8025 | |
| (0.0184) | (0.0233) | (0.0169) | (0.0367) | |||
| 90 | 75 | 0.6859 | 0.6557 | 0.651 | 0.6562 | |
| (0.0422) | (0.032) | (0.0378) | (0.0565) | |||
| 90 | 100 | 0.2348 | 0.3932 | 0.2105 | 0.2978 | |
| (0.0256) | (0.0273) | (0.0355) | (0.0196) |
The table shows the percent of DE genes (DE %), percent of up-regulated genes among all the DE genes (Up %), and the mean AUCs (standard errors in parentheses) for all five methods with 10 simulated replicates. The highest AUC value is shown in bold
The AUCs of edgeR-robust, DESeq2, limma-voom, ELMSeq and rSeqRobust based on log-normally distributed data
| DE (%) | Up (%) | edgeR - robust | DESeq2 | limma - voom | ELMSeq | rSeqRobust |
|---|---|---|---|---|---|---|
| 1 | 50 | 0.9727 | 0.9861 | 0.9864 | 0.9863 | |
| (0.0077) | (0.0066) | (0.0061) | (0.0063) | |||
| 1 | 75 | 0.9951 | 0.9991 | 0.9986 | 0.9991 | |
| (0.0032) | (9e-04) | (9e-04) | (8e-04) | |||
| 1 | 100 | 0.9774 | 0.9892 | 0.9811 | 0.9845 | |
| (0.0089) | (0.0068) | (0.0093) | (0.0135) | |||
| 10 | 50 | 0.9807 | 0.9889 | 0.983 | 0.9847 | |
| (0.0038) | (0.0016) | (0.0026) | (0.0025) | |||
| 10 | 75 | 0.9803 | 0.9856 | 0.9889 | 0.987 | |
| (0.0037) | (0.0027) | (0.0019) | (0.0023) | |||
| 10 | 100 | 0.9601 | 0.9568 | 0.9784 | 0.9763 | |
| (0.0072) | (0.007) | (0.0052) | (0.0073) | |||
| 30 | 50 | 0.9811 | 0.9878 | 0.9854 | 0.9864 | |
| (0.002) | (0.002) | (9e-04) | (0.001) | |||
| 30 | 75 | 0.9321 | 0.946 | 0.9576 | 0.9836 | |
| (0.005) | (0.0036) | (0.0031) | (0.0026) | |||
| 30 | 100 | 0.8313 | 0.7859 | 0.8892 | 0.9725 | |
| (0.0217) | (0.0072) | (0.0171) | (0.0036) | |||
| 50 | 50 | 0.9836 | 0.9856 | 0.9889 | 0.9893 | |
| (0.002) | (0.0013) | (0.0013) | (0.0013) | |||
| 50 | 75 | 0.8518 | 0.8061 | 0.8857 | 0.9787 | |
| (0.0218) | (0.011) | (0.0167) | (0.0024) | |||
| 50 | 100 | 0.5708 | 0.5533 | 0.5863 | 0.896 | |
| (0.0356) | (0.0086) | (0.0223) | (0.0078) | |||
| 70 | 50 | 0.9763 | 0.97 | 0.986 | 0.9871 | |
| (0.0034) | (0.0085) | (0.0022) | (0.0019) | |||
| 70 | 75 | 0.7051 | 0.5986 | 0.7466 | 0.885 | |
| (0.0226) | (0.0139) | (0.0311) | (0.0109) | |||
| 70 | 100 | 0.3702 | 0.5275 | 0.3727 | 0.3825 | |
| (0.0052) | (0.0097) | (0.013) | (0.0018) | |||
| 90 | 50 | 0.9792 | 0.9851 | 0.9766 | 0.9878 | |
| (0.0034) | (0.0027) | (0.0035) | (0.0019) | |||
| 90 | 75 | 0.4242 | 0.5324 | 0.4887 | 0.4061 | |
| (0.0163) | (0.0135) | (0.0205) | (0.0049) | |||
| 90 | 100 | 0.3881 | 0.5456 | 0.3553 | 0.3833 | |
| (0.003) | (0.0119) | (0.0027) | (0.0026) |
The table shows the percent of DE genes (DE %), percent of up-regulated genes among all the DE genes (Up %), and the mean AUCs (standard errors in parentheses) for all five methods with 10 simulated replicates. The highest AUC value is shown in bold
The AUCs of edgeR-robust, DESeq2, limma-voom, ELMSeq and rSeqRobust based on negative-binomially distributed data
| DE (%) | Up (%) | edgeR - robust | DESeq2 | limma - voom | ELMSeq | rSeqRobust |
|---|---|---|---|---|---|---|
| 1 | 50 | 0.9934 | 0.9919 | 0.9922 | 0.9937 | |
| (0.0038) | (0.0043) | (0.0048) | (0.0045) | |||
| 1 | 75 | 0.9933 | 0.9933 | 0.993 | 0.9922 | |
| (0.0033) | (0.0043) | (0.0036) | (0.0047) | |||
| 1 | 100 | 0.9882 | 0.9836 | 0.9867 | 0.9891 | |
| (0.0046) | (0.0057) | (0.0054) | (0.0047) | |||
| 10 | 50 | 0.9866 | 0.9892 | 0.9876 | 0.9895 | |
| (0.0024) | (0.0021) | (0.0024) | (0.0019) | |||
| 10 | 75 | 0.9775 | 0.9803 | 0.9795 | 0.9867 | |
| (0.0037) | (0.0044) | (0.0032) | (0.0024) | |||
| 10 | 100 | 0.9724 | 0.9739 | 0.9788 | 0.9864 | |
| (0.0045) | (0.0046) | (0.0035) | (0.0032) | |||
| 30 | 50 | 0.9838 | 0.9851 | 0.9874 | 0.9878 | |
| (0.0022) | (0.0022) | (0.0017) | (0.0014) | |||
| 30 | 75 | 0.9568 | 0.9601 | 0.9614 | 0.9837 | |
| (0.0058) | (0.0022) | (0.0052) | (0.0023) | |||
| 30 | 100 | 0.8809 | 0.8902 | 0.89 | 0.9837 | |
| (0.0171) | (0.0044) | (0.0143) | (0.0014) | |||
| 50 | 50 | 0.982 | 0.9823 | 0.9867 | 0.9869 | |
| (0.0022) | (0.0027) | (0.0017) | (0.0016) | |||
| 50 | 75 | 0.9178 | 0.8977 | 0.9228 | 0.9799 | |
| (0.008) | (0.0074) | (0.0069) | (0.0015) | |||
| 50 | 100 | 0.5817 | 0.5509 | 0.6413 | 0.9157 | |
| (0.0345) | (0.0104) | (0.027) | (0.0026) | |||
| 70 | 50 | 0.9811 | 0.9807 | 0.9873 | 0.9871 | |
| (0.0026) | (0.0022) | (0.0013) | (0.0014) | |||
| 70 | 75 | 0.7935 | 0.6559 | 0.8258 | 0.9108 | |
| (0.0348) | (0.023) | (0.0306) | (0.0061) | |||
| 70 | 100 | 0.3529 | 0.4508 | 0.3866 | 0.3371 | |
| (0.0082) | (0.0222) | (0.0188) | (0.003) | |||
| 90 | 50 | 0.9842 | 0.9867 | 0.9849 | 0.987 | |
| (0.0023) | (0.0019) | (0.0019) | (0.0015) | |||
| 90 | 75 | 0.5017 | 0.5326 | 0.5683 | 0.4044 | |
| (0.0238) | (0.0121) | (0.0247) | (0.0104) | |||
| 90 | 100 | 0.3403 | 0.5145 | 0.2979 | 0.3167 | |
| (0.0033) | (0.0092) | (0.0021) | (0.003) |
The table shows the percent of DE genes (DE %), percent of up-regulated genes among all the DE genes (Up %), and the mean AUCs (standard errors in parentheses) for all five methods with 10 simulated replicates. The highest AUC value is shown in bold
The computational times (in seconds) of edgeR-robust, DESeq2, limma-voom, ELMSeq and rSeqRobust
| edgeR | DESeq2 | voom | ELMSeq | rSeqRobust | |
|---|---|---|---|---|---|
| 7 | 5.45 | 0.76 | 403.39 | 16.63 | |
| 20 | 9.49 | 1.51 | 987.87 | 21.84 | |
| 200 | 70.68 | 49.30 | 2225.95 | 76.93 |
Percent of DE genes: 10%, percent of up-regulated genes among the DE genes: 50%. The least time is shown in bold
Fig. 1Venn diagram based on the set of differentially expressed genes identified by edgeR, DESeq2, limma-voom, ELMSeq and rSeqRobust