| Literature DB >> 31266443 |
Kyungtaek Park1, Jaehoon An2, Jungsoo Gim3, Minseok Seo4,5, Woojoo Lee6, Taesung Park1,7, Sungho Won8,9,10.
Abstract
BACKGROUND: Transcriptomic profiles can improve our understanding of the phenotypic molecular basis of biological research, and many statistical methods have been proposed to identify differentially expressed genes (DEGs) under two or more conditions with RNA-seq data. However, statistical analyses with RNA-seq data are often limited by small sample sizes, and global variance estimates of RNA expression levels have been utilized as prior distributions for gene-specific variance estimates, making it difficult to generalize the methods to more complicated settings. We herein proposed a Bartlett-Adjusted Likelihood-based LInear mixed model approach (BALLI) to analyze more complicated RNA-seq data. The proposed method estimates the technical and biological variances with a linear mixed-effects model, with and without adjusting small sample bias using Bartlkett's corrections.Entities:
Keywords: Bartlett’s correction; Differentially expressed genes; Linear mixed model; RNA sequencing
Year: 2019 PMID: 31266443 PMCID: PMC6604381 DOI: 10.1186/s12864-019-5851-6
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Estimated type-1 error rates with simulation data based on Nigerian people’s RNA-seq data. Estimated type-1 error rates by BALLI, DESeq2, edgeR, LLI, and voom and their 95% confidence levels were estimated for = 12,16,20 and 24. aThe type-1 error rates are marked by bold font if their 95% confidence levels include or lower than the nominal significant level
| BALLI | DESeq2 | edgeR | LLI | voom | BALLI | DESeq2 | edgeR | LLI | voom | |
|---|---|---|---|---|---|---|---|---|---|---|
|
|
| |||||||||
| 0.1 |
|
|
| 0.12695 (0.11443,0.13947) |
|
|
| 0.11070 (0.10223,0.11916) | 0.11613 (0.10522,0.12704) |
|
| 0.05 |
|
|
| 0.06636 (0.05767,0.07506) |
|
|
| 0.05762 (0.05218,0.06305) | 0.06072 (0.05355,0.06790) |
|
| 0.01 |
|
| 0.01441 (0.01186,0.01697) | 0.01662 (0.01328,0.01996) |
|
|
| 0.01393 (0.01212,0.01575) | 0.01349 (0.01118,0.01580) |
|
| 0.005 |
| 0.00741 (0.00531,0.00952) | 0.00891 (0.00709,0.01072) | 0.00942 (0.00720,0.01163) |
|
| 0.00695 (0.00555,0.00835) | 0.00809 (0.00692,0.00925) | 0.00714 (0.00575,0.00853) |
|
|
|
| |||||||||
| 0.1 |
|
| 0.10962 (0.10028,0.11896) |
|
|
|
| 0.10874 (0.10024,0.11724) |
|
|
| 0.05 |
|
| 0.05774 (0.05147,0.06401) |
|
|
|
| 0.05881 (0.05343,0.06419) |
|
|
| 0.01 |
|
| 0.01381 (0.01152,0.01609) |
|
|
|
| 0.01317 (0.01163,0.01472) |
|
|
| 0.005 |
|
| 0.00789 (0.00643,0.00934) |
|
|
|
| 0.00728 (0.00636,0.00819) |
|
|
Fig. 1Estimated powers and precisions with simulation data based on Nigerian people’s RNA-seq data. Statistical powers of BALLI, DESeq2, edgeR, LLI, and voom were estimated at FDR-adjusted 0.1 significance level when δ = 0.5σ or 1σ and N = 12, 16, 20, 24, 28, 40, 64, and 68. a Estimated power when δ=0.5σ. b Estimated precision when δ=0.5σ. c Estimated power when δ=1σ. d Estimated precision when δ=1σ
Estimated type-1 error rates with simulation data based on simulated RNA-seq data from negative binomial distribution. Estimated type-1 error rates by BALLI, DESeq2, edgeR, LLI, and voom and their 95% confidence levels were estimated for = 12,16,20,and 24. aThe type-1 error rates are marked by bold font if their 95% confidence levels include or lower than the nominal significant level
| BALLI | DESeq2 | edgeR | LLI | voom | BALLI | DESeq2 | edgeR | LLI | voom | |
|---|---|---|---|---|---|---|---|---|---|---|
|
|
| |||||||||
| 0.1 |
|
| 0.12172 (0.11984, 0.12360) | 0.13198 (0.13030, 0.13367) |
|
|
| 0.12028 (0.11911, 0.12145) | 0.11853 (0.11729, 0.11977) |
|
| 0.05 |
|
| 0.05604 (0.05492, 0.05716) | 0.07063 (0.06926, 0.07200) |
|
|
| 0.05616 (0.05528, 0.05703) | 0.06345 (0.06239, 0.06452) |
|
| 0.01 |
|
| 0.01267 (0.01209, 0.01326) | 0.01858 (0.01798, 0.01919) |
|
|
| 0.01276 (0.01225, 0.01328) | 0.01469 (0.01419, 0.01519) |
|
| 0.005 |
| 0.00549 (0.00517, 0.00581) | 0.00689 (0.00650, 0.00729) | 0.01043 (0.00992, 0.01094) |
|
| 0.00543 (0.00515, 0.00572) | 0.00682 (0.00653, 0.00711) | 0.00803 (0.00759, 0.00846) |
|
|
|
| |||||||||
| 0.1 |
|
| 0.11844 (0.11718, 0.11971) | 0.11073 (0.10925, 0.11221) |
|
|
| 0.11846 (0.11686, 0.12005) | 0.10867 (0.10747, 0.10987) |
|
| 0.05 |
|
| 0.05748 (0.05625, 0.05871) | 0.05780 (0.05662, 0.05898) |
|
|
| 0.06050 (0.05953, 0.06147) | 0.05518 (0.05427, 0.05608) |
|
| 0.01 |
|
| 0.01271 (0.01204, 0.01338) | 0.01328 (0.01267, 0.01388) |
|
|
| 0.01242 (0.01203, 0.01281) | 0.01212 (0.01167, 0.01257) |
|
| 0.005 |
| 0.00536 (0.00501, 0.00572) | 0.00688 (0.00647, 0.00730) | 0.00707 (0.00661, 0.00753) |
|
|
| 0.00662 (0.00639, 0.00685) | 0.00625 (0.00595, 0.00655) |
|
Fig. 2Effect of varying library sizes on the type-1 error rates. Type-1 error rates were estimated by BALLI, DESeq2, edgeR, LLI, and voom when u = 0.2, 0.4, 0.6, 0.8, and 1 and sample size (N) is 12 (a), 16 (b), 20 (c), 24 (d), 28 (e), 40 (f), 64 (g), and 68 (h) at the 0.05 nominal significance level
Fig. 3Effect of varying library sizes on the statistical power and precision. Statistical powers and precisions for BALLI, DESeq2, edgeR, LLI, and voom were empirically estimated at FDR-adjusted 0.1 significance level when u = 0.2, δ = 0.5σ or 1σ and sample size (N) is 12, 16, 20, 24, 28, 40, 64, and 68. a Estimated power when u=0.2 and δ=0.5σ. b Estimated precision when u=0.2 and δ=0.5σ. c Estimated power when u=0.2 and δ=1σ. d Estimated precision when u=0.2 and δ=1σ
True DEG analysis results of Holstein milk data. Holstein milk data was analyzed by BALLI, DESeq2, edgeR, LLI, and voom and their p values (FDRs) are provided
| BALLI | DESeq2 | edgeR | LLI | voom | |
|---|---|---|---|---|---|
| TOX4 | 1.058 × 10−5 (1.267 × 10− 1) | 2.949 × 10−4 (6.298 × 10− 1) | 2.797 × 10−2 (1) | 7.476 × 10−7 (8.947 × 10− 3) | 4.980 × 10− 3 (9.999 × 10− 1) |
| HNRNPL | 3.913 × 10− 4 (9.222 × 10− 1) | 1.551 × 10− 3 (9.997 × 10− 1) | 1.684 × 10− 1 (1) | 6.796 × 10− 5 (1.787 × 10− 1) | 4.677 × 10− 2 (9.999 × 10− 1) |
| SPTSSB | 4.457 × 10− 4 (9.222 × 10− 1) | 2.660 × 10− 3 (9.997 × 10− 1) | 2.160 × 10− 4 (1) | 8.686 × 10− 5 (1.787 × 10− 1) | 1.037 × 10− 3 (9.999 × 10− 1) |
| NOS3 | 4.676 × 10−4 (9.222 × 10− 1) | 2.396 × 10− 4 (6.928 × 10− 1) | 2.549 × 10− 4 (1) | 8.957 × 10− 5 (1.787 × 10− 1) | 8.693 × 10− 4 (9.999 × 10− 1) |
| SLC4A1 | 2.145 × 10−2 (9.999 × 10− 1) | 8.753 × 10− 2 (9.997 × 10− 1) | 1.025 × 10− 1 (1) | 9.856 × 10− 3 (7.179 × 10− 1) | 7.579 × 10− 2 (9.999 × 10− 1) |
| NLN | 9.513 × 10− 2 (9.999 × 10− 1) | 3.028 × 10− 1 (9.997 × 10− 1) | 3.789 × 10− 1 (1) | 6.084 × 10− 2 (9.774 × 10− 1) | 1.214 × 10− 1 (9.999 × 10− 1) |
| KALRN | 9.792 × 10− 2 (9.999 × 10− 1) | 8.943 × 10− 2 (9.997 × 10− 1) | 8.815 × 10− 2 (1) | 6.307 × 10− 2 (9.790 × 10− 1) | 1.054 × 10− 1 (9.999 × 10− 1) |
| PMCH | 1.635 × 10− 1 (9.999 × 10− 1) | 2.225 × 10− 1 (9.997 × 10− 1) | 2.353 × 10− 1 (1) | 1.176 × 10− 1 (9.999 × 10− 1) | 1.758 × 10− 1 (9.999 × 10− 1) |
| C25H16orf88 | 1.765 × 10− 1 (9.999 × 10− 1) | 1.494 × 10− 1 (9.997 × 10− 1) | 2.627 × 10− 1 (1) | 1.289 × 10− 1 (9.999 × 10− 1) | 1.516 × 10− 1 (9.999 × 10− 1) |
Fig. 4Significant genes of Holstein milk data. Venn diagram was provided with significant genes at the 0.001 nominal significance level by BALLI, DESeq2, edgeR, LLI, and voom
Significant genes in all methods of Holstein milk data. Gene lists of Holstein milk data siginificant in nominal 0.005 significant level for all methods (BALLI, DESeq2, edgeR, LLI, and voom) and their p values (FDRs) are provided
| BALLI | DESeq2 | edgeR | LLI | voom | |
|---|---|---|---|---|---|
| SPTSSB | 4.457 × 10−4 (9.222 × 10− 1) | 2.660 × 10− 3 (9.997 × 10− 1) | 2.160 × 10− 4 (1) | 8.686 × 10− 5 (1.787 × 10− 1) | 1.037 × 10− 3 (9.999 × 10− 1) |
| NOS3 | 4.676 × 10− 4 (9.222 × 10− 1) | 2.396 × 10− 4 (6.298 × 10− 1) | 2.549 × 10− 4 (1) | 8.957 × 10− 5 (1.787 × 10− 1) | 8.693 × 10− 4 (9.999 × 10− 1) |
| FXYD3 | 7.198 × 10−4 (9.507 × 10− 1) | 2.813 × 10− 4 (6.298 × 10− 1) | 5.182 × 10− 4 (1) | 1.513 × 10− 4 (2.012 × 10− 1) | 2.097 × 10− 3 (9.999 × 10− 1) |
| SPESP1 | 1.103 × 10− 3 (9.507 × 10− 1) | 4.834 × 10− 3 (9.997 × 10− 1) | 1.669 × 10− 3 (1) | 2.579 × 10−4 (2.385 × 10− 1) | 4.001 × 10− 3 (9.999 × 10− 1) |
| CHST1 | 1.339 × 10− 3 (9.507 × 10− 1) | 8.624 × 10− 4 (9.383 × 10− 1) | 1.872 × 10− 3 (1) | 3.166 × 10− 4 (2.385 × 10− 1) | 6.108 × 10− 4 (9.999 × 10− 1) |
| LEPREL1 | 1.387 × 10− 3 (9.507 × 10− 1) | 2.554 × 10− 3 (9.997 × 10− 1) | 3.297 × 10− 3 (1) | 3.334 × 10− 4 (2.385 × 10− 1) | 1.487 × 10− 3 (9.999 × 10− 1) |
| JUB | 1.486 × 10− 3 (9.507 × 10− 1) | 1.755 × 10− 3 (9.997 × 10− 1) | 2.445 × 10− 3 (1) | 3.674 × 10− 4 (2.385 × 10− 1) | 2.804 × 10− 3 (9.999 × 10− 1) |
| MIA | 1.509 × 10− 3 (9.507 × 10− 1) | 4.210 × 10− 4 (6.298 × 10− 1) | 2.247 × 10− 3 (1) | 3.666 × 10− 4 (2.385 × 10− 1) | 2.432 × 10− 3 (9.999 × 10− 1) |
| C4BPA | 2.241 × 10− 3 (9.999 × 10− 1) | 3.190 × 10− 4 (6.298 × 10− 1) | 1.307 × 10− 3 (1) | 6.000 × 10− 4 (2.951 × 10− 1) | 2.866 × 10− 3 (9.999 × 10− 1) |
| CLDN6 | 2.653 × 10− 3 (9.999 × 10− 1) | 1.282 × 10−3 (9.997 × 10− 1) | 2.414 × 10− 3 (1) | 7.377 × 10− 4 (3.270 × 10− 1) | 1.542 × 10− 3 (9.999 × 10− 1) |
| PALMD | 3.737 × 10− 3 (9.999 × 10− 1) | 1.151 × 10− 3 (9.997 × 10− 1) | 2.648 × 10−3 (1) | 1.127 × 10− 3 (3.776 × 10− 1) | 3.032 × 10− 3 (9.999 × 10− 1) |
| KLK12 | 3.840 × 10− 3 (9.999 × 10− 1) | 2.766 × 10− 3 (9.997 × 10− 1) | 9.826 × 10− 4 (1) | 1.225 × 10− 3 (3.776 × 10− 1) | 3.162 × 10− 3 (9.999 × 10− 1) |