| Literature DB >> 25847007 |
Ning Leng1, Yuan Li2, Brian E McIntosh3, Bao Kim Nguyen3, Bret Duffin3, Shulan Tian3, James A Thomson4, Colin N Dewey5, Ron Stewart3, Christina Kendziorski5.
Abstract
MOTIVATION: With improvements in next-generation sequencing technologies and reductions in price, ordered RNA-seq experiments are becoming common. Of primary interest in these experiments is identifying genes that are changing over time or space, for example, and then characterizing the specific expression changes. A number of robust statistical methods are available to identify genes showing differential expression among multiple conditions, but most assume conditions are exchangeable and thereby sacrifice power and precision when applied to ordered data.Entities:
Mesh:
Year: 2015 PMID: 25847007 PMCID: PMC4528625 DOI: 10.1093/bioinformatics/btv193
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.(a) An auto-regressive hidden Markov component models dynamic paths. (b) An auto-regressive non-hidden Markov component models constant and sporadic paths
Operating characteristics for identifying changes in Sim I
| Power (%) | FDR (%) | F1 score (%) | Power (strong) (%) | Power (weak) (%) | |
|---|---|---|---|---|---|
| EBSeqHMM | 98.6 | 4.3 | 97.1 | 99.7 | 97.5 |
| EBSeq | 90.0 | 0.1 | 94.7 | 93.9 | 86.1 |
| DESeq2 | 92.4 | 0 | 96.1 | 95.4 | 89.4 |
| edgeR | 92.5 | 0.1 | 96.1 | 96.1 | 89.4 |
| voom | 91.9 | 0 | 95.8 | 95.1 | 88.6 |
| maSigPro (0.7) | 46.8 | 0 | 63.8 | 56.1 | 37.5 |
| maSigPro (0.5) | 76.1 | 0.1 | 86.4 | 81.5 | 70.6 |
| maSigPro (0.3) | 86.9 | 0.5 | 92.8 | 90.6 | 83.2 |
| FC (2.5) | 0.6 | 0.2 | 1.2 | 0.8 | 0.5 |
| FC (2) | 3.4 | 1.4 | 6.6 | 4.3 | 2.6 |
| FC (1.5) | 42.1 | 3.5 | 58.7 | 55.7 | 28.6 |
| FC (1.3) | 90.0 | 8.5 | 90.7 | 97.5 | 82.4 |
| FC (1.2) | 98.6 | 19.7 | 88.6 | 99.8 | 97.9 |
The first three columns show the average power, FDR and F1 score for detecting DE genes in Sim I. Power within the strong and weak groups is further evaluated in columns 4 and 5. Averages are calculated over 100 Sim I simulations. The standard errors (not shown) for EBSeq-HMM, EBSeq, DESeq2, edgeR, voom and maSigPro (and in most cases FC) were .
Fig. 2.Shown are two genes identified exclusively by EBSeq-HMM in Sim I data (upper) and in case study data (lower). The x-axis shows time points (upper) and positions on mouse limb (lower), and the y-axis shows median gene expression adjusted for library sizes
Operating characteristics for identifying changes in Sim II
| Power (%) | FDR (%) | F1 score (%) | Power (strong) (%) | Power (weak) (%) | Power (sporadic) (%) | |
|---|---|---|---|---|---|---|
| EBSeqHMM | 94.5 | 4.5 | 95.0 | 99.7 | 97.4 | 86.4 |
| EBSeq | 81.4 | 0.1 | 89.7 | 93.9 | 86.1 | 64.2 |
| DESeq2 | 84.1 | 0 | 91.4 | 95.2 | 89.3 | 67.9 |
| edgeR | 84.4 | 0 | 91.6 | 95.4 | 89.5 | 68.3 |
| voom | 83.2 | 0 | 90.8 | 95.0 | 88.7 | 65.9 |
| maSigPro (0.7) | 33.1 | 0 | 49.7 | 56.0 | 37.8 | 5.5 |
| maSigPro (0.5) | 56.8 | 0.1 | 72.4 | 81.6 | 70.6 | 18.2 |
| maSigPro (0.3) | 67.4 | 0.5 | 80.4 | 89.9 | 82.3 | 30.0 |
| FC (2.5) | 0.4 | 0.4 | 0.8 | 0.7 | 0.4 | 0.1 |
| FC (2) | 2.5 | 1.9 | 4.9 | 4.2 | 2.5 | 0.8 |
| FC (1.5) | 36.1 | 4.0 | 52.5 | 55.9 | 28.6 | 23.9 |
| FC (1.3) | 83.0 | 9.0 | 86.8 | 97.4 | 82.5 | 69.2 |
| FC (1.2) | 95.8 | 20.1 | 87.1 | 99.8 | 97.9 | 89.6 |
The first three columns show the average power, FDR and F1 score for detecting DE genes in Sim II. For dynamic genes, the power within the strong and weak groups is further evaluated in columns 4 and 5. Power within the sporadic group is evaluated in column 6. Averages are calculated over 100 Sim II simulations. The standard errors (not shown) for EBSeq-HMM, EBSeq, DESeq2, edgeR, voom and maSigPro (and in most cases FC) were .
Fig. 3.Shown are the number of genes (ground truth) simulated in Sim I as being in each of eight dynamic paths (these eight are shown as they contain the most genes among all simulated paths). Also shown are the average number classified into each path by EBSeq-HMM and by FC analysis at thresholds 1.2 and 1.3 (averages are calculated over 100 Sim I datasets). Correct classifications are shown in blue (first bar); incorrect are shown in red (second bar)
Fig. 4.Shown are median expression levels of 33 Hox genes identified as DE by EBSeq-HMM. The expression values were adjusted for library size and further scaled to mean 0 and standard deviation 1 for each gene; median expression over three replicates is shown. Genes were clustered via hierarchical clustering using Euclidean distance and complete linkage. The x-axis shows seven positions over the mouse limb
Fig. 5.(a), (b) Shown are genes classified as following an Up-Down-Up-Down-Down-Down (left panel, 827 genes) or Down-Up-Down-Up-Up-Up (right panel, 218 genes) expression path in the case study data. Each line indicates one gene. The x-axis shows seven positions over the mouse limb; the y-axis shows median scaled expression within each position