| Literature DB >> 35931958 |
Abstract
BACKGROUND: Use of next-generation sequencing technologies to transcriptomics (RNA-seq) for gene expression profiling has found widespread application in studying different biological conditions including cancers. However, RNA-seq experiments are still small sample size experiments due to the cost. Recently, an increased focus has been on meta-analysis methods for integrated differential expression analysis for exploration of potential biomarkers. In this study, we propose a p-value combination method for meta-analysis of multiple independent but related RNA-seq studies that accounts for sample size of a study and direction of expression of genes in individual studies.Entities:
Keywords: Differential expression; Glioblastoma; Meta-analysis; RNA-seq
Mesh:
Substances:
Year: 2022 PMID: 35931958 PMCID: PMC9354357 DOI: 10.1186/s12859-022-04859-9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
Simulation settings for inter-study variability parameter (), number of studies and number of replicates per study
| Setting | No. of studies | No. of replicates (case, control) | AUC (MIN, IN, FIN) | Std. dev (MIN, IN, FIN) | |
|---|---|---|---|---|---|
| 1 | 0.15 | 3 | (10, 10) (15, 10) (12, 16) | 0.886, 0.920, 0.920 | 0.005, 0.003, 0.003 |
| 2 | 0.15 | 5 | (10, 10) (15, 10) (12, 16) (14, 12) (20, 20) | 0.953, 0.970, 0.970 | 0.005, < 0.001, 0.001 |
| 3 | 0.5 | 3 | (10, 10) (15, 10) (12, 16) | 0.950, 0.965, 0.966 | 0.004, 0.005, 0.005 |
| 4 | 0.5 | 5 | (10, 10) (15, 10) (12, 16) (14, 12) (20, 20) | 0.957, 0.977, 0.977 | 0.005, 0.005, 0.005 |
Area under the receiver operating characteristic curves (AUC) for inverse-normal (IN), modified inverse-normal (MIN) and fused inverse-normal (FIN) methods computed using 100 trials for each simulation setting. Std. dev: Standard deviation
Information about GBM RNA-seq datasets used for integrated analysis using different p-value combination methods in our study
| Datasets | No. of replicates (cases/normal) | No. of genes (after filtering) | Up DEGs | Down DEGs |
|---|---|---|---|---|
| GSE123892 | 4/3 | 15,024 | 1914 | 1837 |
| GSE151352 | 12/12 | 12,916 | 670 | 1545 |
| TCGA-GBM | 160/5 | 17,943 | 3746 | 3183 |
Up and down differentially expressed genes (DEGs) refer to the up and down-regulated DEGs obtained in per-study differential analysis
Fig. 1Performance comparison of modified inverse-normal, inverse-normal and fused inverse-normal methods. Plots of receiver operating characteristics (ROC) curves averaged over 100 trials for each simulation setting for all three methods. Simulation settings are represented by rows (from top to bottom): corresponding to low (σ = 0.15) and high (σ = 0.5) inter-study variability and columns (from left to right): corresponding to 3 (S = 3) and 5 studies (S = 5) combined. The black, blue, and red ROC curves represent the modified inverse-normal (MIN), inverse-normal (IN) and fused inverse-normal (FIN) methods respectively
Fig. 2Characteristics of modified inverse-normal method. a False discovery rates (FDR) for modified inverse-normal (MIN) method for all simulation settings. b Proportion of true positives (TPs) among unique differentially expressed genes (DEGs) identified by MIN method as compared to inverse-normal (IN) method. c Proportion of truly unique DEGs (MIN) with the observed effective direction of expression as the true direction of expression
Fig. 3Characteristics of fused inverse-normal method. a False discovery rates (FDR) for fused inverse-normal (FIN) method for all simulation settings. b Proportion of true-positives (TPs) among unique differentially expressed genes (DEGs) identified by FIN method as compared to inverse-normal method. c Proportion of truly unique DEGs (FIN) with the observed effective direction of expression as the true direction of expression
Fig. 4Comparison of results from meta-analysis methods. a Histograms of raw p-values obtained from per-study differential analysis of GSE123892 and GSE151352 and TCGA-GBM datasets used in real data application. b Venn diagram of the differentially expressed genes (DEGs) identified using inverse-normal (IN), modified inverse-normal (MIN) and fused inverse-normal (FIN) methods
Top 10 up- and down-regulated differentially expressed genes (DEGs) identified by the fused inverse-normal method
| DEGs | Mean | Effect | BH | DEGs | Mean | Effect | BH | ||
|---|---|---|---|---|---|---|---|---|---|
| 10.45 | 3.33 | +++ | 11.19 | 4.32 | −−− | ||||
| 10.39 | 4.04 | +++ | 11.10 | 4.19 | −−− | ||||
| 10.39 | 3.68 | +++ | 11.07 | 3.71 | −−− | ||||
| 10.29 | 4.67 | +++ | 10.99 | 4.79 | −−− | ||||
| 10.24 | 5.79 | +++ | 10.98 | 3.35 | −−− | ||||
| 10.15 | 4.48 | +++ | 10.91 | 3.95 | −−− | ||||
| 10.12 | 5.80 | +++ | 10.91 | 4.40 | −−− | ||||
| 10.09 | 5.48 | +++ | 10.90 | 2.83 | −−− | ||||
| 10.07 | 5.95 | +++ | 10.88 | 5.33 | −−− | ||||
| 10.04 | 4.63 | +++ | 10.85 | 2.35 | −−− |
The DEGs have been sorted based on the value of the statistic and the mean of absolute value of the have been reported. Effect signifies the direction of expression of DEGs in the per-study differential analysis. BH p-value: Benjamini Hochberg p-value
Number of differentially expressed genes (DEGs) found in one, two or all three datasets
| Method | Expression direction | Present in one study | Present in two studies | Present in three studies | Total DEGs |
|---|---|---|---|---|---|
| IN | Same Mismatched | 1368 0 | 1085 0 | 3465 0 | 5918 |
| MIN | Same Mismatched | 1182 0 | 1035 52 | 3442 181 | 5892 |
| FIN | Same Mismatched | 1359 0 | 1083 53 | 3461 182 | 6138 |
Same and mismatched represents if the direction of expression of a DEG was consistent across a study or not respectively. IN: Inverse-normal, MIN: Modified inverse-normal, FIN: Fused inverse-normal
Fig. 5Significant pathways identified by IPA. The top ten significant pathways based on Benjamini Hochberg (BH) p-value among the canonical pathways identified by Ingenuity Pathway Analysis (IPA) for the up-regulated differentially expressed genes (DEGs) (orange bar) and down-regulated DEGs (green bar). The numbers on the bar plot show the ratio between the numbers of DEGs enriched and total number of genes in each of these pathways
Top 10 differentially expressed genes (DEGs) with mismatched direction of expression across datasets identified by the fused inverse-normal method
| DEGs | Mean |logFC| | Effect | BH | |
|---|---|---|---|---|
| 7.58 | 1.30 | +−+ | ||
| 7.58 | 2.82 | +−+ | ||
| − 7.53 | 1.37 | −+− | ||
| − 7.53 | 1.31 | −+− | ||
| 7.52 | 1.31 | +−+ | ||
| − 7.47 | 1.67 | −+− | ||
| 7.43 | 4.35 | +−+ | ||
| − 7.31 | 1.91 | −+− | ||
| 7.18 | 2.08 | +−+ | ||
| − 7.06 | 1.23 | +−− |
The DEGs have been sorted based on the absolute value of the statistic and the mean of absolute value of the have been reported. Effect signifies the direction of expression of DEGs in the per-study differential analysis for GSE123892, GSE151352 and TCGA-GBM respectively. BH p-value: Benjamini Hochberg p-value