| Literature DB >> 32428201 |
Jiajing Xie1,2, Yang Xu1,2, Haifeng Chen3, Meirong Chi1,2, Jun He1,2, Meifeng Li1,2, Hui Liu1,2, Jie Xia1,2, Qingzhou Guan1,2, Zheng Guo1,2, Haidan Yan1,2.
Abstract
MOTIVATION: For some specific tissues, such as the heart and brain, normal controls are difficult to obtain. Thus, studies with only a particular type of disease samples (one phenotype) cannot be analyzed using common methods, such as significance analysis of microarrays, edgeR and limma. The RankComp algorithm, which was mainly developed to identify individual-level differentially expressed genes (DEGs), can be applied to identify population-level DEGs for the one-phenotype data but cannot identify the dysregulation directions of DEGs.Entities:
Year: 2020 PMID: 32428201 PMCID: PMC7520039 DOI: 10.1093/bioinformatics/btaa523
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Datasets used in this study
| Tissues | Datasets | Platform | Control | Case |
|---|---|---|---|---|
| Heart | GSE42955 | GPL6244 | 5 | 12 |
| GSE22253 | GPL6244 | 108 | — | |
| GSE57338 | GPL11532 | 136 | 95 | |
| GSE141910 | RNA-seq | 166 | — | |
| GSE116250 | RNA-seq | 14 | 13 | |
| GSE26887 | GPL6244 | 5 | 12 | |
| GSE46224 | RNA-seq | 8 | 8 | |
| Brain | GSE35493 | GPL570 | 9 | — |
| GSE94349 | GPL570 | 27 | 10 | |
| GSE68015 | GPL570 | 16 | 11 | |
| GSE86574 | GPL570 | 10 | 15 | |
| GSE26966 | GPL570 | 9 | — | |
| GTEx | RNA-seq | 406 | — | |
| GSE50161 | GPL570 | 13 | 34 | |
| TCGA | RNA-seq | 5 | 156 | |
| Breast | GSE20194 | GPL96 | 27 | 37 |
| GSE23988 | GPL96 | 13 | 16 | |
| GSE20271 | GPL96 | 8 | 26 |
Tissues sampled from normal heart, normal brain or pCR ER-negative breast cancer.
Tissues sampled from ICM, glioma or RD ER-negative breast cancer.
Datasets were independent datasets used to evaluate the performance of the algorithms.
Fig. 1.RankComp and PhenoComp algorithms to identify population-level DEGs. For a dataset with only disease samples, we first identified the individual-level DEGs for a particular type of disease sample using the stable REOs determined by previously accumulated normal samples as normal background. Then, based on the individual-level differential expression statuses of genes, RankComp and PhenoComp developed different methods to infer population-level DEGs
Fig. 2.Distributions of the f, f and f in different datasets for ICM (A) and glioma (B). The black squares on the box represent the average values of the f, f and f in each dataset
Fig. 3.Comparison of the performance of PhenoComp and RankComp in simulated data. The p and p were estimated by the mean values of the f, f and f for PhenoComp
Fig. 4.Comparison of DEGs identified by different methods. (A) Concordance analysis of DEGs identified by two different methods. The p and p were estimated by the median values of the f, f and f for PhenoComp. Overlaps represent the DEGs identified by both PhenoComp and a common method (edgeR, limma or SAM). The con_overlaps denotes the overlaps that have the same dysregulated directions in two lists of DEGs. POG12 represents the proportion of consistent overlaps among the DEGs identified by PhenoComp. POG21 represents the proportion of consistent overlaps among the DEGs identified by the common method. (B) Number of pathways enriched with DEGs identified by different methods. The DEGs identified with FDR <5% and FDR <1% were all analyzed for pathway enrichment analysis. (C) Comparison of DEGs identified by PhenoComp and RankComp. Uncertain genes were those DEGs identified by RankComp with uncertain directions. Overlap with uncertain genes denotes that genes with uncertain directions identified by RankComp could be detected with clear dysregulation directions by PhenoComp
Fig. 5.Analysis of preoperative chemotherapy response data of breast cancer. (A) Number of DEGs identified by PhenoComp, SAM and limma in GSE20194, GSE20271 and GSE23988, respectively. The pathways enriched with upregulated genes (B) and downregulated genes (C) identified by PhenoComp. GeneRatio denoted the proportion of DEGs within a pathway among the total genes within the pathway