| Literature DB >> 27634898 |
Qingzhou Guan1, Rou Chen1, Haidan Yan1, Hao Cai1, You Guo1,2, Mengyao Li1, Xiangyu Li1, Mengsha Tong1, Lu Ao1, Hongdong Li1, Guini Hong1, Zheng Guo1.
Abstract
The highly stable within-sample relative expression orderings (REOs) of gene pairs in a particular type of human normal tissue are widely reversed in the cancer condition. Based on this finding, we have recently proposed an algorithm named RankComp to detect differentially expressed genes (DEGs) for individual disease samples measured by a particular platform. In this paper, with 461 normal lung tissue samples separately measured by four commonly used platforms, we demonstrated that tens of millions of gene pairs with significantly stable REOs in normal lung tissue can be consistently detected in samples measured by different platforms. However, about 20% of stable REOs commonly detected by two different platforms (e.g., Affymetrix and Illumina platforms) showed inconsistent REO patterns due to the differences in probe design principles. Based on the significantly stable REOs (FDR<0.01) for normal lung tissue consistently detected by the four platforms, which tended to have large rank differences, RankComp detected averagely 1184, 1335 and 1116 DEGs per sample with averagely 96.51%, 95.95% and 94.78% precisions in three evaluation datasets with 25, 57 and 58 paired lung cancer and normal samples, respectively. Individualized pathway analysis revealed some common and subtype-specific functional mechanisms of lung cancer. Similar results were observed for colorectal cancer. In conclusion, based on the cross-platform significantly stable REOs for a particular normal tissue, differentially expressed genes and pathways in any disease sample measured by any of the platforms can be readily and accurately detected, which could be further exploited for dissecting the heterogeneity of cancer.Entities:
Keywords: differentially expressed genes; gene expression profiling; heterogeneity of cancer; individual level; multiple platforms
Mesh:
Year: 2016 PMID: 27634898 PMCID: PMC5356599 DOI: 10.18632/oncotarget.11996
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
Description of normal sample data and paired cancer-normal sample data used in this study
| GEO Acc or Data source | Platform | Normal sample size | Tumor sample size | |
|---|---|---|---|---|
| The normal sample data for REO evaluation | ||||
| SetA | GSE19804 | Affymetrix GPL570 | 60 | |
| GSE18842 | Affymetrix GPL570 | 45 | ||
| GSE27262 | Affymetrix GPL570 | 25 | ||
| GSE31210 | Affymetrix GPL570 | 20 | ||
| SetB | GSE19188 | Affymetrix GPL570 | 65 | |
| SetA | GSE32863 | Illumina GPL6884 | 58 | |
| SetB | GSE31267 | Illumina GPL6947 | 24 | |
| SetA | GSE40588 | Agilent GPL6480 | 60 | |
| SetB | GSE15197 | Agilent GPL6480 | 13 | |
| GSE57148 | Illumina GPL11154 | 91 | ||
| SetA | GSE21510 | Affymetrix GPL570 | 25 | |
| GSE18105 | Affymetrix GPL570 | 17 | ||
| GSE4107 | Affymetrix GPL570 | 10 | ||
| SetB | GSE8671 | Affymetrix GPL570 | 32 | |
| SetA | GSE56789 | Illumina GPL10558 | 40 | |
| SetB | GSE31279 | Illumina GPL6104 | 32 | |
| GSE43841 | Illumina GPL14951 | 6 | ||
| SetA | GSE46271 | Agilent GPL14550 | 22 | |
| GSE50114 | Agilent GPL6480 | 9 | ||
| SetB | GSE28000 | Agilent GPL4133 | 23 | |
| GSE50760 | Illumina GPL11154 | 18 | ||
| The paired cancer-normal sample data for the performance of | ||||
| GSE27262 | Affymetrix GPL570 | 25 | 25 | |
| GSE32863 | Illumina GPL6884 | 57 | 57 | |
| TCGA_luad | IlluminaHiSeq_RNASeqV2 | 58 | 58 | |
| GSE8671 | Affymetrix GPL570 | 32 | 32 | |
| GSE31279 | Illumina GPL6104 | 32 | 32 | |
| TCGA_coad | IlluminaHiSeq_RNASeqV2 | 26 | 26 | |
Note:
To determine stable gene pairs for a particular type of normal tissue, only the normal sample sizes were described for the datasets.
Denotes mRNA_seq data, especially TCGA_luad and TCGA_coad denote paired lung adenocarcinoma and colon adenocarcinoma samples from TCGA, respectively.
Figure 1The flowchart of the analysis procedure
Reproducibility of significantly stable REOs in normal samples measured by each of the platforms
| Label | Normal sample size | Gene | Number of stable REOs | Number of overlaps | POG12 | POG21 | Consistency | P | |
|---|---|---|---|---|---|---|---|---|---|
| Affymetrix | SetA | 150 | 20283 | 197,546,446 | 190,118,028 | 0.9434 | 0.9519 | 0.9802 | <1.0-16 |
| SetB | 65 | 195,767,556 | |||||||
| Illumina | SetA | 58 | 23364 | 251,964,302 | 231,498,834 | 0.8906 | 0.9061 | 0.9694 | <1.0-16 |
| SetB | 24 | 247,667,868 | |||||||
| Agilent | SetA | 60 | 19596 | 181,534,752 | 151,185,241 | 0.7833 | 0.9105 | 0.9406 | <1.0-16 |
| SetB | 13 | 156,176,364 | |||||||
| Affymetrix | SetA | 52 | 20283 | 193,475,574 | 184,134,774 | 0.9136 | 0.9135 | 0.96 | <1.0-16 |
| SetB | 32 | 193,501,698 | |||||||
| Illumina | SetA | 40 | 17789 | 148,902,375 | 131,019,285 | 0.8048 | 0.8723 | 0.9147 | <1.0-16 |
| SetB | 38 | 137,385,589 | |||||||
| Agilent | SetA | 31 | 18583 | 145,935,881 | 121,390,845 | 0.8099 | 0.87 | 0.9736 | <1.0-16 |
| SetB | 23 | 135,855,195 | |||||||
Note:
denotes the number of genes of SetA and setB measured by a particular platform. POG12 (or POG21) denotes the percentage of the significantly stable gene pairs (FDR<0.01) detected from SetA (or SetB) that are consistently detected in SetB (or SetA). Consistency denotes the percentage of overlapped gene pairs that display the same REO patterns between SetA and SetB and P denotes the significance of the consistency.
Figure 2The percentage of the gene pairs with significantly stable REOs (FDR<0.01) in all measured gene pairs
Cross-platform evaluation of the significantly REOs for normal tissues
| Number of stable REOs | Number of overlaps | POG12 | POG21 | Consistency | P | |
|---|---|---|---|---|---|---|
| Affymetrix | 94,145,902 | 80,493,915 | 0.7043 | 0.7471 | 0.8237 | <1.0-16 |
| Illumina | 88,746,864 | |||||
| Affy_Illu | 66,305,728 | 52,986,997 | 0.736 | 0.6271 | 0.921 | <1.0-16 |
| Agilent | 77,825,426 | |||||
| Affy_Illu_Agi | 48,802,858 | 47,832,844 | 0.9486 | 0.4667 | 0.9679 | <1.0-16 |
| RNA_seq (GSE57148) | 99,202,212 | |||||
| Affymetrix | 100,855,012 | 78,495,790 | 0.6729 | 0.7569 | 0.8645 | <1.0-16 |
| Illumina | 89,653,488 | |||||
| Affy_Illu | 67,862,351 | 52,201,960 | 0.7347 | 0.6223 | 0.9551 | <1.0-16 |
| Agilent | 80,116,625 | |||||
| Affy_Illu_Agi | 49,856,959 | 48,851,749 | 0.9662 | 0.4453 | 0.9861 | <1.0-16 |
| RNA_seq (GSE50670) | 108,187,244 | |||||
Note: Affy_Illu denotes stable gene pairs consistently detected from the data measured by Affymetrix and Illumina platforms. Similarly, Affy_Illu_Agi denotes stable gene pairs consistently detected from the data measured by Affymetrix, Illumina and Agilent platforms.
Figure 3RankComp based on significantly stable REOs can detect much more DEGs with slightly decreased precision for each disease sample than RankComp based on highly stable REOs (stable in above 99% samples)
Figure 4The KEGG pathways separately enriched with up- and down-regulated genes in at least 10% of the TCGA lung adenocarcinoma samples A. and the TCGA colon adenocarcinoma samples B