| Literature DB >> 29378509 |
Qingzhou Guan1, Haidan Yan1, Yanhua Chen1, Baotong Zheng1, Hao Cai1, Jun He1, Kai Song2, You Guo1,3, Lu Ao1, Huaping Liu1, Wenyuan Zhao2, Xianlong Wang4, Zheng Guo5,6,7.
Abstract
BACKGROUND: Due to experimental batch effects, the application of a quantitative transcriptional signature for disease diagnoses commonly requires inter-sample data normalization, which would be hardly applicable under common clinical settings. Many cancers might have qualitative differences with the non-cancer states in the gene expression pattern. Therefore, it is reasonable to explore the power of qualitative diagnostic signatures which are robust against experimental batch effects and other random factors.Entities:
Keywords: Batch effects; Classifiers; Diagnostic signature; Platform; Relative expression orderings
Mesh:
Substances:
Year: 2018 PMID: 29378509 PMCID: PMC5789529 DOI: 10.1186/s12864-018-4446-y
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Quantitative measurement variation for replicates measured by PCR-based technologies. For each of the sample types (sample A and sample B) measured by StaRT-PCR™ Assays and TaqMan® Assays, the red bar denotes the percentage of genes that shows at least 10% CV and the green bar denotes the percentage of genes that shows at least 15% CV. The total number of such genes within each assay and sample type is noted by blue dots connected by lines and is read on the secondary axis
Fig. 2Sensitivity and specificity of SVM classifiers (a) and naïve Bayesian classifier (b) for validation datasets. Notably, some datasets included only colorectal cancer tissue samples or normal tissue samples, so only the results of sensitivity or specificity were shown for those datasets
Fig. 3Analysis procedure for identifying a cross-platform REO-based signature
Data used in this study
| GEO Acc | Platform | Sample sizea | ||
|---|---|---|---|---|
| Normal | IBD | Tumor | ||
| Training | ||||
| GSE32323 | Affymetrix GPL570 | 17 | 17 | |
| GSE22598 | Affymetrix GPL570 | 17 | 17 | |
| GSE41328 | Affymetrix GPL570 | 10 | 10 | |
| GSE4107 | Affymetrix GPL570 | 10 | 12 | |
| GSE4183 | Affymetrix GPL570 | 8 | 15 | 15 |
| GSE18105 | Affymetrix GPL570 | 17 | 94 | |
| GSE12251 | Affymetrix GPL570 | 23 | ||
| GSE13367 | Affymetrix GPL570 | 16 | ||
| GSE9452 | Affymetrix GPL570 | 8 | ||
| GSE16879 | Affymetrix GPL570 | 6 | 61 | |
| GSE35144 | Affymetrix GPL570 | 27 | ||
| GSE35896 | Affymetrix GPL570 | 62 | ||
| GSE33113 | Affymetrix GPL570 | 6 | 90 | |
| GSE37178 | Illumina GPL6947 | 84 | ||
| GSE48634 | Illumina GPL10558 | 69 | 102 | |
| Validation | ||||
| GSE9348 | Affymetrix GPL570 | 12 | 70 | |
| GSE23878 | Affymetrix GPL570 | 24 | 35 | |
| GSE47908 | Affymetrix GPL570 | 15 | 39 | |
| GSE36807 | Affymetrix GPL570 | 7 | 28 | |
| GSE27854 | Affymetrix GPL570 | 115 | ||
| GSE22619 | Affymetrix GPL570 | 10 | 10 | |
| GSE21510 | Affymetrix GPL570 | 25 | 123 | |
| GSE17536 | Affymetrix GPL570 | 177 | ||
| GSE14580 | Affymetrix GPL570 | 6 | 24 | |
| GSE8671 | Affymetrix GPL570 | 32 | 32 | |
| GSE9254 | Affymetrix GPL570 | 19 | ||
| GSE20916 | Affymetrix GPL570 | 44 | 91 | |
| GSE53306 | Illumina GPL10558 | 12 | 28 | |
| GSE31279 | Illumina GPL6104 | 42 | 44 | |
| GSE33126 | Illumina GPL6947 | 9 | 9 | |
| GSE68570 | Illumina GPL10558 | 5 | 6 | |
| GSE26305 | Illumina GPL6884 | 2 | 2 | |
| GSE56789 | Illumina GPL10558 | 40 | ||
| GSE43841 | Illumina GPL14951 | 6 | ||
| GSE50760b | Illumina GPL11154 | 18 | 36 | |
| GSE72819b | Illumina GPL11154 | 73 | ||
| TCGA_coadb,c | IlluminaHiSeq_RNASeqV2 | 41 | 285 | |
Notes:
aEmpty cells indicate that there is no sample in the corresponding category
bThese samples are measured by the RNA-sequencing platform
cDenotes the colorectal adenocarcinoma sample from TCGA
Fig. 4Performance of k-gene pairs REO-based signature applied to the training set. The majority vote rule was used for classification
The REO-based signature
| Gene pair | REO ( |
|---|---|
| 1 | |
| 2 | |
| 3 |
Note:
aRelative expression ordering (REO) of a gene pair, G > G denotes that the expression value of gene i is larger than the expression value of gene j in 90% of non-cancer samples but is less than the expression value of gene j in 90% of colorectal cancer samples
Fig. 5Performance of the REO-based signature applied to multiple independent datasets from different platforms. The majority vote rule was used for classification
Fig. 6The distribution of the expression levels of the 3 gene-pairs in GSE8671. The gene expression levels of GPAT3 and TRIP13 (a), PYY and CKAP2 (b) and SDCBP2 and DAP3 (c)
MAQC PCR-based data used in this study
| GEO Acc | Protocol | Platform | Sample A | Sample B |
|---|---|---|---|---|
| GSE5350 | StaRT-PCR™ Assays | GPL4198 | 3 | 3 |
| GSE5350 | TaqMan® Assays | GPL4097 | 4 | 4 |