| Literature DB >> 27993158 |
Chathura Siriwardhana1, Susmita Datta2, Somnath Datta3.
Abstract
BACKGROUND: It is interesting to study the consistency of outcomes arising from two genomic platforms: Microarray and RNAseq, which are established on fundamentally different technologies. This topic has been frequently discussed from the prospect of comparing differentially expressed genes (DEGs). In this study, we explore the inter-platform concordance between microarray and RNASeq in their ability to classify samples based on genomic information. We use a set of 7 standard multi-class classifiers and an adaptive ensemble classifier developed around them to predict Chemical Modes of Actions (MOA) of data profiled by microarray and RNASeq platforms from Rat Liver samples exposed to a variety of chemical compounds. We study the concordance between microarray and RNASeq data in various forms, based on classifier's performance between two platforms.Entities:
Keywords: Classification; Microarray; RNASeq
Mesh:
Substances:
Year: 2016 PMID: 27993158 PMCID: PMC5168706 DOI: 10.1186/s13062-016-0167-9
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 4.540
Accuracies of predicting MOA’s in the adjusted test set, based on classifiers developed on gene expression sets profiled from microarray and RNASeq platforms
| Platform | Classifier | Overall Acc. % | Sensitivity, Specificity | ||
|---|---|---|---|---|---|
| PPARA | CAR/PXR | Control | |||
| Microarray | Ensemble | 75 | 89,67 | 44,94 | 100,67 |
| SVM | 58 | 56,59 | 33,73 | 100,44 | |
| RF | 67 | 89,54 | 22,94 | 100,56 | |
| PLS+LDA | 71 | 67,73 | 56,80 | 100,61 | |
| PLS+RF | 58 | 44,68 | 44,68 | 100,45 | |
| PCA+LDA | 17 | 0,27 | 0,27 | 67,0 | |
| PCA+RF | 33 | 33,33 | 0,53 | 83,16 | |
| RPART | 62 | 100,39 | 11,93 | 83,55 | |
| RNASeq | Ensemble | 67 | 56,74 | 56,74 | 100,56 |
| SVM | 58 | 67,54 | 22,81 | 100,45 | |
| RF | 58 | 67,54 | 22,81 | 100,45 | |
| PLS+LDA | 67 | 56,74 | 56,74 | 100,56 | |
| PLS+RF | 58 | 67,54 | 22,81 | 100,45 | |
| PCA+LDA | 25 | 33,20 | 0,40 | 50,17 | |
| PCA+RF | 20 | 22,19 | 11,25 | 33,16 | |
| RPART | 46 | 22,60 | 33,54 | 100,28 |
Accuracies of predicting MOA’s in the originally given test set, based on classifiers developed on common gene expression sets profiled from microarray and RNASeq platforms
| Platform | Classifier | Overall Acc. % | Sensitivity, Specificity | |||
|---|---|---|---|---|---|---|
| PPARA | CAR/PXR | Control | OTHER | |||
| Microarray | Ensemble | 50 | 44,52 | 44,52 | 100,42 | 39,58 |
| SVM | 55 | 0,70 | 0,70 | 83,50 | 100,21 | |
| RF | 57 | 0,73 | 0,73 | 100,50 | 100,25 | |
| PLS+LDA | 40 | 67,33 | 56,36 | 67,36 | 6,67 | |
| PLS+RF | 55 | 11,67 | 11,67 | 100,48 | 83,34 | |
| PCA+LDA | 12 | 0,15 | 0,15 | 83,0 | 0,21 | |
| PCA+RF | 40 | 0,51 | 0,51 | 94,31 | 0,70 | |
| RPART | 45 | 100,30 | 0,57 | 83,39 | 28,58 | |
| RNASeq | Ensemble | 50 | 33,55 | 33,55 | 100,42 | 50,50 |
| SVM | 60 | 0,76 | 11,73 | 100,53 | 100,30 | |
| RF | 55 | 0,70 | 0,70 | 100,48 | 94,26 | |
| PLS+LDA | 38 | 56,33 | 56,33 | 100,28 | 0,66 | |
| PLS+RF | 55 | 11,67 | 22,64 | 100,48 | 78,38 | |
| PCA+LDA | 14 | 0,18 | 0,18 | 100,0 | 0,24 | |
| PCA+RF | 43 | 0,55 | 0,55 | 83,36 | 72,21 | |
| RPART | 43 | 33,46 | 33,46 | 33,45 | 56,33 |
Accuracies of predicting MOA’s in the adjusted test set, based on classifiers developed on complete gene expression sets profiled from microarray and RNASeq platforms
| Platform | Classifier | Overall Acc. % | Sensitivity, Specificity | ||
|---|---|---|---|---|---|
| PPARA | CAR/PXR | Control | |||
| Microarray | Ensemble | 62 | 56,66 | 44,73 | 100,49 |
| SVM | 50 | 33,60 | 44,54 | 83,39 | |
| RF | 67 | 89,54 | 22,94 | 100,56 | |
| PLS+LDA | 67 | 67,67 | 44,81 | 100,56 | |
| PLS+RF | 54 | 44,60 | 33,67 | 100,39 | |
| PCA+LDA | 12 | 33,0 | 0,20 | 0,17 | |
| PCA+RF | 8 | 22,0 | 0,13 | 0,11 | |
| RPART | 62 | 100,39 | 11,93 | 83,55 | |
| RNASeq | Ensemble | 62 | 56,66 | 44,73 | 100,49 |
| SVM | 54 | 44,60 | 33,67 | 100,39 | |
| RF | 62 | 78,52 | 22,86 | 100,49 | |
| PLS+LDA | 58 | 44,66 | 44,66 | 100,44 | |
| PLS+RF | 50 | 44,54 | 22,67 | 100,33 | |
| PCA+LDA | 33 | 33,33 | 0,53 | 83,16 | |
| PCA+RF | 25 | 22,27 | 0,40 | 67,11 | |
| RPART | 42 | 44,41 | 22,54 | 67,34 | |
Accuracies of predicting MOA’s in the originally given test set, based on classifiers developed on complete gene expression sets profiled from microarray and RNASeq platforms
| Platform | Classifier | Overall Acc. % | Sensitivity, Specificity | |||
|---|---|---|---|---|---|---|
| PPARA | CAR/PXR | Control | OTHER | |||
| Microarray | Ensemble | 55 | 33,61 | 44,58 | 100,48 | 56,54 |
| SVM | 55 | 0,70 | 0,70 | 83,50 | 100,21 | |
| RF | 55 | 0,70 | 0,70 | 83,50 | 100,21 | |
| PLS+LDA | 38 | 67,30 | 44,36 | 100,28 | 0,66 | |
| PLS+RF | 60 | 11,73 | 0,76 | 100,53 | 100,30 | |
| PCA+LDA | 17 | 0,22 | 11,19 | 83,6 | 6,25 | |
| PCA+RF | 38 | 0,48 | 0,48 | 0,44 | 89,0 | |
| RPART | 48 | 100,34 | 0,61 | 83,42 | 33,59 | |
| RNASeq | Ensemble | 55 | 44,58 | 33,61 | 100,48 | 56,54 |
| SVM | 55 | 0,70 | 0,70 | 83,50 | 100,21 | |
| RF | 52 | 0,66 | 0,66 | 83,47 | 94,20 | |
| PLS+LDA | 33 | 44,30 | 44,30 | 100,22 | 0,58 | |
| PLS+RF | 60 | 0,76 | 22,70 | 100,53 | 94,34 | |
| PCA+LDA | 10 | 0,13 | 22,7 | 33,6 | 0,18 | |
| PCA+RF | 43 | 0,55 | 0,55 | 17,47 | 94,5 | |
| RPART | 24 | 44,19 | 22,25 | 67,17 | 0,42 | |
Fig. 1Plots between prediction accuracies of RNASeq vs Microarray for two different test sets using the common gene set, by eight different classification techniques, for classifiers trained and predicted on individual platform
Fig. 2Plots between prediction accuracies of RNASeq vs Microarray for two different test sets using the complete gene set, by eight different classification techniques, for classifiers trained and predicted on individual platform
Accuracies of predicting MOA’s in the whole datasets (inducing testing and training sets) of RNAseq and microarray platforms, using the classifiers trained on corresponding opposite platform
| Procedure | Classifier | Overall Acc. % | Sensitivity, Specificity | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| PPARA | CAR/PXR | AhR | Cytotoxic | DNADamage | ER | HMGCOA | Control | |||
| Trained on microarray and predicted on RNASeq | Ensemble | 100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 |
| svm | 86 | 100,83 | 100,83 | 33,91 | 44,99 | 67,88 | 89,86 | 100,85 | 100,82 | |
| RF | 92 | 100,90 | 100,90 | 56,95 | 89,93 | 67,94 | 100,91 | 100,91 | 100,90 | |
| PLS+LDA | 100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | |
| PLS+RF | 98 | 100,98 | 100,98 | 78,100 | 100,97 | 100,98 | 100,98 | 100,98 | 100,97 | |
| PCA+LDA | 11 | 0,13 | 6,12 | 44,8 | 0,14 | 0,12 | 0,12 | 0,12 | 29,6 | |
| PCA+RF | 12 | 0,15 | 6,13 | 0,13 | 0,16 | 0,13 | 0,13 | 0,13 | 50,1 | |
| RPART | 68 | 83,65 | 61,69 | 33,71 | 67,68 | 67,68 | 67,68 | 100,65 | 62,70 | |
| Trained on RNASeq and predicted on microarray | Ensemble | 100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 |
| svm | 87 | 100,84 | 94,86 | 22,93 | 75,88 | 80,88 | 89,87 | 100,86 | 100,83 | |
| RF | 94 | 100,93 | 100,93 | 78,96 | 88,95 | 87,95 | 78,96 | 100,93 | 100,92 | |
| PLS+LDA | 100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | 100,100 | |
| PLS+RF | 93 | 94,93 | 100,92 | 78,94 | 100,92 | 87,94 | 78,94 | 100,92 | 97,92 | |
| PCA+LDA | 13 | 0,16 | 0,16 | 0,14 | 0,14 | 0,14 | 0,14 | 0,14 | 47,3 | |
| PCA+RF | 9 | 0,11 | 0,11 | 22,8 | 0,10 | 0,10 | 0,10 | 0,10 | 30,3 | |
| RPART | 76 | 100,71 | 94,72 | 0,83 | 100,74 | 68,77 | 100,74 | 0,83 | 87,73 | |
Fig. 3Plots between prediction accuracies of RNASeq vs Microarray test sets, by eight different classification techniques, for classifiers trained and predicted on cross platforms
Genes ranked by the importance based on accuracy reduction, for Microarray and RNA-Seq, using the adjusted test set with the common set of genes
| Rank | Microarray | RNA-Seq | ||
|---|---|---|---|---|
| Gene name | Resulted accuracy | Gene name | Resulted accuracy | |
| 1 | Cyp1a1 | 0.561 | Fam111a | 0.540 |
| 2 | RT1-Bb | 0.575 | Evc | 0.610 |
| 3 | Fam111a | 0.585 | Cyp1a1 | 0.621 |
| 4 | Ugt2b | 0.599 | Cyp1a2 | 0.625 |
| 5 | Aldh1a7 | 0.628 | Akr1b8 | 0.625 |
| 6 | Akr1b8 | 0.632 | Hbb | 0.625 |
| 7 | Gpnmb | 0.647 | Ugt2b | 0.625 |
| 8 | Obp3 | 0.647 | Dhrs7 | 0.626 |
| 9 | Hbb | 0.649 | Mme | 0.628 |
| 10 | Vnn1 | 0.658 | Nr1d1 | 0.628 |
| 11 | Tsku | 0.660 | Cish | 0.631 |
| 12 | Aldh1a1 | 0.668 | Abcc3 | 0.631 |
| 13 | RGD1309362 | 0.668 | Adora1 | 0.632 |
| 14 | Socs2 | 0.669 | Fos | 0.636 |
| 15 | LOC685020 | 0.671 | Abcd2 | 0.640 |
| 16 | Aldh1b1 | 0.672 | Irs3 | 0.642 |
| 17 | RGD1564865 | 0.674 | Asrgl1 | 0.644 |
| 18 | Cyp1a2 | 0.675 | Pilra | 0.646 |
| 19 | Psat1 | 0.676 | Ddhd1 | 0.646 |
| 20 | Gadd45g | 0.678 | Ugt2b17 | 0.647 |
Analysis 3: Genes ranked by the importance, for microarray and RNASeq, using the adjusted test set with complete sets of genes
| Rank | Microarray | RNA-Seq | ||
|---|---|---|---|---|
| Gene name | Resulted accuracy | Gene name | Resulted accuracy | |
| 1 | Fam111a | 0.572 | Abcb1b | 0.551 |
| 2 | Abcc3 | 0.606 | GTP_EFTU_D3.1 | 0.563 |
| 3 | Adam8 | 0.624 | Hba-a2 | 0.564 |
| 4 | LOC100911107 0.628 | Hbb | 0.569 | |
| 5 | Atf3 | 0.632 | Cyp1a1 | 0.569 |
| 6 | Krt10 | 0.635 | LOC360504 | 0.572 |
| 7 | Aldh1a7 | 0.638 | Casp12 | 0.572 |
| 8 | MGC108823 | 0.638 | Ugt2b | 0.572 |
| 9 | Ckap2 | 0.638 | Apof | 0.575 |
| 10 | Cyp1a1 | 0.638 | MGC72973 | 0.578 |
| 11 | Asrgl1 | 0.639 | blarkly | 0.578 |
| 12 | Hamp | 0.640 | Dhrs7 | 0.578 |
| 13 | Hbb | 0.640 | Laminin_G_2.1 | 0.579 |
| 14 | Angptl4 | 0.640 | LOC313220 | 0.579 |
| 15 | Oas1a | 0.640 | Car3 | 0.579 |
| 16 | Psat1 | 0.640 | Dbp | 0.579 |
| 17 | Igfbp2 | 0.642 | Mcm5 | 0.581 |
| 18 | Gsta3 | 0.643 | TCTP.0 | 0.581 |
| 19 | Obp3 | 0.649 | Egln3 | 0.581 |
| 20 | Pik3r1 | 0.649 | Fam111a | 0.581 |
Genes ranked by the importance (based on the measure given by R), for Microarray and RNA-Seq, using the whole data including 8 verities of MOAs with the common gene set
| Rank | Microarray | RNASeq | ||||
|---|---|---|---|---|---|---|
| Gene name | Resulted accuracy |
| Gene name | Resulted accuracy |
| |
| 1 | Cyp1a1 | 0.9538 | 0.0064 | Cyp1a1 | 0.9658 | 0.0063 |
| 2 | RT1-Bb | 0.9707 | 0.0018 | Abcc3 | 0.9786 | 0.0019 |
| 3 | Gstp1 | 0.9740 | 0.0017 | Cyp7a1 | 0.9689 | 0.0016 |
| 4 | Usp2 | 0.9600 | 0.0015 | Cyp1a2 | 0.9751 | 0.0016 |
| 5 | Nr1d1 | 0.9693 | 0.0012 | Fabp2 | 0.9705 | 0.0015 |
| 6 | Obp3 | 0.9694 | 0.0011 | Sgcb | 0.9677 | 0.0014 |
| 7 | Fam111a | 0.9733 | 0.0011 | Atf3 | 0.9672 | 0.0014 |
| 8 | Prss23 | 0.963 | 0.0009 | Gdf15 | 0.9692 | 0.0013 |
| 9 | Igtp | 0.9668 | 0.0009 | Apoa4 | 0.9699 | 0.0011 |
| 10 | Taf8 | 0.9725 | 0.0008 | Slc13a3 | 0.9751 | 0.0011 |
| 11 | Dmbt1 | 0.9768 | 0.0008 | Ugt2b17 | 0.9751 | 0.0011 |
| 12 | Ccng1 | 0.9611 | 0.0008 | Acy3 | 0.9670 | 0.0011 |
| 13 | Cav1 | 0.9654 | 0.0008 | Porcn | 0.9732 | 0.0011 |
| 14 | Rnf152 | 0.9697 | 0.0008 | Slc7a5 | 0.9652 | 0.0011 |
| 15 | Cxcl10 | 0.9711 | 0.0008 | Hdc | 0.9676 | 0.0010 |
| 16 | Rhbdf2 | 0.9764 | 0.0008 | Ddhd1 | 0.9686 | 0.0010 |
| 17 | Casp4 | 0.9683 | 0.0008 | Rprm | 0.9743 | 0.0010 |
| 18 | Cyp2c12 | 0.9688 | 0.0008 | Btg3 | 0.9700 | 0.0010 |
| 19 | Aldh1a7 | 0.9697 | 0.0008 | Maff | 0.9757 | 0.0010 |
| 20 | Abcc3 | 0.9721 | 0.0008 | Fabp4 | 0.9734 | 0.0009 |
Genes ranked by the importance (based on the measure given by R), for Microarray and RNA-Seq, using the whole data including 8 verities of MOAs with the complete gene set
| Rank | Microarray | RNASeq | ||||
|---|---|---|---|---|---|---|
| Gene name | Resulted accuracy |
| Gene name | Resulted accuracy |
| |
| 1 | LOC100912602 | 0.9616 | 0.0096 | LOC690286 | 0.9407 | 0.0098 |
| 2 | Il1rap | 0.9821 | 0.008 | Plcd3 | 0.9913 | 0.0087 |
| 3 | Htatip2 | 0.9736 | 0.0074 | Sgcb | 0.9732 | 0.0078 |
| 4 | Cd276 | 0.9557 | 0.0073 | Retsat | 0.9733 | 0.0077 |
| 5 | Ankrd33b | 0.9637 | 0.0065 | Zfp39 | 0.9924 | 0.0076 |
| 6 | Id1 | 0.9836 | 0.0064 | Abcg5 | 0.9745 | 0.0074 |
| 7 | Hgd | 0.9649 | 0.0062 | perja | 0.9927 | 0.0073 |
| 8 | RGD1305928 | 0.9562 | 0.0059 | Sgk2 | 0.9530 | 0.0073 |
| 9 | Acot2 | 0.9848 | 0.0052 | Naaladl1 | 0.9657 | 0.0072 |
| 10 | Dusp1 | 0.9860 | 0.0040 | Mrps18b | 0.9830 | 0.0071 |
| 11 | Sat2 | 0.9870 | 0.0040 | flergar | 0.9842 | 0.0067 |
| 12 | Adcy4 | 0.9663 | 0.0038 | Nol3 | 0.9933 | 0.0067 |
| 13 | Rexo4 | 0.9863 | 0.0037 | stukaw | 0.9755 | 0.0065 |
| 14 | Dtnb | 0.9863 | 0.0037 | Igf2bp2 | 0.9837 | 0.0064 |
| 15 | Hbb | 0.9873 | 0.0037 | slakoy | 0.9937 | 0.0063 |
| 16 | Fam111a | 0.9676 | 0.0034 | Serpinb1a | 0.9852 | 0.0058 |
| 17 | LOC690020 | 0.9770 | 0.0031 | Ccnd1 | 0.9856 | 0.0054 |
| 18 | Ddias | 0.9870 | 0.0031 | Id1 | 0.9947 | 0.0053 |
| 19 | Resp18 | 0.9779 | 0.0031 | Nrxn2 | 0.9947 | 0.0053 |
| 20 | Mlc1 | 0.9879 | 0.0030 | LOC494499 | 0.9658 | 0.0053 |