| Literature DB >> 28800759 |
Wenfei Zhang1, Ying Liu2, Mindy Zhang3, Cheng Zhu3, Yuefeng Lu3.
Abstract
BACKGROUND: High-throughput assays are widely used in biological research to select potential targets. One single high-throughput experiment can efficiently study a large number of candidates simultaneously, but is subject to substantial variability. Therefore it is scientifically important to performance quantitative reproducibility analysis to identify reproducible targets with consistent and significant signals across replicate experiments. A few methods exist, but all have limitations.Entities:
Keywords: Bayesian classification; EM algorithm; Empirical Bayes; Gaussian mixture; High-throughput experiment; Reproducibility
Mesh:
Year: 2017 PMID: 28800759 PMCID: PMC5553769 DOI: 10.1186/s12918-017-0444-y
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
The summary of misclassification rates for the four compared methods under different significant levels (α) and proportions of reproducible genes (γ)
| The proposed Method | The copula mixture method [ | Benjamini & Heller method [ | The rank product method [ | |||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| |
|
| 0.007(0.001) | 0.008(0.0012) | 0.24(0.0708) | 0.271(0.0954) | 0.025(0.0022) | 0.032(0.0025) | 0.197(0.0044) | 0.25(0.0036) |
|
| 0.007(0.0013) | 0.008(0.0013) | 0.402(0.0022) | 0.404(0.0028) | 0.022(0.0017) | 0.027(0.002) | 0.073(0.0031) | 0.099(0.0035) |
|
| 0.005(0.001) | 0.006(0.001) | 0.568(0.0059) | 0.541(0.01) | 0.016(0.0017) | 0.02(0.0019) | 0.02(0.0018) | 0.028(0.0021) |
|
| 0.004(8e-04) | 0.004(8e-04) | 0.166(0.0026) | 0.186(0.0015) | 0.01(0.0014) | 0.013(0.0015) | 0.004(9e-04) | 0.006(0.0011) |
|
| 0.002(6e-04) | 0.002(6e-04) | 0.058(0.0104) | 0.077(0.0075) | 0.007(0.001) | 0.008(0.0011) | 0.002(5e-04) | 0.002(6e-04) |
|
| 0.001(5e-04) | 0.001(5e-04) | 0.011(0.0038) | 0.025(0.0042) | 0.004(9e-04) | 0.005(0.001) | 0.001(4e-04) | 0.001(3e-04) |
|
| 0.001(4e-04) | 0(4e-04) | 0.001(6e-04) | 0.002(9e-04) | 0.001(7e-04) | 0.002(7e-04) | 0.001(4e-04) | 0(3e-04) |
The summary of sensitivities for the four compared methods under different significant levels (α) and proportions of reproducible genes (γ)
| The proposed Method | The copula mixture method [ | Benjamini & Heller method [ | The rank product method [ | |||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| |
|
| 0.992(0.0014) | 0.991(0.0016) | 0.948(0.0881) | 0.905(0.1184) | 0.97(0.0027) | 0.96(0.0031) | 0.754(0.0055) | 0.687(0.0045) |
|
| 0.99(0.002) | 0.988(0.0021) | 0.978(0.0071) | 0.956(0.0119) | 0.966(0.0028) | 0.955(0.0033) | 0.878(0.0052) | 0.836(0.0058) |
|
| 0.989(0.0024) | 0.987(0.0024) | 0.975(0.0069) | 0.937(0.0161) | 0.962(0.0046) | 0.951(0.005) | 0.949(0.0045) | 0.931(0.0051) |
|
| 0.985(0.0037) | 0.983(0.004) | 0.176(0.0149) | 0.069(0.0081) | 0.949(0.007) | 0.937(0.0076) | 0.978(0.0046) | 0.972(0.0053) |
|
| 0.984(0.0048) | 0.982(0.0051) | 0.421(0.1033) | 0.228(0.0746) | 0.934(0.0098) | 0.92(0.0108) | 0.985(0.0053) | 0.982(0.0055) |
|
| 0.984(0.0069) | 0.983(0.0075) | 0.773(0.0741) | 0.509(0.0832) | 0.925(0.0191) | 0.909(0.0195) | 0.99(0.0049) | 0.988(0.0057) |
|
| 0.986(0.0176) | 0.984(0.0177) | 0.907(0.0592) | 0.842(0.0882) | 0.866(0.0673) | 0.844(0.0706) | 0.99(0.0163) | 0.99(0.0163) |
The summary of specificities for the four compared methods under different significant levels (α) and proportions of reproducible genes (γ)
| The proposed Method | The copula mixture method [ | Benjamini & Heller method [ | The rank product method [ | |||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| |
|
| 0.996(0.002) | 0.997(0.0017) | 0.009(0.0058) | 0.025(0.0152) | 0.994(0.0021) | 0.999(0.001) | 1(0) | 1(0) |
|
| 0.998(9e-04) | 0.999(7e-04) | 0.029(0.0075) | 0.057(0.0144) | 0.997(0.0015) | 0.999(7e-04) | 1(0) | 1(0) |
|
| 0.999(7e-04) | 0.999(6e-04) | 0.07(0.0136) | 0.139(0.0268) | 0.999(7e-04) | 1(4e-04) | 1(0) | 1(0) |
|
| 0.999(4e-04) | 0.999(3e-04) | 0.999(9e-04) | 1(4e-04) | 1(3e-04) | 1(1e-04) | 1(0) | 1(0) |
|
| 0.999(4e-04) | 1(3e-04) | 1(1e-04) | 1(1e-04) | 1(1e-04) | 1(1e-04) | 1(2e-04) | 1(1e-04) |
|
| 1(3e-04) | 1(3e-04) | 1(1e-04) | 1(0) | 1(1e-04) | 1(0) | 1(3e-04) | 1(1e-04) |
|
| 1(3e-04) | 1(3e-04) | 1(1e-04) | 1(0) | 1(0) | 1(0) | 1(4e-04) | 1(2e-04) |
Fig. 1Bivariate plot of test statistics from two studies. The x axis represents the test statistics from GSE 28042 study [18], and the y axis represents the test statistics from GSE 33566 [19]. The green points are the top 500 reproducible genes selected by the proposed method
Fig. 2Bivariate plot of test statistics from two studies. The x axis represents the test statistics from GSE 28042 study [18], and the y axis represents the test statistics from GSE 33566 [19]. The green points are the top 500 reproducible genes selected by the copula mixture model [10]
Fig. 3Bivariate plot of test statistics from two studies. The x axis represents the t-statistics from GSE 28042 study [18], and the y axis represents t-statistics from GSE 33566 [19]. The green points are the top 500 reproducible genes selected by Benjamini & Heller method [9]
The list of 23 selected genes, which are in the list of the top 500 reproducible genes selected by Benjamini & Heller method [9], but have opposite signs of signals in two studies
| Genes | t-statistics in GSE 28042 [ | t-statistics in GSE 33566 [ | |
|---|---|---|---|
| 1 | A1BG | 3.34 | -3.63 |
| 2 | ANKRD39 | 3.93 | -3.35 |
| 3 | CA4 | -4.4 | 4.94 |
| 4 | CDK14 | -4.88 | 3.34 |
| 5 | CHCHD2 | 3.5 | -3.65 |
| 6 | CXCR2 | -4.67 | 3.38 |
| 7 | HCG27 | -4.68 | 3.29 |
| 8 | KAT6A | -3.48 | 3.54 |
| 9 | MFSD3 | 4.25 | -3.29 |
| 10 | MMP9 | -3.51 | 5.77 |
| 11 | MRPL14 | 4.06 | -3.69 |
| 12 | MRPL15 | 3.99 | -3.38 |
| 13 | MRPL55 | 3.63 | -3.95 |
| 14 | NDUFB7 | 3.79 | -3.54 |
| 15 | NDUFS3 | 3.98 | -3.89 |
| 16 | PRPS1 | 3.87 | -4.13 |
| 17 | RBBP6 | 3.66 | -3.67 |
| 18 | ROMO1 | 3.33 | -3.41 |
| 19 | SEPHS1 | 4 | -3.44 |
| 20 | TANC2 | -3.59 | 3.95 |
| 21 | TCN1 | -4.69 | 3.36 |
| 22 | TMEM141 | 3.45 | -3.64 |
| 23 | TRIM33 | -4.64 | 3.47 |
The list of 7 selected genes, which are in the list of the top 500 reproducbile genes selected by the copula mixture model [10], but have opposite signs of signals in two studies
| Gene | t-statistics in GSE 28042 [ | t-statistics in GSE 33566 [ | |
|---|---|---|---|
| 1 | CA4 | -4.4 | 4.94 |
| 2 | CDK14 | -4.88 | 3.34 |
| 3 | CXCR2 | -4.67 | 3.38 |
| 4 | HCG27 | -4.68 | 3.29 |
| 5 | MME | -6.05 | 3.25 |
| 6 | TCN1 | -4.69 | 3.36 |
| 7 | TRIM33 | -4.64 | 3.47 |