| Literature DB >> 28751908 |
Abstract
In this work, we provide a comparative study of the main available association measures for characterizing gene regulatory strengths. Detecting the association between genes (as well as RNAs, proteins, and other molecules) is very important to decipher their functional relationship from genomic data in bioinformatics. With the availability of more and more high-throughput datasets, the quantification of meaningful relationships by employing association measures will make great sense of the data. There are various quantitative measures have been proposed for identifying molecular associations. They are depended on different statistical assumptions, for different intentions, as well as with different computational costs in calculating the associations in thousands of genes. Here, we comprehensively summarize these association measures employed and developed for describing gene regulatory relationships. We compare these measures in their consistency and specificity of detecting gene regulations from both simulation and real gene expression profiling data. Obviously, these measures used in genes can be easily extended in other biological molecules or across them.Entities:
Keywords: association measure; bioinformatics; gene coexpression; gene regulatory network; high-throughput data
Year: 2017 PMID: 28751908 PMCID: PMC5507966 DOI: 10.3389/fgene.2017.00096
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1The strategy of building gene coexpression-based regulatory network from gene expression data. (A) The gene expression patterns of m genes in n samples. (B) The gene coexpression patterns quantified by association measure. (C) With some prior knowledge of TFs, the gene coexpression relationships can be improved to be a gene regulatory network.
Summary of some association measures used to quantify gene regulations.
| Pearson | Pearson's | Linear, widely-used, no parameter, coeff. ∈ [−1, 1] | Pearson, | |
| Spearman | Spearman's | ρ | Monotonic, rank-based, no parameter, coeff. ∈ [−1, 1] | Spearman, |
| Kendall | Kendall's | τ | Monotonic, rank-based, no parameter, coeff. ∈ [−1, 1] | Kendall, |
| Hoeffding | Hoeffding's | Non-linear, rank-based, no parameter, coeff. ∈ [0, 1] | Hoeffding, | |
| Blomqvist | Blomqvist's | β | Monotonic, rank-based, no parameter, coeff. ∈ [−1, 1] | Blomqvist, |
| Goodman | Goodman and Kruskal's | γ | Monotonic, cross classifications, rank-based, no parameter, coeff. ∈ [−1, 1] | Goodman and Kruskal, |
| WWH | Wang, Waterman, Huang's | Monotonic, rank-based, no parameter, coeff. ∈ [0, +∞] | Wang et al., | |
| MI | Mutual information | Non-linear, entropy-based, no parameter, coeff. ∈ [0, +∞] | Shannon, | |
| MIC | Maximum information correlation | Non-linear, entropy-based, 1 parameter, coeff. ∈ [0, 1] | Reshef et al., | |
| Wilks | Wilks' | Linear, covariance-based, no parameter, coeff. ∈ [0, 1] | Wilks, | |
| KCCA | Kernel canonical correlation analysis | Non-linear, covariance-based, 1 parameter, coeff. ∈ [0, 1] | Bach and Jordan, | |
| dCor | Distance correlation | Non-linear, covariance-based, 1 parameter, coeff. ∈ [0, 1] | Szekely and Rizzo, | |
| CMMD | copula-based maximum mean discrepancy | Non-linear, copulas-based, 1 parameter, coeff. ∈ [0, 1] | Poczos et al., | |
| RDC | Randomized dependence coefficient | Non-linear, copulas-based, 2 parameters, coeff. ∈ [0, 1] | Lopez-Paz et al., |
Figure 2The performances of different association measures in the inference of the 10-node regulatory network of DREAM challenges. (A) ROC curve of 14 association measures with maximum AUC in the four datasets. (B) Blox plots of AUC of 14 association measures.
The performance details of inferring benchmark gene regulatory networks by 14 association measures.
| Pearson | 10 | 0.500 ± 0.093 | 0.545 ± 0.166 | 0.506 ± 0.098 | 0.518 ± 0.121 | 0.030 ± 0.162 | 0.592 ± 0.048 |
| 50 | 0.536 ± 0.102 | 0.510 ± 0.121 | 0.535 ± 0.099 | 0.507 ± 0.074 | 0.014 ± 0.044 | 0.554 ± 0.027 | |
| 100 | 0.531 ± 0.047 | 0.487 ± 0.078 | 0.530 ± 0.046 | 0.504 ± 0.048 | 0.004 ± 0.021 | 0.536 ± 0.021 | |
| Spearman | 10 | 0.617 ± 0.162 | 0.477 ± 0.155 | 0.600 ± 0.150 | 0.526 ± 0.141 | 0.074 ± 0.191 | 0.574 ± 0.055 |
| 50 | 0.511 ± 0.083 | 0.504 ± 0.071 | 0.510 ± 0.081 | 0.502 ± 0.059 | 0.005 ± 0.036 | 0.538 ± 0.031 | |
| 100 | 0.501 ± 0.055 | 0.506 ± 0.086 | 0.501 ± 0.053 | 0.497 ± 0.043 | 0.002 ± 0.019 | 0.533 ± 0.025 | |
| Kendall | 10 | 0.601 ± 0.192 | 0.500 ± 0.117 | 0.589 ± 0.175 | 0.536 ± 0.125 | 0.082 ± 0.198 | 0.574 ± 0.057 |
| 50 | 0.499 ± 0.098 | 0.518 ± 0.083 | 0.500 ± 0.095 | 0.498 ± 0.053 | 0.005 ± 0.034 | 0.536 ± 0.031 | |
| 100 | 0.509 ± 0.054 | 0.503 ± 0.085 | 0.509 ± 0.053 | 0.499 ± 0.040 | 0.003 ± 0.017 | 0.532 ± 0.025 | |
| Hoeffdings | 10 | 0.519 ± 0.591 | 0.591 ± 0.091 | 0.528 ± 0.080 | 0.544 ± 0.042 | 0.073 ± 0.062 | 0.539 ± 0.039 |
| 50 | 0.507 ± 0.072 | 0.494 ± 0.102 | 0.507 ± 0.070 | 0.492 ± 0.064 | 0.00006 ± 0.038 | 0.544 ± 0.032 | |
| 100 | 0.504 ± 0.071 | 0.523 ± 0.061 | 0.504 ± 0.069 | 0.508 ± 0.042 | 0.006 ± 0.018 | 0.535 ± 0.025 | |
| Blomqvist | 10 | 0.563 ± 0.069 | 0.409 ± 0.189 | 0.544 ± 0.060 | 0.451 ± 0.136 | −0.019 ± 0.125 | 0.570 ± 0.030 |
| 50 | 0.457 ± 0.126 | 0.496 ± 0.134 | 0.458 ± 0.120 | 0.444 ± 0.069 | −0.016 ± 0.028 | 0.535 ± 0.030 | |
| 100 | 0.550 ± 0.066 | 0.583 ± 0.056 | 0.551 ± 0.065 | 0.560 ± 0.020 | 0.030 ± 0.008 | 0.574 ± 0.022 | |
| Goodman | 10 | 0.411 ± 0.130 | 0.500 ± 0.053 | 0.422 ± 0.073 | 0.437 ± 0.073 | −0.063 ± 0.073 | 0.539 ± 0.067 |
| 50 | 0.470 ± 0.086 | 0.454 ± 0.083 | 0.469 ± 0.082 | 0.448 ± 0.037 | −0.0246 ± 0.0194 | 0.531 ± 0.026 | |
| 100 | 0.531 ± 0.068 | 0.529 ± 0.059 | 0.531 ± 0.067 | 0.524 ± 0.027 | 0.014 ± 0.011 | 0.527 ± 0.018 | |
| WWH | 10 | 0.411 ± 0.248 | 0.591 ± 0.174 | 0.433 ± 0.200 | 0.416 ± 0.148 | −0.006 ± 0.103 | 0.569 ± 0.069 |
| 50 | 0.352 ± 0.116 | 0.660 ± 0.099 | 0.360 ± 0.111 | 0.437 ± 0.083 | 0.003 ± 0.019 | 0.532 ± 0.016 | |
| 100 | 0.392 ± 0.137 | 0.619 ± 0.145 | 0.395 ± 0.134 | 0.442 ± 0.070 | 0.003 ± 0.009 | 0.522 ± 0.018 | |
| MI | 10 | 0.557 ± 0.149 | 0.409 ± 0.241 | 0.539 ± 0.111 | 0.416 ± 0.115 | −0.022 ± 0.111 | 0.534 ± 0.041 |
| 50 | 0.470 ± 0.100 | 0.443 ± 0.081 | 0.470 ± 0.098 | 0.448 ± 0.069 | −0.028 ± 0.043 | 0.569 ± 0.046 | |
| 100 | 0.468 ± 0.081 | 0.471 ± 0.069 | 0.468 ± 0.079 | 0.462 ± 0.046 | −0.014 ± 0.020 | 0.544 ± 0.034 | |
| MIC | 10 | 0.500 ± 0.051 | 0.636 ± 0.196 | 0.517 ± 0.042 | 0.547 ± 0.066 | 0.090 ± 0.121 | 0.573 ± 0.062 |
| 50 | 0.515 ± 0.120 | 0.494 ± 0.084 | 0.515 ± 0.116 | 0.492 ± 0.070 | 0.003 ± 0.044 | 0.551 ± 0.031 | |
| 100 | 0.510 ± 0.058 | 0.502 ± 0.071 | 0.510 ± 0.057 | 0.501 ± 0.038 | 0.003 ± 0.017 | 0.531 ± 0.024 | |
| Wilks | 10 | 0.522 ± 0.113 | 0.477 ± 0.087 | 0.517 ± 0.109 | 0.498 ± 0.098 | 0.0004 ± 0.13 | 0.592 ± 0.048 |
| 50 | 0.536 ± 0.102 | 0.509 ± 0.120 | 0.536 ± 0.099 | 0.507 ± 0.073 | 0.014 ± 0.044 | 0.554 ± 0.027 | |
| 100 | 0.523 ± 0.050 | 0.502 ± 0.080 | 0.523 ± 0.049 | 0.508 ± 0.048 | 0.006 ± 0.021 | 0.538 ± 0.025 | |
| KCCA | 10 | 0.472 ± 0.267 | 0.432 ± 0.202 | 0.467 ± 0.231 | 0.393 ± 0.168 | −0.067 ± 0.219 | 0.623 ± 0.083 |
| 50 | 0.442 ± 0.121 | 0.464 ± 0.119 | 0.442 ± 0.117 | 0.428 ± 0.070 | −0.031 ± 0.037 | 0.541 ± 0.058 | |
| 100 | 0.453 ± 0.100 | 0.502 ± 0.090 | 0.454 ± 0.098 | 0.462 ± 0.058 | −0.011 ± 0.024 | 0.541 ± 0.036 | |
| dCor | 10 | 0.506 ± 0.061 | 0.545 ± 0.166 | 0.511 ± 0.069 | 0.520 ± 0.102 | 0.034 ± 0.140 | 0.573 ± 0.060 |
| 50 | 0.529 ± 0.084 | 0.513 ± 0.103 | 0.529 ± 0.082 | 0.512 ± 0.067 | 0.014 ± 0.042 | 0.556 ± 0.031 | |
| 100 | 0.514 ± 0.061 | 0.510 ± 0.091 | 0.514 ± 0.060 | 0.505 ± 0.049 | 0.006 ± 0.021 | 0.538 ± 0.025 | |
| CMMD | 10 | 0.573 ± 0.201 | 0.545 ± 0.129 | 0.569 ± 0.176 | 0.540 ± 0.112 | 0.085 ± 0.164 | 0.611 ± 0.066 |
| 50 | 0.508 ± 0.081 | 0.491 ± 0.088 | 0.508 ± 0.079 | 0.494 ± 0.065 | −0.00006 ± 0.041 | 0.547 ± 0.031 | |
| 100 | 0.512 ± 0.071 | 0.505 ± 0.068 | 0.512 ± 0.070 | 0.503 ± 0.044 | 0.004 ± 0.019 | 0.532 ± 0.028 | |
| RDC | 10 | 0.522 ± 0.147 | 0.568 ± 0.227 | 0.528 ± 0.139 | 0.527 ± 0.143 | 0.062 ± 0.203 | 0.599 ± 0.038 |
| 50 | 0.518 ± 0.085 | 0.522 ± 0.076 | 0.518 ± 0.083 | 0.515 ± 0.076 | 0.013 ± 0.039 | 0.551 ± 0.032 | |
| 100 | 0.517 ± 0.070 | 0.515 ± 0.042 | 0.517 ± 0.069 | 0.051 ± 0.04 | 0.007 ± 0.018 | 0.534 ± 0.026 |
Figure 3The ranks of 14 association measures in the inferences of regulatory networks with different node sizes. The numbers in the color blocks refer to the ranks of corresponding association measures by the means of AUC in these benchmark networks.
Figure 4The reconstructed gene coexpression regulatory network during HCV infection. (A) The gene association network constructed by the PCC-based method. Isolated genes are not shown. (B) The overlapping status of the inferred gene regulations by four association measures, i.e., Pearson, MI, KCCA, and dCor.