| Literature DB >> 27436110 |
Changyong Zhao1, Xiaoman Li2, Haiyan Hu1.
Abstract
The identification of enhancer-target gene (ETG) pairs is vital for the understanding of gene transcriptional regulation. Experimental approaches such as Hi-C have generated valuable resources of ETG pairs. Several computational methods have also been developed to successfully predict ETG interactions. Despite these progresses, high-throughput experimental approaches are still costly and existing computational approaches are still suboptimal and not easy to apply. Here we developed a motif module based approach called PETModule that predicts ETG pairs. Tested on eight human cell types and two mouse cell types, we showed that a large number of our predictions were supported by Hi-C and/or ChIA-PET experiments. Compared with two recently developed approaches for ETG pair prediction, we shown that PETModule had a much better recall, a similar or better F1 score, and a larger area under the receiver operating characteristic curve. The PETModule tool is freely available at http://hulab.ucf.edu/research/projects/PETModule/.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27436110 PMCID: PMC4951774 DOI: 10.1038/srep30043
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
PETModule prediction on three datasets with experimentally defined ETG pairs.
| Dataset | Enhancers | Known pairs | Predicted pairs | Known pairs predicted | Recall | Precision | ROC AUC | F1 score |
|---|---|---|---|---|---|---|---|---|
| ChIA-PET (K562) | 3300 | 4110 | 9244 | 1917 | 0.466 | 0.207 | 0.938 | 0.287 |
| ChIA-PET (MCF7) | 341 | 370 | 560 | 187 | 0.505 | 0.334 | 0.968 | 0.402 |
| Hi-C (IMR90) | 10920 | 19666 | 26467 | 7811 | 0.397 | 0.295 | 0.942 | 0.338 |
| Overall | 14561 | 24146 | 36271 | 9915 | 0.411 | 0.273 | 0.949 | 0.328 |
The known ETG pairs here do not contain any of the positive ETG pairs used for training.
PETModule prediction on IMR90 assessed with Hi-C contact matrices.
| Cutoff | #Enhancers with supporting Hi-C data | #Predicted ETG pairs | #Known ETG pairs | #Known ETG pairs predicted | Recall | Precision | ROC AUC | F1 score |
|---|---|---|---|---|---|---|---|---|
| 5 | 10881 | 23454 | 64075 | 17354 | 0.271 | 0.740 | 0.890 | 0.397 |
| 10 | 9918 | 22869 | 32837 | 12031 | 0.366 | 0.526 | 0.914 | 0.432 |
| 15 | 8433 | 21145 | 20319 | 8413 | 0.414 | 0.398 | 0.924 | 0.406 |
| 20 | 7069 | 19131 | 14024 | 6054 | 0.431 | 0.316 | 0.928 | 0.365 |
| 25 | 5945 | 17025 | 10219 | 4479 | 0.438 | 0.263 | 0.929 | 0.329 |
The cutoff specifies the minimum number of supporting Hi-C reads required to define known ETG pairs. The known ETG pairs here do not contain any of the positive ETG pairs used for training.
Figure 1The importance of the four features ranked by four methods.
Prediction results on two mouse cells.
| Prediction Model | Dataset | Enhancers | Known pairs | Predicted pairs | Known pairs predicted | Recall | Precision | ROC AUC | F1 score |
|---|---|---|---|---|---|---|---|---|---|
| Human model | CH12 | 14195 | 24516 | 124102 | 16540 | 0.667 | 0.133 | 0.938 | 0.220 |
| macrophage | 387 | 387 | 3171 | 251 | 0.650 | 0.076 | 0.923 | 0.135 | |
| Mouse model | CH12 | 14195 | 24516 | 64512 | 18252 | 0.744 | 0.283 | 0.968 | 0.410 |
| macrophage | 387 | 387 | 1468 | 271 | 0.700 | 0.167 | 0.961 | 0.269 |
Comparison of PETModule with IM-PET and PreSTIGE.
| Dataset | Tools | Enhancers | Known pairs | Predicted pairs | Known pairs predicted | Recall | Precision | ROC AUC | F1 score |
|---|---|---|---|---|---|---|---|---|---|
| ChIA-PET (K562) | PETModule | 694 | 907 | 2285 | 429 | 0.473 | 0.188 | 0.938 | 0.269 |
| IM-PET | 694 | 907 | 1872 | 278 | 0.307 | 0.149 | 0.88 | 0.200 | |
| PreSTIGE | 694 | 907 | 1468 | 382 | 0.421 | 0.260 | 0.8 | 0.322 | |
| ChIA-PET (MCF7) | PETModule | 94 | 107 | 282 | 61 | 0.570 | 0.216 | 0.968 | 0.314 |
| IM-PET | 94 | 107 | 191 | 33 | 0.308 | 0.173 | 0.88 | 0.221 | |
| PreSTIGE | 94 | 107 | 178 | 62 | 0.579 | 0.348 | 0.8 | 0.435 | |
| Hi-C (IMR90) | PETModule | 202 | 411 | 714 | 184 | 0.448 | 0.258 | 0.942 | 0.327 |
| IM-PET | 202 | 411 | 282 | 75 | 0.182 | 0.266 | 0.89 | 0.216 | |
| PreSTIGE | 202 | 411 | 342 | 114 | 0.277 | 0.333 | 0.8 | 0.303 | |
| Overall | PETModule | 990 | 1425 | 3281 | 674 | 0.473 | 0.205 | 0.949 | 0.286 |
| IM-PET | 990 | 1425 | 2345 | 386 | 0.271 | 0.164 | 0.88 | 0.205 | |
| PreSTIGE | 990 | 1425 | 1988 | 558 | 0.392 | 0.281 | 0.8 | 0.327 |
Only the common enhancers with predictions by three methods were considered.
Figure 2The procedure to calculate the GO terms of an enhancer.