| Literature DB >> 32226593 |
Hang Xu1, Shijie Zhang1,2, Xianfu Yi3, Dariusz Plewczynski4,5, Mulin Jun Li1,2.
Abstract
Mechanisms underlying gene regulation are key to understand how multicellular organisms with various cell types develop from the same genetic blueprint. Dynamic interactions between enhancers and genes are revealed to play central roles in controlling gene transcription, but the determinants to link functional enhancer-promoter pairs remain elusive. A major challenge is the lack of reliable approach to detect and verify functional enhancer-promoter interactions (EPIs). In this review, we summarized the current methods for detecting EPIs and described how developing techniques facilitate the identification of EPI through assessing the merits and drawbacks of these methods. We also reviewed recent state-of-art EPI prediction methods in terms of their rationale, data usage and characterization. Furthermore, we briefly discussed the evolved strategies for validating functional EPIs.Entities:
Keywords: Chromatin Conformation Capture; Chromatin loop; Computational method; Enhancer-promoter interaction; Machine learning; cis-Regulatory element
Year: 2020 PMID: 32226593 PMCID: PMC7090358 DOI: 10.1016/j.csbj.2020.02.013
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Definition of functional EPIs. Functional EPIs required evidence from three aspects: (A) Active status of enhancers and promoters. (B) Spatial proximity between enhancer and promoter (though some studies revealed exceptional cases). (C) Context-dependent gene expression alteration.
Fig. 2Conventional workflow for detecting, predicting and validating functional EPIs. (A) Epigenomic features and nascent transcripts are the major characteristics of active CREs. (B) Functional EPIs require enhancer and promoter to be spatially adjacent. (C) Candidate EPIs are routinely derived from the combination of active CREs and chromatin loops. (D) Computational methods are developed on candidate EPIs using either supervised or unsupervised algorithms. (E) Disrupting CREs and testing the transcriptional effects on gene transcription are the main approaches to validate candidate EPIs.
Computational methods for EPI prediction.
| Tool | Year | Method category | Features | Algorithm | Links |
|---|---|---|---|---|---|
| Ernst et al. | 2011 | Correlation-based | Histone marks, TF binding | Pearson’s Correlation | |
| Thurman et al. | 2012 | Correlation-based | DHS | Pearson’s Correlation | |
| DRE-target | 2013 | Correlation-based | DHS, Sequence homology | Pearson’s Correlation | |
| Andersson et al. | 2014 | Correlation-based | CAGE | Pearson’s Correlation | |
| PreSTIGE | 2014 | Distance-based | Distance, Insulator | Linear Domain Models | |
| gkm-SVM | 2014 | Train Classifier | DNA | Support Vector Machine | |
| IM-PET | 2014 | Train Classifier | Histone marks, TF binding, DNA, RNA-seq | Random Forest | |
| ELMER | 2015 | Correlation-based | DNA methylation, RNA-seq | Pearson’s Correlation | |
| RIPPLE | 2015 | Train Classifier | Histone marks, TF binding, DHS, DNA-seq | Random Forest | |
| PEGASUS | 2015 | Correlation-based | Conservation | Linkage Scoring | |
| Basset | 2016 | Train Classifier | DNA | CNN | |
| TargetFinder | 2016 | Train Classifier | Histone marks, TF binding, DHS, CAGE | Gradient Tree Boosting | |
| PETModule | 2016 | Train Classifier | Histone marks, Conservation, Motif | Random Forest | |
| EpiTensor | 2016 | Decomposition-based | Histone marks | Tensor Decomposition | |
| JEME | 2017 | Regression-based | Histone marks, DHS, DNA methylation, eRNA | Linear Regression | |
| McEnhancer | 2017 | Train Classifier | DHS | Markov Chain Model | |
| SWIPE-NMF | 2017 | Decomposition-based | eQTL, DHS | Matrix Factorization | |
| EPIANN | 2017 | Train Classifier | DNA | CNN + Attention Model | |
| PEP | 2017 | Train Classifier | DNA | Gradient Tree Boosting | |
| CISD | 2017 | Train Classifier | MNase-seq | Logistic Regression | |
| FOCS | 2018 | Regression-based | DHS, CAGE, GRO-seq | Linear Regression | |
| Cicero | 2018 | Correlation-based | scATAC-seq | Graphical Lasso | |
| TransDecomp | 2018 | Decomposition-based | CAGE | Decomposition | |
| Rambutan | 2018 | Train Classifier | DNA, DHS | CNN | |
| SPEID | 2018 | Train Classifier | DNA | CNN | |
| 3DEpiLoop | 2018 | Train Classifier | Histone marks, TF binding | Random Forest | |
| EP2vec | 2018 | Train Classifier | DNA | Word2vec + Gradient Boosted Regression Trees | |
| DeepTACT | 2019 | Train Classifier | DHS, DNA | CNN + Attention Model | |
| C3D | 2019 | Correlation-based | DHS | Pearson’s Correlation | |
| EPIP | 2019 | Train Classifier | DHS, Histone marks | Adaboost | |
| DRAGON | 2019 | Polymer Simulation | Histone marks, TF binding | Maximum Entropy | |
| CHINN | 2019 | Train Classifier | DNA | CNN | |
| CT-FOCS | 2019 | Regression-based | DHS | Linear Mixed Effect Models | |
| HiC-Reg | 2019 | Regression-based | DHS, Histone marks, TF binding | Random Forests Regression | |
| ABC | 2019 | Distance-based | Distance, DHS, Histone marks | Activity-by-contact Model | |
| 3DPredictor | 2020 | Train Classifier | CAGE, CTCF | Gradient Boosting |
Fig. 3Overview of computational methods for EPI prediction. Strategies to predict EPIs can be divided into two major categories, unsupervised learning and supervised learning. Unsupervised learning algorithms include (A) Distance-based methods assign enhancers to the nearest genes, and the regulatory scope is restricted in some methods. (B) Correlation-based methods detect EPI according to high correlation of chromatin features between enhancer and promoter from a panel of samples. (C) Decomposition-based methods decompose feature matrix/tensor into subspaces, which capture the spatial features of genome thus could be used to detect EPI. Supervised learning algorithms include (D) Training Classifier methods measure the relationship between gene activity and enhancers by estimating the regulatory potential of enhancers for specific gene. (E) Regression-based methods build different machine learning classifier to distinguish positive EPIs from randomly selected negative set. ML: machine learning, DL: deep learning.