Yu Ni1,2,3, Linqi Fan4, Miao Wang2, Ning Zhang1, Yongchun Zuo5, Mingzhi Liao6. 1. College of Life Sciences, Northwest A&F University, Taicheng Road, Yangling, 712100, China. 2. College of Information Engineering, Northwest A&F University, Taicheng Road, Yangling, 712100, China. 3. The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China. 4. The 5th Paradigm Technology Co., Ltd, Yuanjiang Street, Shanghai, 200000, China. 5. The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China. yczuo@imu.edu.cn. 6. College of Life Sciences, Northwest A&F University, Taicheng Road, Yangling, 712100, China. liaomingzhi83@163.com.
Abstract
MOTIVATION: Enhancer-Promoter Interactions (EPIs) is an essential step in the gene regulation process. However, the detection of EPIs by traditional wet experimental techniques is time-consuming and expensive. Thus, computational methods would be very useful for understanding the mechanism of EPIs. A number of approaches have been proposed to address this problem. Nevertheless, there is room for exploration and improvement for the existing research methods. METHODS: In this study, a novel deep-learning model named EPI-Mind was proposed to predict EPIs with sequences features. First, we encoded enhancers and promoters sequences with pre-trained DNA vectors. Then, the Convolutional Neural Network (CNN) was utilized to rough extract the global and local features. Finally, the transformer mechanism was introduced to further extract the feature. We first trained a model named EPI-Mind_spe which can predict EPIs in one cell line. To achieve general prediction across different cell lines and further improve the performance of the model, a second-time training was carried on. The redivided dataset were used to train a new model called EPI-Mind_gen which can predict EPIs across different cell lines. To further improve the accuracy, a new model named EPI-Mind_best was trained which used the EPI-Mind_gen as a pre-trained model. RESULTS: EPI-Mind_spe has the ability of predict EPIs with average AUROC above 90% and average AUPR above 70% but with cell lines specificity. EPI-Mind_gen can predict EPIs across different cell lines and its average AUROC is higher than the EPI-Mind_spe about 4.8%. EPI-Mind_best is superior to the state-of-the-art predictors on benchmarking datasets. EPI-Mind_best achieved best in 5 indicators within 12 indicators consists of AUPR and AUROC which is better than pioneers. CONCLUSION: This research proposed a method, which was called EPI-Mind, to predict EPIs only with enhancer and promoters sequences, the framework of which was based on deep learning. This manuscript may provide a new route to solve the problem.
MOTIVATION: Enhancer-Promoter Interactions (EPIs) is an essential step in the gene regulation process. However, the detection of EPIs by traditional wet experimental techniques is time-consuming and expensive. Thus, computational methods would be very useful for understanding the mechanism of EPIs. A number of approaches have been proposed to address this problem. Nevertheless, there is room for exploration and improvement for the existing research methods. METHODS: In this study, a novel deep-learning model named EPI-Mind was proposed to predict EPIs with sequences features. First, we encoded enhancers and promoters sequences with pre-trained DNA vectors. Then, the Convolutional Neural Network (CNN) was utilized to rough extract the global and local features. Finally, the transformer mechanism was introduced to further extract the feature. We first trained a model named EPI-Mind_spe which can predict EPIs in one cell line. To achieve general prediction across different cell lines and further improve the performance of the model, a second-time training was carried on. The redivided dataset were used to train a new model called EPI-Mind_gen which can predict EPIs across different cell lines. To further improve the accuracy, a new model named EPI-Mind_best was trained which used the EPI-Mind_gen as a pre-trained model. RESULTS: EPI-Mind_spe has the ability of predict EPIs with average AUROC above 90% and average AUPR above 70% but with cell lines specificity. EPI-Mind_gen can predict EPIs across different cell lines and its average AUROC is higher than the EPI-Mind_spe about 4.8%. EPI-Mind_best is superior to the state-of-the-art predictors on benchmarking datasets. EPI-Mind_best achieved best in 5 indicators within 12 indicators consists of AUPR and AUROC which is better than pioneers. CONCLUSION: This research proposed a method, which was called EPI-Mind, to predict EPIs only with enhancer and promoters sequences, the framework of which was based on deep learning. This manuscript may provide a new route to solve the problem.
Authors: Stefan Schoenfelder; Tom Sexton; Lyubomira Chakalova; Nathan F Cope; Alice Horton; Simon Andrews; Sreenivasulu Kurukuti; Jennifer A Mitchell; David Umlauf; Daniela S Dimitrova; Christopher H Eskiw; Yanquan Luo; Chia-Lin Wei; Yijun Ruan; James J Bieker; Peter Fraser Journal: Nat Genet Date: 2009-12-13 Impact factor: 38.330
Authors: Meredith Yeager; Nilanjan Chatterjee; Julia Ciampa; Kevin B Jacobs; Jesus Gonzalez-Bosquet; Richard B Hayes; Peter Kraft; Sholom Wacholder; Nick Orr; Sonja Berndt; Kai Yu; Amy Hutchinson; Zhaoming Wang; Laufey Amundadottir; Heather Spencer Feigelson; Michael J Thun; W Ryan Diver; Demetrius Albanes; Jarmo Virtamo; Stephanie Weinstein; Fredrick R Schumacher; Geraldine Cancel-Tassin; Olivier Cussenot; Antoine Valeri; Gerald L Andriole; E David Crawford; Christopher A Haiman; Brian Henderson; Laurence Kolonel; Loic Le Marchand; Afshan Siddiq; Elio Riboli; Timothy J Key; Rudolf Kaaks; William Isaacs; Sarah Isaacs; Kathleen E Wiley; Henrik Gronberg; Fredrik Wiklund; Pär Stattin; Jianfeng Xu; S Lilly Zheng; Jielin Sun; Lars J Vatten; Kristian Hveem; Merethe Kumle; Margaret Tucker; Daniela S Gerhard; Robert N Hoover; Joseph F Fraumeni; David J Hunter; Gilles Thomas; Stephen J Chanock Journal: Nat Genet Date: 2009-09-20 Impact factor: 38.330