Literature DB >> 33535061

ES-ARCNN: Predicting enhancer strength by using data augmentation and residual convolutional neural network.

Ting-He Zhang1, Mario Flores2, Yufei Huang3.   

Abstract

Enhancers are non-coding DNA sequences bound by proteins called transcription factors. They function as distant regulators of gene transcription and participate in the development and maintenance of cell types and tissues. Since experimental validation of enhancers is expensive and time-consuming, many computational methods have been developed to predict enhancers and their strength. However, most of these methods still lack good performance in the prediction of enhancer strength. Here, we present a method to predict Enhancers Strength (i.e., strong and weak) by using Augmented data and Residual Convolutional Neural Network (ES-ARCNN). To train ES-ARCNN, we used two data augmentation tricks (i.e., reverse complement and shift) to previously identified enhancers for enlarging a previously identified dataset of enhancers. We further employed a residual convolutional neural network and trained it using the augmented dataset. Compared with other state-of-the-art methods in the 10-fold cross-validation (CV) test, ES-ARCNN has the best performance with the accuracy of 66.17%, and the tricks of data augmentation can effectively improve the prediction performance. We further tested ES-ARCNN on an independent dataset and obtained 65.5% accuracy, which has more than 4% improvement over the other three existing methods. The results in 10CV and independent tests show that ES-ARCNN can effectively predict the enhancer strength. The transcription factor binding sites (TFBSs) enrichment analysis shows that from the mechanistic perspective, enhancer strength is associated with a higher density of important TFBSs in a tissue. A user-friendly web-application is also provided at http://compgenomics.utsa.edu/ES-ARCNN/.
Copyright © 2021 Elsevier Inc. All rights reserved.

Keywords:  Augmentation; Residual convolutional neural network; Reverse complement; Shift; Strong enhancer; Weak enhancer

Year:  2021        PMID: 33535061     DOI: 10.1016/j.ab.2021.114120

Source DB:  PubMed          Journal:  Anal Biochem        ISSN: 0003-2697            Impact factor:   3.365


  3 in total

1.  An Energy Data-Driven Approach for Operating Status Recognition of Machine Tools Based on Deep Learning.

Authors:  Wei Yan; Chenxun Lu; Ying Liu; Xumei Zhang; Hua Zhang
Journal:  Sensors (Basel)       Date:  2022-09-01       Impact factor: 3.847

2.  A machine learning technique for identifying DNA enhancer regions utilizing CIS-regulatory element patterns.

Authors:  Ahmad Hassan Butt; Tamim Alkhalifah; Fahad Alturise; Yaser Daanial Khan
Journal:  Sci Rep       Date:  2022-09-07       Impact factor: 4.996

3.  Enhancer-LSTMAtt: A Bi-LSTM and Attention-Based Deep Learning Method for Enhancer Recognition.

Authors:  Guohua Huang; Wei Luo; Guiyang Zhang; Peijie Zheng; Yuhua Yao; Jianyi Lyu; Yuewu Liu; Dong-Qing Wei
Journal:  Biomolecules       Date:  2022-07-17
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.