Literature DB >> 32578842

Deep4mC: systematic assessment and computational prediction for DNA N4-methylcytosine sites by deep learning.

Haodong Xu1, Peilin Jia1, Zhongming Zhao1.   

Abstract

DNA N4-methylcytosine (4mC) modification represents a novel epigenetic regulation. It involves in various cellular processes, including DNA replication, cell cycle and gene expression, among others. In addition to experimental identification of 4mC sites, in silico prediction of 4mC sites in the genome has emerged as an alternative and promising approach. In this study, we first reviewed the current progress in the computational prediction of 4mC sites and systematically evaluated the predictive capacity of eight conventional machine learning algorithms as well as 12 feature types commonly used in previous studies in six species. Using a representative benchmark dataset, we investigated the contribution of feature selection and stacking approach to the model construction, and found that feature optimization and proper reinforcement learning could improve the performance. We next recollected newly added 4mC sites in the six species' genomes and developed a novel deep learning-based 4mC site predictor, namely Deep4mC. Deep4mC applies convolutional neural networks with four representative features. For species with small numbers of samples, we extended our deep learning framework with a bootstrapping method. Our evaluation indicated that Deep4mC could obtain high accuracy and robust performance with the average area under curve (AUC) values greater than 0.9 in all species (range: 0.9005-0.9722). In comparison, Deep4mC achieved an AUC value improvement from 10.14 to 46.21% when compared to previous tools in these six species. A user-friendly web server (https://bioinfo.uth.edu/Deep4mC) was built for predicting putative 4mC sites in a genome.
© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  4mC; DNA N4-methylcytosine; deep learning; epigenetic modification; feature selection; methyladenine

Year:  2021        PMID: 32578842      PMCID: PMC8138820          DOI: 10.1093/bib/bbaa099

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  50 in total

1.  Iterative feature representations improve N4-methylcytosine site prediction.

Authors:  Leyi Wei; Ran Su; Shasha Luan; Zhijun Liao; Balachandran Manavalan; Quan Zou; Xiaolong Shi
Journal:  Bioinformatics       Date:  2019-12-01       Impact factor: 6.937

2.  DeepHINT: understanding HIV-1 integration via deep learning with attention.

Authors:  Hailin Hu; An Xiao; Sai Zhang; Yangyang Li; Xuanling Shi; Tao Jiang; Linqi Zhang; Lei Zhang; Jianyang Zeng
Journal:  Bioinformatics       Date:  2019-05-15       Impact factor: 6.937

3.  6mA-Finder: a novel online tool for predicting DNA N6-methyladenine sites in genomes.

Authors:  Haodong Xu; Ruifeng Hu; Peilin Jia; Zhongming Zhao
Journal:  Bioinformatics       Date:  2020-05-01       Impact factor: 6.937

4.  EnhancerPred2.0: predicting enhancers and their strength based on position-specific trinucleotide propensity and electron-ion interaction potential feature selection.

Authors:  Wenying He; Cangzhi Jia
Journal:  Mol Biosyst       Date:  2017-03-28

5.  iRNA-PseU: Identifying RNA pseudouridine sites.

Authors:  Wei Chen; Hua Tang; Jing Ye; Hao Lin; Kuo-Chen Chou
Journal:  Mol Ther Nucleic Acids       Date:  2016

6.  Meta-4mCpred: A Sequence-Based Meta-Predictor for Accurate DNA 4mC Site Prediction Using Effective Feature Representation.

Authors:  Balachandran Manavalan; Shaherin Basith; Tae Hwan Shin; Leyi Wei; Gwang Lee
Journal:  Mol Ther Nucleic Acids       Date:  2019-04-30

7.  CD-HIT: accelerated for clustering the next-generation sequencing data.

Authors:  Limin Fu; Beifang Niu; Zhengwei Zhu; Sitao Wu; Weizhong Li
Journal:  Bioinformatics       Date:  2012-10-11       Impact factor: 6.937

8.  A coding measure scheme employing electron-ion interaction pseudopotential (EIIP).

Authors:  Achuthsankar S Nair; Sivarama Pillai Sreenadhan
Journal:  Bioinformation       Date:  2006-10-07

9.  A Model Stacking Framework for Identifying DNA Binding Proteins by Orchestrating Multi-View Features and Classifiers.

Authors:  Xiu-Juan Liu; Xiu-Jun Gong; Hua Yu; Jia-Hui Xu
Journal:  Genes (Basel)       Date:  2018-08-01       Impact factor: 4.096

10.  4mCpred-EL: An Ensemble Learning Framework for Identification of DNA N4-methylcytosine Sites in the Mouse Genome.

Authors:  Balachandran Manavalan; Shaherin Basith; Tae Hwan Shin; Da Yeon Lee; Leyi Wei; Gwang Lee
Journal:  Cells       Date:  2019-10-28       Impact factor: 6.600

View more
  10 in total

1.  iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization.

Authors:  Zhen Chen; Pei Zhao; Chen Li; Fuyi Li; Dongxu Xiang; Yong-Zi Chen; Tatsuya Akutsu; Roger J Daly; Geoffrey I Webb; Quanzhi Zhao; Lukasz Kurgan; Jiangning Song
Journal:  Nucleic Acids Res       Date:  2021-06-04       Impact factor: 16.971

2.  Hyb4mC: a hybrid DNA2vec-based model for DNA N4-methylcytosine sites prediction.

Authors:  Ying Liang; Yanan Wu; Zequn Zhang; Niannian Liu; Jun Peng; Jianjun Tang
Journal:  BMC Bioinformatics       Date:  2022-06-29       Impact factor: 3.307

3.  iFeatureOmega: an integrative platform for engineering, visualization and analysis of features from molecular sequences, structural and ligand data sets.

Authors:  Zhen Chen; Xuhan Liu; Pei Zhao; Chen Li; Yanan Wang; Fuyi Li; Tatsuya Akutsu; Chris Bain; Robin B Gasser; Junzhou Li; Zuoren Yang; Xin Gao; Lukasz Kurgan; Jiangning Song
Journal:  Nucleic Acids Res       Date:  2022-05-07       Impact factor: 19.160

4.  4mCPred-CNN-Prediction of DNA N4-Methylcytosine in the Mouse Genome Using a Convolutional Neural Network.

Authors:  Zeeshan Abbas; Hilal Tayara; Kil To Chong
Journal:  Genes (Basel)       Date:  2021-02-20       Impact factor: 4.096

5.  Identifying DNA N4-methylcytosine sites in the rosaceae genome with a deep learning model relying on distributed feature representation.

Authors:  Jhabindra Khanal; Hilal Tayara; Quan Zou; Kil To Chong
Journal:  Comput Struct Biotechnol J       Date:  2021-03-19       Impact factor: 7.271

6.  Systematic Analysis and Accurate Identification of DNA N4-Methylcytosine Sites by Deep Learning.

Authors:  Lezheng Yu; Yonglin Zhang; Li Xue; Fengjuan Liu; Qi Chen; Jiesi Luo; Runyu Jing
Journal:  Front Microbiol       Date:  2022-03-15       Impact factor: 5.640

7.  StackEPI: identification of cell line-specific enhancer-promoter interactions based on stacking ensemble learning.

Authors:  Yongxian Fan; Binchao Peng
Journal:  BMC Bioinformatics       Date:  2022-07-11       Impact factor: 3.307

8.  Circ-LocNet: A Computational Framework for Circular RNA Sub-Cellular Localization Prediction.

Authors:  Muhammad Nabeel Asim; Muhammad Ali Ibrahim; Muhammad Imran Malik; Andreas Dengel; Sheraz Ahmed
Journal:  Int J Mol Sci       Date:  2022-07-26       Impact factor: 6.208

9.  4mCPred-MTL: Accurate Identification of DNA 4mC Sites in Multiple Species Using Multi-Task Deep Learning Based on Multi-Head Attention Mechanism.

Authors:  Rao Zeng; Song Cheng; Minghong Liao
Journal:  Front Cell Dev Biol       Date:  2021-05-10

10.  Analysis of DNA Sequence Classification Using CNN and Hybrid Models.

Authors:  Hemalatha Gunasekaran; K Ramalakshmi; A Rex Macedo Arokiaraj; S Deepa Kanmani; Chandran Venkatesan; C Suresh Gnana Dhas
Journal:  Comput Math Methods Med       Date:  2021-07-15       Impact factor: 2.238

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.