Literature DB >> 33010390

i6mA-stack: A stacking ensemble-based computational prediction of DNA N6-methyladenine (6mA) sites in the Rosaceae genome.

Jhabindra Khanal1, Dae Young Lim2, Hilal Tayara3, Kil To Chong4.   

Abstract

DNA N6-methyladenine (6 mA) is an epigenetic modification that plays a vital role in a variety of cellular processes in both eukaryotes and prokaryotes. Accurate information of 6 mA sites in the Rosaceae genome may assist in understanding genomic 6 mA distributions and various biological functions such as epigenetic inheritance. Various studies have shown the possibility of identifying 6 mA sites through experiments, but the procedures are time-consuming and costly. To overcome the drawbacks of experimental methods, we propose an accurate computational paradigm based on a machine learning (ML) technique to identify 6 mA sites in Rosa chinensis (R.chinensis) and Fragaria vesca (F.vesca). To improve the performance of the proposed model and to avoid overfitting, a recursive feature elimination with cross-validation (RFECV) strategy is used to extract the optimal number of features (ONF) subset from five different DNA sequence encoding schemes, i.e., Binary Encoding (BE), Ring-Function-Hydrogen-Chemical Properties (RFHC), Electron-Ion-Interaction Pseudo Potentials of Nucleotides (EIIP), Dinucleotide Physicochemical Properties (DPCP), and Trinucleotide Physicochemical Properties (TPCP). Subsequently, we use the ONF subset to train a double layers of ML-based stacking model to create a bioinformatics tool named 'i6mA-stack'. This tool outperforms its peer tool in general and is currently available at http://nsclbio.jbnu.ac.kr/tools/i6mA-stack/.
Copyright © 2020 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  DNA N6-methyladenine; Machine learning; RFECV; Sequence analysis; Stacking

Mesh:

Substances:

Year:  2020        PMID: 33010390     DOI: 10.1016/j.ygeno.2020.09.054

Source DB:  PubMed          Journal:  Genomics        ISSN: 0888-7543            Impact factor:   5.736


  6 in total

1.  Using k-mer embeddings learned from a Skip-gram based neural network for building a cross-species DNA N6-methyladenine site prediction model.

Authors:  Trinh Trung Duong Nguyen; Van Ngu Trinh; Nguyen Quoc Khanh Le; Yu-Yen Ou
Journal:  Plant Mol Biol       Date:  2021-11-29       Impact factor: 4.076

Review 2.  DNA N6-Methyladenine Modification in Eukaryotic Genome.

Authors:  Hao Li; Ning Zhang; Yuechen Wang; Siyuan Xia; Yating Zhu; Chen Xing; Xuefeng Tian; Yinan Du
Journal:  Front Genet       Date:  2022-06-24       Impact factor: 4.772

3.  Identifying DNA N4-methylcytosine sites in the rosaceae genome with a deep learning model relying on distributed feature representation.

Authors:  Jhabindra Khanal; Hilal Tayara; Quan Zou; Kil To Chong
Journal:  Comput Struct Biotechnol J       Date:  2021-03-19       Impact factor: 7.271

4.  i6mA-Vote: Cross-Species Identification of DNA N6-Methyladenine Sites in Plant Genomes Based on Ensemble Learning With Voting.

Authors:  Zhixia Teng; Zhengnan Zhao; Yanjuan Li; Zhen Tian; Maozu Guo; Qianzi Lu; Guohua Wang
Journal:  Front Plant Sci       Date:  2022-02-14       Impact factor: 5.753

5.  An Explainable Supervised Machine Learning Model for Predicting Respiratory Toxicity of Chemicals Using Optimal Molecular Descriptors.

Authors:  Keerthana Jaganathan; Hilal Tayara; Kil To Chong
Journal:  Pharmaceutics       Date:  2022-04-11       Impact factor: 6.525

6.  Robust proportional overlapping analysis for feature selection in binary classification within functional genomic experiments.

Authors:  Muhammad Hamraz; Naz Gul; Mushtaq Raza; Dost Muhammad Khan; Umair Khalil; Seema Zubair; Zardad Khan
Journal:  PeerJ Comput Sci       Date:  2021-06-01
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.