Literature DB >> 33834381

i6mA-VC: A Multi-Classifier Voting Method for the Computational Identification of DNA N6-methyladenine Sites.

Tian Xue1, Shengli Zhang2, Huijuan Qiao1.   

Abstract

DNA N6-methyladenine (6 mA), as an essential component of epigenetic modification, cannot be neglected in genetic regulation mechanism. The efficient and accurate prediction of 6 mA sites is beneficial to the development of biological genetics. Biochemical experimental methods are considered to be time-consuming and laborious. Most of the established machine learning methods have a single dataset. Although some of them have achieved cross-species prediction, their results are not satisfactory. Therefore, we designed a novel statistical model called i6mA-VC to improve the accuracy for 6 mA sites. On the one hand, kmer and binary encoding are applied to extract features, and then gradient boosting decision tree (GBDT) embedded method is applied as the feature selection strategy. On the other hand, DNA sequences are represented by vectors through the feature extraction method of ring-function-hydrogen-chemical properties (RFHCP) and the feature selection strategy of ExtraTree. After fusing the two optimal features, a voting classifier based on gradient boosting decision tree (GBDT), light gradient boosting machine (LightGBM) and multilayer perceptron classifier (MLPC) is constructed for final classification and prediction. The accuracy of Rice dataset and M.musculus dataset with five-fold cross-validation are 0.888 and 0.967, respectively. The cross-species dataset is selected as independent testing dataset, and the accuracy reaches 0.848. Through rigorous experiments, it is demonstrated that the proposed predictor is convincing and applicable. The development of i6mA-VC predictor will become an effective way for the recognition of N6-methyladenine sites, and it will also be beneficial for biological geneticists to further study gene expression and DNA modification. In addition, an accessible web-server for i6mA-VC is available from http://www.zhanglab.site/ .

Entities:  

Keywords:  DNA N6-methyladenine sites; Light gradient boosting machine; Multilayer perceptron classifier; Ring-function-hydrogen-chemical properties; Voting

Year:  2021        PMID: 33834381     DOI: 10.1007/s12539-021-00429-4

Source DB:  PubMed          Journal:  Interdiscip Sci        ISSN: 1867-1462            Impact factor:   2.233


  43 in total

1.  Host specificity of DNA produced by Escherichia coli. I. Host controlled modification of bacteriophage lambda.

Authors:  W ARBER; D DUSSOIX
Journal:  J Mol Biol       Date:  1962-07       Impact factor: 5.469

2.  Occurrence of a new base in the deoxyribonucleic acid of a strain of Bacterium coli.

Authors:  D B DUNN; J D SMITH
Journal:  Nature       Date:  1955-02-19       Impact factor: 49.962

3.  A nonhereditary, host-induced variation of bacterial viruses.

Authors:  S E LURIA; M L HUMAN
Journal:  J Bacteriol       Date:  1952-10       Impact factor: 3.490

4.  E. coli oriC and the dnaA gene promoter are sequestered from dam methyltransferase following the passage of the chromosomal replication fork.

Authors:  J L Campbell; N Kleckner
Journal:  Cell       Date:  1990-09-07       Impact factor: 41.582

5.  Rare bases in animal DNA.

Authors:  B F Vanyushin; S G Tkacheva; A N Belozersky
Journal:  Nature       Date:  1970-03-07       Impact factor: 49.962

6.  DNA restriction enzyme from E. coli.

Authors:  M Meselson; R Yuan
Journal:  Nature       Date:  1968-03-23       Impact factor: 49.962

7.  5-methylcytosine and 6-methylamino-purine in bacterial DNA.

Authors:  B F Vanyushin; A N Belozersky; N A Kokurina; D X Kadirova
Journal:  Nature       Date:  1968-06-15       Impact factor: 49.962

8.  Use of restriction enzymes to study eukaryotic DNA methylation: II. The symmetry of methylated sites supports semi-conservative copying of the methylation pattern.

Authors:  A P Bird
Journal:  J Mol Biol       Date:  1978-01-05       Impact factor: 5.469

9.  Analysis of global gene expression and double-strand-break formation in DNA adenine methyltransferase- and mismatch repair-deficient Escherichia coli.

Authors:  Jennifer L Robbins-Manke; Zoran Z Zdraveski; Martin Marinus; John M Essigmann
Journal:  J Bacteriol       Date:  2005-10       Impact factor: 3.490

10.  Effects of high levels of DNA adenine methylation on methyl-directed mismatch repair in Escherichia coli.

Authors:  P J Pukkila; J Peterson; G Herman; P Modrich; M Meselson
Journal:  Genetics       Date:  1983-08       Impact factor: 4.562

View more
  1 in total

1.  Accurate Prediction of Anti-hypertensive Peptides Based on Convolutional Neural Network and Gated Recurrent unit.

Authors:  Hongyan Shi; Shengli Zhang
Journal:  Interdiscip Sci       Date:  2022-04-27       Impact factor: 3.492

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.