Literature DB >> 30336361

iRecSpot-EF: Effective sequence based features for recombination hotspot prediction.

Md Rafsan Jani1, Md Toha Khan Mozlish2, Sajid Ahmed3, Niger Sultana Tahniat4, Dewan Md Farid5, Swakkhar Shatabda6.   

Abstract

In genetic evolution, meiotic recombination plays an important role. Recombination introduces genetic variations and is a vital source of biodiversity and appears as a driving force in evolutionary development. Local regions of chromosomes where recombination events tend to be concentrated are known as hotspots and regions with relatively low frequencies of recombination are called coldspots. Predicting hotspots and coldspots can enlighten structure of recombination and genome evolution. In this paper, we proposed a predictor, called iRecSpot-EF to predict recombination hot and cold spots. iRecSpot-EF uses a novel set of features extracted from the genome sequences. We introduce the frequency of (l,k,p)-mers in the sequence as features. Our proposed feature extraction method hinges solely upon the nucleotide sequences, thus being cost-effective and robust. After feature extraction, the most informative features are selected using AdaBoost algorithm. We have selected logistic regression as the classification algorithm. iRecSpot-EF was tested on a standard benchmark dataset using cross-fold validation. It achieved an accuracy of 95.14% and area under Receiver Operating Characteristic curve (auROC) of 0.985. The performance of iRecSpot-EF is significantly better than the state-of-the-art methods. iRecSpot-EF is readily available for use from http://iRecSpot.pythonanywhere.com/server. All relevant codes are available via open repository at: https://github.com/mrzResearchArena/iRecSpot.
Copyright © 2018 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Classification; DNA sequences; Feature selection; Novel feature extraction; Recombination hotspots; Web application

Mesh:

Substances:

Year:  2018        PMID: 30336361     DOI: 10.1016/j.compbiomed.2018.10.005

Source DB:  PubMed          Journal:  Comput Biol Med        ISSN: 0010-4825            Impact factor:   4.589


  4 in total

1.  PyFeat: a Python-based effective feature generation tool for DNA, RNA and protein sequences.

Authors:  Rafsanjani Muhammod; Sajid Ahmed; Dewan Md Farid; Swakkhar Shatabda; Alok Sharma; Abdollah Dehzangi
Journal:  Bioinformatics       Date:  2019-10-01       Impact factor: 6.937

2.  A novel lncRNA-protein interaction prediction method based on deep forest with cascade forest structure.

Authors:  Xiongfei Tian; Ling Shen; Zhenwu Wang; Liqian Zhou; Lihong Peng
Journal:  Sci Rep       Date:  2021-09-23       Impact factor: 4.379

3.  ACP-MHCNN: an accurate multi-headed deep-convolutional neural network to predict anticancer peptides.

Authors:  Sajid Ahmed; Rafsanjani Muhammod; Zahid Hossain Khan; Sheikh Adilina; Alok Sharma; Swakkhar Shatabda; Abdollah Dehzangi
Journal:  Sci Rep       Date:  2021-12-08       Impact factor: 4.379

4.  Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae.

Authors:  Guoqing Liu; Shuangjian Song; Qiguo Zhang; Biyu Dong; Yu Sun; Guojun Liu; Xiujuan Zhao
Journal:  Front Genet       Date:  2021-06-29       Impact factor: 4.599

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.