| Literature DB >> 30187132 |
Md Siddiqur Rahman1, Usma Aktar1, Md Rafsan Jani1, Swakkhar Shatabda2.
Abstract
In bacterial DNA, there are specific sequences of nucleotides called promoters that can bind to the RNA polymerase. Sigma70 ([Formula: see text]) is one of the most important promoter sequences due to its presence in most of the DNA regulatory functions. In this paper, we identify the most effective and optimal sequence-based features for prediction of [Formula: see text] promoter sequences in a bacterial genome. We used both short-range and long-range DNA sequences in our proposed method. A very small number of effective features are selected from a large number of the extracted features using multi-window of different sizes within the DNA sequences. We call our prediction method iPro70-FMWin and made it freely accessible online via a web application established at http://ipro70.pythonanywhere.com/server for the sake of convenience of the researchers. We have tested our method using a standard benchmark dataset. In the experiments, iPro70-FMWin has achieved an area under the curve of the receiver operating characteristic and accuracy of 0.959 and 90.57%, respectively, which significantly outperforms the state-of-the-art predictors.Keywords: Feature selection; Multi-windowing; Prokaryote; Sequence-based features; promoter
Mesh:
Substances:
Year: 2018 PMID: 30187132 DOI: 10.1007/s00438-018-1487-5
Source DB: PubMed Journal: Mol Genet Genomics ISSN: 1617-4623 Impact factor: 3.291