Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 PyFeat: a Python-based effective feature generation tool for DNA, RNA and protein sequences.

Literature DB >> 30850831

PyFeat: a Python-based effective feature generation tool for DNA, RNA and protein sequences.

Rafsanjani Muhammod¹, Sajid Ahmed¹, Dewan Md Farid¹, Swakkhar Shatabda¹, Alok Sharma^2,3,4, Abdollah Dehzangi⁵.

Abstract

MOTIVATION: Extracting useful feature set which contains significant discriminatory information is a critical step in effectively presenting sequence data to predict structural, functional, interaction and expression of proteins, DNAs and RNAs. Also, being able to filter features with significant information and avoid sparsity in the extracted features require the employment of efficient feature selection techniques. Here we present PyFeat as a practical and easy to use toolkit implemented in Python for extracting various features from proteins, DNAs and RNAs. To build PyFeat we mainly focused on extracting features that capture information about the interaction of neighboring residues to be able to provide more local information. We then employ AdaBoost technique to select features with maximum discriminatory information. In this way, we can significantly reduce the number of extracted features and enable PyFeat to represent the combination of effective features from large neighboring residues. As a result, PyFeat is able to extract features from 13 different techniques and represent context free combination of effective features. The source code for PyFeat standalone toolkit and employed benchmarks with a comprehensive user manual explaining its system and workflow in a step by step manner are publicly available.
RESULTS: https://github.com/mrzResearchArena/PyFeat/blob/master/RESULTS.md.
AVAILABILITY AND IMPLEMENTATION: Toolkit, source code and manual to use PyFeat: https://github.com/mrzResearchArena/PyFeat/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Mesh：

Substances：
Proteins
RNA
DNA

Year: 2019 PMID： 30850831 PMCID： PMC6761934 DOI： 10.1093/bioinformatics/btz165

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

10 in total

1. iRecSpot-EF: Effective sequence based features for recombination hotspot prediction.

Authors: Md Rafsan Jani; Md Toha Khan Mozlish; Sajid Ahmed; Niger Sultana Tahniat; Dewan Md Farid; Swakkhar Shatabda
Journal: Comput Biol Med Date: 2018-10-11 Impact factor: 4.589

2. Identifying Sigma70 Promoters with Novel Pseudo Nucleotide Composition.

Authors: Hao Lin; Zhi-Yong Liang; Hua Tang; Wei Chen
Journal: IEEE/ACM Trans Comput Biol Bioinform Date: 2017-02-08 Impact factor: 3.710

3. iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences.

Authors: Zhen Chen; Pei Zhao; Fuyi Li; André Leier; Tatiana T Marquez-Lago; Yanan Wang; Geoffrey I Webb; A Ian Smith; Roger J Daly; Kuo-Chen Chou; Jiangning Song
Journal: Bioinformatics Date: 2018-07-15 Impact factor: 6.937

4. BioSeq-Analysis: a platform for DNA, RNA and protein sequence analysis based on machine learning approaches.

Authors: Bin Liu
Journal: Brief Bioinform Date: 2019-07-19 Impact factor: 11.622

5. Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences.

Authors: Bin Liu; Fule Liu; Xiaolong Wang; Junjie Chen; Longyun Fang; Kuo-Chen Chou
Journal: Nucleic Acids Res Date: 2015-05-09 Impact factor: 16.971

6. propy: a tool to generate various modes of Chou's PseAAC.

Authors: Dong-Sheng Cao; Qing-Song Xu; Yi-Zeng Liang
Journal: Bioinformatics Date: 2013-02-19 Impact factor: 6.937

7. PAI: Predicting adenosine to inosine editing sites by using pseudo nucleotide compositions.

Authors: Wei Chen; Pengmian Feng; Hui Ding; Hao Lin
Journal: Sci Rep Date: 2016-10-11 Impact factor: 4.379

8. iDNAProt-ES: Identification of DNA-binding Proteins Using Evolutionary and Structural Features.

Authors: Shahana Yasmin Chowdhury; Swakkhar Shatabda; Abdollah Dehzangi
Journal: Sci Rep Date: 2017-11-02 Impact factor: 4.379

9. Pse-Analysis: a python package for DNA/RNA and protein/ peptide sequence analysis based on pseudo components and kernel methods.

Authors: Bin Liu; Hao Wu; Deyuan Zhang; Xiaolong Wang; Kuo-Chen Chou
Journal: Oncotarget Date: 2017-02-21

10. Enhanced regulatory sequence prediction using gapped k-mer features.

Authors: Mahmoud Ghandi; Dongwon Lee; Morteza Mohammad-Noori; Michael A Beer
Journal: PLoS Comput Biol Date: 2014-07-17 Impact factor: 4.475

10 in total

17 in total

1. MathFeature: feature extraction package for DNA, RNA and protein sequences based on mathematical descriptors.

Authors: Robson P Bonidia; Douglas S Domingues; Danilo S Sanches; André C P L F de Carvalho
Journal: Brief Bioinform Date: 2022-01-17 Impact factor: 11.622

2. BioAutoML: automated feature engineering and metalearning to predict noncoding RNAs in bacteria.

Authors: Robson P Bonidia; Anderson P Avila Santos; Breno L S de Almeida; Peter F Stadler; Ulisses N da Rocha; Danilo S Sanches; André C P L F de Carvalho
Journal: Brief Bioinform Date: 2022-07-18 Impact factor: 13.994

3. TACOS: a novel approach for accurate prediction of cell-specific long noncoding RNAs subcellular localization.

Authors: Young-Jun Jeon; Md Mehedi Hasan; Hyun Woo Park; Ki Wook Lee; Balachandran Manavalan
Journal: Brief Bioinform Date: 2022-07-18 Impact factor: 13.994

4. BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA-miRNA interaction prediction.

Authors: Muhammad Nabeel Asim; Muhammad Ali Ibrahim; Christoph Zehe; Johan Trygg; Andreas Dengel; Sheraz Ahmed
Journal: Interdiscip Sci Date: 2022-08-10 Impact factor: 3.492

5. Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation.

Authors: Daiyun Huang; Kunqi Chen; Bowen Song; Zhen Wei; Jionglong Su; Frans Coenen; João Pedro de Magalhães; Daniel J Rigden; Jia Meng
Journal: Nucleic Acids Res Date: 2022-10-14 Impact factor: 19.160

6. iFeatureOmega: an integrative platform for engineering, visualization and analysis of features from molecular sequences, structural and ligand data sets.

Authors: Zhen Chen; Xuhan Liu; Pei Zhao; Chen Li; Yanan Wang; Fuyi Li; Tatsuya Akutsu; Chris Bain; Robin B Gasser; Junzhou Li; Zuoren Yang; Xin Gao; Lukasz Kurgan; Jiangning Song
Journal: Nucleic Acids Res Date: 2022-05-07 Impact factor: 19.160