Literature DB >> 25504848

repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects.

Bin Liu1, Fule Liu2, Longyun Fang2, Xiaolong Wang3, Kuo-Chen Chou3.   

Abstract

UNLABELLED: In order to develop powerful computational predictors for identifying the biological features or attributes of DNAs, one of the most challenging problems is to find a suitable approach to effectively represent the DNA sequences. To facilitate the studies of DNAs and nucleotides, we developed a Python package called representations of DNAs (repDNA) for generating the widely used features reflecting the physicochemical properties and sequence-order effects of DNAs and nucleotides. There are three feature groups composed of 15 features. The first group calculates three nucleic acid composition features describing the local sequence information by means of kmers; the second group calculates six autocorrelation features describing the level of correlation between two oligonucleotides along a DNA sequence in terms of their specific physicochemical properties; the third group calculates six pseudo nucleotide composition features, which can be used to represent a DNA sequence with a discrete model or vector yet still keep considerable sequence-order information via the physicochemical properties of its constituent oligonucleotides. In addition, these features can be easily calculated based on both the built-in and user-defined properties via using repDNA.
AVAILABILITY AND IMPLEMENTATION: The repDNA Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repDNA/. CONTACT: bliu@insun.hit.edu.cn or kcchou@gordonlifescience.org SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25504848     DOI: 10.1093/bioinformatics/btu820

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  83 in total

1.  iRSpot-GAEnsC: identifing recombination spots via ensemble classifier and extending the concept of Chou's PseAAC to formulate DNA samples.

Authors:  Muhammad Kabir; Maqsood Hayat
Journal:  Mol Genet Genomics       Date:  2015-08-30       Impact factor: 3.291

2.  repRNA: a web server for generating various feature vectors of RNA sequences.

Authors:  Bin Liu; Fule Liu; Longyun Fang; Xiaolong Wang; Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2015-06-18       Impact factor: 3.291

3.  MULTiPly: a novel multi-layer predictor for discovering general and specific types of promoters.

Authors:  Meng Zhang; Fuyi Li; Tatiana T Marquez-Lago; André Leier; Cunshuo Fan; Chee Keong Kwoh; Kuo-Chen Chou; Jiangning Song; Cangzhi Jia
Journal:  Bioinformatics       Date:  2019-09-01       Impact factor: 6.937

4.  Evolutionary mechanism and biological functions of 8-mers containing CG dinucleotide in yeast.

Authors:  Yan Zheng; Hong Li; Yue Wang; Hu Meng; Qiang Zhang; Xiaoqing Zhao
Journal:  Chromosome Res       Date:  2017-02-09       Impact factor: 5.239

5.  Comparative analysis of housekeeping and tissue-selective genes in human based on network topologies and biological properties.

Authors:  Lei Yang; Shiyuan Wang; Meng Zhou; Xiaowen Chen; Yongchun Zuo; Dianjun Sun; Yingli Lv
Journal:  Mol Genet Genomics       Date:  2016-02-20       Impact factor: 3.291

6.  Protein remote homology detection by combining Chou's distance-pair pseudo amino acid composition and principal component analysis.

Authors:  Bin Liu; Junjie Chen; Xiaolong Wang
Journal:  Mol Genet Genomics       Date:  2015-04-21       Impact factor: 3.291

7.  iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization.

Authors:  Zhen Chen; Pei Zhao; Chen Li; Fuyi Li; Dongxu Xiang; Yong-Zi Chen; Tatsuya Akutsu; Roger J Daly; Geoffrey I Webb; Quanzhi Zhao; Lukasz Kurgan; Jiangning Song
Journal:  Nucleic Acids Res       Date:  2021-06-04       Impact factor: 16.971

8.  Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences.

Authors:  Bin Liu; Fule Liu; Xiaolong Wang; Junjie Chen; Longyun Fang; Kuo-Chen Chou
Journal:  Nucleic Acids Res       Date:  2015-05-09       Impact factor: 16.971

9.  EnANNDeep: An Ensemble-based lncRNA-protein Interaction Prediction Framework with Adaptive k-Nearest Neighbor Classifier and Deep Models.

Authors:  Lihong Peng; Jingwei Tan; Xiongfei Tian; Liqian Zhou
Journal:  Interdiscip Sci       Date:  2022-01-10       Impact factor: 2.233

10.  Comparison of genomic data via statistical distribution.

Authors:  Saeid Amiri; Ivo D Dinov
Journal:  J Theor Biol       Date:  2016-07-25       Impact factor: 2.691

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.