Literature DB >> 28137599

Sequence comparison and essential gene identification with new inter-nucleotide distance sequences.

Yushuang Li1, Yanfen Lv2, Xiaonan Li3, Wenli Xiao4, Chun Li5.   

Abstract

Four new inter-nucleotide distance sequences for a DNA sequence are defined. They are different from ones presented by Afreixo et al., and overcome the irreversible defect of the global inter-nucleotide distance sequence proposed by Nair and Mahalakshmi. Five basic statistical quantities are extracted from (ordered) precise inter-nucleotide distance sequences to construct a 20 dimensional feature vector. This simple mathematical descriptor of DNA sequence plays crucial roles in sequence comparison and essential gene identification. Euclidean distance between feature vectors is utilized to compare similarities among whole mitochondrial genomes of 18 eutherian mammals and 23 sequences of 16S ribosomal RNA, respectively. Derived phylogenetic trees are quite agreement with a few popular studies. Furthermore, using feature vector as input a support vector machine (SVM)-based method are developed to identify essential genes and non-essential genes of 5 bacteria. Higher AUC values (the minimum is 0.7971, the highest reaches 0.8751 and the average is 0.8174) than some well-known results confirm the performance of the method.
Copyright © 2017 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Essential gene identification; Feature vector; Inter-nucleotide distance sequence; Sequence comparison; Statistical quantity; Support vector machine

Mesh:

Substances:

Year:  2017        PMID: 28137599     DOI: 10.1016/j.jtbi.2017.01.031

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  1 in total

1.  Sequence-based information-theoretic features for gene essentiality prediction.

Authors:  Dawit Nigatu; Patrick Sobetzko; Malik Yousef; Werner Henkel
Journal:  BMC Bioinformatics       Date:  2017-11-09       Impact factor: 3.169

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.