Literature DB >> 20665972

A simple feature representation vector for phylogenetic analysis of DNA sequences.

Shuyan Ding1, Qi Dai, Hongmei Liu, Tianming Wang.   

Abstract

In this study, a simple 4k-dimension feature representation vector is proposed to reconstruct phylogenetic trees, where k is the length of a word. The vector is composed of elements which characterize the relative difference of biological sequence from sequence generated by an independent random process. In addition, the variance of a vector which is obtained by averaging every column of feature representation matrix is employed to determine appropriate word length. In our experiments, reliable results can always be generated when word length is < 7 which appears to be of lower computational complexity. Phylogenetic trees of 24 transferrins and 48 Hepatitis E viruses reconstructed at word length 6 are in good agreements with previous study, it shows that our method is efficient and powerful.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20665972     DOI: 10.1016/j.jtbi.2010.05.027

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  4 in total

1.  DNA word analysis based on the distribution of the distances between symmetric words.

Authors:  Ana H M P Tavares; Armando J Pinho; Raquel M Silva; João M O S Rodrigues; Carlos A C Bastos; Paulo J S G Ferreira; Vera Afreixo
Journal:  Sci Rep       Date:  2017-04-07       Impact factor: 4.379

2.  Genome analysis with the conditional multinomial distribution profile.

Authors:  Guisong Chang; Tianming Wang
Journal:  J Theor Biol       Date:  2010-12-01       Impact factor: 2.691

3.  An Associated Representation Method for Defining Agricultural Cases in a Case-Based Reasoning System for Fast Case Retrieval.

Authors:  Zhaoyu Zhai; José-Fernán Martínez Ortega; Victoria Beltran; Néstor Lucas Martínez
Journal:  Sensors (Basel)       Date:  2019-11-22       Impact factor: 3.576

4.  A new graph-theoretic approach to determine the similarity of genome sequences based on nucleotide triplets.

Authors:  Subhram Das; Arijit Das; D K Bhattacharya; D N Tibarewala
Journal:  Genomics       Date:  2020-08-19       Impact factor: 5.736

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.