Literature DB >> 31563612

Fast and accurate genome comparison using genome images: The Extended Natural Vector Method.

Shaojun Pei1, Wenhui Dong1, Xiuqiong Chen1, Rong Lucy He2, Stephen S-T Yau3.   

Abstract

Using numerical methods for genome comparison has always been of importance in bioinformatics. The Chaos Game Representation (CGR) is an effective genome sequence mapping technology, which converts genome sequences to CGR images. To each CGR image, we associate a vector called an Extended Natural Vector (ENV). The ENV is based on the distribution of intensity values. This mapping produces a one-to-one correspondence between CGR images and their ENVs. We define the distance between two DNA sequences as the distance between their associated ENVs. We cluster and classify several datasets including Influenza A viruses, Bacillus genomes, and Conoidea mitochondrial genomes to build their phylogenetic trees. Results show that our ENV combining CGR method (CGR-ENV) compares favorably in classification accuracy and efficiency against the multiple sequence alignment (MSA) method and other alignment-free methods. The research provides significant insights into the study of phylogeny, evolution, and efficient DNA comparison algorithms for large genomes.
Copyright © 2019 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Chaos game representation; Extended natural vector; Genome comparison

Mesh:

Substances:

Year:  2019        PMID: 31563612     DOI: 10.1016/j.ympev.2019.106633

Source DB:  PubMed          Journal:  Mol Phylogenet Evol        ISSN: 1055-7903            Impact factor:   4.286


  3 in total

Review 1.  Chaos game representation and its applications in bioinformatics.

Authors:  Hannah Franziska Löchel; Dominik Heider
Journal:  Comput Struct Biotechnol J       Date:  2021-11-10       Impact factor: 7.271

2.  Comparative analysis and prediction of nucleosome positioning using integrative feature representation and machine learning algorithms.

Authors:  Guo-Sheng Han; Qi Li; Ying Li
Journal:  BMC Bioinformatics       Date:  2021-06-02       Impact factor: 3.307

3.  Alignment-free genomic sequence comparison using FCGR and signal processing.

Authors:  Daniel Lichtblau
Journal:  BMC Bioinformatics       Date:  2019-12-30       Impact factor: 3.169

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.