Literature DB >> 29935353

A new method to cluster genomes based on cumulative Fourier power spectrum.

Rui Dong1, Ziyue Zhu1, Changchuan Yin2, Rong L He3, Stephen S-T Yau4.   

Abstract

Analyzing phylogenetic relationships using mathematical methods has always been of importance in bioinformatics. Quantitative research may interpret the raw biological data in a precise way. Multiple Sequence Alignment (MSA) is used frequently to analyze biological evolutions, but is very time-consuming. When the scale of data is large, alignment methods cannot finish calculation in reasonable time. Therefore, we present a new method using moments of cumulative Fourier power spectrum in clustering the DNA sequences. Each sequence is translated into a vector in Euclidean space. Distances between the vectors can reflect the relationships between sequences. The mapping between the spectra and moment vector is one-to-one, which means that no information is lost in the power spectra during the calculation. We cluster and classify several datasets including Influenza A, primates, and human rhinovirus (HRV) datasets to build up the phylogenetic trees. Results show that the new proposed cumulative Fourier power spectrum is much faster and more accurately than MSA and another alignment-free method known as k-mer. The research provides us new insights in the study of phylogeny, evolution, and efficient DNA comparison algorithms for large genomes. The computer programs of the cumulative Fourier power spectrum are available at GitHub (https://github.com/YaulabTsinghua/cumulative-Fourier-power-spectrum).
Copyright © 2018 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Cumulative Fourier power spectrum; DNA sequences; Moment vectors; Phylogenetic trees

Mesh:

Substances:

Year:  2018        PMID: 29935353     DOI: 10.1016/j.gene.2018.06.042

Source DB:  PubMed          Journal:  Gene        ISSN: 0378-1119            Impact factor:   3.688


  5 in total

1.  Analysis of the Hosts and Transmission Paths of SARS-CoV-2 in the COVID-19 Outbreak.

Authors:  Rui Dong; Shaojun Pei; Changchuan Yin; Rong Lucy He; Stephen S-T Yau
Journal:  Genes (Basel)       Date:  2020-06-09       Impact factor: 4.096

2.  Full Chromosomal Relationships Between Populations and the Origin of Humans.

Authors:  Rui Dong; Shaojun Pei; Mengcen Guan; Shek-Chung Yau; Changchuan Yin; Rong L He; Stephen S-T Yau
Journal:  Front Genet       Date:  2022-02-02       Impact factor: 4.599

3.  Identification of HIV Rapid Mutations Using Differences in Nucleotide Distribution over Time.

Authors:  Nan Sun; Jie Yang; Stephen S-T Yau
Journal:  Genes (Basel)       Date:  2022-01-19       Impact factor: 4.096

4.  Context dependent prediction in DNA sequence using neural networks.

Authors:  Christian Grønbæk; Yuhu Liang; Desmond Elliott; Anders Krogh
Journal:  PeerJ       Date:  2022-09-20       Impact factor: 3.061

5.  Large-Scale Genome Comparison Based on Cumulative Fourier Power and Phase Spectra: Central Moment and Covariance Vector.

Authors:  Shaojun Pei; Rui Dong; Rong Lucy He; Stephen S-T Yau
Journal:  Comput Struct Biotechnol J       Date:  2019-07-11       Impact factor: 7.271

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.