Literature DB >> 27460589

Comparison of genomic data via statistical distribution.

Saeid Amiri1, Ivo D Dinov2.   

Abstract

Sequence comparison has become an essential tool in bioinformatics, because highly homologous sequences usually imply significant functional or structural similarity. Traditional sequence analysis techniques are based on preprocessing and alignment, which facilitate measuring and quantitative characterization of genetic differences, variability and complexity. However, recent developments of next generation and whole genome sequencing technologies give rise to new challenges that are related to measuring similarity and capturing rearrangements of large segments contained in the genome. This work is devoted to illustrating different methods recently introduced for quantifying sequence distances and variability. Most of the alignment-free methods rely on counting words, which are small contiguous fragments of the genome. Our approach considers the locations of nucleotides in the sequences and relies more on appropriate statistical distributions. The results of this technique for comparing sequences, by extracting information and comparing matching fidelity and location regularization information, are very encouraging, specifically to classify mutation sequences.
Copyright © 2016 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Alignment-free; Clustering; Distance; K-tuple

Mesh:

Year:  2016        PMID: 27460589      PMCID: PMC5361063          DOI: 10.1016/j.jtbi.2016.07.032

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  31 in total

1.  Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions.

Authors:  Gregory E Sims; Se-Ran Jun; Guohong A Wu; Sung-Hou Kim
Journal:  Proc Natl Acad Sci U S A       Date:  2009-02-02       Impact factor: 11.205

Review 2.  Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis.

Authors:  Oliver Bonham-Carter; Joe Steele; Dhundy Bastola
Journal:  Brief Bioinform       Date:  2013-07-31       Impact factor: 11.622

3.  Mitochondrial DNA sequences of primates: tempo and mode of evolution.

Authors:  W M Brown; E M Prager; A Wang; A C Wilson
Journal:  J Mol Evol       Date:  1982       Impact factor: 2.395

4.  PseKNC: a flexible web server for generating pseudo K-tuple nucleotide composition.

Authors:  Wei Chen; Tian-Yu Lei; Dian-Chuan Jin; Hao Lin; Kuo-Chen Chou
Journal:  Anal Biochem       Date:  2014-04-13       Impact factor: 3.365

5.  Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega.

Authors:  Fabian Sievers; Andreas Wilm; David Dineen; Toby J Gibson; Kevin Karplus; Weizhong Li; Rodrigo Lopez; Hamish McWilliam; Michael Remmert; Johannes Söding; Julie D Thompson; Desmond G Higgins
Journal:  Mol Syst Biol       Date:  2011-10-11       Impact factor: 11.429

6.  An improved alignment-free model for DNA sequence similarity metric.

Authors:  Junpeng Bao; Ruiyu Yuan; Zhe Bao
Journal:  BMC Bioinformatics       Date:  2014-09-28       Impact factor: 3.169

7.  Some remarks on protein attribute prediction and pseudo amino acid composition.

Authors:  Kuo-Chen Chou
Journal:  J Theor Biol       Date:  2010-12-17       Impact factor: 2.691

8.  New method for global alignment of 2 DNA sequences by the tree data structure.

Authors:  Zhao-Hui Qi; Xiao-Qin Qi; Chen-Chen Liu
Journal:  J Theor Biol       Date:  2009-12-16       Impact factor: 2.691

9.  CVTree update: a newly designed phylogenetic study platform using composition vectors and whole genomes.

Authors:  Zhao Xu; Bailin Hao
Journal:  Nucleic Acids Res       Date:  2009-04-26       Impact factor: 16.971

10.  iSS-PseDNC: identifying splicing sites using pseudo dinucleotide composition.

Authors:  Wei Chen; Peng-Mian Feng; Hao Lin; Kuo-Chen Chou
Journal:  Biomed Res Int       Date:  2014-05-21       Impact factor: 3.411

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.