Literature DB >> 23904502

Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis.

Oliver Bonham-Carter, Joe Steele, Dhundy Bastola.   

Abstract

Modern sequencing and genome assembly technologies have provided a wealth of data, which will soon require an analysis by comparison for discovery. Sequence alignment, a fundamental task in bioinformatics research, may be used but with some caveats. Seminal techniques and methods from dynamic programming are proving ineffective for this work owing to their inherent computational expense when processing large amounts of sequence data. These methods are prone to giving misleading information because of genetic recombination, genetic shuffling and other inherent biological events. New approaches from information theory, frequency analysis and data compression are available and provide powerful alternatives to dynamic programming. These new methods are often preferred, as their algorithms are simpler and are not affected by synteny-related problems. In this review, we provide a detailed discussion of computational tools, which stem from alignment-free methods based on statistical analysis from word frequencies. We provide several clear examples to demonstrate applications and the interpretations over several different areas of alignment-free analysis such as base-base correlations, feature frequency profiles, compositional vectors, an improved string composition and the D2 statistic metric. Additionally, we provide detailed discussion and an example of analysis by Lempel-Ziv techniques from data compression.
© The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

Keywords:  alignment-free; information theory; sequence-alignment; word-analysis

Mesh:

Year:  2013        PMID: 23904502      PMCID: PMC4296134          DOI: 10.1093/bib/bbt052

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  47 in total

1.  The emerging paradigm and open problems in comparative genomics.

Authors:  E V Koonin
Journal:  Bioinformatics       Date:  1999-04       Impact factor: 6.937

2.  BLAT--the BLAST-like alignment tool.

Authors:  W James Kent
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

Review 3.  Alignment-free sequence comparison-a review.

Authors:  Susana Vinga; Jonas Almeida
Journal:  Bioinformatics       Date:  2003-03-01       Impact factor: 6.937

4.  Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions.

Authors:  Gregory E Sims; Se-Ran Jun; Guohong A Wu; Sung-Hou Kim
Journal:  Proc Natl Acad Sci U S A       Date:  2009-02-02       Impact factor: 11.205

5.  Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform.

Authors:  Anthony J Cox; Markus J Bauer; Tobias Jakobi; Giovanna Rosone
Journal:  Bioinformatics       Date:  2012-05-03       Impact factor: 6.937

6.  DNA shuffling of a family of genes from diverse species accelerates directed evolution.

Authors:  A Crameri; S A Raillard; E Bermudez; W P Stemmer
Journal:  Nature       Date:  1998-01-15       Impact factor: 49.962

7.  Linguistics of nucleotide sequences: morphology and comparison of vocabularies.

Authors:  V Brendel; J S Beckmann; E N Trifonov
Journal:  J Biomol Struct Dyn       Date:  1986-08

Review 8.  Computational solutions to large-scale data management and analysis.

Authors:  Eric E Schadt; Michael D Linderman; Jon Sorenson; Lawrence Lee; Garry P Nolan
Journal:  Nat Rev Genet       Date:  2010-09       Impact factor: 53.242

9.  Integrating overlapping structures and background information of words significantly improves biological sequence comparison.

Authors:  Qi Dai; Lihua Li; Xiaoqing Liu; Yuhua Yao; Fukun Zhao; Michael Zhang
Journal:  PLoS One       Date:  2011-11-10       Impact factor: 3.240

10.  Confirming the phylogeny of mammals by use of large comparative sequence data sets.

Authors:  Arjun B Prasad; Marc W Allard; Eric D Green
Journal:  Mol Biol Evol       Date:  2008-05-02       Impact factor: 16.240

View more
  41 in total

1.  A study of bias and increasing organismal complexity from their post-translational modifications and reaction site interplays.

Authors:  Oliver Bonham-Carter; Ishwor Thapa; Steven From; Dhundy Bastola
Journal:  Brief Bioinform       Date:  2016-01-13       Impact factor: 11.622

2.  Inferring Phylogenomic Relationship of Microbes Using Scalable Alignment-Free Methods.

Authors:  Guillaume Bernard; Timothy G Stephens; Raúl A González-Pech; Cheong Xin Chan
Journal:  Methods Mol Biol       Date:  2021

3.  High-Throughput Genotyping Technologies in Plant Taxonomy.

Authors:  Monica F Danilevicz; Cassandria G Tay Fernandez; Jacob I Marsh; Philipp E Bayer; David Edwards
Journal:  Methods Mol Biol       Date:  2021

Review 4.  A Primer on Infectious Disease Bacterial Genomics.

Authors:  Tarah Lynch; Aaron Petkau; Natalie Knox; Morag Graham; Gary Van Domselaar
Journal:  Clin Microbiol Rev       Date:  2016-09-07       Impact factor: 26.132

5.  Accurate Inference of Tree Topologies from Multiple Sequence Alignments Using Deep Learning.

Authors:  Anton Suvorov; Joshua Hochuli; Daniel R Schrider
Journal:  Syst Biol       Date:  2020-03-01       Impact factor: 15.683

6.  Identity: rapid alignment-free prediction of sequence alignment identity scores using self-supervised general linear models.

Authors:  Hani Z Girgis; Benjamin T James; Brian B Luczak
Journal:  NAR Genom Bioinform       Date:  2021-02-01

7.  Comparison of genomic data via statistical distribution.

Authors:  Saeid Amiri; Ivo D Dinov
Journal:  J Theor Biol       Date:  2016-07-25       Impact factor: 2.691

8.  CAM: an alignment-free method to recover phylogenies using codon aversion motifs.

Authors:  Justin B Miller; Lauren M McKinnon; Michael F Whiting; Perry G Ridge
Journal:  PeerJ       Date:  2019-06-04       Impact factor: 2.984

9.  Use of Alignment-Free Phylogenetics for Rapid Genome Sequence-Based Typing of Helicobacter pylori Virulence Markers and Antibiotic Susceptibility.

Authors:  Arnoud H M van Vliet; Johannes G Kusters
Journal:  J Clin Microbiol       Date:  2015-07-01       Impact factor: 5.948

10.  An investigation into inter- and intragenomic variations of graphic genomic signatures.

Authors:  Rallis Karamichalis; Lila Kari; Stavros Konstantinidis; Steffen Kopecki
Journal:  BMC Bioinformatics       Date:  2015-08-07       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.