Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis.

Literature DB >> 23904502

Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis.

Oliver Bonham-Carter, Joe Steele, Dhundy Bastola.

Abstract

Modern sequencing and genome assembly technologies have provided a wealth of data, which will soon require an analysis by comparison for discovery. Sequence alignment, a fundamental task in bioinformatics research, may be used but with some caveats. Seminal techniques and methods from dynamic programming are proving ineffective for this work owing to their inherent computational expense when processing large amounts of sequence data. These methods are prone to giving misleading information because of genetic recombination, genetic shuffling and other inherent biological events. New approaches from information theory, frequency analysis and data compression are available and provide powerful alternatives to dynamic programming. These new methods are often preferred, as their algorithms are simpler and are not affected by synteny-related problems. In this review, we provide a detailed discussion of computational tools, which stem from alignment-free methods based on statistical analysis from word frequencies. We provide several clear examples to demonstrate applications and the interpretations over several different areas of alignment-free analysis such as base-base correlations, feature frequency profiles, compositional vectors, an improved string composition and the D2 statistic metric. Additionally, we provide detailed discussion and an example of analysis by Lempel-Ziv techniques from data compression.

Keywords: alignment-free; information theory; sequence-alignment; word-analysis

Mesh：

Year: 2013 PMID： 23904502 PMCID： PMC4296134 DOI： 10.1093/bib/bbt052

Source DB: PubMed Journal: Brief Bioinform ISSN： 1467-5463 Impact factor: 11.622

47 in total

1. The emerging paradigm and open problems in comparative genomics.

Authors: E V Koonin
Journal: Bioinformatics Date: 1999-04 Impact factor: 6.937

2. BLAT--the BLAST-like alignment tool.

Authors: W James Kent
Journal: Genome Res Date: 2002-04 Impact factor: 9.043

Review 3. Alignment-free sequence comparison-a review.

Authors: Susana Vinga; Jonas Almeida
Journal: Bioinformatics Date: 2003-03-01 Impact factor: 6.937

4. Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions.

Authors: Gregory E Sims; Se-Ran Jun; Guohong A Wu; Sung-Hou Kim
Journal: Proc Natl Acad Sci U S A Date: 2009-02-02 Impact factor: 11.205

5. Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform.

Authors: Anthony J Cox; Markus J Bauer; Tobias Jakobi; Giovanna Rosone
Journal: Bioinformatics Date: 2012-05-03 Impact factor: 6.937

6. DNA shuffling of a family of genes from diverse species accelerates directed evolution.

Authors: A Crameri; S A Raillard; E Bermudez; W P Stemmer
Journal: Nature Date: 1998-01-15 Impact factor: 49.962

7. Linguistics of nucleotide sequences: morphology and comparison of vocabularies.

Authors: V Brendel; J S Beckmann; E N Trifonov
Journal: J Biomol Struct Dyn Date: 1986-08

Review 8. Computational solutions to large-scale data management and analysis.

Authors: Eric E Schadt; Michael D Linderman; Jon Sorenson; Lawrence Lee; Garry P Nolan
Journal: Nat Rev Genet Date: 2010-09 Impact factor: 53.242

9. Integrating overlapping structures and background information of words significantly improves biological sequence comparison.

Authors: Qi Dai; Lihua Li; Xiaoqing Liu; Yuhua Yao; Fukun Zhao; Michael Zhang
Journal: PLoS One Date: 2011-11-10 Impact factor: 3.240

10. Confirming the phylogeny of mammals by use of large comparative sequence data sets.

Authors: Arjun B Prasad; Marc W Allard; Eric D Green
Journal: Mol Biol Evol Date: 2008-05-02 Impact factor: 16.240

41 in total

1. A study of bias and increasing organismal complexity from their post-translational modifications and reaction site interplays.

Authors: Oliver Bonham-Carter; Ishwor Thapa; Steven From; Dhundy Bastola
Journal: Brief Bioinform Date: 2016-01-13 Impact factor: 11.622

2. Inferring Phylogenomic Relationship of Microbes Using Scalable Alignment-Free Methods.

Authors: Guillaume Bernard; Timothy G Stephens; Raúl A González-Pech; Cheong Xin Chan
Journal: Methods Mol Biol Date: 2021

3. High-Throughput Genotyping Technologies in Plant Taxonomy.

Authors: Monica F Danilevicz; Cassandria G Tay Fernandez; Jacob I Marsh; Philipp E Bayer; David Edwards
Journal: Methods Mol Biol Date: 2021

Review 4. A Primer on Infectious Disease Bacterial Genomics.

Authors: Tarah Lynch; Aaron Petkau; Natalie Knox; Morag Graham; Gary Van Domselaar
Journal: Clin Microbiol Rev Date: 2016-09-07 Impact factor: 26.132

5. Accurate Inference of Tree Topologies from Multiple Sequence Alignments Using Deep Learning.

Authors: Anton Suvorov; Joshua Hochuli; Daniel R Schrider
Journal: Syst Biol Date: 2020-03-01 Impact factor: 15.683

6. Identity: rapid alignment-free prediction of sequence alignment identity scores using self-supervised general linear models.

Authors: Hani Z Girgis; Benjamin T James; Brian B Luczak
Journal: NAR Genom Bioinform Date: 2021-02-01

7. Comparison of genomic data via statistical distribution.

Authors: Saeid Amiri; Ivo D Dinov
Journal: J Theor Biol Date: 2016-07-25 Impact factor: 2.691

8. CAM: an alignment-free method to recover phylogenies using codon aversion motifs.

Authors: Justin B Miller; Lauren M McKinnon; Michael F Whiting; Perry G Ridge
Journal: PeerJ Date: 2019-06-04 Impact factor: 2.984

9. Use of Alignment-Free Phylogenetics for Rapid Genome Sequence-Based Typing of Helicobacter pylori Virulence Markers and Antibiotic Susceptibility.

Authors: Arnoud H M van Vliet; Johannes G Kusters
Journal: J Clin Microbiol Date: 2015-07-01 Impact factor: 5.948

10. An investigation into inter- and intragenomic variations of graphic genomic signatures.

Authors: Rallis Karamichalis; Lila Kari; Stavros Konstantinidis; Steffen Kopecki
Journal: BMC Bioinformatics Date: 2015-08-07 Impact factor: 3.169