Literature DB >> 25172925

Frameshift alignment: statistics and post-genomic applications.

Sergey L Sheetlin1, Yonil Park1, Martin C Frith1, John L Spouge1.   

Abstract

MOTIVATION: The alignment of DNA sequences to proteins, allowing for frameshifts, is a classic method in sequence analysis. It can help identify pseudogenes (which accumulate mutations), analyze raw DNA and RNA sequence data (which may have frameshift sequencing errors), investigate ribosomal frameshifts, etc. Often, however, only ad hoc approximations or simulations are available to provide the statistical significance of a frameshift alignment score.
RESULTS: We describe a method to estimate statistical significance of frameshift alignments, similar to classic BLAST statistics. (BLAST presently does not permit its alignments to include frameshifts.) We also illustrate the continuing usefulness of frameshift alignment with two 'post-genomic' applications: (i) when finding pseudogenes within the human genome, frameshift alignments show that most anciently conserved non-coding human elements are recent pseudogenes with conserved ancestral genes; and (ii) when analyzing metagenomic DNA reads from polluted soil, frameshift alignments show that most alignable metagenomic reads contain frameshifts, suggesting that metagenomic analysis needs to use frameshift alignment to derive accurate results. Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.

Entities:  

Mesh:

Year:  2014        PMID: 25172925      PMCID: PMC4253828          DOI: 10.1093/bioinformatics/btu576

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  40 in total

1.  Pro-Frame: similarity-based gene recognition in eukaryotic DNA sequences with errors.

Authors:  A A Mironov; P S Novichkov; M S Gelfand
Journal:  Bioinformatics       Date:  2001-01       Impact factor: 6.937

2.  Rapid significance estimation in local sequence alignment with gaps.

Authors:  Ralf Bundschuh
Journal:  J Comput Biol       Date:  2002       Impact factor: 1.479

3.  Alignments of DNA and protein sequences containing frameshift errors.

Authors:  X Guan; E C Uberbacher
Journal:  Comput Appl Biosci       Date:  1996-02

4.  Local alignment statistics.

Authors:  S F Altschul; W Gish
Journal:  Methods Enzymol       Date:  1996       Impact factor: 1.600

5.  Identification of protein coding regions by database similarity search.

Authors:  W Gish; D J States
Journal:  Nat Genet       Date:  1993-03       Impact factor: 38.330

6.  New finite-size correction for local alignment score distributions.

Authors:  Yonil Park; Sergey Sheetlin; Ning Ma; Thomas L Madden; John L Spouge
Journal:  BMC Res Notes       Date:  2012-06-12

7.  A new repeat-masking method enables specific detection of homologous sequences.

Authors:  Martin C Frith
Journal:  Nucleic Acids Res       Date:  2010-11-24       Impact factor: 16.971

8.  HMM-FRAME: accurate protein domain classification for metagenomic sequences containing frameshift errors.

Authors:  Yuan Zhang; Yanni Sun
Journal:  BMC Bioinformatics       Date:  2011-05-24       Impact factor: 3.169

9.  GHOSTM: a GPU-accelerated homology search tool for metagenomics.

Authors:  Shuji Suzuki; Takashi Ishida; Ken Kurokawa; Yutaka Akiyama
Journal:  PLoS One       Date:  2012-05-04       Impact factor: 3.240

10.  PhyloSift: phylogenetic analysis of genomes and metagenomes.

Authors:  Aaron E Darling; Guillaume Jospin; Eric Lowe; Frederick A Matsen; Holly M Bik; Jonathan A Eisen
Journal:  PeerJ       Date:  2014-01-09       Impact factor: 2.984

View more
  16 in total

1.  ALP & FALP: C++ libraries for pairwise local alignment E-values.

Authors:  Sergey Sheetlin; Yonil Park; Martin C Frith; John L Spouge
Journal:  Bioinformatics       Date:  2015-10-01       Impact factor: 6.937

Review 2.  Application of computational approaches to analyze metagenomic data.

Authors:  Ho-Jin Gwak; Seung Jae Lee; Mina Rho
Journal:  J Microbiol       Date:  2021-02-10       Impact factor: 3.422

3.  RIFRAF: a frame-resolving consensus algorithm.

Authors:  Kemal Eren; Ben Murrell
Journal:  Bioinformatics       Date:  2018-11-15       Impact factor: 6.937

4.  MAIRA- real-time taxonomic and functional analysis of long reads on a laptop.

Authors:  Benjamin Albrecht; Caner Bağcı; Daniel H Huson
Journal:  BMC Bioinformatics       Date:  2020-09-17       Impact factor: 3.169

5.  Companion: a web server for annotation and analysis of parasite genomes.

Authors:  Sascha Steinbiss; Fatima Silva-Franco; Brian Brunk; Bernardo Foth; Christiane Hertz-Fowler; Matthew Berriman; Thomas D Otto
Journal:  Nucleic Acids Res       Date:  2016-04-21       Impact factor: 16.971

6.  MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs.

Authors:  Daniel H Huson; Benjamin Albrecht; Caner Bağcı; Irina Bessarab; Anna Górska; Dino Jolic; Rohan B H Williams
Journal:  Biol Direct       Date:  2018-04-20       Impact factor: 4.540

7.  Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes.

Authors:  Y M Suvorova; M A Korotkova; K G Skryabin; E V Korotkov
Journal:  DNA Res       Date:  2019-04-01       Impact factor: 4.458

8.  Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps.

Authors:  Alexander T Dilthey; Chirag Jain; Sergey Koren; Adam M Phillippy
Journal:  Nat Commun       Date:  2019-07-11       Impact factor: 14.919

9.  AnABlast: a new in silico strategy for the genome-wide search of novel genes and fossil regions.

Authors:  Juan Jimenez; Caia D S Duncan; María Gallardo; Juan Mata; Antonio J Perez-Pulido
Journal:  DNA Res       Date:  2015-10-21       Impact factor: 4.458

10.  Parallels between experimental and natural evolution of legume symbionts.

Authors:  Camille Clerissi; Marie Touchon; Delphine Capela; Mingxing Tang; Stéphane Cruveiller; Clémence Genthon; Céline Lopez-Roques; Matthew A Parker; Lionel Moulin; Catherine Masson-Boivin; Eduardo P C Rocha
Journal:  Nat Commun       Date:  2018-06-11       Impact factor: 14.919

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.