Literature DB >> 12584116

FORRepeats: detects repeats on entire chromosomes and between genomes.

A Lefebvre1, T Lecroq, H Dauchel, J Alexandre.   

Abstract

MOTIVATION: As more and more whole genomes are available, there is a need for new methods to compare large sequences and transfer biological knowledge from annotated genomes to related new ones. BLAST is not suitable to compare multimegabase DNA sequences. MegaBLAST is designed to compare closely related large sequences. Some tools to detect repeats in large sequences have already been developed such as MUMmer or REPuter. They also have time or space restrictions. Moreover, in terms of applications, REPuter only computes repeats and MUMmer works better with related genomes.
RESULTS: We present a heuristic method, named FORRepeats, which is based on a novel data structure called factor oracle. In the first step it detects exact repeats in large sequences. Then, in the second step, it computes approximate repeats and performs pairwise comparison. We compared its computational characteristics with BLAST and REPuter. Results demonstrate that it is fast and space economical. We show FORRepeats ability to perform intra-genomic comparison and to detect repeated DNA sequences in the complete genome of the model plant Arabidopsis thaliana.

Entities:  

Mesh:

Year:  2003        PMID: 12584116     DOI: 10.1093/bioinformatics/btf843

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  11 in total

1.  Mauve: multiple alignment of conserved genomic sequence with rearrangements.

Authors:  Aaron C E Darling; Bob Mau; Frederick R Blattner; Nicole T Perna
Journal:  Genome Res       Date:  2004-07       Impact factor: 9.043

2.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers.

Authors:  Guillaume Marçais; Carl Kingsford
Journal:  Bioinformatics       Date:  2011-01-07       Impact factor: 6.937

3.  Bioinformatics and genomic analysis of transposable elements in eukaryotic genomes.

Authors:  Mateusz Janicki; Rebecca Rooke; Guojun Yang
Journal:  Chromosome Res       Date:  2011-08       Impact factor: 4.620

4.  DiagHunter and GenoPix2D: programs for genomic comparisons, large-scale homology discovery and visualization.

Authors:  Steven B Cannon; Alexander Kozik; Brian Chan; Richard Michelmore; Nevin D Young
Journal:  Genome Biol       Date:  2003-09-19       Impact factor: 13.583

5.  Estimating the k-mer Coverage Frequencies in Genomic Datasets: A Comparative Assessment of the State-of-the-art.

Authors:  Swati C Manekar; Shailesh R Sathe
Journal:  Curr Genomics       Date:  2019-01       Impact factor: 2.236

6.  A new method to compute K-mer frequencies and its application to annotate large repetitive plant genomes.

Authors:  Stefan Kurtz; Apurva Narechania; Joshua C Stein; Doreen Ware
Journal:  BMC Genomics       Date:  2008-10-31       Impact factor: 3.969

7.  Simultaneous identification of long similar substrings in large sets of sequences.

Authors:  Jürgen Kleffe; Friedrich Möller; Burghardt Wittig
Journal:  BMC Bioinformatics       Date:  2007-05-24       Impact factor: 3.169

8.  OrthoParaMap: distinguishing orthologs from paralogs by integrating comparative genome data and gene phylogenies.

Authors:  Steven B Cannon; Nevin D Young
Journal:  BMC Bioinformatics       Date:  2003-09-02       Impact factor: 3.169

Review 9.  Seq'ing identity and function in a repeat-derived noncoding RNA world.

Authors:  Rachel J O'Neill
Journal:  Chromosome Res       Date:  2020-03-07       Impact factor: 5.239

10.  A benchmark study of k-mer counting methods for high-throughput sequencing.

Authors:  Swati C Manekar; Shailesh R Sathe
Journal:  Gigascience       Date:  2018-12-01       Impact factor: 6.524

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.