Literature DB >> 12975312

Annotating large genomes with exact word matches.

John Healy1, Elizabeth E Thomas, Jacob T Schwartz, Michael Wigler.   

Abstract

We have developed a tool for rapidly determining the number of exact matches of any word within large, internally repetitive genomes or sets of genomes. Thus we can readily annotate any sequence, including the entire human genome, with the counts of its constituent words. We create a Burrows-Wheeler transform of the genome, which together with auxiliary data structures facilitating counting, can reside in about one gigabyte of RAM. Our original interest was motivated by oligonucleotide probe design, and we describe a general protocol for defining unique hybridization probes. But our method also has applications for the analysis of genome structure and assembly. We demonstrate the identification of chromosome-specific repeats, and outline a general procedure for finding undiscovered repeats. We also illustrate the changing contents of the human genome assemblies by comparing the annotations built from different genome freezes.

Entities:  

Mesh:

Substances:

Year:  2003        PMID: 12975312      PMCID: PMC403711          DOI: 10.1101/gr.1350803

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  10 in total

1.  REPuter: fast computation of maximal repeats in complete genomes.

Authors:  S Kurtz; C Schleiermacher
Journal:  Bioinformatics       Date:  1999-05       Impact factor: 6.937

2.  REPuter: the manifold applications of repeat analysis on a genomic scale.

Authors:  S Kurtz; J V Choudhuri; E Ohlebusch; C Schleiermacher; J Stoye; R Giegerich
Journal:  Nucleic Acids Res       Date:  2001-11-15       Impact factor: 16.971

3.  Selection of optimal DNA oligos for gene expression arrays.

Authors:  F Li; G D Stormo
Journal:  Bioinformatics       Date:  2001-11       Impact factor: 6.937

4.  BLAT--the BLAST-like alignment tool.

Authors:  W James Kent
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

5.  Repbase update: a database and an electronic journal of repetitive elements.

Authors:  J Jurka
Journal:  Trends Genet       Date:  2000-09       Impact factor: 11.639

6.  A 9.1-kb gap in the genome reference map is shown to be a stable deletion/insertion polymorphism of ancestral origin.

Authors:  Renato Robledo; Sandro Orru; Antonella Sidoti; Rosella Muresu; Diane Esposito; Marie Claude Grimaldi; Carlo Carcassi; Antoniettina Rinaldi; Luigi Bernini; Licinio Contu; Massimo Romani; Bruce Roe; Marcello Siniscalco
Journal:  Genomics       Date:  2002-12       Impact factor: 5.736

7.  The UCSC Genome Browser Database.

Authors:  D Karolchik; R Baertsch; M Diekhans; T S Furey; A Hinrichs; Y T Lu; K M Roskin; M Schwartz; C W Sugnet; D J Thomas; R J Weber; D Haussler; W J Kent
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

8.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

9.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

10.  Representational oligonucleotide microarray analysis: a high-resolution method to detect genome copy number variation.

Authors:  Robert Lucito; John Healy; Joan Alexander; Andrew Reiner; Diane Esposito; Maoyen Chi; Linda Rodgers; Amy Brady; Jonathan Sebat; Jennifer Troge; Joseph A West; Seth Rostan; Ken C Q Nguyen; Scott Powers; Kenneth Q Ye; Adam Olshen; Ennapadam Venkatraman; Larry Norton; Michael Wigler
Journal:  Genome Res       Date:  2003-09-15       Impact factor: 9.043

  10 in total
  25 in total

1.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers.

Authors:  Guillaume Marçais; Carl Kingsford
Journal:  Bioinformatics       Date:  2011-01-07       Impact factor: 6.937

2.  Short Read Mapping: An Algorithmic Tour.

Authors:  Stefan Canzar; Steven L Salzberg
Journal:  Proc IEEE Inst Electr Electron Eng       Date:  2015-09-07       Impact factor: 10.961

3.  Mouse genomic representational oligonucleotide microarray analysis: detection of copy number variations in normal and tumor specimens.

Authors:  B Lakshmi; Ira M Hall; Christopher Egan; Joan Alexander; Anthony Leotta; John Healy; Lars Zender; Mona S Spector; Wen Xue; Scott W Lowe; Michael Wigler; Robert Lucito
Journal:  Proc Natl Acad Sci U S A       Date:  2006-07-14       Impact factor: 11.205

4.  Representational oligonucleotide microarray analysis: a high-resolution method to detect genome copy number variation.

Authors:  Robert Lucito; John Healy; Joan Alexander; Andrew Reiner; Diane Esposito; Maoyen Chi; Linda Rodgers; Amy Brady; Jonathan Sebat; Jennifer Troge; Joseph A West; Seth Rostan; Ken C Q Nguyen; Scott Powers; Kenneth Q Ye; Adam Olshen; Ennapadam Venkatraman; Larry Norton; Michael Wigler
Journal:  Genome Res       Date:  2003-09-15       Impact factor: 9.043

5.  Different clustering of genomes across life using the A-T-C-G and degenerate R-Y alphabets: early and late signaling on genome evolution?

Authors:  V Kirzhner; A Paz; Z Volkovich; E Nevo; A Korol
Journal:  J Mol Evol       Date:  2007-03-19       Impact factor: 2.395

6.  Distribution of short paired duplications in mammalian genomes.

Authors:  Elizabeth E Thomas; Nathan Srebro; Jonathan Sebat; Nicholas Navin; John Healy; Bud Mishra; Michael Wigler
Journal:  Proc Natl Acad Sci U S A       Date:  2004-07-06       Impact factor: 11.205

7.  Minimizing off-target signals in RNA fluorescent in situ hybridization.

Authors:  Aaron Arvey; Anita Hermann; Cheryl C Hsia; Eugene Ie; Yoav Freund; William McGinnis
Journal:  Nucleic Acids Res       Date:  2010-02-17       Impact factor: 16.971

8.  Loss of epigenetic silencing in tumors preferentially affects primate-specific retroelements.

Authors:  Sebastian Szpakowski; Xueguang Sun; José M Lage; Andrew Dyer; Jill Rubinstein; Diane Kowalski; Clarence Sasaki; Jose Costa; Paul M Lizardi
Journal:  Gene       Date:  2009-08-21       Impact factor: 3.688

9.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.

Authors:  Ben Langmead; Cole Trapnell; Mihai Pop; Steven L Salzberg
Journal:  Genome Biol       Date:  2009-03-04       Impact factor: 13.583

10.  Identification of repeat structure in large genomes using repeat probability clouds.

Authors:  Wanjun Gu; Todd A Castoe; Dale J Hedges; Mark A Batzer; David D Pollock
Journal:  Anal Biochem       Date:  2008-05-20       Impact factor: 3.365

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.