Literature DB >> 16328950

GNARE: automated system for high-throughput genome analysis with grid computational backend.

Dinanath Sulakhe1, Alex Rodriguez, Mark D'Souza, Michael Wilde, Veronika Nefedova, Ian Foster, Natalia Maltsev.   

Abstract

Recent progress in genomics and experimental biology has brought exponential growth of the biological information available for computational analysis in public genomics databases. However, applying the potentially enormous scientific value of this information to the understanding of biological systems requires computing and data storage technology of an unprecedented scale. The Grid, with its aggregated and distributed computational and storage infrastructure, offers an ideal platform for high-throughput bioinformatics analysis. To leverage this we have developed the Genome Analysis Research Environment (GNARE)--a scalable computational system for the high-throughput analysis of genomes, which provides an integrated database and computational backend for data-driven bioinformatics applications. GNARE efficiently automates the major steps of genome analysis including acquisition of data from multiple genomic databases; data analysis by a diverse set of bioinformatics tools; and storage of results and annotations. High-throughput computations in GNARE are performed using distributed heterogeneous Grid computing resources such as Grid2003, TeraGrid, and the DOE Science Grid. Multi-step genome analysis workflows involving massive data processing, the use of application-specific tools and algorithms and updating of an integrated database to provide interactive web access to results are all expressed and controlled by a "virtual data" model which transparently maps computational workflows to distributed Grid resources. This paper describes how Grid technologies such as Globus, Condor, and the Gryphyn Virtual Data System were applied in the development of GNARE. It focuses on our approach to Grid resource allocation and to the use of GNARE as a computational framework for the development of bioinformatics applications.

Mesh:

Year:  2005        PMID: 16328950     DOI: 10.1007/s10877-005-3463-y

Source DB:  PubMed          Journal:  J Clin Monit Comput        ISSN: 1387-1307            Impact factor:   2.502


  16 in total

1.  Blocks+: a non-redundant database of protein alignment blocks derived from multiple compilations.

Authors:  S Henikoff; J G Henikoff; S Pietrokovski
Journal:  Bioinformatics       Date:  1999-06       Impact factor: 6.937

Review 2.  A new approach to decoding life: systems biology.

Authors:  T Ideker; T Galitski; L Hood
Journal:  Annu Rev Genomics Hum Genet       Date:  2001       Impact factor: 8.929

3.  Sentra, a database of signal transduction proteins.

Authors:  Natalia Maltsev; E Marland; G X Yu; S Bhatnagar; R Lusk
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

4.  SCOP database in 2002: refinements accommodate structural genomics.

Authors:  Loredana Lo Conte; Steven E Brenner; Tim J P Hubbard; Cyrus Chothia; Alexey G Murzin
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

5.  The Protein Information Resource: an integrated public resource of functional annotation of proteins.

Authors:  Cathy H Wu; Hongzhan Huang; Leslie Arminski; Jorge Castro-Alvear; Yongxing Chen; Zhang-Zhi Hu; Robert S Ledley; Kali C Lewis; Hans-Werner Mewes; Bruce C Orcutt; Baris E Suzek; Akira Tsugita; C R Vinayaka; Lai-Su L Yeh; Jian Zhang; Winona C Barker
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

6.  The CATH database: an extended protein family resource for structural and functional genomics.

Authors:  F M G Pearl; C F Bennett; J E Bray; A P Harrison; N Martin; A Shepherd; I Sillitoe; J Thornton; C A Orengo
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

7.  The InterPro Database, 2003 brings increased coverage and new features.

Authors:  Nicola J Mulder; Rolf Apweiler; Teresa K Attwood; Amos Bairoch; Daniel Barrell; Alex Bateman; David Binns; Margaret Biswas; Paul Bradley; Peer Bork; Phillip Bucher; Richard R Copley; Emmanuel Courcelle; Ujjwal Das; Richard Durbin; Laurent Falquet; Wolfgang Fleischmann; Sam Griffiths-Jones; Daniel Haft; Nicola Harte; Nicolas Hulo; Daniel Kahn; Alexander Kanapin; Maria Krestyaninova; Rodrigo Lopez; Ivica Letunic; David Lonsdale; Ville Silventoinen; Sandra E Orchard; Marco Pagni; David Peyruc; Chris P Ponting; Jeremy D Selengut; Florence Servant; Christian J A Sigrist; Robert Vaughan; Evgueni M Zdobnov
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

8.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

9.  Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins.

Authors:  A Bateman; E Birney; R Durbin; S R Eddy; R D Finn; E L Sonnhammer
Journal:  Nucleic Acids Res       Date:  1999-01-01       Impact factor: 16.971

10.  Sensitivity and selectivity in protein similarity searches: a comparison of Smith-Waterman in hardware to BLAST and FASTA.

Authors:  E G Shpaer; M Robinson; D Yee; J D Candlin; R Mines; T Hunkapiller
Journal:  Genomics       Date:  1996-12-01       Impact factor: 5.736

View more
  6 in total

Review 1.  Systems genetics, bioinformatics and eQTL mapping.

Authors:  Hong Li; Hongwen Deng
Journal:  Genetica       Date:  2010-09-03       Impact factor: 1.082

2.  Reconstruction and validation of RefRec: a global model for the yeast molecular interaction network.

Authors:  Tommi Aho; Henrikki Almusa; Jukka Matilainen; Antti Larjo; Pekka Ruusuvuori; Kaisa-Leena Aho; Thomas Wilhelm; Harri Lähdesmäki; Andreas Beyer; Manu Harju; Sharif Chowdhury; Kalle Leinonen; Christophe Roos; Olli Yli-Harja
Journal:  PLoS One       Date:  2010-05-14       Impact factor: 3.240

3.  Sequence-based analysis of pQBR103; a representative of a unique, transfer-proficient mega plasmid resident in the microbial community of sugar beet.

Authors:  Adrian Tett; Andrew J Spiers; Lisa C Crossman; Duane Ager; Lena Ciric; J Maxwell Dow; John C Fry; David Harris; Andrew Lilley; Anna Oliver; Julian Parkhill; Michael A Quail; Paul B Rainey; Nigel J Saunders; Kathy Seeger; Lori A S Snyder; Rob Squares; Christopher M Thomas; Sarah L Turner; Xue-Xian Zhang; Dawn Field; Mark J Bailey
Journal:  ISME J       Date:  2007-07-05       Impact factor: 10.302

4.  A primer on high-throughput computing for genomic selection.

Authors:  Xiao-Lin Wu; Timothy M Beissinger; Stewart Bauck; Brent Woodward; Guilherme J M Rosa; Kent A Weigel; Natalia de Leon Gatti; Daniel Gianola
Journal:  Front Genet       Date:  2011-02-24       Impact factor: 4.599

5.  PUMA2--grid-based high-throughput analysis of genomes and metabolic pathways.

Authors:  Natalia Maltsev; Elizabeth Glass; Dinanath Sulakhe; Alexis Rodriguez; Mustafa H Syed; Tanuja Bompada; Yi Zhang; Mark D'Souza
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

6.  Sentra: a database of signal transduction proteins for comparative genome analysis.

Authors:  Mark D'Souza; Elizabeth M Glass; Mustafa H Syed; Yi Zhang; Alexis Rodriguez; Natalia Maltsev; Michael Y Galperin
Journal:  Nucleic Acids Res       Date:  2006-11-29       Impact factor: 16.971

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.