Literature DB >> 21904429

Mycobacteriophage genome database.

Jerrine Joseph¹, Vasanthi Rajendran, Sameer Hassan, Vanaja Kumar.

Abstract

UNLABELLED: Mycobacteriophage genome database (MGDB) is an exclusive repository of the 64 completely sequenced mycobacteriophages with annotated information. It is a comprehensive compilation of the various gene parameters captured from several databases pooled together to empower mycobacteriophage researchers. The MGDB (Version No.1.0) comprises of 6086 genes from 64 mycobacteriophages classified into 72 families based on ACLAME database. Manual curation was aided by information available from public databases which was enriched further by analysis. Its web interface allows browsing as well as querying the classification. The main objective is to collect and organize the complexity inherent to mycobacteriophage protein classification in a rational way. The other objective is to browse the existing and new genomes and describe their functional annotation. AVAILABILITY: The database is available for free at http://mpgdb.ibioinformatics.org/mpgdb.php.

Entities: Species

Keywords: MySQL; Mycobacteriophages; PHP; annotation; database; genome

Year: 2011 PMID： 21904429 PMCID： PMC3163920 DOI： 10.6026/97320630006393

Source DB: PubMed Journal: Bioinformation ISSN： 0973-2063

Background

Mycobacteriophage genomes were chosen because of their abundance, their wide size distribution and their very compact genome organization. This enables them to be the ideal choice to function as genetic tool in tuberculosis research. All these features were expected to raise most of the technical challenges to be solved in building up the database. Despite their importance, little is known about the genomic diversity harbored in phages. Methods to capture complete phage genomes have been hampered by the lack of knowledge about the target genomes [1]. The sheer abundance and importance of mycobacteriophages coupled with knowledge of their genetic makeup, demands establishing comprehensive annotation methods that can be applied. This wealth of information helps to decipher the genetic framework that drives phage biology, thus providing a window into understanding how these important organisms modulate and by extension, impact human health.

Database Content

The present version of the mycobacteriophage genome database (version 1.0) contains 6086 genes from 64 mycobacteriophages. All the sequence data used has been downloaded from the NCBI genomes section ( http://www.ncbi.nlm.nih.gov/genomes). As a first step towards building a comprehensive mycobacteriophage functional classification of the 64 sequenced genomes, it has been subjected to several servers like PFAM [2], PROSITE [3], CATH [4], SCOP [5] and so on for annotation. The proteins from 64 mycobacteriophages were clustered into families based on ACLAME classification [6]. The output of the analysis resulted in 72 clusters covering a total of 2082 proteins representing 34.7% of all analyzed proteins.

Methodology

Tool Design and implementation

The system used is MySQL database to store gene and protein sequence annotated data. Programs are written in PHP enabled database search using keywords like ‘Genome name’ or ‘mycobacteriophage genes’. The database facilitates to query and retrieve results in ‘Text’ or ‘Table’ or ‘Graphical’ formats as ‘Description’, ‘Summary’, ‘predicted results’ and ‘graphs’ respectively. Detailed sequence data can be obtained from the corresponding hyperlinks. Data flow diagram for MGDB is represented in the Figure 1. The browsing and query web pages are shown as snapshots in Figure 2.

Figure 1

Data flow diagram for mycobacteriophage genome database.

Figure 2

Mycobacteriophage genome database query and browser sample web pages.

Future Development

The database will be periodically updated and enriched with more features to make it further interactive and user friendly. Tools like BLAST [7] and CLUSTAL W [8] are to be incorporated in the database, to better equip the mycobacteriophage research community. Phage genome comparison and newly identified drug targets of M. tuberculosis will also updated in the future. While these resources are perused by a great number of the mycobacteriophage research community, we will undertake the initiatives to acquire curate and enhance the content of this database in service to the wider research community.

8 in total

1. ACLAME: a CLAssification of Mobile genetic Elements.

Authors: Raphaël Leplae; Aline Hebrant; Shoshana J Wodak; Ariane Toussaint
Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971

2. CLUSTAL V: improved software for multiple sequence alignment.

Authors: D G Higgins; A J Bleasby; R Fuchs
Journal: Comput Appl Biosci Date: 1992-04

3. Pfam: multiple sequence alignments and HMM-profiles of protein domains.

Authors: E L Sonnhammer; S R Eddy; E Birney; A Bateman; R Durbin
Journal: Nucleic Acids Res Date: 1998-01-01 Impact factor: 16.971

Review 4. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors: S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal: Nucleic Acids Res Date: 1997-09-01 Impact factor: 16.971

5. CATH--a hierarchic classification of protein domain structures.

Authors: C A Orengo; A D Michie; S Jones; D T Jones; M B Swindells; J M Thornton
Journal: Structure Date: 1997-08-15 Impact factor: 5.006

6. SCOP: a structural classification of proteins database for the investigation of sequences and structures.

Authors: A G Murzin; S E Brenner; T Hubbard; C Chothia
Journal: J Mol Biol Date: 1995-04-07 Impact factor: 5.469

7. Analysis of high-throughput sequencing and annotation strategies for phage genomes.

Authors: Matthew R Henn; Matthew B Sullivan; Nicole Stange-Thomann; Marcia S Osburne; Aaron M Berlin; Libusha Kelly; Chandri Yandava; Chinnappa Kodira; Qiandong Zeng; Michael Weiand; Todd Sparrow; Sakina Saif; Georgia Giannoukos; Sarah K Young; Chad Nusbaum; Bruce W Birren; Sallie W Chisholm
Journal: PLoS One Date: 2010-02-05 Impact factor: 3.240

8. ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins.

Authors: Edouard de Castro; Christian J A Sigrist; Alexandre Gattiker; Virginie Bulliard; Petra S Langendijk-Genevaux; Elisabeth Gasteiger; Amos Bairoch; Nicolas Hulo
Journal: Nucleic Acids Res Date: 2006-07-01 Impact factor: 16.971

8 in total

3 in total

1. Homology modeling, substrate docking, and molecular simulation studies of mycobacteriophage Che12 lysin A.

Authors: Shainaba A Saadhali; Sameer Hassan; Luke Elizabeth Hanna; Uma Devi Ranganathan; Vanaja Kumar
Journal: J Mol Model Date: 2016-07-13 Impact factor: 1.810

2. Complete genome sequence and comparative genomic analysis of Mycobacterium massiliense JCM 15300 in the Mycobacterium abscessus group reveal a conserved genomic island MmGI-1 related to putative lipid metabolism.

Authors: Tsuyoshi Sekizuka; Masanori Kai; Kazue Nakanaga; Noboru Nakata; Yuko Kazumi; Shinji Maeda; Masahiko Makino; Yoshihiko Hoshino; Makoto Kuroda
Journal: PLoS One Date: 2014-12-11 Impact factor: 3.240

3. Genome Sequences of 12 Mycobacteriophages Recovered from Archival Stocks in Japan.

Authors: Jumpei Uchiyama; Keijiro Mizukami; Koji Yahara; Shin-Ichiro Kato; Hironobu Murakami; Tadahiro Nasukawa; Naoya Ohara; Midori Ogawa; Toshio Yamazaki; Shigenobu Matsuzaki; Masahiro Sakaguchi
Journal: Genome Announc Date: 2018-06-21

3 in total