Literature DB >> 19900969

phiSITE: database of gene regulation in bacteriophages.

Lubos Klucar1, Matej Stano, Matus Hajduk.   

Abstract

We have developed phiSITE, database of gene regulation in bacteriophages. To date it contains detailed information about more than 700 experimentally confirmed or predicted regulatory elements (promoters, operators, terminators and attachment sites) from 32 bacteriophages belonging to Siphoviridae, Myoviridae and Podoviridae families. The database is manually curated, the data are collected mainly form scientific papers, cross-referenced with other database resources (EMBL, UniProt, NCBI taxonomy database, NCBI Genome, ICTVdb, PubMed Central) and stored in SQL based database system. The system provides full text search for regulatory elements, graphical visualization of phage genomes and several export options. In addition, visualizations of gene regulatory networks for five phages (Bacillus phage GA-1, Enterobacteria phage lambda, Enterobacteria phage Mu, Enterobacteria phage P2 and Mycoplasma phage P1) have been defined and made available. The phiSITE is accessible at http://www.phisite.org/.

Entities:  

Mesh:

Year:  2009        PMID: 19900969      PMCID: PMC2808901          DOI: 10.1093/nar/gkp911

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Bacteriophages, though very simple in composition and replication, are the most abundant biological entities on earth. They are the main force in global carbon cycle, in evolution of bacterial species and in maintenance of balance of bacteria in a whole biosphere. The amount and turnover of bacteriophages in the world can be illustrated on the fact, that phage predation destroys an estimated half of the world bacteria population every 48 h (1). Extreme natural adaptability of phages and their strict (or broad) specificity in host bacteria infection make phages ideal adepts for combating human (or other) bacterial diseases. This approach, generally termed as phage therapy, is known to human kind since phage discovery almost a century ago by Twort (2) and d’Herelle (3), but since the advent of chemical antibiotics in the 1940s it has been little used in the West (4). Bacteriphages were the first organisms studied on a molecular level. In 70-ties, genomes of bacteriophages MS2 and phi-X174 were the first to be completely determined (5,6) and all discoveries of gene regulation are generally based on bacteriophage and bacteria operons research. Over 5500 bacteriophages have been examined in the electron microscope (7). There are 550 completely known phage genomes at the present time. In the EMBL database, entries from ∼1500 different bacteriophages and prophages can be found, giving the approximate number of known and studied bacteriophages. Regulatory elements and gene regulation mechanisms are, however, described only for a few dozens of phage genomes. Knowing the details about gene regulation is interesting for several reasons. Post-genomic research involves mainly analyzing the dynamics of gene regulation. The commonly accepted assumption that co-regulated genes share similarities in their regulatory mechanism led to a major challenge for the computational biologist—detecting novel regulatory elements (motifs) in such sets of co-expressed genes. These similarities at transcriptional level imply that the promoter region might contain consensus motifs recognized by the same regulatory proteins. In the upstream regions of such sets of co-regulated genes, the common consensus motifs are statistically over-represented as compared to their frequencies in a background set (of non-co-regulated genes) (8). Knowledge of gene regulation systems can lead to several novel practical application ranging from ‘designing of better phages’ used for controlling cellular behavior for medical or biotechnology purposes (9,10) to extremely perspective bio-nanotechnology applications (toggle-switches, oscillators, nano-devices) (9,11,12). Characterization of gene regulatory networks (GRNs) is quite well summarized for eukaryotes. As an example, we can point out the TRANSFAC (database about eukaryotic transcription factors, their DNA-binding sites and DNA-binding profiles) (13) or The Eukaryotic Promoter Database (14). For prokaryotic organisms there are only few projects under development: PRODORIC (Prokaryotic Database of Gene Regulation) (15) or RegulonDB (transcriptional regulatory network of Escherichia coli K12) (16) covering several hundreds of completely sequenced bacterial genomes. All known information about gene regulation in bacteriophages are spread among scientific papers and books only, partially in primary DNA and protein databases and have not yet been collected in a form of publicly available database. To address this deficiency we have developed phiSITE, database of gene regulation in bacteriophages described in this article.

DATABASE CURRATION AND CONTENT

phiSITE (release 2009.3) contains detailed information about 714 experimentally confirmed or predicted regulatory elements from 32 bacteriophages form Siphoviridae, Myoviridae and Podoviridae families (Table 1). Data related to phage gene regulation are extracted primarily from scientific papers but also from other scientific publications and primary databases. Particular focus is on experimentally confirmed regulatory sites, though predicted sites are also harvested. Many predicted sites in phage genomes are so widely accepted by scientific community that no further experimental evidence is expected. To easily separate entries according to the evidence, experimental/predicted flag of sites is clearly marked in all search results, giving possibility to select and/or analyze only experimental or predicted entries. Phage genome data are parsed from the EMBL database entries using semi-automated parser. All additional data are inserted by curators into the MySQL database back-end using web forms. phiSITE is available to any individual and for any purpose and it is distributed under the ‘Creative Commons Attribution-Share Alike 3.0 Unported License’ (http://creativecommons.org/licenses/by-sa/3.0/).
Table 1.

Statistics of the phiSITE content (Release 2009.3)

Collected phages (with complete genome) 32 (29)
    Myoviridae5
    Podoviridae18
    Siphoviridae9
Regulatory sites (experimentally identified)714 (423)
    Promoters482
    Operators61
    Terminators165
    Attachment sites6
Source publications127
Statistics of the phiSITE content (Release 2009.3) The base element of phiSITE is defined as a site, representing one regulatory element present on a phage genome. This can be either promoter, operator, transcription terminator or attachment site. Site element can be segmented into several subsites (if known), particular cis-regulatory signals (e.g. −35 and −10 for prokaryotic promoter). The database also provides references to the method of evidence for experimentally confirmed sites. All sites are linked to the other phiSITE tables describing the phage and its features. Information about complete phage genome is also included (if available), together with names and positions of all known genes. phiSITE keeps also updated information about phage and phage host taxonomy, together with numerous links to other database resources described in section ‘Phage genome browser’ below. There are also several accompanying analyzing tools under development, accessible in the Tools section. These include: Each tool is accompanied with corresponding help instructions, and their detailed description is beyond the scope of this paper. PSSM-convert: a tool for creation and conversion of Position Specific Scoring Matrices in different formats. Free Energy: a tool for computation of Gibbs free energy distribution in DNA sequence. Promoter Hunter: a tool for promoter search in prokaryotic genomes. The phiSITE database is permanently updated and new releases are published several times a year.

DATABASE ACCESS

The main access to the database is provided via the web interface at http://www.phisite.org/. The phiSITE portal is based on a well-established LAMP platform (Linux/Apache/MySQL/PHP). Users can utilize several ways to approach the data: searching and exporting the entries via Quick Search and Advanced Search; exploring phage genomes via graphical applet in the Phages section; exploring phage GRNs via BioTapestry Viewer; browsing and exporting the entries according to the phage or host taxonomy in the Browse section; and downloading the whole content of the database in XML format in Downloads section.

Searching the entries

User can search the content of a database using ‘Quick Search or Advanced Search’. Search terms are looked up either in all text fields (phage name, host name, site name, site description or site type) or in a single field selected by a user. In ‘Advance Search’ different search fields for each search term can be specified, with an optional usage of wildcards. Search results are provided in a form of table with customizable order. Each entry includes site name, type (promoter, operator, terminator or attachment site), method of evidence, source reference, phage details and semi-graphical representation of DNA segment containing the site (Figure 1). All sites are linked to the Sequence Ontology thesaurus (17). Arbitrary number of entries from search result page can be manually selected and exported using exporting module described in the section ‘Browsing and exporting the entries’ below.
Figure 1.

Result of a ‘Quick Search’ for all regulatory sites from Enterobacteria phage lambda. Each entry contains description of a site and links to other information resources. Three sites (PR', PFE and PRM) are selected and can be exported using the ‘Export selected’ button.

Result of a ‘Quick Search’ for all regulatory sites from Enterobacteria phage lambda. Each entry contains description of a site and links to other information resources. Three sites (PR', PFE and PRM) are selected and can be exported using the ‘Export selected’ button.

Phage genome browser

The system possess proprietary graphical genome browser (Figure 2). It is used to visualize all phages with known and annotated genome. It is based on Adobe Flash technology (http://www.adobe.com/products/flash/) and it is dynamically linked to the phiSITE MySQL back-end. Genome browser provides a graphical representation of all phage genes and regulatory sites where all elements are zoomable up to the primary sequence level. User can use a mouse to zoom in/out and to drag along the genome sequence. All elements are labeled with a name and a short description. Features section contains phage and phage host taxonomic classification, provides set of links to related bioinformatics resources (EMBL, UniProt, NCBI taxonomy database, NCBI Genome, ICTVdb and PubMed Central) (18–21) and also to other sections of phiSITE portal: BioTapestry viewer (for selected phages) and direct link to the list of all sites associated with a particular phage.
Figure 2.

Graphical applet zoomed to the region between Enterobacteria phage lambda genes cI and cro. User can use a mouse to zoom in/out and to scroll along the phage genome.

Graphical applet zoomed to the region between Enterobacteria phage lambda genes cI and cro. User can use a mouse to zoom in/out and to scroll along the phage genome.

BioTapestry viewer

We have adapted BioTapestry tool for visual representation of phage GRNs. BioTapestry is a free and open source Java based interactive tool for building, visualizing and simulating GRNs (22). It can output regulatory network in SBML format (23), which can be read into a GRNs simulation environment such as Dizzy (24). Source data for visualization in BioTapestry Editor are imported as Comma Separated Value files from phiSITE back-end, where interaction instructions extracted from scientific literature are defined. Source type ‘gene’ is used for genes and gene products, and source type ‘box’ for regulatory sites. Several types of interactions are described in the BioTapestry model: (i) initiation of transcription of a gene from promoter, (ii) activation of transcription by a product of phage gene, (iii) repression of transcription by a product of a gene binding to the operator of target promoter, (iv) repression of transcription by the operator negatively influencing promoter, (v) termination of transcription initiated from the promoter and (vi) antitermination of transcription by a product of antiterminator gene. Positive regulation is depicted as an arrowed line pointing from the master to the slave element (i–iii), negative regulations as a ‘T’ shaped line pointing to the slave element (iv,vi) and neutral relation as a straight line between master and slave elements (v). The Editor automatically creates a network of interactions and assembled model is made available on the web using Java Web Start technology. Only interactions among the phage genome elements are defined at the moment, though future versions may also include phage host regulatory elements. Example of Enterobacteria phage lambda regulatory region is given in Supplementary Data.

Browsing and exporting the entries

Set of phiSITE entries can be exported using dynamic export module and used in further analyses in a variety of bioinformatics tools. User can select a group of sites according to the phage or phage host taxonomic hierarchy. Evidence (experimental, predicted or both) and site and subsite types can also be selected. Each taxonomic selection step is coupled with background counting of sites currently selected. After selection, user has an option (i) to build a motif representation for selected sites, (ii) to export sites as FASTA sequences or (iii) to export selected site in XML format. Selecting Build motif representation is followed by a sequence alignment assembly process mediated by a ClustalW2 algorithm (25) and the motif is exported in several output formats: TRANSFAC database (13), FASTA, Patser (26), PromScan (27), Postion Weight Matrix (26) and Sequence logo (28). XML format is based on XML version 1.0 specification and the output file is coupled with XML Document Type Definition (DTD).

CONCLUSION

phiSITE is a manually curated database dedicated to the gene regulation in bacteriophages. It is the first resource of this kind and it is freely available to all potential users. Mainly experimentally detected cis-regulatory elements on phage genomes are harvested from scientific articles. This data are accompanied with additional information about phages and phage hosts, external links and associated tools. Curation and update process of phiSITE database will be continued. Further enhancements will include improved visualization models for selected bacteriophages with possible application in systems biology simulation engines, implementation of web services to access the data. Next version of genome browser will also cover direct link to the description of genes and regulatory elements, mediated by clicking the corresponding element in the browser and also improved graphical rendering of visualized entities. We are awaiting response from scientific community in order to improve the services provided by the phiSITE platform.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

Slovak Research and Development Agency [grant number APVT-51-025004]; Scientific Grant Agency of the Ministry of Education of the Slovak Republic and of Slovak Academy of Sciences [grant number VEGA 2/0100/09]. Conflict of interest statement. None declared.
  25 in total

Review 1.  Computational studies of gene regulatory networks: in numero molecular biology.

Authors:  J Hasty; D McMillen; F Isaacs; J J Collins
Journal:  Nat Rev Genet       Date:  2001-04       Impact factor: 53.242

2.  Designing better phages.

Authors:  S S Skiena
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

3.  A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling.

Authors:  G Thijs; M Lescot; K Marchal; S Rombauts; B De Moor; P Rouzé; Y Moreau
Journal:  Bioinformatics       Date:  2001-12       Impact factor: 6.937

4.  Bio-nanotechnology: Two-way traffic.

Authors:  T Andrew Taton
Journal:  Nat Mater       Date:  2003-02       Impact factor: 43.841

5.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models.

Authors:  M Hucka; A Finney; H M Sauro; H Bolouri; J C Doyle; H Kitano; A P Arkin; B J Bornstein; D Bray; A Cornish-Bowden; A A Cuellar; S Dronov; E D Gilles; M Ginkel; V Gor; I I Goryanin; W J Hedley; T C Hodgman; J-H Hofmeyr; P J Hunter; N S Juty; J L Kasberger; A Kremling; U Kummer; N Le Novère; L M Loew; D Lucio; P Mendes; E Minch; E D Mjolsness; Y Nakayama; M R Nelson; P F Nielsen; T Sakurada; J C Schaff; B E Shapiro; T S Shimizu; H D Spence; J Stelling; K Takahashi; M Tomita; J Wagner; J Wang
Journal:  Bioinformatics       Date:  2003-03-01       Impact factor: 6.937

Review 6.  Bacteriophages: evolution of the majority.

Authors:  Roger W Hendrix
Journal:  Theor Popul Biol       Date:  2002-06       Impact factor: 1.570

7.  Construction of phi29 DNA-packaging RNA monomers, dimers, and trimers with variable sizes and shapes as potential parts for nanodevices.

Authors:  Dan Shu; Lisa P Huang; Stephen Hoeprich; Peixuan Guo
Journal:  J Nanosci Nanotechnol       Date:  2003-08

8.  WebLogo: a sequence logo generator.

Authors:  Gavin E Crooks; Gary Hon; John-Marc Chandonia; Steven E Brenner
Journal:  Genome Res       Date:  2004-06       Impact factor: 9.043

9.  Complete nucleotide sequence of bacteriophage MS2 RNA: primary and secondary structure of the replicase gene.

Authors:  W Fiers; R Contreras; F Duerinck; G Haegeman; D Iserentant; J Merregaert; W Min Jou; F Molemans; A Raeymaekers; A Van den Berghe; G Volckaert; M Ysebaert
Journal:  Nature       Date:  1976-04-08       Impact factor: 49.962

Review 10.  Domain architectures of sigma54-dependent transcriptional activators.

Authors:  David J Studholme; Ray Dixon
Journal:  J Bacteriol       Date:  2003-03       Impact factor: 3.490

View more
  46 in total

1.  A Tailspike with Exopolysaccharide Depolymerase Activity from a New Providencia stuartii Phage Makes Multidrug-Resistant Bacteria Susceptible to Serum-Mediated Killing.

Authors:  Hugo Oliveira; Graça Pinto; Bruna Mendes; Oscar Dias; Hanne Hendrix; Ergun Akturk; Jean-Paul Noben; Jan Gawor; Małgorzata Łobocka; Rob Lavigne; Joana Azeredo
Journal:  Appl Environ Microbiol       Date:  2020-06-17       Impact factor: 4.792

2.  Identification and characterization of VpsR and VpsT binding sites in Vibrio cholerae.

Authors:  David Zamorano-Sánchez; Jiunn C N Fong; Sefa Kilic; Ivan Erill; Fitnat H Yildiz
Journal:  J Bacteriol       Date:  2015-01-26       Impact factor: 3.490

3.  Global Regulator of Rubber Degradation in Gordonia polyisoprenivorans VH2: Identification and Involvement in the Regulation Network.

Authors:  Jan de Witt; Sylvia Oetermann; Mariana Parise; Doglas Parise; Jan Baumbach; Alexander Steinbüchel
Journal:  Appl Environ Microbiol       Date:  2020-07-20       Impact factor: 4.792

4.  A versatile vector for mycobacterial protein production with a functional minimized acetamidase regulon.

Authors:  Christian Magaña Vergara; Christina Jana Louisa Kallenberg; Miriam Rogasch; Christian Gerhard Hübner; Young-Hwa Song
Journal:  Protein Sci       Date:  2017-09-25       Impact factor: 6.725

5.  Coevolutionary Couplings Unravel PAM-Proximal Constraints of CRISPR-SpCas9.

Authors:  Yi Li; José A De la Paz; Xianli Jiang; Richard Liu; Adarsha P Pokkulandra; Leonidas Bleris; Faruck Morcos
Journal:  Biophys J       Date:  2019-10-08       Impact factor: 4.033

6.  iac Gene Expression in the Indole-3-Acetic Acid-Degrading Soil Bacterium Enterobacter soli LF7.

Authors:  Isaac V Greenhut; Beryl L Slezak; Johan H J Leveau
Journal:  Appl Environ Microbiol       Date:  2018-09-17       Impact factor: 4.792

7.  BioXSD: the common data-exchange format for everyday bioinformatics web services.

Authors:  Matús Kalas; Pål Puntervoll; Alexandre Joseph; Edita Bartaseviciūte; Armin Töpfer; Prabakar Venkataraman; Steve Pettifer; Jan Christian Bryne; Jon Ison; Christophe Blanchet; Kristoffer Rapacki; Inge Jonassen
Journal:  Bioinformatics       Date:  2010-09-15       Impact factor: 6.937

8.  Regulation of Gene Expression in Shewanella oneidensis MR-1 during Electron Acceptor Limitation and Bacterial Nanowire Formation.

Authors:  Sarah E Barchinger; Sahand Pirbadian; Christine Sambles; Carol S Baker; Kar Man Leung; Nigel J Burroughs; Mohamed Y El-Naggar; John H Golbeck
Journal:  Appl Environ Microbiol       Date:  2016-08-15       Impact factor: 4.792

9.  The TubR-centromere complex adopts a double-ring segrosome structure in Type III partition systems.

Authors:  Bárbara Martín-García; Alejandro Martín-González; Carolina Carrasco; Ana M Hernández-Arriaga; Rubén Ruíz-Quero; Ramón Díaz-Orejas; Clara Aicart-Ramos; Fernando Moreno-Herrero; María A Oliva
Journal:  Nucleic Acids Res       Date:  2018-06-20       Impact factor: 16.971

10.  The 2010 Nucleic Acids Research Database Issue and online Database Collection: a community of data resources.

Authors:  Guy R Cochrane; Michael Y Galperin
Journal:  Nucleic Acids Res       Date:  2009-12-03       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.