Literature DB >> 18984614

PIG--the pathogen interaction gateway.

Tim Driscoll¹, Matthew D Dyer, T M Murali, Bruno W Sobral.

Abstract

Protein-protein interactions (PPIs) play a vital role in initiating infection in a number of pathogens. Identifying which interactions allow a pathogen to infect its host can help us to understand methods of pathogenesis and provide potential targets for therapeutics. Public resources for studying host-pathogen systems, in particular PPIs, are scarce. To facilitate the study of host-pathogen PPIs, we have collected and integrated host-pathogen PPI (HP-PPI) data from a number of public resources to create the Pathogen Interaction Gateway (PIG). PIG provides a text based search and a BLAST interface for searching the HP-PPI data. Each entry in PIG includes information such as the functional annotations and the domains present in the interacting proteins. PIG provides links to external databases to allow for easy navigation among the various websites. Additionally, PIG includes a tool for visualizing a single HP-PPI network or two HP-PPI networks. PIG can be accessed at http://pig.vbi.vt.edu.

Entities: Chemical Disease Gene Species

Mesh：

Substances：
Proteins

Year: 2008 PMID： 18984614 PMCID： PMC2686532 DOI： 10.1093/nar/gkn799

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Protein–protein interactions (PPIs) play a vital role in initiating infection in a number of pathogens. For example, HIV uses host surface proteins to gain entrance to the host cell. HIV protein ENV attaches to the host human glycoprotein CD4 and subsequently to host chemokine receptors CCR5 and CXCR4. These binding events cause conformational changes to viral proteins that allow the membrane of the virus to fuse to the host cell membrane and enable the virus to enter the host cell. Knowing which PPIs allow a pathogen to infect its host provides critical insights into methods of pathogenesis and potential targets for therapeutics. Unfortunately, resources for studying host–pathogen PPIs require the navigation of several websites. To this end we have created the Pathogen Interaction Gateway (PIG), an integrated platform of experimentally verified and manually curated host–pathogen PPIs (HP–PPIs) and associated computational tools. Currently there are a number of public databases (1–4) and other resources (e.g. http://www.proteomicsresource.org) that store data for experimentally verified and manually curated host–pathogen PPIs. PIG is designed to integrate data from these various public resources and primary literature into a single data warehouse. Currently PIG only contains data on human–pathogen PPIs; data for other host–pathogen systems will be included as they become available. Our goals for PIG are: (i) to create a centralized location for experimentally verified and manually curated HP–PPIs that is integrated with other public resources; (ii) to provide an easy-to-use web interface for accessing and using data that would otherwise require the navigation of several websites; (iii) to provide a platform upon which various tools can be developed for identifying potential targets for therapeutics and (iv) to set the stage for developing methods for predicting host–pathogen PPIs.

CONSTRUCTION AND CONTENT

We designed a PostgreSQL relation database to store the data in PIG. From each interaction database, we gather important information for each HP–PPI including protein IDs, organism information, literature references and the experimental method used to identify the interaction. Next, we map all protein ids to UniProt (5) entry IDs to allow for easy integration of other external resources and genomic information. We ignore any interaction in which a protein does not have a mapping to the UniProt system. We focus on HP–PPIs corresponding to host–pathogen systems of interest (currently only human pathogens). We then create a non-redundant set of HP–PPIs and insert them into PIG. In addition to HP–PPIs we also gather and integrate intra-species PPIs from public resources (1,3,4,6–9). We obtain protein sequence information from UniProt (5), functional annotations for protein entries from the Gene Ontology (10) and functional domain data from InterProScan (11). PIG currently focuses on known human–pathogen interactions. It contains 20 113 host–pathogen PPIs for 206 different pathogen strains collated from 1322 literature sources. A summary of the PPIs included in PIG can be seen in Table 1. Users can download the data stored in PIG as either a PostgreSQL dump or as tab-delimited files.

Table 1.

Summary of the host–pathogen PPI data currently stored in PIG

Group	Number of PPIs	Number of strains	Number of references
HIV	9095	49	1035
Yersinia	4111	3	5
Bacillus	3077	3	11
Francisella	1383	1	1
Hepatitis	1244	16	95
Influenza	287	4	5
Papillomavirus	229	12	177
Epstein Bar virus	206	2	36
Adenovirus	82	9	61
Herpesvirus	63	20	56
Sarcoma virus	51	6	57
Clostridium	45	4	5
T-lymphotrophic virus	25	2	12
Escherichia coli	22	2	7
Chlamydia	20	2	3
Neisseria	16	1	3
Streptococcus	14	5	7
Vaccinia virus	13	4	8
Staphylococcus	12	3	17
Pseudomonas	11	1	4
Measles virus	10	3	5
Polyomavirus	8	3	11
Leukemia virus	7	1	1
Shigella	6	1	7
Plasmodium	6	2	2
Anemia virus	4	4	8
Hantaan virus	4	1	1
SARS	4	1	5
Listeria	4	1	3
Salmonella	3	1	4
Dengue virus	3	3	3
Seoul virus	3	1	1
Echovirus	3	2	2
Helicobacter	3	2	5
Rotavirus	3	3	3
Foamy virus	2	1	1
SIV	2	2	1
Dictyostelium	2	1	2
Puumala virus	2	1	1
Orf virus	2	2	1
Aeromonas	2	1	1
Stomatitis virus	2	1	1
Mycoplasma	2	1	1
Bothrops	2	1	1
Campylobacter	2	1	1
Vipera	1	1	1
Sendal virus	1	1	1
Pneumocystis	1	1	1
Corynephage	1	1	1
Nucleopolyhedrovirus	1	1	1
Candida	1	1	1
Rabies virus	1	1	1
Toxoplasma	1	1	1
Poliovirus	1	1	1
Nipah virus	1	1	1
Klebsiella	1	1	1
Enterobacteria	1	1	1
Mokola virus	1	1	1
West Nile virus	1	1	1
Tula virus	1	1	1
Ebolavirus	1	1	1
Total	20 113	206	1322

For each host–pathogen system, we list the number of known PPIs, the number of strains and the number of literature references.

Summary of the host–pathogen PPI data currently stored in PIG For each host–pathogen system, we list the number of known PPIs, the number of strains and the number of literature references.

USING PIG OR THE PIG WEB INTERFACE

PIG contains a number of tools for identifying data of interest. First, PIG has two search tools: a simple text search (Figure 1a) and a BLASTP (12) interface (Figure 1b). The text interface allows users to search the database for key terms of interest (e.g. protein identifier or protein name). This search is executed against the protein IDs and descriptions stored in PIG. Valid results return links to protein-specific pages that contain information about the protein itself, such as functional annotations and domains, along with a list of interactions in which the protein participates (Figure 1c). Protein attributes such as ID, functional annotations and domains are hyperlinked to their respective external websites to allow easy navigation among different resources (Figure 1d). The BLASTP interface allows users to search entries in PIG that have sequence similarity to a protein of interest. Users can adjust BLAST search parameters. The search is executed using the BLASTP algorithm against all protein entries in PIG. Results are displayed in a standard fashion and each significant hit links back to a protein-specific page in PIG.

Figure 1.

Summary of tools available on PIG. Users can search data within PIG using either (a) a simple text search or (b) a BLAST interface. Search results provide users with links to (c) individual protein pages. The individual protein pages contain information functional annotations, domains, and known inter-species and intra-species PPIs. (d) Each piece of information contains a direct link to the corresponding external database. (e) PIG also contains visualization tools. Users can select a host–pathogen system of interest and (f) view the corresponding network. Users can use the genomic information in PIG to view subsets of the networks using (g) domain, (h) annotation and (i) experimental method data. (j) Users can follow links from the visualization page to corresponding external databases. In addition to search tools, PIG also provides a platform for visualizing HP–PPI networks. Users can either visualize the network for a single host–pathogen system or for two host–pathogen systems. In both cases the user first selects a host system and a group of pathogens (Figure 1e). Each group entry expands to list-specific strains, thus allowing a user to compare closely related strains of a particular pathogen or pathogens from two different groups. Networks can be visualized directly within a Web browser using a custom applet (Java required), or downloaded as GraphML files for offline use (Figure 1f). Our custom applet provides full interactivity to the user, including zooming and panning, click-and-drag, and rich contextual information. The visualization of two HP–PPI networks allows users to identify human proteins that interact with both pathogens. These conserved interactors can provide critical insights into general strategies employed by pathogens during pathogenesis. The visualization tool also allows users to quickly identify functional annotations and domains for a selected protein in the network. For each of the protein's interactors, the supporting literature references are also displayed along with the technology used to identify the interaction. These features can be used to visualize subsets of the HP–PPI network (e.g. those interactions identified using a yeast two-hybrid approach) (Figure 1g–i). Each of these features also links to the corresponding external website (Figure 1j). Finally, users can restrict a network using GO-Slim terms (10) to focus on specific functional sub-networks.

CONCLUSIONS

PIG is an integrated resource that acts a centralized location for public information on known HP–PPI data. Each protein entry in PIG is hyperlinked to its corresponding entry in the UniProt database (5), functional annotations to the Gene Ontology (10), functional domains to InterProScan (11) and interactions to PubMed entries. These links allow for easy navigating among the various websites. PIG includes a number of tools for accessing available data including a simple text search, a blast interface and a tool for visualizing and comparing HP–PPI networks. Data stored in PIG are available for download on the website. Future improvements of PIG will include the integration of additional data sources such as known virulence factors and toxins (9), tools for identification of therapeutic targets and computational methods to allow users to predict PPIs between any host and pathogen system of their choice. We believe that PIG will be a valuable resource for researchers of host–pathogen systems and possibly aid in the identification of potential targets for therapeutics.

FUNDING

National Institute of Allergy and Infectious Diseases grant HHSN26620040035C. Funding for open access charge: NIH grant. Conflict of interest statement. None declared.

12 in total

1. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors: M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal: Nat Genet Date: 2000-05 Impact factor: 38.330

2. IntAct: an open source molecular interaction database.

Authors: Henning Hermjakob; Luisa Montecchi-Palazzi; Chris Lewington; Sugath Mudali; Samuel Kerrien; Sandra Orchard; Martin Vingron; Bernd Roechert; Peter Roepstorff; Alfonso Valencia; Hanah Margalit; John Armstrong; Amos Bairoch; Gianni Cesareni; David Sherman; Rolf Apweiler
Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971

3. The Database of Interacting Proteins: 2004 update.

Authors: Lukasz Salwinski; Christopher S Miller; Adam J Smith; Frank K Pettit; James U Bowie; David Eisenberg
Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971

Review 4. Biomolecular interaction network database.

Authors: Don Gilbert
Journal: Brief Bioinform Date: 2005-06 Impact factor: 11.622

Review 5. MINT: a Molecular INTeraction database.

Authors: Andreas Zanzoni; Luisa Montecchi-Palazzi; Michele Quondam; Gabriele Ausiello; Manuela Helmer-Citterich; Gianni Cesareni
Journal: FEBS Lett Date: 2002-02-20 Impact factor: 4.124

6. Human protein reference database--2006 update.

Authors: Gopa R Mishra; M Suresh; K Kumaran; N Kannabiran; Shubha Suresh; P Bala; K Shivakumar; N Anuradha; Raghunath Reddy; T Madhan Raghavan; Shalini Menon; G Hanumanthu; Malvika Gupta; Sapna Upendran; Shweta Gupta; M Mahesh; Bincy Jacob; Pinky Mathew; Pritam Chatterjee; K S Arun; Salil Sharma; K N Chandrika; Nandan Deshpande; Kshitish Palvankar; R Raghavnath; R Krishnakanth; Hiren Karathia; B Rekha; Rashmi Nayak; G Vishnupriya; H G Mohan Kumar; M Nagini; G S Sameer Kumar; Rojan Jose; P Deepthi; S Sujatha Mohan; T K B Gandhi; H C Harsha; Krishna S Deshpande; Malabika Sarker; T S Keshava Prasad; Akhilesh Pandey
Journal: Nucleic Acids Res Date: 2006-01-01 Impact factor: 16.971

7. MPact: the MIPS protein interaction resource on yeast.

Authors: Ulrich Güldener; Martin Münsterkötter; Matthias Oesterheld; Philipp Pagel; Andreas Ruepp; Hans-Werner Mewes; Volker Stümpflen
Journal: Nucleic Acids Res Date: 2006-01-01 Impact factor: 16.971

8. MvirDB--a microbial database of protein toxins, virulence factors and antibiotic resistance genes for bio-defence applications.

Authors: C E Zhou; J Smith; M Lam; A Zemla; M D Dyer; T Slezak
Journal: Nucleic Acids Res Date: 2006-11-07 Impact factor: 16.971

9. Reactome: a knowledgebase of biological pathways.

Authors: G Joshi-Tope; M Gillespie; I Vastrik; P D'Eustachio; E Schmidt; B de Bono; B Jassal; G R Gopinath; G R Wu; L Matthews; S Lewis; E Birney; L Stein
Journal: Nucleic Acids Res Date: 2005-01-01 Impact factor: 16.971

10. InterProScan: protein domains identifier.

Authors: E Quevillon; V Silventoinen; S Pillai; N Harte; N Mulder; R Apweiler; R Lopez
Journal: Nucleic Acids Res Date: 2005-07-01 Impact factor: 16.971

29 in total

Review 1. Immunogenomics and systems biology of vaccines.

Authors: Luigi Buonaguro; Bali Pulendran
Journal: Immunol Rev Date: 2011-01 Impact factor: 12.988

2. Study of intra-inter species protein-protein interactions for potential drug targets identification and subsequent drug design for Escherichia coli O104:H4 C277-11.

Authors: Shakhinur Islam Mondal; Zabed Mahmud; Montasir Elahi; Arzuba Akter; Nurnabi Azad Jewel; Md Muzahidul Islam; Sabiha Ferdous; Taisei Kikuchi
Journal: In Silico Pharmacol Date: 2017-04-11

3. Virus interactions with human signal transduction pathways.

Authors: Zhongming Zhao; Junfeng Xia; Oznur Tastan; Irtisha Singh; Meghana Kshirsagar; Jaime Carbonell; Judith Klein-Seetharaman
Journal: Int J Comput Biol Drug Des Date: 2011-02-17

4. Integration and visualization of host-pathogen data related to infectious diseases.

Authors: Timothy Driscoll; Joseph L Gabbard; Chunhong Mao; Oral Dalay; Maulik Shukla; Clark C Freifeld; Anne Gatewood Hoen; John S Brownstein; Bruno W Sobral
Journal: Bioinformatics Date: 2011-06-27 Impact factor: 6.937

Review 5. A review on host-pathogen interactions: classification and prediction.

Authors: R Sen; L Nayak; R K De
Journal: Eur J Clin Microbiol Infect Dis Date: 2016-07-29 Impact factor: 3.267

6. Comparative analysis of virus-host interactomes with a mammalian high-throughput protein complementation assay based on Gaussia princeps luciferase.

Authors: Grégory Neveu; Patricia Cassonnet; Pierre-Olivier Vidalain; Caroline Rolloy; José Mendoza; Louis Jones; Frédéric Tangy; Mandy Muller; Caroline Demeret; Lionel Tafforeau; Vincent Lotteau; Chantal Rabourdin-Combe; Gilles Travé; Amélie Dricot; David E Hill; Marc Vidal; Michel Favre; Yves Jacob
Journal: Methods Date: 2012-08-08 Impact factor: 3.608

7. Structural similarity-based predictions of protein interactions between HIV-1 and Homo sapiens.

Authors: Janet M Doolittle; Shawn M Gomez
Journal: Virol J Date: 2010-04-28 Impact factor: 4.099

8. Prediction and comparison of Salmonella-human and Salmonella-Arabidopsis interactomes.

Authors: Sylvia Schleker; Javier Garcia-Garcia; Judith Klein-Seetharaman; Baldo Oliva
Journal: Chem Biodivers Date: 2012-05 Impact factor: 2.408

9. Prediction and analysis of the protein interactome in Pseudomonas aeruginosa to enable network-based drug target selection.

Authors: Minlu Zhang; Shengchang Su; Raj K Bhatnagar; Daniel J Hassett; Long J Lu
Journal: PLoS One Date: 2012-07-24 Impact factor: 3.240

10. Multitask learning for host-pathogen protein interactions.

Authors: Meghana Kshirsagar; Jaime Carbonell; Judith Klein-Seetharaman
Journal: Bioinformatics Date: 2013-07-01 Impact factor: 6.937