Literature DB >> 24259431

Updates on the web-based VIOLIN vaccine database and analysis system.

Yongqun He1, Rebecca Racz, Samantha Sayers, Yu Lin, Thomas Todd, Junguk Hur, Xinna Li, Mukti Patel, Boyang Zhao, Monica Chung, Joseph Ostrow, Andrew Sylora, Priya Dungarani, Guerlain Ulysse, Kanika Kochhar, Boris Vidri, Kelsey Strait, George W Jourdian, Zuoshuang Xiang.   

Abstract

The integrative Vaccine Investigation and Online Information Network (VIOLIN) vaccine research database and analysis system (http://www.violinet.org) curates, stores, analyses and integrates various vaccine-associated research data. Since its first publication in NAR in 2008, significant updates have been made. Starting from 211 vaccines annotated at the end of 2007, VIOLIN now includes over 3240 vaccines for 192 infectious diseases and eight noninfectious diseases (e.g. cancers and allergies). Under the umbrella of VIOLIN, >10 relatively independent programs are developed. For example, Protegen stores over 800 protective antigens experimentally proven valid for vaccine development. VirmugenDB annotated over 200 'virmugens', a term coined by us to represent those virulence factor genes that can be mutated to generate successful live attenuated vaccines. Specific patterns were identified from the genes collected in Protegen and VirmugenDB. VIOLIN also includes Vaxign, the first web-based vaccine candidate prediction program based on reverse vaccinology. VIOLIN collects and analyzes different vaccine components including vaccine adjuvants (Vaxjo) and DNA vaccine plasmids (DNAVaxDB). VIOLIN includes licensed human vaccines (Huvax) and veterinary vaccines (Vevax). The Vaccine Ontology is applied to standardize and integrate various data in VIOLIN. VIOLIN also hosts the Ontology of Vaccine Adverse Events (OVAE) that logically represents adverse events associated with licensed human vaccines.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 24259431      PMCID: PMC3964998          DOI: 10.1093/nar/gkt1133

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Vaccination is one of the most significant inventions in modern medicine. It has been used to dramatically improve human health. However, our efforts to develop vaccines to protect against many diseases have not been successful. For example, the infectious diseases AIDS, tuberculosis and malaria are three of the top five threats to human health (1), but there is not an effective and safe vaccine available against any of these diseases. Vaccines can also be developed against many noninfectious diseases, including cancer, allergy and autoimmune diseases. More funding has been added to extensive vaccine research from governments and nonprofit foundations. For example, Gates Foundation has donated billions of dollars to invest in vaccine research and development. It has been anticipated that the Gates Foundation donation, combined with commitments from the USA and other governments could prevent the deaths of 8 million children from 2010 to 2019 (2). Resulting from intensive vaccine research and development, a large volume of data and publications has been published. A recent study has confirmed that the records of vaccine-related literature stored in PubMed (3) are increasing at an exponential rate (4). To address the challenge of integrating and analyzing published vaccine-related results, we have developed the Vaccine Investigation and Online Information Network (VIOLIN, http://www.violinet.org) (5). VIOLIN has become the largest, web-based vaccine database and analysis system for vaccine researchers. The vaccines annotated in VIOLIN include licensed vaccines, vaccines being tested in clinical trials and vaccines that have been studied in research and experimentally verified effective in at least one laboratory animal model. The vaccine data collected for each vaccine covers vaccine components (e.g. vaccine adjuvants), protection efficacy and host immune responses. VIOLIN also contains many specific software programs such as several for vaccine literature mining and vaccine design. The first VIOLIN paper was published in the Database Issue of the Nucleic Acids Research journal in 2008 (5). Since its first publication, dramatic progress has been made. This article aims to summarize the major changes and updates since 2008.

OVERALL SYSTEM DESIGN, ANNOTATION PIPELINE AND STATISTICS

VIOLIN is currently implemented using a three-tier architecture built on two HP ProLiant DL380 G6 servers that run a Red Hat Linux operating system (Red Hat Enterprise Linux ES 4). Users submit database queries through the web. These queries are processed using PHP/SQL (middle-tier, application server based on Apache) against a MySQL (version 5.0) relational database (back-end, database server). The result of each query is then presented to the user through the web browser. The data stored in these two servers are routinely backed up by each other and with additional data storage space available at the University of Michigan. All the annotated information in VIOLIN is obtained by manual curation from peer-reviewed literature or other reliable websites. PubMed (3) is our major source for obtaining peer-reviewed publications. Manual curation emphasizes the retrieval of experimental evidence of vaccine efficacy and ensures the accuracy of vaccine information in VIOLIN. Additionally, a web-based curation and literature mining system (Limix) was used (5,6). The interactive Limix system allows data curators to search literature, copy and edit text, add references, submit data to the database and check submission history. Limix also provides the data reviewers a platform and tools to review, edit and approve the curated data. All these features are implemented and available for use on a user-friendly web interface. Data submitted to VIOLIN is reviewed by an expert and, upon approval, is published to the VIOLIN website. As shown in our original VIOLIN publication, VIOLIN contained ∼200 vaccines or vaccine candidates against 18 pathogens in the end of 2007 (5). After 5 years of diligent work, VIOLIN now includes over 3200 vaccines or vaccine candidates against 192 infectious diseases and eight noninfectious diseases. Table 1 summarizes a list of most annotated pathogens and diseases and related vaccine information stored in VIOLIN. More details can be found on the VIOLIN statistics page: http://www.violinet.org/stat.php.
Table 1.

Representative VIOLIN statistics as of 18 October 2013

Pathogen (disease)No. of vaccines and licensed vaccinesNo. of protective antigens usedNo. of virmugens used
Gram-positive bacteria
    Clostridium tetani (tetanus)60 (57)a20
    Mycobacterium tuberculosis (Tuberculosis)43 (2)2615
    Erysipelothrix rhusiopathiae (Erysipelas)29 (28)10
    Bacillus anthracis (Anthrax)26 (4)131
    Corynebacterium diphtheriae (Diphtheria)22 (22)10
Gram-negative bacteria
    Leptospira spp. (Leptospirosis)132 (130)20
    Salmonella spp. (Salmonellosis)62 (19)646
    Brucella spp. (Brucellosis)60 (7)2515
    Escherichia coli (Hemorrhagic colitis)40 (14)174
    Haemophilus influenzae (Meningitis)30 (11)140
Viruses
    Bovine herpesvirus 1 (Infectious bovine rhinotracheitis)159 (146)72
    Influenza virus [Influenza (flu)]153 (89)492
    Bovine viral diarrhea virus 1 [Bovine viral Diarrhea (BVD)]129 (128)00
    Bovine Parainfluenza 3 Virus (BPIV-3)108 (108)00
    Newcastle disease virus (Newcastle disease)97 (95)30
Parasite
    Plasmodium spp. (Malaria)36 (0)337
    Leishmania donovani (Visceral leishmaniasis)15 (1)121
    Toxoplasma gondii (Toxoplasmosis)14 (0)123
    Trypanosoma cruzi (Chagas disease)14 (0)160
    Eimeria spp. (Coccidiosis)11 (8)10
Fungi
    Coccidioides spp. (Coccidioidomycosis)4 (0)90
Noninfectious disease
    Cancer52 (2)720
    Arthritis4 (0)40
    Diabetes4 (0)50
    Atherosclerosis (Atherosclerosis, arteriosclerotic vascular disease)2 (0)00
    Allergy1 (0)150

aThe number in parentheses corresponds to the number of licensed vaccines.

Representative VIOLIN statistics as of 18 October 2013 aThe number in parentheses corresponds to the number of licensed vaccines. VIOLIN includes many relatively independent programs (e.g. Protegen and Vaxjo) targeting specific vaccine-related domains (e.g. protective antigens and vaccine adjuvants) (Figure 1 and Table 2). Their relations are described in Figure 1. Specifically, a vaccine is developed to protect (or treat) a host against a disease. The disease can be an infectious disease caused by an infectious microbe or a noninfectious disease such as cancer or autoimmune disease. A vaccine has different components such as protective antigen and vaccine adjuvant. Depending on the classification criteria, different vaccine types exist, such as live attenuated vaccines and DNA vaccines. Once administered, a vaccine will induce specific immune responses including humoral antibody response and cell-mediated immunity. The level of protection is considered as the gold standard for evaluating the efficacy of a vaccine (Figure 1).
Figure 1.

Overview of major VIOLIN components and their relations. The program names inside parentheses are databases or tools. For components with no quoted program names, the information is listed in the general VIOLIN database.

Table 2.

Statistics of specific VIOLIN programs as of 18 October 2013

ProgramDescriptionURLRelease yearReferenceNew or oldComment
Pathogen genes/proteins and vaccine candidate prediction
    Protegen857 protective antigenshttp://violinet.org/protegen2009(8)new
    VirmugenDB225 virmugenshttp://violinet.org/virmugendb2011(12)new
    VBLASTBLAST vaccine geneshttp://violinet.org/blast2007(5)old
    Vaxignvaccine candidate predictionhttp://violinet.org/vaxign2009(15,16,40)newprecomputed or dynamic
    VaxitopImmune epitope predictionhttp://violinet.org/vaxign/vaxitop2009(16,40)new
Vaccine components (other than protective antigens)
    Vaxjo93 vaccine adjuvantshttp://violinet.org/vaxjo2010(22)new
    DNAVaxDB144 DNA vaccine plasmidshttp://violinet.org/dnavaxdb2010(23)new
    VaxvecVaccine vectorshttp://violinet.org/vaxvec2010new
Specific vaccine types
    DNAVaxDB419 DNA vaccineshttp://violinet.org/dnavaxdb2012(23)newfor 99 pathogens
    VirmugenDB207 attenuated vaccineshttp://violinet.org/virmugendb2011(12)newfor 57 pathogens
    Vevax>1000 licensed veterinary vaccineshttp://violinet.org/vevax2011newfor 106 pathogens
    Huvax184 licensed human vaccineshttp://violinet.org/huvax2011newfor 29 pathogens
Vaccine literature mining
    VO-SciMinerBrucella vaccine-gene interaction networkhttp://violinet.org/vo-sciminer2011(32)new
    VaxmeshMeSH-based literature mininghttp://violinet.org/litesearch/meshtree/meshtree.php2007(5)old
    LitesearchVaccine literature searchhttp://violinet.org/litesearch/keywords_search.php2007(5)old
Hosting of community efforts
    VOCommunity-based VOhttp://violinet.org/vaccineontolog2008(26,27)new>4800 terms
    OVAEOntology of Vaccine Adverse Eventshttp://www.violinet.org/ovae/2013(41)new>1300 terms
    ICoVax20122012 Computational vaccinology workshophttp://violinet.org/icovax20122012(35)new
    ICoVax20132013 Computational vaccinology workshophttp://violinet.org/icovax20132013new
Data retrieval (other than protective antigens)
    V-UtilitiesData query and retrieval utilitieshttp://violinet.org/v-utilities2010newfor software programming
Overview of major VIOLIN components and their relations. The program names inside parentheses are databases or tools. For components with no quoted program names, the information is listed in the general VIOLIN database. Statistics of specific VIOLIN programs as of 18 October 2013 The individual VIOLIN programs share common annotated data in the general VIOLIN database. However, these individual programs often contain additional data that is unique for the program such as vaccine adjuvant-specific data (e.g. adjuvant structure), and is not typically found in the description of a specific vaccine. These individual programs also include their own query interfaces, as well as other related information including BLAST analysis (7) and websites that provide our annotated data for downloading.

VACCINE-RELATED PATHOGEN GENES/PROTEINS

In modern vaccine research, it is critical to identify genes and proteins that can be directly used for vaccine development. Two types of pathogen genes or proteins are used for developing vaccines. One type is the protective antigens that are able to induce antigen-specific protective immunity. Another type is microbial virulence factors that can be mutated in virulent pathogens to make live attenuated vaccines. VIOLIN has incorporated individual programs specifically targeted to these two types of genes and proteins.

Protegen: database of protective antigens

Protegen was developed in 2010 to store and analyze protective antigens (Table 2) (8). To be qualified as a protective antigen and included in Protegen, it is required that this antigen is used for development of an experimentally verified vaccine or has been experimentally shown to induce an immune response (e.g. production of neutralization antibody) that correlates with protection. This is a key difference between the antigens collected in Protegen and those in some other databases, such as AntigenDB that focuses on the induction of immune responses without requiring protection (9). Currently, Protegen holds over 800 protective antigens from 200 pathogens (Table 2). Over 200 protective antigens have been added since the Protegen paper was published in 2010. The protective antigens collected in Protegen have been used in development of different types of vaccines, particularly protein subunit vaccines, DNA vaccines and recombinant vector vaccines (8). We have also performed data analysis to find specific patterns among the Protegen data (10). For example, among 201 protective protein antigens from Gram-negative bacteria, 48% of antigens are extracellular or cell wall proteins and ∼40% of protective antigens are adhesins or adhesin-like proteins. Among 69 protective protein antigens from Gram-positive bacteria available in Protegen, 64% of protective antigens belong to extracellular or outer membrane proteins, and 54% of protective antigens are adhesins or adhesin-like proteins. Many conserved domains, including autotransporter and TonB domains, are enriched in these bacterial protective antigens (10). In addition to our own data analysis, the data in Protegen have been used as the gold standard for evaluation of different vaccine candidate protection methods (11). In the study conducted by Jaiswal et al. (11), the Protegen data were used to evaluate four software programs, including Vaxign, in prediction of protein vaccine candidates. Many nonadhesin and nonsurface bacterial vaccine candidates collected in Protegen has been the challenge for prediction by different vaccine candidate prediction programs.

VirmugenDB: database of ‘virmugens’

Various virulence factors exist as part of a pathogen, and not all virulence factors can be knocked out to make an effective live attenuated vaccine. The term ‘virmugen’ was coined by Dr Yongqun He to represent a gene encoding a virulent factor of a pathogen that, when mutated, has been proven feasible in laboratory animal studies to create a live attenuated vaccine (12). Currently, VirmugenDB contains over 220 virmugens that were mutated to make more than 200 vaccines experimentally verified as useful for vaccination against over 50 bacterial, viral and protozoal pathogens (Table 2). Significant patterns were identified from analysis of virmugen data. For example, the aroA gene has been used in 10 Gram-negative and one Gram-positive bacteria as a virmugen. The aroA gene sequences in the 10 Gram-negative bacteria share at least 50% identity (12). This gene encodes for a key enzyme involved in aromatic amino acid biosynthesis. This finding suggests that interference of the aromatic amino acid biosynthesis pathway provides a good strategy for live attenuated vaccine development. Indeed, the analysis of all virmugens found that virmugens tend to involve metabolism of nutrients (e.g. amino acids, carbohydrates, nucleotides) and cell membrane formation. Compared with other virulence factors, it is likely that virmugens have specific characteristics, the study of which deserves more investigation. Host genes whose expressions were regulated by virmugen mutation vaccines or wild-type virulent pathogens were also annotated and compared with an ultimate aim to identify the protective immune mechanisms specifically targeted by vaccines (12).

Customized BLAST analysis programs

A commonly used tool for gene or protein sequence similarity search is BLAST (7). Several customized BLAST programs are available in VIOLIN. Protegen and VirmugenDB include customized BLAST libraries of DNA and protein sequences of protective antigens and virmugens, respectively, for sequence similarity searches. DNAVaxDB also includes a BLAST search program for comparing user-provided gene or protein sequence(s) with protective antigens used in development of DNA vaccines. In addition, VIOLIN also has a VBLAST program for searching all pathogen genes and proteins annotated in the VIOLIN system. Different BLAST programs (e.g. blastn, blastp and RPS BLAST) are included as well.

Vaxign/Vaxitop: genome sequence-based vaccine candidate prediction

As an emerging vaccine development strategy in the genomics era, reverse vaccinology initiates vaccine development from bioinformatics analysis of genome sequences (13). Over the last decade, different parameters have been included for vaccine candidate prediction. The Vaxign program in VIOLIN is the first web-based vaccine design software program based on the reverse vaccinology strategy (14,15). The Vaxign pipeline predicts potential vaccine protein candidates based on the prediction of the following criteria: subcellular localization, transmembrane helices, adhesin probability, microbial sequence conservation by ortholog analysis, exclusion of proteins having orthologs in selected genome(s), similarity to host proteins and identification of major histocompatibility complex (MHC) Class I or Class II immune epitopes. The MHC Class I and II binding epitope prediction is performed using Vaxitop, an internally developed tool. Vaxitop is a position-specific scoring matrix (PSSM)-based epitope prediction program. Whereas other tools use an arbitrary percentage or rank cutoff, Vaxitop relies on a statistical P-value to examine the likelihood of a candidate peptide being an immune epitope (14,16). Vaxign has been demonstrated to effectively predict vaccine candidates for Brucella spp. (15,17,18), uropathogenic Escherichia coli (14) and human herpesvirus 1 (16). The Vaxign program has also been used for genome reannotation and prediction of virulence factors. For example, Vaxign has been applied to reannotate genome and predict virulence factor genes using the genome sequences of Campylobacter fetus subspecies (19) and Corynebacterium diphtheriae NCTC13129 (20). Currently over 350 genomes have been precomputed using the Vaxign pipeline (Table 2). The predicted results can be queried using a user-friendly web interface. Vaxign also allows users to perform dynamic vaccine candidate predictions by inputting specific sequences of up to 500 proteins.

VACCINE COMPONENTS

Vaccines typically contain multiple components. One of the most critical components is the protective antigen(s). As described above, VIOLIN Protegen contains the information of known protective antigens. Live attenuated vaccines contain mutations of virulence factors (i.e. virmugens) leading to the attenuation phenotype. The VIOLIN VirmugenDB collects a list of known virmugens. However, a virmugen is not considered as a vaccine component since it is not part of the vaccine recipe. There are many other types of vaccine components (21). Below we introduce two more VIOLIN programs collecting two specific types of vaccine components:

Vaxjo: database of vaccine adjuvants

A vaccine adjuvant is a vaccine component used to accelerate, prolong or enhance host immune responses to coadministered protective antigens in a vaccine. Vaccine adjuvants have different actions in vivo. They may modify the cytokine immune network, deliver and present antigens to appropriate immune effector cells, induce CD8+ cytotoxic T-lymphocyte (CTL) responses or generate a short-term or long-term depot to give a continuous or pulsed release. Vaxjo is a vaccine adjuvant database that annotates and stores various vaccine adjuvants and their usage in different vaccines (22). Currently, Vaxjo contains 93 vaccine adjuvants used in 378 vaccines for over 70 pathogens (Table 2). For each vaccine adjuvant, Vaxjo introduces its name, components, preparation, vaccines in VIOLIN that utilize each adjuvant and at least one reliable reference. Different types of vaccine adjuvants are collected. The commonly identified vaccine adjuvant types with highest numbers of adjuvants include 28 synthetic adjuvants, 18 microorganism-derived adjuvants, 15 emulsion adjuvants and 13 mineral salt adjuvants. Aluminum hydroxide is the most common adjuvant found, with 62 associated vaccines collected in VIOLIN. Freund’s complete and incomplete adjuvants are also commonly used with each being associated with 42 vaccines (22).

DNA vaccine plasmids collected in DNAVaxDB

As of 14 September 2013, 141 DNA vaccine plasmids have been annotated in the DNAVaxDB (23) (Table 2). These plasmids have been used in generation of over 400 DNA vaccines. Among the most commonly used plasmids were pcDNA3.1, pcDNA3, pVAX1, pVR1012 and pCI. Specific patterns have been identified by analyzing the plasmids collected in VIOLIN. The most commonly used promoter is the human cytomegalovirus virus (CMV) immediate-early promoter that elicits higher expression levels. Some plasmids have been more frequently used for development of DNA vaccines against one type of pathogen than the others. For example, 10 Gram-negative bacterial DNA vaccines use the plasmid pCMVi-UB, but this plasmid has not been used in DNA vaccines against any Gram-positive bacteria, viruses or parasitic pathogens (23). The VIOLIN database has also included the information of other vaccine components such as Vaxvec for collection and analysis of vaccine vectors (e.g. bacterial vaccine vectors, viral vaccine vectors). More work is necessary to systematically annotate and classify such information.

SPECIFIC VACCINE TYPES

VIOLIN includes commercial vaccines as well as those still undergoing preclinical or clinical trials. Here we introduce two programs targeting two different types of vaccines:

Huvax: licensed human vaccines databases

Huvax collects and allows query of licensed human vaccine data. Huvax has curated all 104 human vaccines currently licensed in the USA and Canada, including 27 bacterial vaccines, 47 viral vaccines and 30 combination vaccines. The annotated data for each licensed human vaccine cover vaccine types, preparation, adjuvants, preservatives, allergens, age groups, administration routes, manufacturers, immune responses and adverse events (AEs). Different patterns have been found from the analysis of data for all human licensed vaccines. For example, aluminum salts, including Al(OH)3 and Al(PO)4, have been found to be the most commonly used adjuvants. In addition, several preservatives, including phenol, thimerosal and 2-phenoxyethanol (24), have been commonly used in human vaccine preparation.

DNAVaxDB: DNA vaccines

DNAVaxDB is designed to store and analyze specifically DNA vaccines and their related plasmids and protective antigens. Currently, DNAVaxDB holds over 417 DNA vaccines using 141 DNA vaccine plasmids and 375 protective antigens (Table 2). These vaccines are developed against 99 infectious and noninfectious diseases (including arthritis, cancer and diabetes). To meet the needs for many researchers who are only interested in DNA vaccines, independent web query interfaces have also been developed to query the DNA vaccines, plasmids and protective antigens used in DNA vaccines.

APPLICATIONS OF VACCINE ONTOLOGY ON VIOLIN DATA INTEGRATION AND LITERATURE MINING

Application of VO on VIOLIN data exchange and integration

Originally VIOLIN used VIOLINML, an eXtensible Markup Language (XML)-based format for VIOLIN data exchange (5). Over the past few years, we have switched to rely on the community-based Vaccine Ontology (VO) for data exchange. A biomedical ontology is a set of terms and relations that represent entities in a biomedical domain and how they relate to each other, and terms in ontologies are typically expressed in computer and human interpretable formats to support automated reasoning (25). VO is a biomedical ontology in the vaccine and vaccination domain (21,26,27). The development of VO follows the OBO Foundry principles, including openness, collaboration, and using a common shared syntax (28). Using the Web Ontology Language (OWL) format (http://www.w3.org/TR/owl2-quick-reference/), VO is developed to support machine processing and automated reasoning. In order to properly and efficiently develop and analyze VO, we have also developed several software programs (25,29,30), which have been widely used by the ontology community for development and analysis of other biomedical ontologies. As demonstrated in the Ontobee program (30), currently VO has over 4800 ontology terms (http://www.ontobee.org/ontostat.php?ontology=VO). These terms cover most vaccines, vaccine components (e.g. protective antigens, adjuvants), virmugens and vaccination types stored in VIOLIN. Other top-level terms and term relations are also included in VO. Through systematic alignments with top level ontologies, VO logically represents these vaccine-specific terms and the relations among them and other terminologies, such as pathogens, diseases and vaccines.

VO-SciMiner: VO-based literature mining

The SciMiner literature mining program supports literature indexing and gene name tagging (31). By integrating VO and SciMiner, VO-SciMiner was developed to retrieve, store and analyze vaccines, microbial genes and vaccine-gene interaction networks based on literature mining of PubMed articles (32). VO-SciMiner was first evaluated using the bacterial model of Brucella, a Gram-negative bacterium that causes zoonotic brucellosis in humans and various animals (6). A set of rules was set up for term expansion and literature indexing of VO terms. Using 100 manually annotated biomedical articles, VO-SciMiner demonstrated high recall (91%) and precision (99%) for indexing PubMed papers. The asserted and inferred VO hierarchies provide semantic support. As a result, VO-SciMiner indexing exhibited superior performance in retrieving Brucella vaccine-related papers over the MeSH-based PubMed literature search method. Using extracted abstracts for all Brucella-related papers, VO-SciMiner identified 140 Brucella genes associated with Brucella vaccines. These Brucella genes included protective antigens, virulence factors, and other vaccine-related genes. The enriched biological functional categories of these genes were also identified. An integrative interaction network of Brucella vaccines and genes were constructed and used to address different questions. A web-based query interface has been developed to facilitate its use (Table 2). Our study shows that VO-SciMiner can be possibly developed to improve the efficiency for PubMed searching in the vaccine domain. The expansion of VO-SciMiner to other pathogens is underway.

ONTOLOGY-BASED REPRESENTATION AND ANALYSIS OF VACCINE ADVERSE EVENTS

Although licensed vaccines are in general very safe, they sometimes induce different types of adverse events (AEs) in vaccine recipients. In the USA, the Vaccine Adverse Event Reporting System (VAERS) has been used for decades for collecting different vaccine AE (VAEs) cases (33). The Ontology of Adverse Events (OAE) is a community-based biomedical ontology in the area of AEs (33,34). OAE has been used to analyze VAERS AE data (33). Furthermore, to better represent and analyze vaccine AEs, we developed the Ontology of Vaccine Adverse Events (OVAE; http://www.violinet.org/ovae) by extending the OAE and the VO. OVAE was used to represent and classify the AEs recorded in package insert documents of commercial vaccines licensed in the USA. With over 1300 terms, OVAE includes 87 distinct types of VAEs associated with 63 licensed human vaccines licensed in the USA. The OAE can be used to answer different questions such as the top 10 vaccines associated with the highest numbers of VAEs and the top 10 VAEs most frequently observed among vaccines. More efforts will be made to use OVAE for better analysis of VAERS data.

VIOLIN VACCINE DATA QUERY

Vaxquery (http://www.violinet.org/vaxquery) is the primary data query system developed to search curated vaccine data and related information stored in the VIOLIN system. The default keyword search provides four sections of output containing the keyword(s): vaccines, pathogens, vaccine-related genes and vaccine-related literature. Vaxquery also provides a set of advanced searching programs (http://www.violinet.org/vaxquery/adv_vaxquery.php). The advanced Vaxquery search can be performed in three ways: a vaccine search, a pathogen search or a hierarchical data comparison. For each of these methods, a user can type keywords for specific parameters (e.g. vaccine trade name, antigen, adjuvant). The advanced hierarchical search and comparison program provides a hierarchical structure of the VIOLIN data and allows users to display selected vaccine information. These query and visualization approaches offer the users to customize their search for vaccine-related information. In addition to Vaxquery, different VIOLIN programs (e.g. Protegen, Huvax) have their own query interfaces. These specific query programs search only the information in specific domains (e.g. protective antigens, human licensed vaccines).

OTHER VIOLIN PROGRAMS

Several other VIOLIN programs have been developed. For example, in addition to VO-SciMiner, three other vaccine literature mining programs exist: Litesearch, Vaxmesh and Vaxlert. These programs exist in the original VIOLIN paper published in 2008 (5). Litesearch is a simple literature search of vaccine-related publications. Vaxlert is a program that provides newly published vaccine papers and literature email alerts. Vaxmesh includes a MeSH tree hierarchy and publication records related to each MeSH term in the tree hierarchy. Several new VIOLIN programs are being developed. Vevax is a licensed veterinary vaccine database. Compared with human vaccines, many more animal vaccines have been licensed. Currently Vevax contains over 1000 licensed vaccines for 17 animal species. For analysis and study of vaccine-related molecular mechanisms, VIOLIN provides two programs Vaxism and Vaxar. Vaxism focuses on introducing basic information of microbial pathogenesis, protective immunity and animal models. Vaxar targets the classification and analysis of animal responses to vaccinations. Based on Vaxar, so far we have collected vaccine-induced responses from 35 host species. For software programmers to query and retrieve data, VIOLIN also provides a programming utility service (V-Utilities; http://www.violinet.org/v-utilities). The VIOLIN website has also been used to host several community-based efforts. For example, VIOLIN is the website that hosts the project of VO (http://www.violinet.org/vaccineontology). VIOLIN also hosts the official websites for two International Computational Vaccinology workshops (ICoVax): ICoVax 2012 (http://www.violinet.org/icovax2012) (35) and ICoVax 2013 (http://www.violinet.org/icovax2013).

DISCUSSION

The VIOLIN development in the past 5 years has proven very productive. According to Google Analytics, VIOLIN has been visited by ∼60 000 unique visitors since 2008, and more visits have been seen in the last 2 years. This article introduces many individual VIOLIN programs (e.g. Protegen, VirmugenDB, Vaxjo and Vaxign), most of which have newly been developed since 2008. These programs can also be integrated for the study of a specific pathogen. For example, we have previously reported the application of different VIOLIN programs to simultaneously study vaccines for Brucella (17). Similar approaches can be used to study other pathogens. The mechanisms of vaccine-induced protections to various diseases remain unclear. One largely ignored research area is the identification of mechanisms by which successful vaccines stimulate protective immunity. Systems vaccinology provides a feasible strategy to tackle this problem (36,37). Systematic annotation of host genes whose expressions are induced by vaccines allows for the collection and meta-analysis of experimentally verified results identified from a large volume of peer-reviewed publications. Our previous ontology-based meta-analysis study allowed the identification of experimental factors that significantly contribute to the protection efficacy of whole organism Brucella vaccines (42,43). Analysis of omics data from publically available high-throughput data repositories can also provide valuable novel insights regarding mechanism. As such, we are currently exploring these possibilities to gain better understanding of vaccine-specific protective immunity and to potentially allow the identification of early innate signatures for immunogenicity of vaccines, discover novel immune regulation mechanisms and support rational vaccine design. The semantic web aims at extending the existing web of documents into a web of data designed to be processed automatically (38). Within the Semantic Web framework, a movement known as ‘Linked Open Data’ (LOD) has emerged with the goal of publishing various open datasets using machine-parsable language such as Resource Description Framework (RDF) on the web and establishing good practices for sharing this data (30,39). The VO provides a foundation for integrating various vaccine datasets. Based on VO and other related ontologies, we have planned to develop a ‘Linked Open Vaccine Data’ (LOVD) system to support deep data integration and sharing. Such a LOVD system will promote further basic and translational vaccine research and development. One limitation of our manual curation of vaccine-related information is that often we cannot contain up-to-date and complete information in such a time of fast-growing publications. A potential time delay in updating our database is expected. This is why we will frequently miss newly developed vaccines or vaccine components published in peer-reviewed journals. As a result, a failure to find some information in the database does not mean such information does not exist in the literature. Although there exists literature mining programs to automatically extract relevant information, we find that there are no programs with sufficient high quality and accuracy when compared with manual curation by trained researchers. Nevertheless, we do provide some literature mining and curation programs, as shown in our Limix and VO-SciMiner programs, to facilitate manual curation. We recognize this tradeoff, but ultimately we envision VIOLIN to be a resource that provides high-quality information about various aspects of vaccine, at the expense of potential delay in providing the most up-to-date information. In addition, one major goal of our study is to identify scientifically sound patterns and hypotheses from curated data, which often does not require an inclusion of all possible data. For example, our VirmugenDB study shows that many genes encoding for enzymes involving the metabolism of nutrients (e.g. amino acids, carbohydrates and nucleotides), such as the aroA gene encoding for key enzyme important for the aromatic amino acid biosynthesis, have been frequently knocked out for making live attenuated vaccines (12). Such a finding could be generated with all the possible papers that we had found but might not be complete. Over the past years, we have focused on annotation and analysis of preventive vaccines against infectious diseases. In the future, we will expand to cover vaccines against other types of diseases and expand the coverage of existing vaccine types (e.g. cancer vaccines). As therapeutic vaccine development becomes more extensive in those research fields, manual annotation and analysis of data on therapeutic vaccines will become a primary research topic in our continued VIOLIN project development. We anticipate that VIOLIN will continue to be a comprehensive and crucial source for vaccine knowledge collection, vaccine data analysis and rational vaccine design.

FUNDING

Supported by NIHNIAIDR01 grant [#R01AI081062 to Y.H., for R.R., S.S., Y.L., X.L. and Z.X.]; Startup Funding (to Y.H.) from the University of Michigan; Supported by the Undergraduate Research Opportunity Program (UROP) (to Y.H.) at the University of Michigan (for J.O., A.S., P.D., G.U., K.K., B.V. and K.S.). Funding for open access charge: National Institutes of Health R01 grant [R01AI081062]. Conflict of interest statement. None declared.
  32 in total

Review 1.  Reverse vaccinology.

Authors:  R Rappuoli
Journal:  Curr Opin Microbiol       Date:  2000-10       Impact factor: 7.934

2.  The research imperative: fighting AIDS, TB and malaria.

Authors:  Richard G A Feachem
Journal:  Trop Med Int Health       Date:  2004-11       Impact factor: 2.622

Review 3.  Systems vaccinology.

Authors:  Bali Pulendran; Shuzhao Li; Helder I Nakaya
Journal:  Immunity       Date:  2010-10-29       Impact factor: 31.745

4.  Ontology-based combinatorial comparative analysis of adverse events associated with killed and live influenza vaccines.

Authors:  Sirarat Sarntivijai; Zuoshuang Xiang; Kerby A Shedden; Howard Markel; Gilbert S Omenn; Brian D Athey; Yongqun He
Journal:  PLoS One       Date:  2012-11-28       Impact factor: 3.240

5.  Vaxign: the first web-based vaccine design program for reverse vaccinology and applications for vaccine development.

Authors:  Yongqun He; Zuoshuang Xiang; Harry L T Mobley
Journal:  J Biomed Biotechnol       Date:  2010-07-04

Review 6.  Emerging vaccine informatics.

Authors:  Yongqun He; Rino Rappuoli; Anne S De Groot; Robert T Chen
Journal:  J Biomed Biotechnol       Date:  2011-06-15

7.  BBP: Brucella genome annotation with literature mining and curation.

Authors:  Zuoshuang Xiang; Wenjie Zheng; Yongqun He
Journal:  BMC Bioinformatics       Date:  2006-07-16       Impact factor: 3.169

8.  Genome-wide prediction of vaccine targets for human herpes simplex viruses using Vaxign reverse vaccinology.

Authors:  Zuoshuang Xiang; Yongqun He
Journal:  BMC Bioinformatics       Date:  2013-03-08       Impact factor: 3.169

9.  Database resources of the National Center for Biotechnology Information.

Authors: 
Journal:  Nucleic Acids Res       Date:  2012-11-27       Impact factor: 16.971

10.  Jenner-predict server: prediction of protein vaccine candidates (PVCs) in bacteria based on host-pathogen interactions.

Authors:  Varun Jaiswal; Sree Krishna Chanumolu; Ankit Gupta; Rajinder S Chauhan; Chittaranjan Rout
Journal:  BMC Bioinformatics       Date:  2013-07-01       Impact factor: 3.169

View more
  32 in total

Review 1.  Recent Trends in System-Scale Integrative Approaches for Discovering Protective Antigens Against Mycobacterial Pathogens.

Authors:  Aarti Rana; Shweta Thakur; Girish Kumar; Yusuf Akhter
Journal:  Front Genet       Date:  2018-11-27       Impact factor: 4.599

2.  Cwp22, a novel peptidoglycan cross-linking enzyme, plays pleiotropic roles in Clostridioides difficile.

Authors:  Duolong Zhu; Jessica Bullock; Yongqun He; Xingmin Sun
Journal:  Environ Microbiol       Date:  2019-06-28       Impact factor: 5.491

3.  Genomics of immune response to typhoid and cholera vaccines.

Authors:  Partha P Majumder
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2015-06-19       Impact factor: 6.237

4.  Vaxar: A Web-Based Database of Laboratory Animal Responses to Vaccinations and Its Application in the Meta-Analysis of Different Animal Responses to Tuberculosis Vaccinations.

Authors:  Thomas Todd; Natalie Dunn; Zuoshuang Xiang; Yongqun He
Journal:  Comp Med       Date:  2016-04       Impact factor: 0.982

Review 5.  Ontology-supported research on vaccine efficacy, safety and integrative biological networks.

Authors:  Yongqun He
Journal:  Expert Rev Vaccines       Date:  2014-06-07       Impact factor: 5.217

6.  Integrative representations and analyses of vaccine-induced intended protective immunity and unintended adverse events using ontology-based and theory-guided approaches.

Authors:  Yongqun He; Edison Ong; Jiangan Xie
Journal:  Glob Vaccines Immunol       Date:  2016-05-23

7.  Ontology-based Vaccine and Drug Adverse Event Representation and Theory-guided Systematic Causal Network Analysis toward Integrative Pharmacovigilance Research.

Authors:  Yongqun He
Journal:  Curr Pharmacol Rep       Date:  2016-03-11

Review 8.  The omic approach to parasitic trematode research-a review of techniques and developments within the past 5 years.

Authors:  Orçun Haçarız; Gearóid P Sayers
Journal:  Parasitol Res       Date:  2016-04-28       Impact factor: 2.289

9.  Vaxvec: The first web-based recombinant vaccine vector database and its data analysis.

Authors:  Shunzhou Deng; Carly Martin; Rasika Patil; Felix Zhu; Bin Zhao; Zuoshuang Xiang; Yongqun He
Journal:  Vaccine       Date:  2015-09-25       Impact factor: 3.641

10.  Discovery of novel cross-protective Rickettsia prowazekii T-cell antigens using a combined reverse vaccinology and in vivo screening approach.

Authors:  Erika Caro-Gomez; Michal Gazi; Yenny Goez; Gustavo Valbuena
Journal:  Vaccine       Date:  2014-07-07       Impact factor: 3.641

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.