| Literature DB >> 29322922 |
Asif M Khan1,2, Yongli Hu3, Olivo Miotto4,5, Natascha M Thevasagayam3, Rashmi Sukumaran3, Hadia Syahirah Abd Raman6, Vladimir Brusic7, Tin Wee Tan3, J Thomas August8.
Abstract
BACKGROUND: Viral vaccine target discovery requires understanding the diversity of both the virus and the human immune system. The readily available and rapidly growing pool of viral sequence data in the public domain enable the identification and characterization of immune targets relevant to adaptive immunity. A systematic bioinformatics approach is necessary to facilitate the analysis of such large datasets for selection of potential candidate vaccine targets.Entities:
Keywords: Bioinformatics; Database; Reverse vaccinology; Target discovery; Tools; Vaccine design; Viral diversity
Mesh:
Substances:
Year: 2017 PMID: 29322922 PMCID: PMC5763473 DOI: 10.1186/s12920-017-0301-2
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1Overview of the methodology pipeline. a Summary of the main steps involved. b A workflow describing the main steps involved
Bioinformatics tools, web servers and tutorials relevant as part of the vaccine target discovery pipeline described herein
| Tool | URL (http or ftp) |
|---|---|
| AVANA |
|
| BioEdit |
|
| BLAST Web server |
|
| BLAST+ |
|
| BLAST+ Databases |
|
| BLAST+ Manual |
|
| BLAST2GO |
|
| CDD |
|
| ClustalX |
|
| HCV Immunology Database |
|
| HIV Molecular Immunology Database |
|
| HIV Sequence Database |
|
| Immune Epitope Database |
|
| Influenza Database |
|
| Jalview |
|
| MHCBN |
|
| MULTIPRED2 |
|
| Muscle |
|
| NCBI Entrez Databases |
|
| NCBI Entrez Protein Database |
|
| NCBI Entrez Taxonomy Database |
|
| NetCTL |
|
| PEPVAC |
|
| Pfam |
|
| PROMALS3D |
|
| Prosite |
|
| R |
|
| SYFPEITHI |
|
| Tutorial 1: create BLAST searchable database |
|
| Tutorial 2: remove duplicate sequences |
|
| Tutorial 3: generate tabbed alignment file (in.taln format) for input to AVANA |
|
| Tutorial 4: Notes on performing BLAST search with short queries |
|
| UniProt |
|
Fig. 2An example of a search result at the NCBI Entrez Taxonomy Database for data collection
Fig. 3A sample nonamer position x with five distinct nonamer sequences and their incidences
Fig. 4Entropy plot of nonamer peptide diversity for the envelope proteins of dengue virus (all four serotypes) and HIV-1 clade B. Large entropy values imply high variability, whereas closer to zero represent high conservation. HIV-1 envelope protein of clade B viruses is more diverse than the envelope protein of all reported dengue virus serotypes
Fig. 5Variant analysis of the envelope proteins of dengue virus (all four serotypes) and HIV-1 clade B. a Density plots for the incidence of total variants of the index nonamer and the entropy of the nonamer sequences for the envelope proteins of dengue virus (all four serotypes) and HIV-1 clade B. The variants incidence is widely distributed for HIV-1 clade B, whereas for dengue it is mainly localized in the higher range, with very few positions (6%) under the 20% incidence region (dotted line). b Density plots for the incidence of all variants to the index nonamer and the major variant at each nonamer position for the envelope proteins of dengue virus (all four serotypes) and HIV-1 clade B. The boxed regions and the adjacent values indicate the fraction and number of total nonamer positions analyzed that are highly conserved, contain fewer than 20% variants of the index sequence and fewer than 10% incidence of the major variant. Despite the lower overall entropy of dengue virus (all four serotypes) compared to HIV-1 clade B, only 2% (9 nonamer sites) of the dengue virus envelope nonamers satisfied this criteria, while more favourable targets (9%; 80 nonamer sites) are present for the envelope of HIV-1 clade B
Fig. 6Distribution in nature of hepatitis A virus (HAV) protein nonamers. Nonamers of HAV are found across viruses, archaea, bacteria, and eukaryota kingdoms. Picornaviridae are viruses of the same family as HAV. HAV is a single stranded RNA virus that encodes for four structural proteins, namely 1A (VP4), 1B (VP2), 1C (VP3), and 1D (VP1), and seven non-structural proteins (2A, 2B, 2C, 3A, 3B, 3C and 3D). The length of the protein bars indicate the number of nonamers of these protein that are matched by the other species. Highly conserved nonamers that do not match any one of the species, in particular the co-circulating viruses are favourable for selection as potential vaccine targets in order to avoid potential altered ligand effects