Literature DB >> 21303499

Promiscuous prediction and conservancy analysis of CTL binding epitopes of HCV 3a viral proteome from Punjab Pakistan: an in silico approach.

Abida Shehzadi1, Shahid Ur Rehman, Muhammad Idrees.   

Abstract

BACKGROUND: HCV is a positive sense RNA virus affecting approximately 180 million people world wide and about 10 million Pakistani populations. HCV genotype 3a is the major cause of infection in Pakistani population. One of the major problems of HCV infection especially in the developing countries that limits the limits the antiviral therapy is the long term treatment, high dosage and side effects. Studies of antigenic epitopes of viral sequences of a specific origin can provide an effective way to overcome the mutation rate and to determine the promiscuous binders to be used for epitope based subunit vaccine design. An in silico approach was applied for the analysis of entire HCV proteome of Pakistani origin, aimed to identify the viral epitopes and their conservancy in HCV genotypes 1, 2 and 3 of diverse origin.
RESULTS: Immunoinformatic tools were applied for the predictive analysis of HCV 3a antigenic epitopes of Pakistani origin. All the predicted epitopes were then subjected for their conservancy analysis in HCV genotypes 1, 2 and 3 of diverse origin (worldwide). Using freely available web servers, 150 MHC II epitopes were predicted as promiscuous binders against 51 subjected alleles. E2 protein represented the 20% of all the predicted MHC II epitopes. 75.33% of the predicted MHC II epitopes were (77-100%) conserve in genotype 3; 47.33% and 40.66% in genotype 1 and 2 respectively. 69 MHC I epitopes were predicted as promiscuous binders against 47 subjected alleles. NS4b represented 26% of all the MHC I predicted epitopes. Significantly higher epitope conservancy was represented by genotype 3 i.e. 78.26% and 21.05% for genotype 1 and 2.
CONCLUSIONS: The study revealed comprehensive catalogue of potential HCV derived CTL epitopes from viral proteome of Pakistan origin. A considerable number of predicted epitopes were found to be conserved in different HCV genotype. However, the number of conserved epitopes in HCV genotype 3 was significantly higher in contrast to its conservancy in HCV genotype 1 and 2. Despite of the lower conservancy in genotype 1 and 2, all the predicted epitopes have important implications in diagnostics as well as CTL-based rational vaccine design, effective for most population of the world and especially the Pakistani population.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21303499      PMCID: PMC3042956          DOI: 10.1186/1743-422X-8-55

Source DB:  PubMed          Journal:  Virol J        ISSN: 1743-422X            Impact factor:   4.099


Background

Family Flaviviridae comprises small enveloped pathogens classified in three genera: Flavivirus, Pestivirus, and Hepacivirus. Members of these genera cause various diseases in humans and other animals such as birds, horses and pigs. The only genera Flavivirus contain more than 70 members including Hepatitis C Virus (HCV), Dengue virus, West Nile virus and tick-borne encephalitis virus [1-3]. HCV is a positive sense RNA virus affecting approximately 180 million people world wide and rate of infection in Pakistani population is about 10 million [4,5]. HCV genome contributes about 9400 nucleotides that encode single polyprotein of approximately 3010 to 3033 amino acids in length [6]. This single polyprotein is processed by viral as well as host proteases into three structural proteins (i.e. core, E1 and E2) and four non-structural proteins (i.e. NS2, NS3, NS4, and NS5A) [7]. HCV mainly spreads via blood supply, reuse of glass syringes and needles, unsterilized medical equipment, use of tooth brushes of HCV patients, etc [7] and causes of acute and chronic infections [8]. Clinical demonstrations of acute Hepatitus C Viral infection include Jaundice, Fever, Myalgia, Fatigue, Lethargy, Increased ALT, Anorexia and Fulminant hepatic failure [7]. About 80% of HCV infected individuals develop chronic infections [9]. Chronic liver infections develop chronic hepatitis, cirrhosis and hepatocellular carcinoma within a period of 10, 20 and 30 years respectively followed by viral infection [10,11]. Out of 70-80% chronically infected individuals, 20% develop cirrhosis and 1-5% individuals suffer from final stage of liver diseases [12]. Hepatic steatosis is the accumulation of lipids in hepatocytes and is reported for the cause of cirrhosis [13] with the more severe cases being reported in patients infected with HCV genotype 3a [14]. The prevelance of steatosis in Pakistani population is about 61.5-65.5% compared with 32.8-81.2% in western countries [15]. The percentage of males infected with HCV chronic liver stage is higher then females with the age of patients between 40-50 years [5]. HCV is classified into six genotypes each heaving various subtypes [16-18]. These genotypes are distributed differently in various parts of the world with the genetic variance between them is about one third. The genotypes 1, 2 and 3 have world wide distribution. But the significant differences are observed in subtype distribution. Subtype 1a is mostly found in North America and Europe followed by 2b and 3a. Subtype 1b is frequently found in South East Europe and Tunisia and 2c in North Italy. Genotype 4 is mainly restricted to Middle East and Central Africa and genotype 5 in South Africa. Genotype 6 is distributed throughout South East Asia and also being isolated from Hong Kong and Vietnam [17]. The most frequent HCV genotypic distribution in Pakistan is 3a [49.05%] followed by 3b [17.66%] [19]. The knowledge of HCV distribution is crucial for treatment therapy and vaccination because of its predictive value in terms of response to antiviral therapy and vaccination. Effective responses to antiviral therapy are normally associated with genotype 2 and 3 in comparison to any other genotype [20]. HCV replicates at about 1012 new HCV viruses/day. Replication is carried out by RNA dependent RNA polymerase. RNA polymerase lacks the "proofreading" ability that ensures the high mutation rate of about 8-18 mutations in genomic RNA/year [21,20]. Such a high mutation rate limits the treatment therapy and vaccination. The current treatment therapy for HCV is INF alpha along with ribavirin limited to about 50% population [22]. Although the response rate is not much deterring, but high dosage, long-term treatment and side effects limits the usage [23,21]. There is the possibility that after next few years, new antiviral agents such as inhibitors of the viral protease, helices or polymerase will further improve the response rate of the current therapeutic agents. However, antiviral therapy is not affordable in most developing countries, where the prevalence of HCV is generally the highest. Thus, given the huge reservoir of HCV worldwide, the development of an effective vaccine may be the cheapest way to control disease associated with HCV infection. Development of an effective HCV vaccine requires understanding of immune response. Viral immune response is associated with Major Histocompatabiliy complex protein (MHC) and T lymphocytes/T cell. MHC are classified into 2 broad categories, MHC I and MHC II [24]. MHC initially recognizes the viral antigenic epitopes and presents to T lymphocytes for degradation. MHC I presents the antigenic epitopes to CD8+ T cells and MHC II presents to CD4+ T cells for viral degradation [25,26]. CD8 T cells also referred to as cytotoxic T cells (CTL or Tc), limit viral infections by initial recognizing and their subsequent killing infected cells and secreting cytokines. CD4 T referred to as helper cells or Th cells and provides growth factors and signals for generation and maintenance of CD8 T cells [27]. T cells recognize the antigens only when they are associated with MHC, surface glycoprotein exposed on surface of all vertebrate cells. The selection of T cell epitopes is also important because these are linear and hence easy to synthesize. A particular vaccine developed against HCV can't be effective for Pakistani population because of variations in HCV genomic sequences and distribution with regard to geographical area. Since a large number of Pakistani population is infected by HCV3a and number of patients enrolled in public and hospitals is increasing day by day. So there is a current need to develop a vaccine against HCV in particular to HCV3a that will cover approximately maximum Pakistani population. The current vaccines are DNA vaccine, Peptide vaccine and epitopic vaccines. Epitopes are the small antigenic segments of viral proteins and causes infections in the host. Epitopic vaccines provide more potent and controlled immune response and eliminates the potential lethal effects of the use of whole viral proteins [28]. Promiscuous epitopes (epitopes capable of binding maximum number of HLA alleles) may overcome the population coverage. Secondly the conserved epitopes reduces antigen escape associated with the viral mutation [29]. So the present study was designed for the prediction of promiscuous epitopes and to analyze their conservancy in general population. Any mutation in the peptide/epitope will lower the conservancy, so it was hypothesized to analyze the pI value of the mutated amino acid residue, that if remain in the range as was in original epitope provides the likeliness of that particular epitope to be used for epitopic vaccine design having an effective control over viral mutation, immune response with minimum side effects.

Methods

Sequence Retrieval and Analysis

The sequence of fully sequenced HCV 3a genome and protein of Pakistani origin was retrieved from NCBI [GU294484]. The number of individual bases in the genome i.e. the number of adenine; cytosine, guanine and thymine were calculated from DDBJ database. The molecular weight of proteins, percentage of highly repeated amino acid and the least repeated amino acid in the viral protein was calculated by using sequence and search analysis tool at PIR database (http://pir.georgetown.edu/).

Epitope Prediction

Promiscuous epitopes of HCV 3a viral proteins were predicted for HLA I and HLA II binding alleles using freely available immunoinformatics tools such as ProPred I, and ProPred respectively. In comparison to other epitope prediction tools, Propred 1 and Propred cover maximum number of Human Leukocyte antigens i.e. HLA and being used for epitopic prediction for HBV and tuberculosis. ProPred1 allows the user to predict antigenic apitopes for 47 MHC I alleles and ProPred allows epitopes prediction for 51 MHC II alleles. Predictions through these tools can be carried out at various thresholds from 1 to 10%. The algorithms designed for the working of these tools are based on linear coefficients of matrices. Maximum of the matrics were retrieved from BIMASS where the score of each peptide is calculated in multiplication and/or sum up manner. For example the score of following peptide "PACDPGRAA" can be calculated by using following equation: Score = P(1) × A(2) × C(3) × D(4) × P(5) × G(6) × R(7) × A(8) × A(9) Score = P(1) + A(2) + C(3) + D(4) + P(5) + G(6) + R(7) + A(8) + A(9) Where P (1) is score of P at position 1. Only the promiscuous epitopes with score higher than the chosen threshold score were assigned as predicted epitopes for the selected HLA alleles [30]. For the following study the default threshold i.e. 4% was used where the sensitivity and specificity are nearly the same for most of the HLA alleles available in ProPred1 and ProPred server. Moreover, MHC I alleles were predicted by keeping the proteosome and immunoproteosome filters on at 5% threshold because most of the MHC binders having a proteosomal cleavage site at C-terminal have higher likelihood to be T-cell epitopes [31]. The predicted promiscuous epitopes were positioned in the table in a decreasing order of their score.

Epitope Conservancy Analysis

All the predicted epitopes of HCV 3a proteins of Pakistani origin were subjected for worldwide conservancy analysis among HCV genotype 1, 2 and 3. 5 sequences against each HCV protein (used for epitope prediction) were retrieved from NCBI randomly. The predicted epitopes of HCV 3a (Pakistani origin) along with 5 selected sequences of individual genotypes (genotype 1, 2 and 3; one at a time) were submitted to epitope conservancy analysis tool available at IEDB database (http://tools.immuneepitope.org/tools/conservancy/iedb_input). All the epitopes having 77-100% conservancy were selected while rejecting the epitopes having variation at the anchor residues. The anchor residues in the predicted epitopes were highlighted by making it bold. The epitopes that were 100% conserved in the selected proteins of the 3 viral genotypes 1, 2 and 3 were also fully bold. Epitopes with 88/77% conservancy were with single or double amino acid variation respectively and to highlight them bold format was used in the conservancy column against each genotype. Asteric sign (*) indicates that one out of five selected sequences either does not respond to epitope conservancy or have conservancy lower then 77%. Double asteric sign (**) indicates that only one sequence responds for 77-100% conservancy to the selected epitope.

Validation of varied amino acids using pI value

The Peptides with single or double amino acid variation were analyzed for their hydropathic characteristics or pI value [32]. The pI gives the information that the varied residue retained the amino acid group or diverted from its normal group in a particular peptide under consideration and thus provides information to be used or their rejection. All the varied amino acid residue with diverted group (with considerable change of pI value) were separated from other using superscript "D" for single variation and "DD" for diverted group for doubly varied residues. The superscript "D" in doubly varied residues of particular peptides represents the partial variation i.e. one of the varied residue retained the amino acid group while other residue shifted the amino acid group by a considerable change of pI value.

Results

HCV 3a genome of Pakistani origin comprises 9474 bp with GC content 2622 and 2700 respectively. The GC contents are 12.35% higher then AT contents. The genome encodes a polyprotein that subsequently get fragmented into structural and non structural protein of obvious molecular weight. The envelope protein E2 comprises highest moleculat weight 38755.3 KDa (Table 1). Leucine (L) a neutral nonpolar amino acid residue has the highest percent of repetition (13.1%) in E2 protein. The least repeated residue of E2 is a basic polar Lysine (K) (1.4%). The shortest segment viral protein is NS4a (5751.69 KDa molecular weight) comprising 54 amino acid residues. Leucine (L) and Valine (V) have highest percentage of repetition (14.8) and Histidine (H), Methionine (M), Threonine (T) and Tryptophan (W) are the least repeated amino acid residues (1.9%). The molecular weight of other viral proteins and percent repetition of their amino acid residue for were listed in Table 1. The percentage of amino acid residues gives an out look for their pI value and their probability of incidence in the antigenic epitopes.
Table 1

It comprises the data of HCV genome size, Proteins, Molecular weight and %age of highly repeated and least repeated amino acid residues in individual bases

BasesNo.Proteinsaa NumberMol. Wt.Highly repeated aa% of repetitionLeast repeated aa% of repetition
Total bp9474Capsid11412985.8R18.4C/F0.9

A1974Core757638.88L16E/K/M/Y1.3

C2700E119020643.9V11.1E1.1

G2622E235038755.3L13.1K1.4

T2178NS314915423.6A/G11.4N0.7

NS4a545751.69L/V14.8H/M/T/W1.9

NS4b19420167.5A13.4C0.4

NS5a-1a626700.72G14.5E/D1.6

NS5a-1b10111224.6P11.9K/W1
It comprises the data of HCV genome size, Proteins, Molecular weight and %age of highly repeated and least repeated amino acid residues in individual bases F (Phenylalanine), I (Isoleucine), L (Leucine), M (Methionine), V (Valine), W (Tryptophan) and Y (Tyrosine) were mainly the anchor residues for MHC II predicted epitopes and are nonpolar in nature. Total 150 epitopes were predicted against 51 alleles of MHC II (Table 2). The highest number of epitopes was represented by E2 protein comprising 20% of all MHC II predicted epitopes. VFLLNPCGL, FVILVFLLL, WHINSTVLH, FNLLDVPKA, LELINTHGS, VQYLYGVGS are the promiscuous binders of 45-50 MHC II alleles. E2 is followed by NS2 and NS4B proteins representing 14.66% of the predicted MHC II epitopes. In case of NS2 VRAHVLVRL, VILLTSLLY and VRLCMFVRS are the best binders both in term of score and the HLA allele coverage (50-51 MHC II alleles). FFNILGGWV, VNLLPAILS and VVNLLPAIL are the best binders of NS4b protein both in terms of HLA coverage (41 HLA coverage for the first epitope and 51 for the next 2 epitopes) and binding efficiency. LVVGVICAA, FNILGGWVA, WQKLEAFWH, IQYLAGLST and VVGVICAAL are also the epitopes of good quality covering 31 to 35 HLA alleles available in ProPred. For the NS5a_1a only three epitopes (MRLAGPRTC, FISCQKGYK and VVSTRCPCG) were predicted as promiscuous binders with the binding score higher then the selected threshold. Out of these three epitopes MRLAGPRTC is capable of binding all the HLA alleles available in ProPred server while FISCQKGYK and VVSTRCPCG bind 22 and 25 HLA alleles respectively. The predicted promiscuous binders against other proteins were also summarized in table 2.
Table 2

Predicted HLA II epitopes HCV Proteins of Pakistani origin and their conservancy in Genotype 1, 2 and 3 worldwide

Epitope start PositionPredicted T-cell epitopesHLA allelesHCV genotype 1HCV Genotype 2HCV Genotype 3
Capsid

43LGVRATRKA23LGVRATRKTDLGVRATRKTLGVRATRKTD

36LPRRGPRLG15LPRRGPRLGLPRRGPRLGLPRRGPRLG

106WGPNDPRRR16WGPTDPRRRWGPTDPRHRDWGPNDPRRR

34YVLPRRGPR24YLLPRRGPRYLLPRRGPRYVLPRRGPR

21VKFPGGGQI8VKFPGGGQI *VKFPGGGQIVKFPGGGQI

35VLPRRGPRL9VLPRRGPRL

45VRATRKASE25VRATRKTSEDVRATRKTSEDVRATRKTSED

30VGGVYVLPR39VGGVYLLPR *VGGVYLLPRVGGVYVLPR

15IRRPQDVKF6IRRPQDVKF

95WLLSPRGSR28WLLSPRGSRWLLSPRGSRWLLSPRGSR

29IVGGVYVLP3IVGGVYLLP *IVGGVYLLPIVGGVYVLP

82WPLYGNEGC10WPLYGNEGCWPLYGNEGCWPLYGNEGC

85YGNEGCGWA11YGNEGCGWAYGNEGCGWA *YGNEGCGWA

33VYVLPRRGP1VYLLPRRGPVYLLPRRGPVYVLPRRGP

Core

61FLLALLSCL50FLLALLSCLFLLALLSCIFLLALLSCL

64LALLSCLIH45LALLSCLTVDDLALLSCLIH *

15FADLMGYIP41FADLMGYIPFADLMGYIPFADLMGYIP

24LVGAPVGGV44LVGAPLGGALVGAPVGGV *

63LLALLSCLI36LLALLSCLTDLLALLSCITDLLALLSCLI *

62FLLALLSCL24FLLALLSCLFLLALLSCIFLLALLSCL *

32VARALAHGV10VARALAHGVVARALAHGV

21YIPLVGAPV28YIPLVGAPLYIPVVGAPLYIPLVGAPV

19MGYIPLVGA26MGYIPLVGAMGYIPVVGAMGYIPLVGA

E1

58YVGATTASI41YVGATTASI *

140MVVAHILRL39MVVAHILRL*

2WRNTSGLYV27WRNTSGLYV

138VGMVVAHIL28

56VKYVGATTA21VRYVGATTAD *

9YVLTNARSN31YVLTNDCSNDD

161WGVLAGLAY15WGVLAGMAYWGVVFGLAYWGILAGLAY

93FLVGQAFTF11FLVGQLFTFFLVGQAFTF

181IIMVMFSGV91IIMVMFSGV

130MMMNWSPAV35MMMNWSPTADMMMNWSPAM

134WSPAVGMVV6WSPAMGMVV *

132MNWSPAVGM14MNWSPAMGM *

169YYTMQGNWA18YYSMQGNWA

47WTPMTPTVA21WTPVTPTVA *

172MQGNWAKVA25MVGNWAKVLDMQGAWAKVIDWTPVTPTVAD *

145ILRLPQTLF19ILRLPQTLF

E2

122MLPHHRPVV3

151VFLLNPCGL48

337WEFVILVFL4WEFIVLVFL

339FVILVFLLL46FIVLVFLLL

35WHINSTVLH41

342LVFLLLADALLFLLLADALLFLLLADALVFLLLADA

100VLLAYAPRP50

198FRPLLPHRL47

218VRLGALVDT12

62FNLLDVPKA45

26LELINTHGS46

57FYYHKFNLL12FYYHKFNSSDFYYHKFNSTDD

83VGPLDRCQH26

58YYHKFNLLD24

286LLHSTTELA17LLHSTTEWALLHSTTELA

129VVVGTTDPK14VVVGTTDKLDDVVVGTTDRLDD *VVVGTTDAK

320VQYLYGVGS46VQYLYGVGSVQYLYGVGS

159LLVVGGLGG14

293LAILPCSFT7LAILPCSFT

335LKWEFVILV4LKWEFIVLV

322YLYGVGSGM5YLYGVGSSIDYLYGVGSGM

300FTPMPALST17FTPMPALST

245FYTVQGEDV4

18IVRGPEQRL26

100VLLAYAPRP4

257VWHRFTAAC19VEHRLTAACD *

206LLQETSRGH8

1YITGGTAAR8

267WTRGERCDI10WTRGERCEI

310IHLHQNIVD11IHLHQNIVD *IHLHQNIVD

NS2

101VRAHVLVRL51VRAHVLVRL

62VILLTSLLY50VILLTSLLY *

73LVFDIAKLL24LVFDITKLLD *LVFDITKLLD *LIFDITKLLD

153LKDLAVATE7LKDLAVATE *

113FVRSVTGGK37

130VGRWFNTYL11VGRWFNTYL *

123FQMAILSVG31FQMIILHVGD

137YLYDHLAPM21YLYDHLAPM

74VFDIAKLLIA23VFDITKLLLA D *VFDITKLLLAD

107LVRLCMFVR36LVRLCMLVR

108VRLCMFVRS51VRLCMLVRS *

89YFVRAHVLV33YFVRAHVLV

11ILVLFGFFT15

37YAICRCESA18IINGLPVSADYTICRCESAD *

33WWNQYAICR8WWNQYTICRD

185ILCGLPVSA10IINGLPVSA *IINGLPVSAILCGLPVSA

145MQHWAAAGL18MQHWAAAGL

50VPPLLARGS21VPSLLARGSD *

88LYLIQAAIT35LYLIQTAITD *

158VATEPVIFS14VAVEPVVFSDVAVEPVVFSVATEPVIFS

37YAICRCESA19YTICRCESAD *

175WGADTAACG11WGADTAACG *WGADTAACG *WGADTAACG

NS3

4VQVLSTATQ46VQIVSTATQVQVLSSVTQD

43LQMYTNVDQ42

129VCTRGVAKA21VCTRGVAKAVCARGVAKSDD *

24WTVYHGAGS13WTVYHGAGTWTVYHGAGN

84VIPARRRGD18VIPVRRRGD *

138LQFIPVETL45

140FIPVETLST43FIPVENLGTD

6VLSTATQTF19IVSTATQTF

53LVGWPAPPG29LVGWPAPQGLVGWPSPPGD

27YHGAGSRTL22YHGAGTRTI

14FLGTTLGGV10

77LVTREADVI25LVTRHADVI D *LVTRNADVID *

98LSPRPLACL12LSPRPLSTLD

124IFRAAVCTR44

NS4a

23VVIVGHIEL43VVIVGRIILDDVVIVGRIVLDDVVIVGHIEL

3WVLLGGVLAA43WVLVGGVLAAWVLVGGVLAAWVLLGGVLAA

4VLLGGVLAAL40VLVGGVLAALVLVGGVLAALVLLGGVLAAL

38VPDKEVLYQ11VPDKEVLYQ *

24VIVGHIELG8VIVGHIELG

10LAALAAYCLS8LAALAAYCLT *LAALAAYCLSLAALAAYCLS

16YCLSVGCVV6YCLSTGCVVDYCLSVGCVV

26VGHIELGGK9VGHIELGGK

25IVGHIELGG29IVGHIELGG

20VGCVVIVGH15VGCVVIVGH

9VLAALAAYC9VLAALAAYC*VLAALAAYCVLAALAAYC

29IELGGKPAL14IELGGKPAL

NS4b

81FFNILGGWV41FFNILGGWV

153VNLLPAILS51VNLLPAILSVNLLPAILSVNLLPAILS

152VVNLLPAIL51

39WNFVSGIQY16WNFISGIQYWNFISGIQYWNFVSGIQY

165LVVGVICAA35LVVGVVCAALVVGVVCAALVVGVICAA

82FNILGGWVA32FNILGGWVAFNILGGWVAFNILGGWVA

81FFNILGGWV5FFNILGGWV

63LMAFAASVT9LMAFTASITDLMAFTAAVTDDLMAFTASVTD

27WQKLEAFWH35WQKLEVFWADWQKLEAFWH *

167VGVICAALL11VGVVCAAILVGVVCAAILVGVICAAIL

45IQYLAGLST35IQYLAGLSTIQYLAGLSTIQYLAGLST

64MAFAASVTS23MAFTASITSMAFTAAVTSDDMAFTASVTSD

84ILGGWVATH24ILGGWVAAQDDILGGWVAAQDDILGGWVATH

103VVSGLAGAA10VGAGLAGAADVVSGLAGAA

166VVGVICAAL31VVGVVCAAIVVGVVCAAIVGVICAAIL

85LGGWVATHL3LGGWVAAQLDDLGGWVAAQLDDLGGWVATHL

60VASLMAFAA15VASLMAFTAD

41FVSGIQYLA8FISGIQYLAFISGIQYLAFVSGIQYLA

139FKIMGGELP21FKIMSGEVPDFKIMGGEFP *

9LQRATQQQA14LQRATQQQA *

122LDILAGYGA6LDILAGYGA *

104VSGLAGAAI3VSGLAGAAI

NS5a_1a

39MRLAGPRTC51MRIVGPRTC *FISCQKGYRD *MRLAGPRTC*

3FISCQKGYK22FFSCQRGYKDD *FISCQKGYK *

19VVSTRCPCG25VMSTRCPCG *

NS5a_1b

73LLRDEITFV20LLRDEVTFQD*LLRDEVTFQ D **LLRDEITFV *

16WRVAANSYV33WRVAAEEYVDD *WRVAASEYVDWRVAANSYV

55FTEVDGVRL4FTELDGVRL*FTEVDGVRL **FTEVDGVRL

80FVVGLNSYA25FVVGLNSYA *

32FHYITGATE16FHYITGATE

61VRLHRYAPP27VRLHRYAPA*VRLHRYAPP *

87YAIGSQLPC20YVVGSQLPC *VRLHRYAPAD **YAIGSQLPC *

23YVEVRRVGD14YVEVTRVGDD *YVEVTRVGDD **YVEVRRVGD

Bold amino acid residues in T-cell Epitope column indicates the anchor residues

Bold individual amino acid residues in HCV Genotype 1, 2 and 3 columns indicated the variation in peptide in comparison to the predicted epitope

*Indicates that one of the protein sequence selected for epitope conservancy either does not respond or have conservancy lower then 70%

** Indicates that only one of the protein sequence from selected sequences respond to epitope conservancy

D Indicates that amino acid residue in case of single/double variation diverted their group compared to primary epitope using pI value

DD Indicates that both amino acid residues in case of double variation diverted their group compared to primary epitope using pI value

Predicted HLA II epitopes HCV Proteins of Pakistani origin and their conservancy in Genotype 1, 2 and 3 worldwide Bold amino acid residues in T-cell Epitope column indicates the anchor residues Bold individual amino acid residues in HCV Genotype 1, 2 and 3 columns indicated the variation in peptide in comparison to the predicted epitope *Indicates that one of the protein sequence selected for epitope conservancy either does not respond or have conservancy lower then 70% ** Indicates that only one of the protein sequence from selected sequences respond to epitope conservancy D Indicates that amino acid residue in case of single/double variation diverted their group compared to primary epitope using pI value DD Indicates that both amino acid residues in case of double variation diverted their group compared to primary epitope using pI value Total 69 epitopes were predicted as promiscuous epitopes for MHC I alleles. The anchor residues in case of MHCI are quite varying both in amino acid residues and also in their nature. Mostly represented anchor residues are neutral nonpolar and neutral polar. However, quite small percentage of anchor residues were also acidic polar and basic polar in nature. The highest number of MHC I binding epitopes were represented by NS4b protein comprising 26% of all MHC I predicted epitopes. NFVSGIQYL epitope of NS4b is the best promiscuous binder of highest binding score. NS4b is followed by NS2, E2 and NS3 proteins representing 20.28% (NS2 epitopes) and 11.59% (for E2 and NS3). In case of NS2, 14 promiscuous epitopes were predicted with varying binding efficiency. GSRDGVILL, DGVILLTSL, WAAAGLKDL and LQVWVPPLL are the good binders both in term of score and the HLA allele coverage (21, 28, 27 and 28 alleles respectively). E2 predicted epitopes covers 20 to 28 HLA alleles except the PLLHSTTEL epitope that covers only 11 HLA alleles but with highest binding efficiency. NS3 epitopes covers 8 to 25 HLA alleles and were also ranked on the basis of their binding efficiency predicted by the score. The least represented epitopes were by NS5a_1a. It comprises only one epitope (HVKNGSMRL) as predicted promiscuous binders for 16 MHC I binding alleles. The promiscuous binders of MHC I for other proteins were also predicted and summarized in table 3.
Table 3

Predicted HLA I epitopes HCV Proteins of Pakistani origin and their conservancy in Genotype 1, 2 and 3 worldwide

Epitope start PositionPredicted T-cell epitopesHLA allelesHCV genotype 1HCV Genotype 2HCV Genotype 3
Capsid

38RRGPRLGVR9RRGPRLGVRRRGPRLGVRRRGPRLGVR

35VLPRRGPRL25VLPRRGPRL

Core

55PGCSFSIFL8PGCSFSIFLPGCSFSIFLPGCSFSIFL *

41RALEDGINF20RVLEDGVNF **RALEDGINF *

7VIDTLTCGF15VIDTLTCGFVIDTITCGF *VIDTLTCGF *

35ALAHGVRAL24ALAHGVRVLALAHGVRVLALAHGVRAL *

24LVGAPVGGV18LVGAPLGGA *LVGAPVGGV *

26GAPVGGVAR9GAPLGGAAR *GAPLGGVARGAPVGGVAR *

E1

135SPAVGMVVA14SPAMGMVVA *

86GDVCGAVFL19GDLCGSVFLDGDVCGAVMIGDMCGAVFL *

144HILRLPQTL22HILRLPQTL *

156IAGAHWGVL27IAGAHWGVLIAGAHWGIL

64ASIRGHVDL25ASIRSHVDLD

E2

285PLLHSTTEL11PLLHSTTEWDPLLHSTTEL

305ALSTGLIHL25ALTTGLIHLALSTGLIHL

227CSFTPMPAL20CSFTTLPALD *CSFTPMPAL

71QQLQAHHFL27

157CGLLVVGGL28

212RGHIQPVRL24

6TAARGGQGL25

157CGLLVVGGL28

NS2

172VITWGADTA6VITWGADTA *

75FDIAKLLIA12FDITKLLLADFDITKLLLAD *FDITKLLIAD

70YPSLVFDIA15YPSLIFDITDD

57GSRDGVILL21GGRDAVILLD **GGRDAVILLD *GSRDGVILL

60DGVILLTSL26DGVILLTSL *

148WAAAGLKDL27WAASGLRDLDD**WAAAGLKDL *

50VPPLLARGS11VPPLLARGS *

46LQVWVPPLL28LHVWVPPLNDDLHVWVPPLNDD *LQVWVPPLL *

117VTGGKYFQM16VVGGKYFQMD *

65LTSLLYPSL23LTSLLYPSL

6TLGAGILVL48TLGAGVLVL *

73LVFDIAKLL31LVFDITKLLDLVFDITKLLD *LIFDITKLLD

145MQHWAAAGL26MQHWAAAGL

178DTAACGDIL21DTAACGDIIDTAACGDIID *DTAACGDIL

NS3

119GHVAGIFRA8GHAVGIFRA *GHVVGLFRA *

27YHGAGSRTL14YHGAGTRTIYHGAGNKTLD

128AVCTRGVAK8AVCTRGVAKAVCTRGVAK *

57PAPPGAKSL11PAPQGARSLDD *PSPPGTKSLDD

98LSPRPLACL25LSPRPLSTLD

95ASLLSPRPL24

130CTRGVAKAL20CTRGVAKAV

7LSTATQTFL24

NS4a

3WVLLGGVLA11WVLVGGVLAWVLVGGVLAWVLLGGVLA

23VVIVGHIEL21VVIVGRIILDDVVIVGRIVLDDVVIVGHIEL

10LAALAAYCL24LAALAAYCL *LAALAAYCLLAALAAYCL

5LLGGVLAAL27LVGGVLAALLVGGVLAALLLGGVLAAL

NS4b

96PQSSSAFVV6PQSSSAFVV

40NFVSGIQYL30NFISGIQYLNFISGIQYLNFVSGIQYL

81FFNILGGWV13FFNILGGWV

46QYLAGLSTL17QYLAGLSTLQYLAGLSTLQYLAGLSTL

102FVVSGLAGA9FVGAGLAGADFVVSGLAGA

54LPGNPAVAS14LPGNPAIASLPGNPAIASLPGNPAVAS

161SPGALVVGV14SPGALVVGVSPGALVVGVSPGALVVGV

141IMGGELPNA7IMGGEFPTAD *

164ALVVGVICA11ALVVGVVCAALVVGVVCAALVVGVICA

117LGRVLLDIL22LGKVLVDILD*LGKVLVDILDLGKVLLDILD *

59AVASLMAFA9AIASLMAFTDAIASLMAFTDAVASLMAFTD

152VVNLLPAIL15

113GIGLGRVLL24GIGLGKVLLD *

56GNPAVASLM12GNPAIASLMGNPAIASLMGNPAVASLM

52STLPGNPAV21STLPGNPAISTLPGNPAVSTLPGNPAV

85LGGWVATHL26LGGWVAAQLDDLGGWVAAQLDDLGGWVATHL

145ELPNAEDVV11

99SSAFVVSGL22SSAFVVSGL

NS5a_1a

33HVKNGSMRL16HVKNGSMRI *HVKNGSMRI **HVKNGSMRL

NS5a_1b

49VPAAEFFTE6VPAPEFFTE *VPAPEFFTE **VPAAEFFTE

79TFVVGLNSY10TFQVGLNQYD *TFTVGLNSFD *TFTVGLNSYD *

76DEITFVVGL19DEVTFQVGLD *DEVTFTVGLD*DEITFMVGL *

Bold amino acid residues in T-cell Epitope column indicates the anchor residues

Bold individual amino acid residues in HCV Genotype 1, 2 and 3 columns indicated the variation in peptide in comparison to the predicted epitope

*Indicates that one of the protein sequence selected for epitope conservancy either does not respond or have conservancy lower then 70%

** Indicates that only one of the protein sequence from selected sequences respond to epitope conservancy

D Indicates that amino acid residue in case of single/double variation diverted their group compared to primary epitope using pI value

DD Indicates that both amino acid residues in case of double variation diverted their group compared to primary epitope using pI value

Predicted HLA I epitopes HCV Proteins of Pakistani origin and their conservancy in Genotype 1, 2 and 3 worldwide Bold amino acid residues in T-cell Epitope column indicates the anchor residues Bold individual amino acid residues in HCV Genotype 1, 2 and 3 columns indicated the variation in peptide in comparison to the predicted epitope *Indicates that one of the protein sequence selected for epitope conservancy either does not respond or have conservancy lower then 70% ** Indicates that only one of the protein sequence from selected sequences respond to epitope conservancy D Indicates that amino acid residue in case of single/double variation diverted their group compared to primary epitope using pI value DD Indicates that both amino acid residues in case of double variation diverted their group compared to primary epitope using pI value Out of total 150 predicted MHC II epitopes, 75.33% were (77-100%) conserve in genotype 3 (Table 1) against the randomly selected viral proteins. Out of 75.33% conserved peptides of genotype 3, 71.68% peptides were 100% conserve while 22.12% peptides were having single residue variation (88% epitope conservancy). Only the 40% peptides of singly varied residues diverted their amino acid group and the pI value while 60% singly varied residues retained the amino acid group as was in the predicted epitope of HCV 3a proteins. 6.19% peptides comprised the 77% epitope conservancy because of double residue variation in the peptides of general population in contrast to predicted epitopes of HCV 3a of Pakistani origin. Out of 6.19%, doubly varied amino acid residues 42.85% peptides retained their amino acid group and nearly same pI value as in case of predicted epitope while 28.57% peptides were having partial group divertion and 28.57% (of doubly varied amino acid residues) peptides diverted their amino acid group because of considerable variation in the pI value. Similar data was also obtained for the HCV genotype 1 and 2 consisting 47.33% and 40.66% conservancy respectively. However, in contrast to genotype 3, only 23.94% predicted epitopes were 100% conserve in randomly selected sequences of genotype 1 and 22.95% in genotype 2. Their rate of single/double residue variation was also predicted and expressed as figure 1.
Figure 1

A comparative analysis of HCV 3a Predictive epitopes against MHC II alleles and their conservancy analysis in Genotype 1, 2 and 3 worldwide.

A comparative analysis of HCV 3a Predictive epitopes against MHC II alleles and their conservancy analysis in Genotype 1, 2 and 3 worldwide. Out of total 69 predicted MHC I epitopes, 78.26% were (77-100%) conserve in genotype 3 (Table 2) against the randomly selected viral proteins. Out of 78.26% conserved peptides of genotype 3, 72.22% peptides were 100% conserve while 22.22% peptides were having single residue variation (88% epitope conservancy). 40.66% peptides of singly varied residues retained the amino acid group as was in the predicted epitope of HCV 3a proteins while 58.33% singly varied residues diverted their amino acid group and the pI value. 5.5% peptides comprised the 77% epitope conservancy because of double residue variation in the peptides of general population in contrast to predicted epitopes of HCV 3a of Pakistani origin. Out of 5.5%, doubly varied amino acid residues 66.66% peptides were having partial group divertion and 33.33% (of doubly varied amino acid residues) peptides diverted their amino acid group because of considerable variation in the pI value. Similar data was also obtained for the HCV genotype 1 and 2 consisting 55.07% conservancy. However, in contrast to genotype 3, only 21.05% predicted epitopes were 100% conserve in randomly selected sequences of genotype 1 and 2. Their rate of single/double residue variation was also predicted and expressed as figure 2.
Figure 2

A comparative analysis of HCV 3a Predictive epitopes predicted against MHC I and their conservancy analysis in Genotype 1, 2 and 3 worldwide.

A comparative analysis of HCV 3a Predictive epitopes predicted against MHC I and their conservancy analysis in Genotype 1, 2 and 3 worldwide.

Discussion

The modern technique for control of HCV infection is a vaccine preparation that can specifically induce antibody-mediated immunity. The rapid advancements in the computational methodologies and immunoinformatics/immuno-bioinformatics provide new strategies for the synthesis of antigen specific epitopic vaccine against infectious agents such as viruses and pathogens. Epitopic vaccine against HIV, malaria and tuberculosis provided promising results and supported the defensive and therapeutic uses of these vaccines [33]. Thus in the present study, a new systematic immunoinformatics approach was applied for the predicted antigenic epitopes of HCV 3a proteins of Pakistani origin followed by diversity and conservancy in other genotypes (1,2 and 3) in randomly selected HCV sequences from NCBI and mainly belong to Thailand, Cuba, UK, USA, China, Japan, France, Italy and Germany. The immunogenic epitopes identified were nanomers and could be used diagnostically to detect HCV specific CTL responses in the patients and after vaccination. A CTL based HCV vaccine might not efficient enough to prevent from infection but it might protect the body from the disease. The analysis showed that the minimal number of epitopes required to represent the complete anigenicity of the whole proteins are significantly smaller then required to represent full length proteins. The majority of the epitopes reported here had intermediate to high HLA binding affinity. By the use of an efficient CTL based epitope delivery technology; the predicted epitopes could eventually become vaccines in their own or fused as polytopes. The design of the HCV vaccine using conserved epitopes can avoid viral mutation and thus provides more efficient results. The study shows that the predicted epitopes were highly conserved in HCV genotype 3 and also but less conserved in genotype 1 and 2 both for MHC I and MHC II. Moreover, to ensure the viral detection at all stages of its intracellular evolution we have used all the viral proteins. Therefore, the total number of predicted epitopes were also maximized in correspond to the number of covered proteins used for the analysis.

Abbreviations

HCV: hepatitis C virus; HLA: human leukocyte antigen; MHC: major histocompatability complex; CTL: cytotoxic T lymphocytes.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

AS and SH designed the study. AS performed the immunoinformatics analysis and drafted the manuscript. MI critically reviewed the manuscript. All authors have read and approved the final manuscript.
  30 in total

Review 1.  Hepatitis C virus infection.

Authors:  G M Lauer; B D Walker
Journal:  N Engl J Med       Date:  2001-07-05       Impact factor: 91.245

2.  A systematic bioinformatics approach for selection of epitope-based vaccine targets.

Authors:  Asif M Khan; Olivo Miotto; A T Heiny; Jerome Salmon; K N Srinivasan; Eduardo J M Nascimento; Ernesto T A Marques; Vladimir Brusic; Tin Wee Tan; J Thomas August
Journal:  Cell Immunol       Date:  2007-04-16       Impact factor: 4.868

Review 3.  Distribution of hepatitis C virus genotypes in the Middle East.

Authors:  S Ramia; J Eid-Fares
Journal:  Int J Infect Dis       Date:  2006-03-27       Impact factor: 3.623

4.  A simple method for displaying the hydropathic character of a protein.

Authors:  J Kyte; R F Doolittle
Journal:  J Mol Biol       Date:  1982-05-05       Impact factor: 5.469

Review 5.  Introduction to classical swine fever: virus, disease and control policy.

Authors:  V Moennig
Journal:  Vet Microbiol       Date:  2000-04-13       Impact factor: 3.293

6.  Hepatic steatosis in Iranian patients with chronic hepatitis C.

Authors:  Mohammad Minakari; Farzaneh Khadem Sameni; Hamid Mohaghegh Shalmani; Mahsa Molaee; Mohammad-Reza Zali
Journal:  Med Princ Pract       Date:  2008-02-19       Impact factor: 1.927

7.  Morphometric analysis of hepatic steatosis in chronic hepatitis C infection.

Authors:  Alia Zubair; Shahid Jamal; Azhar Mubarik
Journal:  Saudi J Gastroenterol       Date:  2009-01       Impact factor: 2.485

Review 8.  Epidemiology of hepatitis C virus infection in Pakistan.

Authors:  Nadeem Sajjad Raja; Khalid Abbas Janjua
Journal:  J Microbiol Immunol Infect       Date:  2008-02       Impact factor: 4.399

9.  Protection against H1N1 influenza challenge by a DNA vaccine expressing H3/H1 subtype hemagglutinin combined with MHC class II-restricted epitopes.

Authors:  Lei Tan; Huijun Lu; Dan Zhang; Mingyao Tian; Bo Hu; Zhuoyue Wang; Ningyi Jin
Journal:  Virol J       Date:  2010-12-07       Impact factor: 4.099

10.  PEPVAC: a web server for multi-epitope vaccine development based on the prediction of supertypic MHC ligands.

Authors:  Pedro A Reche; Ellis L Reinherz
Journal:  Nucleic Acids Res       Date:  2005-07-01       Impact factor: 16.971

View more
  8 in total

1.  RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

Authors:  Muhammad Afzal; Ahmad Ali Shahid; Abida Shehzadi; Shahid Nadeem; Tayyab Husnain
Journal:  Bioinformation       Date:  2012-07-21

2.  Selection of epitope-based vaccine targets of HCV genotype 1 of Asian origin: a systematic in silico approach.

Authors:  Abida Shehzadi; Shahid Ur Rehman; Tayyab Husnain
Journal:  Bioinformation       Date:  2012-10-13

3.  Comparative molecular dynamics simulation of Hepatitis C Virus NS3/4A protease (Genotypes 1b, 3a and 4b) predicts conformational instability of the catalytic triad in drug resistant strains.

Authors:  Mitchell Kramer; Daniel Halleran; Moazur Rahman; Mazhar Iqbal; Muhammad Ikram Anwar; Muhmad Ikram Anwar; Salwa Sabet; Edward Ackad; Mohammad S Yousef; Mohammad Yousef
Journal:  PLoS One       Date:  2014-08-11       Impact factor: 3.240

4.  Prediction of T-cell epitopes of hepatitis C virus genotype 5a.

Authors:  Maemu P Gededzha; M Jeffrey Mphahlele; Selokela G Selabe
Journal:  Virol J       Date:  2014-11-08       Impact factor: 4.099

5.  Epitope design of L1 protein for vaccine production against Human Papilloma Virus types 16 and 18.

Authors:  Sunanda Baidya; Rasel Das; Md Golam Kabir; Md Arifuzzaman
Journal:  Bioinformation       Date:  2017-03-31

6.  Sequence-based in silico analysis of well studied hepatitis C virus epitopes and their variants in other genotypes (particularly genotype 5a) against South African human leukocyte antigen backgrounds.

Authors:  Nishi Prabdial-Sing; Adrian J Puren; Sheila M Bowyer
Journal:  BMC Immunol       Date:  2012-12-10       Impact factor: 3.615

7.  In silico prediction of B- and T- cell epitope on Lassa virus proteins for peptide based subunit vaccine design.

Authors:  Sitansu Kumar Verma; Soni Yadav; Ajay Kumar
Journal:  Adv Biomed Res       Date:  2015-09-28

8.  Immuno-Informatics Analysis of Pakistan-Based HCV Subtype-3a for Chimeric Polypeptide Vaccine Design.

Authors:  Sajjad Ahmad; Farah Shahid; Muhammad Tahir Ul Qamar; Habib Ur Rehman; Sumra Wajid Abbasi; Wasim Sajjad; Saba Ismail; Faris Alrumaihi; Khaled S Allemailem; Ahmad Almatroudi; Hafiz Fahad Ullah Saeed
Journal:  Vaccines (Basel)       Date:  2021-03-21
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.