Literature DB >> 32435064

The possible regions to design Human Papilloma Viruses vaccine in Iranian L1 protein.

Behzad Dehghani¹, Zahra Hasanshahi¹, Tayebeh Hashempour¹, Mohamad Motamedifar^1,2.

Abstract

Human Papilloma Virus (HPV) genome encodes several proteins, as L1is major capsid protein and L2 is minor capsid protein. Among all HPV types HPV-16 and HPV-18 are the most common high-risk HPV (HR-HPV) types globally and the majority of cases are infected with these types. HPV entry and the initial interaction with the host cell are mainly related to the L1 protein which is the main component of HPV vaccines. The aim of this research was comparison analysis among all Iranian L1 protein sequences submitted in NCBI GenBank to find the major substitutions as well as structural and immune properties of this protein. All sequences HPV L1 protein from Iranian isolates from 2014 to 2016 were selected and obtained from NCBI data bank. "CLC Genomics Workbench" was used to translate alignment. To predict B cell epitopes, we employed several programs. Modification sites such as phosphorylation, glycosylation, and disulfide bonds were determined. Secondary and tertiary structures of all sequences were analyzed. Several mutations were found and major mutations were in amino acid residues 102, 202, 207, 292, 379, and 502. The mentioned mutations showed the minor effect on B cell and physicochemical properties of the L1 protein. Six disulfide bonds were determined in L1 protein and also in several N-link glycosylation and phosphorylation sites. Five L1 loops were determined, which had great potential to be B cell epitopes with high antigenic properties. All in all, this research as the first report from Iran described the tremendous potential of two L1 loops (BC and FG) to induce immune system which can be used as the descent candidate to design a new vaccine against HPV in the Iranian population. In addition, some differences between the reference sequence and Iranian patients' sequences were determined. It is essential to consider these differences to monitor the effectiveness and efficacy of the vaccine for the Iranian population. Our results provide a vast understanding of L1 protein that can be useful for further studies on HPV infections and new vaccine generations. © Institute of Molecular Biology, Slovak Academy of Sciences 2019.

Entities: CellLine Chemical Disease Gene Mutation Species

Keywords: Bioinformatics; HPV; L1

Year: 2019 PMID： 32435064 PMCID： PMC7223900 DOI： 10.2478/s11756-019-00386-w

Source DB: PubMed Journal: Biologia (Bratisl) ISSN： 0006-3088 Impact factor: 1.350

Introduction

Globally, 2784 million women aged 15 years and older are at risk of developing cervical cancer; 569,847 women are diagnosed with cervical cancer, and 311,365 die from the disease annually(Bruni et al. 2019). After breast cancer, it is the second major cancer of women’s reproductive organs related to cancer deaths. (Beckmann et al. 2010; Rahimifar et al. 2010). Studies showed more than 69.4% of this cancer is related to HPV infection as HPV was found in pre-invasive lesions (Bruni et al. 2019). The rate of Human Papillomavirus (HR HPV) among Iranian women has increased in the recent years; several studies have estimated this rate among healthy women from31% to 34% (Jamdar et al. 2018; Shafaghi et al. 2013; Yousefzadeh et al. 2014). This rate was around 38.6% in Iranian patients with cervical infections and 23% with breast cancer (Dadashi et al. 2017; Haghshenas et al. 2016). HPV is a non-enveloped particle and like other Papillomaviruses (PVs) it is characterized by having an icosahedral capsid of about 55 nm diameter (Doorbar et al. 2015). HPV genome expresses six nonstructural viral regulatory proteins (E1, E2, E4, E5, E6, and E7) and the virus capsid contains two encoded proteins, the L1 major and L2 minor capsid proteins (Frazer 2009; Zheng and Baker 2006). L1 protein is highly conserved among different HPV types and only 10% of the L1 sequence is different among HPV types, and it has the ability to self-assemble in virus-like particles (VLPs)(Kim 2016; Xu et al. 2016); it seems that a surface cell protein was involved in VLP binding. Based on VLPs, VLP-based vaccines were introduced that offered highly effective protection against HPV infections(Buck et al. 2013). L1 has an indispensable role in HPV entry and the initial interaction is remarkably attributable to L1 interactions with proteoglycans (Buck et al. 2013). This virus is categorized into low-risk or high-risk. Prior studies showed that about 50–60% of human anogenital carcinomas belonged to HPV-16 and around 20% was HPV-18(Vera-Bravo et al. 2003). Likewise, several studies showed that the majority of HPV infections in Iranian patients were related to HPV-16. Two recommended HPV vaccines, Gardasil and Cervarix contain VLPs assembled from the L1 proteins of different HPV types (Monie et al. 2008; Siddiqui and Perry 2006). Consequently, any change in HPV L1 can lead to a decrease in the effectiveness and efficacy of vaccines that are used commonly. The aim of this study, as the first report of the most prevalent substitutions of HPV L1 protein among sequences obtained from Iranian patients, was to find mutations and their effect on the structure, physicochemical properties, and antigenic features. We also attempted to provide a comprehensive view of L1 protein as a major candidate designing vaccines against HPV infections.

Materials and methods

L1 protein sequences availability

All 60 L1 sequences from 2014 to 2016 (KM058636-KM058666, KP160988-KP160999, KP161000-KP161014, and KX827590) and a reference sequence (K02718.1) were obtained from NCBI databank at (http://www.ncbi.nlm.nih.gov), all sequences were used in this study belonged to HPV-16. Table 1 shows all software which were used in this study.

Table 1

The software used in this study and related URLs

	Software	URL	Function
1	Signal-BLAST	http://sigpep.services.came.sbg.ac.at/signalblast.html	Signal peptide prediction
2	predisi	http://www.predisi.de/	Signal peptide prediction
3	SignalP	http://www.cbs.dtu.dk/services/SignalP/	Signal peptide prediction
4	ProtParam	http://expasy.org/tools/protparam.html).	Physicochemical properties
5	DiANNA	http://clavius.bc.edu/~clotelab/DiANNA/	Disulfide bonds prediction
6	SCRATCH	http://scratch.proteomics.ics.uci.edu/	Disulfide bonds prediction
7	NetPhosK	http://www.cbs.dtu.dk/services/NetPhosK/	Phosphorylation sites prediction
8	DISPHOS	http://www.dabi.temple.edu/disphos/	Phosphorylation sites prediction
9	NetPhos	http://www.cbs.dtu.dk/services/NetPhos/	Phosphorylation sites prediction
10	NetNGlyc	www.cbs.dtu.dk/services/NetNGlyc/	N-glycosylation sites prediction
11	GlycoEP	www.imtech.res.in/raghava/glycoep/submit.html	N-glycosylation sites prediction
12	SOPMA	https://npsa-prabi.ibcp.fr/NPSA/npsa_sopma.html	Secondary structure prediction
13	Phyre2	http://www.sbg.bio.ic.ac.uk/~phyre2/	Secondary structure prediction
14	I-TASSER	https://zhanglab.ccmb.med.umich.edu/I-TASSER/	Tertiary structure prediction
15	Phyre²	http://www.sbg.bio.ic.ac.uk/~phyre2/html/page.cgi?id=index	Tertiary structure prediction
16	(PS)2-v2	http://ps2.life.nctu.edu.tw/	Tertiary structure prediction
17	Qmean	https://swissmodel.expasy.org/qmean/	Tertiary structure qulification
18	Rammpage	http://mordred.bioc.cam.ac.uk/~rapper/rampage.php	Ramachandran Plot Analysis
19	immuneepitope	https://www.iedb.org	Immuno-informatic analysis
20	BcePred	crdd.osdd.net/raghava/bcepred/	Immuno-informatic analysis
21	ABCpred	http://crdd.osdd.net/raghava/abcpred/	Immuno-informatic analysis
22	Bepipred	http://www.cbs.dtu.dk/services/BepiPred/	Immuno-informatic analysis
23	AlgPred	http://crdd.osdd.net/raghava/algpred/	Immuno-informatic analysis
24	VaxiJen	http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.	Immuno-informatic analysis
25	IEDB	http://tools.iedb.org/bcell/.	Immuno-informatic analysis

The software used in this study and related URLs

Substitutions

The CLC Genomics Workbench was used for the analysis of the mutations, alignment of the translated peptides and homology among sequences. Based on mutations, the selected sequenced were categorized into 3 groups (Table 2).

Table 2

All mutations which were summarized and all sequences harbored substitution in amino acid 60 and Selected sequences groups based on the most prevalent mutations

Mutations	Frequency
Q 2 E	1(1.6%)
L 39 S	1(1.6%)
H 102 Y	40(66.6%)
T 202 N	40(66.6%)
N 207 T	1(1.6%)
V 220 I	1(1.6%)
H 228 D	60 (100%)
T 292 A	53(88.3%)
A 310 V	1(1.6%)
T 379 P	40(66.6%)
T 424 S	1(1.6%)
L 502 F	40(66.6%)
K 514 N	1(1.6%)
Groups	Mutations for each group
1	102, 202, 207,228, 292, 379, and 502
2	228 and292
3	228

All mutations which were summarized and all sequences harbored substitution in amino acid 60 and Selected sequences groups based on the most prevalent mutations

Signal peptide prediction

To predict signal peptide, we used “Signal-BLAST” (Frank and Sippl 2008) and “predisi” (Hiller et al. 2004).

Physicochemical properties

Prediction of instability index, aliphatic index, theoretical isoelectric point (pI), and grand average of hydropathy (GRAVY) was done by “Expasy’sProtParam”(Gasteiger et al. 2005).

Postmodification positions

“DiANNA” (Ferrè and Clote 2006) and “SCRATCH” (Cheng et al. 2005) were used to predict disulfide bonds prediction. Phosphorylation sites were computed by “NetPhosK” (Blom et al. 2004), “DISPHOS” (Iakoucheva et al. 2004), and “NetPhos” (Blom et al. 1999), and finally, N-glycosylation sites were found by “NetNGlyc” (Gupta et al. 2017) and “GlycoEP” (Chauhan et al. 2013).

Secondary and tertiary structure

“SOPMA” (Geourjon and Deleage 1995) and “Phyre” (Kelley and Sternberg 2009) server were reliable programs we employed to predict the secondary structure. “I-TASSER” (Roy et al. 2010), “Phyre2” (Kelley and Sternberg 2009), and “(PS)2-v2” (Chen et al. 2006) were utilized to predict the tertiary structure of the selected sequences. All predicted 3D structures were evaluated for the stereochemistry, reliability, and quality by “Qmean” (Benkert et al. 2008) and “Rammpage”. The obtained 3D structures were refined by “GalaxyRefine” (Heo et al. 2013) program. “Discovery Studio” was used to find the loops on HPV L1.

Immunoinformatics analysis

“Immuneepitope” (Vita et al. 2018), “BcePred” (Saha and Raghava 2004), “ABCpred” (Saha and Raghava 2006b), and “Bepipred” (Jespersen et al. 2017) online programs were performed to find B Cell epitopes sites. Allergenic properties were predicted by “AlgPred” (Saha and Raghava 2006a) and “VaxiJen” (Doytchinova and Flower 2007) software computed protective antigens and subunit vaccines prediction. Hydrophilicity, flexibility, and surface accessibility of GP were anticipated using “IEDB” tools.

Results

Mutation

All the selected sequences contained a mutation in amino acid 228, and the most prevalent mutations occurred in positions 102, 202, 207, 292, 379, 502 (Table 2).

Signal peptide

Analysis by two selected software showed that signal-blast and predisi could not find any position for signal peptide.

Protparam analysis

L1 protein with 531 amino acids length and molecular weight of 59,554.02 Da had pI of 8.27 indicated that it is a basic protein. Its’ in vivo half-life was estimated to be 30 h in mammalian cells and more than 10 h in E.coli. Its instability index and aliphatic index were computed to be a stable protein (Table 3).

Table 3

Physicochemical properties of L1 protein in selected sequences and reference sequence

Protparam	ref(K02718.1)	1	2	3
Molecular weight	59,554.02	59,542.99	59,473.93	59,503.95
pI	8.27	8.26	8.27	8.27
Estimated half-life	30 h (mammalian reticulocytes, in vitro).	30 h (mammalian reticulocytes, in vitro).	30 h (mammalian reticulocytes, in vitro).	30 h (mammalian reticulocytes, in vitro).
	>20 h (yeast, in vivo).	>20 h (yeast, in vivo).	>20 h (yeast, in vivo).	>20 h (yeast, in vivo).
	>10 h (Escherichia coli, in vivo).	>10 h (Escherichia coli, in vivo).	>10 h (Escherichia coli, in vivo).	>10 h (Escherichia coli, in vivo).
Instability index	36.11 stable.	35.87 stable	36.19 stable	36.19 stable
GRAVY	−0.298	−0.294	−0.289	−0.294
Aliphatic index	76.53	75.99	76.72	76.53

Physicochemical properties of L1 protein in selected sequences and reference sequence

Post-modification prediction

Table 4 illustrates the disulfide bonds results, determining 6 probable disulfide bonds in all sequences; however, because of mutations, different bonds were found in some positions. Prediction of glycosylation sites by several reliable tools showed 4 conserve positions for all sequences (157,242,367, and 421). The analysis showed that L1 is a highly phosphorylated protein and the software determined a large number of phosphorylation positions in such protein (Table 5). In comparison with the reference sequence, 4 new phosphorylation positions (229, 376, 507, and 514) were found in 3 groups of selected sequences. In considering the list of mutations we could not find any related mutation to these new positions. Therefore, other factors may have influenced one software to predict these positions.

Table 4

Disulfide bonds were computed by DiANNA and SCRATCH with several similar positions among the sequences

disulfide bonds	Reference	1	2	3
13–13	+	+	+	+
13–211	+	+	+	+
13–371	+	+	+	+
128–211	+	+	+	+
172–405	+	+	+	+
201–453	+	–	–	–
211–453	+	–	–	–
211–454	–	+	+	+
255–371	+	+	+	+
405–453	+	–	–	–
350–454	–	+	+	+
187–454	–	+	+	+
350–371	–	+	–	–

+: Mentioned band was predicted in the sequence. -: Mentioned band was not found in the sequence

Table 5

Phosphorylation sites found by NetPhosK”, “DISPHOS” and “NetPhos; several serine, threonine, and tyrosine amino acids being predicted

	Serine	Threonine	Tyrosine
Reference	49–115–244-308-324-369-375-422-518-519-521	36–65–121-292-320-362-380-439-443-544-516-523-507-517-520-522	53–161–260-268-302-381-444
1	+	+ and 376, 514	+
2	+	+ and 507,514	+
3	+	+ and 229,507,514	+

+: All mentioned positions were predicted in the sequence

Disulfide bonds were computed by DiANNA and SCRATCH with several similar positions among the sequences +: Mentioned band was predicted in the sequence. -: Mentioned band was not found in the sequence Phosphorylation sites found by NetPhosK”, “DISPHOS” and “NetPhos; several serine, threonine, and tyrosine amino acids being predicted +: All mentioned positions were predicted in the sequence

Structural analysis

According to the analysis, the majority of the secondary structures of L1 were random coil (42.5%); extended strand and alpha helix were 28% and 19%, respectively (Table 6). Figure 1 shows the tertiary structure of L1 predicted by reliable software. The Qmean and Ramachandran plot results are summarized in Table 7. The coverage of each predicted structure is shown in Fig. 2. By using “discovery Studio”, 5 loops were found on the L1 structure which was refined by “GalaxyRefine”; the results were summarized in Table 8 and illustrated in Fig. 3.

Table 6

Secondary structure prediction results for L1 protein; the majority of L1 protein structure consisting of random coil

Secondary structure	Reference	1	2	3
Alpha helix	19.77%	19.59%	19.96%	19.40%
Extended strand	28.25%	28.81%	28.63%	28.63%
Beta turn	9.42%	8.85%	9.04%	9.42%
Random coil	42.56%	42.75%	42.37%	42.56%

Fig. 1

HPV L1 3D model structure; yellow: 5 identified loops, and red: α-helixs

Table 7

Ramachandran plot and Qmean results for the selected and reference sequences

	Phyre²		(PS)2-v2		I-TASSER
	Ramachandran^a	Qmean	Ramachandran^a	Qmean	Ramachandran^a	Qmean
Ref	86%,10.2%	−7.34	95.4%,4.2%	−3.73	79.2%, 14.4%	−8.79
1	85.5%,10.3%	−7.6	95.1%,4.4%	−3.51	80.3%, 14.0%	−7.71
2	85.7%,10.1%	−7.57	95.8%,4.0%	−3.98	81.1%, 14.0%	−6.77
3	85.7%,10.1%	−7.57	95.6%,4.0%	−3.72	81.9%, 14.0%	−6.12

aPercentage of favored, and allowed regions

Fig. 2

Coverage of the predicted tertiary structures by 3 reliable software. The coverage of “I-TASSER” was 100%; it was 90% for “Phyre2”, and around 85% for “(PS)2-v2”

Table 8

Five identified loops on the L1 protein and the regions related to each loop. Codons which were mutated in selected sequences were bolded

Region	Loop
76-FPIKKPNNNKILVPA-89	BC or A
157-NASAYAANAGVDNR-170	DE or B
198-GSPCTNVAVNPGDCPP-213	EF or C
292-TVGENVPDDLYIKGSGSTANLASSN-316	FG or D
374-ISTSETTYK-382	H1 or E

Fig. 3

Propensity scale plots of L1, Flexibility, Hydrophilicity, and Surface accessibility. The Horizontal red line is the threshold. Yellow colors, above the threshold, indicate favorable regions consisting of higher scored residues

Secondary structure prediction results for L1 protein; the majority of L1 protein structure consisting of random coil HPV L1 3D model structure; yellow: 5 identified loops, and red: α-helixs Ramachandran plot and Qmean results for the selected and reference sequences aPercentage of favored, and allowed regions Coverage of the predicted tertiary structures by 3 reliable software. The coverage of “I-TASSER” was 100%; it was 90% for “Phyre2”, and around 85% for “(PS)2-v2” Five identified loops on the L1 protein and the regions related to each loop. Codons which were mutated in selected sequences were bolded Propensity scale plots of L1, Flexibility, Hydrophilicity, and Surface accessibility. The Horizontal red line is the threshold. Yellow colors, above the threshold, indicate favorable regions consisting of higher scored residues

Immune properties of L1

B cell epitope prediction by using software showed 6 highly potential regions (37–50, 79–86, 194–210, 303–313, 427–442, and 509–530). “AlgPred” could not find any IgE epitope, and “VexiJen” showed L1 is a Probable antigen. “IEDB” results are displayed in Fig. 3, with 3 distinct parameters which used to locate regions with the highest antigenic ability and all prediction calculations are based on propensity scales for each of the 20 amino acids. In each diagram, the regions with higher scale than the threshold were favorable regions consisting of higher scored residues. Combination of the results of all three diagrams could define the region with high potential to be a candidate as an antigen.

Discussion

Several studies on Iranian patients with HPV infections illustrated that the majority of cases were infected with HPV-16 and after that HPV-18 is ranked second (Farjadian et al. 2003; Ghaffari et al. 2006; Salehi-Vaziri et al. 2016; Shahramian et al. 2011). As there were few sequences related to HPV-18, all the sequences used in this study belonged to HPV-16. 436 Iranian women with different cervical lesions were studied by Salehi-Vaziri et al. in 2013 and the results showed 32.8% of the samples were HPV-16 (Salehi-Vaziri et al. 2016). A study on 100 Iranian patients with cervical carcinomas by Mortazavi et al. in 2002 showed that more than 73% of them were infected with HPV-16 and around 12% with HPV-18 and HPV-33 types (Mortazavi et al. 2002). In 2014, research on 851 Iranian women showed 7.3% of them were infected with HPV-16 (Yousefzadeh et al. 2014). While numerous investigations aimed to determine HPV and genotyping by using L1 sequences, there are no study reporting mutations in this gene. Therefore, this study was the first report on HPV L1 mutations in Iranian patients and the results are applicable to increase the efficiency of the current vaccine. Antoine Touze et al. in 1998 showed substitution in the amino acid region from residues 83 to 97 seemed to affect the level of expression of the L1 protein that has an important effect on HPV vaccines development and serological tests (Touze et al. 1998). In 2014, Fleury et al. found that only a few mutations within the FG loop are sufficient to generate a new serotype escaping vaccination (Fleury et al. 2014). Shailja Pande in 2008 described that the most frequent variation in amino acid (T353P), threonine, was replaced by non-polar aliphatic amino acid and may have an effect on the structure or function of the L1 protein (Pande et al. 2008). It was suggested that it had a significant role in immune recognition and vaccine development strategies and led to conformational changes within epitopes relevant for viral neutralization. In 2003, Yoshiyuki Ishii found that three cysteines (175, 185, and 428) are required for the normal assembly of L1-capsids through trimerization and dimerization of L1; also, substitution in the mentioned regions affects the normal function of HPV (Ishii et al. 2003). The present study showed several mutations in L1 protein; however, we did not find any mutation similar to previous studies in other countries, probably due to different regions or different types of their studies. Lee et al. in 2008 showed the expression rate of HPV L1 capsid protein in the uterine cervical specimens and determined that HPV low-risk group had a higher L1 capsid expression rate than HPV high-risk group; they suggested that L1 capsid expression might be related to favorable disease biology (Lee et al. 2008). Expression of L1 in the insect cells by using recombinant baculoviruses system by Antoine Touze et al. (1998) in 1998 was done, showing the stability of this protein in eukaryotic cells and the yield depend on mutations in a region (83 to 97) that seemed to affect the level of expression. In 2009, Bazan et al. expressed, purified recombinant HPV16 L1 in methylotrophic yeast (Pichia pastoris) that could be useful to produce low-cost vaccines (Bazan et al. 2009). Wang in 1999 and Zhang in 1998 produced a fused form in Escherichia coli using an inducible expression system and showed the stability of the L1 protein in this host (Zhang et al. 1998). Likewise, our results confirmed the stability of the L1 protein in prokaryotic and eukaryotic hosts and showed it was a thermostable and hydrophilic protein and two selected software did not predict any signal peptide for the L1 protein. Li in 1998 determined the vital role of disulfide bonds in papillomavirus capsid assembly and suggested a conserved position (cysteine (C) 424) and also the mutant one could not assemble in vitro into capsid-like structures (Li et al. 1998). In 1998, Martin Sapp et al. described the conserved exclusive disulfide bond of C176 with C427 and confirmed the importance of this bond in the structure of the papillomavirus capsid and DNA packaging (Sapp et al. 1998). Conway et al. in 2011 indicated that capsids properly mature and become stabilized over time (10-day to 20-day) (Conway et al. 2011). In addition, several individual L1 cysteine residues (428, 185, and 175) have an indispensable role in this process. Similar to previous studies, the present results confirmed several cysteine bonds and found some bonds which were omitted as a result of some mutations occurred in such regions in different selected sequences. However, eight cysteine residues (13, 211, 371, 128, 405, 172, 255 and 371) were found in all selected and reference sequences which have the critical role in the structure of the L1 protein. The difference between the positions found in the present study and previous studies may be related to the different geographical regions, the different methods which used, and the different HPV genotypes. It can be suggested that the disulfide bond 13–13,as well as all other disulfide bonds, may play critical roles in constructing L1 pentamer which has interaction with L2 proteins. Although our results showed 4 conserved glycosylation positions in all sequences, based on previous studies we could not find any function for glycosylated regions. Zhou et al. in 1993 showed while the majority of L1 protein localizes in the cell nucleus, glycosylated L1 remains in the endoplasmic reticulum and it is not exported from the cell nor translocated to the cell membrane or the cell nucleus (Zhou et al. 1993). Therefore, they concluded that glycosylated L1 was not important in the construction of papillomavirus virion. While it was suggested that N-linked glycosylation played a significant role in VLP binding cells, in 1999 Joyce et al. could not find any concentration dose of tunicamycin to measure the effect of glycosylation (Joyce et al. 1999). Two conserved amino acid residues (Threonine(T) 340 and T129) were introduced as phosphorylation positions on L1 protein by Buck et al. in 2013 (Buck et al. 2013). In addition, Xi et al. showed that L1 was phosphorylated only in the earlier weeks and this seems to be fairly unstable (Xi and Banks 1991). The present results indicated several phosphorylation sites that were conserved among all selected sequences and the reference sequence. Bioinformatics has provided a great context to define the structure of virus proteins (Behzad et al. 2019; Dehghani et al. 2017, 2019a, b; Moattari et al. 2015; Zahra et al. 2020). Bioinformatics tools were employed widely in this research to determine the structure of L1 and illustrated that the majority of the secondary structures belonged to the random coil and extended strand. The tertiary structure prediction by using three reliable software showed (PS)2-v2 could be able to construct a more reliable structure with the highest percentage of amino acid residues in the favored region in Ramachandran plot and the highest Qmean score. However, the predicted structure by (PS)2-v2 just covered 85% of the L1 sequence and by considering this fact we can conclude that I-TASSER predicted a more useful structure while it covered the complete sequence. In addition, Phyre2 showed the lower accuracy in comparison with (PS)2-v2 and its’ coverage was around 85%. In our study, by using “discovery studio”, the structural analysis showed five distinct loops on L1 protein and the related sequences for each loop. Some mutations were located in these loops, i.e. T202 N and N207 T in EF, T292A in FG, and T379P in H1, which may have a significant effect on the loops structure and the antibodies response. Carter et al. showed multiple neutralizing epitopes on the HPV virion surface, three loops of which (DE, FG, and HI) were most important and vital for binding by neutralizing antibodies(Carter et al. 2006). The crystal structures of four L1 pentamers were determined by Bishop et al. in 2007 (Bishop et al. 2007). They determined that the surface loops contain the known epitopes for neutralizing monoclonal antibodies (NmAbs). Bissett et al. found the role of the DE loop and the late region of the FG loop as neutralizing antibodies epitopes while their findings were critical to describe vaccine-induced cross-neutralizing antibodies which can play a vital role in vaccine-induced cross-protection (Bissett et al. 2012). Roth et al. by using monoclonal antibodies (mAbs) recognized three loops (BC, DE, and FG) as the main epitopes and showed the majority of neutralizing epitopes are located on the tip of the capsomere that were related to these variable loops; this data is crucial for designing intertypic HPV vaccines (Roth et al. 2006). Comparison between the identified loops, hydrophilicity, flexibility, and surface accessibility revealed that BC, DE, and H1 loops were located in regions with the highest surface accessibility scores. In addition, the highest flexibility scores belonged to DE, EF, and FG loops and 4 loops (BC, DE, FG, and H1) had high hydrophilicity scores. Based on three indices (hydrophilicity, flexibility, and surface accessibility), we found BC, DE, FG, and H1 were the highest antigenic ability among other regions of the L1 protein. The combination of B cell epitope prediction and antigenic properties clarified that BC and FG were the most capable regions of L1 with the highest ability to apply in recombinant vaccines against HPV infections (Table 9). By considering the list of mutations, the BC loop was completely conserved and just one codon (292) in FG loop showed substitution among selected sequences.

Table 9

The comparison among the all identified loops by considering hydrophilicity, flexibility, surface accessibility, and B cell epitope prediction

Loops	Surface accessibility	Flexibility	Hydrophilicity	B-cell epitope	Final selection
BC	*		*	*	*
DE	*	*	*
EF		*		*
FG		*	*	*	*
H1	*		*

Finally, two loops BC, and FG were selected as the most capable regions. *: indicates the more capable loop in each parameter

The comparison among the all identified loops by considering hydrophilicity, flexibility, surface accessibility, and B cell epitope prediction Finally, two loops BC, and FG were selected as the most capable regions. *: indicates the more capable loop in each parameter Humoral immunity has always the key role to control virus infections (Ajorloo et al. 2015; Alborzi et al. 2017; Negahdaripour et al. 2017). B cell epitope prediction showed 6 potential regions, 3 of which were completely consistent with 3 loops: BC, EF, and FG that illustrated the importance of these loops in humoral response against HPV L1 protein. In comparison with previous studies, it can be concluded that FG is the most reported loop which has tremendous potential as an epitope for neutralizing antibodies. It seemed that the mentioned mutations did not have a significant effect on B cell epitope regions and their scores; however, substitution in amino acid 202 had a positive impact on EF loop and increased the score of this region as a potential B cell epitope. To conclude, the present results suggested two regions in HPV L1 protein with high potential to induce immune system which can be used as new candidates for new HPV vaccines. In spite of several substitutions in the studied sequences, they had an insignificant impact on all features of L1 protein; it can be inferred that L1 has a conserved sequence and the existing vaccines are still functional for Iranian patients. However, monitoring mutations is crucial to examine the effectiveness and efficacy of the vaccines. Structural and immune properties of Iranian L1 showed the vital role of 5 loops which lead to a strong immune response against HPV infections.

58 in total

1. PrediSi: prediction of signal peptides and their cleavage positions.

Authors: Karsten Hiller; Andreas Grote; Maurice Scheer; Richard Münch; Dieter Jahn
Journal: Nucleic Acids Res Date: 2004-07-01 Impact factor: 16.971

2. Prediction of continuous B-cell epitopes in an antigen using recurrent neural network.

Authors: Sudipto Saha; G P S Raghava
Journal: Proteins Date: 2006-10-01

Review 3. The papillomavirus major capsid protein L1.

Authors: Christopher B Buck; Patricia M Day; Benes L Trus
Journal: Virology Date: 2013-06-22 Impact factor: 3.616

4. Identification of human papillomavirus type 16 L1 surface loops required for neutralization by human sera.

Authors: Joseph J Carter; Greg C Wipf; Margaret M Madeleine; Stephen M Schwartz; Laura A Koutsky; Denise A Galloway
Journal: J Virol Date: 2006-05 Impact factor: 5.103

5. Intercapsomeric disulfide bonds in papillomavirus assembly and disassembly.

Authors: M Li; P Beard; P A Estes; M K Lyon; R L Garcea
Journal: J Virol Date: 1998-03 Impact factor: 5.103

6. The L1 major capsid protein of human papillomavirus type 11 recombinant virus-like particles interacts with heparin and cell-surface glycosaminoglycans on human keratinocytes.

Authors: J G Joyce; J S Tung; C T Przysiecki; J C Cook; E D Lehman; J A Sands; K U Jansen; P M Keller
Journal: J Biol Chem Date: 1999-02-26 Impact factor: 5.157

7. High risk HPV types in southern Iranian patients with cervical cancer.

Authors: S Farjadian; E Asadi; M Doroudchi; A Samsami Dehaghani; S Z Tabei; V P Kumar; A Ghaderi
Journal: Pathol Oncol Res Date: 2003-07-14 Impact factor: 3.201

8. Human papillomavirus type 16 variant analysis of E6, E7, and L1 genes and long control region in biopsy samples from cervical cancer patients in north India.

Authors: Shailja Pande; Neeraj Jain; Bhupesh K Prusty; Suresh Bhambhani; Sanjay Gupta; Rajyashri Sharma; Swaraj Batra; Bhudev C Das
Journal: J Clin Microbiol Date: 2008-01-16 Impact factor: 5.948

9. DiANNA 1.1: an extension of the DiANNA web server for ternary cysteine classification.

Authors: F Ferrè; P Clote
Journal: Nucleic Acids Res Date: 2006-07-01 Impact factor: 16.971

Review 10. Human papilloma virus in oral cancer.

Authors: Soung Min Kim
Journal: J Korean Assoc Oral Maxillofac Surg Date: 2016-12-27

2 in total

1. First report of computational protein-ligand docking to evaluate susceptibility to HIV integrase inhibitors in HIV-infected Iranian patients.

Authors: Farzane Ghasabi; Ava Hashempour; Nastaran Khodadad; Soudabeh Bemani; Parisa Keshani; Mohamad Javad Shekiba; Zahra Hasanshahi
Journal: Biochem Biophys Rep Date: 2022-03-29

2. Functional and Structural Characterization of SARS-Cov-2 Spike Protein: An In Silico Study.

Authors: Hadi Sedigh Ebrahim-Saraie; Behzad Dehghani; Ali Mojtahedi; Mohammad Shenagari; Meysam Hasannejad-Bibalan
Journal: Ethiop J Health Sci Date: 2021-03

2 in total